Although personalized automatic speech recognition (ASR) models have recently been improved to recognize even severely impaired speech, model performance may degrade over time for persons with degenerating speech. The aims of this study were to (1) analyze the change of performance of ASR over time in individuals with degrading speech, and (2) explore mitigation strategies to optimize recognition throughout disease progression. Speech was recorded by four individuals with degrading speech due to amyotrophic lateral sclerosis (ALS). Word error rates (WER) across recording sessions were computed for three ASR models: Unadapted Speaker Independent (U-SI), Adapted Speaker Independent (A-SI), and Adapted Speaker Dependent (A-SD or personalized). The performance of all models degraded significantly over time as speech became more impaired, but the A-SD model improved markedly when updated with recordings from the severe stages of speech progression. Recording additional utterances early in the disease before significant speech degradation did not improve the performance of A-SD models. This emphasizes the importance of continuous recording (and model retraining) when providing personalized models for individuals with progressive speech impairments.View details
Frontiers in Marine Science, vol. 8 (2021), pp. 165
Passive acoustic monitoring is a well-established tool for researching the occurrence, movements, and ecology of a wide variety of marine mammal species. Advances in hardware and data collection have exponentially increased the volumes of passive acoustic data collected, such that discoveries are now limited by the time required to analyze rather than collect the data. In order to address this limitation, we trained a deep convolutional neural network (CNN) to identify humpback whale song in over 187,000 h of acoustic data collected at 13 different monitoring sites in the North Pacific over a 14-year period. The model successfully detected 75 s audio segments containing humpback song with an average precision of 0.97 and average area under the receiver operating characteristic curve (AUC-ROC) of 0.992. The model output was used to analyze spatial and temporal patterns of humpback song, corroborating known seasonal patterns in the Hawaiian and Mariana Islands, including occurrence at remote monitoring sites beyond well-studied aggregations, as well as novel discovery of humpback whale song at Kingman Reef, at 5∘ North latitude. This study demonstrates the ability of a CNN trained on a small dataset to generalize well to a highly variable signal type across a diverse range of recording and noise conditions. We demonstrate the utility of active learning approaches for creating high-quality models in specialized domains where annotations are rare. These results validate the feasibility of applying deep learning models to identify highly variable signals across broad spatial and temporal scales, enabling new discoveries through combining large datasets with cutting edge tools.View details
No Results Found
We're always looking for more talented, passionate people.