Michael P Brenner

Michael P Brenner

Michael is an applied mathematician, interested in the interface between machine learning and science.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
    Ariel Goldstein
    Avigail Grinstein-Dabush
    Haocheng Wang
    Zhuoqiao Hong
    Bobbi Aubrey
    Samuel A. Nastase
    Zaid Zada
    Eric Ham
    Harshvardhan Gazula
    Eliav Buchnik
    Werner Doyle
    Sasha Devore
    Patricia Dugan
    Roi Reichart
    Daniel Friedman
    Orrin Devinsky
    Adeen Flinker
    Uri Hasson
    Nature Communications (2024)
    Preview abstract Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. We demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns using stringent zero-shot mapping. The common geometric patterns allow us to predict the brain embedding of a given left-out word in IFG based solely on its geometrical relationship to other nonoverlapping words in the podcast. Furthermore, we show that contextual embeddings better capture the geometry of IFG embeddings than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain. View details
    Preview abstract We developed dysarthric speech intelligibility classifiers on 551,176 disordered speech samples contributed by a diverse set of 468 speakers, with a range of self-reported speaking disorders and rated for their overall intelligibility on a fivepoint scale. We trained three models following different deep learning approaches and evaluated them on ∼94K utterances from 100 speakers. We further found the models to generalize well (without further training) on the TORGO database (100% accuracy), UASpeech (0.93 correlation), ALS-TDI PMP (0.81 AUC) datasets as well as on a dataset of realistic unprompted speech we gathered (106 dysarthric and 76 control speakers, ∼2300 samples). View details
    Context-Aware Abbreviation Expansion Using Large Language Models
    Ajit Narayanan
    Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2022(2022) (to appear)
    Preview abstract Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. Our approach is to expand the abbreviations into full-phrase options by leveraging conversation context with the power of pretrained large language models (LLMs). Through zero-shot, few-shot, and fine-tuning experiments on four public conversation datasets, we show that for replies to the initial turn of a dialog, an LLM with 64B parameters is able to exactly expand over 70% of phrases with abbreviation length up to 10, leading to an effective keystroke saving rate of up to about 77% on these exact expansions. Including a small amount of context in the form of a single conversation turn more than doubles abbreviation expansion accuracies compared to having no context, an effect that is more pronounced for longer phrases. Additionally, the robustness of models against typo noise can be enhanced through fine-tuning on noisy data. View details
    Preview abstract Amyotrophic Lateral Sclerosis (ALS) disease progression is usually measured using the subjective, questionnaire-based revised ALS Functional Rating Scale (ALSFRS-R). A purely objective measure for tracking disease progression would be a powerful tool for evaluating real-world drug effectiveness, efficacy in clinical trials, as well as identifying participants for cohort studies. Here we develop a machine learning based objective measure for ALS disease progression, based on voice samples and accelerometer measurements. The ALS Therapy Development Institute (ALS-TDI) collected a unique dataset of voice and accelerometer samples from consented individuals - 584 people living with ALS over four years. Participants carried out prescribed speaking and limb-based tasks. 542 participants contributed 5814 voice recordings, and 350 contributed 13009 accelerometer samples, while simultaneously measuring ALSFRS-R. Using the data from 475 participants, we trained machine learning (ML) models, correlating voice with bulbar-related FRS scores and accelerometer with limb related scores. On the test set (n=109 participants) the voice models achieved an AUC of 0.86 (95% CI, 0.847-0.884) , whereas the accelerometer models achieved a median AUC of 0.73 . We used the models and self-reported ALSFRS-R scores to evaluate the real-world effects of edaravone, a drug recently approved for use in ALS, on 54 test participants. In the test cohort, the digital data input into the ML models produced objective measures of progression rates over the duration of the study that were consistent with self-reported scores. This demonstrates the value of these tools for assessing both disease progression and potentially drug effects. In this instance, outcomes from edaravone treatment, both self-reported and digital-ML, resulted in highly variable outcomes from person to person. View details
    Shared computational principles for language processing in humans and deep language models
    Ariel Goldstein
    Zaid Zada
    Eliav Buchnik
    Amy Price
    Bobbi Aubrey
    Samuel A. Nastase
    Harshvardhan Gazula
    Gina Choe
    Aditi Rao
    Catherine Kim
    Colton Casto
    Lora Fanda
    Werner Doyle
    Daniel Friedman
    Patricia Dugan
    Lucia Melloni
    Roi Reichart
    Sasha Devore
    Adeen Flinker
    Liat Hasenfratz
    Omer Levy,
    Kenneth A. Norman
    Orrin Devinsky
    Uri Hasson
    Nature Neuroscience(2022)
    Preview abstract Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models generate appropriate linguistic responses in a given context. In the current study, nine participants listened to a 30-min podcast while their brain responses were recorded using electrocorticography (ECoG). We provide empirical evidence that the human brain and autoregressive DLMs share three fundamental computational principles as they process the same natural narrative: (1) both are engaged in continuous next-word prediction before word onset; (2) both match their pre-onset predictions to the incoming word to calculate post-onset surprise; (3) both rely on contextual embeddings to represent words in natural contexts. Together, our findings suggest that autoregressive DLMs provide a new and biologically feasible computational framework for studying the neural basis of language. View details
    COVID-19 Open-Data: a global-scale spatially granular meta-dataset for coronavirus disease
    Oscar Wahltinez
    Aurora Cheung
    Ruth Alcantara
    Donny Cheung
    Anthony Erlinger
    Matt Lee
    Pranali Yawalkar
    Paula Lê
    Ofir Picazo Navarro
    Scientific Data(2022)
    Preview abstract This paper introduces the COVID-19 Open Dataset (COD), available at goo.gle/covid-19-open-data. A static copy is of the dataset is also available at https://doi.org/10.6084/m9.figshare.c.5399355. This is a very large “meta-dataset” of COVID-related data, containing epidemiological information, from 22,579 unique locations within 232 different countries and independent territories. For 62 of these countries we have state-level data, and for 23 of these countries we have county-level data. For 15 countries, COD includes cases and deaths stratified by age or sex. COD also contains information on hospitalizations, vaccinations, and other relevant factors such as mobility, non-pharmaceutical interventions and static demographic attributes. Each location is tagged with a unique identifier so that these different types of information can be easily combined. The data is automatically extracted from 121 different authoritative sources, using scalable open source software. This paper describes the format and construction of the dataset, and includes a preliminary statistical analysis of its content, revealing some interesting patterns. View details
    Preview abstract Speech samples from over 1000 individuals with impaired speech have been submitted for Project Euphonia, aimed at improving automated speech recognition for atypical speech. We provide an update on the contents of the corpus, which recently passed 1 million utterances, and review key lessons learned from this project. The reasoning behind decisions such as phrase set composition, prompted vs extemporaneous speech, metadata and data quality efforts are explained based on findings from both technical and user-facing research. View details
    Preview abstract Severe speech impairments limit the precision and range of producible speech sounds. As a result, generic automatic speech recognition (ASR) and keyword spotting (KWS) systems are unable to accurately recognize the utterances produced by individuals with severe speech impairments. This paper describes an approach in which simple speech sounds, namely isolated open vowels (e.g., /a/), are used in lieu of more motorically-demanding keywords. A neural network (NN) is trained to detect these isolated open vowels uttered by individuals with speech impairments against background noise. The NN is trained with a two-phase approach. The pre-training phase uses samples from unimpaired speakers along with samples of background noises and unrelated speech; then the fine-tuning stage uses samples of vowel samples collected from individuals with speech impairments. This model can be built into an experimental mobile app that allows users to activate preconfigured actions such as alerting caregivers. Preliminary user testing indicates the model has the potential to be a useful and flexible emergency communication channel for motor- and speech-impaired individuals. View details
    Preview abstract Automatic classification of disordered speech can provide an objective tool for identifying the presence and severity of a speech impairment. Classification approaches can also help identify hard-to-recognize speech samples to teach ASR systems about the variable manifestations of impaired speech. Here, we develop and compare different deep learning techniques to classify the intelligibility of disordered speech on selected phrases. We collected samples from a diverse set of 661 speakers with a variety of self-reported disorders speaking 29 words or phrases, which were rated by speech-language pathologists for their overall intelligibility using a five-point Likert scale. We then evaluated classifiers developed using 3 approaches: (1) a convolutional neural network (CNN) trained for the task, (2) classifiers trained on non-semantic speech representations from CNNs that used an unsupervised objective [1], and (3) classifiers trained on the acoustic (encoder) embeddings from an ASR system trained on typical speech [2]. We find that the ASR encoder’s embeddings considerably outperform the other two on detecting and classifying disordered speech. Further analysis shows that the ASR embeddings cluster speech by the spoken phrase, while the non-semantic embeddings cluster speech by speaker. Also, longer phrases are more indicative of intelligibility deficits than single words. View details
    Preview abstract Objective. This study aimed to (1) evaluate the performance of personalized Automatic Speech Recognition (ASR) models on disordered speech samples representing a wide range of etiologies and speech severities, and (2) compare the accuracy of these models to that of speaker-independent ASR models developed on and for typical speech as well as expert human listeners. Methods. 432 individuals with self-reported disordered speech recorded at least 300 short phrases using a web-based application. Word error rates (WER) were computed using three different ASR models and expert human transcribers. Metadata were collected to evaluate the potential impact of participant, atypical speech, and technical factors on recognition accuracy. Results. The accuracy of personalized models for recognizing disordered speech was high (WER: 4.6%), and significantly better than speaker-independent models (WER: 31%). Personalized models also outperformed human transcribers (WER gain: 9%) with relative gains in accuracy as high as 80%. The most significant gain in recognition performance was for the most severely affected speakers. Low SNR and fewer training utterances adversely affected recognition even for speakers with mild speech impairments. Conclusions. Personalized ASR models have significant potential for improving communication for persons with impaired speech. View details