Amir Feder
Research Areas
Authored Publications
Sort By
Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
Ariel Goldstein
Avigail Grinstein-Dabush
Haocheng Wang
Zhuoqiao Hong
Bobbi Aubrey
Samuel A. Nastase
Zaid Zada
Eric Ham
Harshvardhan Gazula
Eliav Buchnik
Werner Doyle
Sasha Devore
Patricia Dugan
Roi Reichart
Daniel Friedman
Orrin Devinsky
Adeen Flinker
Uri Hasson
Nature Communications (2024)
Preview abstract
Contextual embeddings, derived from deep language models (DLMs), provide
a continuous vectorial representation of language. This embedding space
differs fundamentally from the symbolic representations posited by traditional
psycholinguistics. We hypothesize that language areas in the human brain,
similar to DLMs, rely on a continuous embedding space to represent language.
To test this hypothesis, we densely record the neural activity patterns in the
inferior frontal gyrus (IFG) of three participants using dense intracranial arrays
while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation
for each word (i.e., a brain embedding) in each patient. We demonstrate that
brain embeddings in the IFG and the DLM contextual embedding space have
common geometric patterns using stringent zero-shot mapping. The common
geometric patterns allow us to predict the brain embedding of a given left-out
word in IFG based solely on its geometrical relationship to other nonoverlapping words in the podcast. Furthermore, we show that contextual
embeddings better capture the geometry of IFG embeddings than static word
embeddings. The continuous brain embedding space exposes a vector-based
neural code for natural language processing in the human brain.
View details
LLMs Accelerate Annotation for Medical Information Extraction
Akshay Goel
Almog Gueta
Omry Gilon
Chang Liu
Xiaohong Hao
Bolous Jaber
Shashir Reddy
Rupesh Kartha
Jean Steiner
Machine Learning for Health (ML4H), PMLR (2023)
Preview abstract
The unstructured nature of clinical notes within electronic health records often conceals vital patient-related information, making it challenging to access or interpret. To uncover this hidden information, specialized Natural Language Processing (NLP) models are required. However, training these models necessitates large amounts of labeled data, a process that is both time-consuming and costly when relying solely on human experts for annotation. In this paper, we propose an approach that combines Large Language Models (LLMs) with human expertise to create an efficient method for generating ground truth labels for medical text annotation. By utilizing LLMs in conjunction with human annotators, we significantly reduce the human annotation burden, enabling the rapid creation of labeled datasets. We rigorously evaluate our method on a medical information extraction task, demonstrating that our approach not only substantially cuts down on human intervention but also maintains high accuracy. The results highlight the potential of using LLMs to improve the utilization of unstructured clinical data, allowing for the swift deployment of tailored NLP solutions in healthcare.
View details
Preview abstract
The reliance of text classifiers on spurious correlations can lead to poor generalization at deployment, raising concerns about their use in safety-critical domains such as healthcare. In this work, we propose to use counterfactual data augmentation, guided by knowledge of the causal structure of the data, to simulate interventions on spurious features and to learn more robust text classifiers. We show that this strategy is appropriate in prediction problems where the label is spuriously correlated with an attribute. Under the assumptions of such problems, we discuss the favorable sample complexity of counterfactual data augmentation, compared to importance re-weighting. Pragmatically, we match examples using auxiliary data, based on diff-in-diff methodology, and use a large language model (LLM) to represent a conditional probability of text. Through extensive experimentation on learning caregiver-invariant predictors of clinical diagnoses from medical narratives and on semi-synthetic data, we demonstrate that our method for simulating interventions improves out-of-distribution (OOD) accuracy compared to baseline invariant learning algorithms.
View details
Shared computational principles for language processing in humans and deep language models
Ariel Goldstein
Zaid Zada
Eliav Buchnik
Amy Price
Bobbi Aubrey
Samuel A. Nastase
Harshvardhan Gazula
Gina Choe
Aditi Rao
Catherine Kim
Colton Casto
Lora Fanda
Werner Doyle
Daniel Friedman
Patricia Dugan
Lucia Melloni
Roi Reichart
Sasha Devore
Adeen Flinker
Liat Hasenfratz
Omer Levy,
Kenneth A. Norman
Orrin Devinsky
Uri Hasson
Nature Neuroscience (2022)
Preview abstract
Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models generate appropriate linguistic responses in a given context. In the current study, nine participants listened to a 30-min podcast while their brain responses were recorded using electrocorticography (ECoG). We provide empirical evidence that the human brain and autoregressive DLMs share three fundamental computational principles as they process the same natural narrative: (1) both are engaged in continuous next-word prediction before word onset; (2) both match their pre-onset predictions to the incoming word to calculate post-onset surprise; (3) both rely on contextual embeddings to represent words in natural contexts. Together, our findings suggest that autoregressive DLMs provide a new and biologically feasible computational framework for studying the neural basis of language.
View details
Building a Clinically-Focused Problem List From Medical Notes
Birju Patel
Cathy Cheung
Liwen Xu
Peter Clardy
Rachana Fellinger
LOUHI 2022: The 13th International Workshop on Health Text Mining and Information Analysis (2022)
Preview abstract
Clinical notes often contain vital information not observed in other structured data, but their unstructured nature can lead to critical patient-related information being lost. To make sure this valuable information is utilized for patient care, algorithms that summarize notes into a problem list are often proposed. Focusing on identifying medically-relevant entities in the free-form text, these solutions are often detached from a canonical ontology and do not allow downstream use of the detected text-spans. As a solution, we present here a system for generating a canonical problem list from medical notes, consisting of two major stages. At the first stage, annotation, we use a transformer model to detect all clinical conditions which are mentioned in a single note. These clinical conditions are then grounded to a predefined ontology, and are linked to spans in the text. At the second stage, summarization, we aggregate over the set of clinical conditions detected on all of the patient's note, and produce a concise patient summary that organizes their important conditions.
View details
Section Classification in Clinical Notes with Multi-task Transformers
Fan Zhang
LOUHI 2022: The 13th International Workshop on Health Text Mining and Information Analysis (2022)
Preview abstract
Clinical notes are the backbone of electronic health records, often containing vital information not observed in other structured data. Unfortunately, the unstructured nature of clinical notes can lead to critical patient-related information being lost. Algorithms that organize clinical notes into distinct sections are often proposed in order to allow medical professionals to better access information in a given note. These algorithms, however, often assume a given partition over the note, and only classify section types given this information. In this paper, we propose a multi-task solution for note sectioning, where one model can identify context changes and label each section with its medically-relevant title. Results on in-distribution (MIMIC-III) and out-of-distribution (private held-out) datasets reveal that our multi-task approach can successfully identify note sections across different hospital systems.
View details
Structured Understanding of Assessment and Plans in Clinical Documentation
Doron Yaya-Stupp
Ronnie Barequet
I-Ching Lee
Eyal Oren
Eran Ofek
Alvin Rajkomar
medRxiv (2022)
Preview abstract
Physicians record their detailed thought-processes about diagnoses and treatments as unstructured text in a section of a clinical note called the assessment and plan. This information is more clinically rich than structured billing codes assigned for an encounter but harder to reliably extract given the complexity of clinical language and documentation habits. We describe and release a dataset containing annotations of 579 admission and progress notes from the publicly available and de-identified MIMIC-III ICU dataset with over 30,000 labels identifying active problems, their assessment, and the category of associated action items (e.g. medication, lab test). We also propose deep-learning based models that approach human performance, with a F1 score of 0.88. We found that by employing weak supervision and domain specific data-augmentation, we could improve generalization across departments and reduce the number of human labeled notes without sacrificing performance.
View details
Useful Confidence Measures: Beyond the Max Score
NeurIPS 2022 Workshop on Distribution Shifts (DistShift) (2022) (to appear)
Preview abstract
An important component in deploying machine learning (ML) in safety-critic applications is having a reliable measure of confidence in the ML's predictions. For a classifier $f$ producing a probability vector $f(x)$ over the candidate classes, the confidence is typically taken to be $\max_i f(x)_i$. This approach is potentially limited, as it disregards the rest of the probability vector. In this work, we derive several confidence measures that depend on information beyond the maximum score, such as margin-based and entropy-based measures, and empirically evaluate their usefulness. We focus on NLP tasks and Transformer-based models. We show that in the "out of the box" regime (where the scores of $f$ are used as is), using only the maximum score to inform the confidence measure is highly suboptimal. In the post-processing regime (where the scores of $f$ can be improved using additional held-out data), this remains true (though the differences are less pronounced), with entropy-based confidence emerging as a surprisingly useful measure.
View details
Learning and Evaluating a Differentially Private Pre-trained Language Model
Shlomo Hoory
Avichai Tendler
Findings of the Association for Computational Linguistics: EMNLP 2021, Association for Computational Linguistics, Punta Cana, Dominican Republic, pp. 1178-1189
Preview abstract
Contextual language models have led to significantly better results on a plethora of language understanding tasks, especially when pre-trained on the same data as the downstream task. While this additional pre-training usually improves performance, it often leads to information leakage and therefore risks the privacy of individuals mentioned in the training data. One method to guarantee the privacy of such individuals is to train a differentially private model, but this usually comes at the expense of model performance. Moreover, it is hard to tell given a privacy parameter $\epsilon$ what was the effect on the trained representation and whether it maintained relevant information while improving privacy. To improve privacy and guide future practitioners and researchers, we demonstrate here how to train a differentially private pre-trained language model (i.e., BERT) with a privacy guarantee of $\epsilon=0.5$ with only a small degradation in performance. We experiment on a dataset of clinical notes with a model trained on an entity extraction (EE) task on and compare it to a similar model trained without differential privacy. Finally, we present a series of experiments showing how to interpret the differentially private representation and understand the information lost and maintained in this process.
View details
Active Deep Learning to Detect Demographic Traits in Free-Form Clinical Notes
Danny Vainstein
Roni Rosenfeld
Tzvika Hartman
Journal of Biomedical Informatics (2020)
Preview abstract
The free-form portions of clinical notes are a significant source of information for research, but
before they can be used, they must be de-identified to protect patients' privacy. De-identification
efforts have focused on known identifier types (names, ages, dates, addresses, ID's, etc.).
However, a note can contain residual "Demographic Traits" (DTs), unique enough to re-identify
the patient when combined with other such facts. Here we examine whether any residual risks
remain after removing these identifiers. After manually annotating over 140,000 words worth of
medical notes, we found no remaining directly identifying information, and a low prevalence of
demographic traits, such as marital status or housing type. We developed an annotation guide
to the discovered Demographic Traits (DTs) and used it to label MIMIC-III and i2b2-2006 clinical
notes as test sets. We then designed a "bootstrapped" active learning iterative process for
identifying DTs: we tentatively labeled as positive all sentences in the DT-rich note sections,
used these to train a binary classifier, manually corrected acute errors, and retrained the
classifier. This train-and-correct process may be iterated. Our active learning process
significantly improved the classifier's accuracy. Moreover, our BERT-based model outperformed
non-neural models when trained on both tentatively labeled data and manually relabeled
examples. To facilitate future research and benchmarking, we also produced and made publicly
available our human annotated DT-tagged datasets. We conclude that directly identifying
information is virtually non-existent in the multiple medical note types we investigated.
Demographic traits are present in medical notes, but can be detected with high accuracy using
a cost-effective human-in-the-loop active learning process, and redacted if desired.
View details