Avinatan Hassidim
Authored Publications
Sort By
Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
Ariel Goldstein
Avigail Grinstein-Dabush
Haocheng Wang
Zhuoqiao Hong
Bobbi Aubrey
Samuel A. Nastase
Zaid Zada
Eric Ham
Harshvardhan Gazula
Eliav Buchnik
Werner Doyle
Sasha Devore
Patricia Dugan
Roi Reichart
Daniel Friedman
Orrin Devinsky
Adeen Flinker
Uri Hasson
Nature Communications (2024)
Preview abstract
Contextual embeddings, derived from deep language models (DLMs), provide
a continuous vectorial representation of language. This embedding space
differs fundamentally from the symbolic representations posited by traditional
psycholinguistics. We hypothesize that language areas in the human brain,
similar to DLMs, rely on a continuous embedding space to represent language.
To test this hypothesis, we densely record the neural activity patterns in the
inferior frontal gyrus (IFG) of three participants using dense intracranial arrays
while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation
for each word (i.e., a brain embedding) in each patient. We demonstrate that
brain embeddings in the IFG and the DLM contextual embedding space have
common geometric patterns using stringent zero-shot mapping. The common
geometric patterns allow us to predict the brain embedding of a given left-out
word in IFG based solely on its geometrical relationship to other nonoverlapping words in the podcast. Furthermore, we show that contextual
embeddings better capture the geometry of IFG embeddings than static word
embeddings. The continuous brain embedding space exposes a vector-based
neural code for natural language processing in the human brain.
View details
Health AI Developer Foundations
Atilla Kiraly
Sebastien Baur
Kenneth Philbrick
Fereshteh Mahvar
Liron Yatziv
Tiffany Chen
Bram Sterling
Nick George
Fayaz Jamil
Jing Tang
Kai Bailey
Faruk Ahmed
Akshay Goel
Abbi Ward
Lin Yang
Shravya Shetty
Daniel Golden
Tim Thelin
Rory Pilgrim
Can "John" Kirmizi
arXiv (2024)
Preview abstract
Robust medical Machine Learning (ML) models have the potential to revolutionize healthcare by accelerating clinical research, improving workflows and outcomes, and producing novel insights or capabilities. Developing such ML models from scratch is cost prohibitive and requires substantial compute, data, and time (e.g., expert labeling). To address these challenges, we introduce Health AI Developer Foundations (HAI-DEF), a suite of pre-trained, domain-specific foundation models, tools, and recipes to accelerate building ML for health applications. The models cover various modalities and domains, including radiology (X-rays and computed tomography), histopathology, dermatological imaging, and audio. These models provide domain specific embeddings that facilitate AI development with less labeled data, shorter training times, and reduced computational costs compared to traditional approaches. In addition, we utilize a common interface and style across these models, and prioritize usability to enable developers to integrate HAI-DEF efficiently. We present model evaluations across various tasks and conclude with a discussion of their application and evaluation, covering the importance of ensuring efficacy, fairness, and equity. Finally, while HAI-DEF and specifically the foundation models lower the barrier to entry for ML in healthcare, we emphasize the importance of validation with problem- and population-specific data for each desired usage setting. This technical report will be updated over time as more modalities and features are added.
View details
Preview abstract
Computing efficient traffic signal plans is often based on the amount of traffic in an intersection, its distribution over the various intersection movements and hours as well as on performance metrics such as traffic delay. In their simple and typical form plans are fixed in the same hour over weekdays. This allows low operation costs without the necessity for traffic detection and monitoring tools. A critical factor on the potential efficiency of such plans is the similarity of traffic patterns over the days along each of the intersection movements. We refer to such similarity as the traffic stability of the intersection and define simple metrics to measure it based on traffic volume and traffic delay. In this paper, we propose an automatic probe data based method, for city-wide estimation of traffic stability. We discuss how such measures can be used for signal planning such as in selecting plan resolution or as an indication as which intersections can benefit from dynamic but expensive traffic detection tools. We also identify events of major changes in traffic characteristics of an intersection. We demonstrate the framework by using real traffic statistics to study the traffic stability in the city of Haifa along its 162 intersections. We study the impact of the time of day on the stability, detect major changes in traffic and find intersections with high and low stability.
View details
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Paul Roit
Johan Ferret
Geoffrey Cideron
Matthieu Geist
Sertan Girgin
Léonard Hussenot
Nikola Momchev
Piotr Stanczyk
Nino Vieillard
Olivier Pietquin
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (2023), 6252–6272
Preview abstract
Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input. This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work we leverage recent progress on textual entailment models to directly address this problem for abstractive summarization systems. We use reinforcement learning with reference-free, textual-entailment rewards to optimize for factual consistency and explore the ensuing trade-offs, as improved consistency may come at the cost of less informative or more extractive summaries. Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience and conciseness of the generated summaries.
View details
A Neural Encoder for Earthquake Rate Forecasting
Oleg Zlydenko
Brendan Meade
Alexandra Sharon Molchanov
Sella Nevo
Yohai bar Sinai
Scientific Reports (2023)
Preview abstract
Forecasting the timing of earthquakes is a long-standing challenge. Moreover, it is still debated how to formulate this problem in a useful manner, or to compare the predictive power of different models.
Here, we develop a versatile neural encoder of earthquake catalogs, and apply it to the fundamental problem of earthquake rate prediction, in the spatio-temporal point process framework. The epidemic
type aftershock sequence model (ETAS) effectively learns a small number of parameters to constrain assumed functional forms for the space and time relationships of earthquake sequences (e.g., Omori-Utsu law). Here we introduce learned spatial and temporal embeddings for point process earthquake forecast models that capture complex correlation structures. We demonstrate the generality of this neural representation as compared with ETAS model using train-test data splits and how it enables the incorporation of additional geophysical information. In rate prediction tasks, the generalized model shows > 4% improvement in information gain per earthquake and the simultaneous learning of anisotropic spatial structures analogous to fault traces. The trained network can be also used to perform short-term prediction tasks, showing similar improvement while providing a 1,000-fold reduction in run-time.
View details
Caravan - A global community dataset for large-sample hydrology
Frederik Kratzert
Nans Addor
Tyler Erickson
Martin Gauch
Lukas Gudmundsson
Daniel Klotz
Sella Nevo
Guy Shalev
Scientific Data, 10 (2023), pp. 61
Preview abstract
High-quality datasets are essential to support hydrological science and modeling. Several CAMELS (Catchment Attributes and Meteorology for Large-sample Studies) datasets exist for specific countries or regions, however these datasets lack standardization, which makes global studies difficult. This paper introduces a dataset called Caravan (a series of CAMELS) that standardizes and aggregates seven existing large-sample hydrology datasets. Caravan includes meteorological forcing data, streamflow data, and static catchment attributes (e.g., geophysical, sociological, climatological) for 6830 catchments. Most importantly, Caravan is both a dataset and open-source software that allows members of the hydrology community to extend the dataset to new locations by extracting forcing data and catchment attributes in the cloud. Our vision is for Caravan to democratize the creation and use of globally-standardized large-sample hydrology datasets. Caravan is a truly global open-source community resource.
View details
AI Increases Global Access to Reliable Flood Forecasts
Asher Metzger
Dana Weitzner
Frederik Kratzert
Guy Shalev
Martin Gauch
Sella Nevo
Shlomo Shenzis
Tadele Yednkachw Tekalign
Vusumuzi Dube
arXiv (2023)
Preview abstract
Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Here we show that AI-based forecasting achieves reliability in predicting extreme riverine events in ungauged watersheds at up to a 5-day lead time that is similar to or better than the reliability of nowcasts (0-day lead time) from a current state of the art global modeling system (the Copernicus Emergency Management Service Global Flood Awareness System). Additionally, we achieve accuracies over 5-year return period events that are similar to or better than current accuracies over 1-year return period events. This means that AI can provide flood warnings earlier and over larger and more impactful events in ungauged basins. The model developed in this paper was incorporated into an operational early warning system that produces publicly available (free and open) forecasts in real time in over 80 countries. This work highlights a need for increasing the availability of hydrological data to continue to improve global access to reliable flood warnings.
View details
Building a Clinically-Focused Problem List From Medical Notes
Birju Patel
Cathy Cheung
Liwen Xu
Peter Clardy
Rachana Fellinger
LOUHI 2022: The 13th International Workshop on Health Text Mining and Information Analysis (2022)
Preview abstract
Clinical notes often contain vital information not observed in other structured data, but their unstructured nature can lead to critical patient-related information being lost. To make sure this valuable information is utilized for patient care, algorithms that summarize notes into a problem list are often proposed. Focusing on identifying medically-relevant entities in the free-form text, these solutions are often detached from a canonical ontology and do not allow downstream use of the detected text-spans. As a solution, we present here a system for generating a canonical problem list from medical notes, consisting of two major stages. At the first stage, annotation, we use a transformer model to detect all clinical conditions which are mentioned in a single note. These clinical conditions are then grounded to a predefined ontology, and are linked to spans in the text. At the second stage, summarization, we aggregate over the set of clinical conditions detected on all of the patient's note, and produce a concise patient summary that organizes their important conditions.
View details
Structured Understanding of Assessment and Plans in Clinical Documentation
Doron Yaya-Stupp
Ronnie Barequet
I-Ching Lee
Eyal Oren
Eran Ofek
Alvin Rajkomar
medRxiv (2022)
Preview abstract
Physicians record their detailed thought-processes about diagnoses and treatments as unstructured text in a section of a clinical note called the assessment and plan. This information is more clinically rich than structured billing codes assigned for an encounter but harder to reliably extract given the complexity of clinical language and documentation habits. We describe and release a dataset containing annotations of 579 admission and progress notes from the publicly available and de-identified MIMIC-III ICU dataset with over 30,000 labels identifying active problems, their assessment, and the category of associated action items (e.g. medication, lab test). We also propose deep-learning based models that approach human performance, with a F1 score of 0.88. We found that by employing weak supervision and domain specific data-augmentation, we could improve generalization across departments and reduce the number of human labeled notes without sacrificing performance.
View details
TRUE: Re-evaluating Factual Consistency Evaluation
Or Honovich
Hagai Taitelbaum
Vered Cohen
Thomas Scialom
NAACL 2022, The Association for Computational Linguistics (2022)
Preview abstract
Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability. Automatic factual consistency evaluation may help alleviate this limitation by accelerating evaluation cycles, filtering inconsistent outputs and augmenting training data. While attracting increasing attention, such evaluation metrics are usually developed and evaluated in silo for a single task or dataset, slowing their adoption. Moreover, previous meta-evaluation protocols focused on system-level correlations with human annotations, which leave the example-level accuracy of such metrics unclear.
In this work, we introduce TRUE: a comprehensive study of factual consistency metrics on a standardized collection of existing texts from diverse tasks, manually annotated for factual consistency. Our standardization enables an example-level meta-evaluation protocol that is more actionable and interpretable than previously reported correlations, yielding clearer quality measures. Across diverse state-of-the-art metrics and 11 datasets we find that large-scale NLI and question generation-and-answering-based approaches achieve strong and complementary results. We recommend those methods as a starting point for model and metric developers, and hope TRUE will foster progress towards even better methods.
View details