
Unlocking rich genetic insights through multimodal AI with M-REGLE
June 23, 2025
Yuchen Zhou, Software Engineer, and Farhad Hormozdiari, Staff Research Scientist, Google Research
M-REGLE (Multimodal REpresentation learning for Genetic discovery on Low-dimensional Embeddings) is an AI method that simultaneously analyzes multiple health data streams. Jointly learning from diverse data types creates richer representations, and significantly boosts the discovery of genetic links to disease.
Quick links
Everything from medical specialists with cutting-edge technology to simple smartwatches are generating data on an unprecedented scale. The aggregation of electronic health records, medical imaging, diagnostic tests, genomic data, and even real-time measurements from smartwatches creates a wealth of data for researchers and clinicians to analyze. These diverse data streams often carry unique and overlapping signals, even within the same organ system.
In the cardiovascular system, for example, an electrocardiogram (ECG) measures the heart's electrical activity, while a photoplethysmogram (PPG) — common in smartwatches — tracks blood volume changes. The co-analysis of these modalities can simultaneously assess both the heart’s electrical system and its pumping efficiency, thus providing a more complete picture of heart health. Integrating these physiological signatures with genetic information from large nation-level biobanks could enable the identification of the genetic underpinnings of disease.
Our earlier work, REGLE, was successful for genetic discovery using health data, but it was designed for a single data type (i.e., the unimodal setting). Alternatively, analyzing each modality separately and then trying to piece together the findings later (what we refer to as U-REGLE or Unimodal REGLE) also might not be the most efficient way. U-REGLE could miss subtle shared information between different modalities. Instead, we hypothesized that jointly modeling these complementary data streams would boost the important biological signals, reduce noise, and lead to more powerful genetic discoveries.
Here we present our recent paper, “Utilizing multimodal AI to improve genetic analyses of cardiovascular traits”, which we published in the American Journal of Human Genetics. We developed a multimodal version of REGLE, called M-REGLE, that allows the analysis of multiple types of clinical data together at once. M-REGLE produces lower reconstruction error, identifies more genetic associations, and outperforms risk scores in predicting cardiac disease compared to its predecessor, U-REGLE.

M-REGLE overview steps compared to running our previous model REGLE on each modality separately (U-REGLE, or Unimodal REGLE).
The challenge: Seeing the whole picture
The central premise of M-REGLE is that different clinical modalities, especially those pertaining to a single organ system (like the circulatory system), encode both complementary and overlapping information. In a 12-lead ECG, for example, the different leads are placed in distinct locations on the body. To determine the location of a heart attack, or diagnose arrhythmias, physicians analyze information from specific leads. M-REGLE’s approach, which combines multiple modalities (like the 12 leads of the ECG or one lead plus PPG data) before the representation learning process, offers a more accurate tool that is superior at finding genetic associations, analyzing complex physiological data, and predicting disease.
To effectively do this, M-REGLE employs a robust, multi-step approach that uses joint learning. Instead of looking at 12 different ECG leads or an ECG and a PPG waveform separately, M-REGLE first combines them. It then uses a convolutional variational autoencoder (CVAE) to learn a compressed, combined "signature" (latent factors) from these multiple data streams. The CVAE is designed to capture the most essential information in a lower-dimensional, largely uncorrelated representation. It consists of encoder and decoder networks where the encoder compresses the ECG and PPG waveforms to latent factors and the decoder network reconstructs the waveforms from the created latent factors. To ensure the learned factors are truly independent, principal component analysis (PCA) is applied to these CVAE-generated signatures. Finally, we find associations (significant correlation) between computed independent factors and genetic data via genome-wide association studies (GWAS). The results from these individual GWAS are statistically combined to pinpoint genetic variations associated with the underlying physiological system.
Better learned representations
M-REGLE advances U-REGLE to consistently produce better "learned representations" of the data. Medical data, like an ECG, consists of hundreds of individual data points. When analyzing multiple medical modalities, instead of processing the modalities individually, M-REGLE captures the most important characteristics and condenses it into “latent factors.” This approach resulted in significantly lower reconstruction errors and did a better job of capturing the essential information from the original waveforms compared to learning from each modality separately. For 12-lead ECGs, M-REGLE reduced reconstruction error by 72.5%.
Interpretability sheds some light on embeddings
One of the advantages of generative AI is its interpretability capability. In our study, we used M-REGLE embeddings to show the connection between these embeddings and ECG and PPG waveforms, specifically how altering individual embedding coordinates changes the reconstructed ECG and PPG waveforms from the M-REGLE decoder.
We focused on identifying coordinates that would best distinguish between samples with and without atrial fibrillation (AFib). M-REGLE embeddings at position 4, 6, and 10 were found to be the most distinctive. When we changed values in the 4th M-REGLE embedding from [-2, 2] while keeping the rest of M-REGLE embeddings fixed, we observed corresponding changes in the reconstructed ECG lead I and PPG: the T-wave segment of ECG lead I changed in magnitude, and the dicrotic notch of the PPG signal showed a small alteration. The dicrotic notch provides valuable information about cardiovascular function and health. For example, a less prominent or absent dicrotic notch is often associated with increased arterial stiffness.

The effect of varying the 4th M-REGLE embedding on the reconstructed ECG Lead I and PPG, which leads to a reduction in the magnitude of the T-wave segment of ECG lead I (left) and a change in prominence of the dichroic notch in the PPG (right).
Enhanced genetic discovery
M-REGLE also made improvements over U-REGLE in the identification of genetic associations with cardiovascular disease. For 12-lead ECGs, M-REGLE identified 19.3% more associated genetic loci (regions in the genome) than the unimodal approach. For ECG lead I + PPG, M-REGLE found 13.0% more loci. Importantly, a vast majority of these findings (24/35 for 12 lead ECG and 11/12 for ECG lead I + PPG) replicated known genetic associations for ECG or PPG traits as reported in the GWAS catalog. M-REGLE also uncovered several new loci not previously associated with these traits, some of which showed links to cardiovascular traits in other databases.

A 3-way Venn diagram of the GWAS catalog loci, loci discovered by M-REGLE (12-lead ECG) and loci discovered by U-REGLE. GWAS catalog indicates previously discovered loci.
Improved polygenic risk scores
A polygenic risk score (PRS) quantifies an individual's genetic risk for a disease. We found that PRS developed using genetic variants identified by M-REGLE (from the 12-lead ECG data) significantly outperformed those from U-REGLE in predicting cardiac disease, most notably in atrial fibrillation (AFib). M-REGLE's PRS was significantly better at identifying individuals at risk. These PRS improvements for AFib were not just seen in the UK Biobank, but also independently validated in other large datasets, like the Indiana Biobank, EPIC-Norfolk, and the British Women's Heart and Health Study.
Why does M-REGLE work?
The power of M-REGLE lies in how it handles information. By considering multiple modalities at the outset, M-REGLE gains three main advantages. First, it efficiently captures shared information, learning it once instead of multiple times across each modality. Second, it boosts the unique and complementary signals that each modality provides. Third, M-REGLE reduces noise, as information from one modality might help clarify or filter out noise in another. This all leads to a clearer, more robust signal for powerful downstream genetic analysis.
The future is multimodal
This research is a step forward in leveraging the rich, multimodal health data becoming increasingly available. M-REGLE offers a way to uncover new genetic links to complex disease, improve our ability to predict disease risk, and potentially identify new targets for therapies. In addition, with the rise of smart wearables continuously collecting physiological data like ECG and PPG, methods like M-REGLE will be crucial for translating health data into insights and ultimately, better health outcomes.
Acknowledgements
This work represents a collaborative achievement by many contributors and institutions. We sincerely thank our collaborators for their essential input: Yuchen Zhou, Justin Cosentino, Howard Yang, Andrew Carroll, Cory Y. McLean, Babak Behsaz (Google); Zachary R. McCaw (University of North Carolina); Tae-Hwi Schwantes-An, Dongbing Lai (Indiana University); Mahantesh I. Biradar, Robert Luben, Jorgen Engmann, Rui Providencia, Anthony P. Khawaja (University College London); Patricia B Munroe (Queen Mary University of London). Our thanks also go to Anastasiya Belyaeva for reviewing the manuscript, Greg Corrado, Shravya Shetty, and Michael Brenner for their support, and Monique Brouillette for her help in writing this blog post.