A step towards making heart health screening accessible for billions with PPG signals

July 25, 2024

Mayank Daswani, Software Engineer, and Sujay Kakarmath, Product Manager, Google Research, Health AI Team

We describe an approach for using photoplethysmograph (PPG) data for potential use in early detection of cardiovascular disease risk and release tooling that enables the collection of PPG signals using smartphones.

Heart attack, stroke and other cardiovascular diseases remain the leading cause of death worldwide, claiming millions of lives each year. Yet, essential heart health screenings remain inaccessible for billions of people across the globe. Gaining access to health facilities and laboratories can be challenging and unreliable for many around the world, even for simple things like blood pressure and body mass index (BMI) measurements. As a result, countless individuals remain unaware of their heart disease risk until it is very late and they cannot benefit from life-saving preventative care.

In contrast, most (54%) people in the world have access to a smartphone. Signals obtained from smartphones and wearables are promising pathways to non-invasive care. In fact, early studies demonstrate how smartphone cameras can be used to accurately measure heart rate and respiratory rate, which could provide valuable diagnostics for healthcare providers.

With this in mind, in our paper “Predicting cardiovascular disease risk using photoplethysmography and deep learning”, published in PLOS Global Public Health, we show that photoplethysmographs (PPGs) — which use light to measure variations in blood flow — hold significant promise for detecting risk of cardiovascular disease early, which could be particularly valuable in low-resource settings. We demonstrate that PPG signals from a simple fingertip device combined with basic metadata, including age, sex, smoking status, can predict an individual’s risk for major long-term heart health issues, such as heart attacks, strokes, and related deaths. These predictions have similar accuracy to traditional screenings that typically require blood pressure, BMI and cholesterol measurements. In order to encourage the collection of smartphone PPG data paired with long-term cardiovascular outcomes, we are open-sourcing a software library to make it easier to collect PPG signals from Android smartphones.

PPGs-1-Overview

Cardiovascular risk stratification is done using a variety of risk scores. The inputs to these scores vary from requiring less accessible sources of information like hospital measurements and labs, to more accessible measurements like BMI and blood pressure. Typically there is a trade-off between accessibility and quality of the risk prediction as we move along this spectrum. However, the method we propose is at least as accurate as risk scores based on office-based measurements while being more accessible.

What are PPGs?

As your heart beats, the amount of blood flowing through even the smallest blood vessels in your body changes slightly. PPGs measure these slight fluctuations using light — most often infrared light — shone on your fingertip or earlobe. You’ve likely encountered PPGs if you’ve ever used a pulse oximeter to measure your blood oxygen levels, or worn a smartwatch or fitness tracker. You can also get PPG signals by recording a video of your finger covering your phone camera. Several studies have investigated the utility of PPGs for various cardiovascular assessments such as blood pressure monitoring, vascular aging and arterial stiffness. Further, prior research at Google has demonstrated that smartphone-derived PPG signals can accurately measure heart rate and respiratory rate.

PPGs-2a-Illustration

Our method operates on finger PPG signals that can be easily collected from devices like pulse-oximeters and your smartphone, and can translate this PPG signal with some easily collected metadata into a cardiovascular risk score.

Using PPGs to predict long-term heart health

Unfortunately, there are few large datasets that pair PPG data with long-term cardiovascular outcomes. In order to get a statistically useful number of such outcomes in a general population, a dataset needs to be quite large, and typically should cover a span of 5–10 years. Recently, Biobanks have become a popular way to collect such paired longitudinal data for a wide-range of biomarkers and outcomes.

For our purposes, we made use of the UK Biobank, a large, de-identified biomedical dataset involving approximately 500,000 consented individuals from the UK, paired with a large number of long-term outcomes for heart attack, stroke, and related deaths. We use the subset of UK Biobank that contains PPG signals, filtered to participants aged 40–74 to better mirror previous studies on predicting cardiovascular disease. This results in around 200,000 participants, which we then split into training, validation and test sets.

Our method operates in two stages. We first build generally useful representations (model embeddings) of PPGs by training a 1D-ResNet18 model to predict multiple attributes of an individual (e.g., age, sex, BMI, hypertension status, etc) using only the PPG signal. We then employ the resulting embeddings and associated metadata as features of a survival model for predicting 10-year incidence of major adverse cardiac events. The survival model is a Cox proportional hazards model, which is often used to study long term outcomes when individuals may be lost to follow up, and is also common in estimating disease risk.

We compare this method to several baselines that estimate risk scores while including additional signals like blood pressure and BMI. We find that our PPG embeddings can provide predictions with comparable accuracy without relying on these additional signals. One standard way to evaluate the overall value of a survival model is the concordance index (C-index). On this metric, we show that a survival model using age, sex, BMI, smoking status and systolic blood pressure has a C-index of 70.9%, and a survival model that replaces BMI + systolic blood pressure with our easily obtainable PPG features has a C-index of 71.1% and passes a statistical non-inferiority test.

PPGs-3-RiskThreshold

The Kaplan-Meier survival curve of our deep learning system (DLS) is stratified by whether our system predicts the individual to be low or high risk. The threshold is determined by matching the specificity (63.6%) of a simple blood pressure screening–based algorithm on the same data (systolic blood pressure > 140mmHg). The stratified curves show that individuals deemed high risk have a significantly higher probability of a major cardiovascular event than those deemed low risk, over a ten-year time horizon.

Outlook

This breakthrough could make heart health screening accessible to billions of people in the future. However, further research is necessary to confirm the generalizability of our findings to other populations beyond the UK Biobank cohort we studied. As it stands, there are no other datasets large enough that can be used to show how PPGs can be used to estimate cardiovascular risk. Our findings are, therefore, an important first step that justify global investments in prospective data collection.

In addition to geographic generalizability, further research is also essential to confirm that our model can work across skin types, as inconsistencies have been reported in the literature around oxygenation estimates from PPG signals. The UK Biobank study used an infrared sensor (PulseTrace PCA2) that partially mitigates the differences in absorption due to skin pigmentation by using the optimal wavelength (940nm). There’s also further evidence that this is much less of a problem with state-of-the-art sensors. Our model also relies on waveform shape obtained at this optimal wavelength, rather than a comparison between waveforms obtained at different wavelengths (like SpO2), and therefore we expect it to be less susceptible to this bias. Nevertheless, it is important to confirm this with actual data.

Lastly, for this model to be deployed on smartphones, our findings must be replicated with PPG signals from smartphones, which is currently infeasible due to a lack of data. We hope that our open-source software library will make it easy for other researchers to collect PPG signals from Android smartphones to help overcome this problem. We will also be making PPG embeddings from our work available through UK Biobank Returns.

We believe that by collaborating with the global community, we can transform the fight against heart disease, especially in low-resource environments. By combining the ubiquity of smartphones with the power of AI, we can usher in a future where life saving, cost-effective heart health screenings are accessible to all.

Acknowledgements

This work involved the efforts of a multidisciplinary team of software engineers, researchers, clinicians and cross functional contributors. Key contributors to this project include: Wei-Hung Weng, Sebastien Baur, Diego Ardila, Christina Chen, Lauren Harrell, Mariam Jabara, Babak Behsaz, Cory Y. McLean, Alicia Martin, Preeti Singh, Narayanan Sundararajan, Yossi Matias, Greg Corrado, Leor Stern, Shravya Shetty, Shruthi Prabhakara, Sunny Virmani, Jamie Rogers, Yun Liu, Fred Hersch, Madhuram Jajoo, Divya Ramnath, Jing Tang, Chandrashekar Sankarapu, and Arun Samudrala. We also thank Dr. Goodarz Danaei (Bernard Lown Professor of Cardiovascular Health at the Harvard Chan School of Public Health) and Dr. Yogeshwar Kalkonde (Public Health Physician and Researcher at Sangwari, India) for lending their subject matter expertise to this project. We would also like to extend a special thanks to Tom Small for the animation used in this blog post.