Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Publications

people standing in front of a screen with images and a chipboard

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 443 publications
    Towards Generalist Biomedical AI
    Danny Driess
    Andrew Carroll
    Chuck Lau
    Ryutaro Tanno
    Ira Ktena
    Anil Palepu
    Basil Mustafa
    Aakanksha Chowdhery
    Simon Kornblith
    Philip Mansfield
    Sushant Prakash
    Renee Wong
    Sunny Virmani
    Christopher Semturs
    Sara Mahdavi
    Bradley Green
    Ewa Dominowska
    Joelle Barral
    Karan Singhal
    Pete Florence
    NEJM AI(2024)
    Preview abstract BACKGROUND: Medicine is inherently multimodal, requiring the simultaneous interpretation and integration of insights between many data modalities spanning text, imaging, genomics, and more. Generalist biomedical artificial intelligence systems that flexibly encode, integrate, and interpret these data might better enable impactful applications ranging from scientific discovery to care delivery. METHODS: To catalyze development of these models, we curated MultiMedBench, a new multimodal biomedical benchmark. MultiMedBench encompasses 14 diverse tasks, such as medical question answering, mammography and dermatology image interpretation, radiology report generation and summarization, and genomic variant calling. We then introduced Med-PaLM Multimodal (Med-PaLM M), our proof of concept for a generalist biomedical AI system that flexibly encodes and interprets biomedical data including clinical language, imaging, and genomics with the same set of model weights. To further probe the capabilities and limitations of Med-PaLM M, we conducted a radiologist evaluation of model-generated (and human) chest x-ray reports. RESULTS: We observed encouraging performance across model scales. Med-PaLM M reached performance competitive with or exceeding the state of the art on all MultiMedBench tasks, often surpassing specialist models by a wide margin. In a side-by-side ranking on 246 retrospective chest x-rays, clinicians expressed a pairwise preference for Med-PaLM Multimodal reports over those produced by radiologists in up to 40.50% of cases, suggesting potential clinical utility. CONCLUSIONS: Although considerable work is needed to validate these models in real-world cases and understand if cross-modality generalization is possible, our results represent a milestone toward the development of generalist biomedical artificial intelligence systems. View details
    Mindful Breathing as an Effective Technique in the Management of Hypertension
    Aravind Natarajan
    Hulya Emir-Farinas
    Hao-Wei Su
    Frontiers in Physiology, N/A(2024), N/A
    Preview abstract Introduction: Hypertension is one of the most important, modifiable risk factors for cardiovascular disease. The popularity of wearable devices provides an opportunity to test whether device guided slow mindful breathing may serve as a non-pharmacological treatment in the management of hypertension. Methods: Fitbit Versa-3 and Sense devices were used for this study. In addition, participants were required to own an FDA or Health Canada approved blood pressure measuring device. Advertisements were shown to 655,910 Fitbit users, of which 7,365 individuals expressed interest and filled out the initial survey. A total of 1,918 participants entered their blood pressure readings on at least 1 day and were considered enrolled in the study. Participants were instructed to download a guided mindful breathing app on their smartwatch device, and to engage with the app once a day prior to sleep. Participants measured their systolic and diastolic blood pressure prior to starting each mindful breathing session, and again after completion. All measurements were self reported. Participants were located in the United States or Canada. Results: Values of systolic and diastolic blood pressure were reduced following mindful breathing. There was also a decrease in resting systolic and diastolic measurements when measured over several days. For participants with a systolic pressure ≥ 130 mmHg, there was a decrease of 9.7 mmHg following 15 min of mindful breathing at 6 breaths per minute. When measured over several days, the resting systolic pressure decreased by an average of 4.3 mmHg. Discussion: Mindful breathing for 15 min a day, at a rate of 6 breaths per minute is effective in lowering blood pressure, and has both an immediate, and a short term effect (over several days). This large scale study demonstrates that device guided mindful breathing with a consumer wearable for 15 min a day is effective in lowering blood pressure, and a helpful complement to the standard of care. View details
    An intentional approach to managing bias in embedding models
    Atilla P. Kiraly
    Alexander D'Amour
    Jungyeon Park
    Rory Pilgrim
    Charles Lau
    Heather Cole-Lewis
    Shravya Shetty
    Krish Eswaran
    Leo Anthony Celi
    The Lancet Digital Health, 6(2024), E126-E130
    Preview abstract Advances in machine learning for health care have brought concerns about bias from the research community; specifically, the introduction, perpetuation, or exacerbation of care disparities. Reinforcing these concerns is the finding that medical images often reveal signals about sensitive attributes in ways that are hard to pinpoint by both algorithms and people. This finding raises a question about how to best design general purpose pretrained embeddings (GPPEs, defined as embeddings meant to support a broad array of use cases) for building downstream models that are free from particular types of bias. The downstream model should be carefully evaluated for bias, and audited and improved as appropriate. However, in our view, well intentioned attempts to prevent the upstream components—GPPEs—from learning sensitive attributes can have unintended consequences on the downstream models. Despite producing a veneer of technical neutrality, the resultant end-to-end system might still be biased or poorly performing. We present reasons, by building on previously published data, to support the reasoning that GPPEs should ideally contain as much information as the original data contain, and highlight the perils of trying to remove sensitive attributes from a GPPE. We also emphasise that downstream prediction models trained for specific tasks and settings, whether developed using GPPEs or not, should be carefully designed and evaluated to avoid bias that makes models vulnerable to issues such as distributional shift. These evaluations should be done by a diverse team, including social scientists, on a diverse cohort representing the full breadth of the patient population for which the final model is intended. View details
    Artificial Intelligence in Healthcare: A Perspective from Google
    Lily Peng
    Lisa Lehmann
    Artificial Intelligence in Healthcare, Elsevier(2024)
    Preview abstract Artificial Intelligence (AI) holds the promise of transforming healthcare by improving patient outcomes, increasing accessibility and efficiency, and decreasing the cost of care. Realizing this vision of a healthier world for everyone everywhere requires partnerships and trust between healthcare systems, clinicians, payers, technology companies, pharmaceutical companies, and governments to drive innovations in machine learning and artificial intelligence to patients. Google is one example of a technology company that is partnering with healthcare systems, clinicians, and researchers to develop technology solutions that will directly improve the lives of patients. In this chapter we share landmark trials of the use of AI in healthcare. We also describe the application of our novel system of organizing information to unify data in electronic health records (EHRs) and bring an integrated view of patient records to clinicians. We discuss our consumer focused innovation in dermatology to help guide search journeys for personalized information about skin conditions. Finally, we share a perspective on how to embed ethics and a concern for all patients into the development of AI. View details
    Preview abstract Importance: Interest in artificial intelligence (AI) has reached an all-time high, and health care leaders across the ecosystem are faced with questions about where, when, and how to deploy AI and how to understand its risks, problems, and possibilities. Observations: While AI as a concept has existed since the 1950s, all AI is not the same. Capabilities and risks of various kinds of AI differ markedly, and on examination 3 epochs of AI emerge. AI 1.0 includes symbolic AI, which attempts to encode human knowledge into computational rules, as well as probabilistic models. The era of AI 2.0 began with deep learning, in which models learn from examples labeled with ground truth. This era brought about many advances both in people’s daily lives and in health care. Deep learning models are task-specific, meaning they do one thing at a time, and they primarily focus on classification and prediction. AI 3.0 is the era of foundation models and generative AI. Models in AI 3.0 have fundamentally new (and potentially transformative) capabilities, as well as new kinds of risks, such as hallucinations. These models can do many different kinds of tasks without being retrained on a new dataset. For example, a simple text instruction will change the model’s behavior. Prompts such as “Write this note for a specialist consultant” and “Write this note for the patient’s mother” will produce markedly different content. Conclusions and Relevance: Foundation models and generative AI represent a major revolution in AI’s capabilities, ffering tremendous potential to improve care. Health care leaders are making decisions about AI today. While any heuristic omits details and loses nuance, the framework of AI 1.0, 2.0, and 3.0 may be helpful to decision-makers because each epoch has fundamentally different capabilities and risks. View details
    Socio-spatial equity analysis of relative wealth index and emergency obstetric care accessibility in urban Nigeria
    Kerry L. M. Wong
    Aduragbemi Banke-Thomas
    Tope Olubodun
    Peter M. Macharia
    Charlotte Stanton
    Narayanan Sundararajan
    Yash Shah
    Mansi Kansal
    Swapnil Vispute
    Olakunmi Ogunyemi
    Uchenna Gwacham-Anisiobi
    Jia Wang
    Ibukun-Oluwa Omolade Abejirinde
    Prestige Tatenda Makanga
    Bosede B. Afolabi
    Lenka Beňová
    Communications Medicine, 4(2024), pp. 34
    Preview abstract Background Better geographical accessibility to comprehensive emergency obstetric care (CEmOC) facilities can significantly improve pregnancy outcomes. However, with other factors, such as affordability critical for care access, it is important to explore accessibility across groups. We assessed CEmOC geographical accessibility by wealth status in the 15 most-populated Nigerian cities. Methods We mapped city boundaries, verified and geocoded functional CEmOC facilities, and assembled population distribution for women of childbearing age and Meta’s Relative Wealth Index (RWI). We used the Google Maps Platform’s internal Directions Application Programming Interface to obtain driving times to public and private facilities. City-level median travel time (MTT) and number of CEmOC facilities reachable within 60 min were summarised for peak and non-peak hours per wealth quintile. The correlation between RWI and MTT to the nearest public CEmOC was calculated. Results We show that MTT to the nearest public CEmOC facility is lowest in the wealthiest 20% in all cities, with the largest difference in MTT between the wealthiest 20% and least wealthy 20% seen in Onitsha (26 vs 81 min) and the smallest in Warri (20 vs 30 min). Similarly, the average number of public CEmOC facilities reachable within 60 min varies (11 among the wealthiest 20% and six among the least wealthy in Kano). In five cities, zero facilities are reachable under 60 min for the least wealthy 20%. Those who live in the suburbs particularly have poor accessibility to CEmOC facilities. Conclusions Our findings show that the least wealthy mostly have poor accessibility to care. Interventions addressing CEmOC geographical accessibility targeting poor people are needed to address inequities in urban settings. View details
    Differences between Patient and Clinician Submitted Images: Implications for Virtual Care of Skin Conditions
    Rajeev Rikhye
    Grace Eunhae Hong
    Preeti Singh
    Margaret Ann Smith
    Aaron Loh
    Vijaytha Muralidharan
    Doris Wong
    Rory Sayres
    Michelle Phung
    Nicolas Betancourt
    Bradley Fong
    Rachna Sahasrabudhe
    Khoban Nasim
    Alec Eschholz
    Kat Chou
    Peggy Bui
    Yuan Liu
    Justin Ko
    Steven Lin
    Mayo Clinic Proceedings: Digital Health(2024)
    Preview abstract Objective: To understand and highlight the differences in clinical, demographic, and image quality characteristics between patient-taken (PAT) and clinic-taken (CLIN) photographs of skin conditions. Patients and Methods: This retrospective study applied logistic regression to data from 2500 deidentified cases in Stanford Health Care’s eConsult system, from November 2015 to January 2021. Cases with undiagnosable or multiple conditions or cases with both patient and clinician image sources were excluded, leaving 628 PAT cases and 1719 CLIN cases. Demographic characteristic factors, such as age and sex were self-reported, whereas anatomic location, estimated skin type, clinical signs and symptoms, condition duration, and condition frequency were summarized from patient health records. Image quality variables such as blur, lighting issues and whether the image contained skin, hair, or nails were estimated through a deep learning model. Results: Factors that were positively associated with CLIN photographs, post-2020 were as follows: age 60 years or older, darker skin types (eFST V/VI), and presence of skin growths. By contrast, factors that were positively associated with PAT photographs include conditions appearing intermittently, cases with blurry photographs, photographs with substantial nonskin (or nail/hair) regions and cases with more than 3 photographs. Within the PAT cohort, older age was associated with blurry photographs. Conclusion: There are various demographic, clinical, and image quality characteristic differences between PAT and CLIN photographs of skin concerns. The demographic characteristic differences present important considerations for improving digital literacy or access, whereas the image quality differences point to the need for improved patient education and better image capture workflows, particularly among elderly patients. View details
    Understanding metric-related pitfalls in image analysis validation
    Annika Reinke
    Lena Maier-Hein
    Paul Jager
    Shravya Shetty
    Understanding Metrics Workgroup
    Nature Methods (2024)
    Preview abstract Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation. View details
    Validation of a deep learning system for the detection of diabetic retinopathy in Indigenous Australians
    Mark Chia
    Fred Hersch
    Rory Sayres
    Pearse Keane
    Angus Turner
    British Journal of Ophthalmology, 108(2024), pp. 268-273
    Preview abstract Background/aims: Deep learning systems (DLSs) for diabetic retinopathy (DR) detection show promising results but can underperform in racial and ethnic minority groups, therefore external validation within these populations is critical for health equity. This study evaluates the performance of a DLS for DR detection among Indigenous Australians, an understudied ethnic group who suffer disproportionately from DR-related blindness. Methods: We performed a retrospective external validation study comparing the performance of a DLS against a retinal specialist for the detection of more-than-mild DR (mtmDR), vision-threatening DR (vtDR) and all-cause referable DR. The validation set consisted of 1682 consecutive, single-field, macula-centred retinal photographs from 864 patients with diabetes (mean age 54.9 years, 52.4% women) at an Indigenous primary care service in Perth, Australia. Three-person adjudication by a panel of specialists served as the reference standard. Results: For mtmDR detection, sensitivity of the DLS was superior to the retina specialist (98.0% (95% CI, 96.5 to 99.4) vs 87.1% (95% CI, 83.6 to 90.6), McNemar’s test p<0.001) with a small reduction in specificity (95.1% (95% CI, 93.6 to 96.4) vs 97.0% (95% CI, 95.9 to 98.0), p=0.006). For vtDR, the DLS’s sensitivity was again superior to the human grader (96.2% (95% CI, 93.4 to 98.6) vs 84.4% (95% CI, 79.7 to 89.2), p<0.001) with a slight drop in specificity (95.8% (95% CI, 94.6 to 96.9) vs 97.8% (95% CI, 96.9 to 98.6), p=0.002). For all-cause referable DR, there was a substantial increase in sensitivity (93.7% (95% CI, 91.8 to 95.5) vs 74.4% (95% CI, 71.1 to 77.5), p<0.001) and a smaller reduction in specificity (91.7% (95% CI, 90.0 to 93.3) vs 96.3% (95% CI, 95.2 to 97.4), p<0.001). Conclusion: The DLS showed improved sensitivity and similar specificity compared with a retina specialist for DR detection. This demonstrates its potential to support DR screening among Indigenous Australians, an underserved population with a high burden of diabetic eye disease. View details
    Crowdsourcing Dermatology Images with Google Search Ads: Creating a Diverse and Representative Dataset of Real-World Skin Conditions
    Abbi Ward
    Ashley Carrick
    Christopher Semturs
    Dawn Siegel
    Jay Hartford
    Jimmy Li
    Julie Wang
    Justin Ko
    Pradeep Kumar S
    Renee Wong
    Sriram Lakshminarasimhan
    Steven Lin
    Sunny Virmani
    arXiv(2024)
    Preview abstract Background Health datasets from clinical sources do not reflect the breadth and diversity of disease in the real world, impacting research, medical education and artificial intelligence (AI) tool development. Dermatology is a suitable area to develop and test a new and scalable method to create representative health datasets. Methods We used Google Search advertisements to solicit contributions of images of dermatology conditions, demographic and symptom information from internet users in the United States (US) over 265 days starting March 2023. With informed contributor consent, we described and released this dataset containing 10,106 images from 5058 contributions, with dermatologist labels as well as Fitzpatrick Skin Type and Monk Skin Tone labels for the images. Results We received 22 ± 14 submissions/day over 265 days. Female contributors (66.04%) and younger individuals (52.3% < age 40) had a higher representation in the dataset compared to the US population, and 36.6% of contributors had a non-White racial or ethnic identity. Over 97.5% of contributions were genuine images of skin conditions. Image quality had no impact on dermatologist confidence in assigning a differential diagnosis. The dataset consists largely of short duration (54% with onset < 7 days ago) allergic, infectious, and inflammatory conditions. Fitzpatrick skin type distribution is well-balanced, considering the geographical origin of the dataset and the absence of enrichment for population groups or skin tones. Interpretation Search ads are effective at crowdsourcing images of health conditions. The SCIN dataset bridges important gaps in the availability of representative images of common skin conditions. View details
    Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study
    Terry Spitz
    Malcolm Chelliah
    Heather Cole-Lewis
    Yuan Liu
    Stephanie Farquhar
    Qinghan Xue
    Jenna Lester
    Cían Hughes
    Patricia Strachan
    Fraser Tan
    Peggy Bui
    Craig Mermel
    Lily Peng
    Sunny Virmani
    Christopher Semturs
    Ivor Horn
    Cameron Chen
    The Lancet eClinicalMedicine(2024)
    Preview abstract Background Artificial intelligence (AI) has repeatedly been shown to encode historical inequities in healthcare. We aimed to develop a framework to quantitatively assess the performance equity of health AI technologies and to illustrate its utility via a case study. Methods Here, we propose a methodology to assess whether health AI technologies prioritise performance for patient populations experiencing worse outcomes, that is complementary to existing fairness metrics. We developed the Health Equity Assessment of machine Learning performance (HEAL) framework designed to quantitatively assess the performance equity of health AI technologies via a four-step interdisciplinary process to understand and quantify domain-specific criteria, and the resulting HEAL metric. As an illustrative case study (analysis conducted between October 2022 and January 2023), we applied the HEAL framework to a dermatology AI model. A set of 5420 teledermatology cases (store-and-forward cases from patients of 20 years or older, submitted from primary care providers in the USA and skin cancer clinics in Australia), enriched for diversity in age, sex and race/ethnicity, was used to retrospectively evaluate the AI model's HEAL metric, defined as the likelihood that the AI model performs better for subpopulations with worse average health outcomes as compared to others. The likelihood that AI performance was anticorrelated to pre-existing health outcomes was estimated using bootstrap methods as the probability that the negated Spearman's rank correlation coefficient (i.e., “R”) was greater than zero. Positive values of R suggest that subpopulations with poorer health outcomes have better AI model performance. Thus, the HEAL metric, defined as p (R >0), measures how likely the AI technology is to prioritise performance for subpopulations with worse average health outcomes as compared to others (presented as a percentage below). Health outcomes were quantified as disability-adjusted life years (DALYs) when grouping by sex and age, and years of life lost (YLLs) when grouping by race/ethnicity. AI performance was measured as top-3 agreement with the reference diagnosis from a panel of 3 dermatologists per case. Findings Across all dermatologic conditions, the HEAL metric was 80.5% for prioritizing AI performance of racial/ethnic subpopulations based on YLLs, and 92.1% and 0.0% respectively for prioritizing AI performance of sex and age subpopulations based on DALYs. Certain dermatologic conditions were significantly associated with greater AI model performance compared to a reference category of less common conditions. For skin cancer conditions, the HEAL metric was 73.8% for prioritizing AI performance of age subpopulations based on DALYs. Interpretation Analysis using the proposed HEAL framework showed that the dermatology AI model prioritised performance for race/ethnicity, sex (all conditions) and age (cancer conditions) subpopulations with respect to pre-existing health disparities. More work is needed to investigate ways of promoting equitable AI performance across age for non-cancer conditions and to better understand how AI models can contribute towards improving equity in health outcomes. View details
    Exploring the Feasibility of Remote Cardiac Auscultation Using Earphones
    Tao Chen
    Yongjie Yang
    Xiuzhen Guo
    Jie Xiong
    Shangguan Longfei
    MobiCom 2024: The 30th Annual International Conference On Mobile Computing And Networking
    Preview abstract The elderly over 65 accounts for 80% of COVID deaths in the United States. In response to the pandemic, the federal, state governments, and commercial insurers are promoting video visits, through which the elderly can access specialists at home over the Internet, without the risk of COVID exposure. However, the current video visit practice barely relies on video observation and talking. The specialist could not assess the patient's health conditions by performing auscultations. This paper tries to address this key missing component in video visits by proposing Asclepius, a hardware-software solution that turns the patient's earphones into a stethoscope, allowing the specialist to hear the patient's fine-grained heart sound (i.e., PCG signals) in video visits. To achieve this goal, we contribute a low-cost plug-in peripheral that repurposes the earphone's speaker into a microphone and uses it to capture the patient's minute PCG signals from her ear canal. As the PCG signals suffer from strong attenuation and multi-path effects when propagating from the heart to ear canals, we then propose efficient signal processing algorithms coupled with a data-driven approach to de-reverberate and further correct the amplitude and frequency distortion in raw PCG receptions. We implement Asclepius on a 2-layer PCB board and follow the IRB protocol to evaluate its performance with 30 volunteers. Our extensive experiments show that Asclepius can effectively recover Phonocardiogram (PCG) signals with different types of earphones. The feedback from cardiologists also confirms the efficacy and efficiency of our system. PCG signal samples and benchmark results can be found at an anonymous link https://asclepius-system.github.io/ View details
    Towards Conversational Diagnostic AI
    Anil Palepu
    Khaled Saab
    Jan Freyberg
    Ryutaro Tanno
    Amy Wang
    Brenna Li
    Nenad Tomašev
    Karan Singhal
    Le Hou
    Albert Webson
    Kavita Kulkarni
    Sara Mahdavi
    Christopher Semturs
    Juro Gottweis
    Joelle Barral
    Kat Chou
    Arxiv(2024) (to appear)
    Preview abstract At the heart of medicine lies the physician-patient dialogue, where skillful history-taking paves the way for accurate diagnosis, effective management, and enduring trust. Artificial Intelligence (AI) systems capable of diagnostic dialogue could increase accessibility, consistency, and quality of care. However, approximating clinicians' expertise is an outstanding grand challenge. Here, we introduce AMIE (Articulate Medical Intelligence Explorer), a Large Language Model (LLM) based AI system optimized for diagnostic dialogue. AMIE uses a novel self-play based simulated environment with automated feedback mechanisms for scaling learning across diverse disease conditions, specialties, and contexts. We designed a framework for evaluating clinically-meaningful axes of performance including history-taking, diagnostic accuracy, management reasoning, communication skills, and empathy. We compared AMIE's performance to that of primary care physicians (PCPs) in a randomized, double-blind crossover study of text-based consultations with validated patient actors in the style of an Objective Structured Clinical Examination (OSCE). The study included 149 case scenarios from clinical providers in Canada, the UK, and India, 20 PCPs for comparison with AMIE, and evaluations by specialist physicians and patient actors. AMIE demonstrated greater diagnostic accuracy and superior performance on 28 of 32 axes according to specialist physicians and 24 of 26 axes according to patient actors. Our research has several limitations and should be interpreted with appropriate caution. Clinicians were limited to unfamiliar synchronous text-chat which permits large-scale LLM-patient interactions but is not representative of usual clinical practice. While further research is required before AMIE could be translated to real-world settings, the results represent a milestone towards conversational diagnostic AI. View details
    Enhancing molecular selectivity using Wasserstein distance based reweighing
    Pratik Worah
    TBD (likely arxiv, PMLR or PNAS)(2024)
    Preview abstract Given a training data-set $\mathcal{S}$, and a reference data-set $\mathcal{T}$, we design a simple and efficient algorithm to reweigh the loss function such that the limiting distribution of the neural network weights that result from training on $\mathcal{S}$ approaches the limiting distribution that would have resulted by training on $\mathcal{T}$. Such reweighing can be used to correct for Train-Test distribution shift, when we don't have access to the labels of $\mathcal{T}$. It can also be used to perform (soft) multi-criteria optimization on neural nets, when we have access to the labels of $\mathcal{T}$, but $\mathcal{S}$ and $\mathcal{T}$ have few common points. As a motivating application, we train a graph neural net to recognize small molecule binders to MNK2 (a MAP Kinase, responsible for cell signaling) which are non-binders to MNK1 (a very similar protein), even in the absence of training data common to both data-sets. We are able to tune the reweighing parameters so that overall change in holdout loss is negligible, but the selectivity, i.e., the fraction of top 100 MNK2 binders that are MNK1 non-binders, increases from 54\% to 95\%, as a result of our reweighing. We expect the algorithm to be applicable in other settings as well, since we prove that when the metric entropy of the input data-sets is bounded, our random sampling based greedy algorithm outputs a close to optimal reweighing, i.e., the two invariant distributions of network weights will be provably close in total variation distance. View details
    Geographical accessibility to emergency obstetric care in urban Nigeria using closer-to-reality travel time estimates
    Aduragbemi Banke-Thomas
    Kerry L. M. Wong
    Tope Olubodun
    Peter M. Macharia
    Narayanan Sundararajan
    Yash Shah
    Mansi Kansal
    Swapnil Vispute
    Olakunmi Ogunyemi
    Uchenna Gwacham-Anisiobi
    Jia Wang
    Ibukun-Oluwa Omolade Abejirinde
    Prestige Tatenda Makanga
    Ngozi Azodoh
    Charles Nzelu, PhD
    Charlotte Stanton
    Bosede B. Afolabi
    Lenka Beňová
    Lancet Global Health(2024)
    Preview abstract Background Better accessibility of emergency obstetric care (CEmOC) facilities can significantly reduce maternal and perinatal deaths. However, pregnant women living in urban settings face additional complex challenges travelling to facilities. We estimated geographical accessibility and coverage to the nearest, second nearest, and third nearest public and private CEmOC facilities in the 15 largest Nigerian cities. Methods We mapped city boundaries, verified and geocoded functional CEmOC facilities, and assembled population distribution for women of childbearing age (WoCBA). We used Google Maps Platform’s internal Directions Application Programming Interface (API) to derive driving times to public, private, or either facility-type. Median travel time (MTT) and percentage of WoCBA able to reach care were summarised for eight traffic scenarios (peak and non-peak hours on weekdays and weekends) by city and within-city (wards) under different travel time thresholds (<15, <30, <60 min). Findings City-level MTT to the nearest CEmOC facility ranged from 18min (Maiduguri) to 46min (Kaduna). Within cities, MTT varied by location, with informal settlements and peripheral areas being the worst off. The percentages of WoCBA within 60min to their nearest public CEmOC were nearly universal; whilst the percentages of WoCBA within 30min reach to their nearest public CEmOC were between 33% in Aba to over 95% in Ilorin and Maiduguri. During peak traffic times, the median number of public CEmOC facilities reachable by WoCBA under 30min was zero in eight of 15 cities. Interpretation This approach provides more context-specific, finer, and policy-relevant evidence to support improving CEmOC service accessibility in urban Africa. View details