Greg Corrado
            Greg Corrado is a senior research scientist interested in biological neuroscience, artificial intelligence, and scalable machine learning.  He has published in fields ranging across behavioral economics, neuromorphic device physics, systems neuroscience, and deep learning.  At Google he has worked for some time on brain inspired computing, and most recently has served as one of the founding members and the co-technical lead of Google's large scale deep neural networks project.
          
        
        
      Authored Publications
    
  
  
  
    
    
  
      
        Sort By
        
        
    
    
        
          
            
              Closing the AI generalisation gap by adjusting for dermatology condition distribution differences across clinical settings
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Rajeev Rikhye
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Aaron Loh
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Grace Hong
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Margaret Ann Smith
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Vijaytha Muralidharan
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Doris Wong
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Michelle Phung
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Nicolas Betancourt
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Bradley Fong
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Rachna Sahasrabudhe
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Khoban Nasim
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Alec Eschholz
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Basil Mustafa
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jan Freyberg
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Terry Spitz
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Kat Chou
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Peggy Bui
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Justin Ko
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Steven Lin
                      
                    
                  
              
            
          
          
          
          
            The Lancet eBioMedicine (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              Background: Generalisation of artificial intelligence (AI) models to a new setting is challenging. In this study, we seek to understand the robustness of a dermatology (AI) model and whether it generalises from telemedicine cases to a new setting including both patient-submitted photographs (“PAT”) and clinician-taken photographs in-clinic (“CLIN”).
Methods: A retrospective cohort study involving 2500 cases previously unseen by the AI model, including both PAT and CLIN cases, from 22 clinics in the San Francisco Bay Area, spanning November 2015 to January 2021. The primary outcome measure for the AI model and dermatologists was the top-3 accuracy, defined as whether their top 3 differential diagnoses contained the top reference diagnosis from a panel of dermatologists per case.
Findings: The AI performed similarly between PAT and CLIN images (74% top-3 accuracy in CLIN vs. 71% in PAT), however, dermatologists were more accurate in PAT images (79% in CLIN vs. 87% in PAT). We demonstrate that demographic factors were not associated with AI or dermatologist errors; instead several categories of conditions were associated with AI model errors (p < 0.05). Resampling CLIN and PAT to match skin condition distributions to the AI development dataset reduced the observed differences (AI: 84% CLIN vs. 79% PAT; dermatologists: 77% CLIN vs. 89% PAT). We demonstrate a series of steps to close the generalisation gap, requiring progressively more information about the new dataset, ranging from the condition distribution to additional training data for rarer conditions. When using additional training data and testing on the dataset without resampling to match AI development, we observed comparable performance from end-to-end AI model fine tuning (85% in CLIN vs. 83% in PAT) vs. fine tuning solely the classification layer on top of a frozen embedding model (86% in CLIN vs. 84% in PAT).
Interpretation: AI algorithms can be efficiently adapted to new settings without additional training data by recalibrating the existing model, or with targeted data acquisition for rarer conditions and retraining just the final layer.
              
  
View details
          
        
      
    
        
          
            
              Triaging mammography with artificial intelligence: an implementation study
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Sarah M. Friedewald
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sunny Jansen
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Fereshteh Mahvar
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Timo Kohlberger
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        David V. Schacht
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sonya Bhole
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Dipti Gupta
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Scott Mayer McKinney
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Stacey Caron
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        David Melnick
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Mozziyar Etemadi
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Samantha Winter
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Alejandra Maciel
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Luca Speroni
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Martha Sevenich
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Arnav Agharwal
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Rubin Zhang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Gavin Duggan
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shiro Kadowaki
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Atilla Kiraly
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jie Yang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Basil Mustafa
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Krish Eswaran
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shravya Shetty
                      
                    
                  
              
            
          
          
          
          
            Breast Cancer Research and Treatment (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              Purpose
Many breast centers are unable to provide immediate results at the time of screening mammography which results in delayed patient care. Implementing artificial intelligence (AI) could identify patients who may have breast cancer and accelerate the time to diagnostic imaging and biopsy diagnosis.
Methods
In this prospective randomized, unblinded, controlled implementation study we enrolled 1000 screening participants between March 2021 and May 2022. The experimental group used an AI system to prioritize a subset of cases for same-visit radiologist evaluation, and same-visit diagnostic workup if necessary. The control group followed the standard of care. The primary operational endpoints were time to additional imaging (TA) and time to biopsy diagnosis (TB).
Results
The final cohort included 463 experimental and 392 control participants. The one-sided Mann-Whitney U test was employed for analysis of TA and TB. In the control group, the TA was 25.6 days [95% CI 22.0–29.9] and TB was 55.9 days [95% CI 45.5–69.6]. In comparison, the experimental group's mean TA was reduced by 25% (6.4 fewer days [one-sided 95% CI > 0.3], p<0.001) and mean TB was reduced by 30% (16.8 fewer days; 95% CI > 5.1], p=0.003). The time reduction was more pronounced for AI-prioritized participants in the experimental group. All participants eventually diagnosed with breast cancer were prioritized by the AI.
Conclusions
Implementing AI prioritization can accelerate care timelines for patients requiring additional workup, while maintaining the efficiency of delayed interpretation for most participants. Reducing diagnostic delays could contribute to improved patient adherence, decreased anxiety and addressing disparities in access to timely care.
              
  
View details
          
        
      
    
        
          
            
              A personal health large language model for sleep and fitness coaching
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
    
    
    
    
    
                      
                        Anastasiya Belyaeva
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Zhun Yang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Nick Furlotte
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Chace Lee
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Erik Schenck
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yojan Patel
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jian Cui
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Logan Schneider
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Robby Bryant
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ryan Gomes
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Allen Jiang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Roy Lee
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Javier Perez
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jamie Rogers
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Cathy Speed
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shyam Tailor
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Megan Walker
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jeffrey Yu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Tim Althoff
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Conor Heneghan
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Mark Malhotra
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shwetak Patel
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shravya Shetty
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jiening Zhan
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Daniel McDuff
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            Nature Medicine (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              Although large language models (LLMs) show promise for clinical healthcare applications, their utility for personalized health monitoring using wearable device data remains underexplored. Here we introduce the Personal Health Large Language Model (PH-LLM), designed for applications in sleep and fitness. PH-LLM is a version of the Gemini LLM that was finetuned for text understanding and reasoning when applied to aggregated daily-resolution numerical sensor data. We created three benchmark datasets to assess multiple complementary aspects of sleep and fitness: expert domain knowledge, generation of personalized insights and recommendations and prediction of self-reported sleep quality from longitudinal data. PH-LLM achieved scores that exceeded a sample of human experts on multiple-choice examinations in sleep medicine (79% versus 76%) and fitness (88% versus 71%). In a comprehensive evaluation involving 857 real-world case studies, PH-LLM performed similarly to human experts for fitness-related tasks and improved over the base Gemini model in providing personalized sleep insights. Finally, PH-LLM effectively predicted self-reported sleep quality using a multimodal encoding of wearable sensor data, further demonstrating its ability to effectively contextualize wearable modalities. This work highlights the potential of LLMs to revolutionize personal health monitoring via tailored insights and predictions from wearable data and provides datasets, rubrics and benchmark performance to further accelerate personal health-related LLM research.
              
  
View details
          
        
      
    
        
          
            
              Performance of a Deep Learning Diabetic Retinopathy Algorithm in India
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Arthur Brant
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Xiang Yin
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Lu Yang
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Divleen Jeji
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sunny Virmani
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Anchintha Meenu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Naresh Babu Kannan
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Florence Thng
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Lily Peng
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ramasamy Kim
                      
                    
                  
              
            
          
          
          
          
            JAMA Network Open (2025)
          
          
        
        
        
          
              Preview abstract
          
          
              Importance: While prospective studies have investigated the accuracy of artificial intelligence (AI) for detection of diabetic retinopathy (DR) and diabetic macular edema (DME), to date, little published data exist on the clinical performance of these algorithms.
Objective: To evaluate the clinical performance of an automated retinal disease assessment (ARDA) algorithm in the postdeployment setting at Aravind Eye Hospital in India.
Design, Setting, and Participants: This cross-sectional analysis involved an approximate 1% sample of fundus photographs from patients screened using ARDA. Images were graded via adjudication by US ophthalmologists for DR and DME, and ARDA’s output was compared against the adjudicated grades at 45 sites in Southern India. Patients were randomly selected between January 1, 2019, and July 31, 2023.
Main Outcomes and Measures: Primary analyses were the sensitivity and specificity of ARDA for severe nonproliferative DR (NPDR) or proliferative DR (PDR). Secondary analyses focused on sensitivity and specificity for sight-threatening DR (STDR) (DME or severe NPDR or PDR).
Results: Among the 4537 patients with 4537 images with adjudicated grades, mean (SD) age was 55.2 (11.9) years and 2272 (50.1%) were male. Among the 3941 patients with gradable photographs, 683 (17.3%) had any DR, 146 (3.7%) had severe NPDR or PDR, 109 (2.8%) had PDR, and 398 (10.1%) had STDR. ARDA’s sensitivity and specificity for severe NPDR or PDR were 97.0% (95% CI, 92.6%-99.2%) and 96.4% (95% CI, 95.7%-97.0%), respectively. Positive predictive value (PPV) was 50.7% and negative predictive value (NPV) was 99.9%. The clinically important miss rate for severe NPDR or PDR was 0% (eg, some patients with severe NPDR or PDR were interpreted as having moderate DR and referred to clinic). ARDA’s sensitivity for STDR was 95.9% (95% CI, 93.0%-97.4%) and specificity was 94.9% (95% CI, 94.1%-95.7%); PPV and NPV were 67.9% and 99.5%, respectively.
Conclusions and Relevance: In this cross-sectional study investigating the clinical performance of ARDA, sensitivity and specificity for severe NPDR and PDR exceeded 96% and caught 100% of patients with severe  NPDR and PDR for ophthalmology referral. This preliminary large-scale postmarketing report of the performance of ARDA after screening 600 000 patients in India underscores the importance of monitoring and publication an algorithm's clinical performance, consistent with recommendations by regulatory bodies.
              
  
View details
          
        
      
    
        
        
          
              Preview abstract
          
          
              Importance: Interest in artificial intelligence (AI) has reached an all-time high, and health care leaders across the ecosystem are faced with questions about where, when, and how to deploy AI and how to understand its risks, problems, and possibilities.
Observations: While AI as a concept has existed since the 1950s, all AI is not the same. Capabilities and risks of various kinds of AI differ markedly, and on examination 3 epochs of AI emerge. AI 1.0 includes symbolic AI, which attempts to encode human knowledge into computational rules, as well as probabilistic models. The era of AI 2.0 began with deep learning, in which models learn from examples labeled with ground truth. This era brought about many advances both in people’s daily lives and in health care. Deep learning models are task-specific, meaning they do one thing at a time, and they primarily focus on classification and prediction. AI 3.0 is the era of foundation models and generative AI. Models in AI 3.0 have fundamentally new (and potentially transformative) capabilities, as well as new kinds of risks, such as hallucinations. These models can do many different kinds of tasks without being retrained on a new dataset. For example, a simple text instruction will change the model’s behavior. Prompts such as “Write this note for a specialist consultant” and “Write this note for the patient’s mother” will produce markedly different content.
Conclusions and Relevance: Foundation models and generative AI represent a major revolution in AI’s capabilities, ffering tremendous potential to improve care. Health care leaders are making decisions about AI today. While any heuristic omits details and loses nuance, the framework of AI 1.0, 2.0, and 3.0 may be helpful to decision-makers because each epoch has fundamentally different capabilities and risks.
              
  
View details
          
        
      
    
        
          
            
              Assistive AI in Lung Cancer Screening: A Retrospective Multinational Study in the United States and Japan
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Atilla Kiraly
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Corbin Cunningham
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ryan Najafi
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jie Yang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Chuck Lau
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Diego Ardila
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Scott Mayer McKinney
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Rory Pilgrim
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Mozziyar Etemadi
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sunny Jansen
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Lily Peng
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shravya Shetty
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Neeral Beladia
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Krish Eswaran
                      
                    
                  
              
            
          
          
          
          
            Radiology: Artificial Intelligence (2024)
          
          
        
        
        
          
              Preview abstract
          
          
              Lung cancer is the leading cause of cancer death world-wide with 1.8 million deaths in 20201. Studies have concluded that low-dose computed tomography lung cancer screening can reduce mortality by up to 61%2 and updated 2021 US guidelines expanded eligibility. As screening efforts rise, AI can play an important role, but must be unobtrusively integrated into existing clinical workflows. In this work, we introduce a state-of-the-art, cloud-based AI system providing lung cancer risk assessments without requiring any user input. We demonstrate its efficacy in assisting lung cancer screening under both US and Japanese screening settings using different patient populations and screening protocols. Technical improvements over a previously described system include a focus on earlier cancer detection for improved accuracy, introduction of an effective assistive user interface, and a system designed to integrate into typical clinical workflows. The stand-alone AI system was evaluated on 3085 individuals achieving area under the curve (AUC) scores of 91.7% (95%CI [89.6, 95.2]), 93.3% (95%CI [90.2, 95.7]), and 89.1% (95%CI [77.7, 97.3]) on three datasets (two from US and one from Japan), respectively. To evaluate the system’s assistive ability, we conducted two retrospective multi-reader multi-case studies on 627 cases read by experienced board certified radiologists (average 20 years of experience [7,40]) using local PACS systems in the respective US and Japanese screening settings. The studies measured the reader’s level of suspicion (LoS) and categorical responses for scores and management recommendations under country-specific screening protocols. The radiologists’ AUC for LoS increased with AI assistance by 2.3% (95%CI [0.1-4.5], p=0.022) for the US study and by 2.3% (95%CI [-3.5-8.1], p=0.179) for the Japan study. Specificity for recalls increased by 5.5% (95%CI [2.7-8.5], p<0.0001) for the US and 6.7% (95%CI [4.7-8.7], p<0.0001) for the Japan study. No significant reduction in other metrics occured. This work advances the state-of-the-art in lung cancer detection, introduces generalizable interface concepts that can be applicable to similar AI applications, and demonstrates its potential impact on diagnostic AI in global lung cancer screening with results suggesting a substantial drop in unnecessary follow-up procedures without impacting sensitivity.
              
  
View details
          
        
      
    
        
          
            
              Towards Conversational Diagnostic AI
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
    
    
    
    
    
                      
                        Khaled Saab
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jan Freyberg
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ryutaro Tanno
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Amy Wang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Brenna Li
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Nenad Tomašev
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Karan Singhal
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Le Hou
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Albert Webson
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Kavita Kulkarni
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sara Mahdavi
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Juro Gottweis
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Joelle Barral
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Kat Chou
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            Arxiv (2024) (to appear)
          
          
        
        
        
          
              Preview abstract
          
          
              At the heart of medicine lies the physician-patient dialogue, where skillful history-taking paves the way for accurate diagnosis, effective management, and enduring trust. Artificial Intelligence (AI) systems capable of diagnostic dialogue could increase accessibility, consistency, and quality of care. However, approximating clinicians' expertise is an outstanding grand challenge. Here, we introduce AMIE (Articulate Medical Intelligence Explorer), a Large Language Model (LLM) based AI system optimized for diagnostic dialogue.
AMIE uses a novel self-play based simulated environment with automated feedback mechanisms for scaling learning across diverse disease conditions, specialties, and contexts. We designed a framework for evaluating clinically-meaningful axes of performance including history-taking, diagnostic accuracy, management reasoning, communication skills, and empathy. We compared AMIE's performance to that of primary care physicians (PCPs) in a randomized, double-blind crossover study of text-based consultations with validated patient actors in the style of an Objective Structured Clinical Examination (OSCE). The study included 149 case scenarios from clinical providers in Canada, the UK, and India, 20 PCPs for comparison with AMIE, and evaluations by specialist physicians and patient actors. AMIE demonstrated greater diagnostic accuracy and superior performance on 28 of 32 axes according to specialist physicians and 24 of 26 axes according to patient actors. Our research has several limitations and should be interpreted with appropriate caution. Clinicians were limited to unfamiliar synchronous text-chat which permits large-scale LLM-patient interactions but is not representative of usual clinical practice. While further research is required before AMIE could be translated to real-world settings, the results represent a milestone towards conversational diagnostic AI.
              
  
View details
          
        
      
    
        
          
            
              Searching for Dermatology Information Online using Images vs Text: a Randomized Study
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Justin Krogue
                      
                    
                
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jay Hartford
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Natalie Salaets
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Kimberley Raiford
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Dounia Berrada
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Harsh Kharbanda
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Lou Wang
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Peggy Bui
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            medRxiv (2024)
          
          
        
        
        
          
              Preview abstract
          
          
              Background: Skin conditions are extremely common worldwide, and are an important cause of both anxiety and morbidity. Since the advent of the internet, individuals have used text-based search (eg, “red rash on arm”) to learn more about concerns on their skin, but this process is often hindered by the inability to accurately describe the lesion’s morphology. In the study, we surveyed respondents’ experiences with an image-based search, compared to the traditional text-based search experience.
Methods: An internet-based survey was conducted to evaluate the experience of text-based vs image-based search for skin conditions. We recruited respondents from an existing cohort of volunteers in a commercial survey panel; survey respondents that met inclusion/exclusion criteria, including willingness to take photos of a visible concern on their body, were enrolled. Respondents were asked to use the Google mobile app to conduct both regular text-based search (Google Search) and image-based search (Google Lens) for their concern, with the order of text vs. image search randomized. Satisfaction for each search experience along six different dimensions were recorded and compared, and respondents’ preferences for the different search types along these same six dimensions were recorded.
Results: 372 respondents were enrolled in the study, with 44% self-identifying as women, 86% as White and 41% over age 45. The rate of respondents who were at least moderately familiar with searching for skin conditions using text-based search versus image-based search were 81.5% and 63.5%, respectively. After using both search modalities, respondents were highly satisfied with both image-based and text-based search, with >90% at least somewhat satisfied in each dimension and no significant differences seen between text-based and image-based search when examining the responses on an absolute scale per search modality. When asked to directly rate their preferences in a comparative way, survey respondents preferred image-based search over text-based search in 5 out of 6 dimensions, with an absolute 9.9% more preferring image-based search over text-based search overall (p=0.004). 82.5% (95% CI 78.2 - 86.3) reported a preference to leverage image-based search (alone or in combination with text-based search) in future searches.  Of those who would prefer to use a combination of both, 64% indicated they would like to start with image-based search, indicating that image-based search may be the preferred entry point for skin-related searches.
Conclusion: Despite being less familiar with image-based search upon study inception, survey respondents generally preferred image-based search to text-based search and overwhelmingly wanted to include this in future searches. These results suggest the potential for image-based search to play a key role in people searching for information regarding skin concerns.
              
  
View details
          
        
      
    
        
          
            
              Prospective Multi-Site Validation of AI to Detect Tuberculosis and Chest X-Ray Abnormalities
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Sahar Kazemzadeh
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Atilla Kiraly
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Nsala Sanjase
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Minyoi Maimbolwa
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Brian Shuma
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shahar Jamshy
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Christina Chen
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Arnav Agharwal
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Chuck Lau
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Daniel Golden
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jin Yu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Eric Wu
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Kat Chou
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shravya Shetty
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Krish Eswaran
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Rory Pilgrim
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Monde Muyoyeta
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            NEJM AI (2024)
          
          
        
        
        
          
              Preview abstract
          
          
              Background
Using artificial intelligence (AI) to interpret chest X-rays (CXRs) could support accessible triage tests for active pulmonary tuberculosis (TB) in resource-constrained settings.
Methods
The performance of two cloud-based CXR AI systems — one to detect TB and the other to detect CXR abnormalities — in a population with a high TB and human immunodeficiency virus (HIV) burden was evaluated. We recruited 1978 adults who had TB symptoms, were close contacts of known TB patients, or were newly diagnosed with HIV at three clinical sites. The TB-detecting AI (TB AI) scores were converted to binary using two thresholds: a high-sensitivity threshold and an exploratory threshold designed to resemble radiologist performance. Ten radiologists reviewed images for signs of TB, blinded to the reference standard. Primary analysis measured AI detection noninferiority to radiologist performance. Secondary analysis evaluated AI detection as compared with the World Health Organization (WHO) targets (90% sensitivity, 70% specificity). Both used an absolute margin of 5%. The abnormality-detecting AI (abnormality AI) was evaluated for noninferiority to a high-sensitivity target suitable for triaging (90% sensitivity, 50% specificity).
Results
Of the 1910 patients analyzed, 1827 (96%) had conclusive TB status, of which 649 (36%) were HIV positive and 192 (11%) were TB positive. The TB AI’s sensitivity and specificity were 87% and 70%, respectively, at the high-sensitivity threshold and 78% and 82%, respectively, at the balanced threshold. Radiologists’ mean sensitivity was 76% and mean specificity was 82%. At the high-sensitivity threshold, the TB AI was noninferior to average radiologist sensitivity (P<0.001) but not to average radiologist specificity (P=0.99) and was higher than the WHO target for specificity but not sensitivity. At the balanced threshold, the TB AI was comparable to radiologists. The abnormality AI’s sensitivity and specificity were 97% and 79%, respectively, with both meeting the prespecified targets.
Conclusions
The CXR TB AI was noninferior to radiologists for active pulmonary TB triaging in a population with a high TB and HIV burden. Neither the TB AI nor the radiologists met WHO recommendations for sensitivity in the study population. AI can also be used to detect other CXR abnormalities in the same population.
              
  
View details
          
        
      
    
        
          
            
              A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
    
    
    
    
    
                      
                        Heather Cole-Lewis
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Nenad Tomašev
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Liam McCoy
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Leo Anthony Celi
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Alanna Walton
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Chirag Nagpal
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Akeiylah DeWitt
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Philip Mansfield
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sushant Prakash
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Joelle Barral
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ivor Horn
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Karan Singhal
                      
                    
                  
              
            
          
          
          
          
            Nature Medicine (2024)
          
          
        
        
        
          
              Preview abstract
          
          
              Large language models (LLMs) hold promise to serve complex health information needs but also have the potential to introduce harm and exacerbate health disparities. Reliably evaluating equity-related model failures is a critical step toward developing systems that promote health equity. We present resources and methodologies for surfacing biases with potential to precipitate equity-related harms in long-form, LLM-generated answers to medical questions and conduct a large-scale empirical case study with the Med-PaLM 2 LLM. Our contributions include a multifactorial framework for human assessment of LLM-generated answers for biases and EquityMedQA, a collection of seven datasets enriched for adversarial queries. Both our human assessment framework and our dataset design process are grounded in an iterative participatory approach and review of Med-PaLM 2 answers. Through our empirical study, we find that our approach surfaces biases that may be missed by narrower evaluation approaches. Our experience underscores the importance of using diverse assessment methodologies and involving raters of varying backgrounds and expertise. While our approach is not sufficient to holistically assess whether the deployment of an artificial intelligence (AI) system promotes equitable health outcomes, we hope that it can be leveraged and built upon toward a shared goal of LLMs that promote accessible and equitable healthcare.
              
  
View details