Tiya Tiyasirichokchai
Google Research
Authored Publications
Sort By
TRINDs: Assessing the Diagnostic Capabilities of Large Language Models for Tropical and Infectious Diseases
Steve Adudans
Oluwatosin Akande
Chintan Ghate
Sylvanus Aitkins
Geoffrey Siwo
Lynda Osadebe
Nenad Tomašev
Eric Ndombi
Preview abstract
Neglected tropical diseases (NTDs) and infectious diseases disproportionately affect the poorest regions of the world. While large language models (LLMs) have shown promise for medical question answering, there is limited work focused on tropical and infectious disease-specific explorations. We introduce TRINDs, a dataset of 52 tropical and infectious diseases with demographic and semantic clinical and consumer augmentations. We evaluate various context and counterfactual locations to understand their influence on LLM performance. Results show that LLMs perform best when provided with contextual information such as demographics, location, and symptoms. We also develop TRINDs-LM, a tool that enables users to enter symptoms and contextual information to receive a most likely diagnosis. In addition to the LLM evaluations, we also conducted a human expert baseline study to assess the accuracy of human experts in diagnosing tropical and infectious diseases with 7 medical and public health experts. This work demonstrates methods for creating and evaluating datasets for testing and optimizing LLMs, and the use of a tool that could improve digital diagnosis and tracking of NTDs.
View details
Crowdsourcing Dermatology Images with Google Search Ads: Creating a Diverse and Representative Dataset of Real-World Skin Conditions
Abbi Ward
Ashley Carrick
Dawn Siegel
Jay Hartford
Jimmy Li
Julie Wang
Justin Ko
Pradeep Kumar S
Renee Wong
Sriram Lakshminarasimhan
Steven Lin
Sunny Virmani
arXiv (2024)
Preview abstract
Background
Health datasets from clinical sources do not reflect the breadth and diversity of disease in the real world, impacting research, medical education and artificial intelligence (AI) tool development. Dermatology is a suitable area to develop and test a new and scalable method to create representative health datasets.
Methods
We used Google Search advertisements to solicit contributions of images of dermatology conditions, demographic and symptom information from internet users in the United States (US) over 265 days starting March 2023. With informed contributor consent, we described and released this dataset containing 10,106 images from 5058 contributions, with dermatologist labels as well as Fitzpatrick Skin Type and Monk Skin Tone labels for the images.
Results
We received 22 ± 14 submissions/day over 265 days. Female contributors (66.04%) and younger individuals (52.3% < age 40) had a higher representation in the dataset compared to the US population, and 36.6% of contributors had a non-White racial or ethnic identity. Over 97.5% of contributions were genuine images of skin conditions. Image quality had no impact on dermatologist confidence in assigning a differential diagnosis. The dataset consists largely of short duration (54% with onset < 7 days ago) allergic, infectious, and inflammatory conditions. Fitzpatrick skin type distribution is well-balanced, considering the geographical origin of the dataset and the absence of enrichment for population groups or skin tones.
Interpretation
Search ads are effective at crowdsourcing images of health conditions. The SCIN dataset bridges important gaps in the availability of representative images of common skin conditions.
View details
A mobile-optimized artificial intelligence system for gestational age and fetal malpresentation assessment
Ryan Gomes
Bellington Vwalika
Chace Lee
Angelica Willis
Joan T. Price
Christina Chen
Margaret P. Kasaro
James A. Taylor
Elizabeth M. Stringer
Scott Mayer McKinney
Ntazana Sindano
William Goodnight, III
Justin Gilmer
Benjamin H. Chi
Charles Lau
Terry Spitz
Kris Liu
Jonny Wong
Rory Pilgrim
Akib Uddin
Lily Hao Yi Peng
Kat Chou
Jeffrey S. A. Stringer
Shravya Ramesh Shetty
Communications Medicine (2022)
Preview abstract
Background
Fetal ultrasound is an important component of antenatal care, but shortage of adequately trained healthcare workers has limited its adoption in low-to-middle-income countries. This study investigated the use of artificial intelligence for fetal ultrasound in under-resourced settings.
Methods
Blind sweep ultrasounds, consisting of six freehand ultrasound sweeps, were collected by sonographers in the USA and Zambia, and novice operators in Zambia. We developed artificial intelligence (AI) models that used blind sweeps to predict gestational age (GA) and fetal malpresentation. AI GA estimates and standard fetal biometry estimates were compared to a previously established ground truth, and evaluated for difference in absolute error. Fetal malpresentation (non-cephalic vs cephalic) was compared to sonographer assessment. On-device AI model run-times were benchmarked on Android mobile phones.
Results
Here we show that GA estimation accuracy of the AI model is non-inferior to standard fetal biometry estimates (error difference -1.4 ± 4.5 days, 95% CI -1.8, -0.9, n=406). Non-inferiority is maintained when blind sweeps are acquired by novice operators performing only two of six sweep motion types. Fetal malpresentation AUC-ROC is 0.977 (95% CI, 0.949, 1.00, n=613), sonographers and novices have similar AUC-ROC. Software run-times on mobile phones for both diagnostic models are less than 3 seconds after completion of a sweep.
Conclusions
The gestational age model is non-inferior to the clinical standard and the fetal malpresentation model has high AUC-ROCs across operators and devices. Our AI models are able to run on-device, without internet connectivity, and provide feedback scores to assist in upleveling the capabilities of lightly trained ultrasound operators in low resource settings.
View details