Jump to Content
Jake Garrison

Jake Garrison

I was born in Spokane, Washington, attended University of Washington for undergraduate and graduate school. I built my own electric car around when I was 16, and this project served as a catalyst for my educational interests in college. During my undergraduate years, I studied electrical engineering with a focus on power electronics and battery management for electric cars as well as analog audio circuits and digital signal processing. Additionally, I was an early researcher in autonomous driving using multimodal deep neural networks. I also developed various AI driven apps and software in startup type environments. For graduate school, I joined the UW Ubicomp lab where I researched novel health sensing on mobile devices. My thesis was on sound based lung function testing. I continue to do this type of research at Google Health.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Health-related acoustic signals, such as cough and breathing sounds, are relevant for medical diagnosis and continuous health monitoring. Most existing machine learning approaches for health acoustics are trained and evaluated on specific tasks, limiting their generalizability across various healthcare applications. In this paper, we leverage a self-supervised learning framework, SimCLR with a Slowfast NFNet backbone, for contrastive learning of health acoustics. A crucial aspect of optimizing Slowfast NFNets for this application lies in identifying effective audio augmentations. We conduct an in-depth analysis of various audio augmentation strategies and demonstrate that an appropriate augmentation strategy enhances the performance of the Slowfast NFNet audio encoder across a diverse set of health acoustic tasks. Our findings reveal that when augmentations are combined, they can produce synergistic effects that exceed the benefits seen when each is applied individually. View details
    Towards Accurate Differential Diagnosis with Large Language Models
    Daniel McDuff
    Anil Palepu
    Amy Wang
    Yash Sharma
    Kavita Kulkarni
    Le Hou
    Sara Mahdavi
    Sushant Prakash
    Anupam Pathak
    Shwetak Patel
    Ewa Dominowska
    Juro Gottweis
    Joelle Barral
    Kat Chou
    Jake Sunshine
    Arxiv (2023)
    Preview abstract An accurate differential diagnosis (DDx) is a cornerstone of medical care, often reached through an iterative process of interpretation that combines clinical history, physical examination, investigations and procedures. Interactive interfaces powered by Large Language Models (LLMs) present new opportunities to both assist and automate aspects of this process. In this study, we introduce an LLM optimized for diagnostic reasoning, and evaluate its ability to generate a DDx alone or as an aid to clinicians. 20 clinicians evaluated 302 challenging, real-world medical cases sourced from the New England Journal of Medicine (NEJM) case reports. Each case report was read by two clinicians, who were randomized to one of two assistive conditions: either assistance from search engines and standard medical resources, or LLM assistance in addition to these tools. All clinicians provided a baseline, unassisted DDx prior to using the respective assistive tools. Our LLM for DDx exhibited standalone performance that exceeded that of unassisted clinicians (top-10 accuracy 59.1% vs 33.6%, [p = 0.04]). Comparing the two assisted study arms, the DDx quality score was higher for clinicians assisted by our LLM (top-10 accuracy 51.7%) compared to clinicians without its assistance (36.1%) (McNemar's Test: 45.7, p < 0.01) and clinicians with search (44.4%) (4.75, p = 0.03). Further, clinicians assisted by our LLM arrived at more comprehensive differential lists than those without its assistance. Our study suggests that our LLM for DDx has potential to improve clinicians' diagnostic reasoning and accuracy in challenging cases, meriting further real-world evaluation for its ability to empower physicians and widen patients' access to specialist-level expertise. View details
    Preview abstract Learned speech representations can drastically improve performance on tasks with limited labeled data. However, due to their size and complexity, learned representations have limited utility in mobile settings where run-time performance can be a significant bottleneck. In this work, we propose a class of lightweight speech embedding models that run efficiently on mobile devices based on the recently proposed TRILL speech embedding. We combine novel architectural modifications with existing speedup techniques to create embedding models that are fast enough to run in real-time on a mobile device and exhibit minimal performance degradation on a benchmark of non-semantic speech tasks. One such model (FRILL) is 32x faster on a Pixel 1 smartphone and 40% the size of TRILL, with an average decrease in accuracy of only 2%. To our knowledge, FRILL is the highest quality non-semantic embedding designed for use on mobile devices. Furthermore, we demonstrate that these representations are useful for mobile health tasks such as non-speech human sounds detection and face-masked speech detection. Our training and evaluation code is publicly available. View details
    Whosecough: In-the-Wild Cougher Verification Using Multitask Learning
    Matt Whitehill
    Shwetak Patel
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 896-900
    Preview abstract Current automatic cough counting systems can determine how many coughs are present in an audio recording. However, they cannot determine who produced the cough. This limits their usefulness as most systems are deployed in locations with multiple people (i.e., a smart home device in a four-person home). Previous models trained solely on speech performed reasonably well on forced coughs [1]. By incorporating coughs into the training data, the model performance should improve. However, since limited natural cough data exists, training on coughs can lead to model overfitting. In this work, we overcome this problem by using multitask learning, where the second task is speaker verification. Our model achieves 82.15% classification accuracy amongst four users on a natural, in-the-wild cough dataset, outperforming human evaluators on average by 9.82%. View details
    SpiroConfidence: Determining the Validity of Smartphone Based Spirometry Using Machine Learning
    Varun Viswanath
    Shwetak Patel
    Engineering in Medicine and Biology Conference (EMBC)
    Preview abstract Prior work has shown that smartphone spirometry can effectively measure lung function using the phone’s built-in microphone and could one day play a critical role in making spirometry more usable, accessible, and cost-effective. Although traditional spirometry is performed with the guidance of a medical expert, smartphone spirometry lacks the ability to provide the patient feedback or guarantee the quality of a patient’s spirometry efforts. Smartphone spirometry is particu- larly susceptible to poorly performed efforts because any sounds in the environment (e.g., a person’s voice) or mistakes in the effort (e.g., coughs or short breaths) can invalidate the results. We introduce two approaches to analyze and estimate the quality of smartphone spirometry efforts. A gradient boosting model achieves 98.2% precision and 86.6% recall identifying invalid efforts when given expert tuned audio features, while a Gated-Convolutional Recurrent Neural Network achieves 98.3% precision and 88.0% recall and automatically develops patterns from a Mel-spectrogram, a more general audio feature. View details
    No Results Found