Natalie Harris

Natalie Harris

Natalie is a research engineer in Google Research and a member of the Brain Health Research team. She joined Google in 2014, since then working in the Kirkland/Seattle, Zurich and London offices. She has previously worked on multiple teams across Google and Deepmind, most recently working on continuous prediction of adverse events using Electronic Health Records. Currently her focus is Machine Learning for Healthcare, with a particular interest in Fairness & Ethics. She earned her BS in Computer Science from the University of Washington.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Diagnosing and mitigating changes in model fairness under distribution shift is an important component of the safe deployment of machine learning in healthcare settings. Importantly, the success of any mitigation strategy strongly depends on the structure of the shift. Despite this, there has been little discussion of how to empirically assess the structure of a distribution shift that one is encountering in practice. In this work, we adopt a causal framing to motivate conditional independence tests as a key tool for characterizing distribution shifts. Using our approach in two medical applications, we show that this knowledge can help diagnose failures of fairness transfer, including cases where real-world shifts are more complex than is often assumed in the literature. Based on these results, we discuss potential remedies at each step of the machine learning pipeline. View details
    Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records
    Nenad Tomašev
    Sebastien Baur
    Anne Mottram
    Xavier Glorot
    Jack William Rae
    Michal Zielinski
    Harry Askham
    Andre Saraiva
    Valerio Magliulo
    Clemens Meyer
    Suman Venkatesh Ravuri
    Alistair Connell
    Cían Hughes
    Julien Cornebise
    Hugh Montgomery
    Geraint Rees
    Christopher Laing
    Clifton R. Baker
    Thomas Osborne
    Ruth Reeves
    Demis Hassabis
    Dominic King
    Mustafa Suleyman
    Trevor John Back
    Christopher Nielsen
    Martin Gamunu Seneviratne
    Shakir Mohamad
    Nature Protocols (2021)
    Preview abstract Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks. View details
    Multi-task prediction of organ dysfunction in the ICU using sequential sub-network routing
    Diana Mincu
    Eric Loreaux
    Anne Mottram
    Hugh Montgomery
    Ali Connell
    Nenad Tomašev
    Martin Seneviratne
    Journal of the American Medical Informatics Association (JAMIA) (2021)
    Preview abstract Introduction: Multi-task learning (MTL) using electronic health records (EHRs) allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however it often suffers from negative transfer - impaired learning if tasks are not appropriately selected. We introduce a sequential sub-network routing (SeqSNR) architecture which uses soft parameter sharing to find related tasks and encourage cross-learning between them. Materials and Methods: Using the Medical Information Mart for Intensive Care (MIMIC-III) dataset, we train deep neural network models to predict the onset of six endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single task models (ST) with naive multi-task (shared bottom, SB) and SeqSNR in terms of discriminative performance and label efficiency. Results: SeqSNR showed a modest yet statistically significant performance boost across at least 4 out of 6 tasks compared to SB and ST. When the size of the training dataset was reduced for a given task, SeqSNR outperformed ST for all cases showing an average AU PRC boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels respectively. Discussion and Conclusion: Multi-task learning has variable performance compared to single-task learning, with the possibility for negative transfer. The SeqSNR architecture outperforms SB and ST in discriminative performance and shows superior performance in terms of label efficiency. SeqSNR should be considered for multi-task predictive modeling using EHR data. View details