Jump to Content
Afroz Mohiuddin

Afroz Mohiuddin

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Deciphering clinical abbreviations with a privacy protecting machine learning system
    Alvin Rajkomar
    Eric Loreaux
    Yuchen Liu
    Benny Li
    Ming-Jun Chen
    Yi Zhang
    Juraj Gottweis
    Nature Communications (2022)
    Preview abstract Physicians write clinical notes with abbreviations and shorthand that are difficult to decipher. Abbreviations can be clinical jargon (writing “HIT” for “heparin induced thrombocytopenia”), ambiguous terms that require expertise to disambiguate (using “MS” for “multiple sclerosis” or “mental status”), or domain-specific vernacular (“cb” for “complicated by”). Here we train machine learning models on public web data to decode such text by replacing abbreviations with their meanings. We report a single translation model that simultaneously detects and expands thousands of abbreviations in real clinical notes with accuracies ranging from 92.1%-97.1% on multiple external test datasets. The model equals or exceeds the performance of board-certified physicians (97.6% vs 88.7% total accuracy). Our results demonstrate a general method to contextually decipher abbreviations and shorthand that is built without any privacy-compromising data. View details
    Rethinking Attention with Performers
    Valerii Likhosherstov
    David Martin Dohan
    Peter Hawkins
    Jared Quincy Davis
    Lukasz Kaiser
    Adrian Weller
    accepted to ICLR 2021 (oral presentation) (to appear)
    Preview abstract We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness. To approximate softmax attention-kernels, Performers use a novel Fast Attention Via positive Orthogonal Random features approach (FAVOR+), which may be of independent interest for scalable kernel methods. FAVOR+ can be also used to efficiently model kernelizable attention mechanisms beyond softmax. This representational power is crucial to accurately compare softmax with other kernels for the first time on large-scale tasks, beyond the reach of regular Transformers, and investigate optimal attention-kernels. Performers are linear architectures fully compatible with regular Transformers and with strong theoretical guarantees: unbiased or nearly-unbiased estimation of the attention matrix, uniform convergence and low estimation variance. We tested Performers on a rich set of tasks stretching from pixel-prediction through text models to protein sequence modeling. We demonstrate competitive results with other examined efficient sparse and dense attention methods, showcasing effectiveness of the novel attention-learning paradigm leveraged by Performers. View details
    Model-Based Reinforcement Learning for Atari
    Blazej Osinski
    Chelsea Finn
    Henryk Michalewski
    Konrad Czechowski
    Lukasz Mieczyslaw Kaiser
    Mohammad Babaeizadeh
    Piotr Kozakowski
    Piotr Milos
    Roy H Campbell
    Ryan Sepassi
    Sergey Levine
    NIPS'18 (2020)
    Preview abstract Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with orders of magnitude fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games and achieve competitive results with only 100K interactions between the agent and the environment (400K frames), which corresponds to about two hours of real-time play. View details
    No Results Found