Brian Patrick Williams

Brian Patrick Williams

Dr. Brian Williams is a member of the Applied Science team. He has been at Google since 2011. He has a Masters in Physics from Imperial College and a PhD in Computer Vision from University of Oxford.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
An AI system to help scientists write expert-level empirical software
Eser Aygün
Anastasiya Belyaeva
Gheorghe Comanici
Hao Cui
Renee Johnston
Zahra Shamsi
David Smalling
James Thompson
Sarah Martinson
Lai Wei
Yuchen Zhou
Qian-Ze Zhu
Matthew Abraham
Erica Brand
Anna Bulanova
Jeffrey Cardille
Chris Co
Scott Ellsworth
Grace Joseph
Malcolm Kane
Ryan Krueger
Johan Kartiwa
Jackson Cui
Paul Raccuglia
Julie Wang
Kat Chou
James Manyika
Lizzie Dorfman
Shibl Mourad
Nature (2026)
Preview abstract The cycle of scientific discovery is frequently bottlenecked by the slow, manual creation of software to support computational experiments. To address this, we present Empirical Research Assistance (ERA), an AI system that creates expert-level scientific software whose goal is to maximize a quality metric. The system uses a Large Language Model (LLM) and Tree Search (TS) to systematically improve the quality metric and intelligently navigate the large space of possible solutions. ERA achieves expert-level results when it explores and integrates complex research ideas from external sources. The effectiveness of tree search is demonstrated across a diverse range of tasks. In bioinformatics, ERA discovered 40 novel methods for single-cell data analysis that outperformed the top human-developed methods on a public leaderboard. In epidemiology, ERA generated 14 models that outperformed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations. ERA also produced expert-level software for geospatial analysis, neural activity prediction in zebrafish, and numerical solution of integrals, and a novel rule-based construction for time series forecasting. By devising and implementing novel solutions to diverse tasks, ERA represents a significant step towards accelerating scientific progress. Keywords: Tree Search, Generative AI, Scorable Scientific Tasks, Empirical Software View details
Mapping the ionosphere with millions of phones
Jamie Smith
Anton Geraschenko
Jade Morton
Frank van Diggelen
Nature (2024)
Preview abstract The ionosphere is a layer of weakly ionized plasma bathed in Earth’s geomagnetic field extending about 50–1,500 kilometres above Earth1. The ionospheric total electron content varies in response to Earth’s space environment, interfering with Global Satellite Navigation System (GNSS) signals, resulting in one of the largest sources of error for position, navigation and timing services2. Networks of high-quality ground-based GNSS stations provide maps of ionospheric total electron content to correct these errors, but large spatiotemporal gaps in data from these stations mean that these maps may contain errors3. Here we demonstrate that a distributed network of noisy sensors—in the form of millions of Android phones—can fill in many of these gaps and double the measurement coverage, providing an accurate picture of the ionosphere in areas of the world underserved by conventional infrastructure. Using smartphone measurements, we resolve features such as plasma bubbles over India and South America, solar-storm-enhanced density over North America and a mid-latitude ionospheric trough over Europe. We also show that the resulting ionosphere maps can improve location accuracy, which is our primary aim. This work demonstrates the potential of using a large distributed network of smartphones as a powerful scientific instrument for monitoring Earth. View details
ProtSeq: towards high-throughput, single-molecule protein sequencing via amino acid conversion into DNA barcodes
Jessica Hong
Michael Connor Gibbons
Ali Bashir
Diana Wu
Shirley Shao
Zachary Cutts
Mariya Chavarha
Ye Chen
Lauren Schiff
Mikelle Foster
Victoria Church
Llyke Ching
Sara Ahadi
Anna Hieu-Thao Le
Alexander Tran
Michelle Therese Dimon
Phillip Jess
Marc Berndl
iScience, 25 (2022), pp. 32
Preview abstract We demonstrate early progress toward constructing a high-throughput, single-molecule protein sequencing technology utilizing barcoded DNA aptamers (binders) to recognize terminal amino acids of peptides (targets) tethered on a next-generation sequencing chip. DNA binders deposit unique, amino acid identifying barcodes on the chip. The end goal is that over multiple binding cycles, a sequential chain of DNA barcodes will identify the amino acid sequence of a peptide. Toward this, we demonstrate successful target identification with two sets of target-binder pairs: DNA-DNA and Peptide-Protein. For DNA-DNA binding, we show assembly and sequencing of DNA barcodes over 6 consecutive binding cycles. Intriguingly, our computational simulation predicts that a small set of semi-selective DNA binders offers significant coverage of the human proteome. Toward this end, we introduce a binder discovery pipeline that ultimately could merge with the chip assay into a technology called ProtSeq, for future high-throughput, single-molecule protein sequencing. View details
Impacts of social distancing policies on mobility and COVID-19 case growth in the US
Gregory Alexander Wellenius
Swapnil Suresh Vispute
Valeria Espinosa
Thomas Tsai
Jonathan Hennessy
Andrew Dai
Krishna Kumar Gadepalli
Adam Boulanger
Adam Pearce
Chaitanya Kamath
Arran Schlosberg
Catherine Bendebury
Chinmoy Mandayam
Charlotte Stanton
Shailesh Bavadekar
Christopher David Pluntke
Damien Desfontaines
Benjamin H. Jacobson
Zan Armstrong
Andrew Philip Widdowson
Katherine Chou
Andrew Nathaniel Oplinger
Ashish K. Jha
Evgeniy Gabrilovich
Nature Communications (2021)
Preview abstract Social distancing has emerged as the primary mitigation strategy to combat the COVID-19 pandemic in the United States. However, large-scale evaluation of the effectiveness of social distancing policies are lacking. We used aggregated mobility data to quantify the impact of social distancing policies on observed changes in mobility. Declarations of states of emergency resulted in approximately a 10% reduction in time spent outside places of residence and an increase in visits to grocery stores and pharmacies. Subsequent implementation of ≥1 social distancing policies resulted in an additional 25% reduction in mobility in the following week. The seven states that subsequently ordered residents to shelter in place on or before March 23, 2020 observed an additional 29% reduction in time spent outside the residence. Our findings suggest that state-wide mandates are highly effective in achieving the goals of social distancing to minimize the transmission of COVID-19. View details
Applying Deep Neural Network Analysis to High-Content Image-Based Assays
Scott L. Lipnick
Nina R. Makhortova
Minjie Fan
Zan Armstrong
Thorsten M. Schlaeger
Liyong Deng
Wendy K. Chung
Liadan O'Callaghan
Anton Geraschenko
Dosh Whye
Marc Berndl
Jon Hazard
Arunachalam Narayanaswamy
D. Michael Ando
Lee L. Rubin
SLAS DISCOVERY: Advancing Life Sciences R\&D, 0 (2019), pp. 2472555219857715
Preview abstract The etiological underpinnings of many CNS disorders are not well understood. This is likely due to the fact that individual diseases aggregate numerous pathological subtypes, each associated with a complex landscape of genetic risk factors. To overcome these challenges, researchers are integrating novel data types from numerous patients, including imaging studies capturing broadly applicable features from patient-derived materials. These datasets, when combined with machine learning, potentially hold the power to elucidate the subtle patterns that stratify patients by shared pathology. In this study, we interrogated whether high-content imaging of primary skin fibroblasts, using the Cell Painting method, could reveal disease-relevant information among patients. First, we showed that technical features such as batch/plate type, plate, and location within a plate lead to detectable nuisance signals, as revealed by a pre-trained deep neural network and analysis with deep image embeddings. Using a plate design and image acquisition strategy that accounts for these variables, we performed a pilot study with 12 healthy controls and 12 subjects affected by the severe genetic neurological disorder spinal muscular atrophy (SMA), and evaluated whether a convolutional neural network (CNN) generated using a subset of the cells could distinguish disease states on cells from the remaining unseen control–SMA pair. Our results indicate that these two populations could effectively be differentiated from one another and that model selectivity is insensitive to batch/plate type. One caveat is that the samples were also largely separated by source. These findings lay a foundation for how to conduct future studies exploring diseases with more complex genetic contributions and unknown subtypes. View details
×