Mohit Agarwal
Mohit Agarwal is a Research Engineer at Google, where he focuses on advancing geospatial inference and large language models (LLMs). He joined Google in 2023, bringing with him a strong foundation in data-driven research and machine learning applications. Prior to Google, Mohit worked as a Quantitative Strategist at Goldman Sachs, where he specialized in leveraging data science and machine learning techniques to drive equity research insights. Mohit completed his Ph.D. in Electrical and Computer Engineering at the Georgia Institute of Technology, where he worked under the guidance of Dr. Raghupathy Sivakumar. His doctoral research explored the intersection of Brain-Computer Interfaces (BCIs) and Human-Assisted Machine Learning, focusing on novel approaches to human-centered AI.
Mohit holds a Bachelor's degree from the Indian Institute of Technology Kanpur, which he completed in 2014.
Authored Publications
Sort By
General Geospatial Inference with a Population Dynamics Foundation Model
Chaitanya Kamath
Shravya Shetty
David Schottlander
Yael Mayer
Joydeep Paul
Jamie McPike
Sheila de Guia
Niv Efron
(2024) (to appear)
Preview abstract
Understanding complex relationships between human behavior and local contexts is crucial for various applications in public health, social science, and environmental studies. Traditional approaches often make use of small sets of manually curated, domain-specific variables to represent human behavior, and struggle to capture these intricate connections, particularly when dealing with diverse data types. To address this challenge, this work introduces a novel approach that leverages the power of graph neural networks (GNNs). We first construct a large dataset encompassing human-centered variables aggregated at postal code and county levels across the United States. This dataset captures rich information on human behavior (internet search behavior and mobility patterns) along with environmental factors (local facility availability, temperature, and air quality). Next, we propose a GNN-based framework designed to encode the connections between these diverse features alongside the inherent spatial relationships between postal codes and their containing counties. We then demonstrate the effectiveness of our approach by benchmarking the model on 27 target variables spanning three distinct domains: health, socioeconomic factors, and environmental measurements. Through spatial interpolation, extrapolation, and super-resolution tasks, we show that the proposed method can effectively utilize the rich feature set to achieve accurate predictions across diverse geospatial domains.
View details