General Geospatial Inference with a Population Dynamics Foundation Model

Mohit Agarwal

Mimi Sun

Chaitanya Kamath

Arbaaz Muslim

Prithul Sarker

Joydeep Paul

Hector Yee

Marcin Sieniek

Kim Jablonski

Yael Mayer

David Fork

Sheila de Guia

Jamie McPike

Adam Boulanger

Tomer Shekel

David Schottlander

Yao Xiao

Manjit Chakravarthy Manukonda

Yun Liu

Neslihan Bulut

Sami Abu-El-Haija

Arno Eigenwillig

Bryan Perozzi

Monica Bharel

Von Nguyen

Luke Barrington

Niv Efron

Yossi Matias

Greg Corrado

Krish Eswaran

Shruthi Prabhakara

Shravya Shetty

Gautam Prasad

(2024) (to appear)

Download Google Scholar

Abstract

Supporting the health and well-being of dynamic populations around the world requires governmental agencies, organizations, and researchers to understand and reason over complex relationships between human behavior and local contexts. This support includes identifying populations at elevated risk and gauging where to target limited aid resources. Traditional approaches to these classes of problems often entail developing manually curated, task-specific features and models to represent human behavior and the natural and built environment, which can be challenging to adapt to new, or even related tasks. To address this, we introduce the Population Dynamics Foundation Model (PDFM), which aims to capture the relationships between diverse data modalities and is applicable to a broad range of geospatial tasks. We first construct a geo-indexed dataset for postal codes and counties across the United States, capturing rich aggregated information on human behavior from maps, busyness, and aggregated search trends, and environmental factors such as weather and air quality. We then model this data and the complex relationships between locations using a graph neural network, producing embeddings that can be adapted to a wide range of downstream tasks using relatively simple models. We evaluate the effectiveness of our approach by benchmarking it on 27 downstream tasks spanning three distinct domains: health indicators, socioeconomic factors, and environmental measurements. The approach achieves state-of-the-art performance on geospatial interpolation across all tasks, surpassing existing satellite and geotagged image based location encoders. In addition, it achieves state-of-the-art performance in extrapolation and super-resolution for 25 of the 27 tasks. We also show that the PDFM can be combined with a state-of-the-art forecasting foundation model, TimesFM, to predict unemployment and poverty, achieving performance that surpasses fully supervised forecasting. The full set of embeddings and sample code are publicly available for researchers. In conclusion, we have demonstrated a general purpose approach to geospatial modeling tasks critical to understanding population dynamics by leveraging a rich set of complementary globally available datasets that can be readily adapted to previously unseen machine learning tasks.

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

General Geospatial Inference with a Population Dynamics Foundation Model

Abstract

Meet the teams driving innovation