Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Chain-of-Table: Evolves Tables in the LLM Reasoning Chain for Table Understanding

Zilong Wang

Hao Zhang

Chun-Liang Li

Julian Eisenschlos

Vincent Perot

Zifeng Wang

Lesly Miculicich

Yasuhisa Fujii

Jingbo Shang

Chen-Yu Lee

Tomas Pfister

ICLR (2024)

Automatic Histograms: Leveraging Language Models for Text Dataset Exploration

Emily Reif

Crystal Qian

James Wexler

Minsuk Kahng

Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA '24), ACM, Honolulu, HI, USA (2024), pp. 9

BigLake: BigQuery’s Evolution toward a Multi-Cloud Lakehouse

Justin Levandoski

Garrett Casto

Mingge Deng

Rushabh Desai

Pavan Edara

Thibaud Hottelier

Amir Hormati

Anoop Johnson

Jeff Johnson

Dawid Kurzyniec

Sam McVeety

Prem Ramanathan

Gaurav Saxena

Vidya Shanmugam

Yuri Volobuev

SIGMOD (2024)

Vortex: A Stream-oriented Storage Engine For Big Data Analytics

Pavan Edara

Jonathan Forbes

Bigang Li

SIGMOD (2024)

Discovering Datasets on the Web Scale: Challenges and Recommendations for Google Dataset Search

Katrina Sostek

Daniel Russell

Tesh Goyal

Tarfah Alrashed

Stella Dugall

Natasha Noy

Harvard Data Science Review (2024)

SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL

Jeff Shute

Shannon Bales

Matthew Brown

Jean-Daniel Browne

Brandon Dolphin

Romit Kudtarkar

Andrey Litvinov

Jingchi Ma

John Morcos

Michael Shen

David Wilhite

Xi Wu

Lulan Yu

Proc. VLDB Endow. (2024), pp. 4051-4063 (to appear)

Progressive Partitioning for Parallelized Query Execution in Google’s Napa

Junichi Tatemura

Tao Zou

Jagan Sankaranarayanan

Yanlai Huang

Jim Chen

Yupu Zhang

Kevin Lai

Hao Zhang

Gokul Nath Babu Manoharan

Goetz Graefe

Divyakant Agrawal

Brad Adelberg

Shilpa Kolhar

Indrajit Roy

49th International Conference on Very Large Data Bases, VLDB (2023), pp. 3475-3487

Keynote: Hardware Innovation for Zettabyte-scale Databases

David Bacon

(2023)

Are we cobblers without shoes? Making Computer Science data FAIR

Natasha Noy

Carole Goble

Communications of ACM, 66 (1) (2023)

Data Commons

R.V. Guha

Prashanth Radhakrishnan

Bo Xu

Carolyn Au

Wei Sun

Jehangir Amjad

Ajai Tirumali

Jennifer Chen

Julia Wu

Natalie Diaz

Samantha Piekos

Prem Ramaswami

James Manyika

(2023)

Firestore: The NoSQL Serverless Database for the Application Developer

Ram Kesavan

David Gay

Daniel Thevessen

Jimit Shah

C. Mohan

2023 IEEE 39th International Conference on Data Engineering (ICDE), pp. 3367-3379

In-path Oracles for Road Networks

Debajyoti Ghosh

Jagan Sankaranarayanan

Kiran Khatter

Hanan Samet

International Journal of Geo-Information, 12(7) (2023), pp. 277

Detection and Prevention of Silent Data Corruption in an Exabyte-scale Database System

David F. Bacon

The 18th IEEE Workshop on Silicon Errors in Logic – System Effects, IEEE (2022)

The Open Reaction Database

Abigail G. Doyle

Anton Kast

Connor W. Coley

Joel M. Hawkins

Klavs F. Jensen

Michael R. Maser

Michael Wleklinski

Spencer D. Dreher

(2021)

How complete are the CDC's COVID-19 Case Surveillance datasets for race/ethnicity at the state and county levels?

Katrina Sostek

Google, Inc. (2021)

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Publications

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Publications

Filter by:

Year

Team

Research Area

Meet the teams driving innovation