General Science
We aim to transform scientific research itself. Many scientific endeavors can benefit from large scale experimentation, data gathering, and machine learning (including deep learning). We aim to accelerate scientific research by applying Google’s computational power and techniques in areas such as drug discovery, biological pathway modeling, microscopy, medical diagnostics, material science, and agriculture. We collaborate closely with world-class research partners to help solve important problems with large scientific or humanitarian benefit.
Recent Publications
A scalable system to measure contrail formation on a per-flight basis
Erica Brand
Sebastian Eastham
Carl Elkin
Thomas Dean
Zebediah Engberg
Ulrike Hager
Joe Ng
Dinesh Sanekommu
Tharun Sankar
Marc Shapiro
Environmental Research Communications (2024)
Preview abstract
In this work we describe a scalable, automated system to determine from satellite data whether a given flight has made a persistent contrail.
The system works by comparing flight segments to contrails detected by a computer vision algorithm running on images from the GOES-16 Advanced Baseline Imager. We develop a `flight matching' algorithm and use it to label each flight segment as a `match' or `non-match'. We perform this analysis on 1.6 million flight segments and compare these labels to existing contrail prediction methods based on weather forecast data. The result is an analysis of which flights make persistent contrails several orders of magnitude larger than any previous work. We find that current contrail prediction models fail to correctly predict whether we will match a contrail in many cases.
View details
Preview abstract
Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Here we show that AI-based forecasting achieves reliability in predicting extreme riverine events in ungauged watersheds at up to a 5-day lead time that is similar to or better than the reliability of nowcasts (0-day lead time) from a current state of the art global modeling system (the Copernicus Emergency Management Service Global Flood Awareness System). Additionally, we achieve accuracies over 5-year return period events that are similar to or better than current accuracies over 1-year return period events. This means that AI can provide flood warnings earlier and over larger and more impactful events in ungauged basins. The model developed in this paper was incorporated into an operational early warning system that produces publicly available (free and open) forecasts in real time in over 80 countries. This work highlights a need for increasing the availability of hydrological data to continue to improve global access to reliable flood warnings.
View details
Preview abstract
This is an invited OFC 2024 conference workshop talk regarding a new type of lower-power datacenter optics design choice: linear pluggable optics. In this talk I will discuss the fundamental performance constraints facing linear pluggable optics and their implications on DCN and ML use cases
View details
Global extreme heat forecasting using neural weather models
Amy McGovern
Jason Hickey
Artificial Intelligence for the Earth Systems, vol. 2 (2023), e220035
Preview abstract
Heatwaves are projected to increase in frequency and severity with global warming. Improved warning systems would help reduce the associated loss of lives, wildfires, power disruptions, and reduction in crop yields. In this work, we explore the potential for deep learning systems trained on historical data to forecast extreme heat on short, medium and subseasonal time scales. To this purpose, we train a set of neural weather models (NWMs) with convolutional architectures to forecast surface temperature anomalies globally, 1 to 28 days ahead, at ~200-km resolution and on the cubed sphere. The NWMs are trained using the ERA5 reanalysis product and a set of candidate loss functions, including the mean-square error and exponential losses targeting extremes. We find that training models to minimize custom losses tailored to emphasize extremes leads to significant skill improvements in the heatwave prediction task, relative to NWMs trained on the mean-square-error loss. This improvement is accomplished with almost no skill reduction in the general temperature prediction task, and it can be efficiently realized through transfer learning, by retraining NWMs with the custom losses for a few epochs. In addition, we find that the use of a symmetric exponential loss reduces the smoothing of NWM forecasts with lead time. Our best NWM is able to outperform persistence in a regressive sense for all lead times and temperature anomaly thresholds considered, and shows positive regressive skill relative to the ECMWF subseasonal-to-seasonal control forecast after 2 weeks.
View details
Preview abstract
Google has developed an AI-based river and inundation forecasting system, and is partnering with governments and water agencies around the world to provide real-time flood alerts directly to individuals, communities, and NGOs through existing Google information channels like Maps, Search, and Android Alerts. This talk will cover the background, development, and impact of this effort.
View details
Chimane-Mosetén
Jeanette Sakel
Amazonian Languages: An International Handbook, De Gruyter Mouton (2023)
Preview abstract
Chimane-Mosetén (also known as Mosetenan; ISO 639–3: cas; Glottocode: mose1249) is a dialect continuum spoken by 13,500–16,000 people in the Amazonian region of northern Bolivia. It has not been convincingly shown to be related to any other language. Its status as an isolate makes it unique in many respects, not least in its combination of features typical of both Amazonian and Andean languages. Like its closer geographical neighbors in Amazonian Bolivia, including Movima, Tacana, Reyesano, and Cavineña, it exhibits contrastive nasality in the vowel system and is head marking and predominantly agglutinative. Bound pronominal forms marking arguments in the clause have the same form as bound pronominals marking possessors. Subordinate clauses typically involve nominalized verbs. Unlike most of its Amazonian neighbors, on the other hand, it does not have a semantically-based classifier or gender system but instead features arbitrarily assigned masculine or feminine gender. It also does not feature any incorporation of nouns, adverbs, or adpositions. It has an extensive oblique case-marking system, though core case-marking does not occur. More similar to Quechua and other Andean languages, it features a complex predicate-argument agreement system in which one or more agreement suffixes cross-reference the subject and object arguments of a transitive verb. It also has a large class of lexical numbers following a decimal numeral system.
View details