Grey Nearing

I am a Research Scientist on the Google Flood Forecasting team, and work on water-related topics.

Research Areas

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Here we show that AI-based forecasting achieves reliability in predicting extreme riverine events in ungauged watersheds at up to a 5-day lead time that is similar to or better than the reliability of nowcasts (0-day lead time) from a current state of the art global modeling system (the Copernicus Emergency Management Service Global Flood Awareness System). Additionally, we achieve accuracies over 5-year return period events that are similar to or better than current accuracies over 1-year return period events. This means that AI can provide flood warnings earlier and over larger and more impactful events in ungauged basins. The model developed in this paper was incorporated into an operational early warning system that produces publicly available (free and open) forecasts in real time in over 80 countries. This work highlights a need for increasing the availability of hydrological data to continue to improve global access to reliable flood warnings. View details
    An artificial neural network to estimate the foliar and ground cover input variables of the Rangeland Hydrology and Erosion Model
    Mahmoud Saeedimoghaddam
    David Goodrich
    Mariano Hernandez
    David Phillip Guertin
    Loretta J. Metz
    Guillermo Ponce-Campos
    Haiyan Wei
    Shea Burns
    Sarah E. McCord
    Mark A. Nearing
    C. Jason Williams
    Carrie-Ann Houdeshell
    Mashrekur Rahman
    Menberu B. Meles
    Steve Barker
    Journal of Hydrology(2024)
    Preview abstract Models like the Rangeland Hydrology and Erosion Model (RHEM) are useful for estimating soil erosion, however, they rely on input parameters that are sometimes difficult or expensive to measure. Specifically, RHEM requires information about foliar and ground cover fractions that generally must be measured in situ, which makes it difficult to use models like RHEM to produce erosion or soil risk maps for areas exceeding the size of a hillslope such as a large watershed. We previously developed a deep learning emulator of RHEM that has low computational expense and can, in principle, be run over large areas (e.g., over the continental US). In this paper, we develop a deep learning model to estimate the RHEM ground cover inputs from remote sensing time series, reducing the need for extensive field surveys to produce erosion maps. We achieve a prediction accuracy on hillslope runoff of r2=0.9, and on soil loss and sediment yield of r2 = 0.4 at 66,643 field locations within the US. We demonstrate how this approach can be used for mapping by developing runoff, soil loss, and sediment yield maps over a 1356 km2 region of interest in Nebraska. View details
    On the predictability of turbulent fluxes from land: PLUMBER2 MIP experimental description and preliminary results
    Gab Abramowitz
    Anna Ukkola
    Sanaa Hobeichi
    Jon Cranko Page
    Mathew Lipson
    Martin De Kauwe
    Sam Green
    Claire Brenner
    Jonathan Frame
    Martyn Clark
    Martin Best
    Peter Anthoni
    Gabriele Arduini
    Souhail Boussetta
    Silvia Caldararu
    Kyeungwoo Cho
    Matthias Cuntz
    David Fairbairn
    Craig Ferguson
    Hyungjun Kim
    Yeonjoo Kim
    Jürgen Knauer
    David Lawrence
    Xiangzhong Luo
    Sergey Malyshev
    Tomoko Nitta
    Jerome Ogee
    Keith Oleson
    Catherine Ottlé
    Phillipe Peylin
    Patricia de Rosnay
    Heather Rumbold
    Bob Su
    Nicolas Vuichard
    Anthony Walker
    Xiaoni Wang-Faivre
    Yunfei Wang
    Yijian Zeng
    Hydrology and Earth Systems Sciences Discussions(2024)
    Preview abstract Accurate representation of the turbulent exchange of carbon, water, and heat between the land surface and the atmosphere is critical for modelling global energy, water, and carbon cycles, both in future climate projections and weather forecasts. We describe a Model Intercomparison Project (MIP) that compares the surface turbulent heat flux predictions of around 20 different land models provided with in-situ meteorological forcing, evaluated with measured surface fluxes using quality-controlled data from 170 eddy-covariance based flux tower sites. Several out-of-sample empirical model predictions of site fluxes are used as benchmarks to quantify the degree to which land model performance could improve across a broad range of metrics. The performance discrepancy between empirical and physically-based model predictions also provides a potential pathway to understand sources of model error. Sites with unusual behaviour, complicated processes, poor data quality or uncommon flux magnitude will be more difficult to predict for both mechanistic and empirical models. Results suggest that latent heat flux and net ecosystem exchange of CO2 are better predicted by land models than sensible heat flux, which at least conceptually would appear to have fewer physical processes controlling it. Land models that are implemented in Earth System Models also appear to perform notably better than stand alone ecosystem (including demographic) models, at least in terms of the fluxes examined here. Flux tower data quality is also explored as an uncertainty source, with the difference between energy-balance corrected versus raw fluxes examined, as well as filtering for low wind speed periods. Land model performance does not appear to improve with energy-balance corrected data, and indeed some results raised questions about whether the correction process itself was appropriate. In both cases results were broadly consistent, with simple out-of-sample empirical models, including linear regression, comfortably outperforming mechanistic land models. The PLUMBER2 approach, and its openly-available data, enable precise isolation of the locations and conditions in which model developers can know that a given land model can improve, allowing information pathways and discrete parametrisations in models to be identified and targeted for model development. View details
    AI Increases Global Access to Reliable Flood Forecasts
    Asher Metzger
    Dana Weitzner
    Frederik Kratzert
    Guy Shalev
    Martin Gauch
    Sella Nevo
    Shlomo Shenzis
    Tadele Yednkachw Tekalign
    Vusumuzi Dube
    arXiv(2023)
    Preview abstract Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Here we show that AI-based forecasting achieves reliability in predicting extreme riverine events in ungauged watersheds at up to a 5-day lead time that is similar to or better than the reliability of nowcasts (0-day lead time) from a current state of the art global modeling system (the Copernicus Emergency Management Service Global Flood Awareness System). Additionally, we achieve accuracies over 5-year return period events that are similar to or better than current accuracies over 1-year return period events. This means that AI can provide flood warnings earlier and over larger and more impactful events in ungauged basins. The model developed in this paper was incorporated into an operational early warning system that produces publicly available (free and open) forecasts in real time in over 80 countries. This work highlights a need for increasing the availability of hydrological data to continue to improve global access to reliable flood warnings. View details
    In Defense of Metrics: Metrics Sufficiently Encode Typical Human Preferences Regarding Hydrological Model Performance
    Martin Gauch
    Frederik Kratzert
    Hoshin Gupta
    Juliane Mai
    Bryan A. Tolson
    Sepp Hochreiter
    Daniel Klotz
    Water Resources Research, 59, e2022WR033918(2023)
    Preview abstract Building accurate rainfall–runoff models is an integral part of hydrological science and practice. The variety of modeling goals and applications have led to a large suite of evaluation metrics for these models. Yet, hydrologists still put considerable trust into visual judgment, although it is unclear whether such judgment agrees or disagrees with existing quantitative metrics. In this study, we tasked 622 experts to compare and judge more than 14,000 pairs of hydrographs from 13 different models. Our results show that expert opinion broadly agrees with quantitative metrics and results in a clear preference for a Machine Learning model over traditional hydrological models. The expert opinions are, however, subject to significant amounts of inconsistency. Nevertheless, where experts agree, we can predict their opinion purely from quantitative metrics, which indicates that the metrics sufficiently encode human preferences in a small set of numbers. While there remains room for improvement of quantitative metrics, we suggest that the hydrologic community should reinforce their benchmarking efforts and put more trust in these metrics. View details
    Preview abstract Google has developed an AI-based river and inundation forecasting system, and is partnering with governments and water agencies around the world to provide real-time flood alerts directly to individuals, communities, and NGOs through existing Google information channels like Maps, Search, and Android Alerts. This talk will cover the background, development, and impact of this effort. View details
    Caravan - A global community dataset for large-sample hydrology
    Frederik Kratzert
    Nans Addor
    Tyler Erickson
    Martin Gauch
    Lukas Gudmundsson
    Daniel Klotz
    Sella Nevo
    Guy Shalev
    Scientific Data, 10(2023), pp. 61
    Preview abstract High-quality datasets are essential to support hydrological science and modeling. Several CAMELS (Catchment Attributes and Meteorology for Large-sample Studies) datasets exist for specific countries or regions, however these datasets lack standardization, which makes global studies difficult. This paper introduces a dataset called Caravan (a series of CAMELS) that standardizes and aggregates seven existing large-sample hydrology datasets. Caravan includes meteorological forcing data, streamflow data, and static catchment attributes (e.g., geophysical, sociological, climatological) for 6830 catchments. Most importantly, Caravan is both a dataset and open-source software that allows members of the hydrology community to extend the dataset to new locations by extracting forcing data and catchment attributes in the cloud. Our vision is for Caravan to democratize the creation and use of globally-standardized large-sample hydrology datasets. Caravan is a truly global open-source community resource. View details
    Preview abstract We develop a deep learning based convolutional-regression model that estimates the volumetric soil moisture content in the top ~5 cm. Input predictors include Sentinel-1 (active radar), Sentinel-2 (optical imagery), and SMAP (passive radar) as well as geophysical variables from SoilGrids and modelled soil moisture fields from GLDAS. The model was trained and evaluated on data from ~1300 in-situ sensors globally over the period 2015 - 2021 and obtained an average per-sensor correlation of 0.72 and ubRMSE of 0.0546. These results are benchmarked against 13 other soil moisture estimates at different locations, and an ablation study was used to identify important predictors. View details
    On strictly enforced mass conservation constraints for modelling the Rainfall-Runoff process
    Jonathan Frame
    Frederik Kratzert
    Hoshin Gupta
    Paul Ullrich
    Hydrological Processes(2023)
    Preview abstract It has been proposed that conservation laws might not be beneficial for accurate hydrological modelling due to errors in input (precipitation) and target (streamflow) data (particularly at the event time scale), and this might explain why deep learning models (which are not based on enforcing closure) can out-perform catchment-scale conceptual and process-based models at predicting streamflow. We test this hypothesis with two forcing datasets that disagree in total, long-term precipitation. We analyse the roll of strictly enforced mass conservation for matching a long-term mass balance between precipitation input and streamflow output using physics-informed (mass conserving) machine learning and find that: (1) enforcing closure in the rainfall-runoff mass balance does appear to harm the overall skill of hydrological models; (2) deep learning models learn to account for spatiotemporally variable biases in data (3) however this ‘closure’ effect accounts for only a small fraction of the difference in predictive skill between deep learning and conceptual models. View details
    Global Flood Forecasting at a Fine Catchment Resolution using Machine Learning
    Asher Metzger
    Dana Weitzner
    Frederik Kratzert
    Guy Shalev
    Sella Nevo
    Shlomo Shenzis
    Tadele Yednkachw Tekalign
    (2022)
    Preview abstract Machine learning has been shown to be a promising tool for hydrological modeling. We have used this technology to develop an operational real-time global streamflow prediction model. The model architecture is based primarily on an LSTM (Long Short Term Memory), which is a form of RNN (Recurrent Neural Network) that includes a state vector similar to dynamical systems models. Our model has been shown to outperform physical and conceptual hydrologic models across time and spatial scales. The main advantage of this ML approach is that models can be trained (calibrated) over many diverse catchments simultaneously rather than being calibrated separately per catchment. This advantage is especially important when modeling on a global scale where the model is trained on a very large number of catchments that have diverse climatology and geographical settings. Consequently, the model learns different rainfall-runoff dynamics of rivers across these settings and is able to predict accordingly. Once the model is trained (a very short process in comparison to calibrating traditional global models), it can be applied almost anywhere where basin attributes are available, in particular, at ungauged locations. We use globally available, near-real time datasets for training and inference, which allows running the model operationally. Global datasets used: HydroSHEDS database for global catchments delineation and static attributes. Meteorological forcing data from: ECMWF weather data, including the ERA5-Land reanalysis and the IFS HRES real-time forecasts and re-forecasts. NOAA’s IMERG (early) global precipitation estimates. CPC Global Unified Gauge-Based Analysis of Daily Precipitation. Stream flow global datasets such as GRDC and Caravan for streamflow discharge labels. View details