Protecting cities with AI-driven flash flood forecasting

March 12, 2026

Oleg Zlydenko, Software Engineer, and Deborah Cohen, Research Scientist, Google Research

We’re expanding our global flood forecasting coverage with the roll-out of flash flood predictions in urban areas. Using a novel AI training method based on news data, we provide up to 24 hours advance notice for these rapid-onset events. This expansion is a critical step to enhancing global climate resilience and keeping communities safe.

According to the World Meteorological Organization (WMO), flash floods account for approximately 85% of flood related fatalities worldwide. They typically occur within six hours of heavy rain, turn city streets into gushing rivers, and take more than 5,000 lives annually, making them one of the world’s deadliest disasters. Early warning systems (EWS) are essential for keeping communities safe and informed. They have been proven to save lives and mitigate damage: even a 12-hour lead time can provide a 60% reduction in flash flood damage. However, a stark “warning gap" exists between countries. While the most developed nations benefit from robust forecasting, life-saving infrastructure is largely absent across vast regions of the Global South, where less than half of developing countries have access to multi-hazard EWS. This leaves billions of people without the advance notice that makes a critical difference.

To address this, today we’re announcing the roll-out of Urban Flash Flood forecasts on Flood Hub. By leveraging a new AI-powered methodology, we can now predict the risk of flash floods in urban areas up to 24 hours in advance. These predictions build on years of research and mark a significant breakthrough in our flood forecasting capabilities and an expansion of our flood coverage.

To date, our Flood Forecasting Initiative has focused on riverine floods, where rivers overflow their banks over a relatively slow period. While our forecasts cover over 2 billion people in 150 countries for the most significant riverine floods events, urban flash floods present a unique challenge. Unlike riverine floods, flash floods are characterized by their rapid onset, requiring a fundamentally different forecasting approach.

The challenge: The "invisible" flood

One challenge in forecasting flash floods is a lack of "ground truth" data. Riverine machine learning models are trained on physical stream gauges that measure water levels or streamflow. By training models on historical river gauge measurements, we can accurately predict localized water rises and anticipate when a river is likely to exceed its flood banks. We have also successfully extended these predictions to ungauged locations to provide more global coverage of riverine floods.

Flash floods, however, can happen anywhere and often far from any stream gauge. In urban environments, the complex interaction between intense rainfall, impermeable surfaces, and drainage systems makes traditional physical modeling computationally prohibitive at a global scale. Furthermore, without a historical record of exactly where and when flash floods have occurred in the past, traditional supervised ML models cannot learn the patterns necessary to predict them.

To address the lack of historical data, we used Groundsource, a new AI-powered methodology to extract ground truth from unstructured data with high precision. This enabled us to create the Groundsource dataset of past flash flood events. We used Gemini to analyze publicly available news reports that mention floods to confirm flood event details (e.g., clear locations and times). These entries were then aggregated to create a dataset of historical flooding events, which we used to train and evaluate our new flash flood model in urban areas.

The scaling challenge: Local precision vs. global reach

Specialized, hyper-local early warning systems have been engineered to address flash floods from rainfall in specific urban settings, with examples in Florida (US), Barranquilla (Colombia), Manila (Philippines), Nakhon Si Thammarat (Thailand), Mayaguez (Puerto Rico), and Barcelona (Spain). These systems typically rely on a network of physical sensors monitoring variables like direct and radar-inferred precipitation, water levels and flow velocities. While highly accurate for their specific locations, they are difficult to scale because of the high costs of hardware deployment, the need for site-specific calibration algorithms and engineering expertise.

At a broader level, initiatives such as the WMO’s Flash Flood Guidance System (FFGS), the European Runoff Index based on Climatology (ERIC) flash flood indicator, and the US National Weather Service (NWS) Flash Flood Warnings system provide wider coverage through remote sensing and numerical weather models. These systems, however, encounter significant hurdles regarding global implementation. A primary issue is their dependency on high-resolution hydrological maps and radar-based weather forecasts, resources that are largely unavailable within the Global South. Furthermore, the reliance on professional hydrologists to interpret complex model data and distribute actionable warnings presents a second major challenge.

To achieve near-global reach, our model uses only global weather products (NASA IMERG, NOAA CPC) as well as real-time global weather forecasts from the ECMWF Integrated Forecast System (IFS) High Resolution (HRES) atmospheric model and the AI-based medium-range global weather forecasting model by Google DeepMind. The system currently operates at a 20x20 kilometer spatial resolution, a constraint primarily driven by the resolution of globally available data sources.

The model: Focusing on the city

Trained on Groundsource, the new flash flood model is designed to answer a specific question: Given the forecasted weather and local conditions, is a flash flood likely to occur in this area in the next 24 hours?

The model utilizes a recurrent neural network (RNN) architecture constructed with a long short-term memory (LSTM) unit that is specifically suited for processing time-series data. In addition to the meteorological time-series inputs, it also incorporates static geographic, geophysical, and anthropogenic attributes, such as urbanization density, topography, and soil absorption rates.

We focused our initial launch on urban areas, providing forecasts for the majority of the world’s population. The reason for this choice is that the training data — news reports — is naturally more dense in these locations. Currently the model predicts impact in areas with population densities greater than 100 people per square kilometer.

Urban-Flash-Floods-1

Regions in the world covered by our model.

Evaluation results

We evaluated our model against the Groundsource dataset, noting that reported precision metrics are likely underestimates. Because some real-world floods go unreported in the media, valid alerts can be misclassified as false positives. A manual audit of a random subset of the dataset (100 alerts per continent) substantiated this discrepancy, revealing that many false positives were in fact verified flood events, and confirmed that the actual precision is higher than raw metrics show. We calculated recall on floods from the Global Disaster Awareness and Coordination System (GDACS), in order to estimate how well our model captures the most impactful flood events.

Detailed performance metrics are shown in the plots below. The key insight is that the precision and recall of our model in much of the global south — South America, South East Asia — is equivalent to the performance in the richest countries that typically benefit from modern instrumentation and local forecasting experts. For comparison, we tried to estimate the performance of the NWS Flash Flood Warnings in the U.S., using the same metrics. To ensure consistency, we adjusted NWS data to match our resolution (20x20 kilometer grids over 24-hour windows). The recall of the NWS forecasts is 22% and the precision is 44% (which is underestimated, as above). This provides context for the difficulty of the problem, and shows that our model achieves similar results in many of the countries that are most frequently affected by floods.

Some gaps still remain however. In map (b) below we only show countries where we had at least 10 events in GDACS in order to estimate our recall. Many countries in Africa are still lacking in ground truth beyond Groundsource, making it difficult to accurately estimate the accuracy of our model.

Urban-Flash-Floods-2

Top row: Precision (a) and recall (b) by country. Bottom row: Event count in Groundsource (c) and GDACS (d) by country. We exclude countries with less than 10 ground truth events, as their metrics would be very noisy. For a clear visual display, we limit the precision and recall color scale to 50%.

Building global climate resilience

This launch is part of our Google Earth AI family of geospatial models and datasets and is a critical step supporting Google’s Crisis Resilience effort, but it is just the beginning. We are actively working to improve the model's generalization to rural areas, reduce the spatial resolution for more hyper-local forecasts, and integrate even more real-time weather data sources.

As we focus on the future of our communities and our planet, the importance of scalable, AI-driven adaptation tools has never been clearer. By expanding our coverage to include the rapid-onset threats that affect cities most, we hope to provide governments, individuals, and international organizations with the information they need to stay safe in a changing climate.

Acknowledgements

Many people were involved in the development of this effort. We would like to especially thank those from Google Research: Aviel Niego, Avinatan Hassidim, Benny Mosheyev, Dan Korenfeld, Deborah Cohen, Dem Gerolemou, Gila Loike, Grey Nearing, Hadas Fester, Ido Zemach, Juliet Rothenberg, Martin Gauch, Oleg Zlydenko, Oren Gilon, Reuven Sayag, Rotem Mayo, Shmuel Fronman, Shruti Verma, Tzvika Stein, Yossi Matias, and Yuval Shildan.

×
×