Jump to Content

Caravan - A global community dataset for large-sample hydrology

Frederik Kratzert
Nans Addor
Tyler Erickson
Martin Gauch
Lukas Gudmundsson
Daniel Klotz
Sella Nevo
Guy Shalev
Scientific Data, vol. 10 (2023), pp. 61


High-quality datasets are essential to support hydrological science and modeling. Several CAMELS (Catchment Attributes and Meteorology for Large-sample Studies) datasets exist for specific countries or regions, however these datasets lack standardization, which makes global studies difficult. This paper introduces a dataset called Caravan (a series of CAMELS) that standardizes and aggregates seven existing large-sample hydrology datasets. Caravan includes meteorological forcing data, streamflow data, and static catchment attributes (e.g., geophysical, sociological, climatological) for 6830 catchments. Most importantly, Caravan is both a dataset and open-source software that allows members of the hydrology community to extend the dataset to new locations by extracting forcing data and catchment attributes in the cloud. Our vision is for Caravan to democratize the creation and use of globally-standardized large-sample hydrology datasets. Caravan is a truly global open-source community resource.