HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin

Frederik Kratzert
Martin Gauch
Daniel Klotz
Hydrology and Earth System Sciences (2024)

Abstract

Machine learning (ML) has played an increasing role in the hydrological sciences. In particular, Long Short-Term Memory (LSTM) networks are popular for rainfall–runoff modeling. A large majority of studies that use this type of model do not follow best practices, and there is one mistake in particular that is common: training deep learning models on small, homogeneous data sets, typically data from only a single hydrological basin. In this position paper, we show that LSTM rainfall–runoff models are best when trained with data from a large number of basins.