Hydrological Concept Formation inside Long Short Term Memory (LSTM) networks
Abstract
Long Short-Term Memory networks (LSTMs), a deep learning architecture specialized for sequential inputs, have demonstrated state-of-the-art accuracy in several large-sample benchmarking studies for rainfall-runoff modelling. Conceptual and physically-based models have fixed internal representations (equations and parameters) that explicitly describe known or hypothesized physical processes, sometimes derived from measurements at the field or laboratory scale. In contrast, an LSTM learns intermediate representations guided by model training, based only on training data and the selected objective function. It is thus not clear, a priori, whether the LSTM internal states represent hydrological processes which can be extracted into a form that is interpretable by humans. Small-scale experiments have demonstrated that the internal states of LSTMs can also be interpreted. By extracting the tensors which represent the learned translation from inputs (precipitation, temperature) to outputs (discharge), this research seeks to understand what information the LSTM captures about the hydrological system. We assess the hypothesis that the LSTM replicates real-world processes and that we can extract information about these processes from the internal states of the LSTM. We examine the cell-state vector, which represents the memory of the LSTM, and explore the ways in which the LSTM learns to reproduce stores of water, such as soil moisture and snow cover. We use a simple regression approach to map the LSTM state-vector to our target stores (soil moisture and snow). Good correlations (R2 > 0.8) between the probe outputs and the target variables of interest provide evidence that the LSTM contains information that reflects known hydrological processes comparable with the concept of variable-capacity soil moisture stores.
The implications of this study are threefold: 1) LSTMs reproduce known hydrological processes. 2) While conceptual models have theoretical assumptions embedded in the model a priori, the LSTM derives these from the data. These learned representations are interpretable by scientists. 3) LSTMs can be used to gain an estimate of intermediate stores of water such as soil moisture. We therefore argue that deep learning approaches can be used to advance our scientific goals as well as our predictive goals.
The implications of this study are threefold: 1) LSTMs reproduce known hydrological processes. 2) While conceptual models have theoretical assumptions embedded in the model a priori, the LSTM derives these from the data. These learned representations are interpretable by scientists. 3) LSTMs can be used to gain an estimate of intermediate stores of water such as soil moisture. We therefore argue that deep learning approaches can be used to advance our scientific goals as well as our predictive goals.