Anonymous

Item talk:Q146517

From geokb

Exploring the exceptional performance of a deep learning stream temperature model and the value of streamflow data

Stream water temperature (Ts) is a variable of critical importance for aquatic ecosystem health. Ts is strongly affected by groundwater-surface water interactions which can be learned from streamflow records, but previously such information was challenging to effectively absorb with process-based models due to parameter equifinality. Based on the long short-term memory (LSTM) deep learning architecture, we developed a basin-centric lumped daily mean Ts model, which was trained over 118 data-rich basins with no major dams in the conterminous United States, and showed strong results. At a national scale, we obtained a median root-mean-square error of 0.69°C, Nash–Sutcliffe model efficiency coefficient of 0.985, and correlation of 0.994, which are marked improvements over previous values reported in literature. The addition of streamflow observations as a model input strongly elevated the performance of this model. In the absence of measured streamflow, we showed that a two-stage model could be used, where simulated streamflow from a pre-trained LSTM model (Qsim) still benefited the Ts model even though no new information was brought directly into the inputs of the Ts model. The model indirectly used information learned from streamflow observations provided during the training of Qsim, potentially to improve internal representation of physically meaningful variables. Our results indicate that strong relationships exist between basin-averaged forcing variables, catchment attributes, and Ts that can be simulated by a single model trained by data on the continental scale.