Coupling machine learning and physical modelling for predicting runoff at catchment scale

In this paper, we present an approach that combines data-driven and physical modelling for predicting the runoff occurrence and volume at catchment scale. With that aim, we first estimated the runoff volume from recorded storms aided by the Green-Ampt infiltration model. Then, we used machine learning algorithms, namely LightGBM (LGBM) and Deep Neural Network (DNN), to predict the outputs of the physical model fed on a set of atmospheric variables (relative humidity, temperature, atmospheric pressure, and wind velocity) collected before or immediately after the beginning of the storm. Results for a small urban catchment in Madrid show DNN performed better in predicting the runoff occurrence and volume. Moreover, enriching the input primary atmospheric variables with auxiliary variables (e.g., storm intensity data recorded during the first hour, or rain volume and intensity estimates obtained from auxiliary regression methods) largely increased the model performance. We show in this manuscript data-driven algorithms shaped by physical criteria can be successfully generated by allowing the data-driven algorithm learn from the output of physical models. It represents a novel approach for physics-informed data-driven algorithms shifting from common practices in hydrological modelling through machine learning.