Prediction of algal blooms via data-driven machine learning models: an evaluation using data from a well-monitored mesotrophic lake
[摘要] With increasing lake monitoring data, data-driven machinelearning (ML) models might be able to capture the complex algal bloomdynamics that cannot be completely described in process-based (PB) models.We applied two ML models, the gradient boost regressor (GBR) and long short-termmemory (LSTM) network, to predict algal blooms and seasonal changes in algalchlorophyll concentrations (Chl) in a mesotrophic lake. Three predictiveworkflows were tested, one based solely on available measurements and theothers applying a two-step approach, first estimating lake nutrients thathave limited observations and then predicting Chl using observed andpre-generated environmental factors. The third workflow was developedusing hydrodynamic data derived from a PB model as additional trainingfeatures in the two-step ML approach. The performance of the ML models wassuperior to a PB model in predicting nutrients and Chl. The hybrid modelfurther improved the prediction of the timing and magnitude of algal blooms.A data sparsity test based on shuffling the order of training and testingyears showed the accuracy of ML models decreased with increasing sampleinterval, and model performance varied with training–testing yearcombinations.
[发布日期] [发布机构]
[效力级别] [学科分类] 土木及结构工程学
[关键词] [时效性]