A comparative evaluation of non-linear time series analysis and singular spectrum analysis for the modelling of air pollution
[摘要] ENGLISH ABSTRACT:Air pollution is a major concern III the Cape Metropole. A major contributor to the airpollution problem is road transport. For this reason, a national vehicle emissions study is inprogress with the aim of developing a national policy regarding motor vehicle emissions andcontrol. Such a policy could bring about vehicle emission control and regulatory measures,which may have far-reaching social and economic effects.Air pollution models are important tools 10 predicting the effectiveness and the possiblesecondary effects of such policies. It is therefore essential that these models arefundamentally sound to maintain a high level of prediction accuracy. Complex air pollutionmodels are available, but they require spatial, time-resolved information of emission sourcesand a vast amount of processing power. It is unlikely that South African cities will have thenecessary spatial, time-resolved emission information in the near future. An alternative airpollution model is one that is based on the Gaussian Plume Model. This model, however,relies on gross simplifying assumptions that affect model accuracy.It is proposed that statistical and mathematical analysis techniques will be the most viableapproach to modelling air pollution in the Cape Metropole. These techniques make it possibleto establish statistical relationships between pollutant emissions, meteorological conditionsand pollutant concentrations without gross simplifying assumptions or excessive informationrequirements. This study investigates two analysis techniques that fall into theaforementioned category, namely, Non-linear Time Series Analysis (specifically, the methodof delay co-ordinates) and Singular Spectrum Analysis (SSA).During the past two decades, important progress has been made in the field of Non-linearTime Series Analysis. An entire toolbox of methods is available to assist in identifyingnon-linear determinism and to enable the construction of predictive models. It is argued thatthe dynamics that govern a pollution system are inherently non-linear due to the strongcorrelation with weather patterns and the complexity of the chemical reactions and physicaltransport of the pollutants. In addition to this, a statistical technique (the method of surrogatedata) showed that a pollution data set, the oxides of Nitrogen (NOx), displayed a degree ofnon-linearity, albeit that there was a high degree of noise contamination. This suggested thata pollution data set will be amenable to non-linear analysis and, hence, Non-linear TimeSeries Analysis was applied to the data set.SSA, on the other hand, is a linear data analysis technique that decomposes the time seriesinto statistically independent components. The basis functions, in terms of which the data isdecomposed, are data-adaptive which makes it well suited to the analysis of non-linearsystems exhibiting anharmonic oscillations. The statistically independent components, intowhich the data has been decomposed, have limited harmonic content. Consequently, thesecomponents are more amenable to prediction than the time series itself. The fact that SSA'sability has been proven in the analysis of short, noisy non-linear signals prompted the use ofthis technique.The aim of the study was to establish which of these two techniques is best suited to themodelling of air pollution data. To this end, a univariate model to predict NOx concentrationswas constructed using each of the techniques. The prediction ability of the respective modelwas assumed indicative of the accuracy of the model. It was therefore used as the basisagainst which the two techniques were evaluated. The procedure used to construct the modeland to quantify the model accuracy, for both the Non-linear Time Series Analysis model andthe SSA model, was consistent so as to allow for unbiased comparison. In both cases, nonoise reduction schemes were applied to the data prior to the construction of the model. Theaccuracy of a 48-hour step-ahead prediction scheme and a lOO-hour step-ahead predictionscheme was used to compare the two techniques.The accuracy of the SSA model was markedly superior to the Non-linear Time Series model.The paramount reason for the superior accuracy of the SSA model is its adept ability toanalyse and cope with noisy data sets such as the NOx data set. This observation providesevidence to suggest that Singular Spectrum Analysis is better suited to the modelling of airpollution data. It should therefore be the analysis technique of choice when more advanced,multivariate modelling of air pollution data is carried out.It is recommended that noise reduction schemes, which decontaminate the data withoutdestroying important higher order dynamics, should be researched. The application of aneffective noise reduction scheme could lead to an improvement in model accuracy. Inaddition to this, the univariate SSA model should be extended to a more complex multivariatemodel that explicitly encompasses variables such as traffic flow and weather patterns. Thiswill explicitly expose the inter-relationships between the variables and will enable sensitivitystudies and the evaluation of a multitude of scenarios.
[发布日期] [发布机构] Stellenbosch University
[效力级别] [学科分类]
[关键词] [时效性]