Lasso Guarantees for Dependent Data
[摘要] Serially correlated high-dimensional data are prevalent in the big data era. In order to predict and learn the complex relationship among the multiple time series, high-dimensional modeling has gained importance in various fields such as control theory, statistics, economics, finance, genetics and neuroscience. We study a number of high-dimensional statistical problems involving different classes of mixing processes. For example, given a sequence (Xt), one might be interested in predicting Xt using the past observations Xt−1, · · · , Xt−d.The vector autoregressive (VAR) models naturally admit a linear autoregressive representation that allows us to study the joint evolution of the time series. In the high-dimensional setting, where both the number of components of the time series and the order of the model are allowed to grow with sample size, consistent estimation is impossible without structural assumptions on the transition matrices. One such structural assumption is that of sparsity. The lasso program is a celebrated method for sparse estimation. The majority of theoretical and empirical results on lasso, however, assume iid data. In addition, it is common for real data sets to contain missing and/or corrupted data. In the autoregressive scenario, both the independent and dependent variables are affected, and hence requires careful consideration. We study the problem based upon the framework proposed in Loh and Wainwright [2012]. In addition, many theoretical results on estimation of high-dimensional time series require specifying an underlying data generating model (DGM). Instead, we assume only (strict) stationarity and mixing conditions to establish finite sample consistency of lasso for data coming from various families of distributions. When the DGM is nonlinear, the lasso estimate corresponds to that of a best linear predictor which we assume is sparse. We provide results for three different combinations of dependence and tail behavior of the time series: α-mixing and Gaussian, β-mixing and subgaussian tails, and β-mixing and subweibull tails. To prove our results for the second set, we derive a novel Hanson-Wright type concentration inequality for beta-mixing subgaussian random vectors that may be of independent interest. Together, applications of these results extend to non-Gaussian, non-Markovian and non-linear times series models as the examples we provide demonstrate.
[发布日期] [发布机构] University of Michigan
[效力级别] High dimensional Time Series [学科分类]
[关键词] Lasso;High dimensional Time Series;Mixing Processes;Statistics and Numeric Data;Science;Statistics [时效性]