Modeling and Estimating Multi-Block Interactions for High-Dimensional Stationary Time Series
[摘要] Modeling and estimating interactions amongst multiple groups of variables is an important task for understanding the structure of complex system. In particular, for time series, the interdependence structure can be either on contemporaneous correlations, or on lead-lag cross-relations. This thesis addresses a number of topics related to such interdependence structures, under high-dimensional scaling. The first part of the thesis considers modeling and estimating interactions between observable blocks of variables, as well as their respective within-block dependence structures, in high-dimensional independent and identically distributed (iid), as well as temporal dependent settings. In the iid case, we model the blocks of variables of interest through a multi-layered Gaussian graphical model, and introduce a penalized maximum likelihood (MLE) procedure that provides both statistical and algorithmic guarantees, leveraging the structure of the log-likelihood function and its bi-convex nature. For the case where the data exhibit temporal dependence, the blocks are modeled through a stable Vector Autoregressive (VAR) system with group Granger-causal ordering. Building upon the work for the iid case, we estimate their lead-lag relationships, as well as the contemporaneous dependence structure using a penalized MLE criterion, under different structural assumptions of the transition matrices --- sparse or low rank. We establish theoretical properties for the estimates analogous to the iid case, modulo an additional cost due to the temporal dependence in the data. Moreover, we devise a testing procedure for the presence of such group Granger causality, tailoring it to the posited structural assumptions on the transition matrix that couples the blocks. The devised estimation and testing procedure are assessed via numerical experiments, and further illustrated on a real data example from economics that examines the impact of the stock market on major macroeconomic indicators.However, large stable VAR systems have the inherent limitation that the transition matrix needs to be very sparse or has small averaged magnitude to satisfy the stationary constraint. This further raises the issue of whether VAR model is the appropriate modeling framework for ultra large number of time series. To this end, we consider systems of time series that can be summarized by a small set of latent factors. In the second part of this thesis, we focus on estimating the interaction between an observable process and a dynamically evolving latent factor process. Specifically, we extend the popular in applied economics work, factor-augmented vector autoregressive (FAVAR) model to high dimensions and study estimation of the model parameters by formulating an optimization problem that involves a low-rank-plus-sparse type decomposition. Moreover, we investigate model identifiability issues and establish theoretical properties for the proposed estimator. The performance of the proposed method is evaluated through synthetic data, and the model is further illustrated on an economic data set that examines interlinkages between commodity prices and macroeconomic variables. Along a slightly different line of inquiry where the contemporaneous dependence is of prime interest rather than lead-lag relationships, we extend the approximate factor model where correlations amongst the idiosyncratic (error) component are assumed to be weak, to the case where moderate-to-strong correlations are allowed. Using a formulation similar to the FAVAR problem, we propose an algorithm to estimate the model parameters and investigate its statistical and algorithmic properties. The model and the quality of the resulting estimates are illustrated on log-returns of stock prices of large financial institutions.
[发布日期] [发布机构] University of Michigan
[效力级别] optimization [学科分类]
[关键词] high-dimensional time series;optimization;regularization;statistical error bound;convergence;Statistics and Numeric Data;Science;Statistics [时效性]