Contributions to Functional Data Analysis and High-Throughput Screening Assay Analysis.
[摘要] Modern science is characterized by complex nature of available data sets. Statisticians are now developing new statistical techniques for analyzing large and complex data sets. This dissertation contributes toward analyzing several such challenging types of data sets.The second chapter explores mixture regression, a method to cluster a sample and estimate the individual regression models for the groups simultaneously. This method treats the covariate as deterministic so that it carries no information as to the membership of the subject, leading to poor prediction performance. Although this assumption may be reasonable in experiments, in observational data the covariate often behaves differently across the groups. To accommodate the method to incorporate the covariate heterogeneity, we introduce a new approach. The method is developed for the functional covariate as well as the multivariate covariate. We explore the method both numerically and analytically, and present a real-data analysis where this new approach outperforms the traditional approach.The third chapter explores the regularization approach to the functional linear regression models. While several authors have contributed to this problem, derivation is often technical and difficult to see what is behind the approach. Our goal is two-fold. The first goal is to provide much simpler derivation of the optimal prediction convergence rate in a general setting. The second goal is to extend the practical aspects of the model by accommodating discrete observations and multiple predictors.The fourth chapter explores high-throughput screening (HTS) assay analysis. HTS assays can be used as less expensive alternatives to conventional animal and cell culture assays. In this context, a prediction relationship between the HTS and conventional assays must be defined. In some applications, the lowest value among the conventional assays is of primary interest, in which case it may be advantageous to predict this minimum value directly rather than in two stages following prediction of each assay separately. We explore an approach that focuses the modeling efforts directly on the parameter of interest, rather than on the high dimensional nuisance parameter. We apply this method to the ToxCast data of the EPA and to the 60 cell line screen of the NCI.
[发布日期] [发布机构] University of Michigan
[效力级别] Functional Regression [学科分类]
[关键词] Functional Data Analysis;Functional Regression;Mixture Regression;Minimax Rate;High-throughput Screening;Profile Likelihood;Public Health;Mathematics;Science (General);Statistics and Numeric Data;Health Sciences;Science;Statistics [时效性]