Modeling temporally-regulated effects on distributions
[摘要] We present a nonparametric framework for modeling an evolving sequence of (estimated) probability distributions which distinguishes the effects of sequential progression on the observed distribution from extraneous sources of noise (i.e. latent variables which perturb the distributions independently of the sequence-index). To discriminate between these two types of variation, our methods leverage the underlying assumption that the effects of sequential-progression follow a consistent trend. Our methods are motivated by the recent rise of single-cell RNA-sequencing time course experiments, in which an important analytic goal is the identification of genes relevant to the progression of a biological process of interest at cellular resolution. As existing statistical tools are not suited for this task, we introduce a new regression model for (ordinal-value , univariate-distribution) covariate-response pairs where the class of regression-functions reflects coherent changes to the distributions over increasing levels of the covariate, a concept we refer to as trends in distributions. Through simulation study and extensive application of our ideas to data from recent single-cell gene-expression time course experiments, we demonstrate numerous strengths of our framework. Finally, we characterize both theoretical properties of the proposed estimators and the generality of our trend-assumption across diverse types of underlying sequential-progression effects, thus highlighting the utility of our framework for a wide variety of other applications involving the analysis of distributions with associated ordinal labels.
[发布日期] [发布机构] Massachusetts Institute of Technology
[效力级别] [学科分类]
[关键词] [时效性]