A Stochastic Segmentation Model for Recurrent Copy NumberAlteration Analysis
[摘要] Recurrent DNA copy number alterations (CNAs) are key genetic events in the study of human genetics and disease. Analysis of recurrent DNA CNA data often involves the inference of individual samples’ true signal levels and the crosssample recurrent regions at each location. We propose for the analysis of multiple samples CNA data a new stochastic segmentation model and an associated inference procedure that has attractive statistical and computational properties. An important feature of our model is that it yields explicit formulas for posterior probabilities of recurrence at each location, which can be used to estimate the recurrent regions directly. We propose an approximation method whose computational complexity is only linear in sequence length, which makes our model applicable to data of higher density. Simulation studies and analysis of an ovarian cancer dataset with 15 samples and a lung cancer dataset with 10 samples are conducted to illustrate the advantage of the proposed model.
[发布日期] [发布机构]
[效力级别] [学科分类]
[关键词] Categorical states;Hidden Markov models;Multiple change-points;Recurrent CNAs [时效性]