Methods for Sequence Based Studies of Complex Traits.
[摘要] Thousands of loci have been associated with complex diseases and traits. However, there is still much we do not know about the biology of disease. Many reasons for this are possible, including the strong focus of genetic association studies over the past 5-10 years on common single nucleotide polymorphisms. In this dissertation, we focus on methods for the design and analysis of sequence based studies, to enable the assessment of other types of variants, particularly rare variants and copy number variants.In the first chapter, we attempt to resolve the debate regarding the best strategy for studying very rare, disease-associated variants, particularly singleton variants appearing only once in a sample. We estimated the sensitivity to detect singleton variants using both simulations and analysis of real data. We extended this to determine the power of an association study for discrete traits, evaluating the burden of singletons under a variety of situations. We found that sensitivity to detect singletons increases with sequencing depth, plateauing when depth reaches ~25x. For a fixed sequencing capacity, we estimated that power is maximized when samples are sequenced at 15-20x coverage, which produced an optimal trade-off of singleton discovery and sample size. In our assessments, increasing coverage beyond 15-20x (and decreasing sample size) results in reduced power.In the second chapter, we extend this analysis to a quantitative trait framework. Despite the different disease model, the results for quantitative traits are remarkably similar to those for binary traits. For constant sequencing effort, power is maximized at 11-16x coverage, for a variety of parameter values examined. Increasing coverage further for reduced sample size results in decreased power.In the final chapter, we turn to another type of variant that may aid in understanding the etiology of disease: copy number variation. Copy number variants (CNVs) are associated with many diseases, especially psychiatric disorders, and there is great interest in methods to accurately detect and genotype CNVs. We developed a method that uses read depth information to estimate copy number for a set of sequenced individuals.
[发布日期] [发布机构] University of Michigan
[效力级别] Genetics [学科分类]
[关键词] Statistical genetics;Genetics;Statistics and Numeric Data;Health Sciences;Science;Biostatistics [时效性]