Efficient Methods for Analysis of Genome Scale Data.
[摘要] In my dissertation I develop and evaluate methods for gene-mapping that can extract useful information from large complex datasets with many genetic markers and outcomes measured. The first part of the dissertation describes an extension of the variance components approach to incorporate repeated phenotype measurements and establish a general formula for cost-effectiveness analysis. The second part proposes a discrete-generation framework of the coalescent model that can rapidly simulate large (>100Mb) sequences from a population based on flexible population history and allows recombination rates to vary along the genome. The third part develops a case-control association mapping strategy that uses genetic data to match individuals and accounts for unknown population structure. The fourth part describes a genome-wide genetic map of genetic variants that influence global gene expression integrating data on >50,000 mRNA transcript levels and >400,000 genetic markers. Using this dataset, I perform systematic evaluation of accuracy and power of genotype imputation with respect to different aspects of the phenotypic traits of interest and genetic markers being tested.
[发布日期] [发布机构] University of Michigan
[效力级别] Linkage Analysis of Repeated Measures [学科分类]
[关键词] Genome-wide Association Study;Linkage Analysis of Repeated Measures;Gene Expression;Population Structure;Genotype Imputation;Expression Quantitative Trait Loci (EQTL);Ecology and Evolutionary Biology;Genetics;Mathematics;Science (General);Statistics and Numeric Data;Science;Biostatistics [时效性]