Prospective Identification of Long-range Transcriptional Regulatory Regions via Integrative Genomics.
[摘要] Eukaryotic Transcription is regulated by transcription factor (TF) proteins that are recruited at the proximal promoter as well as distal genomic elements (like enhancers). In this work we present a generalizable in-silico computational strategy to locate these functional regulatory elements, some of which might lie hundreds of kilobases from the gene, using genomic-sequence, gene expression, and protein-interaction modalities. As a case study, I will examine the previously reported urogenital enhancers of the Gata2 gene from various perspectives (sequence features, annotated expression data and protein interaction). Additionally, recent results from the ENCODE project (http://genome.gov/ENCODE) have made several new observations, some of which will be examined in the context of this problem.In this context, we consider four main problems:1. Identification of upstream TF effectors via novel network inference procedures, considering the dynamic spatio-temporal context of transcriptional regulation.2. Finding sequence preferences of known regulatory regions via motif discovery.3. Graph mining techniques to infer structural features of TF-interaction graphs during promoter-enhancer crosstalk.4. Heterogeneous data integration across multiple modalities to predict regulatory regions from candidate sequences.Based on these approaches we have predicted the location of some novel regulatory regions for the Gata3 locus. Transgenic assays have been done by members of the Engel laboratory to validate these predictions experimentally. From the results, we believe that this work demonstrates the applicability of advanced computational methodologies and principled data integration for biologically-relevant hypothesis generation and validation.
[发布日期] [发布机构] University of Michigan
[效力级别] Prospective Enhancer Discovery [学科分类]
[关键词] Integrative Genomics;Prospective Enhancer Discovery;Transcriptional Regulatory Network Inference;Machine Learning for Bioinformatics;Signal Processing in Bioinformatics;Electrical Engineering;Science (General);Engineering;Science;Electrical Engineering: Systems and Bioinformatics [时效性]