Testing the Coding Potential of Conserved Short Genomic Sequences
[摘要] Proposed is a procedure to test whether a genomic sequence contains coding DNA, called acoding potential region. The procedure tests the coding potential of conserved short genomic sequence, in which the assumptions on the probability models of gene structuresare relaxed. Thus, it is expected to provide additional candidate regions that contain codingDNAs to the current genomic database. The procedure was applied to the set of highly conserved human-mouse sequences in the genome database at the University of California at Santa Cruz. For sequences containingRefSeq coding exons, the procedure detected 91.3% regions having coding potential in thisset, which covers 83% of the human RefSeq coding exons, at a 2.6% false positive rate. Theprocedure detected 12,688 novel short regions with coding potential at the false discoveryrate <0.05; 65.7% of the novel regions are between annotated genes.
[发布日期] [发布机构]
[效力级别] [学科分类] 生物技术
[关键词] [时效性]