已收录 268921 条政策
 政策提纲
  • 暂无提纲
Assembly-free genome comparison based on next-generation sequencing reads and variable length patterns
[摘要] BackgroundWith the advent of Next-Generation Sequencing technologies (NGS), a large amount of short read data has been generated. If a reference genome is not available, the assembly of a template sequence is usually challenging because of repeats and the short length of reads. When NGS reads cannot be mapped onto a reference genome alignment-based methods are not applicable. However it is still possible to study the evolutionary relationship of unassembled genomes based on NGS data.ResultsWe present a parameter-free alignment-free method, called Under2¯, based on variable-length patterns, for the direct comparison of sets of NGS reads. We define a similarity measure using variable-length patterns, as well as reverses and reverse-complements, along with their statistical and syntactical properties. We evaluate several alignment-free statistics on the comparison of NGS reads coming from simulated and real genomes. In almost all simulations our method Under2¯ outperforms all other statistics. The performance gain becomes more evident when real genomes are used.ConclusionThe new alignment-free statistic is highly successful in discriminating related genomes based on NGS reads data. In almost all experiments, it outperforms traditional alignment-free statistics that are based on fixed length patterns.
[发布日期] 2014-09-10 [发布机构] 
[效力级别]  [学科分类] 
[关键词] alignment-free statistics;next-generation sequencing;pattern discovery [时效性] 
   浏览次数:1      统一登录查看全文      激活码登录查看全文