已收录 268921 条政策
 政策提纲
  • 暂无提纲
SWAP-Assembler: scalable and efficient genome assembly towards thousands of cores
[摘要] BackgroundThere is a widening gap between the throughput of massive parallel sequencing machines and the ability to analyze these sequencing data. Traditional assembly methods requiring long execution time and large amount of memory on a single workstation limit their use on these massive data.ResultsThis paper presents a highly scalable assembler named as SWAP-Assembler for processing massive sequencing data using thousands of cores, where SWAP is an acronym for Small World Asynchronous Parallel model. In the paper, a mathematical description of multi-step bi-directed graph (MSG) is provided to resolve the computational interdependence on merging edges, and a highly scalable computational framework for SWAP is developed to automatically preform the parallel computation of all operations. Graph cleaning and contig extension are also included for generating contigs with high quality. Experimental results show that SWAP-Assembler scales up to 2048 cores on Yanhuang dataset using only 26 minutes, which is better than several other parallel assemblers, such as ABySS, Ray, and PASHA. Results also show that SWAP-Assembler can generate high quality contigs with good N50 size and low error rate, especially it generated the longest N50 contig sizes for Fish and Yanhuang datasets.ConclusionsIn this paper, we presented a highly scalable and efficient genome assembly software, SWAP-Assembler. Compared with several other assemblers, it showed very good performance in terms of scalability and contig quality. This software is available at: https://sourceforge.net/projects/swapassembler
[发布日期] 2014-09-10 [发布机构] 
[效力级别]  [学科分类] 
[关键词] genome assembly;parallel computing;De Bruijn graph [时效性] 
   浏览次数:4      统一登录查看全文      激活码登录查看全文