A fast read alignment method based on seed-and-vote for next generation sequencing
[摘要] BackgroundThe next-generation of sequencing technologies, along with the development of bioinformatics, are generating a growing number of reads every day. For the convenience of further research, these reads should be aligned to the reference genome by read alignment tools. Despite the diversity of read alignment tools, most have no comprehensive advantage in both accuracy and speed. For example, BWA has comparatively high accuracy, but its speed leaves much to be desired, becoming a bottleneck while an increasing number of reads need to be aligned every day. We believe that the speed of read alignment tools still has huge room for improvement, while maintaining little to no loss in accuracy.ResultsHere we implement a new read alignment tool, Fast Seed-and-Vote Aligner (FSVA), which is based on seeding and voting. FSVA achieves a high accuracy close to BWA and simultaneously has a very high speed. It only requires ~10–15 CPU hours to run a whole genome read alignment, which is ~5–7 times faster than BWA.ConclusionsIn some cases, reads have to be aligned in a short time. Where requirement of accuracy is not very stringent, FSVA would be a promising option.FSVA is available at https://github.com/Topwood91/FSVA
[发布日期] 2016-12-23 [发布机构]
[效力级别] [学科分类]
[关键词] Read alignment;Seed and vote;Hash table [时效性]