已收录 268921 条政策
 政策提纲
  • 暂无提纲
GTZ: a fast compression and cloud transmission tool optimized for FASTQ files
[摘要] BackgroundThe dramatic development of DNA sequencing technology is generating real big data, craving for more storage and bandwidth. To speed up data sharing and bring data to computing resource faster and cheaper, it is necessary to develop a compression tool than can support efficient compression and transmission of sequencing data onto the cloud storage.ResultsThis paper presents GTZ, a compression and transmission tool, optimized for FASTQ files. As a reference-free lossless FASTQ compressor, GTZ treats different lines of FASTQ separately, utilizes adaptive context modelling to estimate their characteristic probabilities, and compresses data blocks with arithmetic coding. GTZ can also be used to compress multiple files or directories at once. Furthermore, as a tool to be used in the cloud computing era, it is capable of saving compressed data locally or transmitting data directly into cloud by choice. We evaluated the performance of GTZ on some diverse FASTQ benchmarks. Results show that in most cases, it outperforms many other tools in terms of the compression ratio, speed and stability.ConclusionsGTZ is a tool that enables efficient lossless FASTQ data compression and simultaneous data transmission onto to cloud. It emerges as a useful tool for NGS data storage and transmission in the cloud environment. GTZ is freely available online at: https://github.com/Genetalks/gtz.
[发布日期] 2017-12-28 [发布机构] 
[效力级别]  [学科分类] 
[关键词] FASTQ;Compression;General-purpose;Lossless;Parallel compression and transmission;Cloud computing [时效性] 
   浏览次数:5      统一登录查看全文      激活码登录查看全文