已收录 273724 条政策
 政策提纲
  • 暂无提纲
Use of DFT Distance Metrics for Classification of SARS-CoV-2 Genomes
[摘要] In this work, we investigate using Fourier coefficients (FCs) for capturing useful information about viral sequences in a computationally efficient and compact manner. Specifically, we extract geographic submission location from SARS-CoV-2 sequence headers submitted to the GISAID Initiative, calculate corresponding FCs, and use the FCs to classify these sequences according to geographic location. We show that the FCs serve as useful numerical summaries for sequences that allow manipulation, identification, and differentiation via classical mathematical and statistical methods that are not readily applicable for character strings. Further, we argue that subsets of the FCs may be usable for the same purposes, which results in a reduction in storage requirements. We conclude by offering extensions of the research and potential future directions for subsequent analyses, such as the use of other series transforms for discreetly indexed signals such as genomes.
[发布日期]  [发布机构] 
[效力级别]  [学科分类] 生物科学(综合)
[关键词] alignment-free methods;Fourier transform;genomic sequences;supervised learning;visualization of high-dimensional data [时效性] 
   浏览次数:3      统一登录查看全文      激活码登录查看全文