Parallel likelihood calculations for phylogenetic trees
[摘要] ENGLISH ABSTRACT: Phylogenetic analysis is the study of evolutionary relationships among organisms.To this end, phylogenetic trees, or evolutionary trees, are used todepict the evolutionary relationships between organisms as reconstructed fromDNA sequence data. The likelihood of a given tree is commonly calculatedfor many purposes including inferring phylogenies, sampling from the space oflikely trees and inferring other parameters governing the evolutionary process.This is done using Felsenstein's algorithm, a widely implemented dynamicprogramming approach that reduces the computational complexity from exponentialto linear in the number of taxa. However, with the advent of efficientmodern sequencing techniques the size of data sets are rapidly increasing beyondcurrent computational capability.Parallel computing has been used successfully to address many similarproblems and is currently receiving attention in the realm of phylogeneticanalysis. Work has been done using data decomposition, where the likelihoodcalculation is parallelised over DNA sequence sites. We propose an alternativeway of parallelising the likelihood calculation, which we call segmentation,where the tree is broken down into subtrees and the likelihood of each subtreeis calculated concurrently over multiple processes. We introduce our proposedsystem, which aims to drastically increase the size of trees that can be practicallyused in phylogenetic analysis. Then, we evaluate the system on largephylogenies which are constructed from both real and synthetic data, to showthat a larger decrease of run times are obtained when the system is used.
[发布日期] [发布机构] Stellenbosch University
[效力级别] [学科分类]
[关键词] [时效性]