Parallel hierarchical N-body methods
[摘要] Recent algorithmic advances utilizing hierarchical data structures have resultedin a dramatic reduction in the time required for computer simulation of N-bodysystems with long-range interactions. Computations which required O(N^2) operationscan now be done in O(N log N) or O(N). We review these tree methodsand find that they may be distinguished based on a few simple features.The Barnes-Hut (BH) algorithm has received a great deal of attention, andis the subject of the remainder of the dissertation. We present a generalization ofthe BH tree and analyze the statistical properties of such trees in detail. We alsoconsider the expected number of operations entailed by an execution of the BHalgorithm. We find an optimal value for m, the maximum number of bodies in aterminal cell, and confirm that the number of operations is O(N log N), even ifthe distribution of bodies is not uniform.The mathematical basis of all hierarchical methods is the multipole approximation.We discuss multipole approximations, for the case of arbitrary, sphericallysymmetric, and Newtonian Green's functions. We describe methods for computingmultipoles and evaluating multipole approximations in each of these cases,emphasizing the tradeoff between generality and algorithmic complexity.N-body simulations in computational astrophysics can require 10^6 or evenmore bodies. Algorithmic advances are not sufficient, in and of themselves, tomake computations of this size feasible. Parallel computation offers, a priori, thenecessary computational power in terms of speed and memory. We show how theBH algorithm can be adapted to execute in parallel. We use orthogonal recursivebisection to partition space. The logical communication structure that emergesis that of a hypercube. A local version of the BH tree is constructed in eachprocessor by iteratively exchanging data along each edge of the logical hypercube.We obtain speedups in excess of 380 on a 512 processor system for simulations ofgalaxy mergers with 180000 bodies. We analyze the performance of the parallelversion of the algorithm and find that the overhead is due primarily to interprocessorsynchronization delays and redundant computation. Communication is nota significant factor.
[发布日期] [发布机构] University:California Institute of Technology;Department:Physics, Mathematics and Astronomy
[效力级别] [学科分类]
[关键词] Physics [时效性]