Acceleration of Algebraic Recurrences on Processors with Instruction Level Parallelism
[摘要] This report describes parallelization techniques for accelerating a broad class of recurrences on processors with instruction level parallelism. We introduce a new technique, called blocked back-substitution, which has lower operation count and higher performance than previous methods. The blocked back-substitution technique requires unrolling and non-symmetric optimization of innermost loop iteration. We present metrics to characterize the performance of software-pipelined loops and compare these metrics for a range of height reduction techniques and processor architectures.
[发布日期] [发布机构] HP Development Company
[效力级别] [学科分类] 计算机科学(综合)
[关键词] [时效性]