Program redundancy analysis and optimization to improve memory performance
[摘要] Program redundancy analysis and optimization have been an important component in optimizing compilers to identify and remove redundant computations and improve application performance. This thesis targets an important class of program run-time redundancy---redundant memory operations.First, an efficient and powerful static analysis algorithm is developed to detect redundant memory operations. Previous techniques separate the task of scalar and memory redundancy detection, thus they fail to discover the redundancies due to the interaction between scalar and memory values. The new algorithm unifies the scalar and memory redundancy detection using an integrated value numbering process, and is more powerful than the separate approach.Once the redundant memory operations are identified, program transformations are employed to remove those redundant memory operations. This thesis demonstrates that traditional scalar redundancy removal frameworks can be easily adapted to remove both fully static and partially static memory redundancies. Experimental results show that the analysis and optimization can remove significant amount of redundant loads (up to 40% of dynamic loads) in benchmark applications from SPEC2000 and MediaBench.This thesis also presents a limit study on run-time memory redundancies to evaluate effectiveness of various known memory redundancy removal methods. The effects on register allocation are quantitatively measured. The results show that register spills due to memory redundancy removal are rare in the benchmark applications, and there is little negative impact on application performance caused by spilling overhead.Finally, this thesis presents detailed microarchitecture simulations to measure the performance and energy efficiency benefits of memory redundancy removal. The results show that memory redundancy analysis and optimization can reduce the overall application execution cycles by up to 10%, even with wide-issue architectures. It also significantly reduces both instruction and data cache accesses. Architectural level energy consumption simulation shows that the reduction in dynamic memory instructions and cache accesses results in up to 14.8% energy savings. The performance boost and energy savings by memory redundancy analysis and optimization are especially valuable to meet the increasingly demanding low power and energy consumption requirement for both high-end microprocessors and battery-powered portable devices.
[发布日期] [发布机构] Rice University
[效力级别] science [学科分类]
[关键词] [时效性]