Cache Resource Allocation in Large-Scale Chip Multiprocessors.
[摘要] Chip multiprocessors (CMPs) have become virtually ubiquitous due to theincreasing impact of power and thermal constraints being placed on processordesign, as well as the diminishing returns of building ever more complexuniprocessors. While the number of cores on a chip has increased rapidly,changes to other aspects of system design have been slower in coming.Namely,the on-chip memory hierarchy has largely been unchanged despite the shift to multicore designs. The last level ofcache on chip is generally shared by all hardware threads onthe chip, creating a ripe environment for resource allocation problems.This dissertation examines cache resource allocation in large-scale chipmultiprocessors. It begins by performing extensive supporting research in the area of shared cache metric analysis, concluding that there is no single optimal shared cache design metric. This result supports the idea that shared caches ought not explicitly attempt to achieve optimal partitions; rather they should only react when unfavorable performance is detected. This study is followed by some studies using machine learning analysis to extract salient characteristics in predicting poor shared cache performance. The culmination of this dissertation is ashared cache management framework called SLAM (Scalable, Lightweight, AdaptiveManagement).SLAM is a scalable and feasible mechanism which detects inefficiency of cache usage by hardwarethreads sharing the cache. An inefficient thread can be easily punished byeffectively reducing its cache occupancy via a modified cache insertion policy.The crux of SLAM is the detection of inefficient threads, which relies on twonovel performance monitors in the cache which stem from the results of the machine learning studies: the Misses Per Access Counter (MPAC),and the Relative Insertion Tracker (RIT), which each requires only tens of bitsin storage per thread.SLAM not only provides a means for extracting significant performance gainsover current cache designs (up to 13.1% improvement), but SLAM also provides a means for grantingdifferentiated quality of service to various cache sharers. Particularly ascommercialized virtual servers become increasingly common, being able toprovide differentiated quality of service at low cost potentially has significant value.
[发布日期] [发布机构] University of Michigan
[效力级别] Computer Science [学科分类]
[关键词] Chip Multiprocessor Cache Memory Systems;Computer Science;Electrical Engineering;Engineering;Computer Science & Engineering [时效性]