Efficient Data Center Architectures Using Non-Volatile Memory andReliability Techniques.
[摘要] The cost of running a data center is increasingly dominated by energy consumption, contributed by power provisioning, cooling and server components such as processors, memories and disk drives.Meanwhile, emerging classes of complex data center workloads place a heavier burden on processing and storage hardware, involving accesses to huge datasets for each operation.Fortunately, emerging technologies promise better performance and efficiency.Non-volatile (NV) memories for applications such as disk caches are proven ways to save energy, and in recent developments, byte-addressable persistent storage such as phase-change memory (PCM) or Memristors can serve as both main memory and permanent storage, reducing data transfers between layers of hierarchy.Further, 3D die-stacking provides a low-energy high-bandwidth means of connecting storage with computation hardware.The challenge lies in how to optimally combine and balance system elements when data-center workload demands vary significantly.Once combined, new, inherent drawbacks such as limited memory write endurance need to be countered.Further, as processors often dominate system power consumption, they become a critical target for energy optimization.Unfortunately, current CPU architectures cannot fully exploit voltage scaling due to the need for safety margins as well as having large caches that fail at higher voltages than the logic circuits.In this thesis, we address these challenges via the following novel techniques;We propose a distributed, energy-efficient data center architecture, replacing hard disk drives and DRAM main memory with non-volatile Memristors or PCM.The system is composed of a network of uniform building blocks called Nanostores that combine processors with a permanent data store.To reduce unnecessary data movement, DRAM and disk layers are eliminated, resulting in a flattened memory hierarchy.Because NV memories wear out with the number of data writes, we propose novel wear-leveling solutions.First we propose distributed data center wear-leveling to address SSD-based and future Nanostore based storage, with a 3.9x improvement in lifetime.Second, we propose server-level reliability improvements for Flash memory based disk caches that provide 20x improvements in lifetime on average.We propose a novel on-chip cache fault tolerance scheme that allows more than a 30% improvement in energy efficiency.
[发布日期] [发布机构] University of Michigan
[效力级别] Flash Memory [学科分类]
[关键词] Data Center;Flash Memory;Cache;Phase-change Memory;Wear-leveling;Energy;Computer Science;Electrical Engineering;Engineering;Computer Science & Engineering [时效性]