已收录 273699 条政策
 政策提纲
  • 暂无提纲
Collective Memory Transfers for Multi-Core Chips
[摘要] Future performance improvements for microprocessors have shifted from clock frequency scaling towards increases in on-chip parallelism. Performance improvements for a wide variety of parallel applications require domain-decomposition of data arrays from a contiguous arrangement in memory to a tiled layout for on-chip L1 data caches and scratchpads. How- ever, DRAM performance suffers under the non-streaming access patterns generated by many independent cores. We propose collective memory scheduling (CMS) that actively takes control of collective memory transfers such that requests arrive in a sequential and predictable fashion to the memory controller. CMS uses the hierarchically tiled arrays formal- ism to compactly express collective operations, which greatly improves programmability over conventional prefetch or list- DMA approaches. CMS reduces application execution time by up to 32% and DRAM read power by 2.2??, compared to a baseline DMA architecture such as STI Cell.
[发布日期] 2013-11-13 [发布机构] 
[效力级别]  [学科分类] 数学(综合)
[关键词] DRAM;access stream;stencils;memory bandwidth;collective transfers [时效性] 
   浏览次数:71      统一登录查看全文      激活码登录查看全文