已收录 268921 条政策
 政策提纲
  • 暂无提纲
Fine-grain producer-initiated communication in cache-coherent multiprocessors
[摘要] Shared-memory multiprocessors are becoming increasingly popular as a high-performance, easy to program, and relatively inexpensive choice for parallel computation. However, the performance of shared-memory multiprocessors is limited by memory latency. Memory latencies are higher in multiprocessors due to physical constraints and cache coherence overheads. In addition, synchronization operations, which are necessary to ensure correctness in parallel programs, add further communication overhead in shared-memory multiprocessors.Software-controlled non-binding data prefetching is a widely used consumer-initiated mechanism to hide communication latency and is currently supported on most architectures. However, on an invalidation-based cache-coherent multiprocessor, prefetching is inapplicable or insufficient for some communication patterns such as irregular communication, fine-grain pipelined loops, and synchronization. For these cases, a combination of two fine-grain, producer-initiated primitives (referred to as remote writes) is better able to reduce the latency of communication. This work demonstrates experimentally that remote writes provide significant performance benefits in cache-coherent shared-memory multiprocessors both with and without prefetching. Further, the combination of remote writes and prefetching is able to eliminate most of the memory system overheads in our applications, except for misses due to cache conflicts.
[发布日期]  [发布机构] Rice University
[效力级别] Electrical [学科分类] 
[关键词]  [时效性] 
   浏览次数:5      统一登录查看全文      激活码登录查看全文