已收录 268921 条政策
 政策提纲
  • 暂无提纲
Implementing Asynchronous Checkpoint/Restart for the Concurrent Collections Model
[摘要] It has been claimed that what simplifies parallelism can also simplify resilience. Based on that assertion, we present the Concurrent Collections programming model (CnC) as an ideal target for a simple yet powerful resilience system for parallel computations. Specifically, we claim that the same attributes that simplify reasoning about parallel applications written in CnC will similarly simplify the implementation of a checkpoint/restart system within the CnC runtime. We define these properties of CnC in the context of a model built in K. To demonstrate how these simplifying properties of CnC help to simplify resilience, we have implemented a simple checkpoint/restart system within Rice’s Habanero C implementation of the CnC runtime. We show how the CnC runtime can fully encapsulate the checkpointing and restarting processes, allowing application programmers to gain all the benefits of resilience without any added effort beyond implementing the application in CnC, while avoiding the synchronization overheads present in traditional techniques.
[发布日期]  [发布机构] Rice University
[效力级别] Collections [学科分类] 
[关键词]  [时效性] 
   浏览次数:3      统一登录查看全文      激活码登录查看全文