Predicting dataset popularity for the CMS experiment
[摘要] The CMS experiment at the LHC accelerator at CERN relies on its computing infrastructure to stay at the frontier of High Energy Physics, searching for new phenomena and making discoveries. Even though computing plays a significant role in physics analysis we rarely use its data to predict the system behavior itself. A basic information about computing resources, user activities and site utilization can be really useful for improving the throughput of the system and its management. In this paper, we discuss a first CMS analysis of dataset popularity based on CMS meta-data which can be used as a model for dynamic data placement and provide the foundation of data-driven approach for the CMS computing infrastructure.
[发布日期] [发布机构] Cornell University, Ithaca; NY; 14850, United States^1;University of Bologna, INFN-Bologna, Italy^2;Princeton University, NJ; 08544, United States^3
[效力级别] 计算机科学 [学科分类] 计算机科学(综合)
[关键词] Computing infrastructures;Computing resource;Data-driven approach;Dynamic data;Physics analysis;Site utilization;System behaviors;User activity [时效性]