Final report for''automated diagnosis of large scale parallel applications''
[摘要] The work performed is part of a continuing research project, PPerfDB, headed by Dr. Karavanic. We are studying the application of experiment management techniques to the problems associated with gathering, storing, and using performance data with the goal of achieving completely automated diagnosis of application and system bottlenecks. This summer we focused on incorporating heterogeneous data from a variety of tools, applications, and platforms, and on designing novel techniques for automated performance diagnosis. The Experiment Management paradigm is a useful approach for designing a tool that will automatically diagnose performance problems in large-scale parallel applications. The ability to gather, store, and use performance data gathered over time from different executions and using different collection tools enables more sophisticated approaches to performance diagnosis and to performance evaluation more generally. We look forward to continuing our efforts by further development and analysis of online diagnosis using historical data, and by investigating performance data and diagnosis gathered from mixed MPUOpenMP applications.
[发布日期] 2000-11-17 [发布机构] Lawrence Livermore National Laboratory
[效力级别] [学科分类]
[关键词] Management;Evaluation;Diagnosis;Performance;99 General And Miscellaneous//Mathematics, Computing, And Information Science [时效性]