Online Trace Reordering for Efficient Representation of Event Partial Orders
[摘要] Distributed and parallel applications not only have distributed state but are ofteninherently non-deterministic, making them significantly more challenging to monitor anddebug. Additionally, a significant challenge when working with distributed and parallelapplications has to do with the fundamental requirement of determining the order in whichcertain actions are performed by the application. A naive approach for ordering actionswould be to impose a single order on all actions, i.e., given any two actions or events, onemust happen before the other. A global order, however, is often misleading, e.g., two eventsin two different processes may be causally independent yet one may have occurred beforethe other. A partial order of events, therefore, serves as the fundamental data structurefor ordering events in distributed and parallel applications.Traditionally, Fidge/Mattern timestamps have been used for representing event partialorders. The size of the vector timestamp depends on the number of parallel entities (traces)in the application, e.g., processes or threads. A major limitation of Fidge/Mattern time-stamps is that the total size of timestamps does not scale for large systems with hundredsor thousands of traces. Taylor proposed an efficient offset-based scheme for representinglarge event partial orders by representing deltas between timestamps of successive events.The offset-based schemes have been shown to be significantly more space efficient whentraces that communicate the most are close to each other for generating the deltas (offsets).In Taylor’s offset-based schemes the optimal order of traces is computed offline. In thiswork we adapt the offset-based schemes to dynamically reorder traces and demonstratethat very efficient scalable representations of event partial orders can be generated in anonline setting, requiring as few as 100 bytes/event for storing partial order event data forapplications with around 1000 processes.
[发布日期] [发布机构] University of Waterloo
[效力级别] [学科分类]
[关键词] Computer Science [时效性]