Parallel Pattern Search in Large, Partial-Order Data Sets on Multi-core Systems
MetadataShow full item record
Monitoring and debugging distributed systems is inherently a difficult problem. Events collected during the execution of distributed systems can enable developers to diagnose and fix faults. Process-time diagrams are normally used to view the relationships between the events and understand the interaction between processes over time. A major difficulty with analyzing these sets of events is that they are usually very large. Therefore, being able to search through the event-data sets can enable users to get to points of interest quickly and find out if patterns in the dataset represent the expected behaviour of the system. A lot of research work has been done to improve the search algorithm for finding event-patterns in large partial-order datasets. In this thesis, we improve on this work by parallelizing the search algorithm. This is useful as many computers these days have more than one core or processor. Therefore, it makes sense to exploit this available computing power as part of an effort to improve the speed of the algorithm. The search problem itself can be modeled as a Constraint Satisfaction Problem (CSP). We develop a simple and efficient way of generating tasks (to be executed by the cores) that guarantees that no two cores will ever repeat the same work-effort during the search. Our approach is generic and can be applied to any CSP consisting of a large domain space. We also implement an efficient dynamic work-stealing strategy that ensures the cores are kept busy throughout the execution of the parallel algorithm. We evaluate the efficiency and scalability of our algorithm through experiments and show that we can achieve efficiencies of up to 80% on a 24-core machine.