µProfiler: A Concurrent Profiler for Concurrent C++ (µC++)
Gidzinski, Justyna Jay
MetadataShow full item record
A concurrent program, unlike a sequential program, has multiple threads of execution, resulting in numerous advantages (e.g., faster execution), but also in complex and unpredictable interaction. As a consequence, a concurrent program can easily underutilize available parallelism, and performance can be extremely difficult for users to predict and analyze on their own. A profiler is a tool that can help a user identify as well as locate potential performance problems in a program. Profiling is accomplished through monitoring of the program execution, and analyzing and visualizing the collected performance data. A profiler must display useful information in a way that allows a user to effectively and efficiently understand and analyze a program's behaviour. This thesis describes the advancement in design and implementation of µProfiler, a profiler for sequential and concurrent programs written in µC++. µC++ is a concurrent dialect of the C++ programming language, which executes in uni-processor and multi-processor shared-memory environments. Major advancements to three µProfiler metrics are presented: the Execution State, the Exact Routine Call-Graph and the Statistical Routine Call-Graph. The Execution State metric charts each state for every thread over the entire execution of the program. With high overhead and perfect accuracy, the Exact Routine Call-Graph metric provides an exact call-graph profile of the program's dynamic execution, describing the control flow among routines. With low overhead and less accuracy, the Statistical Routine Call-Graph metric provides a statistical call-graph profile of the program's dynamic execution. For each metric, advancements were made throughout the profiling process (i.e., monitoring, analysis and visualization), addressing goals such as scalability, functionality, usability and performance. The metrics provide reasonable memory overhead and, based on the comparison to related work, are state-of-the-art in functionality and provide similar run-time performance.