Profiling Concurrent Programs Using Hardware Counters
MetadataShow full item record
Concurrency is a programming tool that is widely used in applications. Concurrent user-level threads can be used to structure the execution of a program in a uniprocessor environment and/or speed up its execution in a multiprocessor setting. Unfortunately, threads may interact with each other in unpredictable ways, often leading to performance problems that are nonexistent in the sequential domain. <br /><br /> A profiler can be used to help locate performance problems in sequential and concurrent programs. A profiler is a tool that monitors, analyzes, and visualizes the execution performance of a program to help users verify its expected behaviour, and locate its bottlenecks and hotspots. One of the important tools a profiler has at its disposal is a set of hardware counters, which are specialized CPU registers that count the occurrences of hardware events as a program executes. Hardware-event counts provide extremely precise insight into the execution behaviour of a program, and can be used to pinpoint portions of code where performance is suboptimal. <br /><br /> This thesis describes the design and implementation of <em>µ</em>Profiler, which is a profiler for sequential and concurrent programs written in a concurrent dialect of the C++ programming language called <em>µ</em>C++. <em>µ</em>C++ offers user-level concurrency in a uniprocessor or multiprocessor shared-memory environment. A new architecture-abstraction layer is developed, which allows <em>µ</em>Profiler to access hardware counters on multiple CPU types. As well, two new profiling metrics are presented, which use the architecture-abstraction layer to gather hardware-event counts for <em>µ</em>C++ programs. These metrics offer performance information about <em>µ</em>C++ programs that is unavailable by any other means.