Combinatorial Problems in Compiler Optimization
Beg, Mirza Omer
MetadataShow full item record
Several important compiler optimizations such as instruction scheduling and register allocation are fundamentally hard and are usually solved using heuristics or approximate solutions. In contrast, this thesis examines optimal solutions to three combinatorial problems in compiler optimization. The first problem addresses instruction scheduling for clustered architectures, popular in embedded systems. Given a set of instructions the optimal solution gives the best possible schedule for a given clustered architectural model. The problem is solved using a decomposition technique applied to constraint programming which determines the spatial and temporal schedule using an integrated approach. The experiments show that our solver can tradeoff some compile time efficiency to solve most instances in standard benchmarks giving significant performance improvements. The second problem addresses instruction selection in the compiler code generation phase. Given the intermediate representation of code the optimal solution determines the sequence of equivalent machine instructions as it optimizes for code size. This thesis shows that a large number of benchmark instances can be solved optimally using constraint programming techniques. The third problem addressed is the placement of data in memory for efficient cache utilization. Using the data access patterns of a given program, our algorithm determines a placement to reorganize data in memory which would result in fewer cache misses. By focusing on graph theoretic placement techniques it is shown that there exist, in special cases, efficient and optimal algorithms for data placement that significantly improve cache utilization. We also propose heuristic solutions for solving larger instances for which provably optimal solutions cannot be determined using polynomial time algorithms. We demonstrate that cache hit rates can be significantly improved by using profiling techniques over a wide range of benchmarks and cache configurations.