Improving the performance of concurrent sorts in database systems
Loading...
Date
Authors
Zhang, Weiye
Advisor
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Most research on sorting has been focused on improving single sort performance. This thesis focuses on improving overall system throughout when multiple sorts (or other operations) are running concurrently, competing for the same resources. This is the normal environment in a database system.
A dynamic memory adjustment technique is proposed for external merge sort which adjusts sort space at run time in response to actual input size and available memory space. It balances memory allocation among concurrent sorts so that more sort jobs are done entirely in main memory. This significantly increases system throughput and reduces average response time.
Several read-ahead strategies which reduce disk seeks during merging are studied. Three strategies, called equal buffering, simple clustering, and clustering with atomic reads, effectively reduce disk seeks. The latter two exploit existing order in the input data much better than the first. A set of formulas are derived for estimating the performance improvement resulting from these read-ahead strategies. They provide close estimates for uniformly distributed random data.
The amount of data transferred between main memory and disk is determined by the merge pattern, i.e., the order in which runs are merged. For the case when the sort space remains fixed throughout the merge phase, we derive formulas for calculating the optimum merge cost and provide methods for choosing the best merge width and buffer size. For the case when the sort space is adjustable between merge steps, four merge strategies are proposed and studied. Two are found promising for practical use.