Towards Automatic Initial Buffer Configuration
Buffer pools are blocks of memory used in database systems to retain frequently referenced pages. Configuring the buffer pools is a difficult and manual task that involves determining the amount of memory to devote to the buffer pools, the number of buffer pools to use, their sizes, and the database objects assigned to each buffer pool. A good buffer configuration improves query response times and system throughput by reducing the number of disk accesses. Determining a good buffer configuration requires knowledge of the database workload. Empirical studies have shown that optimizing the initial buffer configuration (determined at database design time) can improve system throughput. A good initial configuration can also provide a faster convergence towards a favorable dynamic buffer allocation. Previous studies have not considered automating the buffer pool configuration process. This thesis presents two techniques that facilitate the initial buffer configuration task. First, we develop an analytic model of the GCLOCK buffer replacement policy that can be used to evaluate the effectiveness of a particular buffer configuration for a given workload. Second, to obtain the necessary model parameters, we propose a workload characterization scheme that extracts workload parameters, describing the query reference patterns, from the query access plans. In addition, we extend an existing multifractal model and present a multifractal skew model to represent query access skew. Our buffer model has been validated against measurements of the buffer manager of a commercial database system. The model has also been compared to an alternative GCLOCK buffer model. Our results show that our proposed model closely predicts the actual physical read rates and recognizes favourable buffer configurations. This work provides a foundation for the development of an automated buffer configuration tool.