|dc.description.abstract||Cloud computing has gained considerable popularity in recent years. In this paradigm, an organization, referred to as a subscriber, acquires resources from an infrastructure provider to deploy its applications and pays for these resources on a pay-as-you-go basis. Typically, an infrastructure provider charges a subscriber based on resource level and duration of usage. From the subscriber's perspective, it is desirable to acquire enough capacity to provide an acceptable quality of service while minimizing the cost. A key indicator of quality of service is response time. In this thesis, we use performance models based on queueing theory to determine the required capacity to meet a performance target given by Pr[response time ≤ x] ≥ β.
We first consider the case where resources are obtained from an infrastructure provider for a time period of one hour. This is compatible with the pricing policy of major infrastructure providers where instance usage is charged on an hourly basis. Over such a time period, web application traffic exhibits time-varying behavior. A conventional traffic model such as Poisson process does not capture this characteristic. The Markov-modulated Poisson process (MMPP), on the other hand, is capable of modeling such behavior. In our investigation of MMPP as a traffic model, an available workload generator is extended to produce a synthetic trace of job arrivals with a controlled level of time-variation, and an MMPP is fitted to the synthetic trace. The effectiveness of MMPP is evaluated by comparing the performance results through simulation, using as input the synthetic trace and job arrivals generated by the fitted MMPP.
Queueing models with MMPP arrival process are then developed to determine the required capacity to meet a performance target over a one-hour time interval. Specifically, results on response time distribution are used in an optimization to obtain estimates of the required capacity. Two models are of interest to our investigation: a single-server model and a two-stage tandem queue. For both models, it is assumed that service time is represented by a phase-type (PH) distribution and queueing discipline is FCFS. The single-server model is therefore the MMPP/PH/1 (FCFS) model. Analytic results for time-dependent response time distribution of this model are first obtained. Computation of numerical results, however, is very costly. Through numerical examples, it is found that steady-state results are a good approximation for a time interval of one hour; the computation requirement is also significantly lower. Steady-state results are then used to determine the required capacity. The effectiveness of this model in terms of predicting the required capacity to meet the performance target is evaluated using an experimental system based on the TPC-W benchmark. Results on the impact of MMPP parameters on the required capacity are also presented. The second model is a two-stage tandem queue. The accuracy of the required capacity obtained via steady-state analysis is also evaluated using the TPC-W benchmark.
We next consider the case where the infrastructure provider uses a time unit (TU) of less than one hour for charging of resource usage. We focus on scenarios where TU is comparable to the average sojourn time in an MMPP state. A one-hour operation interval is divided into a number of service intervals, each having the length one TU. At the beginning of each service interval, an estimate of the arrival rate is used as input to the M/PH/1 (FCFS) model to determine the required capacity to meet the performance target over the upcoming service interval; three heuristic algorithms are developed to estimate the arrival rate. The merit of this strategy, in terms of meeting the performance target over the operation interval and savings in capacity when compared to that determined by the single-server model, is investigated using the TPC-W benchmark.||en