UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Modeling Management Metrics for Monitoring Software Systems

Loading...
Thumbnail Image

Date

2011-09-30T19:22:50Z

Authors

Jiang, Miao

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Software systems are growing rapidly in size and complexity, and becoming more and more difficult and expensive to maintain exclusively by human operators. These systems are expected to be highly available, and failure in these systems is expensive. To meet availability and performance requirements within budget, automated and efficient approaches for systems monitoring are highly desirable. Autonomic computing is an effort in this direction, which promises systems that self-monitor, thus alleviating the burden of detailed operation oversight from human administrators. In particular, a solution is to develop automated monitoring systems that continuously collect monitoring data from target systems, analyze the data, detect errors and diagnose faults automatically. In this dissertation, we survey work based on management metrics and describe the common features of these current solutions. Based on observations of the advantages and drawbacks of these solutions, we present a general solution framework in four separate steps: metric modeling, system-health signature generation, system-state checking, and fault localization. Within our framework, we present two specific solutions for error detection and fault diagnosis in the system, one based on improved linear-regression modeling and the second based on summarizing the system state by an informationtheoretic measurement. We evaluate our monitoring solutions with fault-injection experiments in a J2EE benchmark and show the effectiveness and efficiency of our solutions.

Description

Keywords

Computer systems, System monitoring

LC Keywords

Citation