Rough Set Based Rule Evaluations and Their Applications
Loading...
Date
2007-03-06T15:47:15Z
Authors
Li, Jiye
Advisor
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Knowledge discovery is an important process in data analysis, data
mining and machine learning. Typically knowledge is presented in the
form of rules. However, knowledge discovery systems often generate a
huge amount of rules. One of the challenges we face is how to
automatically discover interesting and meaningful knowledge from
such discovered rules. It is infeasible for human beings to select
important and interesting rules manually. How to provide a measure
to evaluate the qualities of rules in order to facilitate the
understanding of data mining results becomes our focus. In this
thesis, we present a series of rule evaluation techniques for the
purpose of facilitating the knowledge understanding process. These
evaluation techniques help not only to reduce the number of rules,
but also to extract higher quality rules. Empirical studies on both
artificial data sets and real world data sets demonstrate how such
techniques can contribute to practical systems such as ones for
medical diagnosis and web personalization.
In the first part of this thesis, we discuss several rule evaluation
techniques that are proposed towards rule postprocessing. We show
how properly defined rule templates can be used as a rule evaluation
approach. We propose two rough set based measures, a Rule Importance
Measure, and a Rules-As-Attributes Measure,
%a measure of considering rules as attributes,
to rank the important and interesting rules. In the second part of
this thesis, we show how data preprocessing can help with rule
evaluation. Because well preprocessed data is essential for
important rule generation, we propose a new approach for processing
missing attribute values for enhancing the generated rules. In the
third part of this thesis, a rough set based rule evaluation system
is demonstrated to show the effectiveness of the measures proposed
in this thesis. Furthermore, a new user-centric web personalization
system is used as a case study to demonstrate how the proposed
evaluation measures can be used in an actual application.
Description
Keywords
Data Mining, Rough Set, Rule Evaluations, Personalization