High-order pattern discovery and analysis of discrete-valued data sets

dc.contributor.authorWang, Yangen
dc.date.accessioned2006-07-28T19:55:56Z
dc.date.available2006-07-28T19:55:56Z
dc.date.issued1997en
dc.date.submitted1997en
dc.description.abstractAutomatic pattern discovery from data collections and the analysis of the patterns for useful information are common and important in both science and engineering today. This discovery is especially demanding in challenging industrial and business applications where the explosive volume of data makes manual analysis virtually impossible. The problems of pattern discovery and analysis that this research addresses include: 1) the discovery of polythetic patterns; 2) the discovery of patterns in the presence of noise and uncertainties; 3) schema for representing different order patterns; 4) the inference process for flexible pattern prediction; and 5) the application of pattern discovery to large database analysis and data mining. In this thesis, the design and development of a system for pattern discovery and analysis of categorical or discrete-valued data is presented. The system starts with detecting the event association patterns of different orders and provides a probabilistic inference mechanism to achieve flexible classification and prediction. Here a pattern is defined as a significant event association in a problem domain. To detect significant event associations, residual analysis in statistics is used. The insights gained from the analysis of the event association of different orders and the properties of the residuals lead to a general pattern discovery paradigm which detects patterns according to the deviations of the observed patterns from a default model. Along with the paradigm, techniques are developed to avoid exhaustive search in the process of discovering high order patterns from a large data set. An attribute hypergraph is proposed to represent and to operate on the discovered patterns which can be of different orders. The pattern discovery process can be viewed as a hypergraph generation process. The attributed hypergraph acts as a bridge linking the pattern discovery process with the inference process. For pattern analysis and inference, a generalized reasoning process based on the weight of evidence is introduced. With this paradigm, flexible prediction becomes possible. This thesis covers also the implementation of the major ideas outlined in the pattern discovery framework in an integrated software system. It ends with discussions on the experimental results of pattern discovery and analysis on data obtained from various sources (including synthetic and real-world data). Compared with the existing systems, the new methodology this thesis presents stands out, possessing significant and superior characteristics in both pattern discovery and pattern analysis.en
dc.formatapplication/pdfen
dc.format.extent10286272 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10012/201
dc.language.isoenen
dc.pendingfalseen
dc.publisherUniversity of Waterlooen
dc.rightsCopyright: 1997, Wang, Yang. All rights reserved.en
dc.subjectHarvested from Collections Canadaen
dc.titleHigh-order pattern discovery and analysis of discrete-valued data setsen
dc.typeDoctoral Thesisen
uws-etd.degreePh.D.en
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nq22245.pdf
Size:
7.7 MB
Format:
Adobe Portable Document Format