Spectral Ranking and Unsupervised Feature Selection for Point, Collective and Contextual Anomaly Detection

dc.contributor.authorZhang, Haofan
dc.date.accessioned2014-09-03T13:37:49Z
dc.date.available2014-09-03T13:37:49Z
dc.date.issued2014-09-03
dc.date.submitted2014-08-28
dc.description.abstractAnomaly detection problems can be classified into three categories: point anomaly detection, collective anomaly detection and contextual anomaly detection. Many algorithms have been devised to address anomaly detection of a specific type from various application domains. Nevertheless, the exact type of anomalies to be detected in practice is generally unknown under unsupervised setting, and most of the methods exist in literature usually favor one kind of anomalies over the others. Applying an algorithm with an incorrect assumption is unlikely to produce reasonable results. This thesis thereby investigates the possibility of applying a uniform approach that can automatically discover different kinds of anomalies. Specifically, we are primarily interested in Spectral Ranking for Anomalies (SRA) for its potential in detecting point anomalies and collective anomalies simultaneously. We show that the spectral optimization in SRA can be viewed as a relaxation of an unsupervised SVM problem under some assumptions. SRA thereby results in a bi-class classification strength measure that can be used to rank the point anomalies, along with a normal vs. abnormal classification for identifying collective anomalies. However, in dealing with contextual anomaly problems with different contexts defined by different feature subsets, SRA and other popular methods are still not sufficient on their own. Accordingly, we propose an unsupervised backward elimination feature selection algorithm BAHSIC-AD, utilizing Hilbert-Schmidt Independence Critirion (HSIC) in identifying the data instances present as anomalies in the subset of features that have strong dependence with each other. Finally, we demonstrate the effectiveness of SRA combined with BAHSIC-AD by comparing their performance with other popular anomaly detection methods on a few benchmarks, including both synthetic datasets and real world datasets. Our computational results jusitify that, in practice, SRA combined with BAHSIC-AD can be a generally applicable method for detecting different kinds of anomalies.en
dc.identifier.urihttp://hdl.handle.net/10012/8763
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subject.programComputer Scienceen
dc.titleSpectral Ranking and Unsupervised Feature Selection for Point, Collective and Contextual Anomaly Detectionen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_Haofan.pdf
Size:
3.02 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.67 KB
Format:
Item-specific license agreed upon to submission
Description: