A Theoretical Study of Clusterability and Clustering Quality

dc.contributor.authorAckerman, Margareta
dc.date.accessioned2008-01-16T15:36:19Z
dc.date.available2008-01-16T15:36:19Z
dc.date.issued2008-01-16T15:36:19Z
dc.date.submitted2007
dc.description.abstractClustering is a widely used technique, with applications ranging from data mining, bioinformatics and image analysis to marketing, psychology, and city planning. Despite the practical importance of clustering, there is very limited theoretical analysis of the topic. We make a step towards building theoretical foundations for clustering by carrying out an abstract analysis of two central concepts in clustering; clusterability and clustering quality. We compare a number of notions of clusterability found in the literature. While all these notions attempt to measure the same property, and all appear to be reasonable, we show that they are pairwise inconsistent. In addition, we give the first computational complexity analysis of a few notions of clusterability. In the second part of the thesis, we discuss how the quality of a given clustering can be defined (and measured). Users often need to compare the quality of clusterings obtained by different methods. Perhaps more importantly, users need to determine whether a given clustering is sufficiently good for being used in further data mining analysis. We analyze what a measure of clustering quality should look like. We do that by introducing a set of requirements (`axioms') of clustering quality measures. We propose a number of clustering quality measures that satisfy these requirements.en
dc.identifier.urihttp://hdl.handle.net/10012/3478
dc.language.isoenen
dc.pendingfalseen
dc.publisherUniversity of Waterlooen
dc.subjectclusteringen
dc.subjectclustering qualityen
dc.subjectclusterabilityen
dc.subjectdata miningen
dc.subject.programComputer Scienceen
dc.titleA Theoretical Study of Clusterability and Clustering Qualityen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis.pdf
Size:
415.6 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
256 B
Format:
Item-specific license agreed upon to submission
Description: