dc.contributor.author | Feng, Guoyao | |
dc.date.accessioned | 2016-08-10 18:13:37 (GMT) | |
dc.date.available | 2016-08-10 18:13:37 (GMT) | |
dc.date.issued | 2016-08-10 | |
dc.date.submitted | 2016-08-02 | |
dc.identifier.uri | http://hdl.handle.net/10012/10620 | |
dc.description.abstract | In this thesis we present SIRUM: a system for Scalable Informative RUle Mining from multi-dimensional data. Informative rules have recently been studied in several contexts, including data summarization, data cube exploration and data quality. The objective is to produce a concise set of rules (patterns) over the values of the dimension attributes that provide the most information about the distribution of a numeric measure attribute. SIRUM optimizes this task for big, wide and distributed datasets. We implemented SIRUM in Spark and observed significant performance improvements on real data due to our optimizations. | en |
dc.language.iso | en | en |
dc.publisher | University of Waterloo | en |
dc.subject | Informative Rule Mining | en |
dc.subject | Scalable Data Processing Systems | en |
dc.title | Scalable Informative Rule Mining | en |
dc.type | Master Thesis | en |
dc.pending | false | |
uws-etd.degree.department | David R. Cheriton School of Computer Science | en |
uws-etd.degree.discipline | Computer Science | en |
uws-etd.degree.grantor | University of Waterloo | en |
uws-etd.degree | Master of Mathematics | en |
uws.contributor.advisor | Golab, Lukasz | |
uws.contributor.advisor | Keshav, Srinivasan | |
uws.contributor.affiliation1 | Faculty of Mathematics | en |
uws.published.city | Waterloo | en |
uws.published.country | Canada | en |
uws.published.province | Ontario | en |
uws.typeOfResource | Text | en |
uws.peerReviewStatus | Unreviewed | en |
uws.scholarLevel | Graduate | en |