The Libraries will be performing routine maintenance on UWSpace on July 15th-16th, 2025. UWSpace will be available, though users may experience service lags during this time. We recommend all users avoid submitting new items to UWSpace until maintenance is completed.
 

Scalable Informative Rule Mining

Loading...
Thumbnail Image

Date

2016-08-10

Authors

Feng, Guoyao

Advisor

Golab, Lukasz
Keshav, Srinivasan

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

In this thesis we present SIRUM: a system for Scalable Informative RUle Mining from multi-dimensional data. Informative rules have recently been studied in several contexts, including data summarization, data cube exploration and data quality. The objective is to produce a concise set of rules (patterns) over the values of the dimension attributes that provide the most information about the distribution of a numeric measure attribute. SIRUM optimizes this task for big, wide and distributed datasets. We implemented SIRUM in Spark and observed significant performance improvements on real data due to our optimizations.

Description

Keywords

Informative Rule Mining, Scalable Data Processing Systems

LC Subject Headings

Citation