Show simple item record

dc.contributor.authorTavares, MariaCristina 17:57:15 (GMT) 17:57:15 (GMT)
dc.description.abstractThe massive amount of current data has led to many different forms of data analysis processes that aim to explore this data to uncover valuable insights such as trends, anomalies and patterns. These processes support decision makers in their analysis of varied and changing data ranging from financial transactions to customer interactions and social network postings. These data analysis processes use a wide variety of methods, including machine learning, in several domains such as business, finance, health and smart cities. Several data analysis processes have been proposed by academia and industry, including CRISP-DM and SEMMA, to describe the phases that data analysis experts go through when solving their problems. Specifically, CRISP-DM has modeling as one of its phases, which involves selecting a modeling technique, generating a test design, building a model, and assessing the model. However, automating these data analysis modeling processes faces numerous challenges, from a software engineering perspective. First, software users expect increased flexibility from the software as to the possible variations in techniques, types of data, and parameter settings. The software is required to accommodate complex usage and deployment variations, which are difficult for non-experts. Second, variability in functionality or quality attributes increases the complexity of these systems and makes them harder to design and implement. There is a lack of a framework design that takes variability into account. Third, the lack of a more comprehensive analysis of variability makes it difficult to evaluate opportunities for automating data analysis modeling. This thesis proposes a variability-aware design approach to the data analysis modeling process. The approach involves: (i) the assessment of the variabilities inherent in CRISP-DM data analysis modeling and the provision of feature models that represent these variabilities; (ii) the definition of a preliminary framework design that captures the identified variabilities; and (iii) evaluation of the framework design in terms of possibilities of automation. Overall, this work presents, to the best of our knowledge, the first approach based on variability assessment to design data modeling process such as CRISP-DM. The approach advances the state of the art by offering a variability-aware design a solution that can enhance system flexibility and a novel software design framework to support data analysis modeling.en
dc.publisherUniversity of Waterlooen
dc.subjectData analysis modelingen
dc.subjectVariability analysisen
dc.subjectFeature modelsen
dc.subjectObject-oriented frameworken
dc.titleA Variability-Aware Design Approach to the Data Analysis Modeling Processen
dc.typeMaster Thesisen
dc.pendingfalse R. Cheriton School of Computer Scienceen Scienceen of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws.contributor.advisorAlencar, Paulo
uws.contributor.advisorCowan, Donald
uws.contributor.advisorBerry, Daniel
uws.contributor.affiliation1Faculty of Mathematicsen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages