Show simple item record

dc.contributor.authorSun, Yunjia
dc.date.accessioned2016-04-26 15:00:27 (GMT)
dc.date.available2016-08-25 04:50:20 (GMT)
dc.date.issued2016-04-26
dc.date.submitted2016-04-18
dc.identifier.urihttp://hdl.handle.net/10012/10396
dc.description.abstractThis thesis focuses on visualizations for machine learning tasks. More specifically, we create a taxonomy for existing machine learning visualizations, and design a system to help machine learning novices perform labelling tasks. There are many mature visualizations to help people understand the performance of current classifiers, including scatterplots, confusion matrices and ROC curves. However, most machine learning researchers are unaware of the visualization possibilities that exist, and many published visualizations are too task-oriented or dataset-oriented to be easily applied to other tasks. This thesis defines a taxonomy for machine learning visualizations in three dimensions: the data displayed, the advanced features to add for a specific task, and the goal of the visualizations. This taxonomy seeks to help machine learning researchers select a better visualization method to analyze their data. Previous machine learning tools focus on presenting comprehensive information to experts, treating machine learning as a black-box for end-users, or explaining the reason behind the prediction in a simple and clear way. However, to build a machine learning system, one needs to label data first, and a lot of machine learning novices want to build a classifier themselves simply by labelling data. This inspired our idea to design and implement the Label-and-Learn system, which includes five visualizations to help users better understand their data, the likelihood of the classifier's success, and to improve their user experience. To evaluate the utility of our Label-and-Learn system, we ran user studies to compare the visualization system and traditional system in the quality of the labels, the user's mental model about the task, and the user experience. The results from the experiment show that visualizations have no negative effect on the quality of the labels, but do improve the user's mental model and the user experience. The success of the Label-and-Learn system should inspire further research in using visualizations to improve the user experience of data labelling in machine learning tasks.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectVisualizationen
dc.subjectMachine Learningen
dc.subjectData Labellingen
dc.titleNovice-Centric Visualizations for Machine Learningen
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws-etd.embargo.terms4 monthsen
uws.contributor.advisorLank, Edward
uws.contributor.advisorLaw, Edith
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages