Show simple item record

dc.contributor.authorGaliullin, Artur 15:48:49 (GMT) 15:48:49 (GMT)
dc.description.abstractInconsistency often arises in real-world databases and, as a result, critical queries over dirty data may lead users to make ill-informed decisions. Functional dependencies (FDs) can be used to specify intended semantics of the underlying data and aid with the cleaning task. Enumerating and evaluating all the possible repairs to FD violations is infeasible, while approaches that produce a single repair or attempt to isolate the dirty portion of data are often too destructive or constraining. In this thesis, we leverage a recent advance in data cleaning that allows sampling from a well-defined space of reasonable repairs, and provide the user with a data management tool that gives uncertain query answers over this space. We propose a framework to compute probabilistic query answers as though each repair sample were a possible world. We show experimentally that queries over many possible repairs produce results that are more useful than other approaches and that our system can scale to large datasets.en
dc.publisherUniversity of Waterlooen
dc.subjectData Cleaningen
dc.subjectProbabilistic Databasesen
dc.titleQuery Answering over Functional Dependency Repairsen
dc.typeMaster Thesisen
dc.subject.programComputer Scienceen of Computer Scienceen
uws-etd.degreeMaster of Mathematicsen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages