Query Answering over Functional Dependency Repairs

dc.contributor.authorGaliullin, Artur
dc.date.accessioned2013-09-23T15:48:49Z
dc.date.available2013-09-23T15:48:49Z
dc.date.issued2013-09-23T15:48:49Z
dc.date.submitted2013-09-11
dc.description.abstractInconsistency often arises in real-world databases and, as a result, critical queries over dirty data may lead users to make ill-informed decisions. Functional dependencies (FDs) can be used to specify intended semantics of the underlying data and aid with the cleaning task. Enumerating and evaluating all the possible repairs to FD violations is infeasible, while approaches that produce a single repair or attempt to isolate the dirty portion of data are often too destructive or constraining. In this thesis, we leverage a recent advance in data cleaning that allows sampling from a well-defined space of reasonable repairs, and provide the user with a data management tool that gives uncertain query answers over this space. We propose a framework to compute probabilistic query answers as though each repair sample were a possible world. We show experimentally that queries over many possible repairs produce results that are more useful than other approaches and that our system can scale to large datasets.en
dc.identifier.urihttp://hdl.handle.net/10012/7890
dc.language.isoenen
dc.pendingfalseen
dc.publisherUniversity of Waterlooen
dc.subjectData Cleaningen
dc.subjectProbabilistic Databasesen
dc.subject.programComputer Scienceen
dc.titleQuery Answering over Functional Dependency Repairsen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Artur_Galiullin.pdf
Size:
2.18 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
254 B
Format:
Item-specific license agreed upon to submission
Description: