Browsing Mathematics (Faculty of) by Subject "data cleaning"
Now showing items 1-3 of 3
-
Scalability aspects of data cleaning
(University of Waterloo, 2021-01-27)Data cleaning has become one of the important pre-processing steps for many data science, data analytics, and machine learning applications. According to a survey by Gartner, more than 25% of the critical data in the world's ... -
Scaling Machine Learning Data Repair Systems for Sparse Datasets
(University of Waterloo, 2021-01-21)Machine learning data repair systems (e.g. HoloClean) have achieved state-of-the-art performance for the data repair problem on many datasets. However, these systems face significant challenges with sparse datasets. In ... -
Structured Prediction on Dirty Datasets
(University of Waterloo, 2021-12-20)Many errors cannot be detected or repaired without taking into account the underlying structure and dependencies in the dataset. One way of modeling the structure of the data is graphical models. Graphical models combine ...