Sharma, Rakshit2024-04-152026-04-152024-04-152024-04-08http://hdl.handle.net/10012/20438In the dynamic landscape of Machine Learning (ML) applications, data quality comes out to be an important factor that impacts the performance of ML models. Through this thesis, we present a study that proposes innovative methods for enhancing data quality through an iterative data recapture approach. This research primarily focuses on univariate time-series data where specific patterns can be extracted. We start by discussing existing data capture methods, where the data is collected manually or using some hardware devices. The proposed methods, namely Sessionized Recapture Strategy (SRS) and Robust Single Capture Method (RSCM), are meticulously detailed, offering distinct strategies for iterative data recapture. The Single Capture Method (SCM) and Recapture and Visualize Method (RVM) serve as the two baseline methods, with their data capture time and a consequential False Positive Rate (FPR). SRS is the enhancement of RVM, and RSCM is the enhancement of SCM. This thesis also introduces an outlier detection algorithm named Outlier detection through ParameterlEss Robust Algorithm (OPERA), which, when added with SCM and RVM, results in SRS and RSCM, respectively. Compared with the baseline methods, the proposed methods show promising results and improvement in the data quality of the captured data. The experiments are performed on two datasets: one dataset is captured in the Embedded Systems Lab on one of the ANVIL products for Future Technology Devices International (FTDI) chips, and the second dataset is Electrocardiogram (ECG), provided by PhysioNet and is publicly available. The research concludes with synthesizing key findings and recommendations for practitioners seeking to optimize model performance through enhanced data quality.endata capture verificationoutlier detectionanomaly detectionparameterlessrobust outlier detectionOPERAdata capture strategiesANVILECG5000data capture issuesImpact of data quality on ML models: Improving data quality with Outlier DetectionMaster Thesis