Differentially Private Learning with Noisy Labels

Mohapatra, Shubhankar

Differentially Private Learning with Noisy Labels

Files

Mohapatra_Shubhankar.pdf (3.42 MB)

Date

2020-05-28

Authors

Mohapatra, Shubhankar

Advisor

He, Xi
Chen, Helen

Publisher

University of Waterloo

Abstract

Supervised machine learning tasks require large labelled datasets. However, obtaining such datasets is a difficult task and often leads to noisy labels due to human errors or adversarial perturbation. Recent studies have shown multiple methods to tackle this problem in the non-private scenario, yet this remains an unsolved problem when the dataset is private. In this work, we aim to train a model on a sensitive dataset that contains noisy labels such that (i) the model has high test accuracy and (ii) the training process satisfies (ε,δ)-differential privacy. Noisy labels, as studied in our work, are generated by flipping labels in the training set, from the true source label(s) to other targets (s). Our approach, Diffindo, constructs a differentially private stochastic gradient descent algorithm which removes suspicious points based on their noisy gradients. We show experiments on datasets across multiple domains with different class balance properties. Our results show that the proposed algorithm can remove up to 100% of the points with noisy labels in the private scenario while restoring the precision of the targeted label and testing accuracy to its no-noise counterparts.

Keywords

Noisy Labels, Privacy, Differential Privacy, Private Machine Leaning, Diffindo, Private health informatics, Private robust learning

URI

http://hdl.handle.net/10012/15939

Collections

Theses
Computer Science

Full item page

Differentially Private Learning with Noisy Labels

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections