Novel Neural Network Repair Methods for Data Privacy and Individual Fairness
MetadataShow full item record
Machine learning is increasingly becoming critical to the decisions that control our lives. As these predictive models advance toward ubiquity, the demand for models that are trustworthy, fair, and preserve privacy becomes paramount. Despite this, significant privacy, trust, fairness, and security risks exist that make these models untrustworthy. Deep neural networks are vulnerable to attacks that reveal private information about training instances and violate regulatory guidelines. Additionally, models can display biased behavior which is difficult to detect and mitigate. To address these pressing questions, I present two main streams of research that fall under the umbrella of model repair. In the first, deemed Amnesiac Machine Learning, I address the problem of privacy leaking through two unlearning algorithms that specifically remove learning from a subset of training data. I evaluate these algorithms on a novel testing suite consisting of data-leaking attacks. In the second, I present an automated system that detects algorithmic bias, isolates the features most responsible for that biased behavior, and performs model repair to mitigate that bias. In both scenarios the repaired models have similar performance to models trained from scratch for the desired purpose, while at the same time do not exhibit privacy leakage or biased behavior on real-world data sets.
Cite this version of the work
Laura Graves (2021). Novel Neural Network Repair Methods for Data Privacy and Individual Fairness. UWSpace. http://hdl.handle.net/10012/17181