Towards Effective Measurement of Membership Privacy Risk for Machine Learning Models

Duddu, Vasisht

Towards Effective Measurement of Membership Privacy Risk for Machine Learning Models

Files

Duddu_Vasisht.pdf (992.36 KB)

Date

2022-07-18

Authors

Duddu, Vasisht

Advisor

N., Asokan

Publisher

University of Waterloo

Abstract

Machine learning (ML) models are trained on data which can be sensitive. Membership inference attacks (MIAs) infer whether a particular data record was used to train an ML model. This violates the membership privacy of an individual, specially in applications where the knowledge of that individual's data record in training data is sensitive. For instance, the privacy risk of inferring an individual's health status from a model trained on a dataset containing patients with some specific disease. There is a need for a privacy metric that enables ML model builders to quantify the membership privacy risk of (a) individual training data records, (b) computed independently of specific MIAs, (c) which assesses susceptibility to different MIAs, (d) can be used for different applications, (e) efficiently. None of the prior membership privacy risk metrics simultaneously meet all of these criteria. Ideally, a membership privacy risk metric will measure the memorization of individual training data records by large capacity ML models, which is the cause for membership privacy risk as suggested by prior work. In practice, this can be achieved by estimating the influence of individual training data records to a model's utility. Leave-one-out (LOO) computation, i.e., the difference in model utility with and without a data record in training dataset, can be used to measure this memorization but at high computation cost. Shapley values is an alternative LOO approach with efficient algorithms in the literature. It measures the influence of a training data record on a model's utility and thereby the extent of it being memorized by that model. Hence, we conjecture that Shapley values, can serve as a good membership privacy risk metric to indicate the susceptibility of training data records to MIAs. In this work, we explore the following research question: can Shapley values effectively estimate the susceptibility of individual training data records to MIAs? We validate the above conjecture by presenting SHAPr, a membership privacy metric based on Shapely values which satisfies the desiderata (a) - (e) mentioned above. Using ten benchmark datasets and five MIAs, we show that SHAPr is indeed effective in estimating susceptibility of a training data records to different MIAs as computed using F1 scores. We then focus on recall as being more important than precision for evaluating effectiveness of membership privacy risk metrics. We find that using recall, SHAPr is effective to assess the susceptibility across different MIAs and datasets. We find that SHAPr is either comparable or better than prior work for effective MIAs (good accuracy on both members and non-members). Additionally, other than inheriting applications of Shapley values (e.g., data valuation), SHAPr is versatile and can be used for estimating the disproportionate vulnerability over different subgroups to MIAs. We apply SHAPr to evaluate the efficacy of several defenses against MIAs. First, we show that adding noise to subset of training data records lowers their privacy risk. But this comes at the cost of increasing the privacy risk for remaining training data records, making it an ineffective defence. Second, we show that the membership privacy risk of a dataset is not necessarily improved by removing high risk training data records, thereby confirming an observation from prior work in a significantly extended setting (across ten datasets, removing up to 50% of vulnerable training data records). Third, SHAPr correctly captures the decrease in MIA accuracy on using regularization based defence. Finally, SHAPr has acceptable computational cost (compared to naive LOO), i.e., varying from a few minutes for the smallest dataset to ~92 minutes for the largest dataset.