Adaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs

Padalko, Kateryna

Adaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs

dc.contributor.author	Padalko, Kateryna
dc.date.accessioned	2026-05-13T15:25:27Z
dc.date.available	2026-05-13T15:25:27Z
dc.date.issued	2026-05-13
dc.date.submitted	2026-05-08
dc.description.abstract	Training generative models under differential privacy (DP) requires injecting calibrated noise into gradient updates, creating an inherent trade-off between privacy protection and data quality. In standard DP-CTGAN, a single discriminator processes all features under a shared privacy budget, so noise injected to protect sensitive demographic attributes equally degrades the learning signal for non-sensitive features, an architectural limitation, not a mathematical one. We propose the Dual-Path DP-CTGAN, a discriminator architecture that partitions features into sensitive and non-sensitive paths, each governed by its own DP-SGD mechanism and Rényi DP accountant. Gradient isolation confines privacy noise to its respective path, preserving the learning signal for non-sensitive features without relaxing the formal (ε, δ)-DP guarantee. By the post-processing theorem, the generator inherits the privacy guarantees of both paths without additional composition. We embed this architecture in a Bayesian multi-objective hyperparameter optimisation pipeline that jointly evaluates utility, distributional fidelity, and empirical privacy risk, using Pareto-dominance selection to surface non-dominated configurations. Experiments on the Adult Census Income benchmark demonstrate that Dual-Path at ε = 1 achieves distributional fidelity below the non-private baseline and reduces the downstream utility gap by 79% relative to single-path DP-CTGAN at the same budget, exceeding single-path performance at ε = 5 while maintaining comparable empirical privacy risk. Per-feature analysis confirms that the fidelity gain concentrates in the feature group freed from cross-path noise contamination, providing direct evidence for the gradient isolation mechanism. These results suggest that discriminator architecture, rather than the noise mechanism itself, is the primary bottleneck limiting utility in standard DP-GAN designs.
dc.identifier.uri	https://hdl.handle.net/10012/23295
dc.language.iso	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	differential privacy
dc.subject	synthetic data generation
dc.subject	privacy–utility trade-off
dc.subject	tabular data synthesis
dc.subject	health data privacy
dc.subject	Dual-Path DP-CTGAN
dc.subject	conditional tabular GAN
dc.subject	Pareto optimization
dc.subject	privacy risk evaluation
dc.title	Adaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs
dc.type	Master Thesis
uws-etd.degree	Master of Science
uws-etd.degree.department	School of Public Health Sciences
uws-etd.degree.discipline	Public Health Sciences
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0
uws.contributor.advisor	Chen, Helen
uws.contributor.affiliation1	Faculty of Health
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Padalko_Kateryna.pdf
Size:: 32.97 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses