Adaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs

dc.contributor.authorPadalko, Kateryna
dc.date.accessioned2026-05-13T15:25:27Z
dc.date.available2026-05-13T15:25:27Z
dc.date.issued2026-05-13
dc.date.submitted2026-05-08
dc.description.abstractTraining generative models under differential privacy (DP) requires injecting calibrated noise into gradient updates, creating an inherent trade-off between privacy protection and data quality. In standard DP-CTGAN, a single discriminator processes all features under a shared privacy budget, so noise injected to protect sensitive demographic attributes equally degrades the learning signal for non-sensitive features, an architectural limitation, not a mathematical one. We propose the Dual-Path DP-CTGAN, a discriminator architecture that partitions features into sensitive and non-sensitive paths, each governed by its own DP-SGD mechanism and Rényi DP accountant. Gradient isolation confines privacy noise to its respective path, preserving the learning signal for non-sensitive features without relaxing the formal (ε, δ)-DP guarantee. By the post-processing theorem, the generator inherits the privacy guarantees of both paths without additional composition. We embed this architecture in a Bayesian multi-objective hyperparameter optimisation pipeline that jointly evaluates utility, distributional fidelity, and empirical privacy risk, using Pareto-dominance selection to surface non-dominated configurations. Experiments on the Adult Census Income benchmark demonstrate that Dual-Path at ε = 1 achieves distributional fidelity below the non-private baseline and reduces the downstream utility gap by 79% relative to single-path DP-CTGAN at the same budget, exceeding single-path performance at ε = 5 while maintaining comparable empirical privacy risk. Per-feature analysis confirms that the fidelity gain concentrates in the feature group freed from cross-path noise contamination, providing direct evidence for the gradient isolation mechanism. These results suggest that discriminator architecture, rather than the noise mechanism itself, is the primary bottleneck limiting utility in standard DP-GAN designs.
dc.identifier.urihttps://hdl.handle.net/10012/23295
dc.language.isoen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectdifferential privacy
dc.subjectsynthetic data generation
dc.subjectprivacy–utility trade-off
dc.subjecttabular data synthesis
dc.subjecthealth data privacy
dc.subjectDual-Path DP-CTGAN
dc.subjectconditional tabular GAN
dc.subjectPareto optimization
dc.subjectprivacy risk evaluation
dc.titleAdaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs
dc.typeMaster Thesis
uws-etd.degreeMaster of Science
uws-etd.degree.departmentSchool of Public Health Sciences
uws-etd.degree.disciplinePublic Health Sciences
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0
uws.contributor.advisorChen, Helen
uws.contributor.affiliation1Faculty of Health
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Padalko_Kateryna.pdf
Size:
32.97 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections