Adaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs

Padalko, Kateryna

Adaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs

Files

Padalko_Kateryna.pdf (32.97 MB)

Date

2026-05-13

Authors

Padalko, Kateryna

Advisor

Chen, Helen

Publisher

University of Waterloo

Abstract

Training generative models under differential privacy (DP) requires injecting calibrated noise into gradient updates, creating an inherent trade-off between privacy protection and data quality. In standard DP-CTGAN, a single discriminator processes all features under a shared privacy budget, so noise injected to protect sensitive demographic attributes equally degrades the learning signal for non-sensitive features, an architectural limitation, not a mathematical one. We propose the Dual-Path DP-CTGAN, a discriminator architecture that partitions features into sensitive and non-sensitive paths, each governed by its own DP-SGD mechanism and Rényi DP accountant. Gradient isolation confines privacy noise to its respective path, preserving the learning signal for non-sensitive features without relaxing the formal (ε, δ)-DP guarantee. By the post-processing theorem, the generator inherits the privacy guarantees of both paths without additional composition. We embed this architecture in a Bayesian multi-objective hyperparameter optimisation pipeline that jointly evaluates utility, distributional fidelity, and empirical privacy risk, using Pareto-dominance selection to surface non-dominated configurations. Experiments on the Adult Census Income benchmark demonstrate that Dual-Path at ε = 1 achieves distributional fidelity below the non-private baseline and reduces the downstream utility gap by 79% relative to single-path DP-CTGAN at the same budget, exceeding single-path performance at ε = 5 while maintaining comparable empirical privacy risk. Per-feature analysis confirms that the fidelity gain concentrates in the feature group freed from cross-path noise contamination, providing direct evidence for the gradient isolation mechanism. These results suggest that discriminator architecture, rather than the noise mechanism itself, is the primary bottleneck limiting utility in standard DP-GAN designs.