Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

Chew, Robert; Williams, Matthew R.; Segarra, Elan A.; Preiss, Alexander J.; Konet, Amanda; Savitsky, Terrance D.

Statistics > Machine Learning

arXiv:2503.21528 (stat)

[Submitted on 27 Mar 2025]

Title:Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

Authors:Robert Chew, Matthew R. Williams, Elan A. Segarra, Alexander J. Preiss, Amanda Konet, Terrance D. Savitsky

View PDF HTML (experimental)

Abstract:Differential privacy (DP) is becoming increasingly important for deployed machine learning applications because it provides strong guarantees for protecting the privacy of individuals whose data is used to train models. However, DP mechanisms commonly used in machine learning tend to struggle on many real world distributions, including highly imbalanced or small labeled training sets. In this work, we propose a new scalable DP mechanism for deep learning models, SWAG-PPM, by using a pseudo posterior distribution that downweights by-record likelihood contributions proportionally to their disclosure risks as the randomized mechanism. As a motivating example from official statistics, we demonstrate SWAG-PPM on a workplace injury text classification task using a highly imbalanced public dataset published by the U.S. Occupational Safety and Health Administration (OSHA). We find that SWAG-PPM exhibits only modest utility degradation against a non-private comparator while greatly outperforming the industry standard DP-SGD for a similar privacy budget.

Subjects:	Machine Learning (stat.ML); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2503.21528 [stat.ML]
	(or arXiv:2503.21528v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2503.21528

Submission history

From: Robert Chew [view email]
[v1] Thu, 27 Mar 2025 14:17:05 UTC (388 KB)

Statistics > Machine Learning

Title:Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators