A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data

Annamalai, Meenatchi Sundaram Muthu Selva; Gadotti, Andrea; Rocher, Luc

Computer Science > Machine Learning

arXiv:2301.10053 (cs)

[Submitted on 24 Jan 2023 (v1), last revised 9 May 2024 (this version, v3)]

Title:A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data

Authors:Meenatchi Sundaram Muthu Selva Annamalai, Andrea Gadotti, Luc Rocher

View PDF HTML (experimental)

Abstract:Recent advances in synthetic data generation (SDG) have been hailed as a solution to the difficult problem of sharing sensitive data while protecting privacy. SDG aims to learn statistical properties of real data in order to generate "artificial" data that are structurally and statistically similar to sensitive data. However, prior research suggests that inference attacks on synthetic data can undermine privacy, but only for specific outlier records. In this work, we introduce a new attribute inference attack against synthetic data. The attack is based on linear reconstruction methods for aggregate statistics, which target all records in the dataset, not only outliers. We evaluate our attack on state-of-the-art SDG algorithms, including Probabilistic Graphical Models, Generative Adversarial Networks, and recent differentially private SDG mechanisms. By defining a formal privacy game, we show that our attack can be highly accurate even on arbitrary records, and that this is the result of individual information leakage (as opposed to population-level inference). We then systematically evaluate the tradeoff between protecting privacy and preserving statistical utility. Our findings suggest that current SDG methods cannot consistently provide sufficient privacy protection against inference attacks while retaining reasonable utility. The best method evaluated, a differentially private SDG mechanism, can provide both protection against inference attacks and reasonable utility, but only in very specific settings. Lastly, we show that releasing a larger number of synthetic records can improve utility but at the cost of making attacks far more effective.

Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Cite as:	arXiv:2301.10053 [cs.LG]
	(or arXiv:2301.10053v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2301.10053
Journal reference:	Published in the Proceedings of the 33rd USENIX Security Symposium (USENIX Security 2024), please cite accordingly

Submission history

From: Meenatchi Sundaram Muthu Selva Annamalai [view email]
[v1] Tue, 24 Jan 2023 14:56:36 UTC (19,495 KB)
[v2] Mon, 12 Jun 2023 10:42:05 UTC (29,918 KB)
[v3] Thu, 9 May 2024 10:35:25 UTC (30,472 KB)

Computer Science > Machine Learning

Title:A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators