Generalization in the Face of Adaptivity: A Bayesian Perspective

Shenfeld, Moshe; Ligett, Katrina

Computer Science > Machine Learning

arXiv:2106.10761 (cs)

[Submitted on 20 Jun 2021 (v1), last revised 3 Apr 2024 (this version, v3)]

Title:Generalization in the Face of Adaptivity: A Bayesian Perspective

Authors:Moshe Shenfeld, Katrina Ligett

View PDF HTML (experimental)

Abstract:Repeated use of a data sample via adaptively chosen queries can rapidly lead to overfitting, wherein the empirical evaluation of queries on the sample significantly deviates from their mean with respect to the underlying data distribution. It turns out that simple noise addition algorithms suffice to prevent this issue, and differential privacy-based analysis of these algorithms shows that they can handle an asymptotically optimal number of queries. However, differential privacy's worst-case nature entails scaling such noise to the range of the queries even for highly-concentrated queries, or introducing more complex algorithms.
In this paper, we prove that straightforward noise-addition algorithms already provide variance-dependent guarantees that also extend to unbounded queries. This improvement stems from a novel characterization that illuminates the core problem of adaptive data analysis. We show that the harm of adaptivity results from the covariance between the new query and a Bayes factor-based measure of how much information about the data sample was encoded in the responses given to past queries. We then leverage this characterization to introduce a new data-dependent stability notion that can bound this covariance.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2106.10761 [cs.LG]
	(or arXiv:2106.10761v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2106.10761
Journal reference:	Advances in Neural Information Processing Systems, 36 (2024)

Submission history

From: Moshe Shenfeld [view email]
[v1] Sun, 20 Jun 2021 22:06:44 UTC (40 KB)
[v2] Tue, 20 Jun 2023 20:24:58 UTC (45 KB)
[v3] Wed, 3 Apr 2024 19:39:10 UTC (55 KB)

Computer Science > Machine Learning

Title:Generalization in the Face of Adaptivity: A Bayesian Perspective

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Generalization in the Face of Adaptivity: A Bayesian Perspective

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators