research-article

Characterizing the sample complexity of private learners

Authors:

Uri StemmerAuthors Info & Claims

ITCS '13: Proceedings of the 4th conference on Innovations in Theoretical Computer Science

Pages 97 - 110

https://doi.org/10.1145/2422436.2422450

Published: 09 January 2013 Publication History

Abstract

In 2008, Kasiviswanathan el al. defined private learning as a combination of PAC learning and differential privacy [16]. Informally, a private learner is applied to a collection of labeled individual information and outputs a hypothesis while preserving the privacy of each individual. Kasiviswanathan et al. gave a generic construction of private learners for (finite) concept classes, with sample complexity logarithmic in the size of the concept class. This sample complexity is higher than what is needed for non-private learners, hence leaving open the possibility that the sample complexity of private learning may be sometimes significantly higher than that of non-private learning. We give a combinatorial characterization of the sample size sufficient and necessary to privately learn a class of concepts. This characterization is analogous to the well known characterization of the sample complexity of non-private learning in terms of the VC dimension of the concept class. We introduce the notion of probabilistic representation of a concept class, and our new complexity measure RepDim corresponds to the size of the smallest probabilistic representation of the concept class. We show that any private learning algorithm for a concept class C with sample complexity m implies RepDim(C) = O(m), and that there exists a private learning algorithm with sample complexity m = O(RepDim(C)).

We further demonstrate that a similar characterization holds for the database size needed for privately computing a large class of optimization problems and also for the well studied problem of private data release.

References

[1]

A. Beimel, H. Brenner, S. P. Kasiviswanathan, and K. Nissim. Bounds on the sample complexity for private learning and private data release. Full version of citeBKN10, in submition, 2012.

[2]

A. Beimel, P. Carmi, K. Nissim, and E. Weinreb. Private approximation of search problems. SIAM J. Comput., 38(5):1728--1760, 2008.

Digital Library

[3]

A. Beimel, S. P. Kasiviswanathan, and K. Nissim. Bounds on the sample complexity for private learning and private data release. In 7th Theory of Cryptography Conference, pages 437--454, 2010.

Digital Library

[4]

A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: The SuLQ framework. In PODS, pages 128--138. ACM, 2005.

Digital Library

[5]

A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In STOC, pages 609--618. ACM, 2008.

Digital Library

[6]

A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the Association for Computing Machinery, 36(4):929--965, 1989.

Digital Library

[7]

K. Chaudhuri and D. Hsu. Sample complexity bounds for differentially private learning. Journal of Machine Learning Research -- COLT 2011 Proceedings, 19:155--186, 2011.

[8]

H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Statist., 23:493--507, 1952.

[9]

C. Dwork. The differential privacy frontier. In O. Reingold, editor, TCC, volume 5444 of LNCS, pages 496--502. Springer, 2009.

Digital Library

[10]

C. Dwork. A firm foundation for private data analysis. Commun. of the ACM, 54(1):86--95, 2011.

Digital Library

[11]

C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In S. Halevi and T. Rabin, editors, TCC, volume 3876 of LNCS, pages 265--284. Springer, 2006.

Digital Library

[12]

C. Dwork, G. N. Rothblum, and S. P. Vadhan. Boosting and differential privacy. In 51th Annual IEEE Symposium on Foundations of Computer Science, pages 51--60, 2010.

Digital Library

[13]

A. Ehrenfeucht, D. Haussler, M. J. Kearns, and L. G. Valiant. A general lower bound on the number of examples needed for learning. Inf. Comput., 82(3):247--261, 1989.

Digital Library

[14]

Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119 -- 139, 1997.

Digital Library

[15]

W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, 1963.

[16]

S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith. What can we learn privately? In 48th Annual IEEE Symposium on Foundations of Computer Science, pages 531--540. IEEE Computer Society, 2008.

Digital Library

[17]

F. McSherry and K. Talwar. Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science, pages 94--103. IEEE, 2007.

Digital Library

[18]

R. E. Schapire. The strength of weak learnability. Mach. Learn., 5(2):197--227, 1990.

[19]

L. G. Valiant. A theory of the learnable. Communications of the ACM, 27:1134--1142, 1984.

Digital Library

[20]

V. N. Vapnik and A. Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16:264, 1971.

Cited By

Pinto FHu YYang FSanyal A(2024)PILLAR: How to make semi-private learning more effective2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML59370.2024.00014(110-139)Online publication date: 9-Apr-2024
https://doi.org/10.1109/SaTML59370.2024.00014
Alghamdi WAsoodeh SCalmon FFelipe Gomez JKosut OSankar L(2023)Optimal Multidimensional Differentially Private Mechanisms in the Large-Composition Regime2023 IEEE International Symposium on Information Theory (ISIT)10.1109/ISIT54713.2023.10206658(2195-2200)Online publication date: 25-Jun-2023
https://doi.org/10.1109/ISIT54713.2023.10206658
Alghamdi WAsoodeh SCalmon FFelipe Gomez JKosut OSankar L(2023)Schrödinger Mechanisms: Optimal Differential Privacy Mechanisms for Small Sensitivity2023 IEEE International Symposium on Information Theory (ISIT)10.1109/ISIT54713.2023.10206616(2201-2206)Online publication date: 25-Jun-2023
https://doi.org/10.1109/ISIT54713.2023.10206616
Show More Cited By

Index Terms

Characterizing the sample complexity of private learners
1. Security and privacy
  1. Human and societal aspects of security and privacy
2. Social and professional topics
  1. Computing / technology policy
    1. Privacy policies

Recommendations

Private PAC learning implies finite Littlestone dimension
STOC 2019: Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing

We show that every approximately differentially private learning algorithm (possibly improper) for a class H with Littlestone dimension d requires Ω(log^*(d)) examples. As a corollary it follows that the class of thresholds over ℕ can not be learned in a ...
Bounds on the sample complexity for private learning and private data release

Learning is a task that generalizes many of the analyses that are applied to collections of data, in particular, to collections of sensitive individual information. Hence, it is natural to ask what can be learned while preserving individual privacy. ...
Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

In this work we analyze the sample complexity of classification by differentially private algorithms. Differential privacy is a strong and well-studied notion of privacy introduced by Dwork et al. [Lecture Notes in Comput. Sci. 3876, Springer, New York, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ITCS '13: Proceedings of the 4th conference on Innovations in Theoretical Computer Science

January 2013

594 pages

ISBN:9781450318594

DOI:10.1145/2422436

Program Chair:
Robert Kleinberg
Cornell University

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 January 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ITCS '13

Sponsor:

SIGACT

ITCS '13: Innovations in Theoretical Computer Science

January 9 - 12, 2013

California, Berkeley, USA

Acceptance Rates

Overall Acceptance Rate 172 of 513 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
195
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pinto FHu YYang FSanyal A(2024)PILLAR: How to make semi-private learning more effective2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML59370.2024.00014(110-139)Online publication date: 9-Apr-2024
https://doi.org/10.1109/SaTML59370.2024.00014
Alghamdi WAsoodeh SCalmon FFelipe Gomez JKosut OSankar L(2023)Optimal Multidimensional Differentially Private Mechanisms in the Large-Composition Regime2023 IEEE International Symposium on Information Theory (ISIT)10.1109/ISIT54713.2023.10206658(2195-2200)Online publication date: 25-Jun-2023
https://doi.org/10.1109/ISIT54713.2023.10206658
Alghamdi WAsoodeh SCalmon FFelipe Gomez JKosut OSankar L(2023)Schrödinger Mechanisms: Optimal Differential Privacy Mechanisms for Small Sensitivity2023 IEEE International Symposium on Information Theory (ISIT)10.1109/ISIT54713.2023.10206616(2201-2206)Online publication date: 25-Jun-2023
https://doi.org/10.1109/ISIT54713.2023.10206616
Impagliazzo RLei RPitassi TSorrell JLeonardi SGupta A(2022)Reproducibility in learningProceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3519935.3519973(818-831)Online publication date: 9-Jun-2022
https://dl.acm.org/doi/10.1145/3519935.3519973
Kaplan HMansour YMatias YStemmer U(2022)Differentially Private Learning of Geometric ConceptsSIAM Journal on Computing10.1137/21M140642851:4(952-974)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1137/21M1406428
Nikolakakis KKalogerias DSheffet OSarwate A(2021)Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private SchemeIEEE Journal on Selected Areas in Information Theory10.1109/JSAIT.2021.30815252:2(534-548)Online publication date: Jun-2021
https://doi.org/10.1109/JSAIT.2021.3081525
Jung YKim BTewari ALarochelle HRanzato MHadsell RBalcan MLin H(2020)On the equivalence between online and private learnability beyond binary classificationProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497125(16701-16710)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497125
Kaplan HMansour YStemmer UTsfadia ELarochelle HRanzato MHadsell RBalcan MLin H(2020)Private learning of halfspacesProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496896(13976-13985)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3496896
Watson LMediratta AElahi TSarkar R(2020)Privacy Preserving Detection of Path Bias Attacks in TorProceedings on Privacy Enhancing Technologies10.2478/popets-2020-00652020:4(111-130)Online publication date: 17-Aug-2020
https://doi.org/10.2478/popets-2020-0065
Gong MXie YPan KFeng KQin A(2020)A Survey on Differentially Private Machine Learning [Review Article]IEEE Computational Intelligence Magazine10.1109/MCI.2020.297618515:2(49-64)Online publication date: May-2020
https://doi.org/10.1109/MCI.2020.2976185
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents