research-article

Public Access

Heavy Hitters and the Structure of Local Privacy

Authors:

Uri StemmerAuthors Info & Claims

ACM Transactions on Algorithms (TALG), Volume 15, Issue 4

Article No.: 51, Pages 1 - 40

https://doi.org/10.1145/3344722

Published: 04 October 2019 Publication History

All formats PDF

Abstract

We present a new locally differentially private algorithm for the heavy hitters problem that achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates that depend optimally on the number of users, the size of the domain, and the privacy parameter but depend sub-optimally on the failure probability.

We strengthen existing lower bounds on the error to incorporate the failure probability and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters.

• Advanced Grouposition: In the local model, group privacy for k users degrades proportionally to ≈√k instead of linearly in k as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via “packing arguments”), over the central model.

• Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy.

References

[1]

N. Alon and F. R. K. Chung. 1988. Explicit construction of linear sized tolerant networks. Discr. Math. 72, 1--3 (1988), 15--19.

Digital Library

[2]

N. Alon and J. Spencer. 1992. The Probabilistic Method. John Wiley.

[3]

R. Bassily, K. Nissim, A. D. Smith, T. Steinke, U. Stemmer, and J. Ullman. 2016. Algorithmic stability for adaptive data analysis. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC’16). 1046--1059.

[4]

R. Bassily, K. Nissim, U. Stemmer, and A. Thakurta. 2017. Practical locally private heavy hitters. In Proceedings of the Annual Conference on Advances in Neural Information Processing Systems (NIPS’17).

[5]

R. Bassily and A. D. Smith. 2015. Local, private, efficient protocols for succinct histograms. In Proceedings of the 47th Annual ACM on Symposium on Theory of Computing (STOC’15). 127--135.

[6]

M. Bun and T. Steinke. 2016. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Proceedings of the 14th International Conference Theory of Cryptography (TCC’16-B). 635--658.

[7]

T. H. Chan, E. Shi, and D. Song. 2012. Optimal lower bound for differentially private multi-party aggregation. In Proceedings of the 20th Annual European Symposium on Algorithms (ESA’12). 277--288.

[8]

J. C. Duchi, M. I. Jordan, and M. J. Wainwright. 2013. Local privacy and statistical minimax rates. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS’13). 429--438.

[9]

C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, and A. Roth. 2015. Generalization in adaptive data analysis and holdout reuse. In Proceedings of the Annual Conference on Advances in Neural Information Processing Systems (NIPS’15).

[10]

C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, and A. Roth. 2015. Preserving statistical validity in adaptive data analysis. In Proceedings of the ACM Symposium on the Theory of Computing (STOC’15). ACM.

[11]

C. Dwork, F. McSherry, K. Nissim, and A. Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third Theory of Cryptography Conference. 265--284.

[12]

C. Dwork and A. Roth. 2014. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 3-4 (2014), 211--407.

Digital Library

[13]

C. Dwork, G. N. Rothblum, and S. P. Vadhan. 2010. Boosting and differential privacy. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS’10). IEEE Computer Society, 51--60.

[14]

Ú. Erlingsson, V. Pihur, and A. Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the ACM Conference on Computer and Communications Security (CCS’14).

[15]

W. Feller. 1943. Generalization of a probability limit theorem of cramer. Trans. Am. Math. Soc. 54, 3 (1943), 361--372.

[16]

A. C. Gilbert, Y. Li, E. Porat, and M. J. Strauss. 2014. For-all sparse recovery in near-optimal time. In Proceedings of the 41st International Colloquium on Automata, Languages, and Programming (ICALP’14). 538--550.

[17]

V. Guruswami. 2001. List Decoding of Error-Correcting Codes. Ph.D. Dissertation, Massachusetts Institute of Technology, 2001.

Digital Library

[18]

V. Guruswami and P. Indyk. 2001. Expander-based constructions of efficiently decodable codes. In Proceedings of the 2001 IEEE International Conference on Cluster Computing. 658--667.

[19]

J. Hsu, S. Khanna, and A. Roth. 2012. Distributed private heavy hitters. In Proceedings of the 39th International Colloquium Automata, Languages, and Programming (ICALP’12). 461--472.

[20]

D. M. Kane, J. Nelson, E. Porat, and D. P. Woodruff. 2011. Fast moment estimation in data streams in optimal space. In Proceedings of the Annual ACM Symposium on the Theory of Computing (STOC’11). ACM, 745--754.

[21]

H. Kaplan and U. Stemmer. 2018. Differentially private k-means with constant multiplicative error. In Proceedings of the Annual Conference on Neural Information Processing Systems 2018 (NeurIPS’18). 5436--5446.

[22]

S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith. 2011. What can we learn privately? SIAM J. Comput. 40, 3 (2011), 793--826.

Digital Library

[23]

S. P. Kasiviswanathan and A. D. Smith. 2008. A note on differential privacy: Defining resistance to arbitrary side information. CoRR (2008), abs/0803.3946.

[24]

P. N. Klein and N. E. Young. 2015. On the number of iterations for dantzig-wolfe optimization and packing-covering approximation algorithms. SIAM J. Comput. 44, 4 (2015), 1154--1172.

[25]

K. G. Larsen, J. Nelson, H. L. Nguyen, and M. Thorup. 2016. Heavy hitters via cluster-preserving clustering. In Proceedings of the 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’16). 61--70.

[26]

J. Matoušek. 2001. Lower bound on the minus-domination number. Discr. Math. 233, 1 (2001), 361--370.

Digital Library

[27]

N. Mishra and M. Sandler. 2006. Privacy via pseudorandom sketches. In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’06). ACM, New York, NY, 143--152.

[28]

M. Mitzenmacher and E. Upfal. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York, NY.

[29]

K. Nissim and U. Stemmer. 2018. Clustering algorithms for the centralized and local models. In Proceedings of the Annual Converence on Algorithmic Learning Theory (ALT’18). 619--653.

[30]

Z. Qin, Y. Yang, T. Yu, I. Khalil, X. Xiao, and K. Ren. 2016. Heavy hitter estimation over set-valued data with local differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). ACM, New York, NY, 192--203.

[31]

O. Reingold, S. P. Vadhan, and A. Wigderson. 2000. Entropy waves, the zig-zag graph product, and new constant-degree expanders and extractors. In Proceedings of the 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’00). IEEE Computer Society, 3--13.

[32]

R. M. Rogers, A. Roth, A. D. Smith, and O. Thakkar. 2016. Max-information, differential privacy, and post-selection hypothesis testing. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’16). 487--494.

[33]

R. M. Rogers, A. Roth, A. D. Smith, and O. Thakkar. 2016. Max-information, differential privacy, and post-selection hypothesis testing. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’16). 487--494.

[34]

J. P. Schmidt, A. Siegel, and A. Srinivasan. 1995. Chernoff-Hoeffding bounds for applications with limited independence. SIAM J. Discr. Math. 8, 2 (1995), 223--250.

Digital Library

[35]

A. Smith, A. Thakurta, and J. Upadhyay. 2017. Is interaction necessary for distributed private learning? In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP’17). 58--77.

[36]

D. A. Spielman. 1996. Linear-time encodable and decodable error-correcting codes. IEEE Trans. Inf. Theory 42, 6 (1996), 1723--1731.

[37]

U. Stemmer. 2019. Locally private k-means clustering. CoRR (2019), abs/1907.02513.

[38]

A. Thakurta, A. Vyrros, U. Vaishampayan, G. Kapoor, J. Freudiger, V. Sridhar, and D. Davidson. 2017. Learning new words. US Patent 9594741 (2017).

[39]

S. Vadhan. 2016. The Complexity of Differential Privacy. https://privacytools.seas.harvard.edu/publications/complexity-differential-privacy.

Cited By

Li HNavot STessaro SBalzarotti DXu W(2024)POPSTARProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699288(6939-6956)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699288
Wei FBao EXiao XYang YDing B(2024)AAA: An Adaptive Mechanism for Locally Differentially Private Mean EstimationProceedings of the VLDB Endowment10.14778/3659437.365944217:8(1843-1855)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.14778/3659437.3659442
Wang SLi YZhong YChen KWang XZhou ZPeng FQian YDu JYang W(2024)Locally Private Set-Valued Data Analyses: Distribution and Heavy Hitters EstimationIEEE Transactions on Mobile Computing10.1109/TMC.2023.334205623:8(8050-8065)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TMC.2023.3342056
Show More Cited By

Index Terms

Heavy Hitters and the Structure of Local Privacy
1. Theory of computation
  1. Design and analysis of algorithms
    1. Streaming, sublinear and near linear time algorithms
      1. Sketching and sampling
  2. Theory and algorithms for application domains
    1. Database theory
      1. Theory of database privacy and security

Recommendations

Heavy Hitters and the Structure of Local Privacy
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number ...
Practical locally private heavy hitters

We present new practical local differentially private heavy hitters algorithms achieving optimal or near-optimal worst-case error and running time - TreeHist and Bitstogram. In both algorithms, server running time is Õ(n) and user running time is Õ (1), ...
A privacy framework: indistinguishable privacy
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops

In this paper we illustrate a privacy framework named Indistinguishable Privacy. Indistinguishable privacy could be deemed as the formalization of the existing privacy definitions in privacy preserving data publishing as well as secure multi-party ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms

ACM Transactions on Algorithms Volume 15, Issue 4

October 2019

297 pages

ISSN:1549-6325

EISSN:1549-6333

DOI:10.1145/3351875

Editor:
Aravind Srinivasan
University of Maryland, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2019

Accepted: 01 July 2019

Revised: 01 April 2019

Received: 01 June 2018

Published in TALG Volume 15, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Google Research Fellowship
Alfred P. Sloan Research Fellowship
DORECG
Google Faculty Research Award
NSF
CAREER
ONR Young Investigator

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
1,267
Total Downloads

Downloads (Last 12 months)421
Downloads (Last 6 weeks)48

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li HNavot STessaro SBalzarotti DXu W(2024)POPSTARProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699288(6939-6956)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699288
Wei FBao EXiao XYang YDing B(2024)AAA: An Adaptive Mechanism for Locally Differentially Private Mean EstimationProceedings of the VLDB Endowment10.14778/3659437.365944217:8(1843-1855)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.14778/3659437.3659442
Wang SLi YZhong YChen KWang XZhou ZPeng FQian YDu JYang W(2024)Locally Private Set-Valued Data Analyses: Distribution and Heavy Hitters EstimationIEEE Transactions on Mobile Computing10.1109/TMC.2023.334205623:8(8050-8065)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TMC.2023.3342056
Zhang YZhu YZhou YYuan J(2024)Frequency Estimation Mechanisms Under ϵδ-Utility-Optimized Local Differential PrivacyIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.323883912:1(316-327)Online publication date: Jan-2024
https://doi.org/10.1109/TETC.2023.3238839
Rathee MZhang YCorrigan-Gibbs HAda Popa R(2024)Private Analytics via Streaming, Sketching, and Silently Verifiable Proofs2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00245(3072-3090)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00245
Reshetova DChen WÖzgür A(2024)Training Generative Models From Privatized Data via Entropic Optimal TransportIEEE Journal on Selected Areas in Information Theory10.1109/JSAIT.2024.33874635(221-235)Online publication date: 2024
https://doi.org/10.1109/JSAIT.2024.3387463
Su JXu JWang D(2024)PAC learning halfspaces in non-interactive local differential privacy model with public unlabeled dataJournal of Computer and System Sciences10.1016/j.jcss.2023.103496141(103496)Online publication date: May-2024
https://doi.org/10.1016/j.jcss.2023.103496
Tullii MGaucher SRichard HDiemert EPerchet VRakotomamonjy ACalauzènes CVono M(2024)Open Research Challenges for Private Advertising Systems Under Local Differential PrivacyWeb Information Systems Engineering – WISE 202410.1007/978-981-96-0576-7_9(107-122)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1007/978-981-96-0576-7_9
Fichtenberger HHenzinger MUpadhyay JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Constant mattersProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618812(10072-10092)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618812
Lebeda CTetek JGeerts FNgo HSintos S(2023)Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries SketchProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588673(79-88)Online publication date: 18-Jun-2023
https://dl.acm.org/doi/10.1145/3584372.3588673
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents