Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Heavy Hitters and the Structure of Local Privacy

Published: 04 October 2019 Publication History

Abstract

We present a new locally differentially private algorithm for the heavy hitters problem that achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates that depend optimally on the number of users, the size of the domain, and the privacy parameter but depend sub-optimally on the failure probability.
We strengthen existing lower bounds on the error to incorporate the failure probability and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters.
Advanced Grouposition: In the local model, group privacy for k users degrades proportionally to ≈√k instead of linearly in k as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via “packing arguments”), over the central model.
• Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy.

References

[1]
N. Alon and F. R. K. Chung. 1988. Explicit construction of linear sized tolerant networks. Discr. Math. 72, 1--3 (1988), 15--19.
[2]
N. Alon and J. Spencer. 1992. The Probabilistic Method. John Wiley.
[3]
R. Bassily, K. Nissim, A. D. Smith, T. Steinke, U. Stemmer, and J. Ullman. 2016. Algorithmic stability for adaptive data analysis. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC’16). 1046--1059.
[4]
R. Bassily, K. Nissim, U. Stemmer, and A. Thakurta. 2017. Practical locally private heavy hitters. In Proceedings of the Annual Conference on Advances in Neural Information Processing Systems (NIPS’17).
[5]
R. Bassily and A. D. Smith. 2015. Local, private, efficient protocols for succinct histograms. In Proceedings of the 47th Annual ACM on Symposium on Theory of Computing (STOC’15). 127--135.
[6]
M. Bun and T. Steinke. 2016. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Proceedings of the 14th International Conference Theory of Cryptography (TCC’16-B). 635--658.
[7]
T. H. Chan, E. Shi, and D. Song. 2012. Optimal lower bound for differentially private multi-party aggregation. In Proceedings of the 20th Annual European Symposium on Algorithms (ESA’12). 277--288.
[8]
J. C. Duchi, M. I. Jordan, and M. J. Wainwright. 2013. Local privacy and statistical minimax rates. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS’13). 429--438.
[9]
C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, and A. Roth. 2015. Generalization in adaptive data analysis and holdout reuse. In Proceedings of the Annual Conference on Advances in Neural Information Processing Systems (NIPS’15).
[10]
C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, and A. Roth. 2015. Preserving statistical validity in adaptive data analysis. In Proceedings of the ACM Symposium on the Theory of Computing (STOC’15). ACM.
[11]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third Theory of Cryptography Conference. 265--284.
[12]
C. Dwork and A. Roth. 2014. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 3-4 (2014), 211--407.
[13]
C. Dwork, G. N. Rothblum, and S. P. Vadhan. 2010. Boosting and differential privacy. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS’10). IEEE Computer Society, 51--60.
[14]
Ú. Erlingsson, V. Pihur, and A. Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the ACM Conference on Computer and Communications Security (CCS’14).
[15]
W. Feller. 1943. Generalization of a probability limit theorem of cramer. Trans. Am. Math. Soc. 54, 3 (1943), 361--372.
[16]
A. C. Gilbert, Y. Li, E. Porat, and M. J. Strauss. 2014. For-all sparse recovery in near-optimal time. In Proceedings of the 41st International Colloquium on Automata, Languages, and Programming (ICALP’14). 538--550.
[17]
V. Guruswami. 2001. List Decoding of Error-Correcting Codes. Ph.D. Dissertation, Massachusetts Institute of Technology, 2001.
[18]
V. Guruswami and P. Indyk. 2001. Expander-based constructions of efficiently decodable codes. In Proceedings of the 2001 IEEE International Conference on Cluster Computing. 658--667.
[19]
J. Hsu, S. Khanna, and A. Roth. 2012. Distributed private heavy hitters. In Proceedings of the 39th International Colloquium Automata, Languages, and Programming (ICALP’12). 461--472.
[20]
D. M. Kane, J. Nelson, E. Porat, and D. P. Woodruff. 2011. Fast moment estimation in data streams in optimal space. In Proceedings of the Annual ACM Symposium on the Theory of Computing (STOC’11). ACM, 745--754.
[21]
H. Kaplan and U. Stemmer. 2018. Differentially private k-means with constant multiplicative error. In Proceedings of the Annual Conference on Neural Information Processing Systems 2018 (NeurIPS’18). 5436--5446.
[22]
S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith. 2011. What can we learn privately? SIAM J. Comput. 40, 3 (2011), 793--826.
[23]
S. P. Kasiviswanathan and A. D. Smith. 2008. A note on differential privacy: Defining resistance to arbitrary side information. CoRR (2008), abs/0803.3946.
[24]
P. N. Klein and N. E. Young. 2015. On the number of iterations for dantzig-wolfe optimization and packing-covering approximation algorithms. SIAM J. Comput. 44, 4 (2015), 1154--1172.
[25]
K. G. Larsen, J. Nelson, H. L. Nguyen, and M. Thorup. 2016. Heavy hitters via cluster-preserving clustering. In Proceedings of the 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’16). 61--70.
[26]
J. Matoušek. 2001. Lower bound on the minus-domination number. Discr. Math. 233, 1 (2001), 361--370.
[27]
N. Mishra and M. Sandler. 2006. Privacy via pseudorandom sketches. In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’06). ACM, New York, NY, 143--152.
[28]
M. Mitzenmacher and E. Upfal. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York, NY.
[29]
K. Nissim and U. Stemmer. 2018. Clustering algorithms for the centralized and local models. In Proceedings of the Annual Converence on Algorithmic Learning Theory (ALT’18). 619--653.
[30]
Z. Qin, Y. Yang, T. Yu, I. Khalil, X. Xiao, and K. Ren. 2016. Heavy hitter estimation over set-valued data with local differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). ACM, New York, NY, 192--203.
[31]
O. Reingold, S. P. Vadhan, and A. Wigderson. 2000. Entropy waves, the zig-zag graph product, and new constant-degree expanders and extractors. In Proceedings of the 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’00). IEEE Computer Society, 3--13.
[32]
R. M. Rogers, A. Roth, A. D. Smith, and O. Thakkar. 2016. Max-information, differential privacy, and post-selection hypothesis testing. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’16). 487--494.
[33]
R. M. Rogers, A. Roth, A. D. Smith, and O. Thakkar. 2016. Max-information, differential privacy, and post-selection hypothesis testing. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS’16). 487--494.
[34]
J. P. Schmidt, A. Siegel, and A. Srinivasan. 1995. Chernoff-Hoeffding bounds for applications with limited independence. SIAM J. Discr. Math. 8, 2 (1995), 223--250.
[35]
A. Smith, A. Thakurta, and J. Upadhyay. 2017. Is interaction necessary for distributed private learning? In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP’17). 58--77.
[36]
D. A. Spielman. 1996. Linear-time encodable and decodable error-correcting codes. IEEE Trans. Inf. Theory 42, 6 (1996), 1723--1731.
[37]
U. Stemmer. 2019. Locally private k-means clustering. CoRR (2019), abs/1907.02513.
[38]
A. Thakurta, A. Vyrros, U. Vaishampayan, G. Kapoor, J. Freudiger, V. Sridhar, and D. Davidson. 2017. Learning new words. US Patent 9594741 (2017).
[39]
S. Vadhan. 2016. The Complexity of Differential Privacy. https://privacytools.seas.harvard.edu/publications/complexity-differential-privacy.

Cited By

View all
  • (2024)AAA: An Adaptive Mechanism for Locally Differentially Private Mean EstimationProceedings of the VLDB Endowment10.14778/3659437.365944217:8(1843-1855)Online publication date: 1-Apr-2024
  • (2024)Locally Private Set-Valued Data Analyses: Distribution and Heavy Hitters EstimationIEEE Transactions on Mobile Computing10.1109/TMC.2023.334205623:8(8050-8065)Online publication date: 1-Aug-2024
  • (2024)Frequency Estimation Mechanisms Under ϵδ-Utility-Optimized Local Differential PrivacyIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.323883912:1(316-327)Online publication date: Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms
ACM Transactions on Algorithms  Volume 15, Issue 4
October 2019
297 pages
ISSN:1549-6325
EISSN:1549-6333
DOI:10.1145/3351875
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2019
Accepted: 01 July 2019
Revised: 01 April 2019
Received: 01 June 2018
Published in TALG Volume 15, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Differential privacy
  2. heavy hitters
  3. local model

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Google Research Fellowship
  • Alfred P. Sloan Research Fellowship
  • DORECG
  • Google Faculty Research Award
  • NSF
  • CAREER
  • ONR Young Investigator

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)362
  • Downloads (Last 6 weeks)34
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)AAA: An Adaptive Mechanism for Locally Differentially Private Mean EstimationProceedings of the VLDB Endowment10.14778/3659437.365944217:8(1843-1855)Online publication date: 1-Apr-2024
  • (2024)Locally Private Set-Valued Data Analyses: Distribution and Heavy Hitters EstimationIEEE Transactions on Mobile Computing10.1109/TMC.2023.334205623:8(8050-8065)Online publication date: 1-Aug-2024
  • (2024)Frequency Estimation Mechanisms Under ϵδ-Utility-Optimized Local Differential PrivacyIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.323883912:1(316-327)Online publication date: Jan-2024
  • (2024)Training Generative Models From Privatized Data via Entropic Optimal TransportIEEE Journal on Selected Areas in Information Theory10.1109/JSAIT.2024.33874635(221-235)Online publication date: 2024
  • (2024)PAC learning halfspaces in non-interactive local differential privacy model with public unlabeled dataJournal of Computer and System Sciences10.1016/j.jcss.2023.103496141(103496)Online publication date: May-2024
  • (2023)Constant mattersProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618812(10072-10092)Online publication date: 23-Jul-2023
  • (2023)Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries SketchProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588673(79-88)Online publication date: 18-Jun-2023
  • (2022)STARProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security10.1145/3548606.3560631(697-710)Online publication date: 7-Nov-2022
  • (2022)Randomize the Future: Asymptotically Optimal Locally Private Frequency Estimation Protocol for Longitudinal DataProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3526226(237-249)Online publication date: 12-Jun-2022
  • (2022)Frequent Itemset Mining with Local Differential PrivacyProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557327(1146-1155)Online publication date: 17-Oct-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media