Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3584372.3588673acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries Sketch

Published: 18 June 2023 Publication History

Abstract

We consider the problem of computing differentially private approximate histograms and heavy hitters in a stream of elements. In the non-private setting, this is often done using the sketch of Misra and Gries [Science of Computer Programming, 1982]. Chan, Li, Shi, and Xu [PETS 2012] describe a differentially private version of the Misra-Gries sketch, but the amount of noise it adds can be large and scales linearly with the size of the sketch: the more accurate the sketch is, the more noise this approach has to add. We present a better mechanism for releasing a Misra-Gries sketch under (ε,δ)-differential privacy. It adds noise with magnitude independent of the size of the sketch size, in fact, the maximum error coming from the noise is the same as the best known in the private non-streaming setting, up to a constant factor. Our mechanism is simple and likely to be practical. We also give a simple post-processing step of the Misra-Gries sketch that does not increase the worst-case error guarantee. It is sufficient to add noise to this new sketch with less than twice the magnitude of the non-streaming setting. This improves on the previous result for ε-differential privacy where the noise scales linearly to the size of the sketch.

References

[1]
Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, and Ke Yi. 2013. Mergeable Summaries. ACM Trans. Database Syst., Vol. 38, 4, Article 26 (dec 2013), 28 pages. https://doi.org/10.1145/2500128
[2]
Apple. [n.,d.]. Differential Privacy Overview - Apple. https://www.apple.com/privacy/docs/Differential_Privacy_Overview.pdf Retrieved April 13, 2023 from
[3]
Martin Aumü ller, Christian Janos Lebeda, and Rasmus Pagh. 2022. Representing Sparse Vectors with Differential Privacy, Low Error, Optimal Space, and Fast Access. Journal of Privacy and Confidentiality, Vol. 12, 2 (Nov. 2022). https://doi.org/10.29012/jpc.809
[4]
Victor Balcer and Salil Vadhan. 2019. Differential Privacy on Finite Computers. Journal of Privacy and Confidentiality, Vol. 9, 2 (Sep. 2019). https://doi.org/10.29012/jpc.679
[5]
Raef Bassily, Kobbi Nissim, Uri Stemmer, and Abhradeep Guha Thakurta. 2017. Practical locally private heavy hitters. Advances in Neural Information Processing Systems, Vol. 30 (2017).
[6]
Jeremiah Blocki, Elena Grigorescu, Tamalika Mukherjee, and Samson Zhou. 2022. How to Make Your Approximation Algorithm Private: A Black-Box Differentially-Private Transformation for Tunable Approximation Algorithms of Functions with Low Sensitivity. arXiv preprint arXiv:2210.03831 (2022).
[7]
Jonas Bö hler and Florian Kerschbaum. 2021. Secure Multi-party Computation of Differentially Private Heavy Hitters. In CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, November 15 - 19, 2021, Yongdae Kim, Jong Kim, Giovanni Vigna, and Elaine Shi (Eds.). ACM, 2361--2377. https://doi.org/10.1145/3460120.3484557
[8]
Prosenjit Bose, Evangelos Kranakis, Pat Morin, and Yihui Tang. 2003. Bounds for Frequency Estimation of Packet Streams. In SIROCCO 10: Proceedings of the 10th Internaltional Colloquium on Structural Information Complexity, June 18--20, 2003, Umeå Sweden (Proceedings in Informatics, Vol. 17), Jop F. Sibeyn (Ed.). Carleton Scientific, 33--42.
[9]
Mark Bun, Jelani Nelson, and Uri Stemmer. 2019. Heavy hitters and the structure of local privacy. ACM Transactions on Algorithms (TALG), Vol. 15, 4 (2019), 1--40.
[10]
Ricardo Silva Carvalho, Ke Wang, Lovedeep Gondara, and Chunyan Miao. 2020. Differentially private top-k selection via stability on unknown domain. In Conference on Uncertainty in Artificial Intelligence. PMLR, 1109--1118.
[11]
T.-H. Hubert Chan, Mingfei Li, Elaine Shi, and Wenchang Xu. 2012. Differentially Private Continual Monitoring of Heavy Hitters from Distributed Streams. IACR Cryptol. ePrint Arch. (2012), 218. http://eprint.iacr.org/2012/218
[12]
Graham Cormode, Cecilia M. Procopiuc, Divesh Srivastava, and Thanh T. L. Tran. 2012. Differentially private summaries for sparse data. In ICDT. ACM, 299--311. https://doi.org/10.1145/2274576.2274608
[13]
David Durfee and Ryan M Rogers. 2019. Practical differentially private top-k selection with pay-what-you-get composition. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[14]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, 265--284.
[15]
Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Found. Trends Theor. Comput. Sci., Vol. 9, 3--4 (2014), 211--407. https://doi.org/10.1561/0400000042
[16]
Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. 1054--1067.
[17]
Quan Geng, Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2015. The Staircase Mechanism in Differential Privacy. IEEE Journal of Selected Topics in Signal Processing, Vol. 9, 7 (2015), 1176--1184. https://doi.org/10.1109/JSTSP.2015.2425831
[18]
Badih Ghazi, Noah Golowich, Ravi Kumar, Rasmus Pagh, and Ameya Velingker. 2019. On the Power of Multiple Anonymous Messages. IACR Cryptol. ePrint Arch. (2019), 1382. https://eprint.iacr.org/2019/1382
[19]
Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. 2012. Universally utility-maximizing privacy mechanisms. SIAM J. Comput., Vol. 41, 6 (2012), 1673--1693.
[20]
Samuel Haney, Damien Desfontaines, Luke Hartman, Ruchit Shrestha, and Michael Hay. 2022. Precision-based attacks and interval refining: how to break, then fix, differential privacy on finite computers. CoRR, Vol. abs/2207.13793 (2022). https://doi.org/10.48550/arXiv.2207.13793 showeprint[arXiv]2207.13793
[21]
Aleksandra Korolova, Krishnaram Kenthapadi, Nina Mishra, and Alexandros Ntoulas. 2009. Releasing search queries and clicks privately. In WWW. ACM, 171--180. https://doi.org/10.1145/1526709.1526733
[22]
Darakhshan J. Mir, S. Muthukrishnan, Aleksandar Nikolov, and Rebecca N. Wright. 2011. Pan-private algorithms via statistics on sketches. In Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2011, June 12--16, 2011, Athens, Greece, Maurizio Lenzerini and Thomas Schwentick (Eds.). ACM, 37--48. https://doi.org/10.1145/1989284.1989290
[23]
J. Misra and David Gries. 1982. Finding repeated elements. Science of Computer Programming, Vol. 2, 2 (1982), 143--152. https://doi.org/10.1016/0167--6423(82)90012-0
[24]
Rasmus Pagh and Mikkel Thorup. 2022. Improved Utility Analysis of Private CountSketch. In Advances in Neural Information Processing Systems, Vol. 35. 25631--25643. https://proceedings.neurips.cc/paper_files/paper/2022/file/a47f5cdff1469751597d78e803fc590f-Paper-Conference.pdf
[25]
Gang Qiao, Weijie Su, and Li Zhang. 2021. Oneshot differentially private top-k selection. In International Conference on Machine Learning. PMLR, 8672--8681.
[26]
Zhan Qin, Yin Yang, Ting Yu, Issa Khalil, Xiaokui Xiao, and Kui Ren. 2016. Heavy hitter estimation over set-valued data with local differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 192--203.
[27]
Jakub Tve tek. 2022. Additive Noise Mechanisms for Making Randomized Approximation Algorithms Differentially Private. arXiv preprint arXiv:2211.03695 (2022).
[28]
Tianhao Wang, Ninghui Li, and Somesh Jha. 2019. Locally differentially private heavy hitter identification. IEEE Transactions on Dependable and Secure Computing, Vol. 18, 2 (2019), 982--993.
[29]
Hao Wu and Anthony Wirth. 2022. Asymptotically Optimal Locally Private Heavy Hitters via Parameterized Sketches. In International Conference on Artificial Intelligence and Statistics. PMLR, 7766--7798.
[30]
Dan Zhao, Suyun Zhao, Hong Chen, Ruixuan Liu, Cuiping Li, and Wenjuan Liang. 2022b. Efficient protocols for heavy hitter identification with local differential privacy. Frontiers of Computer Science, Vol. 16, 5 (2022), 1--11.
[31]
Fuheng Zhao, Dan Qiao, Rachel Redberg, Divyakant Agrawal, Amr El Abbadi, and Yu-Xiang Wang. 2022a. Differentially Private Linear Sketches: Efficient Implementations and Applications. In Advances in Neural Information Processing Systems, Vol. 35. 12691--12704. https://proceedings.neurips.cc/paper_files/paper/2022/file/525338e0d98401a62950bc7c454eb83d-Paper-Conference.pdf
[32]
Wennan Zhu, Peter Kairouz, Brendan McMahan, Haicheng Sun, and Wei Li. 2020. Federated heavy hitters discovery with differential privacy. In International Conference on Artificial Intelligence and Statistics. PMLR, 3837--3847. io

Cited By

View all
  • (2024)Streaming Algorithms with Few State ChangesProceedings of the ACM on Management of Data10.1145/36511452:2(1-28)Online publication date: 14-May-2024
  • (2024)DPSW-Sketch: A Differentially Private Sketch Framework for Frequency Estimation over Sliding WindowsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671694(3255-3266)Online publication date: 25-Aug-2024
  • (2024)Differential Private Histogram Publication with Background Knowledge2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00154(1148-1156)Online publication date: 2-Jul-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '23: Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
June 2023
392 pages
ISBN:9798400701276
DOI:10.1145/3584372
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. approximate histograms
  2. differential privacy
  3. heavy hitters
  4. misra-gries sketch
  5. streaming algorithms

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)5
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Streaming Algorithms with Few State ChangesProceedings of the ACM on Management of Data10.1145/36511452:2(1-28)Online publication date: 14-May-2024
  • (2024)DPSW-Sketch: A Differentially Private Sketch Framework for Frequency Estimation over Sliding WindowsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671694(3255-3266)Online publication date: 25-Aug-2024
  • (2024)Differential Private Histogram Publication with Background Knowledge2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00154(1148-1156)Online publication date: 2-Jul-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media