Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Continual Observation of Joins under Differential Privacy

Published: 30 May 2024 Publication History

Abstract

The problem of continual observation under differential privacy has been studied extensively in the literature. However, all existing works, with the exception of [28,51], have only studied the simple counting query and its derivatives. Join queries, which are arguably the most important class of queries in relational databases, have only been considered in [28,51], but the solutions offered there have two limitations: First, they only support a few specific graph pattern queries, which are special cases of joins. Second, they require hard degree/frequency constraints on the graph/database instance, and the privatized query answers have errors proportional to these constraints.
In this paper, we propose a new differentially private mechanism for continual observation of joins that overcomes these two limitations. Our mechanism supports arbitrary joins and predicates, and do not require any constraints to be given in advance, even over an infinite stream. More importantly, it yields an error that is proportional to the actual maximum degree/frequencies in the graph/database instance at the current time of observation. Such an instance-specific utility guarantee is much preferred for the continual observation problem, where the database size and the query answer may change significantly over time.

References

[1]
Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of databases. Vol. 8. Addison-Wesley Reading.
[2]
Mahmoud Abo Khamis, Hung Q Ngo, and Dan Suciu. 2017. What do Shannon-type inequalities, submodular width, and disjunctive datalog have to do with one another?. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 429--444.
[3]
Myrto Arapinis, Diego Figueira, and Marco Gaboardi. 2016. Sensitivity of Counting Queries. In International Colloquium on Automata, Languages, and Programming (ICALP).
[4]
Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2013. Differentially private data analysis of social networks via restricted sensitivity. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science. 87--96.
[5]
Jean Bolot, Nadia Fawaz, Shanmugavelayutham Muthukrishnan, Aleksandar Nikolov, and Nina Taft. 2013. Private decayed predicate sums on streams. In Proceedings of the 16th International Conference on Database Theory. 284--295.
[6]
Kuntai Cai, Xiaokui Xiao, and Graham Cormode. 2023. Privlava: synthesizing relational data with foreign keys under differential privacy. Proceedings of the ACM on Management of Data, Vol. 1, 2 (2023), 1--25.
[7]
Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache flink: Stream and batch processing in a single engine. The Bulletin of the Technical Committee on Data Engineering, Vol. 38, 4 (2015).
[8]
Adrian Rivera Cardoso and Ryan Rogers. 2022. Differentially private histograms under continual observation: Streaming selection into the unknown. In International Conference on Artificial Intelligence and Statistics. PMLR, 2397--2419.
[9]
T-H Hubert Chan, Mingfei Li, Elaine Shi, and Wenchang Xu. 2012. Differentially private continual monitoring of heavy hitters from distributed streams. In International Symposium on Privacy Enhancing Technologies Symposium. Springer, 140--159.
[10]
T.-H. Hubert Chan, Elaine Shi, and Dawn Song. 2011. Private and Continual Release of Statistics. ACM Transactions on Information and System Security (2011).
[11]
Badrish Chandramouli, Jonathan Goldstein, Mike Barnett, Robert DeLine, Danyel Fisher, John C Platt, James F Terwilliger, and John Wernsing. 2014. Trill: A high-performance incremental query processor for diverse analytics. Proceedings of the VLDB Endowment, Vol. 8, 4 (2014), 401--412.
[12]
Shixi Chen and Shuigeng Zhou. 2013. Recursive mechanism: towards node differential privacy and unrestricted joins. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 653--664.
[13]
Yan Chen, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2017. Pegasus: Data-adaptive differentially private stream processing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 1375--1388.
[14]
Rachel Cummings, Sara Krehbiel, Kevin A Lai, and Uthaipon Tantipongpipat. 2018. Differential privacy for growing databases. Advances in Neural Information Processing Systems, Vol. 31 (2018).
[15]
Sergey Denisov, Brendan McMahan, Keith Rush, Adam Smith, and Abhradeep Thakurta. 2022. Improved differential privacy for sgd via optimal private linear operators on adaptive streams. In NeurIPS.
[16]
Wei Dong, Juanru Fang, Ke Yi, Yuchao Tao, and Ashwin Machanavajjhala. 2022. R2T: Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys. In Proc. ACM SIGMOD International Conference on Management of Data.
[17]
Wei Dong, Qiyao Luo, and Ke Yi. 2023 a. Continual Observation under User-level Differential Privacy. In 2023 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 2190--2207.
[18]
Wei Dong, Dajun Sun, and Ke Yi. 2023 b. Better than Composition: How to Answer Multiple Relational Queries under Differential Privacy. Proceedings of the ACM on Management of Data, Vol. 1, 2 (2023), 1--26.
[19]
Wei Dong and Ke Yi. 2021. Residual Sensitivity for Differentially Private Multi-Way Joins. In Proc. ACM SIGMOD International Conference on Management of Data.
[20]
Wei Dong and Ke Yi. 2022. A Nearly Instance-optimal Differentially Private Mechanism for Conjunctive Queries. In Proc. ACM Symposium on Principles of Database Systems.
[21]
Wei Dong and Ke Yi. 2023 a. Query Evaluation under Differential Privacy. ACM SIGMOD Record, Vol. 52, 3 (2023), 6--17.
[22]
Wei Dong and Ke Yi. 2023 b. Universal private estimators. In Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 195--206.
[23]
Cynthia Dwork, Moni Naor, Toniann Pitassi, and Guy N Rothblum. 2010. Differential privacy under continual observation. In Proceedings of the forty-second ACM symposium on Theory of computing. 715--724.
[24]
Cynthia Dwork, Moni Naor, Omer Reingold, and Guy N Rothblum. 2015. Pure differential privacy for rectangle queries via private partitions. In International Conference on the Theory and Application of Cryptology and Information Security. Springer, 735--751.
[25]
Cynthia Dwork, Moni Naor, Omer Reingold, Guy N Rothblum, and Salil Vadhan. 2009. On the complexity of differentially private data release: efficient algorithms and hardness results. In Proceedings of the forty-first annual ACM symposium on Theory of computing. 381--390.
[26]
Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, Vol. 9, 3--4 (2014), 211--407.
[27]
Juanru Fang, Wei Dong, and Ke Yi. 2022. Shifted Inverse: A General Mechanism for Monotonic Functions under User Differential Privacy. (2022).
[28]
Hendrik Fichtenberger, Monika Henzinger, and Wolfgang Ost. 2021. Differentially Private Algorithms for Graphs Under Continual Observation. In 29th Annual European Symposium on Algorithms (ESA 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
[29]
Georg Gottlob, Stephanie Tien Lee, Gregory Valiant, and Paul Valiant. 2012. Size and treewidth bounds for conjunctive queries. Journal of the ACM (JACM), Vol. 59, 3 (2012), 1--35.
[30]
Monika Henzinger and Jalaj Upadhyay. 2022. Constant matters: Fine-grained Complexity of Differentially Private Continual Observation Using Completely Bounded Norms. arXiv preprint arXiv:2202.11205 (2022).
[31]
Monika Henzinger, Jalaj Upadhyay, and Sarvagya Upadhyay. 2022. Almost tight error bounds on differentially private continual counting. arXiv preprint arXiv:2211.05006 (2022).
[32]
Noah Johnson, Joseph P Near, and Dawn Song. 2018. Towards practical differential privacy for SQL queries. Proceedings of the VLDB Endowment, Vol. 11, 5 (2018), 526--539.
[33]
Peter Kairouz, Brendan McMahan, Shuang Song, Om Thakkar, Abhradeep Thakurta, and Zheng Xu. 2021. Practical and private (deep) learning without sampling or shuffling. In International Conference on Machine Learning. PMLR, 5213--5225.
[34]
Vishesh Karwa, Sofya Raskhodnikova, Adam Smith, and Grigory Yaroslavtsev. 2011. Private analysis of graph structure. Proceedings of the VLDB Endowment, Vol. 4, 11 (2011), 1146--1157.
[35]
Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2013. Analyzing graphs with node differential privacy. In Theory of Cryptography Conference. Springer, 457--476.
[36]
Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2019. PrivateSQL: a differentially private SQL query engine. Proceedings of the VLDB Endowment, Vol. 12, 11 (2019), 1371--1384.
[37]
Jérôme Kunegis. 2013. Konect: the koblenz network collection. In Proceedings of the 22nd international conference on world wide web. 1343--1350.
[38]
Jure Leskovec and Andrej Krevl. 2014. SNAP: Stanford network analysis project.
[39]
Jure Leskovec and Andrej Krevl. 2016. SNAP datasets: Stanford large network dataset collection (2014). URL http://snap. stanford. edu/data (2016), 49.
[40]
Michael Ley. 2002. The DBLP computer science bibliography: Evolution, research issues, perspectives. In International symposium on string processing and information retrieval. Springer, 1--10.
[41]
Chao Li, Gerome Miklau, Michael Hay, Andrew McGregor, and Vibhor Rastogi. 2015. The matrix mechanism: optimizing linear counting queries under differential privacy. The VLDB journal, Vol. 24, 6 (2015), 757--781.
[42]
Frank D McSherry. 2009. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. 19--30.
[43]
Alan Mislove, Hema Swetha Koppula, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2008. Growth of the flickr social network. In Proceedings of the first workshop on Online social networks. 25--30.
[44]
Arjun Narayan and Andreas Haeberlen. 2012. DJoin: Differentially private join queries over distributed databases. In USENIX Symposium on Operating Systems Design and Implementation. 149--162.
[45]
Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2007. Smooth sensitivity and sampling in private data analysis. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing. 75--84.
[46]
Catuscia Palamidessi and Marco Stronati. 2012. Differential Privacy for Relational Algebra: Improving the Sensitivity Bounds via Constraint Systems. In QAPL.
[47]
Victor Perrier, Hassan Jameel Asghar, and Dali Kaafar. 2019. Private continual release of real-valued data streams. In 26th Annual Network and Distributed System Security Symposium, NDSS 2016. Internet Society, 1--13.
[48]
Davide Proserpio, Sharon Goldberg, and Frank McSherry. 2014. Calibrating Data to Sensitivity in Private Data Analysis. Proceedings of the VLDB Endowment, Vol. 7, 8 (2014).
[49]
Yuan Qiu and Ke Yi. 2022. Differential Privacy on Dynamic Data. arXiv preprint arXiv:2209.01387 (2022).
[50]
Shuang Song, Susan Little, Sanjay Mehta, Staal Vinterbo, and Kamalika Chaudhuri. 2018. Differentially private continual release of graph statistics. arXiv preprint arXiv:1809.02575 (2018).
[51]
Dajun Sun, Wei Dong, and Ke Yi. 2023. Confidence Intervals for Private Query Processing. Proceedings of the VLDB Endowment, Vol. 17, 3 (2023), 373--385.
[52]
Yuchao Tao, Xi He, Ashwin Machanavajjhala, and Sudeepa Roy. 2020. Computing Local Sensitivities of Counting Queries with Joins. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 479--494.
[53]
Jalaj Upadhyay. 2019. Sublinear space private algorithms under the sliding window model. In International Conference on Machine Learning. PMLR, 6363--6372.
[54]
Qichen Wang, Xiao Hu, Binyang Dai, and Ke Yi. 2023. Change Propagation Without Joins. Proceedings of the VLDB Endowment, Vol. 16, 5 (2023), 1046--1058.
[55]
Tianhao Wang, Joann Qiongna Chen, Zhikun Zhang, Dong Su, Yueqiang Cheng, Zhou Li, Ninghui Li, and Somesh Jha. 2021. Continuous release of data streams under both centralized and local differential privacy. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 1237--1253.
[56]
Bing Zhang, Vadym Doroshenko, Peter Kairouz, Thomas Steinke, Abhradeep Thakurta, Ziyin Ma, Himani Apte, and Jodi Spacek. 2023. Differentially Private Stream Processing at Scale. arXiv preprint arXiv:2303.18086 (2023).
[57]
Jun Zhang, Graham Cormode, Cecilia M Procopiuc, Divesh Srivastava, and Xiaokui Xiao. 2015. Private release of graph statistics using ladder functions. In Proceedings of the 2015 ACM SIGMOD international conference on management of data. 731--745.

Cited By

View all
  • (2024)DOP-SQL: A General-Purpose, High-Utility, and Extensible Private SQL SystemProceedings of the VLDB Endowment10.14778/3685800.368588117:12(4385-4388)Online publication date: 8-Nov-2024
  • (2024)Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign KeysACM Transactions on Database Systems10.1145/369783149:4(1-40)Online publication date: 26-Sep-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 3
SIGMOD
June 2024
1953 pages
EISSN:2836-6573
DOI:10.1145/3670010
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2024
Published in PACMMOD Volume 2, Issue 3

Author Tags

  1. continual observation
  2. differential privacy
  3. join query

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)392
  • Downloads (Last 6 weeks)71
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)DOP-SQL: A General-Purpose, High-Utility, and Extensible Private SQL SystemProceedings of the VLDB Endowment10.14778/3685800.368588117:12(4385-4388)Online publication date: 8-Nov-2024
  • (2024)Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign KeysACM Transactions on Database Systems10.1145/369783149:4(1-40)Online publication date: 26-Sep-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media