Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Secure Sampling for Approximate Multi-party Query Processing

Published: 13 November 2023 Publication History
  • Get Citation Alerts
  • Abstract

    We study the problem of random sampling in the secure multi-party computation (MPC) model. In MPC, taking a sample securely must have a cost Ω(n) irrespective to the sample size s. This is in stark contrast with the plaintext setting, where a sample can be taken in O(s) time trivially. Thus, the goal of approximate query processing (AQP) with sublinear costs seems unachievable under MPC. To get around this inherent barrier, in this paper we take a two-stage approach: In the offline stage, we generate a batch of n/s samples with (n) total cost, which can then be consumed to answer queries as they arrive online. Such an approach allows us to achieve an Õ(s) amortized cost per query, similar to the plaintext setting. Based on our secure batch sampling algorithms, we build MASQUE, an MPC-AQP system that achieves sublinear online query costs by running an MPC protocol to evaluate the queries on pre-generated samples. MASQUE achieves the strong security guarantee of the MPC model, i.e., nothing is revealed beyond the query result, which itself can be further protected by (amplified) differential privacy

    References

    [1]
    Sameer Agarwal, Barzan Mozafari, Aurojit Panda, Henry Milner, Samuel Madden, and Ion Stoica. 2013. BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data. In Proceedings of the 8th ACM European Conference on Computer Systems. 29--42.
    [2]
    M. Ajtai, J. Komlós, and E. Szemerédi. 1983. An Θ(n $łog$ n) Sorting Network. In Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing. 1--9.
    [3]
    Gilad Asharov, T.-H. Hubert Chan, Kartik Nayak, Rafael Pass, Ling Ren, and Elaine Shi. 2020. Bucket Oblivious Sort: An Extremely Simple Oblivious Sort. In 3rd Symposium on Simplicity in Algorithms, SOSA 2020, Martin Farach-Colton and Inge Li Gørtz (Eds.). 8--14.
    [4]
    Gilad Asharov, Koki Hamada, Dai Ikarashi, Ryo Kikuchi, Ariel Nof, Benny Pinkas, Katsumi Takahashi, and Junichi Tomida. 2022. Efficient Secure Three-Party Sorting with Applications to Data Analysis and Heavy Hitters. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (Los Angeles, CA, USA). 125--138.
    [5]
    Borja Balle, Gilles Barthe, and Marco Gaboardi. 2018. Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 6280--6290.
    [6]
    K. E. Batcher. 1968. Sorting Networks and Their Applications. In Proceedings of the AFIPS Spring Joint Computing Conference. 307--314.
    [7]
    Johes Bater, Gregory Elliott, Craig Eggen, Satyender Goel, Abel Kho, and Jennie Rogers. 2017. SMCQL: Secure Querying for Federated Databases. Proceedings of the VLDB Endowment, Vol. 10, 6 (2017), 673--684.
    [8]
    Johes Bater, Xi He, William Ehrich, Ashwin Machanavajjhala, and Jennie Rogers. 2018. Shrinkwrap: Efficient SQL Query Processing in Differentially Private Data Federations. Proc. VLDB Endow., Vol. 12, 3 (nov 2018), 307--320. https://doi.org/10.14778/3291264.3291274
    [9]
    Johes Bater, Yongjoo Park, Xi He, Xiao Wang, and Jennie Rogers. 2020. SAQE: Practical Privacy-Preserving Approximate Query Processing for Data Federations. Proc. VLDB Endow., Vol. 13, 12 (jul 2020), 2691--2705.
    [10]
    Donald Beaver. 1992. Efficient Multiparty Protocols Using Circuit Randomization. In Advances in Cryptology -- CRYPTO '91. 420--432.
    [11]
    Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson. 1988. Completeness Theorems for Non-Cryptographic Fault-Tolerant Distributed Computation. In Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing. 1--10.
    [12]
    Jon Bentley and Bob Floyd. 1987. Programming Pearls: A Sample of Brilliance. Commun. ACM, Vol. 30, 9 (1987), 754--757.
    [13]
    Manuel Blum. 1983. Coin Flipping by Telephone a Protocol for Solving Impossible Problems. SIGACT News, Vol. 15, 1 (1983), 23--27.
    [14]
    Jeffrey Champion, abhi shelat, and Jonathan Ullman. 2019. Securely Sampling Biased Coins with Applications to Differential Privacy. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 603--614.
    [15]
    Surajit Chaudhuri, Bolin Ding, and Srikanth Kandula. 2017. Approximate Query Processing: No Silver Bullet. In Proceedings of the 2017 ACM International Conference on Management of Data. 511--519.
    [16]
    Koji Chida, Koki Hamada, Dai Ikarashi, Ryo Kikuchi, Naoto Kiribuchi, and Benny Pinkas. 2019. An Efficient Secure Three-Party Sorting Protocol with an Honest Majority. Cryptology ePrint Archive, Paper 2019/695. https://eprint.iacr.org/2019/695 https://eprint.iacr.org/2019/695.
    [17]
    Seung Geol Choi, Dana Dachman-Soled, S. Dov Gordon, Linsheng Liu, and Arkady Yerukhimovich. 2022. Secure Sampling with Sublinear Communication. Cryptology ePrint Archive, Paper 2022/660. https://eprint.iacr.org/2022/660 https://eprint.iacr.org/2022/660.
    [18]
    Cryptography and Privacy Engineering Group at TU Darmstadt. [n.,d.]. A Framework for Efficient Mixed-Protocol Secure Two-Party Computation. https://github.com/encryptogroup/ABY.
    [19]
    Wei Dong, Juanru Fang, Ke Yi, Yuchao Tao, and Ashwin Machanavajjhala. 2022. R2T: Instance-optimal Truncation for Differentially PrivateQuery Evaluation with Foreign Keys. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data.
    [20]
    Wei Dong and Ke Yi. 2021. Residual Sensitivity for Deferentially Private Multi-Way Joins. In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data.
    [21]
    Wei Dong and Ke Yi. 2022. A Nearly Instance-optimal Differentially Private Mechanism for Conjunctive Queries. In Proceedings of the ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems.
    [22]
    Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, Vol. 9, 3--4 (2014), 211--407.
    [23]
    Ronald Aylmer Fisher and Frank Yates. 1953. Statistical tables for biological, agricultural and medical research. Hafner Publishing Company.
    [24]
    O. Goldreich, S. Micali, and A. Wigderson. 1987. How to Play ANY Mental Game. In Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing. 218--229.
    [25]
    Michael T. Goodrich. 2011. Data-Oblivious External-Memory Algorithms for the Compaction, Selection, and Sorting of Outsourced Data. In Proceedings of the Twenty-Third Annual ACM Symposium on Parallelism in Algorithms and Architectures (San Jose, California, USA) (SPAA '11). Association for Computing Machinery, New York, NY, USA, 379--388. https://doi.org/10.1145/1989493.1989555
    [26]
    Koki Hamada, Ryo Kikuchi, Dai Ikarashi, Koji Chida, and Katsumi Takahashi. 2013. Practically Efficient Multi-party Sorting Protocols from Comparison Sort Algorithms. In Information Security and Cryptology -- ICISC 2012. 202--216.
    [27]
    Feng Han, Lan Zhang, Hanwen Feng, Weiran Liu, and Xiangyang Li. 2022. Scape: Scalable Collaborative Analytics System on Private Database with Malicious Security. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). 1740--1753.
    [28]
    Noah Johnson, Joseph P Near, and Dawn Song. 2018. Towards practical differential privacy for SQL queries. Proceedings of the VLDB Endowment, Vol. 11, 5 (2018), 526--539.
    [29]
    Srikanth Kandula, Anil Shanbhag, Aleksandar Vitorovic, Matthaios Olma, Robert Grandl, Surajit Chaudhuri, and Bolin Ding. 2016. Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters. In Proceedings of the 2016 International Conference on Management of Data. 631--646.
    [30]
    Marcel Keller. 2020. MP-SPDZ: A Versatile Framework for Multi-Party Computation. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 1575--1590.
    [31]
    Albert Kim, Eric Blais, Aditya Parameswaran, Piotr Indyk, Sam Madden, and Ronitt Rubinfeld. 2015. Rapid Sampling for Visualizations with Ordering Guarantees. Proc. VLDB Endow., Vol. 8, 5 (2015), 521--532.
    [32]
    Vladimir Kolesnikov and Ranjit Kumaresan. 2013. Improved OT Extension for Transferring Short Secrets. In Advances in Cryptology -- CRYPTO 2013. 54--70.
    [33]
    Vladimir Kolesnikov, Ranjit Kumaresan, Mike Rosulek, and Ni Trieu. 2016. Efficient Batched Oblivious PRF with Applications to Private Set Intersection. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (Vienna, Austria) (CCS '16). Association for Computing Machinery, New York, NY, USA, 818--829. https://doi.org/10.1145/2976749.2978381
    [34]
    Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2019. PrivateSQL: a differentially private SQL query engine. Proceedings of the VLDB Endowment, Vol. 12, 11 (2019), 1371--1384.
    [35]
    Simeon Krastnikov, Florian Kerschbaum, and Douglas Stebila. 2020. Efficient Oblivious Database Joins. Proc. VLDB Endow., Vol. 13, 12 (jul 2020), 2132--2145. https://doi.org/10.14778/3407790.3407814
    [36]
    Richard E. Ladner and Michael J. Fischer. 1980. Parallel Prefix Computation. J. ACM, Vol. 27, 4 (1980), 831--838.
    [37]
    Feifei Li, Bin Wu, Ke Yi, and Zhuoyue Zhao. 2016. Wander Join: Online Aggregation via Random Walks. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA). 615--629.
    [38]
    Kaiyu Li, Yong Zhang, Guoliang Li, Wenbo Tao, and Ying Yan. 2019. Bounded Approximate Query Processing. IEEE Transactions on Knowledge and Data Engineering, Vol. 31, 12 (2019), 2262--2276. https://doi.org/10.1109/TKDE.2018.2877362
    [39]
    Chang Liu, Yan Huang, Elaine Shi, Jonathan Katz, and Michael Hicks. 2014. Automating Efficient RAM-Model Secure Computation. In 2014 IEEE Symposium on Security and Privacy. 623--638. https://doi.org/10.1109/SP.2014.46
    [40]
    Payman Mohassel, Peter Rindal, and Mike Rosulek. 2020. Fast Database Joins and PSI for Secret Shared Data. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 1271--1287.
    [41]
    Payman Mohassel and Saeed Sadeghian. 2013. How to Hide Circuits in MPC an Efficient Framework for Private Function Evaluation. In Advances in Cryptology -- EUROCRYPT 2013, Thomas Johansson and Phong Q. Nguyen (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 557--574.
    [42]
    Yongjoo Park, Barzan Mozafari, Joseph Sorenson, and Junhao Wang. 2018. VerdictDB: Universalizing Approximate Query Processing. In Proceedings of the 2018 International Conference on Management of Data. Association for Computing Machinery, New York, NY, USA, 1461--1476.
    [43]
    Benny Pinkas, Thomas Schneider, Oleksandr Tkachenko, and Avishay Yanai. 2019. Efficient circuit-based psi with linear communication. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. 122--153.
    [44]
    Rishabh Poddar, Sukrit Kalra, Avishay Yanai, Ryan Deng, Raluca Ada Popa, and Joseph M. Hellerstein. 2021. Senate: A Maliciously-Secure MPC Platform for Collaborative Analytics. In Proceedings of the 30th Conference on USENIX Security Symposium.
    [45]
    Manoj M Prabhakaran and Vinod M Prabhakaran. 2012. On secure multiparty sampling for more than two parties. In 2012 IEEE Information Theory Workshop. 99--103.
    [46]
    Vinod M Prabhakaran and Manoj M Prabhakaran. 2014. Assisted common information with an application to secure two-party sampling. IEEE Transactions on Information Theory, Vol. 60, 6 (2014), 3413--3434.
    [47]
    Lianke Qin, Rajesh Jayaram, Elaine Shi, Zhao Song, Danyang Zhuo, and Shumo Chu. 2023. Differentially Oblivious Relational Database Operators. In VLDB.
    [48]
    Sajin Sasy, Aaron Johnson, and Ian Goldberg. 2022. Fast Fully Oblivious Compaction and Shuffling. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (Los Angeles, CA, USA) (CCS '22). Association for Computing Machinery, New York, NY, USA, 2565--2579. https://doi.org/10.1145/3548606.3560603
    [49]
    Sajin Sasy and Olga Ohrimenko. 2019. Oblivious Sampling Algorithms for Private Data Analysis.
    [50]
    Yufei Tao. 2022. Algorithmic Techniques for Independent Query Sampling. In PODS.
    [51]
    Yuchao Tao, Xi He, Ashwin Machanavajjhala, and Sudeepa Roy. 2020. Computing Local Sensitivities of Counting Queries with Joins. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 479--494.
    [52]
    Nikolaj Volgushev, Malte Schwarzkopf, Ben Getchell, Mayank Varia, Andrei Lapets, and Azer Bestavros. 2019. Conclave: Secure Multi-Party Computation on Big Data. In Proceedings of the Fourteenth EuroSys Conference 2019 (Dresden, Germany) (EuroSys '19). Association for Computing Machinery, New York, NY, USA, Article 3, 18 pages. https://doi.org/10.1145/3302424.3303982
    [53]
    Yilei Wang and Ke Yi. 2021. Secure Yannakakis: Join-Aggregate Queries over Private Data. In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data.
    [54]
    Yilei Wang and Ke Yi. 2022. Query Evaluation by Circuits. In Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems.
    [55]
    Jonathan Katz Xiao Wang, Alex J. Malozemoff. [n.,d.]. EMP-toolkit: Efficient MultiParty computation toolkit. https://github.com/emp-toolkit.
    [56]
    Andrew C Yao. 1982. Protocols for secure computations. In 23rd Annual Symposium on Foundations of Computer Science. IEEE, 160--164.
    [57]
    Andrew Chi-Chih Yao. 1986. How to generate and exchange secrets. In 27th Annual Symposium on Foundations of Computer Science. 162--167.

    Cited By

    View all
    • (2024)Efficient Parallel D-Core Decomposition at ScaleProceedings of the VLDB Endowment10.14778/3675034.367505417:10(2654-2667)Online publication date: 1-Jun-2024
    • (2023)Data Player: Automatic Generation of Data Videos with Narration-Animation InterplayIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332719730:1(109-119)Online publication date: 3-Nov-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 3
    PACMMOD
    September 2023
    472 pages
    EISSN:2836-6573
    DOI:10.1145/3632968
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 November 2023
    Published in PACMMOD Volume 1, Issue 3

    Permissions

    Request permissions for this article.

    Author Tags

    1. approximate query processing
    2. sampling
    3. secure multi-party computation

    Qualifiers

    • Research-article

    Funding Sources

    • Hong Kong Research Grant Council
    • Alibaba Innovative Research Program

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)188
    • Downloads (Last 6 weeks)38
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Parallel D-Core Decomposition at ScaleProceedings of the VLDB Endowment10.14778/3675034.367505417:10(2654-2667)Online publication date: 1-Jun-2024
    • (2023)Data Player: Automatic Generation of Data Videos with Narration-Animation InterplayIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332719730:1(109-119)Online publication date: 3-Nov-2023

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media