Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Local dampening: differential privacy for non-numeric queries via local sensitivity

Published: 01 December 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Differential privacy is the state-of-the-art formal definition for data release under strong privacy guarantees. A variety of mechanisms have been proposed in the literature for releasing the noisy output of numeric queries (e.g., using the Laplace mechanism), based on the notions of global sensitivity and local sensitivity. However, although there has been some work on generic mechanisms for releasing the output of non-numeric queries using global sensitivity (e.g., the Exponential mechanism), the literature lacks generic mechanisms for releasing the output of non-numeric queries using local sensitivity to reduce the noise in the query output.
    In this work, we remedy this shortcoming and present the local dampening mechanism. We adapt the notion of local sensitivity for the non-numeric setting and leverage it to design a generic non-numeric mechanism. We illustrate the effectiveness of the local dampening mechanism by applying it to two diverse problems: (i) Influential node analysis. Given an influence metric, we release the top-k most influential nodes while preserving the privacy of the relationship between nodes in the network; (ii) Decision tree induction. We provide a private adaptation to the ID3 algorithm to build decision trees from a given tabular dataset. Experimental results show that we could reduce the use of privacy budget by 3 to 4 orders of magnitude for Influential node analysis and increase accuracy up to 12% for Decision tree induction when compared to global sensitivity based approaches.

    References

    [1]
    Catherine L Blake and Christopher J Merz. 1998. UCI repository of machine learning databases.
    [2]
    Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2013. Differentially private data analysis of social networks via restricted sensitivity. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science. ACM, 87--96.
    [3]
    Avrim Blum, Cynthia Dwork, Frank McSherry, and Kobbi Nissim. 2005. Practical privacy: the SuLQ framework. In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 128--138.
    [4]
    Leo Breiman, Jerome Friedman, Charles J Stone, and Richard A Olshen. 1984. Classification and regression trees. CRC press.
    [5]
    Shixi Chen and Shuigeng Zhou. 2013. Recursive mechanism: towards node differential privacy and unrestricted joins. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 653--664.
    [6]
    Cynthia Dwork. 2011. Differential privacy. Encyclopedia of Cryptography and Security (2011), 338--340.
    [7]
    Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006. Our data, ourselves: Privacy via distributed noise generation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 486--503.
    [8]
    Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, 265--284.
    [9]
    Martin Everett and Stephen P Borgatti. 2005. Ego network betweenness. Social networks 27, 1 (2005), 31--38.
    [10]
    Victor A. E. Farias, Felipe T. Brito, Chery Flynn, Javam C. Machado, Subhabrata Majumdar, and Divesh Srivastava. 2020. Local Dampening: Differential Privacy for Non-numeric Queries via Local Sensitivity. arXiv:2012.04117 [cs.CR]
    [11]
    Sam Fletcher and Md Zahidul Islam. 2015. A Differentially Private Decision Forest. AusDM 15 (2015), 99--108.
    [12]
    Sam Fletcher and Md Zahidul Islam. 2015. A differentially private random decision forest using reliable signal-to-noise ratios. In Australasian joint conference on artificial intelligence. Springer, 192--203.
    [13]
    Sam Fletcher and Md Zahidul Islam. 2017. Differentially private random decision forests using smooth sensitivity. Expert Systems with Applications 78 (2017), 16--31.
    [14]
    Sam Fletcher and Md Zahidul Islam. 2019. Decision tree classification with differential privacy: A survey. ACM Computing Surveys (CSUR) 52, 4 (2019), 1--33.
    [15]
    Linton C Freeman. 1978. Centrality in social networks conceptual clarification. Social networks 1, 3 (1978), 215--239.
    [16]
    Arik Friedman and Assaf Schuster. 2010. Data mining with differential privacy. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. 493--502.
    [17]
    Quan Geng, Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2015. The staircase mechanism in differential privacy. IEEE Journal of Selected Topics in Signal Processing 9, 7 (2015), 1176--1184.
    [18]
    Quan Geng and Pramod Viswanath. 2014. The optimal mechanism in differential privacy. In 2014 IEEE international symposium on information theory. IEEE, 2371--2375.
    [19]
    Moritz Hardt, Katrina Ligett, and Frank McSherry. 2012. A simple and practical algorithm for differentially private data release. In Advances in Neural Information Processing Systems. 2339--2347.
    [20]
    Geetha Jagannathan, Krishnan Pillaipakkamnatt, and Rebecca N Wright. 2009. A practical differentially private random decision tree classifier. In 2009 IEEE International Conference on Data Mining Workshops. IEEE, 114--121.
    [21]
    Noah Johnson, Joseph P Near, and Dawn Song. 2018. Towards practical differential privacy for SQL queries. PVLDB 11, 5 (2018), 526--539.
    [22]
    Vishesh Karwa, Sofya Raskhodnikova, Adam Smith, and Grigory Yaroslavtsev. 2011. Private analysis of graph structure. PVLDB 4, 11 (2011), 1146--1157.
    [23]
    Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2013. Analyzing graphs with node differential privacy. In Theory of Cryptography Conference. Springer, 457--476.
    [24]
    Sotiris B Kotsiantis, I Zaharakis, and P Pintelas. 2007. Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering 160 (2007), 3--24.
    [25]
    Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2019. Privatesql: a differentially private sql query engine. PVLDB 12, 11 (2019), 1371--1384.
    [26]
    Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
    [27]
    Chao Li, Gerome Miklau, Michael Hay, Andrew McGregor, and Vibhor Rastogi. 2015. The matrix mechanism: optimizing linear counting queries under differential privacy. The VLDB journal 24, 6 (2015), 757--781.
    [28]
    Wentian Lu and Gerome Miklau. 2014. Exponential random graph estimation under differential privacy. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 921--930.
    [29]
    Hao Ma, Haixuan Yang, Michael R Lyu, and Irwin King. 2008. Mining social networks using heat diffusion processes for marketing candidates selection. In Proceedings of the 17th ACM conference on Information and knowledge management. 233--242.
    [30]
    Ashwin Machanavajjhala, Xi He, and Michael Hay. 2017. Differential Privacy in the Wild: A Tutorial on Current Practices & Open Challenges. In Proc. of SIGMOD. ACM, 1727--1730.
    [31]
    Kenneth G Manton. 2010. National Long-Term Care Survey: 1982, 1984, 1989, 1994, 1999, and 2004. Inter-university Consortium for Political and Social Research (2010).
    [32]
    Peter V Marsden. 2002. Egocentric and sociocentric measures of network centrality. Social networks 24, 4 (2002), 407--422.
    [33]
    Frank McSherry and Kunal Talwar. 2007. Mechanism Design via Differential Privacy. In FOCS, Vol. 7. 94--103.
    [34]
    Frank D McSherry. 2009. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. 19--30.
    [35]
    Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2007. Smooth sensitivity and sampling in private data analysis. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing. ACM, 75--84.
    [36]
    Abhijit Patil and Sanjay Singh. 2014. Differential private random forest. In 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE, 2623--2630.
    [37]
    J. Ross Quinlan. 1986. Induction of decision trees. Machine learning 1, 1 (1986), 81--106.
    [38]
    Santu Rana, Sunil Kumar Gupta, and Svetha Venkatesh. 2015. Differentially private random forest with high utility. In 2015 IEEE International Conference on Data Mining. IEEE, 955--960.
    [39]
    Steven L Salzberg. 1993. C4. 5: Programs for machine learning by j. ross quinlan. morgan kaufmann publishers, inc.
    [40]
    Integrated Public Use Microdata Series. 2015. Version 6.0. Minneapolis: University of (2015).
    [41]
    Yuchao Tao, Xi He, Ashwin Machanavajjhala, and Sudeepa Roy. 2020. Computing Local Sensitivities of Counting Queries with Joins. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 479--494.
    [42]
    Jun Zhang, Graham Cormode, Cecilia M Procopiuc, Divesh Srivastava, and Xiaokui Xiao. 2015. Private release of graph statistics using ladder functions. In Proceedings of the 2015 ACM SIGMOD international conference on management of data. ACM, 731--745.
    [43]
    Jun Zhang, Graham Cormode, Cecilia M Procopiuc, Divesh Srivastava, and Xiaokui Xiao. 2017. PrivBayes: Private data release via bayesian networks. ACM Transactions on Database Systems (TODS) 42, 4 (2017), 25.

    Cited By

    View all
    • (2023)Local dampening: differential privacy for non-numeric queries via local sensitivityThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-022-00774-w32:6(1191-1214)Online publication date: 10-Jan-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 14, Issue 4
    December 2020
    263 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 December 2020
    Published in PVLDB Volume 14, Issue 4

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 29 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Local dampening: differential privacy for non-numeric queries via local sensitivityThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-022-00774-w32:6(1191-1214)Online publication date: 10-Jan-2023

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media