Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3029806.3029817acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article

Share a pie?: Privacy-Preserving Knowledge Base Export through Count-min Sketches

Published: 22 March 2017 Publication History

Abstract

Knowledge base (KB) sharing among parties has been proven to be beneficial in several scenarios. However such sharing can arise considerable privacy concerns depending on the sensitivity of the information stored in each party's KB. In this paper, we focus on the problem of exporting a (part of a) KB of a party towards a receiving one. We introduce a novel solution that enables parties to export data in a privacy-preserving fashion, based on a probabilistic data structure, namely the \emph{count-min sketch}. With this data structure, KBs can be exported in the form of key-value stores and inserted into a set of count-min sketches, where keys can be sensitive and values are counters. Count-min sketches can be tuned to achieve a given key collision probability, which enables a party to deny having certain keys in its own KB, and thus to preserve its privacy. We also introduce a metric, the γ-deniability (novel for count-min sketches), to measure the privacy level obtainable with a count-min sketch. Furthermore, since the value associated to a key can expose to linkage attacks, noise can be added to a count-min sketch to ensure controlled error on retrieved values. Key collisions and noise alter the values contained in the exported KB, and can affect negatively the accuracy of a computation performed on the exported KB. We explore the tradeoff between privacy preservation and computation accuracy by experimental evaluations in two scenarios related to malware detection.

References

[1]
R. Balu and T. Furon. Differentially private matrix factorization using sketching techniques. IMMSEC, June 2016.
[2]
G. Bianchi, L. Bracciale, and P. Loreti. Better than nothing privacy with bloom filters: To what extent? In J. Domingo-Ferrer and I. Tinnirello, editors, Privacy in Statistical Databases, volume 7556 of Lecture Notes in Computer Science, pages 348--363. Springer Berlin Heidelberg, 2012.
[3]
G. Cormode and S. Muthukrishnan. Approximating data with the count-min sketch. Software, IEEE, 29(1):64--69, 2012.
[4]
E. De Cristofaro, Y. Lu, and G. Tsudik. Efficient Techniques for Privacy-Preserving Sharing of Sensitive Information, pages 239--253. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011.
[5]
C. Dwork. Differential privacy. In in ICALP, pages 1--12. Springer, 2006.
[6]
C. Dwork and A. Smith. Differential privacy for statistics: What we know and what we want to learn. Journal of Privacy and Confidentiality, 1(2):2, 2010.
[7]
D. Gallego and G. Huecas. An empirical case of a context-aware mobile recommender system in a banking environment. In MUSIC, pages 13--20. IEEE, 2012.
[8]
A. Gkoulalas-Divanis, G. Loukides, and J. Sun. Publishing data from electronic health records while preserving privacy: A survey of algorithms. Journal of Biomedical Informatics, 50:4--19, 2014.
[9]
A. Hussien, N. Hamza, and H. Hefny. Attacks on anonymization-based privacy-preserving: A survey for data mining and data publishing. Journal of Information Security, 4:101--112, 2013.
[10]
L. Invernizzi, S. Miskovic, R. Torres, C. Kruegel, S. Saha, G. Vigna, S.-J. Lee, and M. Mellia. Nazca: Detecting malware distribution in large-scale networks. In NDSS, volume 14, pages 23--26, 2014.
[11]
A. J. P. Jeckmans, M. R. T. Beye, Z. Erkin, P. H. Hartel, R. L. Lagendijk, and Q. Tang. Privacy in recommender systems. In Social Media Retrieval, Computer Communications and Networks, pages 263--281. Springer Verlag, London, January 2013.
[12]
M. Kao, editor. Encyclopedia of Algorithms. Springer, 2015.
[13]
O. Kaser and D. Lemire. Strongly universal string hashing is fast. CoRR, abs/1202.4961, 2012.
[14]
M. Kruczkowski and E. N. Szynkiewicz. Support vector machine for malware analysis and classification. In Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)-Volume 02, pages 415--420. IEEE Computer Society, 2014.
[15]
B. Li, J. Springer, G. Bebis, and M. H. Gunes. A survey of network flow applications. Journal of Network and Computer Applications, 36(2):567 -- 581, 2013.
[16]
N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In 2007 IEEE 23rd International Conference on Data Engineering, pages 106--115. IEEE, 2007.
[17]
G. Lodi, L. Aniello, G. A. Di Luna, and R. Baldoni. An event-based platform for collaborative threats detection and monitoring. Information Systems, 39:175--195, 2014.
[18]
A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1):3, 2007.
[19]
L. Melis, G. Danezis, and E. D. Cristofaro. Efficient private statistics with succinct sketches. CoRR, abs/1508.06110, 2015.
[20]
A. Narayanan and V. Shmatikov. How to break anonymity of the netflix prize dataset. CoRR, abs/cs/0610105, 2006.
[21]
M. Roughan and Y. Zhang. Secure distributed data-mining and its application to large-scale network measurements. SIGCOMM Comput. Commun. Rev., 36(1):7--14, Jan. 2006.
[22]
R. Sarathy and K. Muralidhar. Some additional insights on applying differential privacy for numeric data. In J. Domingo-Ferrer and E. Magkos, editors, Privacy in Statistical Databases, volume 6344 of Lecture Notes in Computer Science, pages 210--219. Springer, 2010.
[23]
R. Shokri and V. Shmatikov. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 1310--1321. ACM, 2015.
[24]
L. Sweeney. K-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5):557--570, Oct. 2002.
[25]
E. Toch, Y. Wang, and L. F. Cranor. Personalization and privacy: A survey of privacy risks and remedies in personalization-based systems. User Modeling and User-Adapted Interaction, 22(1--2):203--220, Apr. 2012.
[26]
P. Vadrevu, B. Rahbarinia, R. Perdisci, K. Li, and M. Antonakakis. Measuring and detecting malware downloads in live network traffic. In Computer Security - ESORICS 2013, pages 556--573. Springer Berlin Heidelberg, 2013.
[27]
M. N. Wegman and J. Carter. New hash functions and their use in authentication and set equality. Journal of Computer and System Sciences, 22(3):265 -- 279, 1981.
[28]
N. Zhang and W. Zhao. Distributed privacy preserving information sharing. In Proceedings of the 31st international conference on Very large data bases, pages 889--900. VLDB Endowment, 2005.

Index Terms

  1. Share a pie?: Privacy-Preserving Knowledge Base Export through Count-min Sketches

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CODASPY '17: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy
    March 2017
    382 pages
    ISBN:9781450345231
    DOI:10.1145/3029806
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 March 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. count-min sketches
    2. information sharing
    3. knowledge base export
    4. privacy metric

    Qualifiers

    • Research-article

    Funding Sources

    • Italian Presidency of Ministry Council
    • Filierasicura
    • Cybersecurity National Laboratory of CINI (Consorzio Interuniversitario Nazionale Informatica)

    Conference

    CODASPY '17
    Sponsor:

    Acceptance Rates

    CODASPY '17 Paper Acceptance Rate 21 of 134 submissions, 16%;
    Overall Acceptance Rate 149 of 789 submissions, 19%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 185
      Total Downloads
    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Sep 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media