Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1754239.1754270acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

A practice-oriented framework for measuring privacy and utility in data sanitization systems

Published: 22 March 2010 Publication History

Abstract

Published data is prone to privacy attacks. Sanitization methods aim to prevent these attacks while maintaining usefulness of the data for legitimate users. Quantifying the trade-off between usefulness and privacy of published data has been the subject of much research in recent years. We propose a pragmatic framework for evaluating sanitization systems in real-life and use data mining utility as a universal measure of usefulness and privacy. We propose a definition for data mining utility that can be tuned to capture the needs of data users and the adversaries' intentions in a setting that is specified by a database, a candidate sanitization method, and privacy and utility concerns of data owner. We use this framework to evaluate and compare privacy and utility offered by two well-known sanitization methods, namely k-anonymity and ε-differential privacy, when UCI's "Adult" dataset and the Weka data mining package is used, and utility and privacy measures are defined for users and adversaries. In the case of k-anonymity, we compare our results with the recent work of Brickell and Shmatikov (KDD 2008), and show that using data mining algorithms increases their proposed adversarial gains.

References

[1]
N. A. Adam and J. C. Wortman. Security-control methods for statistical databases. ACM Comput Surv, 21(4):515--556, 1989.
[2]
R. Agrawal and R. Srikant. Privacy-Preserving Data Mining. In SIGMOD, pages 439--450, 2000.
[3]
A. Asuncion and D. Newman. UCI Machine Learning Repository, 2007.
[4]
J. Brickell and V. Shmatikov. The Cost of Privacy: Destruction of Data-Mining Utility in Anonymized Data Publishing. In KDD, pages 70--78, 2008.
[5]
J.-W. Byun, A. Kamra, E. Bertino, and N. Li. Efficient k -Anonymization Using Clustering Techniques. In DASFAA, pages 188--200, 2007.
[6]
V. Ciriani, S. D. C. di Vimercati, S. Foresti, and P. Samarati. k-Anonymity. In Secure Data Management in Decentralized Systems. Springer, 2007.
[7]
C. Dwork. Differential Privacy. In ICALP, pages 1--12, 2006.
[8]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating Noise to Sensitivity in Private Data Analysis. In TCC, pages 265--284, 2006.
[9]
A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, pages 211--222, 2003.
[10]
V. S. Iyengar. Transforming data to satisfy privacy constraints. In KDD, pages 279--288, 2002.
[11]
M. Kantarcioglu and C. Clifton. Privacy-preserving Distributed Mining of Association Rules on Horizontally Partitioned Data. IEEE T Knowl Data En, 16(9):1026--1037, 2004.
[12]
D. Kifer and J. Gehrke. Injecting utility into anonymized datasets. In SIGMOD, pages 217--228, 2006.
[13]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Workload-aware anonymization. In KDD, pages 277--286, 2006.
[14]
N. Li, T. Li, and S. Venkatasubramanian. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In ICDE, pages 106--115, 2007.
[15]
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In ICDE, pages 24--35, 2006.
[16]
D. J. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Y. Halpern. Worst-Case Background Knowledge for Privacy-Preserving Data Publishing. In ICDE, pages 126--135, 2007.
[17]
G. Miklau and D. Suciu. A Formal Analysis of Information Disclosure in Data Exchange. In SIGMOD, pages 575--586, 2004.
[18]
M. E. Nergiz and C. Clifton. Thoughts on k-Anonymization. In PDM, page 96, 2006.
[19]
M. E. Nergiz, C. Clifton, and A. E. Nergiz. MultiRelational k-Anonymity. In ICDE, pages 1417--1421, 2007.
[20]
H. Park and K. Shim. Approximate algorithms for K-anonymity. In SIGMOD, pages 67--78, 2007.
[21]
V. Rastogi, S. Hong, and D. Suciu. The Boundary Between Privacy and Utility in Data Publishing. In VLDB, pages 531--542, 2007.
[22]
P. Samarati. Protecting Respondents' Identities in Microdata Release. IEEE T Knowl Data En, 13(6):1010--1027, 2001.
[23]
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory, 1998.
[24]
M. Sramka, R. Safavi-Naini, and J. Denzinger. An Attack on the Privacy of Sanitized Data That Fuses the Outputs of Multiple Data Miners. In PADM, pages 130--137, 2009.
[25]
M. Sramka, R. Safavi-Naini, J. Denzinger, M. Askari, and J. Gao. Utility of Knowledge Discovered from Sanitized Data. Technical Report 2008-910-23, University of Calgary, 2008.
[26]
M. Sramka, R. Safavi-Naini, J. Denzinger, M. Askari, and J. Gao. Utility of Knowledge Extracted from Unsanitized Data when Applied to Sanitized Data. In PST, pages 227--231, 2008.
[27]
L. Sweeney. k-anonymity: a model for protecting privacy. Int J Uncertainty, Fuzziness and Knowl-based Syst, 10(5):557--570, 2002.
[28]
T. M. Truta and B. Vinay. Privacy Protection: p-Sensitive k-Anonymity Property. In PDM, pages 94--103, 2006.
[29]
J. S. Vaidya and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. In KDD, pages 639--644, 2002.
[30]
V. S. Verykios, E. Bertino, I. N. Fovino, L. P. Provenza, Y. Saygin, and Y. Theodoridis. State-of-the-art in privacy preserving data mining. SIGMOD Record, 33(1), 2004.
[31]
I. H. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2005.
[32]
R. C.-W. Wong, J. Li, A. W.-C. Fu, and K. Wang. (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In KDD, pages 754--759, 2006.

Cited By

View all
  • (2018)Flexible Anonymization of Transactions with Sensitive Items2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC)10.1109/BESC.2018.8697320(201-206)Online publication date: Nov-2018
  • (2016)Privacy and Utility Effects of k-anonymity on Association Rule HidingProceedings of the The 3rd Multidisciplinary International Social Networks Conference on SocialInformatics 2016, Data Science 201610.1145/2955129.2955169(1-6)Online publication date: 15-Aug-2016
  • (2016)Computational data privacy in wireless networksPeer-to-Peer Networking and Applications10.1007/s12083-016-0435-610:4(865-873)Online publication date: 25-Jan-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EDBT '10: Proceedings of the 2010 EDBT/ICDT Workshops
March 2010
290 pages
ISBN:9781605589909
DOI:10.1145/1754239
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 March 2010

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

EDBT/ICDT '10
EDBT/ICDT '10: EDBT/ICDT '10 joint conference
March 22 - 26, 2010
Lausanne, Switzerland

Acceptance Rates

Overall Acceptance Rate 7 of 10 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Flexible Anonymization of Transactions with Sensitive Items2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC)10.1109/BESC.2018.8697320(201-206)Online publication date: Nov-2018
  • (2016)Privacy and Utility Effects of k-anonymity on Association Rule HidingProceedings of the The 3rd Multidisciplinary International Social Networks Conference on SocialInformatics 2016, Data Science 201610.1145/2955129.2955169(1-6)Online publication date: 15-Aug-2016
  • (2016)Computational data privacy in wireless networksPeer-to-Peer Networking and Applications10.1007/s12083-016-0435-610:4(865-873)Online publication date: 25-Jan-2016
  • (2015)k-anonymityProceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA - Volume 0110.1109/Trustcom.2015.473(983-989)Online publication date: 20-Aug-2015
  • (2015)A surveyArtificial Intelligence Review10.1007/s10462-015-9439-544:4(547-569)Online publication date: 1-Dec-2015
  • (2015)Quantifying Privacy: A Novel Entropy-Based Measure of Disclosure RiskCombinatorial Algorithms10.1007/978-3-319-19315-1_3(24-36)Online publication date: 7-Jun-2015
  • (2014)Privacy preserving data publishingInternational Journal of Computational Intelligence Studies10.1504/IJCISTUDIES.2014.0627333:2/3(196-220)Online publication date: 1-Jun-2014
  • (2014)A Privacy Risk Model for Trajectory DataTrust Management VIII10.1007/978-3-662-43813-8_9(125-140)Online publication date: 2014
  • (2012)An information theoretic privacy and utility measure for data sanitization mechanismsProceedings of the second ACM conference on Data and Application Security and Privacy10.1145/2133601.2133637(283-294)Online publication date: 7-Feb-2012
  • (2012)Towards A Differential Privacy and Utility Preserving Machine Learning ClassifierProcedia Computer Science10.1016/j.procs.2012.09.05012(176-181)Online publication date: 2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media