Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2806416.2806580acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Public Access

Process-Driven Data Privacy

Published: 17 October 2015 Publication History

Abstract

The quantity of personal data gathered by service providers via our daily activities continues to grow at a rapid pace. The sharing, and the subsequent analysis of, such data can support a wide range of activities, but concerns around privacy often prompt an organization to transform the data to meet certain protection models (e.g., k-anonymity or ε-differential privacy). These models, however, are based on simplistic adversarial frameworks, which can lead to both under- and over-protection. For instance, such models often assume that an adversary attacks a protected record exactly once. We introduce a principled approach to explicitly model the attack process as a series of steps. Specifically, we engineer a factored Markov decision process (FMDP) to optimally plan an attack from the adversary's perspective and assess the privacy risk accordingly. The FMDP captures the uncertainty in the adversary's belief (e.g., the number of identified individuals that match the de-identified data) and enables the analysis of various real world deterrence mechanisms beyond a traditional protection model, such as a penalty for committing an attack. We present an algorithm to solve the FMDP and illustrate its efficiency by simulating an attack on publicly accessible U.S. census records against a real identified resource of over 500,000 individuals in a voter registry. Our results demonstrate that while traditional privacy models commonly expect an adversary to attack exactly once per record, an optimal attack in our model may involve exploiting none, one, or more individuals in the pool of candidates, depending on context.

References

[1]
A. Bharadwaj, O. El Sawy, P. Pavlou, et al. Digital business strategy: toward a next generation of insights. MIS Quarterly, 37(2):471--482, 2013.
[2]
L. Bonomi and L. Xiong. A two-phase algorithm for mining sequential patterns with differential privacy. In Proc. 22nd ACM Int'l Conf. on Inform. and Knowl. Management, pages 269--278, 2013.
[3]
R. Dewri, I. Ray, I. Ray, et al. POkA: Identifying pareto-optimal k-anonymous nodes in a domain hierarchy lattice. In Proc. 18th ACM Conf. on Inform. and Knowl. Management, pages 1037--1046, 2009.
[4]
C. Dwork. Differential privacy. In Proc. Int'l Colloquium on Automata, Languages, and Programming, pages 1--12. Springer, 2006.
[5]
C. Dwork. The promise of differential privacy: A tutorial on algorithmic techniques. In Proc. IEEE Annual Symp. on Foundations of Computer Science, pages 1--12, 2011.
[6]
K. El Emam, E. Jonker, L. Arbuckle, and B. Malin. A systematic review of re-identification attacks on health data. PLoS ONE, 6(12):e28071, 2010.
[7]
M. Elliot and A. Dale. Scenarios of attack: the data intruder's perspective on statistical disclosure risk. Netherlands Official Statistics, 14:6--10, 1999.
[8]
L. Fan and L. Xiong. Real-time aggregate monitoring with differential privacy. In Proc. 21st ACM Int'l Conf. on Inform. and Knowl. Management, pages 2169--2173, 2012.
[9]
D. Freni, C. Ruiz Vicente, S. Mascetti, et al. Preserving location and absence privacy in geo-social networks. In Proc. 19th ACM Int'l Conf. on Inform. and Knowl. Management, pages 309--318, 2010.
[10]
B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys, 42(4), 2010.
[11]
C. Guestrin, D. Koller, R. Parr, and S. Venkataraman. Efficient solution algorithms for factored MDPs. CoRR, abs/1106.1822, 2011.
[12]
R. Jones, R. Kumar, B. Pang, and A. Tomkins. "I know what you did last summer": query logs and user privacy. In Proc. 16th ACM Conf. on Inform. and Knowl. Management, pages 909--914, 2007.
[13]
O. Kwon, N. Lee, and B. Shin. Data quality management, data usage experience and acquisition intention of big data analytics. Int'l Journal of Inform. Management, 34(3):387--394, 2014.
[14]
D. Lambert. Measures of disclosure risk and harm. Journal of Official Statistics, 9:313--331, 1993.
[15]
J. Letchford and Y. Vorobeychik. Optimal interdiction of attack plans. In Proc. Int'l Conf. on Autonomous Agents and Multi-agent Systems, pages 199--206, 2013.
[16]
A. Machanavajjhala, D. Kifer, J. Gehrke, et al. l-diversity: Privacy beyond k-anonymity. ACM Trans. on Knowl. Discovery in Data, 1(1), 2007.
[17]
E. Mackey and M. Elliot. Understanding the data environment. XRDS, 20(1):36--39, Sept. 2013.
[18]
A. Narayanan and V. Shmatikov. De-anonymizing social networks. In Proc. 30th IEEE Symp. on Security and Privacy, pages 173--187, 2009.
[19]
A. Narayanan and V. Shmatikov. Myths and fallacies of "personally identifiable information". Communications of the ACM, 53(6):24--26, 2010.
[20]
M. E. Nergiz, M. Atzori, and C. Clifton. Hiding the presence of individuals from shared databases. In Proc. ACM SIGMOD Int'l Conf. on Management of Data, pages 665--676, 2007.
[21]
North Carolina Voter Registration Database, ftp://www.app.sboe.state.nc.us/data. Last accessed 27 Jan 2014.
[22]
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.
[23]
L. Roderick. Discipline and power in the digital age: the case of the US consumer data broker. Critical Sociology, 40(5):729--746, 2014.
[24]
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory, 1998.
[25]
D. Solove. A taxonomy of privacy. University of Pennsylvania Law Review, 154(3):477--560, 2006.
[26]
M. Srivatsa and M. Hicks. Deanonymizing mobility traces: using social network as a side-channel. In Proc. ACM Conf. on Computer and Communications Security, pages 628--637, 2012.
[27]
L. Sweeney. Uniqueness of simple demographics in the U.S. population. Technical report, Carnegie Mellon University, 2000.
[28]
P. Tallon. An application of game theory to understanding statistical disclosure events. UNECE/Eurostat Work Session on Data Confidentiality, 2009.
[29]
P. Tallon. Corporate governance of big data: perspectives on value, risk, and cost. IEEE Computer, 46(6):32--38, 2013.
[30]
A. Tanner. Harvard professor re-identifies anonymous volunteers in DNA study. Forbes, 2013.
[31]
Z. Wan, Y. Vorobeychik, W. Xia, et al. A game theoretic framework for analyzing re-identification risk. PLoS ONE, 10:e0120592, 2015.

Cited By

View all
  • (2024)Reidentification of Participants in Shared Clinical Data Sets: Experimental StudyJMIR AI10.2196/520543(e52054)Online publication date: 15-Mar-2024
  • (2024)Robin Hood: A De-identification Method to Preserve Minority Representation for Disparities ResearchPrivacy in Statistical Databases10.1007/978-3-031-69651-0_5(67-83)Online publication date: 13-Sep-2024
  • (2022)How Adversarial Assumptions Influence Re-identification Risk Measures: A COVID-19 Case StudyPrivacy in Statistical Databases10.1007/978-3-031-13945-1_25(361-374)Online publication date: 14-Sep-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
October 2015
1998 pages
ISBN:9781450337946
DOI:10.1145/2806416
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. planning
  2. privacy
  3. re-identification

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM'15
Sponsor:

Acceptance Rates

CIKM '15 Paper Acceptance Rate 165 of 646 submissions, 26%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)23
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Reidentification of Participants in Shared Clinical Data Sets: Experimental StudyJMIR AI10.2196/520543(e52054)Online publication date: 15-Mar-2024
  • (2024)Robin Hood: A De-identification Method to Preserve Minority Representation for Disparities ResearchPrivacy in Statistical Databases10.1007/978-3-031-69651-0_5(67-83)Online publication date: 13-Sep-2024
  • (2022)How Adversarial Assumptions Influence Re-identification Risk Measures: A COVID-19 Case StudyPrivacy in Statistical Databases10.1007/978-3-031-13945-1_25(361-374)Online publication date: 14-Sep-2022
  • (2021)A Multi-Disciplinary Perspective for Conducting Artificial Intelligence-enabled Privacy AnalyticsACM Transactions on Management Information Systems10.1145/344750712:1(1-18)Online publication date: 17-Mar-2021
  • (2021)Using game theory to thwart multistage privacy intrusions when sharing dataScience Advances10.1126/sciadv.abe99867:50Online publication date: 10-Dec-2021
  • (2017)Building a Dossier on the CheapProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3132951(1549-1558)Online publication date: 6-Nov-2017
  • (2016)Privacy and Utility Effects of k-anonymity on Association Rule HidingProceedings of the The 3rd Multidisciplinary International Social Networks Conference on SocialInformatics 2016, Data Science 201610.1145/2955129.2955169(1-6)Online publication date: 15-Aug-2016

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media