Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3411497.3420214acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Preserving Genomic Privacy via Selective Sharing

Published: 09 November 2020 Publication History

Abstract

Although genomic data has significant impact and widespread usage in medical research, it puts individuals' privacy in danger, even if they anonymously or partially share their genomic data. To address this problem, we present a framework that is inspired from differential privacy for sharing individuals' genomic data while preserving their privacy. We assume an individual with some sensitive portion on her genome (e.g., mutations or single nucleotide polymorphisms - SNPs that reveal sensitive information about the individual) that she does not want to share. The goals of the individual are to (i) preserve the privacy of her sensitive data (considering the correlations between the sensitive and non-sensitive part), (ii) preserve the privacy of interdependent data (data that belongs to other individuals that is correlated with her data), and (iii) share as much non-sensitive data as possible to maximize utility of data sharing. As opposed to traditional differential privacy-based data sharing schemes, the proposed scheme does not intentionally add noise to data; it is based on selective sharing of data points. We observe that traditional differential privacy concept does not capture sharing data in such a setting, and hence we first introduce a privacy notation, ε-indirect privacy, that addresses data sharing in such settings. We show that the proposed framework does not provide sensitive information to the attacker while it provides a high data sharing utility. We also compare the proposed technique with the previous ones and show our advantage both in terms of privacy and data sharing utility.

References

[1]
2020. https://www.23andme.com/en-int/. [Online; accessed 13-September-2020].
[2]
2020. https://opensnp.org/. [Online; accessed 13-September-2020].
[3]
2020. http://www.eupedia.com/genetics/medical_dna_test.shtml. [Online; accessed 13-September-2020].
[4]
2020. https://medium.com/genomes-web-2-0-and-bioethics/my-personalexome-analysis-part-i-first-findings-72902e4d42cb. [Online; accessed 13- September-2020].
[5]
2020. https://ghr.nlm.nih.gov/primer/genomicresearch/snp. [Online; accessed 13-September-2020].
[6]
20202. http://mathgen.stats.ox.ac.uk/impute/impute_v2.html. [Online; accessed 13-September-2020].
[7]
APOC APOC. 2009. On Jim Watson?s APOE status: genetic information is hard to hide. European Journal of Human Genetics 17 (2009), 147--149.
[8]
Erman Ayday, Jean Louis Raisaro, Jean-Pierre Hubaux, and Jacques Rougemont. 2013. Protecting and evaluating genomic privacy in medical tests and personalized medicine. In Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society. ACM, 95--106.
[9]
Pierre Baldi, Roberta Baronio, Emiliano De Cristofaro, Paolo Gasti, and Gene Tsudik. 2011. Countering gattaca: efficient and secure testing of fully-sequenced human genomes. In Proceedings of the 18th ACM conference on Computer and communications security. ACM, 691--702.
[10]
Yang Cao, Masatoshi Yoshikawa, Yonghui Xiao, and Li Xiong. 2017. Quantifying differential privacy under temporal correlations. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 821--832.
[11]
Christopher A Cassa, Rachel A Miller, and Kenneth D Mandl. 2013. A novel, privacy-preserving cryptographic approach for sharing sequencing data. Journal of the American Medical Informatics Association 20, 1 (2013), 69--76.
[12]
Dominic Deuber, Christoph Egger, Katharina Fech, Giulio Malavolta, Dominique Schröder, Sri Aravinda Krishnan Thyagarajan, Florian Battke, and Claudia Durand. 2019. My Genome Belongs to Me: Controlling Third Party Computation on Genomic Data. Proceedings on Privacy Enhancing Technologies 2019, 1 (2019), 108--132.
[13]
Iman Deznabi, Mohammad Mobayen, Nazanin Jafari, Oznur Tastan, and Erman Ayday. 2018. An inference attack on genomic data using kinship, complex correlations, and phenotype information. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 15, 4 (2018), 1333--1343.
[14]
Stelios Doudalis, Ios Kotsogiannis, Samuel Haney, Ashwin Machanavajjhala, and Sharad Mehrotra. 2017. One-sided differential privacy. arXiv preprint arXiv:1712.05888 (2017).
[15]
John C Duchi, Michael I Jordan, and Martin J Wainwright. 2013. Local privacy and statistical minimax rates. In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on. IEEE, 429--438.
[16]
Cynthia Dwork. 2008. Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Computation. Springer, 1--19.
[17]
Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. ACM, 1054--1067.
[18]
Stephen E Fienberg, Aleksandra Slavkovic, and Caroline Uhler. 2011. Privacy preserving GWAS data sharing. In Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on. IEEE, 628--635.
[19]
Melissa Gymrek, Amy L McGuire, David Golan, Eran Halperin, and Yaniv Erlich. 2013. Identifying personal genomes by surname inference. Science 339, 6117 (2013), 321--324.
[20]
Moritz Hardt and Kunal Talwar. 2010. On the geometry of differential privacy. In Proceedings of the forty-second ACM symposium on Theory of computing. ACM, 705--714.
[21]
Nils Homer, Szabolcs Szelinger, Margot Redman, David Duggan, Waibhav Tembe, Jill Muehling, John V Pearson, Dietrich A Stephan, Stanley F Nelson, and David W Craig. 2008. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS genetics 4, 8 (2008), e1000167.
[22]
Mathias Humbert, Erman Ayday, Jean-Pierre Hubaux, and Amalio Telenti. 2013. Addressing the concerns of the lacks family: quantification of kin genomic privacy. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. ACM, 1141--1152.
[23]
Mathias Humbert, Erman Ayday, Jean-Pierre Hubaux, and Amalio Telenti. 2014. Reconciling utility with privacy in genomics. In Proceedings of the 13th Workshop on Privacy in the Electronic Society. ACM, 11--20.
[24]
Somesh Jha, Louis Kruger, and Vitaly Shmatikov. 2008. Towards practical privacy for genomic computation. In Security and Privacy, 2008. SP 2008. IEEE Symposium on. IEEE, 216--230.
[25]
Aaron Johnson and Vitaly Shmatikov. 2013. Privacy-preserving data exploration in genome-wide association studies. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1079--1087.
[26]
Angelina Jolie. 2013. My medical choice. The New York Times 14, 05 (2013), 2013.
[27]
Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2014. Extremal mechanisms for local differential privacy. In Advances in neural information processing systems. 2879--2887.
[28]
Richard M Karp. 1972. Reducibility among combinatorial problems. In Complexity of computer computations. Springer, 85--103.
[29]
Frank R Kschischang, Brendan J Frey, and H-A Loeliger. 2001. Factor graphs and the sum-product algorithm. IEEE Transactions on information theory 47, 2 (2001), 498--519.
[30]
Changchang Liu, Supriyo Chakraborty, and Prateek Mittal. 2016. Dependence Makes You Vulnberable: Differential Privacy Under Dependent Tuples. In NDSS, Vol. 16. 21--24.
[31]
Muhammad Naveed, Erman Ayday, Ellen W Clayton, Jacques Fellay, Carl A Gunter, Jean-Pierre Hubaux, Bradley A Malin, and XiaoFeng Wang. 2015. Privacy in the genomic era. ACM Computing Surveys (CSUR) 48, 1 (2015), 6.
[32]
Judea Pearl. 2014. Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier.
[33]
Sahel Shariati Samani, Zhicong Huang, Erman Ayday, Mark Elliot, Jacques Fellay, Jean-Pierre Hubaux, and Zoltán Kutalik. 2015. Quantifying genomic privacy via inference attack with high-order SNV correlations. In Security and Privacy Workshops (SPW), 2015 IEEE. IEEE, 32--40.
[34]
Suyash S Shringarpure and Carlos D Bustamante. 2015. Privacy risks from genomic data-sharing beacons. The American Journal of Human Genetics 97, 5 (2015), 631--646.
[35]
Montgomery Slatkin. 2008. Linkage disequilibrium - understanding the evolutionary past and mapping the medical future. Nature Reviews Genetics 9, 6 (2008), 477--485.
[36]
Shuang Song, Yizhen Wang, and Kamalika Chaudhuri. 2017. Pufferfish privacy mechanisms for correlated data. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 1291--1306.
[37]
Latanya Sweeney, Akua Abu, and Julia Winn. 2013. Identifying participants in the personal genome project by name (a re-identification experiment). arXiv preprint arXiv:1304.7605 (2013).
[38]
Isabel Wagner. 2017. Evaluating the strength of genomic privacy metrics. ACM Transactions on Privacy and Security (TOPS) 20, 1 (2017), 2.
[39]
Rui Wang, Yong Fuga Li, XiaoFeng Wang, Haixu Tang, and Xiaoyong Zhou. 2009. Learning your identity and disease from research papers: information leaks in genome wide association study. In Proceedings of the 16th ACM conference on Computer and communications security. ACM, 534--544.
[40]
Tianhao Wang, Jeremiah Blocki, Ninghui Li, and Somesh Jha. 2017. Locally differentially private protocols for frequency estimation. In Proc. of the 26th USENIX Security Symposium. 729--745.
[41]
Xiao Shaun Wang, Yan Huang, Yongan Zhao, Haixu Tang, XiaoFeng Wang, and Diyue Bu. 2015. Efficient genome-wide, privacy-preserving similar patient query based on private edit distance. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 492--503.
[42]
Bin Yang, Issei Sato, and Hiroshi Nakagawa. 2015. Bayesian differential privacy on correlated data. In Proceedings of the 2015 ACM SIGMOD international conference on Management of Data. ACM, 747--762.
[43]
Fei Yu, Stephen E Fienberg, Aleksandra B Slavkovi?, and Caroline Uhler. 2014. Scalable privacy-preserving data sharing methodology for genome-wide association studies. Journal of biomedical informatics 50 (2014), 133--141.

Cited By

View all
  • (2023)Ensuring Trust in Genomics Research2023 5th IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)10.1109/TPS-ISA58951.2023.00011(1-12)Online publication date: 1-Nov-2023
  • (2022)Genomic Data Sharing under Dependent Local Differential PrivacyProceedings of the Twelfth ACM Conference on Data and Application Security and Privacy10.1145/3508398.3511519(77-88)Online publication date: 14-Apr-2022
  • (2021)Using game theory to thwart multistage privacy intrusions when sharing dataScience Advances10.1126/sciadv.abe99867:50Online publication date: 10-Dec-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WPES'20: Proceedings of the 19th Workshop on Privacy in the Electronic Society
November 2020
212 pages
ISBN:9781450380867
DOI:10.1145/3411497
  • General Chairs:
  • Jay Ligatti,
  • Xinming Ou,
  • Program Chairs:
  • Wouter Lueks,
  • Paul Syverson
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data sharing
  2. differential privacy
  3. genomics
  4. privacy

Qualifiers

  • Research-article

Funding Sources

  • The National Library Of Medicine of the National Institutes of Health

Conference

CCS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 106 of 355 submissions, 30%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Ensuring Trust in Genomics Research2023 5th IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)10.1109/TPS-ISA58951.2023.00011(1-12)Online publication date: 1-Nov-2023
  • (2022)Genomic Data Sharing under Dependent Local Differential PrivacyProceedings of the Twelfth ACM Conference on Data and Application Security and Privacy10.1145/3508398.3511519(77-88)Online publication date: 14-Apr-2022
  • (2021)Using game theory to thwart multistage privacy intrusions when sharing dataScience Advances10.1126/sciadv.abe99867:50Online publication date: 10-Dec-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media