Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Semantic Attack on Disassociated Transaction Data

Published: 20 April 2023 Publication History

Abstract

Accessing and sharing information, including personal data, has become easier and faster than ever because of the Internet. Therefore, businesses have started to take advantage of the availability of data by gathering, analysing, and utilising individuals’ data for various purposes, such as developing data-driven products and services that can help improve customer satisfaction and retention, and lead to better healthcare and well-being provisions. However, analysing these data freely may violate individuals’ privacy. This has prompted the development of protection methods that can deter potential privacy threats by anonymising data. Disassociation is one anonymisation approach used to protect transaction data. It works by dividing data into chunks to conceal sensitive links between the items in a transaction, but it does not account for semantic relationships that may exist among the items, which adversaries can exploit to reveal protected links. We show that our proposed de-anonymisation approach could break the privacy protection offered by the disassociation method by exploiting such semantic relationships. Our findings indicate that the disassociation method may not provide adequate protection for transactions: up to 60% of the disassociated items can be reassociated, thereby breaking the privacy of nearly 70% of the protected items. In this paper [an extension to our work reported in AlShuhail and Shao (Semantic attack on disassociated transactions. In: Proceedings of the 8th International Conference on information systems security and privacy-ICISSP, INSTICC. SciTePress, pp. 60–72, 2022)], we develop additional techniques to reconstruct transactions, with additional experiments to illustrate the impact of our attacking method.

References

[1]
AlShuhail A, Shao J. Semantic attack on disassociated transactions. In: Proceedings of the 8th International Conference on information systems security and privacy-ICISSP, INSTICC. SciTePress, 2022; pp. 60–72.
[2]
Fung BC, Wang K, Chen R, and Yu PS Privacy-preserving data publishing: a survey of recent developments ACM Comput Surv (Csur) 2010 42 4 1-53
[3]
Rubinstein IS and Hartzog W Anonymization and risk Wash L Rev 2016 91 703
[4]
El Emam K and Dankar FK Protecting privacy using k-anonymity J Am Med Inform Assoc 2008 15 5 627-637
[5]
Hedegaard S, Houen S, Simonsen JG. Lair: a language for automated semantics-aware text sanitization based on frame semantics. In: 2009 IEEE International Conference on Semantic Computing. IEEE, 2009; pp. 47–52.
[6]
Terrovitis M, Mamoulis N, Kalnis P. Anonymity in unstructured data. In: Proc. of International Conference on Very Large Data Bases (VLDB), Citeseer. 2008.
[7]
Terrovitis M, Liagouris J, Mamoulis N, Skiadopoulos S. Privacy preservation by disassociation. arXiv preprint arXiv:1207.0135, 2012.
[8]
Cilibrasi RL and Vitanyi PM The google similarity distance IEEE Trans Knowl Data Eng 2007 19 3 370-383
[9]
Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on empirical methods in natural language processing (EMNLP), 2014; pp. 1532–43.
[10]
Smith R and Shao J Privacy and e-commerce: a consumer-centric perspective Electron Commer Res 2007 7 2 89-116
[11]
Sweeney L k-anonymity: a model for protecting privacy Internat J Uncertain Fuzziness Knowl-Based Syst 2002 10 05 557-570
[12]
Wong RC-W, Fu AW-C, Wang K, Pei J. Minimality attack in privacy preserving data publishing. In: Proceedings of the 33rd International Conference on very large data bases, 2007; pp. 543–54.
[13]
Cormode G, Srivastava D, Li N, and Li T Minimizing minimality and maximizing utility: analyzing method-based attacks on anonymized data Proc VLDB Endow 2010 13 1–2 1045-1056
[14]
Zhang L, Jajodia S, Brodsky A. Information disclosure under realistic assumptions: privacy versus optimality. In: Proceedings of the 14th ACM Conference on computer and communications security, 2007; pp. 573–83.
[15]
Farkas C and Jajodia S The inference problem: a survey ACM SIGKDD Explorations Newsl 2002 4 2 6-11
[16]
Turkanovic M, Druzovec TW, and Hölbl M Inference attacks and control on database structures TEM J 2015 4 1 3
[17]
Clifton C, Marks D. Security and privacy implications of data mining. In: ACM SIGMOD Workshop on Research Issues on data mining and knowledge discovery. Citeseer, 1996; pp. 15–19.
[18]
Rastogi V, Suciu D, Hong S. The boundary between privacy and utility in data publishing. In: Proceedings of the 33rd International Conference on very large data bases. Citeseer, 2007; pp. 531–42.
[19]
Evfimievski A, Gehrke J, Srikant R. Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on principles of database systems, 2003; pp. 211–22.
[20]
Kifer D. Attacks on privacy and Definetti’s theorem. In: Proceedings of the 2009 ACM SIGMOD International Conference on management of data, 2009; pp. 127–38.
[21]
Basu T, Murthy C. Semantic relation between words with the web as information source. In: International Conference on pattern recognition and machine intelligence. Springer, 2009; pp. 267–72.
[22]
Gracia J, Mena E. Web-based measure of semantic relatedness. In: International Conference on web information systems engineering, vol 8. Springer, 2008; pp. 136–50.
[23]
Bouma G Normalized (pointwise) mutual information in collocation extraction Proc GSCL 2009 30 31-40
[24]
Sánchez D, Batet M, Viejo A. Detecting sensitive information from textual documents: an information-theoretic approach. In: International Conference on modeling decisions for artificial intelligence. Springer, 2012; pp. 173–84.
[25]
Staddon J, Golle P, Zimny B. Web-based inference detection. In: USENIX Security Symposium, 2007; pp. 1–16.
[26]
Chow R, Oberst I, Staddon J. Sanitization’s slippery slope: the design and study of a text revision assistant. In: Proceedings of the 5th Symposium on usable privacy and security, 2009; pp. 1–11.
[27]
Chow R, Golle P, Staddon J. Detecting privacy leaks using corpus-based association rules. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge discovery and data mining, 2008; pp. 893–901.
[28]
Sánchez D, Batet M, Viejo A. Detecting term relationships to improve textual document sanitization. In: PACIS, 2013; p. 105.
[29]
Shao J and Ong H Exploiting contextual information in attacking set-generalized transactions ACM Trans Internet Technol (TOIT) 2017 17 4 40
[30]
Loukides G, Gkoulalas-Divanis A, and Malin B Coat: constraint-based anonymization of transactions Knowl Inf Syst 2011 28 2 251-282
[31]
JeffreyPennington R, Manning C, Pennington J, Socher R, Manning C. Glove: global vectors for word representation. In: Conference on empirical methods in natural language processing. Citeseer. Citeseer, 2014; pp. 1532–43.
[32]
Malvern D, Richards B. Measures of lexical richness. Encycl Appl Linguist. 2012.
[33]
Sra S, Dhillon I. Generalized nonnegative matrix approximations with Bregman divergences. In: Advances in neural information processing systems, vol. 18, 2005.
[34]
Kusner M, Sun Y, Kolkin N, Weinberger K. From word embeddings to document distances. In: International Conference on machine learning. PMLR, 2015; pp. 957–66.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image SN Computer Science
SN Computer Science  Volume 4, Issue 4
Apr 2023
1389 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 20 April 2023
Accepted: 09 March 2023
Received: 17 October 2022

Author Tags

  1. Data privacy
  2. Semantic attack
  3. Transaction data
  4. Disassociation

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media