Abstract
Since recommendation systems play an important role in the current situations where such digital transformation is highly demanded, the privacy of the individuals’ collected data in the systems must be secured effectively. In this paper, the vulnerability of the existing query framework for the recommendation systems is identified. Thus, we propose to apply the well-known k-anonymity model to generalize the given recommendation databases to satisfy the privacy preservation constraint. We show that such data generalization problem which minimizes the impact on data utility is NP-hard. To tackle with such problem, an algorithm to preserve the privacy of the individuals in the recommendation databases is proposed. The idea is to avoid excessive generalizing on the databases by forming a group of similar tuples in the databases. Thus, the impact on the data utility of the generalizing such group can be minimized. Our work is evaluated by extensive experiments. From the results, it is found that our work is highly effective, i.e., the impact quantified by the data utility metrics and the errors of the query results are less than the compared algorithms, and also it is highly efficient, i.e., the execution time is less than the result of its effectiveness-comparable algorithm by more than three times.
Similar content being viewed by others
Notes
MAX or MIN functions only return one value as the query result such that MAX function that only returns a maximum value from a set of values is satisfied by the query condition, and MIN function that only returns a minimum value from a set of values is satisfied by the query condition. They are designed to support the various data domains such as \(\textit{numeric}\), \(\textit{character}\), \(\textit{unique}\)-\(\textit{identifier}\), and \(\textit{date}\)-\(\textit{time}\), as referenced from the website: https://msdn.microsoft.com/en-us/library/ms187751.aspx.
References
Shvarts M, Lobur M, Stekh Y (2017) Some trends in modern recommender systems. In: Proceedings of the 2017 XIIIth international conference on perspective technologies and methods in MEMS design. IEEE
The Statistics Portal (2016) Number of apps available in leading app stores as of June 2016. https://www.statista.com/statistics/276623/number-of-apps-available-in-leading-app-stores/. Retrieved 12 June 2016
Yamato Y (2017) Performance-aware server architecture recommendation and automatic performance verification technology on iaas cloud. Serv Oriented Comput Appl 11:121–135
Chan NN, Tata WG (2012) A recommender system based on historical usage data for web service discovery. Serv Oriented Comput Appl 6:51–63
Lam SKT, Frankowski D, Riedl J (2006) Do you trust your recommendations? An exploration of security and privacy issues in recommender systems. In: Proceedings of the 2006 international conference on emerging trends in information and communication security. ETRICS’06, Springer, pp 14–29
Beel J, Gipp B, Langer S, Breitinger C (2016) Research-paper recommender systems: a literature survey. Int J Digit Libr 17:305–338
Ramakrishnan N, Keller BJ, Mirza BJ, Grama AY, Karypis G (2001) Privacy risks in recommender systems. IEEE Internet Comput 5:54–62
Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10:557–570
Aggarwal CC (2005) On k-anonymity and the curse of dimensionality. In: Proceedings of the 31st international conference on very large data bases. VLDB ’05, VLDB endowment, pp 901–909
Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. In: 21st International conference on data engineering (ICDE’05), pp 217–228
Fung BCM, Wang K, Wang L, Hung PCK (2009) Privacy-preserving data publishing for cluster analysis. Data Knowl Eng 68:552–575
LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: 22nd International conference on data engineering (ICDE’06), pp 25–25
Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl Based Syst 10:571–588
Nergiz ME, Clifton C (2007) Thoughts on k-anonymization. Data Knowl Eng 63:622–645
Fung BCM, Wang K, Yu PS (2005) Top-down specialization for information and privacy preservation. In: 21st international conference on data engineering (ICDE’05), pp 205–216
LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data. SIGMOD ’05. ACM, pp 49–60
Byun JW, Kamra A, Bertino E, Li N (2007) Efficient k-anonymization using clustering techniques. In: Proceedings of the 12th International conference on database systems for advanced applications. DASFAA’07. Springer, Berlin, pp 188–200
Xu J, Wang W, Pei J, Wang X, Shi B, Fu AWC (2006) Utility-based anonymization using local recoding. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’06. ACM, New York, NY, pp 785–790
Iyengar VS (2002) Transforming data to satisfy privacy constraints. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’02. ACM, pp 279–288
Zhang Q, Koudas N, Srivastava D, Yu T (2007) Aggregate query answering on anonymized tables. In: 2007 IEEE 23rd international conference on data engineering, pp 116–125
Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd international conference on data engineering, pp 106–115
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) L-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data. https://doi.org/10.1145/1217299.1217302
Wong RCW, Li J, Fu AWC, Wang K (2006) (\(\alpha \), k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’06. ACM, New York, NY, pp 754–759
Terrovitis M, Mamoulis N, Kalnis P (2008) Privacy-preserving anonymization of set-valued data. Proc VLDB Endow 1:115–125
Verbert K, Manouselis N, Ochoa X, Wolpers M, Drachsler H, Bosnic I, Duval E (2012) Context-aware recommender systems for learning: a survey and future challenges. IEEE Trans Learn Technol 5:318–335
Beel J, Langer S, Genzmehr M, Gipp B, Breitinger C, Nürnberger A (2013) Research paper recommender system evaluation: a quantitative literature survey. In: Proceedings of the international workshop on reproducibility and replication in recommender systems evaluation. RepSys ’13. ACM, pp 15–22
Herlocker JL, Konstan JA, Borchers A, Riedl J (1999) An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval. SIGIR ’99. ACM, pp 230–237
Hijikata Y, Iwahama K, Nishida S (2006) Content-based music filtering system with editable user profile. In: Proceedings of the 2006 ACM symposium on applied computing, SAC ’06. ACM, pp 1050–1057
Carrer-Neto W, Hernández-Alcaraz ML, Valencia-García R, García-Sánchez F (2012) Social knowledge-based recommender system. Application to the movies domain. Expert Syst Appl 39:10990–11000
Burke R (2002) Hybrid recommender systems: survey and experiments. User Model User-Adapt Interact 12:331–370
Isinkaye F, Folajimi Y, Ojokoh B (2015) Recommendation systems: principles, methods and evaluation. Egypt Inform J 16:261–273
Khusro S, Ali Z, Ullah I (2016) Recommender systems: issues, challenges, and research opportunities. In: Kim KJ, Joukov N (eds) Information science and applications (ICISA) 2016. Springer, Singapore, pp 1179–1189
Lam XN, Vu T, Le TD, Duong AD (2008) Addressing cold-start problem in recommendation systems. In: Proceedings of the 2nd international conference on ubiquitous information management and communication. ICUIMC ’08. ACM, pp 208–211
Sarwar BM, Karypis G, Konstan J, Riedl J (2002) Recommender systems for large-scale e-commerce: scalable neighborhood formation using clustering. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.6985
Tsai WT, Xiao Wei YCRPJYC, Zhang D (2007) Data provenance in SOA: security, reliability, and integrity. Service Oriented Comput Appl 1:223–247
Calandrino JA, Kilzer A, Narayanan A, Felten EW, Shmatikov V (2011) “You might also like: ” privacy risks of collaborative filtering. In: Proceedings of the 2011 IEEE symposium on security and privacy. SP ’11, IEEE Computer Society, pp 231–246
Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42:14:1–14:53
di Vimercati SDC, Foresti S, Livraga G, Samarati P (2012) Data privacy: definitions and techniques. Int J Uncertain Fuzziness Knowl Based Syst 20(06):793–817
Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-Completeness. W. H. Freeman & Co., New York
Harper FM, Konstan JA (2015) The movielens datasets: history and context. ACM Trans Interact Intell Syst 5:19:1–19:19
Office for Government Policy Coordination, RoK (2016) Guidelines for de-identification of personal data - guide for de-identification standards and support/management system. https://www.privacy.go.kr/cmm/fms/FileDown.do?atchFileId=FILE_000000000830764&fileSn=0
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Riyana, S., Natwichai, J. Privacy preservation for recommendation databases. SOCA 12, 259–273 (2018). https://doi.org/10.1007/s11761-018-0248-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11761-018-0248-y