Scalable Privacy-Preserving Big Data Management and Analytics: Where We Are and Where We Are Going
Pages 52 - 56
Abstract
While several research efforts have been developed in the context of privacy-preserving big data management and analytics re- cently, relevant challenges arise when such models, techniques and algorithms must be delivered on top of massive, distributed big data repositories. This problem opens the door to the design of innovative models, techniques and algorithms that, contrary to actual proposals, are able to inject the scalability feature during the privacy-preserving big data management and analytics phase. On the basis of these considerations, this paper provides an overview on actual problems and limitations of state-of-the-art techniques, along with the proposal of an effective framework for supporting scalable privacy-preserving big data management and analytics.
References
[1]
Abdulaziz Albatli, David McKee, Paul Townend, Lydia Lau, and Jie Xu. 2017. PROV-TE: A Provenance-Driven Diagnostic Framework for Task Eviction in Data Centers. In Third IEEE International Conference on Big Data Computing Service and Applications, Big Data Service 2017, Redwood City, CA, USA, April 6-9, 2017. 233--242.
[2]
Yael Amsterdamer, Daniel Deutch, and Val Tannen. 2011. Provenance for aggregate queries. In Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2011, June 12-16, 2011, Athens, Greece. 153--164.
[3]
David W. Archer, Lois M. L. Delcambre, and David Maier. 2013. User Trust and Judgments in a Curated Database with Explicit Provenance. In In Search of Elegance in the Theory and Practice of Computation - Essays Dedicated to Peter Buneman. 89--111.
[4]
Flavio Costa, Vítor Silva Sousa, Daniel de Oliveira, Kary A. C. S. Ocaña, and Marta Mattoso. 2014. Towards Supporting Provenance Gathering and Querying in Different Database Approaches. In Provenance and Annotation of Data and Processes - 5th International Provenance and Annotation Workshop, IPAW 2014, Cologne, Germany, June 9-13, 2014. Revised Selected Papers. 254--257.
[5]
Alfredo Cuzzocrea. 2014. Privacy and Security of Big Data: Current Challenges and Future Research Perspectives. In Proceedings of the First International Work- shop on Privacy and Security of Big Data, PSBD@CIKM 2014, Shanghai, China, November 7, 2014. 45--47.
[6]
Alfredo Cuzzocrea. 2016. Big Data Provenance: State-Of-The-Art Analysis and Emerging Research Challenges. In Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference, EDBT/ICDT Workshops 2016, Bordeaux, France, March 15, 2016. http://ceur-ws.org/Vol-1558/paper37.pdf
[7]
Alfredo Cuzzocrea. 2017. Scalable OLAP-Based Big Data Analytics over Cloud Infrastructures: Models, Issues, Algorithms. In Proceedings of the 2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017, London, United Kingdom, September 17 - 19, 2017. 17--21.
[8]
Alfredo Cuzzocrea and Elisa Bertino. 2011. Privacy Preserving OLAP over Distributed XML Data: A Theoretically-Sound Secure-Multiparty-Computation Ap- proach. J. Comput. Syst. Sci. 77, 6 (2011), 965--987.
[9]
Alfredo Cuzzocrea and Ernesto Damiani. 2018. Pedigree-ing Your Big Data: Data-Driven Big Data Privacy in Distributed Environments. In 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2018, Washington, DC, USA, May 1-4, 2018. 675--681.
[10]
Alfredo Cuzzocrea and Dimitrios Gunopulos. 2014. A Decomposition Frame- work for Computing and Querying Multidimensional OLAP Data Cubes over Probabilistic Relational Data. Fundam. Inform. 132, 2 (2014), 239--266.
[11]
Alfredo Cuzzocrea, Vincenzo Russo, and Domenico Saccà. 2008. A Robust Sampling-Based Framework for Privacy Preserving OLAP. In Data Ware- housing and Knowledge Discovery, 10th International Conference, DaWaK 2008, Turin, Italy, September 2-5, 2008, Proceedings. 97--114.
[12]
Alfredo Cuzzocrea and Domenico Saccà. 2010. Balancing accuracy and privacy of OLAP aggregations on data cubes. In DOLAP 2010, ACM 13th International Workshop on Data Warehousing and OLAP, Toronto, Ontario, Canada, October 30, 2010, Proceedings. 93--98.
[13]
Alfredo Cuzzocrea, Il-Yeol Song, and Karen C. Davis. 2011. Analytics over large- scale multidimensional data: the big data revolution!. In DOLAP 2011, ACM 14th International Workshop on Data Warehousing and OLAP, Glasgow, United Kingdom, October 28, 2011, Proceedings. 101--104.
[14]
Renata Queiroz Dividino, Gerd Gröner, Stefan Scheglmann, and Matthias Thimm. 2012. Ranking RDF with Provenance via Preference Aggregation. In Knowledge Engineering and Knowledge Management - 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012. Proceedings. 154--163.
[15]
Fausto Giunghiglia and Moaz Reyad. 2014. Provenance in Open Data Entity-Centric Aggregation. In Provenance and Annotation of Data and Processes - 5th International Provenance and Annotation Workshop, IPAW 2014, Cologne, Germany, June 9-13, 2014. Revised Selected Papers. 232--234.
[16]
Grigoris Karvounarakis, Todd J. Green, Zachary G. Ives, and Val Tannen. 2013. Collaborative data sharing via update exchange and provenance. ACM Trans. Database Syst. 38, 3 (2013), 19:1--19:42.
[17]
Christian Lettner, Mario Pichler, Wilhelm Kirchmayr, Friedrich Kokert, and Mar- tin Habringer. 2013. RDFreduce: Customized Aggregations with Provenance for RDF Data based on an Industrial Use Case. In The 15th International Conference on Information Integration and Web-based Applications & Services, IIWAS '13, Vienna, Austria, December 2-4, 2013. 336.
[18]
Zhe Liu, Kim-Kwang Raymond Choo, and Minghao Zhao. 2017. Practical-oriented protocols for privacy-preserving outsourced big data analysis: Challenges and future research directions. Computers & Security 69 (2017), 97--113.
[19]
Syam Menon and Sumit Sarkar. 2016. Privacy and Big Data: Scal- able Approaches to Sanitize Large Transactional Databases for Sharing. MIS Quarterly 40, 4 (2016), 963--981. http://misq.org/ privacy-and-big-data-scalable-approaches-to-sanitize-large-transactional. html
[20]
Ivens Portugal, Paulo S. C. Alencar, and Donald D. Cowan. 2016. Towards a provenance-aware spatial-temporal architectural framework for massive data integration and analysis. In 2016 IEEE International Conference on Big Data, BigData 2016, Washington DC, USA, December 5-8, 2016. 2686--2691.
[21]
Asma Rani, Navneet Goyal, and Shashi K. Gadia. 2015. Data Provenance for Historical Queries in Relational Database. In Proceedings of the 8th Annual ACM India Conference, Ghaziabad, India, October 29-31, 2015. 117--122.
[22]
Robert J. Sandusky. 2016. Computational provenance: DataONE and implications for cultural heritage institutions. In 2016 IEEE International Conference on Big Data, BigData 2016, Washington DC, USA, December 5-8, 2016. 3266--3271.
[23]
Pierre Senellart. 2017. Provenance and Probabilities in Relational Databases. SIGMOD Record 46, 4 (2017), 5--15.
[24]
Maryam Sepehri, Stelvio Cimato, Ernesto Damiani, and Chan Yeob Yeun. 2015. Data Sharing on the Cloud: A Scalable Proxy-Based Protocol for Privacy- Preserving Queries. In 2015 IEEE TrustCom/BigDataSE/ISPA, Helsinki, Finland, Au- gust 20-22, 2015, Volume 1. 1357--1362.
[25]
Salmin Sultana and Elisa Bertino. 2015. A Distributed System for The Management of Fine-grained Provenance. J. Database Manag. 26, 2 (2015), 32--47.
[26]
Yucel Tas, Mohamed Jehad Baeth, and Mehmet S. Aktas. 2016. An Approach to Standalone Provenance Systems for Big Social Provenance Data. In 12th International Conference on Semantics, Knowledge and Grids, SKG 2016, Beijing, China, August 15-17, 2016. 9--16.
[27]
Dongyao Wu, Sherif Sakr, and Liming Zhu. 2017. HDM: Optimized Big Data Processing with Data Provenance. In Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017, Venice, Italy, March 21-24, 2017. 530--533.
[28]
Xue Yang, Rongxing Lu, Hongbin Liang, and Xiaohu Tang. 2016. SFPM: A Secure and Fine-Grained Privacy-Preserving Matching Protocol for Mobile Social Networking. Big Data Research 3 (2016), 2--9.
[29]
Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. 2016. Apache Spark: a unified engine for big data processing. Commun. ACM 59, 11 (2016), 56--65.
[30]
Xuyun Zhang, Wan-Chun Dou, Jian Pei, Surya Nepal, Chi Yang, Chang Liu, and Jinjun Chen. 2015. Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud. IEEE Trans. Computers 64, 8 (2015), 2293--2307.
Index Terms
- Scalable Privacy-Preserving Big Data Management and Analytics: Where We Are and Where We Are Going
Recommendations
Privacy Preserving Unstructured Big Data Analytics
Big data analytics has created opportunities for researchers to process huge amount of data but created a big threat to privacy of individual. Data processed by big data analytics platforms may have personal information which need to be taken care of ...
Comments
Information & Contributors
Information
Published In
August 2018
98 pages
ISBN:9781450364744
DOI:10.1145/3264560
Copyright © 2018 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
In-Cooperation
- Brookes: Oxford Brookes University
- Northumbria University: University of Northumbria at Newcastle
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 03 August 2018
Check for updates
Author Tags
Qualifiers
- Research-article
- Research
- Refereed limited
Conference
ICCBDC'18
ICCBDC'18: 2018 2nd International Conference on Cloud and Big Data Computing
August 3 - 5, 2018
Barcelona, Spain
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 147Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024
Other Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in