Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3264560.3266429acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccbdcConference Proceedingsconference-collections
research-article

Scalable Privacy-Preserving Big Data Management and Analytics: Where We Are and Where We Are Going

Published: 03 August 2018 Publication History

Abstract

While several research efforts have been developed in the context of privacy-preserving big data management and analytics re- cently, relevant challenges arise when such models, techniques and algorithms must be delivered on top of massive, distributed big data repositories. This problem opens the door to the design of innovative models, techniques and algorithms that, contrary to actual proposals, are able to inject the scalability feature during the privacy-preserving big data management and analytics phase. On the basis of these considerations, this paper provides an overview on actual problems and limitations of state-of-the-art techniques, along with the proposal of an effective framework for supporting scalable privacy-preserving big data management and analytics.

References

[1]
Abdulaziz Albatli, David McKee, Paul Townend, Lydia Lau, and Jie Xu. 2017. PROV-TE: A Provenance-Driven Diagnostic Framework for Task Eviction in Data Centers. In Third IEEE International Conference on Big Data Computing Service and Applications, Big Data Service 2017, Redwood City, CA, USA, April 6-9, 2017. 233--242.
[2]
Yael Amsterdamer, Daniel Deutch, and Val Tannen. 2011. Provenance for aggregate queries. In Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2011, June 12-16, 2011, Athens, Greece. 153--164.
[3]
David W. Archer, Lois M. L. Delcambre, and David Maier. 2013. User Trust and Judgments in a Curated Database with Explicit Provenance. In In Search of Elegance in the Theory and Practice of Computation - Essays Dedicated to Peter Buneman. 89--111.
[4]
Flavio Costa, Vítor Silva Sousa, Daniel de Oliveira, Kary A. C. S. Ocaña, and Marta Mattoso. 2014. Towards Supporting Provenance Gathering and Querying in Different Database Approaches. In Provenance and Annotation of Data and Processes - 5th International Provenance and Annotation Workshop, IPAW 2014, Cologne, Germany, June 9-13, 2014. Revised Selected Papers. 254--257.
[5]
Alfredo Cuzzocrea. 2014. Privacy and Security of Big Data: Current Challenges and Future Research Perspectives. In Proceedings of the First International Work- shop on Privacy and Security of Big Data, PSBD@CIKM 2014, Shanghai, China, November 7, 2014. 45--47.
[6]
Alfredo Cuzzocrea. 2016. Big Data Provenance: State-Of-The-Art Analysis and Emerging Research Challenges. In Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference, EDBT/ICDT Workshops 2016, Bordeaux, France, March 15, 2016. http://ceur-ws.org/Vol-1558/paper37.pdf
[7]
Alfredo Cuzzocrea. 2017. Scalable OLAP-Based Big Data Analytics over Cloud Infrastructures: Models, Issues, Algorithms. In Proceedings of the 2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017, London, United Kingdom, September 17 - 19, 2017. 17--21.
[8]
Alfredo Cuzzocrea and Elisa Bertino. 2011. Privacy Preserving OLAP over Distributed XML Data: A Theoretically-Sound Secure-Multiparty-Computation Ap- proach. J. Comput. Syst. Sci. 77, 6 (2011), 965--987.
[9]
Alfredo Cuzzocrea and Ernesto Damiani. 2018. Pedigree-ing Your Big Data: Data-Driven Big Data Privacy in Distributed Environments. In 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2018, Washington, DC, USA, May 1-4, 2018. 675--681.
[10]
Alfredo Cuzzocrea and Dimitrios Gunopulos. 2014. A Decomposition Frame- work for Computing and Querying Multidimensional OLAP Data Cubes over Probabilistic Relational Data. Fundam. Inform. 132, 2 (2014), 239--266.
[11]
Alfredo Cuzzocrea, Vincenzo Russo, and Domenico Saccà. 2008. A Robust Sampling-Based Framework for Privacy Preserving OLAP. In Data Ware- housing and Knowledge Discovery, 10th International Conference, DaWaK 2008, Turin, Italy, September 2-5, 2008, Proceedings. 97--114.
[12]
Alfredo Cuzzocrea and Domenico Saccà. 2010. Balancing accuracy and privacy of OLAP aggregations on data cubes. In DOLAP 2010, ACM 13th International Workshop on Data Warehousing and OLAP, Toronto, Ontario, Canada, October 30, 2010, Proceedings. 93--98.
[13]
Alfredo Cuzzocrea, Il-Yeol Song, and Karen C. Davis. 2011. Analytics over large- scale multidimensional data: the big data revolution!. In DOLAP 2011, ACM 14th International Workshop on Data Warehousing and OLAP, Glasgow, United Kingdom, October 28, 2011, Proceedings. 101--104.
[14]
Renata Queiroz Dividino, Gerd Gröner, Stefan Scheglmann, and Matthias Thimm. 2012. Ranking RDF with Provenance via Preference Aggregation. In Knowledge Engineering and Knowledge Management - 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012. Proceedings. 154--163.
[15]
Fausto Giunghiglia and Moaz Reyad. 2014. Provenance in Open Data Entity-Centric Aggregation. In Provenance and Annotation of Data and Processes - 5th International Provenance and Annotation Workshop, IPAW 2014, Cologne, Germany, June 9-13, 2014. Revised Selected Papers. 232--234.
[16]
Grigoris Karvounarakis, Todd J. Green, Zachary G. Ives, and Val Tannen. 2013. Collaborative data sharing via update exchange and provenance. ACM Trans. Database Syst. 38, 3 (2013), 19:1--19:42.
[17]
Christian Lettner, Mario Pichler, Wilhelm Kirchmayr, Friedrich Kokert, and Mar- tin Habringer. 2013. RDFreduce: Customized Aggregations with Provenance for RDF Data based on an Industrial Use Case. In The 15th International Conference on Information Integration and Web-based Applications & Services, IIWAS '13, Vienna, Austria, December 2-4, 2013. 336.
[18]
Zhe Liu, Kim-Kwang Raymond Choo, and Minghao Zhao. 2017. Practical-oriented protocols for privacy-preserving outsourced big data analysis: Challenges and future research directions. Computers & Security 69 (2017), 97--113.
[19]
Syam Menon and Sumit Sarkar. 2016. Privacy and Big Data: Scal- able Approaches to Sanitize Large Transactional Databases for Sharing. MIS Quarterly 40, 4 (2016), 963--981. http://misq.org/ privacy-and-big-data-scalable-approaches-to-sanitize-large-transactional. html
[20]
Ivens Portugal, Paulo S. C. Alencar, and Donald D. Cowan. 2016. Towards a provenance-aware spatial-temporal architectural framework for massive data integration and analysis. In 2016 IEEE International Conference on Big Data, BigData 2016, Washington DC, USA, December 5-8, 2016. 2686--2691.
[21]
Asma Rani, Navneet Goyal, and Shashi K. Gadia. 2015. Data Provenance for Historical Queries in Relational Database. In Proceedings of the 8th Annual ACM India Conference, Ghaziabad, India, October 29-31, 2015. 117--122.
[22]
Robert J. Sandusky. 2016. Computational provenance: DataONE and implications for cultural heritage institutions. In 2016 IEEE International Conference on Big Data, BigData 2016, Washington DC, USA, December 5-8, 2016. 3266--3271.
[23]
Pierre Senellart. 2017. Provenance and Probabilities in Relational Databases. SIGMOD Record 46, 4 (2017), 5--15.
[24]
Maryam Sepehri, Stelvio Cimato, Ernesto Damiani, and Chan Yeob Yeun. 2015. Data Sharing on the Cloud: A Scalable Proxy-Based Protocol for Privacy- Preserving Queries. In 2015 IEEE TrustCom/BigDataSE/ISPA, Helsinki, Finland, Au- gust 20-22, 2015, Volume 1. 1357--1362.
[25]
Salmin Sultana and Elisa Bertino. 2015. A Distributed System for The Management of Fine-grained Provenance. J. Database Manag. 26, 2 (2015), 32--47.
[26]
Yucel Tas, Mohamed Jehad Baeth, and Mehmet S. Aktas. 2016. An Approach to Standalone Provenance Systems for Big Social Provenance Data. In 12th International Conference on Semantics, Knowledge and Grids, SKG 2016, Beijing, China, August 15-17, 2016. 9--16.
[27]
Dongyao Wu, Sherif Sakr, and Liming Zhu. 2017. HDM: Optimized Big Data Processing with Data Provenance. In Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017, Venice, Italy, March 21-24, 2017. 530--533.
[28]
Xue Yang, Rongxing Lu, Hongbin Liang, and Xiaohu Tang. 2016. SFPM: A Secure and Fine-Grained Privacy-Preserving Matching Protocol for Mobile Social Networking. Big Data Research 3 (2016), 2--9.
[29]
Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. 2016. Apache Spark: a unified engine for big data processing. Commun. ACM 59, 11 (2016), 56--65.
[30]
Xuyun Zhang, Wan-Chun Dou, Jian Pei, Surya Nepal, Chi Yang, Chang Liu, and Jinjun Chen. 2015. Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud. IEEE Trans. Computers 64, 8 (2015), 2293--2307.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICCBDC '18: Proceedings of the 2018 2nd International Conference on Cloud and Big Data Computing
August 2018
98 pages
ISBN:9781450364744
DOI:10.1145/3264560
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Brookes: Oxford Brookes University
  • Northumbria University: University of Northumbria at Newcastle

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 August 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Privacy- Presering Big Data Frameworks
  2. Privacy-Preserving Big Data Management and Analysis
  3. Scalable Privacy-Preserving Big Data Management and Analysis

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICCBDC'18

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 147
    Total Downloads
  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media