Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2479787.2479806acmotherconferencesArticle/Chapter ViewAbstractPublication PageswimsConference Proceedingsconference-collections
research-article

Crafting a balance between big data utility and protection in the semantic data cloud

Published: 12 June 2013 Publication History

Abstract

Structured big data of Personal Identifiable Information (PII) are acquired from everywhere and stored as microdata in a statistical database. Given a statistical disclosure control method, big data analysis and protection are enacted for outsourcing data sources. We flexibly glean the data utility to achieve effective data-driven decision-making. However, we still comply with the privacy protection principles while applying data analysis. In this paper, we propose three types of semantics-enabled policies for controlling access, handling data, and releasing data to craft a balance between data utility and protection. Structured big data are tagged with semantic metadata to enable semantics-enabled policy's direct processing and interpretation. Finally, we demonstrate how to craft a balance between data utility and protection with these types of semantics-enabled policies, combined with various statistical disclosure control methods.

References

[1]
R. N. Adam and C. J. Worthmann. Security-control methods for statistical databases: A comparative study. ACM Computing Survey, 21(4):515--556, 1989.
[2]
A. C. Ardagna et al. A privacy-aware access control system. Journal of Computer Security, 16, 2008.
[3]
A. P. Bernstein and L. M. Haas. Information integration in the enterprise. Comm. of the ACM, 51(8):72--79, July 2008.
[4]
M. Bezzi et al. Modeling and preventing inferences from sensitive value distribution in data release. Jounral of Computer Security, 20:393--436, 2012.
[5]
A. P. Bonatti. Datalog for security, privacy and trust. In Datalog 2010, LNCS 6702, pages 21--36. Springer, 2011.
[6]
S. Cabuk et al. Towards automated security policy enforcement in multi-tenant virtual data centers. Journal of Computer Security, 18:89--121, 2010.
[7]
D. Calvanese et al. Logical foundations of peer-to-peer data integration. In Proc. of the 23rd ACM SIGACT SIGMOD SIGART Sym. on Principles of Database Systems PODS-2004, pages 241--251, 2004.
[8]
D. Calvanese et al. Data management in peer-to-peer data integration systems. Global Data Management, pages 177--201, 2006.
[9]
D. Calvanese and G. D. Giacomo. Data integration: A logic-based perspective. AI Magazine, 26(1):59--70, 2005.
[10]
A. Cavoukian and J. Jonas. Privacy by design in the age of big data, 2012.
[11]
S. Ceri et al. What you always wanted to know about Datalog (and never dared to ask). IEEE Trans. on knowledge and data engineering, 1(1), 1989.
[12]
V. Ciriani et al. Microdata protection. In T. Yu and S. Jajodia, editors, Secure Data Management in Decentralized Systems, pages 291--321. Springer, 2007.
[13]
C. Clifton et al. Privacy-preserving data integration and sharing. In Data Mining and Knowledge Discovery, pages 19--26. ACM, 2004.
[14]
H. L. Cox, F. A. Karr, and K. S. Kinney. Risk-utility paradigms for statistical disclosure limitation: How to think, but not how to act. International Statistical Review, 79(2):160--183, 2011.
[15]
M. Cox and D. Ellsworth. Application-controlled demand paging for out-of-core visualization. In Proceedings of the 8th Conference on Visualization 97, pages 235--244, 1997.
[16]
J. Domingo-Ferrer et al. Risk-utility paradigms for statistical disclosure limitation: How to think, but not how to act - discussion: A science of statistical disclosure limitation? International Statistical Review, 79(2):184--197, 2011.
[17]
C. Dwork. Differential privacy. In Proc. of the 33rd International Colloquium on Automata, Languages and Programming (ICALP), LNCS 4052, pages 1--12, 2006.
[18]
C. Dwork. A firm foundation for private data analysis. Communications of the ACM, 54(1):86--95, 2011.
[19]
A. Eberhart et al. Semantic technologies and cloud computing. In D. Fensel, editor, Foundations for the Web of Information and Services, pages 239--251. Springer, 2011.
[20]
T. Eiter et al. Rules and Ontologies for the Semantic Web. Springer, 2008.
[21]
R. Faigin et al. Data exchange: Semantics and query answering. Theoretical Computer Science, 336(1):89--124, May 2005.
[22]
S. Foresti. Preserving Privacy in Data Outsourcing. Springer, 2011.
[23]
P. Haase et al. Semantic technologies for enterprise cloud management. In International Semantic Web Conference 2010, pages 98--113, 2010.
[24]
Y. A. Halevy. Answering queries using views: A survey. The VLDB Journal, 10(4):270--294, 2001.
[25]
Y. J. Hu et al. Semantic legal policies for data exchange and protection across super-peer domains in the cloud. Future Internet, 4(4):929--954, 2012.
[26]
Y. J. Hu, W. N. Wu, and D. R. Cheng. Law-aware semantic cloud policies with exceptions for data integration and protection. In International Conference on Web Intelligence, Mining and Semantics (WIMS'12). ACM Press, June 2012.
[27]
Y. J. Hu, W. N. Wu, and J. J. Yang. Semantics-enabled policies for information sharing and protection in the cloud. In Proc. of 3rd Int. Conf. on Social Semantics, LNCS 6984, Oct. 2011.
[28]
Y. J. Hu and J. J. Yang. A semantic privacy-preserving model for data sharing and integration. In International Conference on Web Intelligence, Mining and Semantics (WIMS'11). ACM Press, May 2011.
[29]
A. Hundepool et al. Statistical Disclosure Control. Wiley Series in Survey Methodology, 2012.
[30]
A. Inam et al. A hybrid approach to private record linkage. In 24th International Conference on Data Engineering (ICDE), pages 496--505. IEEE, 2008.
[31]
G. Karjoth and M. Schunter. A privacy policy model for enterprises. In 15th IEEE Computer Security Foundations Workshop (CSFW). IEEE, June 2002.
[32]
A. Labrinidis et al. Challenges and opportunities with big data. Technical report, Computing Research Consortium (CSR), 2012.
[33]
M. Lenzerini. Data integration: A theoretical perspective. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS), pages 233--246. ACM, 2002.
[34]
J. Madhavan et al. Web-scale data integration: You can only afford to pay as you go. In Proc. of CIDR-07, 2007.
[35]
J. Manyika et al. Big data the next frontier for innovation, competition, and productivity. Technical report, McKinsey Global Institute, 2011.
[36]
D. Martin et al. OWL-S: Semantic markup for web service. Technical report, W3C Member Submission, 2004.
[37]
C. A. Mora et al. Top ten big data security and privacy challenges. Technical report, Cloud Security Alliance, 2012.
[38]
M. Morgenstern. Security and inference in multilevel database and knowledge-base systems. In Proceedings of ACM Special Interest Group on Management of Data, pages 357--373. ACM, 1987.
[39]
A. Nash and A. Deutsch. Privacy in GLAV information integration. In ICDT 2007, LNCS 4353, pages 89--103. Springer, 2007.
[40]
J. M. O'Connor and A. K. Das. SQWRL: a query language for OWL. In OWLED, volume 529. CEUR, 2009.
[41]
R. Popp and J. Poindexter. Countering terrorism through information and privacy protection technologies. IEEE Security & Privacy, 4(6):24--33, 2006.
[42]
K. Schwab et al. Personal data: The emergence of a new asset class. Technical report, World Economic Forum, 2011.
[43]
F. J. Sequeda et al. Survey of directly mapping SQL databases to the semantic web. The Knowledge Engineering Review, 26(04):445--486, 2011.
[44]
L. Sweeney. K-annonumity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowedge Based Systems, 10(5):557--570, 2002.
[45]
O. Tene and J. Polonetsky. Privacy in the age of big data: A time for big decisions. 64 Stanford Law Review Online 63, 2012.
[46]
S. D. C. d. Vimercati et al. Access control policies and languages in open environments. In T. Yu and S. Jajodia, editors, Secure Data Management in Decentralized Systems, pages 21--58. Springer, 2007.
[47]
J. D. Weitzner et al. Creating a policy-aware web: Discretionary, rule-based access for the world wide web. In E. Ferrari and B. Thuraisingham, editors, Web and Information Security, pages 1--31. IGI, 2006.

Cited By

View all
  • (2016)Big Data Benefits for the Software Measurement Community2016 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement (IWSM-MENSURA)10.1109/IWSM-Mensura.2016.025(108-114)Online publication date: Oct-2016
  • (2014)Privacy-Preserving WebID Analytics on the Decentralized Policy-Aware Social WebProceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 0210.1109/WI-IAT.2014.140(503-510)Online publication date: 11-Aug-2014

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WIMS '13: Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
June 2013
408 pages
ISBN:9781450318501
DOI:10.1145/2479787
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • UAM: Autonomous University of Madrid

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. big data
  2. data protection
  3. data utility
  4. semantic data cloud
  5. semantics-enabled policy
  6. statistical disclosure control
  7. world wide web

Qualifiers

  • Research-article

Funding Sources

Conference

WIMS '13
Sponsor:
  • UAM

Acceptance Rates

WIMS '13 Paper Acceptance Rate 28 of 72 submissions, 39%;
Overall Acceptance Rate 140 of 278 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)2
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2016)Big Data Benefits for the Software Measurement Community2016 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement (IWSM-MENSURA)10.1109/IWSM-Mensura.2016.025(108-114)Online publication date: Oct-2016
  • (2014)Privacy-Preserving WebID Analytics on the Decentralized Policy-Aware Social WebProceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 0210.1109/WI-IAT.2014.140(503-510)Online publication date: 11-Aug-2014

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media