Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2396761.2396845acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Right-protected data publishing with hierarchical clustering preservation

Published: 29 October 2012 Publication History

Abstract

The emergence of cloud-based storage services is opening up new avenues in data exchange and data dissemination. This has amplified the interest in right-protection mechanisms for establishing ownership in case of data leakage. Current right-protection technologies, however, rarely provide strong guarantees on the dataset utility after the protection process. This work presents techniques that explicitly address this shortcoming and provably preserve the outcome of certain mining operations. In particular, we take special care to guarantee that the outcome of hierarchical clustering operations remains the same before and after right protection. We encode data ownership using watermarking principles. In the process, we derive fundamental bounds on the distortion incurred by the watermarking. We leverage our theoretical analysis to design fast algorithms for right protection without exhaustively searching the vast design space.

References

[1]
C. C. Aggarwal and P. S. Yu. A condensation approach to privacy preserving data mining. In International Conference on Extending Database Technology, pages 183--199, 2004.
[2]
R. Agrawal and J. Kiernan. Watermarking relational databases. In 28th International Conference on Very Large Databases, pages 155--166, 2002.
[3]
P. Bassia and I. Pitas. Robust audio watermarking in the time domain. In 9th European Signal Processing Conference, pages 25--28, 1998.
[4]
K. Chen and L. Liu. Privacy preserving data classification with rotation perturbation. In International Conference on Data Mining, pages 589--592, 2005.
[5]
E. Cope and G. Antonini. Observed correlations and dependencies among operational losses in the ORX consortium database. In Journal of Operational Risk, 2008.
[6]
I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon. Secure spread spectrum watermarking for multimedia. IEEE Transactions on Image Processing, 6(12):1673--1687, 1997.
[7]
D. Defays. An efficient algorithm for a complete link method. Comput. J., 20(4):364--366, 1977.
[8]
O. Devillers and M. J. Golin. Incremental algorithms for finding the convex hulls of circles and the lower envelopes of parabolas. Inf. Process. Lett., 56(3):157--164, 1995.
[9]
I. M. Faniel and A. Zimmerman. Beyond the Data Deluge: A Research Agenda for Large-Scale Data Sharing and Reuse. In Proc. of 6th International Digital Curation Conference, 2010.
[10]
R. Geambasu, S. D. Gribble, and H. M. Levy. CloudViews: Communal Data Sharing in Public Clouds. In Proc. of HotCloud, 2009.
[11]
G. Jagannathan, K. Pillaipakkamnatt, and R. N. Wright. A new privacy-preserving distributed k-clustering algorithm. In SIAM International Conference on Data Mining, 2006.
[12]
H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In 3rd IEEE International Conference on Data Mining, pages 99--106, 2003.
[13]
L. Liu, M. Kantarcioglu, and B. Thuraisingham. The applicability of the perturbation model-based privacy preserving data mining for real-world data. In 6th IEEE International Conference on Data Mining, pages 507--512, 2006.
[14]
C. Lucchese, M. Vlachos, D. Rajan, and P. S. Yu. Rights protection of trajectory datasets with nearest-neighbor preservation. The VLDB Journal, 19(4):531--556, 2010.
[15]
W. Ludwig and H.-P. Klenk. Overview: A phylogenetic backbone and taxonomic framework for prokaryotic systematics. In Manual of Systematic Bacteriology, pages 49--65, 2001.
[16]
M. C. Mont, I. Matteucci, M. Petrocchi, and M. L. Sbodio. Enabling Data Sharing in the Cloud. In HP Laboratories, Tech Report HPL-2012--22, 2012.
[17]
P. Moulin, M. E. Mihcak, and G.-I. Lin. An information-theoretic model for image watermarking and data hiding. In IEEE International Conference on Image Processing, pages 667--670, 2000.
[18]
X. Niu, C. Shao, and X. Wang. A survey of digital vector map watermarking. International Journal of Innovative Computing, Information and Control, 2(6):1301--1316, 2006.
[19]
S. Oliveira and O. Zaiane. Privacy preserving clustering by data transformation. In 18th Brazilian Symposium on Databases, pages 304--318, 2003.
[20]
M. Piorkowski, N. Sarafijanovoc-Djukic, and M. Grossglauser. A Parsimonious Model of Mobile Partitioned Networks with Clustering. In The First International Conference on COMmunication Systems and NETworkS (COMSNETS), January 2009.
[21]
R. Sibson. Slink: An optimally efficient algorithm for the single-link cluster method. Comput. J., 16(1):30--34, 1973.
[22]
J. V. Sickle. Using Mean Similarity Dendrograms to Evaluate Classifications. In Journal of Agricultural, Biological and Environmental Statistics, pages 370--384, 2001.
[23]
D. Simitopoulos, S. A. Tsaftaris, N. V. Boulgouris, and M. G. Strintzis. Compressed-domain video watermarking of MPEG streams. In IEEE International Conference on Multimedia and Expo, volume 1, pages 569--572, 2002.
[24]
R. Sion, M. Atallah, and S. Prabhakar. Rights protection for relational data. IEEE Transactions on Knowledge and Data Engineering, 16(12):1509--1525, 2004.
[25]
R. Sion, M. J. Atallah, and S. Prabhakar. Rights Protection for Discrete Numeric Streams. IEEE Transactions on Knowledge and Data Engineering, 18(5):699--714, 2006.
[26]
M. D. Swanson, B. Zhu, A. H. Tewfik, and L. Boney. Robust audio watermarking using perceptual masking. Signal Processing, 66(3):337--355, 1998.
[27]
A. Z. V. Zabkar. Application of End-Users Market Segmentation using Statistical Methods. In Advances in Methodology and Statistics (19), 2003.
[28]
J. Vaidya and C. Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 206--215, 2003.
[29]
H. Yu, X. Jiang, and J. Vaidya. Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. In ACM Symposium on Applied Computing, pages 603--610, 2006.
[30]
H. Yu, J. Vaidya, and X. Jiang. Privacy-preserving SVM classification on vertically partitioned data. In 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 647--656, 2006.
[31]
J. Yuan, Y. Zheng, X. Xie, and G. Sun. Driving with knowledge from the physical world. In KDD, pages 316--324, 2011.
[32]
J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, and Y. Huang. T-drive: driving directions based on taxi trajectories. In GIS, pages 99--108, 2010.
[33]
W. Zhu, Z. Xiong, and Y.-Q. Zhang. Multiresolution watermarking for images and video. IEEE Transactions on Circuits and Systems for Video Technology, 9(4):545--550, 1999.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
October 2012
2840 pages
ISBN:9781450311564
DOI:10.1145/2396761
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. watermarking

Qualifiers

  • Research-article

Conference

CIKM'12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 254
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media