Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2046556.2046578acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Data mining without data: a novel approach to privacy-preserving collaborative distributed data mining

Published: 17 October 2011 Publication History

Abstract

With the proliferation of organizations that independently collect various types of data, with the growing awareness of corporations and public to keep their sensitive data private, and with the ever-increasing need of government and corporate policy makers to learn the behavior of their customers, there is a definite demand for data mining services even when the data owners refuse to provide their data directly. In the past, techniques such as random perturbation were used by data owners prior to sharing the data with a third-party data miner. But, as already proven, even these techniques are prone to privacy-violation.
In this paper, we take a completely different approach---each data owner derives association rules locally, sanitizes them if necessary, and sends them to a third-party data miner. The data miner collects local rules from all data owners, regenerates an estimate of global data, and performs global data mining. We suggest schemes to reduce the generation of spurious rules, a possible outcome of data generation from rules. The proposed method is illustrated using an example of association rule data mining. We are currently in the process of formalizing some of the underlying techniques and to make them more efficient.

References

[1]
T. Dalenius and S. Reiss. Data-swapping: A technique for disclosure control. J. Statistical Planning and Inference, 6:73--85, 1982.
[2]
A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. In ACM SIGKDD Conference Proceedings, 2002.
[3]
X. Ge, L. Yan, J. Zhu, and W. Shi. Privacy preserving distributed association rule mining based on a secret sharing technique. In Proc. Second International Conference on Software Eng. and Data Mining, pages 345--350, June 2010.
[4]
M. Kantarcioglu and C. Clifton. Privacy preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowledge and Data Engg, 16(9):1026--1037, September 2004.
[5]
M. Kantarcioglu and J. Vaidya. An architecture for privacy-preserving mining of a client information. In IEEE Intl. Conf. Data Mining Workshop on Privacy, Security, and Data Mining. Maebashi City, Japan, 2002.
[6]
H. Kargupta, J. Han, P. S. Yu, R. Motwani, and V. Kumar. Next generation of data mining. In CRC Press, 2009.
[7]
K. Liu, H. Kargupta, and J. Ryan. Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowledge and Data Engg, 18(1):92--106, January 2006.
[8]
E. Poovammal and M. Ponnavaikko. Utility independent privacy preserving data mining on vertically partitioned data. Journal of Computer Science, 5(9):666--673, 2009.
[9]
Y. Saygin, V. Verykios, and C. Clifton. Using unknowns to prevent discovery of association rules. SIGMOD Record, 30(4):45--54, December 2001.
[10]
L. Sweeney. k-anonymity: A model for protecting privacy. Int.l J. Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5):557--570, 2002.
[11]
J. Vaidya and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. In ACM SIGKDD Conference Proceedings, 2002.
[12]
I. H. Witten and E. Frank. Data mining: Practical machine learning tools and techniques. In Elsevier, 2005.

Cited By

View all
  • (2023)Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature reviewArtificial Intelligence Review10.1007/s10462-023-10425-356:9(10427-10464)Online publication date: 22-Feb-2023
  • (2016)Prognosis using distributed data classification with privacy preserving: A novel approach2016 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE)10.1109/WIECON-ECE.2016.8009081(32-35)Online publication date: Dec-2016
  • (2015)Dataless Data MiningProceedings of the 2015 12th International Conference on Information Technology - New Generations10.1109/ITNG.2015.102(615-620)Online publication date: 13-Apr-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WPES '11: Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
October 2011
192 pages
ISBN:9781450310024
DOI:10.1145/2046556
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. association-rules
  2. collaborative data mining
  3. confidence
  4. coverage
  5. data merging
  6. distributed data
  7. privacy-preserving distributed data mining
  8. spurious rule filtering
  9. sub-itemsets
  10. support
  11. vertical data partitioning

Qualifiers

  • Research-article

Conference

CCS'11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 106 of 355 submissions, 30%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature reviewArtificial Intelligence Review10.1007/s10462-023-10425-356:9(10427-10464)Online publication date: 22-Feb-2023
  • (2016)Prognosis using distributed data classification with privacy preserving: A novel approach2016 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE)10.1109/WIECON-ECE.2016.8009081(32-35)Online publication date: Dec-2016
  • (2015)Dataless Data MiningProceedings of the 2015 12th International Conference on Information Technology - New Generations10.1109/ITNG.2015.102(615-620)Online publication date: 13-Apr-2015
  • (2013)IntroductionData Analysis and Pattern Recognition in Multiple Databases10.1007/978-3-319-03410-2_1(1-19)Online publication date: 7-Dec-2013

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media