Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1244002.1244102acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

Mining multiple private databases using a kNN classifier

Published: 11 March 2007 Publication History

Abstract

Modern electronic communication has collapsed geographical boundaries for global information sharing but often at the expense of data security and privacy boundaries. Distributed privacy preserving data mining tools are increasingly becoming critical for mining multiple databases with a minimum information disclosure. We present a framework including a general model as well as multi-round algorithms for mining horizontally partitioned databases using a privacy preserving k Nearest Neighbor (kNN) classifier. A salient feature of our approach is that it offers a trade-off between accuracy, efficiency and privacy through multi-round protocols.

References

[1]
G. Aggarwal, N. Mishra, and B. Pinkas. Secure computation of the kth ranked element. In IACR Conference on Eurocrypt, 2004.
[2]
R. Agrawal, P. Bird, T. Grandison, J. Kieman, S. Logan, and W. Rjaibi. Extending relational database systems to automatically enforce privacy policies. In ICDE, 2005.
[3]
R. Agrawal, A. Evfimievski, and R. Srikant. Information sharing across private databases. In ACM SIGMOD Conference, 2003.
[4]
R. Agrawal, J. Kieman, R. Srikant, and Y. Xu. Hippocratic databases. In International Conference on Very Large Databases (VLDB), 2002.
[5]
E. Bertino, B. Ooi, Y. Yang, and R. H. Deng. Privacy and ownership preserving of outsourced medical data. In ICDE, 2005.
[6]
C. Clifton. Tutorial on privacy, security, and data mining. In 13th European Conference on Machine Learning and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2002.
[7]
C. Clifton, M. Kantarcioglu, X. Lin, J. Vaidya, and M. Zhu. Tools for privacy preserving distributed data mining. In SIGKDD Explorations, 2003.
[8]
J. Gehrke. Models and methods for privacy-preserving data analysis and publishing. In ICDE, 2006.
[9]
B. German. In ftp://ftp.ics.uci.edu/pub/machine-learning-databases/glass.
[10]
O. Goldreich. Secure multi-party computation, 2001. Working Draft, Version 1.3.
[11]
S. Goldwasser. Multi-party computations: past and present. In ACM Symposium on Principles of Distributed Computing (PODC), 1997.
[12]
H. Hacigumus, B. Iyer, C. Li, and S. Mehrotra. Executing sql over encrypted data in the database service provider model. In ACM SIGMOD Conference, 2002.
[13]
B. Hore, S. Mehrotra, and G. Tsudik. A privacy-preserving index for range queries. In ACM Symposium on Principles of Distributed Computing (PODC), 1997.
[14]
V. S. Iyengar. Transforming data to satisfy privacy constraints. In SIGKDD, 2002.
[15]
S. Jajodia and R. Sandhu. Toward a multilevel secure relational data model. In ACM SIGMOD Conference, 1991.
[16]
M. Kantarcioglu and C. Clifton. Privacy preserving data mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering (TKDE), 16(9), 2004.
[17]
M. Kantarcioglu and C. Clifton. Privacy preserving k-nn classifier. In ICDE, 2005.
[18]
M. Kantarcoglu and J. Vaidya. Privacy preserving naive bayes classifier for horizontally partitioned data. In IEEE ICDM Workshop on Privacy Preserving Data Mining, 2003.
[19]
K. LeFevre, D. DeWitt, and R. Ramakrishnan. Workload-aware anonymization. In SIGKDD, 2006.
[20]
Y. Lindell and B. Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3), 2002.
[21]
V. Sigillito. Pima. In ftp://ftp.ics.uci.edu/pub/machine-learning-databases/pima-indians-diabetes.
[22]
J. Vaidya and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. In SIGKDD, 2002.
[23]
J. vaidya and C. Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In SIGKDD, 2003.
[24]
J. Vaidya and C. Clifton. Privacy preserving naive bayes classifier for vertically partitioned data. In SIGKDD, 2003.
[25]
J. Vaidya and C. Clifton. Privacy-preserving top-k queries. In ICDE, 2005.
[26]
V. S. Verykios, E. Bertino, I. N. Fovino, L. P. Provenza, Y. Saygin, and Y. Theodoridis. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 33(1), 2004.
[27]
K. Wang, B. C. M. Fung, and G. Dong. Integrating private databases for data analysis. In IEEE ISI, 2005.
[28]
S. Waugh. In ftp://ftp.ics.uci.edu/pub/machine-learning-databases/abalone.
[29]
L. Xiong, S. Chitti, and L. Liu. Topk queries across multiple private databases. In 25th International Conference on Distributed Computing Systems (ICDCS), 2005.
[30]
Z. Yang, S. Zhong, and R. N. Wright. Privacy-preserving classification of customer data without loss of accuracy. In SIAM SDM, 2005.

Cited By

View all
  • (2024)Investigating the Performance of a Novel Modified Binary Black Hole Optimization Algorithm for Enhancing Feature SelectionApplied Sciences10.3390/app1412520714:12(5207)Online publication date: 14-Jun-2024
  • (2020)Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health CareJCO Clinical Cancer Informatics10.1200/CCI.19.00047(184-200)Online publication date: Nov-2020
  • (2020)Parallelly Running k-Nearest Neighbor Classification Over Semantically Secure Encrypted Data in Outsourced EnvironmentsIEEE Access10.1109/ACCESS.2020.29845798(64617-64633)Online publication date: 2020
  • Show More Cited By

Index Terms

  1. Mining multiple private databases using a kNN classifier

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '07: Proceedings of the 2007 ACM symposium on Applied computing
    March 2007
    1688 pages
    ISBN:1595934804
    DOI:10.1145/1244002
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 March 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. k nearest neighbor
    2. classification
    3. distributed databases
    4. privacy

    Qualifiers

    • Article

    Conference

    SAC07
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Investigating the Performance of a Novel Modified Binary Black Hole Optimization Algorithm for Enhancing Feature SelectionApplied Sciences10.3390/app1412520714:12(5207)Online publication date: 14-Jun-2024
    • (2020)Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health CareJCO Clinical Cancer Informatics10.1200/CCI.19.00047(184-200)Online publication date: Nov-2020
    • (2020)Parallelly Running k-Nearest Neighbor Classification Over Semantically Secure Encrypted Data in Outsourced EnvironmentsIEEE Access10.1109/ACCESS.2020.29845798(64617-64633)Online publication date: 2020
    • (2017)Diagnosis of diabetes mellitus using PSO and KNN classifier2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN)10.1109/IC3TSN.2017.8284446(32-38)Online publication date: Oct-2017
    • (2017)Efficient and privacy-aware multi-party classification protocol for human activity recognitionJournal of Network and Computer Applications10.1016/j.jnca.2017.09.00598:C(84-96)Online publication date: 15-Nov-2017
    • (2017)Preserving output-privacy in data stream classificationProgress in Artificial Intelligence10.1007/s13748-017-0114-86:2(87-104)Online publication date: 6-Feb-2017
    • (2017)An Improved Locality Preserving Projection Method for Dimensionality Reduction with Hyperspectral ImageAdvances in Swarm Intelligence10.1007/978-3-319-61833-3_34(321-329)Online publication date: 24-Jun-2017
    • (2017)Secure and Efficient k-NN QueriesICT Systems Security and Privacy Protection10.1007/978-3-319-58469-0_11(155-170)Online publication date: 4-May-2017
    • (2016)Achieving Probabilistic Anonymity in a Linear and Hybrid Randomization ModelIEEE Transactions on Information Forensics and Security10.1109/TIFS.2016.256260511:10(2187-2202)Online publication date: Oct-2016
    • (2016)Decomposable algorithm for computing k-nearest neighbours across partitioned dataInternational Journal of Parallel, Emergent and Distributed Systems10.1080/17445760.2015.105782031:4(334-353)Online publication date: 1-Jul-2016
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media