Abstract
In this paper, we propose an improved pruning algorithm with memory, which we call improved EDP algorithm. This method provides the better trade-off between data quality and privacy protection against classification attacks. The proposed algorithm reduces the time complexity degree significantly, especially in the case of the complete binary tree of which worst-case time complexity is of order O(MlogM), where M is the number of internal nodes of the complete tree. The experiments also show that the proposed algorithm is feasible and more efficient especially in the case of large and more complex tree structure with more internal nodes, etc. From a practical point of view, the improved EDP algorithm is more applicable and easy to implement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Groups accuse DHCS of improperly sharing data on patients with HIV (September 2010), http://www.californiahealthline.org/articles/2010/9/10/groups-accuse-state-agency-of-sharing-data-on-patients-with-hiv.aspx
Greengard, S.: Privacy: Entitlement or illusion? Personnel Journal 75(5), 74–88 (1996)
Sweeney, L.: K-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 557–570 (2002)
Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Transactions on Knowledge and Data Engineering 16(4), 434–447 (2004)
Li, X.B., Sarkar, S.: Privacy protection in data mining: A perturbation approach for categorical data. Information Systems Research 17(3), 254–270 (2006)
Li, X.B., Sarkar, S.: A tree-based data perturbation approach for privacy-preserving data mining. IEEE Transactions on Knowledge and Data Engineering 18(6), 1278–1283 (2006)
Reiss, S.P.: Practical data-swapping: the first steps. ACM Transactions on Database Systems 9(1), 20–37 (1984)
Estivill-Castro, V., Brankovic, L.: Data swapping: Balancing privacy against precision in mining for logic rules. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 389–398. Springer, Heidelberg (1999)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: The 2000 ACM SIGMOD International Conference on Management Of Data, New York, NY, USA, pp. 439–450 (2000)
Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: The 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, New York, NY, USA, pp. 247–255 (2001)
Li, X.B., Sarkar, S.: Against classification attacks: A decision tree pruning approach to privacy protection in data mining. Operation Research 57(6), 1496–1509 (2009)
Bohanec, M., Bratko, I.: Trading accuracy for simplicity in decision trees. Machine Learning 15(3), 223–250 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, M., Ge, N. (2011). An Improved EDP Algorithm to Privacy Protection in Data Mining. In: Hu, B., Liu, J., Chen, L., Zhong, N. (eds) Brain Informatics. BI 2011. Lecture Notes in Computer Science(), vol 6889. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23605-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-23605-1_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23604-4
Online ISBN: 978-3-642-23605-1
eBook Packages: Computer ScienceComputer Science (R0)