Abstract
Concept hierarchies are widely used in multiple fields to carry out data analysis. In data privacy, they are known as Value Generalization Hierarchies (VGHs), and are used by generalization algorithms to dictate the data anonymization. Thus, their proper specification is critical to obtain anonymized data of good quality. The creation and evaluation of VGHs require expert knowledge and a significant amount of manual effort, making these tasks highly error-prone and time-consuming. In this paper we present AIKA, a knowledge-based framework to automatically construct and evaluate VGHs for the anonymization of categorical data. AIKA integrates ontologies to objectively create and evaluate VGHs. It also implements a multi-dimensional reward function to tailor the VGH evaluation to different use cases. Our experiments show that AIKA improved the creation of VGHs by generating VGHs of good quality in less time than when manually done. Results also showed how the reward function properly captures the desired VGH properties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chicago Homicides. https://data.cityofchicago.org
Insurance. https://github.com/ucd-pel/Datasets/tree/master/Insurance
UTD ToolBox. http://cs.utdallas.edu/dspl/cgi-bin/toolbox/
WS4J library. https://code.google.com/p/ws4j/
Ayala-Rivera, V., McDonagh, P., Cerqueus, T., Murphy, L.: A systematic comparison and evaluation of k -anonymization algorithms for practitioners. Trans. Data Priv. 7(3), 337–370 (2014)
Ayala-Rivera, V., McDonagh, P., Cerqueus, T., Murphy, L.: Ontology-based quality evaluation of value generalization hierarchies for data anonymization. In: PSD (2014)
Banerjee, S., Pedersen, T.: An adapted lesk algorithm for word sense disambiguation using WordNet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002). doi:10.1007/3-540-45715-1_11
Campan, A., Cooper, N., Truta, T.M.: On-the-fly generalization hierarchies for numerical attributes revisited. In: Jonker, W., Petković, M. (eds.) SDM 2011. LNCS, vol. 6933, pp. 18–32. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23556-6_2
D’Aquin, M., Natalya, N.F.: Where to publish and find ontologies? A survey of ontology libraries. Web Semant. (online) 11, 96–111 (2012)
Domingo-Ferrer, J., Sánchez, D., Rufian-Torrell, G.: Anonymization of nominal data based on semantic marginality. Inf. Sci. 242, 35–48 (2013)
Kröll, M., Fukazawa, Y., Ota, J., Strohmaier, M.: Concept hierarchies of health-related human goals. In: KSEM, pp. 124–135 (2011)
Lee, S., Huh, S.-Y., McNiel, R.D.: Automatic generation of concept hierarchies using WordNet. Expert Syst. Appl. 35(3), 1132–1144 (2008)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: International Conference on Management of Data, pp. 49–60 (2005)
Lichman, M.: UCI Machine Learning Repository (2013)
MartÃnez, S., Sánchez, D., Valls, A., Batet, M.: Privacy protection of textual attributes through a semantic-based masking method. Inf. Fusion 13, 304–314 (2012)
Meng, L., Huang, R., Gu, J.: A review of semantic similarity measures in WordNet. Int. J. Hybrid Inf. Technol. 6(1), 1–12 (2013)
Peffers, K., Tuunanen, T., Gengler, C.E., Rossi, M., Hui, W., Virtanen, V., Bragge, J.: The design science research process: a model for producing and presenting information systems research. DESRIST 24, 83–106 (2006)
Portillo-Dominguez, A.O., Wang, M., Magoni, D., Perry, P., Murphy, J.: Load balancing of java applications by forecasting garbage collections. In: ISPDC (2014)
Sánchez, D., Batet, M., MartÃnez, S., Domingo-Ferrer, J.: Semantic variance: an intuitive measure for ontology accuracy evaluation. EAAI 39, 89–99 (2015)
Solé-Ribalta, A., Sánchez, D., Batet, M., Serratosa, F.: Towards the estimation of feature-based semantic similarity using multiple ontologies. Knowl. Based Syst. 55, 101–113 (2014)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(05), 571–588 (2002)
Wang, Y., Liu, W., Bell, D.: A concept hierarchy based ontology mapping approach. In: Bi, Y., Williams, M.-A. (eds.) KSEM 2010. LNCS (LNAI), vol. 6291, pp. 101–113. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15280-1_12
Acknowledgments
This work was supported with the financial support of the Science Foundation Ireland grants 10/CE/I1855 and 13/RC/2094.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Ayala-Rivera, V., Murphy, L., Thorpe, C. (2016). Automatic Construction of Generalization Hierarchies for Publishing Anonymized Data. In: Lehner, F., Fteimi, N. (eds) Knowledge Science, Engineering and Management. KSEM 2016. Lecture Notes in Computer Science(), vol 9983. Springer, Cham. https://doi.org/10.1007/978-3-319-47650-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-47650-6_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47649-0
Online ISBN: 978-3-319-47650-6
eBook Packages: Computer ScienceComputer Science (R0)