Abstract
Knowledge graphs such as DBpedia, Freebase or Wikidata always contain a taxonomic backbone that allows the arrangement and structuring of various concepts in accordance with hypo-hypernym (“class-subclass”) relationship. With the rapid growth of lexical resources for specific domains, the problem of automatic extension of the existing knowledge bases with new words is becoming more and more widespread. In this talk, she addresses the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy.
The author presents a new method which allows achieving high results on this task with little effort, described in [16]. It uses the resources which exist for the majority of languages, making the method universal. The method is extended by incorporating deep representations of graph structures like node2vec, Poincaré embeddings, GCN etc. that have recently demonstrated promising results on various NLP tasks. Furthermore, combining these representations with word embeddings allows them to beat the state of the art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arefyev, N., Fedoseev, M., Kabanov, A., Zizov, V.: Word2vec not dead: predicting hypernyms of co-hyponyms is better than reading definitions. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual conference “Dialogue” (2020)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Camacho-Collados, J., et al.: SemEval-2018 task 9: hypernym discovery. In: Proceedings of The 12th International Workshop on Semantic Evaluation, pp. 712–724. Association for Computational Linguistics, New Orleans, Louisiana (2018). www.aclweb.org/anthology/S18-1115, https://doi.org/10.18653/v1/S18-1115
Dale, D.: A simple solution for the taxonomy enrichment task: discovering hypernyms using nearest neighbor search. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference “Dialogue” (2020)
Espinosa-Anke, L., Ronzano, F., Saggion, H.: TALN at SemEval-2016 task 14: semantic taxonomy enrichment via sense-based embeddings. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1332–1336. Association for Computational Linguistics, San Diego, California (2016). www.aclweb.org/anthology/S16-1208, https://doi.org/10.18653/v1/S16-1208
Fares, M., Kutuzov, A., Oepen, S., Velldal, E.: Word vectors, reuse, and replicability: Towards a community repository of large-text resources. In: Proceedings of the 21st Nordic Conference on Computational Linguistics, pp. 271–276. Association for Computational Linguistics, Gothenburg, Sweden (2017). www.aclweb.org/anthology/W17-0237
Goel, R., Kazemi, S.M., Brubaker, M., Poupart, P.: Diachronic embedding for temporal knowledge graph completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34. no. 04, pp. 3988–3995 (2020). http://ojs.aaai.org/index.php/AAAI/article/view/5815, https://doi.org/10.1609/aaai.v34i04.5815
Jurgens, D., Pilehvar, M.T.: SemEval-2016 task 14: semantic taxonomy enrichment. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1092–1102. Association for Computational Linguistics, San Diego, California (2016). www.aclweb.org/anthology/S16-1169, https://doi.org/10.18653/v1/S16-1169
Kunilovskaya, M., Kutuzov, A., Plum, A.: Taxonomy enrichment: linear hyponym-hypernym projection vs synset ID classification. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference ‘Dialogue’ (2020)
Kutuzov, A., Kuzmenko, E.: WebVectors: a toolkit for building web interfaces for vector semantic models. In: Ignatov, D.I., et al. (eds.) AIST 2016. CCIS, vol. 661, pp. 155–161. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52920-2_15
Loukachevitch, N.V., Lashevich, G., Gerasimova, A.A., Ivanov, V.V., Dobrov, B.V.: Creating Russian wordnet by conversion. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference “Dialogue”, pp. 405–415 (2016)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119. Curran Associates, Inc (2013)
Miller, G.A.: WordNet: An electronic lexical database. MIT press, Cambridge (1998)
Nikishina, I., Logacheva, V., Panchenko, A., Loukachevitch, N.: RUSSE’2020: findings of the first taxonomy enrichment task for the russian language. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference "Dialogue" (2020)
Nikishina, I., Panchenko, A., Logacheva, V., Loukachevitch, N.: Studying taxonomy enrichment on diachronic wordnet versions. In: Proceedings of the 28th International Conference on Computational Linguistics. Association for Computational Linguistics, Barcelona, Spain (2020)
Nikishina, I., Tikhomirov, M., Logacheva, V., Nazarov, Y., Panchenko, A., Loukachevitch, N.: Taxonomy enrichment with text and graph vector representations. Semantic Web, pp. 1–35 (2022). https://doi.org/10.3233/SW-212955
Pennington, J., Socher, R., Manning, C.: GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, Qatar (2014). www.aclweb.org/anthology/D14-1162, https://doi.org/10.3115/v1/D14-1162
Ryabinin, M., Popov, S., Prokhorenkova, L., Voita, E.: Embedding words in non-vector space with unsupervised graph learning. arXiv preprint arXiv:2010.02598 (2020)
Straka, M., Straková, J.: Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 88–99. Association for Computational Linguistics, Vancouver, Canada (2017). www.aclweb.org/anthology/K/K17/K17-3009.pdf
Tanev, H., Rotondi, A.: Deftor at SemEval-2016 task 14: Taxonomy enrichment using definition vectors. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1342–1345. Association for Computational Linguistics, San Diego, California (2016). www.aclweb.org/anthology/S16-1210, https://doi.org/10.18653/v1/S16-1210
Tikhomirov, M., Loukachevitch, N., Parkhomenko, E.: Combined approach to hypernym detection for thesaurus enrichment. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual conference ‘Dialogue’ (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nikishina, I. (2022). Taxonomy Enrichment with Text and Graph Vector Representation. In: Burnaev, E., et al. Analysis of Images, Social Networks and Texts. AIST 2021. Lecture Notes in Computer Science, vol 13217. Springer, Cham. https://doi.org/10.1007/978-3-031-16500-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-16500-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16499-6
Online ISBN: 978-3-031-16500-9
eBook Packages: Computer ScienceComputer Science (R0)