Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Taxonomy Enrichment with Text and Graph Vector Representation

  • Conference paper
  • First Online:
Analysis of Images, Social Networks and Texts (AIST 2021)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13217))

  • 425 Accesses

Abstract

Knowledge graphs such as DBpedia, Freebase or Wikidata always contain a taxonomic backbone that allows the arrangement and structuring of various concepts in accordance with hypo-hypernym (“class-subclass”) relationship. With the rapid growth of lexical resources for specific domains, the problem of automatic extension of the existing knowledge bases with new words is becoming more and more widespread. In this talk, she addresses the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy.

The author presents a new method which allows achieving high results on this task with little effort, described in [16]. It uses the resources which exist for the majority of languages, making the method universal. The method is extended by incorporating deep representations of graph structures like node2vec, Poincaré embeddings, GCN etc. that have recently demonstrated promising results on various NLP tasks. Furthermore, combining these representations with word embeddings allows them to beat the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/skoltech-nlp/diachronic-wordnets.

  2. 2.

    https://fasttext.cc/docs/en/crawl-vectors.html.

  3. 3.

    http://vectors.nlpl.eu/repository/20/29.zip.

  4. 4.

    http://vectors.nlpl.eu/repository/20/185.zip.

  5. 5.

    https://nlp.stanford.edu/projects/glove/.

References

  1. Arefyev, N., Fedoseev, M., Kabanov, A., Zizov, V.: Word2vec not dead: predicting hypernyms of co-hyponyms is better than reading definitions. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual conference “Dialogue” (2020)

    Google Scholar 

  2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  3. Camacho-Collados, J., et al.: SemEval-2018 task 9: hypernym discovery. In: Proceedings of The 12th International Workshop on Semantic Evaluation, pp. 712–724. Association for Computational Linguistics, New Orleans, Louisiana (2018). www.aclweb.org/anthology/S18-1115, https://doi.org/10.18653/v1/S18-1115

  4. Dale, D.: A simple solution for the taxonomy enrichment task: discovering hypernyms using nearest neighbor search. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference “Dialogue” (2020)

    Google Scholar 

  5. Espinosa-Anke, L., Ronzano, F., Saggion, H.: TALN at SemEval-2016 task 14: semantic taxonomy enrichment via sense-based embeddings. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1332–1336. Association for Computational Linguistics, San Diego, California (2016). www.aclweb.org/anthology/S16-1208, https://doi.org/10.18653/v1/S16-1208

  6. Fares, M., Kutuzov, A., Oepen, S., Velldal, E.: Word vectors, reuse, and replicability: Towards a community repository of large-text resources. In: Proceedings of the 21st Nordic Conference on Computational Linguistics, pp. 271–276. Association for Computational Linguistics, Gothenburg, Sweden (2017). www.aclweb.org/anthology/W17-0237

  7. Goel, R., Kazemi, S.M., Brubaker, M., Poupart, P.: Diachronic embedding for temporal knowledge graph completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34. no. 04, pp. 3988–3995 (2020). http://ojs.aaai.org/index.php/AAAI/article/view/5815, https://doi.org/10.1609/aaai.v34i04.5815

  8. Jurgens, D., Pilehvar, M.T.: SemEval-2016 task 14: semantic taxonomy enrichment. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1092–1102. Association for Computational Linguistics, San Diego, California (2016). www.aclweb.org/anthology/S16-1169, https://doi.org/10.18653/v1/S16-1169

  9. Kunilovskaya, M., Kutuzov, A., Plum, A.: Taxonomy enrichment: linear hyponym-hypernym projection vs synset ID classification. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference ‘Dialogue’ (2020)

    Google Scholar 

  10. Kutuzov, A., Kuzmenko, E.: WebVectors: a toolkit for building web interfaces for vector semantic models. In: Ignatov, D.I., et al. (eds.) AIST 2016. CCIS, vol. 661, pp. 155–161. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52920-2_15

    Chapter  Google Scholar 

  11. Loukachevitch, N.V., Lashevich, G., Gerasimova, A.A., Ivanov, V.V., Dobrov, B.V.: Creating Russian wordnet by conversion. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference “Dialogue”, pp. 405–415 (2016)

    Google Scholar 

  12. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119. Curran Associates, Inc (2013)

    Google Scholar 

  13. Miller, G.A.: WordNet: An electronic lexical database. MIT press, Cambridge (1998)

    MATH  Google Scholar 

  14. Nikishina, I., Logacheva, V., Panchenko, A., Loukachevitch, N.: RUSSE’2020: findings of the first taxonomy enrichment task for the russian language. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference "Dialogue" (2020)

    Google Scholar 

  15. Nikishina, I., Panchenko, A., Logacheva, V., Loukachevitch, N.: Studying taxonomy enrichment on diachronic wordnet versions. In: Proceedings of the 28th International Conference on Computational Linguistics. Association for Computational Linguistics, Barcelona, Spain (2020)

    Google Scholar 

  16. Nikishina, I., Tikhomirov, M., Logacheva, V., Nazarov, Y., Panchenko, A., Loukachevitch, N.: Taxonomy enrichment with text and graph vector representations. Semantic Web, pp. 1–35 (2022). https://doi.org/10.3233/SW-212955

  17. Pennington, J., Socher, R., Manning, C.: GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, Qatar (2014). www.aclweb.org/anthology/D14-1162, https://doi.org/10.3115/v1/D14-1162

  18. Ryabinin, M., Popov, S., Prokhorenkova, L., Voita, E.: Embedding words in non-vector space with unsupervised graph learning. arXiv preprint arXiv:2010.02598 (2020)

  19. Straka, M., Straková, J.: Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 88–99. Association for Computational Linguistics, Vancouver, Canada (2017). www.aclweb.org/anthology/K/K17/K17-3009.pdf

  20. Tanev, H., Rotondi, A.: Deftor at SemEval-2016 task 14: Taxonomy enrichment using definition vectors. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1342–1345. Association for Computational Linguistics, San Diego, California (2016). www.aclweb.org/anthology/S16-1210, https://doi.org/10.18653/v1/S16-1210

  21. Tikhomirov, M., Loukachevitch, N., Parkhomenko, E.: Combined approach to hypernym detection for thesaurus enrichment. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual conference ‘Dialogue’ (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irina Nikishina .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nikishina, I. (2022). Taxonomy Enrichment with Text and Graph Vector Representation. In: Burnaev, E., et al. Analysis of Images, Social Networks and Texts. AIST 2021. Lecture Notes in Computer Science, vol 13217. Springer, Cham. https://doi.org/10.1007/978-3-031-16500-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16500-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16499-6

  • Online ISBN: 978-3-031-16500-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics