Abstract
In this paper, we are concerned with the phenomenon of function word polysemy. We adopt the framework of distributional semantics, which characterizes word meaning by observing occurrence contexts in large corpora and which is in principle well situated to model polysemy. Nevertheless, function words were traditionally considered as impossible to analyze distributionally due to their highly flexible usage patterns.
We establish that contextualized word embeddings, the most recent generation of distributional methods, offer hope in this regard. Using the German reflexive pronoun sich as an example, we find that contextualized word embeddings capture theoretically motivated word senses for sich to the extent to which these senses are mirrored systematically in linguistic usage.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We found a comparable, but slightly lower, performance for the sentential contexts in the classification experiments reported below. These experiments are part of the companion Jupyter notebook to this article.
- 2.
We also experimented with fine-tuning the embeddings, but did not obtain competitive results, presumably due to the small size of the training set.
References
Bannard, C., Baldwin, T.: Distributional models of preposition semantics. In: Proceedings of the ACL-SIGSEM Workshop on the Linguistic Dimensions of Prepositions and Their Use in Computational Linguistics Formalisms and Applications, Toulouse, France, pp. 169–180 (2003)
Baroni, M., Bernardi, R., Do, N.Q., Shan, C.C.: Entailment above the word level in distributional semantics. In: Proceedings of EACL, Avignon, France, pp. 23–32 (2012)
Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of ACL, Baltimore, Maryland, pp. 238–247 (2014)
Bernardi, R., Dinu, G., Marelli, M., Baroni, M.: A relatedness benchmark to test the role of determiners in compositional distributional semantics. In: Proceedings of ACL, Sofia, Bulgaria, pp. 53–57 (2013)
Boleda, G., Schulte im Walde, S., Badia, T.: Modeling regular polysemy: a study on the semantic classification of Catalan adjectives. Comput. Linguist. 38(3), 575–616 (2012)
Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Intell. Res. 24, 305–339 (2005)
Deepset.AI: German BERT (2019). https://deepset.ai/german-bert
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, Minneapolis, pp. 4171–4186 (2019)
Faaß, G., Eckart, K.: SdeWaC – a corpus of parsable sentences from the web. In: Gurevych, I., Biemann, C., Zesch, T. (eds.) GSCL 2013. LNCS (LNAI), vol. 8105, pp. 61–68. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40722-2_6
Firth, J.R.: Papers in linguistics 1934–1951. Oxford University Press, Oxford (1957)
Gast, V., Haas, F.: On reciprocal and reflexive uses of anaphors in German and other European languages. In: König, E., Gast, V. (eds.) Reciprocals and Reflexives: Theoretical and Typological Explorations, pp. 307–346. Mouton de Gruyter, Hague (2008)
Gupta, A., Boleda, G., Baroni, M., Padó, S.: Distributional vectors encode referential attributes. In: Proceedings of EMNLP. Lisbon, Portugal (2015)
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Jawahar, G., Sagot, B., Seddah, D.: What does BERT learn about the structure of language? In: Proceedings of ACL. Florence, Italy, pp. 3651–3657 (2019)
Kemmer, S.: The Middle Voice, Typological Studies in Language, vol. 23. John Benjamins, Amsterdam and Philadelphia (1991)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977). http://www.ncbi.nlm.nih.gov/pubmed/843571
Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: Proceedings of NeurIPS. Montréal, QC, pp. 2177–2185. (2014)
Schneider, N., et al.: Comprehensive supersense disambiguation of English prepositions and possessives. In: Proceedings of ACL, Melbourne, Australia, pp. 185–196 (2018)
Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37(1), 141–188 (2010)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of NeurIPS, Long Beach, CA, pp. 5998–6008 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Padó, S., Hole, D. (2022). Distributional Analysis of Polysemous Function Words. In: Özgün, A., Zinova, Y. (eds) Language, Logic, and Computation. TbiLLC 2019. Lecture Notes in Computer Science, vol 13206. Springer, Cham. https://doi.org/10.1007/978-3-030-98479-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-98479-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98478-6
Online ISBN: 978-3-030-98479-3
eBook Packages: Computer ScienceComputer Science (R0)