Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Valency Lexicon of Czech Verbs VALLEX: Recent Experiments with Frame Disambiguation

  • Conference paper
Text, Speech and Dialogue (TSD 2005)

Abstract

VALLEX is a linguistically annotated lexicon aiming at a description of syntactic information which is supposed to be useful for NLP. The lexicon contains roughly 2500 manually annotated Czech verbs with over 6000 valency frames (summer 2005). In this paper we introduce VALLEX and describe an experiment where VALLEX frames were assigned to 10,000 corpus instances of 100 Czech verbs – the pairwise inter-annotator agreement reaches 75%. The part of the data where three human annotators agreed were used for an automatic word sense disambiguation task, in which we achieved the precision of 78.5%.

The research reported in this paper has been partially supported by the grant of Grant Agency of Czech Republic No. 405/04/0243 and by the projects of Information Society No 1ET100300517 and 1ET101470416.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

  1. Žabokrtský, Z.: Valency Lexicon of Czech Verbs. PhD thesis, Faculty of Mathematics and Physics, Charles University in Prague (2005) (in prep.)

    Google Scholar 

  2. Hajič, J., Panevová, J., Urešová, Z., Bémová, A., Kolářová, V., Pajas, P.: PDT-VALLEX: Creating a Large-coverage Valency Lexicon for Treebank Annotation. In: Proceedings of The Second Workshop on Treebanks and Linguistic Theories. Mathematical Modeling in Physics, Engineering and Cognitive Sciences, vol. 9, pp. 57–68. Vaxjo University Press (2003)

    Google Scholar 

  3. Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. D. Reidel Publishing Company, Dordrecht (1986)

    Google Scholar 

  4. Panevová, J.: Valency Frames and the Meaning of the Sentence. In: Luelsdorff, P.L. (ed.) The Prague School of Structural and Functional Linguistics, Amsterdam-Philadelphia, pp. 223–243. John Benjamins, Amsterdam (1994)

    Google Scholar 

  5. Lopatková, M.: Valency in the Prague Dependency Treebank: Building the Valency Lexicon. Prague Bulletin of Mathematical Linguistics 79-80, 37–60 (2003)

    Google Scholar 

  6. Žabokrtský, Z., Lopatková, M.: Valency Frames of Czech Verbs in VALLEX 1.0. In: Frontiers in Corpus Annotation. Proceedings of the Workshop of the HLT/NAACL Conference, pp. 70–77 (2004)

    Google Scholar 

  7. Bojar, O., Semecký, J., Benešová, V.: VALEVAL: Testing VALLEX Consistency and Experimenting withWord-Frame Disambiguation. Prague Bulletin of Mathematical Linguistics 83 (2005)

    Google Scholar 

  8. Edmonds, P.: Introduction to Senseval. ELRA Newsletter 7 (2002)

    Google Scholar 

  9. Carletta, J.: Assessing agreement on classification task: The kappa statistics. Computational Linguistics 22, 249–254 (1996)

    Google Scholar 

  10. Véronis, J.: A study of polysemy judgements and inter-annotator agreement. In: Programme and advanced papers of the Senseval workshop, Herstmonceux Castle (England), pp. 2–4 (1998)

    Google Scholar 

  11. Hajič, J., Holub, M., Hučínová, M., Pavlík, M., Pecina, P., Straňák, P., Šidák, P.: Validating and Improving the Czech WordNet via Lexico-Semantic Annotation of the Prague Dependency Treebank. In: Proceedings of LREC 2004 (2004)

    Google Scholar 

  12. Shirai, K.: Construction of a Word Sense Tagged Corpus for SENSEVAL-2 Japanese Dictionary Task. In: Proceedings of LREC 2002, pp. 605–608 (2002)

    Google Scholar 

  13. Babko-Malaya, O., Palmer, M., Xue, N., Joshi, A., Kulick, S.: Proposition Bank II: Delving Deeper. In: Frontiers in Corpus Annotation. Proceedings of the Workshop of the HLT/NAACL Conference, pp. 17–23 (2004)

    Google Scholar 

  14. Charniak, E.: A Maximum-Entropy-Inspired Parser. In: Proceedings of NAACL 2000, Seattle, Washington, USA, pp. 132–139 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lopatková, M., Bojar, O., Semecký, J., Benešová, V., Žabokrtský, Z. (2005). Valency Lexicon of Czech Verbs VALLEX: Recent Experiments with Frame Disambiguation. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_13

Download citation

  • DOI: https://doi.org/10.1007/11551874_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28789-6

  • Online ISBN: 978-3-540-31817-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics