Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1073083.1073156dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Shallow parsing on the basis of words only: a case study

Published: 06 July 2002 Publication History

Abstract

We describe a case study in which a memory-based learning algorithm is trained to simultaneously chunk sentences and assign grammatical function tags to these chunks. We compare the algorithm's performance on this parsing task with varying training set sizes (yielding learning curves) and different input representations. In particular we compare input consisting of words only, a variant that includes word form information for low-frequency words, gold-standard POS only, and combinations of these. The word-based shallow parser displays an apparently log-linear increase in performance, and surpasses the flatter POS-based curve at about 50,000 sentences of training data. The low-frequency variant performs even better, and the combinations is best. Comparative experiments with a real POS tagger produce lower results. We argue that we might not need an explicit intermediate POS-tagging step for parsing when a sufficient amount of training material is available and word form information is used for low-frequency words.

References

[1]
S. Abney. 1991. Parsing by chunks. In Principle-Based Parsing, pages 257--278. Kluwer Academic Publishers, Dordrecht.
[2]
D. W. Aha, D. Kibler, and M. Albert. 1991. Instance-based learning algorithms. Machine Learning, 6:37--66.
[3]
S. Aït-Mokhtar and J.-P. Chanod. 1997. Subject and object dependency extraction using finite-state transducers. In Proceedings of ACL'97 Workshop on Information Extraction and the Building of Lexical Semantic Resources for NLP Applications, Madrid.
[4]
S. Argamon, I. Dagan and Y. Krymolowski. 1998. A memory-based approach to learning shallow natural language patterns. In Proc. of 36th annual meeting of the ACL, pages 67--73, Montreal.
[5]
M. Banko and E. Brill. 2001. Scaling to very very large corpora for natural language disambiguation. In Proceedings of the 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, France.
[6]
E. Brill. 1993. A Corpus-Based Approach to Language Learning. Ph.D. thesis, University of Pennsylvania, Department of Computer and Information Science.
[7]
S. Buchholz, J. Veenstra, and W. Daelemans. 1999. Cascaded grammatical relation assignment. In Pascale Fung and Joe Zhou, editors, Proceedings of EMNLP/VLC-99, pages 239--246. ACL.
[8]
E. Charniak. 2000. A maximum-entropy-inspired parser. In Proceedings of NAACL'00, pages 132--139.
[9]
K. W. Church. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In Proc. of Second Applied NLP (ACL).
[10]
M. J. Collins. 1996. A new statistical parser based on bigram lexical dependencies. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics.
[11]
S. Cost and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.
[12]
T. M. Cover and P. E. Hart. 1967. Nearest neighbor pattern classification. Institute of Electrical and Electronics Engineers Transactions on Information Theory, 13:21--27.
[13]
W. Daelemans, J. Zavrel, P. Berck, and S. Gillis. 1996. MBT: A memory-based part of speech tagger generator. In E. Ejerhed and I. Dagan, editors, Proc. of Fourth Workshop on Very Large Corpora, pages 14--27. ACL SIGDAT.
[14]
W. Daelemans, A. Van den Bosch, and J. Zavrel. 1997. A feature-relevance heuristic for indexing and compressing large case bases. In M. Van Someren and G. Widmer, editors, Poster Papers of the Ninth European Conference on Machine Learing, pages 29--38, Prague, Czech Republic. University of Economics.
[15]
W. Daelemans, S. Buchholz, and J. Veenstra. 1999a. Memory-based shallow parsing. In Proceedings of CoNLL, Bergen, Norway.
[16]
W. Daelemans, A. Van den Bosch, and J. Zavrel. 1999b. Forgetting exceptions is harmful in language learning. Machine Learning, Special issue on Natural Language Learning, 34:11--41.
[17]
W. Daelemans, J. Zavrel, K. Van der Sloot, and A. Van den Bosch. 2001. TiMBL: Tilburg memory based learner, version 4.0, reference guide. ILK Technical Report 01-04, Tilburg University. available from http://ilk.kub.nl.
[18]
S. A. Dudani. 1976. The distance-weighted k-nearest neighbor rule. In IEEE Transactions on Systems, Man, and Cybernetics, volume SMC-6, pages 325--327.
[19]
J. Eisner. 1997. Three new probabilistic models for dependency parsing: An exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96).
[20]
L. Ferro, M. Vilain, and A. Yeh. 1999. Learning transformation rules to find grammatical relations. In Proceedings of the Third Computational Natural Language Learning workshop (CoNLL), pages 43--52.
[21]
X. Li and D. Roth. 2001. Exploring evidence for shallow parsing. In Proceedings of the Fifth Computational Natural Language Learning workshop (CoNLL).
[22]
M. Marcus, B. Santorini, and M. A. Marcinkiewicz. 1993. Building a large annotated corpus of english: The Penn Treebank. Computational Linguistics, 19(2):313--330.
[23]
M. Muñoz, V. Punyakanok, D. Roth, and D. Zimak. 1999. A learning approach to shallow parsing. In Pascale Fung and Joe Zhou, editors, Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 168--178.
[24]
C. Pollard and I. Sag. 1987. Information-Based Syntax and Semantics, Volume 1: Fundamentals, volume 13 of CSLI Lecture Notes. Center for the Study of Language and Information, Stanford.
[25]
J. R. Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.
[26]
L. A. Ramshaw and M. P. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the 3rd ACL/SIGDAT Workshop on Very Large Corpora, Cambridge, Massachusetts, USA, pages 82--94.
[27]
A. Ratnaparkhi. 1996. A maximum entropy part-of-speech tagger. In Proc. of the Conference on Empirical Methods in Natural Language Processing, May 17-18, 1996, University of Pennsylvania.
[28]
A. Ratnaparkhi. 1997. A linear observed time statistical parser based on maximum entropy models. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, EMNLP-2, Providence, Rhode Island, pages 1--10.
[29]
C. Stanfill and D. Waltz. 1986. Toward memory-based reasoning. Communications of the ACM, 29(12):1213--1228, December.
[30]
E. Tjong Kim Sang and S. Buchholz. 2000. Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of CoNLL-2000 and LLL-2000, pages 127--132, Lisbon, Portugal.
[31]
C. J. Van Rijsbergen. 1979. Information Retrieval. Buttersworth, London.
[32]
J. Veenstra and Antal van den Bosch. 2000. Single-classifier memory-based phrase chunking. In Proceedings of CoNLL-2000 and LLL-2000, pages 157--159, Lisbon, Portugal.
[33]
S. Weiss and C. Kulikowski. 1991. Computer systems that learn. San Mateo, CA: Morgan Kaufmann.
[34]
J. Zavrel. 1997. An empirical re-examination of weighted voting for k-NN. In Proceedings of the 7th Belgian-Dutch Conference on Machine Learning, pages xx--xx.

Cited By

View all
  • (2023)Large Language Models versus Natural Language Understanding and GenerationProceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics10.1145/3635059.3635104(278-290)Online publication date: 24-Nov-2023
  • (2006)A mission for computational natural language learningProceedings of the Tenth Conference on Computational Natural Language Learning10.5555/1596276.1596278(1-5)Online publication date: 8-Jun-2006
  • (2005)New meta-grammar constructs in czech language parser syntProceedings of the 8th international conference on Text, Speech and Dialogue10.1007/11551874_11(85-92)Online publication date: 12-Sep-2005
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '02: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
July 2002
543 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 06 July 2002

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Large Language Models versus Natural Language Understanding and GenerationProceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics10.1145/3635059.3635104(278-290)Online publication date: 24-Nov-2023
  • (2006)A mission for computational natural language learningProceedings of the Tenth Conference on Computational Natural Language Learning10.5555/1596276.1596278(1-5)Online publication date: 8-Jun-2006
  • (2005)New meta-grammar constructs in czech language parser syntProceedings of the 8th international conference on Text, Speech and Dialogue10.1007/11551874_11(85-92)Online publication date: 12-Sep-2005
  • (2003)Preposition semantic classification via Penn Treebank and FrameNetProceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 410.3115/1119176.1119187(79-86)Online publication date: 31-May-2003

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media