Article

Free access

Shallow parsing on the basis of words only: a case study

Authors:

Antal van den Bosch,

Sabine BuchholzAuthors Info & Claims

ACL '02: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics

Pages 433 - 440

https://doi.org/10.3115/1073083.1073156

Published: 06 July 2002 Publication History

Abstract

We describe a case study in which a memory-based learning algorithm is trained to simultaneously chunk sentences and assign grammatical function tags to these chunks. We compare the algorithm's performance on this parsing task with varying training set sizes (yielding learning curves) and different input representations. In particular we compare input consisting of words only, a variant that includes word form information for low-frequency words, gold-standard POS only, and combinations of these. The word-based shallow parser displays an apparently log-linear increase in performance, and surpasses the flatter POS-based curve at about 50,000 sentences of training data. The low-frequency variant performs even better, and the combinations is best. Comparative experiments with a real POS tagger produce lower results. We argue that we might not need an explicit intermediate POS-tagging step for parsing when a sufficient amount of training material is available and word form information is used for low-frequency words.

References

[1]

S. Abney. 1991. Parsing by chunks. In Principle-Based Parsing, pages 257--278. Kluwer Academic Publishers, Dordrecht.

[2]

D. W. Aha, D. Kibler, and M. Albert. 1991. Instance-based learning algorithms. Machine Learning, 6:37--66.

[3]

S. Aït-Mokhtar and J.-P. Chanod. 1997. Subject and object dependency extraction using finite-state transducers. In Proceedings of ACL'97 Workshop on Information Extraction and the Building of Lexical Semantic Resources for NLP Applications, Madrid.

[4]

S. Argamon, I. Dagan and Y. Krymolowski. 1998. A memory-based approach to learning shallow natural language patterns. In Proc. of 36th annual meeting of the ACL, pages 67--73, Montreal.

Digital Library

[5]

M. Banko and E. Brill. 2001. Scaling to very very large corpora for natural language disambiguation. In Proceedings of the 39th Annual Meeting and 10th Conference of the European Chapter of the Association for Computational Linguistics, Toulouse, France.

Digital Library

[6]

E. Brill. 1993. A Corpus-Based Approach to Language Learning. Ph.D. thesis, University of Pennsylvania, Department of Computer and Information Science.

Digital Library

[7]

S. Buchholz, J. Veenstra, and W. Daelemans. 1999. Cascaded grammatical relation assignment. In Pascale Fung and Joe Zhou, editors, Proceedings of EMNLP/VLC-99, pages 239--246. ACL.

[8]

E. Charniak. 2000. A maximum-entropy-inspired parser. In Proceedings of NAACL'00, pages 132--139.

Digital Library

[9]

K. W. Church. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In Proc. of Second Applied NLP (ACL).

Digital Library

[10]

M. J. Collins. 1996. A new statistical parser based on bigram lexical dependencies. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics.

Digital Library

[11]

S. Cost and S. Salzberg. 1993. A weighted nearest neighbour algorithm for learning with symbolic features. Machine Learning, 10:57--78.

Digital Library

[12]

T. M. Cover and P. E. Hart. 1967. Nearest neighbor pattern classification. Institute of Electrical and Electronics Engineers Transactions on Information Theory, 13:21--27.

Digital Library

[13]

W. Daelemans, J. Zavrel, P. Berck, and S. Gillis. 1996. MBT: A memory-based part of speech tagger generator. In E. Ejerhed and I. Dagan, editors, Proc. of Fourth Workshop on Very Large Corpora, pages 14--27. ACL SIGDAT.

[14]

W. Daelemans, A. Van den Bosch, and J. Zavrel. 1997. A feature-relevance heuristic for indexing and compressing large case bases. In M. Van Someren and G. Widmer, editors, Poster Papers of the Ninth European Conference on Machine Learing, pages 29--38, Prague, Czech Republic. University of Economics.

[15]

W. Daelemans, S. Buchholz, and J. Veenstra. 1999a. Memory-based shallow parsing. In Proceedings of CoNLL, Bergen, Norway.

[16]

W. Daelemans, A. Van den Bosch, and J. Zavrel. 1999b. Forgetting exceptions is harmful in language learning. Machine Learning, Special issue on Natural Language Learning, 34:11--41.

Digital Library

[17]

W. Daelemans, J. Zavrel, K. Van der Sloot, and A. Van den Bosch. 2001. TiMBL: Tilburg memory based learner, version 4.0, reference guide. ILK Technical Report 01-04, Tilburg University. available from http://ilk.kub.nl.

[18]

S. A. Dudani. 1976. The distance-weighted k-nearest neighbor rule. In IEEE Transactions on Systems, Man, and Cybernetics, volume SMC-6, pages 325--327.

[19]

J. Eisner. 1997. Three new probabilistic models for dependency parsing: An exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96).

Digital Library

[20]

L. Ferro, M. Vilain, and A. Yeh. 1999. Learning transformation rules to find grammatical relations. In Proceedings of the Third Computational Natural Language Learning workshop (CoNLL), pages 43--52.

[21]

X. Li and D. Roth. 2001. Exploring evidence for shallow parsing. In Proceedings of the Fifth Computational Natural Language Learning workshop (CoNLL).

Digital Library

[22]

M. Marcus, B. Santorini, and M. A. Marcinkiewicz. 1993. Building a large annotated corpus of english: The Penn Treebank. Computational Linguistics, 19(2):313--330.

Digital Library

[23]

M. Muñoz, V. Punyakanok, D. Roth, and D. Zimak. 1999. A learning approach to shallow parsing. In Pascale Fung and Joe Zhou, editors, Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 168--178.

[24]

C. Pollard and I. Sag. 1987. Information-Based Syntax and Semantics, Volume 1: Fundamentals, volume 13 of CSLI Lecture Notes. Center for the Study of Language and Information, Stanford.

Digital Library

[25]

J. R. Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.

Digital Library

[26]

L. A. Ramshaw and M. P. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the 3rd ACL/SIGDAT Workshop on Very Large Corpora, Cambridge, Massachusetts, USA, pages 82--94.

[27]

A. Ratnaparkhi. 1996. A maximum entropy part-of-speech tagger. In Proc. of the Conference on Empirical Methods in Natural Language Processing, May 17-18, 1996, University of Pennsylvania.

[28]

A. Ratnaparkhi. 1997. A linear observed time statistical parser based on maximum entropy models. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, EMNLP-2, Providence, Rhode Island, pages 1--10.

[29]

C. Stanfill and D. Waltz. 1986. Toward memory-based reasoning. Communications of the ACM, 29(12):1213--1228, December.

Digital Library

[30]

E. Tjong Kim Sang and S. Buchholz. 2000. Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of CoNLL-2000 and LLL-2000, pages 127--132, Lisbon, Portugal.

Digital Library

[31]

C. J. Van Rijsbergen. 1979. Information Retrieval. Buttersworth, London.

Digital Library

[32]

J. Veenstra and Antal van den Bosch. 2000. Single-classifier memory-based phrase chunking. In Proceedings of CoNLL-2000 and LLL-2000, pages 157--159, Lisbon, Portugal.

Digital Library

[33]

S. Weiss and C. Kulikowski. 1991. Computer systems that learn. San Mateo, CA: Morgan Kaufmann.

Digital Library

[34]

J. Zavrel. 1997. An empirical re-examination of weighted voting for k-NN. In Proceedings of the 7th Belgian-Dutch Conference on Machine Learning, pages xx--xx.

Cited By

Karanikolas NManga ESamaridi NTousidou EVassilakopoulos M(2023)Large Language Models versus Natural Language Understanding and GenerationProceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics10.1145/3635059.3635104(278-290)Online publication date: 24-Nov-2023
https://dl.acm.org/doi/10.1145/3635059.3635104
Daelemans WMarquez LKlein D(2006)A mission for computational natural language learningProceedings of the Tenth Conference on Computational Natural Language Learning10.5555/1596276.1596278(1-5)Online publication date: 8-Jun-2006
https://dl.acm.org/doi/10.5555/1596276.1596278
Horák AKadlec V(2005)New meta-grammar constructs in czech language parser syntProceedings of the 8th international conference on Text, Speech and Dialogue10.1007/11551874_11(85-92)Online publication date: 12-Sep-2005
https://dl.acm.org/doi/10.1007/11551874_11
Show More Cited By

Shallow parsing on the basis of words only: a case study
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Shallow parsing with pos taggers and linguistic features

Three data-driven publicly available part-of-speech taggers are applied to shallow parsing of Swedish texts. The phrase structure is represented by nine types of phrases in a hierarchical structure containing labels for every constituent type the token ...
Multi Task Learning Based Shallow Parsing for Indian Languages
Shallow Parsing is an important step for many Natural Language Processing tasks. Although shallow parsing has a rich history for resource rich languages, it is not the case for most Indian languages. Shallow Parsing consists of POS Tagging and Chunking. ...
A parsing method for identifying words in mandarin Chinese sentences
IJCAI'91: Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2

This paper presents a parsing method for identifying words in mandarin Chinese sentences. The identification system is composed of a Tomita's parser augmented with tests originally a part of the English-Chinese machine translation system CCL-ECMT ...

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

ACL '02: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics

July 2002

543 pages

General Chair:
Pierre Isabelle

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 06 July 2002

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
290
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Karanikolas NManga ESamaridi NTousidou EVassilakopoulos M(2023)Large Language Models versus Natural Language Understanding and GenerationProceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics10.1145/3635059.3635104(278-290)Online publication date: 24-Nov-2023
https://dl.acm.org/doi/10.1145/3635059.3635104
Daelemans WMarquez LKlein D(2006)A mission for computational natural language learningProceedings of the Tenth Conference on Computational Natural Language Learning10.5555/1596276.1596278(1-5)Online publication date: 8-Jun-2006
https://dl.acm.org/doi/10.5555/1596276.1596278
Horák AKadlec V(2005)New meta-grammar constructs in czech language parser syntProceedings of the 8th international conference on Text, Speech and Dialogue10.1007/11551874_11(85-92)Online publication date: 12-Sep-2005
https://dl.acm.org/doi/10.1007/11551874_11
O'Hara TWiebe JDaelemans WOsborne M(2003)Preposition semantic classification via Penn Treebank and FrameNetProceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 410.3115/1119176.1119187(79-86)Online publication date: 31-May-2003
https://dl.acm.org/doi/10.3115/1119176.1119187

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents