Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1075812.1075869dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free access

A report of recent progress in transformation-based error-driven learning

Published: 08 March 1994 Publication History

Abstract

Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of thousands of lexical and contextual probabilities. In [Brill 92], a trainable rule-based tagger was described that obtained performance comparable to that of stochastic taggers, but captured relevant linguistic information in a small number of simple non-stochastic rules. In this paper, we describe a number of extensions to this rule-based tagger. First, we describe a method for expressing lexical relations in tagging that stochastic taggers are currently unable to express. Next, we show a rule-based approach to tagging unknown words. Finally, we show how the tagger can be extended into a k-best tagger, where multiple tags can be assigned to words in some cases of uncertainty.

References

[1]
{Brill 92} E. Brill 1992. A simple rule-based part of speech tagger. In Proceedings of the Third Conference on Applied Natural Language Processing, Trento, Italy.]]
[2]
{Brill 93} E. Brill 1993. Automatic grammar induction and parsing free text: a transformation-based approach. In Proceedings of the 31st Meeting of the Association of Computational Linguistics, Columbus, Ohio.]]
[3]
{Brill 93a} E. Brill 1993. A corpus-based approach to language learning. Ph.D. Dissertation, Department of Computer and Information Science, University of Pennsylvania.]]
[4]
{Charniak et al. 93} E. Charniak, C. Hendrickson, N. Jacobson, and M. Perkowitz. 1993. Equations for part-of-speech tagging. In Proceedings of Conference of the American Association for Artificial Intelligence (AAAI), Washington, D.C.]]
[5]
{Church 88} K. Church. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In Proceedings of the Second Conference on Applied Natural Language Processing, Austin, Texas.]]
[6]
{Cutting et al. 92} D. Cutting, J. Kupiec, J. Pedersen, and P. Sibun. 1992. A practical part-of-speech tagger In Proceedings of the Third Conference on Applied Natural Language Processing, Trento, Italy.]]
[7]
{DeRose 88} S. DeRose 1988. Grammatical category disambiguation by statistical optimization. Computational Linguistics, Volume 14.]]
[8]
{DeMarcken 90} C. DeMarcken. 1990. Parsing the LOB corpus. In Proceedings of the 1990 Conference of the Association for Computational Linguistics.]]
[9]
{Harris 62} Z. Harris. 1962. String Analysis of Language Structure, Mouton and Co., The Hague.]]
[10]
{Klein and Simmons 63} S. Klein and R. Simmons. 1963. A computational approach to grammatical coding of English words. JACM, Volume 10.]]
[11]
{Jelinek 85} F. Jelinek. 1985. Markov source modeling of text generation. In Impact of Processing Techniques on Communication. J. Skwirzinski, ed., Dordrecht.]]
[12]
{Kupiec 92} J. Kupiec. 1992. Robust part-of-speech tagging using a hidden Markov model. Computer Speech and Language.]]
[13]
{Marcus et al. 93} M. Marcus, B. Santorini, and M. Marcinkiewicz. 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, Volume 19.]]
[14]
{Merialdo 91} B. Merialdo. 1991. Tagging text with a probabilistic model. In IEEE International Conference on Acoustics, Speech and Signal Processing.]]
[15]
{Miller 90} G. Miller. 1990. WordNet: an on-line lexical database. International Journal of Lexicography.]]
[16]
{Su et al. 92} K. Su, M. Wu, and J. Chang. 1992. A new quantitative quality measure for machine translation Systems. In Proceedings of COLING-92, Nantes, France.]]
[17]
{Weischedel et al. 93} R. Weischedel, M. Meteer, R. Schwartz, L. Ramshaw, and J. Palmucci. 1993. Coping with ambiguity and unknown words through probabilistic models Computational Linguistics, Volume 19.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
HLT '94: Proceedings of the workshop on Human Language Technology
March 1994
479 pages
ISBN:1558603573

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 08 March 1994

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 240 of 768 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)4
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Urdu part of speech tagging using conditional random fieldsLanguage Resources and Evaluation10.1007/s10579-018-9439-653:3(331-362)Online publication date: 1-Sep-2019
  • (2006)Empirical merging of ontologiesProceedings of the 3rd European conference on The Semantic Web: research and applications10.1007/11762256_8(65-79)Online publication date: 11-Jun-2006
  • (2004)Retrieving NASA problem reportsData & Knowledge Engineering10.1016/S0169-023X(03)00106-X48:2(231-246)Online publication date: 1-Feb-2004
  • (2001)Activity detection for information access to oral communicationProceedings of the first international conference on Human language technology research10.3115/1072133.1072134(1-6)Online publication date: 18-Mar-2001

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media