Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1075527.1075553dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free access

A simple rule-based part of speech tagger

Published: 23 February 1992 Publication History

Abstract

Automatic part of speech tagging is an area of natural language processing where statistical techniques have been more successful than rule-based methods. In this paper, we present a simple rule-based part of speech tagger which automatically acquires its rules and tags with accuracy comparable to stochastic taggers. The rule-based tagger has many advantages over these taggers, including: a vast reduction in stored information required, the perspicuity of a small set of meaningful rules, ease of finding and implementing improvements to the tagger, and better portability from one tag set, corpus genre or language to another. Perhaps the biggest contribution of this work is in demonstrating that the stochastic method is not the only viable method for part of speech tagging. The fact that a simple rule-based tagger that automatically learns its rules can perform so well should offer encouragement for researchers to further explore rule-based tagging, searching for a better and more expressive set of rule templates and other variations on the simple but effective theme described below.

References

[1]
Church, K. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Proceedings of the Second Conference on Applied Natural Language Processing, ACL, 136--143, 1988.
[2]
Cutting, D., Kupiec, J., Pederson, J. and Sibun, P. A Practical Part-of-Speech Tagger. In Proceedings of the Third Conference on Applied Natural Language Processing, ACL, 1992.
[3]
DeRose, S. J. Grammatical Category Disambiguation by Statistical Optimization. Computational Linguistics 14: 31--39, 1988.
[4]
Deroualt, A. and Merialdo, B. Natural language modeling for phoneme-to-text transcription. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-8, No. 6, 742--749, 1986.
[5]
Francis, W. Nelson and Kučera, Henry, Frequency analysis of English usage. Lexicon and grammar. Houghton Mifflin, Boston, 1982.
[6]
Garside, R., Leech, G. & Sampson, G. The Computational Analysis of English: A Corpus-Based Approach. Longman: London, 1987.
[7]
Green, B. and Rubin, G. Automated Grammatical Tagging of English. Department of Linguistics, Brown University, 1971.
[8]
Hindle, D. Acquiring disambiguation rules from text. Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, 1989.
[9]
Jelinek, F. Markov source modeling of text generation. In J. K. Skwirzinski, ed., Impact of Processing Techniques on Communication, Dordrecht, 1985.
[10]
Klein, S. and Simmons, R. F. A Computational Approach to Grammatical Coding of English Words. JACM 10: 334--47. 1963.
[11]
Kupiec, J. Augmenting a hidden Markov model for phrase-dependent word tagging. In Proceedings of the DARPA Speech and Natural Language Workshop, Morgan Kaufmann, 1989.
[12]
Meteer, M., Schwartz, R., and Weischedel, R. Empirical Studies in Part of Speech Labelling, Proceedings of the DARPA Speech and Natural Language Workshop, Morgan Kaufmann, 1991.

Cited By

View all
  • (2024)Leveraging Bidirectionl LSTM with CRFs for Pashto TaggingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/364945623:4(1-17)Online publication date: 27-Feb-2024
  • (2020)Toward Automated Feedback on Teacher Discourse to Enhance Teacher LearningProceedings of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3313831.3376418(1-13)Online publication date: 21-Apr-2020
  • (2019)Urdu part of speech tagging using conditional random fieldsLanguage Resources and Evaluation10.1007/s10579-018-9439-653:3(331-362)Online publication date: 1-Sep-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
HLT '91: Proceedings of the workshop on Speech and Natural Language
February 1992
487 pages
ISBN:1558602720

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 23 February 1992

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 240 of 768 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)69
  • Downloads (Last 6 weeks)7
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Leveraging Bidirectionl LSTM with CRFs for Pashto TaggingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/364945623:4(1-17)Online publication date: 27-Feb-2024
  • (2020)Toward Automated Feedback on Teacher Discourse to Enhance Teacher LearningProceedings of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3313831.3376418(1-13)Online publication date: 21-Apr-2020
  • (2019)Urdu part of speech tagging using conditional random fieldsLanguage Resources and Evaluation10.1007/s10579-018-9439-653:3(331-362)Online publication date: 1-Sep-2019
  • (2018)A Scalable Solution for Rule-Based Part-of-Speech Tagging on Novel Hardware AcceleratorsProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3219889(665-674)Online publication date: 19-Jul-2018
  • (2018)Weakly Supervised POS Tagging without DisambiguationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/321470717:4(1-19)Online publication date: 21-Jul-2018
  • (2017)Apriori Rule--Based In-App Ad Selection Online Algorithm for Improving Supply-Side Platform RevenuesACM Transactions on Management Information Systems10.1145/30861888:2-3(1-28)Online publication date: 3-Jul-2017
  • (2017)Words matterProceedings of the Seventh International Learning Analytics & Knowledge Conference10.1145/3027385.3027417(218-227)Online publication date: 13-Mar-2017
  • (2017)Integrating Visual Analytics Support for Grounded Theory Practice in Qualitative Text AnalysisComputer Graphics Forum10.1111/cgf.1318036:3(201-212)Online publication date: 1-Jun-2017
  • (2017)Understand Short Texts by Harvesting and Analyzing Semantic KnowledgeIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.257168729:3(499-512)Online publication date: 1-Mar-2017
  • (2017)Automatic sentence stress feedback for non-native English learnersComputer Speech and Language10.1016/j.csl.2016.04.00341:C(29-42)Online publication date: 1-Jan-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media