Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/11562214_16guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Linguistically-motivated grammar extraction, generalization and adaptation

Published: 11 October 2005 Publication History

Abstract

In order to obtain a high precision and high coverage grammar, we proposed a model to measure grammar coverage and designed a PCFG parser to measure efficiency of the grammar. To generalize grammars, a grammar binarization method was proposed to increase the coverage of a probabilistic context-free grammar. In the mean time linguistically-motivated feature constraints were added into grammar rules to maintain precision of the grammar. The generalized grammar increases grammar coverage from 93% to 99% and bracketing F-score from 87% to 91% in parsing Chinese sentences. To cope with error propagations due to word segmentation and part-of-speech tagging errors, we also proposed a grammar blending method to adapt to such errors. The blended grammar can reduce about 20~30% of parsing errors due to error assignment of pos made by a word segmentation system.

References

[1]
E. Charniak, and G. Carroll, "Context-sensitive statistics for improved grammatical language models." In Proceedings of the 12th National Conference on Artificial Intelligence, AAAI Press, pp. 742-747, Seattle, WA, 1994.
[2]
E. Charniak, "Treebank grammars." In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 1031-1036. AAAI Press/MIT Press, 1996.
[3]
E. Charniak, and G. Carroll, J. Adcock, A. Cassanda, Y. Gotoh, J. Katz, M. Littman, J. Mccann, "Taggers for Parsers", Artificial Intelligence, vol. 85, num. 1-2, 1996.
[4]
Feng-Yi Chen, Pi-Fang Tsai, Keh-Jiann Chen, and Huang, Chu-Ren, "Sinica Treebank." Computational Linguistics and Chinese Language Processing, 4(2):87-103, 2000.
[5]
Keh-Jiann Chen and, Yu-Ming Hsieh, "Chinese Treebanks and Grammar Extraction." the First International Joint Conference on Natural Language Processing (IJCNLP-04), March 2004.
[6]
Michael Collins, "Head-Driven Statistical Models for Natural Language parsing." Ph.D. thesis, Univ. of Pennsylvania, 1999.
[7]
Yu-Ming Hsieh, Duen-Chi Yang and Keh-Jiann Chen, "Grammar extraction, generalization and specialization. (in Chinese)" Proceedings of ROCLING 2004.
[8]
Christopher D. Manning and Hinrich Schutze, "Foundations of Statistical Natural Language Processing." the MIT Press, Cambridge, Massachusetts, 1999.
[9]
Mark Johnson, "PCFG models of linguistic tree representations." Computational Linguistics, Vol.24, pp.613-632, 1998.
[10]
Dan Klein and Christopher D. Manning, "Accurate Unlexicalized Parsing." Proceeding of the 4lst Annual Meeting of the Association for Computational Linguistics, pp. 423-430, July 2003.
[11]
Honglin Sun and Daniel Jurafsky, "Shallow Semantic Parsing of Chinese." Proceedings of NAACL 2004.
[12]
Hao Zhang, Qun Liu, Kevin Zhang, Gang Zou and Shuo Bai, "Statistical Chinese Parser ICTPROP." Technology Report, Institute of Computing Technology, 2003.

Cited By

View all
  • (2006)Predicting prosody from textProceedings of the 5th international conference on Chinese Spoken Language Processing10.1007/11939993_22(179-188)Online publication date: 13-Dec-2006
  1. Linguistically-motivated grammar extraction, generalization and adaptation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      IJCNLP'05: Proceedings of the Second international joint conference on Natural Language Processing
      October 2005
      1031 pages
      ISBN:3540291725
      • Editors:
      • Robert Dale,
      • Kam-Fai Wong,
      • Jian Su,
      • Oi Yee Kwong

      Sponsors

      • KAIST: Korea Advanced Institute of Science and Technology
      • ETRI: Electronics and Telecommunications Research Institute
      • Jeju Province Local Government: Jeju Province Local Government
      • Microsoft Korea: Microsoft Korea
      • KISTI

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 11 October 2005

      Author Tags

      1. ambiguity
      2. grammar coverage
      3. grammar extraction
      4. sentence parsing

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 12 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2006)Predicting prosody from textProceedings of the 5th international conference on Chinese Spoken Language Processing10.1007/11939993_22(179-188)Online publication date: 13-Dec-2006

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media