Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1073083.1073157dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

An integrated architecture for shallow and deep processing

Published: 06 July 2002 Publication History

Abstract

We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis.

References

[1]
D. Appelt and D. Israel. 1997. Building information extraction systems. Tutorial during the 5th ANLP, Washington.
[2]
M. Becker and A. Frank. 2002. A Stochastic Topological Parser of German. In Proceedings of COLING 2002, Teipei, Taiwan.
[3]
M. Becker, A. Bredenkamp, B. Crysmann, and J. Klein. to appear. Annotation of error types for german newsgroup corpus. In Anne Abeillé, editor, Treebanks: Building and Using Syntactically Annotated Corpora. Kluwer, Dordrecht.
[4]
T. Brants, W. Skut, and H. Uszkoreit. 1999. Syntactic Annotation of a German newspaper corpus. In Proceedings of the ATALA Treebank Workshop, pages 69--76, Paris, France.
[5]
U. Callmeier. 2000. PET --- A platform for experimentation with efficient HPSG processing techniques. Natural Language Engineering, 6 (1) (Special Issue on Efficient Processing with HPSG):99 - 108.
[6]
E. Charniak. 1996. Treebank Grammars. In AAAI-96. Proceedings of the 13th AAAI, pages 1031--1036. MIT Press.
[7]
A. Copestake, A. Lascarides, and D. Flickinger. 2001. An algebra for semantic construction in constraint-based grammars. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL 2001), Toulouse, France.
[8]
A. Copestake. 1999. The (new) LKB system. ftp://www-csli.stanford.edu/~aac/newdoc.pdf.
[9]
H. Cunningham, K. Humphreys, R. Gaizauskas, and Y. Wilks. 1997. Software Infrastructure for Natural Language Processing. In Proceedings of the Fifth ANLP, March.
[10]
A. Frank. 2001. Treebank Conversion. Converting the NEGRA Corpus to an LTAG Grammar. In Proceedings of the EUROLAN Workshop on Multi-layer Corpus-based Analysis, pages 29--43, Iasi, Romania.
[11]
C. Grover and A. Lascarides. 2001. XML-based data preparation for robust deep parsing. In Proceedings of the 39th ACL, pages 252--259, Toulouse, France.
[12]
B. Hamp and H. Feldweg. 1997. Germanet - a lexical-semantic net for german. In Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, Madrid.
[13]
S. Müller and W. Kasper. 2000. HPSG analysis of German. In W. Wahlster, editor, Verbmobil: Foundations of Speech-to-Speech Translation, Artificial Intelligence, pages 238--253. Springer-Verlag, Berlin Heidelberg New York.
[14]
S. Müller. 1999. Deutsche Syntax deklarativ. Head-Driven Phrase Structure Grammar für das Deutsche. Max Niemeyer Verlag, Tübingen.
[15]
G. Neumann and J. Piskorski. 2002. A shallow text processing core engine. Computational Intelligence, to appear.
[16]
J. Piskorski and G. Neumann. 2000. An intelligent text extraction and navigation system. In Proceedings of the RIAO-2000. Paris, April.
[17]
M. Siegel, F. Xu, and G. Neumann. 2001. Customizing germanet for the use in deep linguistic processing. In Proceedings of the NAACL 2001 Workshop WordNet and Other Lexical Resources: Applications, Extensions and Customizations, Pittsburgh, USA, July.
[18]
P. Tadepalli and B. Natarajan. 1996. A formal framework for speedup learning from problems and solutions. Journal of AI Research, 4:445 - 475.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '02: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
July 2002
543 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 06 July 2002

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)56
  • Downloads (Last 6 weeks)5
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2011)Towards open ontology learning and filteringInformation Systems10.1016/j.is.2011.03.00536:7(1064-1081)Online publication date: 1-Nov-2011
  • (2010)Creating and exploiting a resource of parallel parsesProceedings of the Fourth Linguistic Annotation Workshop10.5555/1868720.1868745(166-171)Online publication date: 15-Jul-2010
  • (2010)Towards robust multi-tool tagging. An OWL/DL-based approachProceedings of the 48th Annual Meeting of the Association for Computational Linguistics10.5555/1858681.1858749(659-670)Online publication date: 11-Jul-2010
  • (2009)Towards effective sentence simplification for automatic processing of biomedical textProceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers10.5555/1620853.1620902(177-180)Online publication date: 31-May-2009
  • (2008)Hybrid processing for grammar and style checkingProceedings of the 22nd International Conference on Computational Linguistics - Volume 110.5555/1599081.1599101(153-160)Online publication date: 18-Aug-2008
  • (2008)Ontology-based information extraction and integration from heterogeneous data sourcesInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2008.07.00766:11(759-788)Online publication date: 1-Nov-2008
  • (2007)Webpage understandingProceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1281192.1281288(903-912)Online publication date: 12-Aug-2007
  • (2006)UCSG shallow parserProceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing10.1007/11671299_18(156-167)Online publication date: 19-Feb-2006
  • (2005)On the need to bootstrap ontology learning with extraction grammar learningProceedings of the 13th international conference on Conceptual Structures: common Semantics for Sharing Knowledge10.1007/11524564_8(119-135)Online publication date: 17-Jul-2005
  • (2003)SDLProceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems - Volume 810.3115/1119226.1119238(83-90)Online publication date: 31-May-2003
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media