Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Preprocessing for Unification Parsing of Spoken Language

  • Conference paper
  • First Online:
Natural Language Processing — NLP 2000 (NLP 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1835))

Included in the following conference series:

Abstract

Wordgraphs are structures that may be output by speech recognisers. We discuss various methods for turning wordgraphs into smaller structures. One of these methods is novel; this method relies on a new kind of determinization of acyclic weighted finite automata that is language-preserving but not fully weight-preserving, and results in smaller automata than in the case of traditional determinization of weighted finite automata. We present empirical data comparing the respective methods.

The methods are relevant for systems in which wordgraphs form the input to kinds of syntactic analysis that are very time consuming, such as unification parsing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. J.W. Amtrup and V. Weber. Time mapping with hypergraphs. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, volume 1, pages 55–61, Montreal, Quebec, Canada, August 1998.

    Google Scholar 

  2. H. Aust, M. Oerder, F. Seide, and V. Steinbiss. The Philips automatic train timetable information system. Speech Communication, 17:249–262, 1995.

    Article  Google Scholar 

  3. Y. Bar-Hillel, M. Perles, and E. Shamir. On formal properties of simple phrase structure grammars. In Y. Bar-Hillel, editor, Language and Information: Selected Essays on their Theory and Application, chapter 9, pages 116–150. Addison-Wesley, 1964.

    Google Scholar 

  4. F. Barthélemy and E. Villemonte de la Clergerie. Subsumption-oriented push-down automata. In Programming Language Implementation and Logic Programming, 4th International Symposium, volume 631 of Lecture Notes in Computer Science, pages 100–114, Leuven, Belgium, August 1992. Springer-Verlag.

    Chapter  Google Scholar 

  5. S. Billot and B. Lang. The structure of shared forests in ambiguous parsing. In 27th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pages 143–151, Vancouver, British Columbia, Canada, June 1989.

    Google Scholar 

  6. J.A. Brzozowski. Canonical regular expressions and minimal state graphs for definite events. Mathematical Theory of Automata, 12:529–561, 1962.

    Google Scholar 

  7. A.L. Buchsbaum, R. Giancarlo, and J.R. Westbrook. On the determinization of weighted finite automata. In Automata, Languages and Programming, 25th International Colloquium, volume 1443 of Lecture Notes in Computer Science, pages 482–493, Aalborg, Denmark, 1998. Springer-Verlag.

    Chapter  Google Scholar 

  8. A.L. Buchsbaum, R. Giancarlo, and J.R. Westbrook. Shrinking language models by robust approximation. In ICASSP’ 98, volume II, pages 685–688, 1998.

    Google Scholar 

  9. T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms The MIT Press, 1990.

    Google Scholar 

  10. T. Jiang and B. Ravikumar. Minimal NFA problems are hard. SIAM Journal on Computing, 22(6):1117–1141, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  11. B. Kiefer, H.-U. Krieger, J. Carroll, and R. Malouf. A bag of useful techniques for efficient and robust parsing. In 37th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Maryland, June 1999.

    Google Scholar 

  12. M. Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23(2):269–311, 1997.

    MathSciNet  Google Scholar 

  13. H. Murveit et al. Large-vocabulary dictation using SRI’s DECIPHERTM speech recognition system: progressive search techniques. In ICASSP-93, volume II, pages 319–322, 1993.

    Google Scholar 

  14. S.M. Shieber. Using restriction to extend parsing algorithms for complex-feature-based formalisms. In 23rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pages 145–152, Chicago, Illinois, USA, July 1985.

    Google Scholar 

  15. G. van Noord. Treatment of ε-moves in subset construction. In Proceedings of the International Workshop on Finite State Methods in Natural Language Processing, pages 57–68, Ankara, Turkey, June–July 1998.

    Google Scholar 

  16. R.A. Wagner and M.J. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1):168–173, 1974.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nederhof, MJ. (2000). Preprocessing for Unification Parsing of Spoken Language. In: Christodoulakis, D.N. (eds) Natural Language Processing — NLP 2000. NLP 2000. Lecture Notes in Computer Science(), vol 1835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45154-4_11

Download citation

  • DOI: https://doi.org/10.1007/3-540-45154-4_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67605-8

  • Online ISBN: 978-3-540-45154-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics