Large Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules

Georgila, Kallirroi; Fakotakis, Nikos; Kokkinakis, George

doi:10.1023/A:1020965126094

Large Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules

Published: November 2002

Volume 5, pages 355–370, (2002)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Kallirroi Georgila¹,
Nikos Fakotakis¹ &
George Kokkinakis¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Some applications of speech recognition, such as automatic directory information services, require very large vocabularies. In this paper, we focus on the task of recognizing surnames in an Interactive telephone-based Directory Assistance Services (IDAS) system, which supersedes other large vocabulary applications in terms of complexity and vocabulary size. We present a method for building compact networks in order to reduce the search space in very large vocabularies using Directed Acyclic Word Graphs (DAWGs). Furthermore, trees, graphs and full-forms (whole words with no merging of nodes) are compared in a straightforward way under the same conditions, using the same decoder and the same vocabularies. Experimental results showed that, as we move from full-form lexicons to trees and then to graphs, the size of the recognition network is reduced, as is the recognition time. However, recognition accuracy is retained since the same phoneme combinations are involved. Subsequently, we refine the N-best hypotheses' list provided by the speech recognizer by applying context-dependent phonological rules. Thus, a small number N in the N-best hypotheses' list produces multiple solutions sufficient to retain high accuracy and at the same time achieve real-time response. Recognition tests with a vocabulary of 88,000 surnames that correspond to 123,313 distinct pronunciations proved the efficiency of the approach. For N = 3 (a value that ensures we have fast performance), before the application of rules the recognition accuracy was 70.27%. After applying phonological rules the recognition performance rose to 86.75%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Linear-Size CDAWG: New Repetition-Aware Indexing and Grammar Compression

A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion

An Efficient Method for Vocabulary Addition to WFST Graphs

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aoe, J., Morimoto, K., and Hase, M. (1993). An algorithm for compressing common suffixes used in trie structures. Systems and Computers in Japan, 24(12):31–42 (Translated from Trans. IEICE, J75-D-II(4):770-799, 1992).
Google Scholar
Betz, M. and Hild, H. (1995). Language models for a spelled letter recognizer. Proceedings of ICASSP, Detroit, MI, Vol. 1, pp. 856–859.
Google Scholar
Billi, R., Canavesio, F., and Rullent, C. (1998). Automation of Telecom Italia directory assistance service: Field trial results. Proceedings of IVTTA, Turin, Italy, pp. 11–16.
Google Scholar
Chen, F.R. (1990). Identification of contextual factors for pronunciation networks. Proceedings of ICASSP, pp. 753–756.
Collingham, R.J., Johnson, K., Nettleton, D.J., Dempster, G., and Garigliano, R. (1997). The Durham telephone enquiry system. International Journal of Speech Technology, 2(2):113–119.
Google Scholar
Còrdoba, R., San-Segundo, R., Montero, J.M., Col´as, J., Ferreiros, J., Macías-Guarasa, J., and Pardo, J.M. (2001). An interactive directory assistance service for Spanish with large-vocabulary recognition. Proceedings of Eurospeech, Aalborg, Denmark, pp. 1279–1282.
Google Scholar
Georgila, K., Sgarbas, K., Fakotakis, N., and Kokkinakis, G. (2000). Fast very large vocabulary recognition based on compact DAWGstructured language models. Proceedings of ICSLP, Beijing, China, Vol. 2, pp. 987–990.
Google Scholar
Gopalakrisnan, P.S., Bahl, L.R., and Mercer, R.L. (1995). A tree search strategy for large vocabulary continuous speech recognition. Proceedings of ICASSP, Detroit, MI, Vol. 1, pp. 572–575.
Google Scholar
Gupta, V., Robillard, S., and Pelletier, C. (1998). Automation of locality recognition in ADAS Plus. Proceedings of IVTTA, Turin, Italy, pp. 1–4.
Google Scholar
Hanazawa, K., Minami, Y., and Furui, S. (1997). An efficient search method for large-vocabulary continuous-speech recognition. Proceedings of ICASSP, Munich, Germany, pp. 1787–1790.
Google Scholar
Kamm, C.A., Shamieh, C.R., and Singhal, S. (1995). Speech recognition issues for directory assistance applications. Speech Communication, 17:303–311.
Google Scholar
Kaspar, B., Fries, G., Schumacher, K., and Wirth, A. (1995). FAUST-A directory-assistance demonstrator. Proceedings of Eurospeech, Madrid, Spain, pp. 1161–1164.
Lennig, M., Bielby, G., and Massicotte, J. (1995). Directory assistance automation in Bell Canada: Trial results. Speech Communication, 17:227–234.
Google Scholar
Mitchell, C.D. and Setlur, A.R. (1999). Improved spelling recognition using a tree-based fast lexical match. Proceedings of ICASSP, Phoenix, AZ.
Nguyen, L. and Schwartz, R. (1999). Single-tree method for grammar-directed search. Proceedings of ICASSP
Phoenix, AZ. Phonetic S ystems (2002). Searching large directories by voice. Provided by Phonetic Systems.
Ramabhadran, B., Bahl, L.R., deSouza, P.V., and Padmanabhan, M. (1998). Acoustics-only based automatic phonetic baseform generation. Proceedings of ICASSP, Seatlle, WA, Vol. 1, pp. 309–312.
Google Scholar
Schmid, P., Cole, R., and Fanty, M. (1993). Automatically generated word pronunciations from phoneme classifier output. Proceedings of ICASSP, Minneapolis, MN, Vol. 2, pp. 223–226.
Google Scholar
Schramm, H., Rueber, B., and Kellner, A. (2000). Strategies for name recognition in automatic directory assistance systems. Speech Communication, 31: pp 329–338.
Google Scholar
Seide, F. and Kellner, A. (1997). Towards an automated directory information system. Proceedings of Eurospeech, Rhodes, Greece, Vol. 3, pp. 1327–1330.
Google Scholar
Sgarbas, K., Fakotakis, N., and Kokkinakis, G. (1995). Two algorithms for incremental construction of directed acyclic word graphs. International Journal on Artificial Intelligence Tools, 4(3): 369–381.
Google Scholar
Sgarbas, K., Fakotakis, N., and Kokkinakis, G. (2001). Incremental construction of compact acyclic NFAs. Proceedings of ACLEACL, Toulouse, France, pp. 482–489.
Suontausta, J., Häkkinen, J., and Viikki, O. (2000). Fast decoding in large vocabulary name dialing. Proceedings of ICASSP, Istanbul, Turkey, 2000.
Van den Heuvel, H., Moreno, A., Omologo, M., Richard G., and Sanders, E. (2001). Annotation in the SpeechDat projects. International Journal of Speech Technology, 4(2):127–143.
Google Scholar
Whittaker, S.J. and Attwater, D.J. (1995). Advanced speech applications-The integration of speech technology into complex services. ESCA Workshop on Spoken Dialogue Systems-Theory and Application, Visgø, Denmark, pp. 113–116.
Young, S., Odell, J., Ollason, D., Valtchev, V., and Woodland, P. (1997). The HTK Book (user manual), Entropic Cambridge Research Laboratory, Cambridge.
Google Scholar

Download references

Author information

Authors and Affiliations

Wire Communications Laboratory, Electrical and Computer Engineering Department, University of Patras, Greece
Kallirroi Georgila, Nikos Fakotakis & George Kokkinakis

Authors

Kallirroi Georgila
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Fakotakis
View author publications
You can also search for this author in PubMed Google Scholar
George Kokkinakis
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Georgila, K., Fakotakis, N. & Kokkinakis, G. Large Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules. International Journal of Speech Technology 5, 355–370 (2002). https://doi.org/10.1023/A:1020965126094

Download citation

Issue Date: November 2002
DOI: https://doi.org/10.1023/A:1020965126094

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Linear-Size CDAWG: New Repetition-Aware Indexing and Grammar Compression

A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion

An Efficient Method for Vocabulary Addition to WFST Graphs

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

Large Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Linear-Size CDAWG: New Repetition-Aware Indexing and Grammar Compression

A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion

An Efficient Method for Vocabulary Addition to WFST Graphs

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation