Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Self-Organizing Feature Maps for HMM Based Lip-Reading

  • Conference paper
Knowledge-Based Intelligent Information and Engineering Systems (KES 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2774))

Abstract

Audio-visual dialogue is an appealing tool for natural interface with computers. Lip-reading is one of important part for audio-visual dialogue. In this paper, it is proposed to use a self-organizing feature map (SOM) and a hierarchical SOM: Hypercolumn model (HCM), as a module of phoneme feature space construction for HMM base lip-reading system. Those SOMs allow alleviating many difficulties associated with feature space construction. It is, however, required for on-line systems to reduce the feature extraction time to the range of normal video camera rates. To achieve this, a randomization technique is introduced. The experimental results show performances of the SOMs for Japanese lip-reading.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. HMM Took Kit, http://htk.eng.cam.ac.uk/

  2. Deligne, S., Potamianos, G., Neti, C.: Audio-visual Speech Enhancement with AVCDCN (Audio-visual Codebook Dependent Cepstral Normalization). In: Int. Conf. on Spoken Language Processing, pp. 1449–1452 (2002)

    Google Scholar 

  3. Heckmann, M., Krochel, K., Savariaux, C., Berthommier, F.: DCT-Based Video Features for Audio-visual speech recognition. In: Int. Conf. on Spoken Language Processing, pp. 1925–1928 (2002)

    Google Scholar 

  4. Meier, U., Stiefelhagen, R., Yang, J., Waibel, A.: Towards unrestricted lipreading. In: 2nd International Conference on Multimodal Interfaces (ICMI 1999) (1999)

    Google Scholar 

  5. Tsuruta, N., Tobely, T., Yoshiki, Y.: A Randomized Self-organizing Map for Gesture Recognition. Journal of Japan Society for Fuzzy Theory and Systems 14(1), 82–87 (2002)

    Google Scholar 

  6. Tobely, T., Tsuruta, N., Amamiya, M.: A Randomized Model of The Hypercolumn Neural Network for Gesture Recognition. Int. Journal of Computers, Systems and Signals 3(1), 14–18 (2002)

    Google Scholar 

  7. Kohonen, T.: Self-organizing maps. Springer Series in Information Science (1997)

    Google Scholar 

  8. Fukushima, K.: Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybernetics 36(4), 193–202 (1980)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tsuruta, N., Iuchi, H., El Sagheer, A., El Tobely, T. (2003). Self-Organizing Feature Maps for HMM Based Lip-Reading. In: Palade, V., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2003. Lecture Notes in Computer Science(), vol 2774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45226-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45226-3_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40804-8

  • Online ISBN: 978-3-540-45226-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics