Self-Organizing Feature Maps for HMM Based Lip-Reading

Tsuruta, Naoyuki; Iuchi, Hirotaka; El Sagheer, Alaa; El Tobely, Tarek

doi:10.1007/978-3-540-45226-3_23

Naoyuki Tsuruta⁹,
Hirotaka Iuchi⁹,
Alaa El Sagheer¹⁰ &
…
Tarek El Tobely¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2774))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

1022 Accesses
6 Citations

Abstract

Audio-visual dialogue is an appealing tool for natural interface with computers. Lip-reading is one of important part for audio-visual dialogue. In this paper, it is proposed to use a self-organizing feature map (SOM) and a hierarchical SOM: Hypercolumn model (HCM), as a module of phoneme feature space construction for HMM base lip-reading system. Those SOMs allow alleviating many difficulties associated with feature space construction. It is, however, required for on-line systems to reduce the feature extraction time to the range of normal video camera rates. To achieve this, a randomization technique is introduced. The experimental results show performances of the SOMs for Japanese lip-reading.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automatic lipreading based on optimized OLSDA and HMM

Article 01 March 2022

Lip-Reading Using Pixel-Based and Geometry-Based Features for Multimodal Human–Robot Interfaces

Optimizing Phoneme-to-Viseme Mapping for Continuous Lip-Reading in Spanish

References

HMM Took Kit, http://htk.eng.cam.ac.uk/
Deligne, S., Potamianos, G., Neti, C.: Audio-visual Speech Enhancement with AVCDCN (Audio-visual Codebook Dependent Cepstral Normalization). In: Int. Conf. on Spoken Language Processing, pp. 1449–1452 (2002)
Google Scholar
Heckmann, M., Krochel, K., Savariaux, C., Berthommier, F.: DCT-Based Video Features for Audio-visual speech recognition. In: Int. Conf. on Spoken Language Processing, pp. 1925–1928 (2002)
Google Scholar
Meier, U., Stiefelhagen, R., Yang, J., Waibel, A.: Towards unrestricted lipreading. In: 2nd International Conference on Multimodal Interfaces (ICMI 1999) (1999)
Google Scholar
Tsuruta, N., Tobely, T., Yoshiki, Y.: A Randomized Self-organizing Map for Gesture Recognition. Journal of Japan Society for Fuzzy Theory and Systems 14(1), 82–87 (2002)
Google Scholar
Tobely, T., Tsuruta, N., Amamiya, M.: A Randomized Model of The Hypercolumn Neural Network for Gesture Recognition. Int. Journal of Computers, Systems and Signals 3(1), 14–18 (2002)
Google Scholar
Kohonen, T.: Self-organizing maps. Springer Series in Information Science (1997)
Google Scholar
Fukushima, K.: Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybernetics 36(4), 193–202 (1980)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics Enginieering and Computer Science, Fukuoka University, 8-19-1, Nanakuma, Jonan, Fukuoka, 814-0180, Japan
Naoyuki Tsuruta & Hirotaka Iuchi
Department of Information Science and Electrical Engneering, Kyushu University, 6-1, Kasuga-Koen, Kasuga, Fukuoka, 816-8580, Japan
Alaa El Sagheer & Tarek El Tobely

Authors

Naoyuki Tsuruta
View author publications
You can also search for this author in PubMed Google Scholar
Hirotaka Iuchi
View author publications
You can also search for this author in PubMed Google Scholar
Alaa El Sagheer
View author publications
You can also search for this author in PubMed Google Scholar
Tarek El Tobely
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computing Laboratory, Oxford University, Parks Road, OXI 3QD, Oxford, United Kingdom
Vasile Palade
Centre for SMART Systems, School of Environment and Technology, University of Brighton, BN2 4GJ, Brighton, UK
Robert J. Howlett
Knowledge-Based Intelligent Engineering Systems Centre, University of South Australia, Mawson Lakes, SA 5095, Adelaide, Australia
Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsuruta, N., Iuchi, H., El Sagheer, A., El Tobely, T. (2003). Self-Organizing Feature Maps for HMM Based Lip-Reading. In: Palade, V., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2003. Lecture Notes in Computer Science(), vol 2774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45226-3_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-45226-3_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40804-8
Online ISBN: 978-3-540-45226-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Self-Organizing Feature Maps for HMM Based Lip-Reading

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic lipreading based on optimized OLSDA and HMM

Lip-Reading Using Pixel-Based and Geometry-Based Features for Multimodal Human–Robot Interfaces

Optimizing Phoneme-to-Viseme Mapping for Continuous Lip-Reading in Spanish

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Self-Organizing Feature Maps for HMM Based Lip-Reading

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic lipreading based on optimized OLSDA and HMM

Lip-Reading Using Pixel-Based and Geometry-Based Features for Multimodal Human–Robot Interfaces

Optimizing Phoneme-to-Viseme Mapping for Continuous Lip-Reading in Spanish

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation