Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Silent speech interfaces

Published: 01 April 2010 Publication History

Abstract

The possibility of speech processing in the absence of an intelligible acoustic signal has given rise to the idea of a 'silent speech' interface, to be used as an aid for the speech-handicapped, or as part of a communications system operating in silence-required or high-background-noise environments. The article first outlines the emergence of the silent speech interface from the fields of speech production, automatic speech processing, speech pathology research, and telecommunications privacy issues, and then follows with a presentation of demonstrator systems based on seven different types of technologies. A concluding section underlining some of the common challenges faced by silent speech interface researchers, and ideas for possible future directions, is also provided.

References

[1]
Arnal, A., Badin, P., Brock, G., Connan, P.-Y., Florig, E., Perez, N., Perrier, P. Simon, P., Sock, R., Varin, L., Vaxelaire, B., Zerling, J.-P., 2000. Une base de données cinéradiographiques du français, XXIIIèmes Journées d'Etude sur la Parole, Aussois, France, pp. 425-428.
[2]
A comparative acoustic study of normal, esophageal and tracheoesophageal speech production. J. Speech Hearing Disorders. v49. 202-210.
[3]
Neurotrophic electrode: method of assembly and implantation into human motor speech cortex. J. Neurosci. Methods. v174 i2. 168-176.
[4]
Small-vocabulary speech recognition using surface electromyography. Interact. Comput.: Interdisciplinary J. Human-Comput. Interact. v18. 1242-1259.
[5]
The thought translation device (TTD) for completely paralyzed patients. IEEE Trans. Rehabil. Eng. v8 i2. 190-193.
[6]
The Berlin brain-computer interface: EEG-based communication without subject training. IEEE Trans. Neural Systems Rehabil. Eng. v14 i2. 147-152.
[7]
Surgical prosthetic approaches for postlaryngectomy voice restoration. In: Keith, R.L., Darley, F.C. (Eds.), Laryngectomy Rehabilitation, Texas College Hill Press, Houston. pp. 251-276.
[8]
Bos, J.C., Tack, D.W., 2005. Speech Input Hardware Investigation for Future Dismounted Soldier Computer Systems, DRCD Toronto CR 2005-064.
[9]
A novel non-acoustic voiced speech sensor. Meas. Sci. Technol. v15. 1291-1302.
[10]
Measuring glottal activity during voiced speech using a tuned electromagnetic resonating collar sensor. Meas. Sci. Technol. v16. 2381-2390.
[11]
Brumberg, J.S., Andreasen, D.S., Bartels, J.L., Guenther, F.H., Kennedy, P.R., Siebert, S.A., Schwartz, A.B., Velliste, M., Wright, E.J., 2007. Human speech cortex long-term recordings: formant frequency analyses. In: Neuroscience Meeting Planner 2007, Program No. 517.17, San Diego, USA.
[12]
Brumberg, J.S., Nieto-Castanon, A., Guenther, F.H., Bartels, J.L., Wright, E.J., Siebert, S.A., Andreasen, D.S., Kennedy, P.R., 2008. Methods for construction of a long-term human brain machine interface with the Neurotrophic Electrode. In: Neuroscience Meeting Planner 2007, Program No. 779.5, Washington, DC.
[13]
Brain-computer interfaces for speech communication. Speech Comm. v52 i4. 367-379.
[14]
Voiced excitation functions calculated from micro-power impulse radar information. J. Acoust. Soc. Amer. v102. 3168(A)
[15]
Carstens Medizinelektronik, 2008. %3chttp://www.articulograph.de/%3e.
[16]
Myo-electric signals to augment speech recognition. Med. Biological Eng. Comput. v39. 500-504.
[17]
Chan, A.D.C., 2003. Multi-expert automatic speech recognition system using myoelectric signals. Ph.D. Dissertation, Department of Electrical and Computer Engineering, University of New Brunswick, Canada.
[18]
Laryngectomy patients and the psychological aspects of their tracheostomy. Rev. Laryngol. Otolaryngol. Rhinol. v123. 137-139.
[19]
Comparing tongue shapes from ultrasound imaging using smoothing spline analysis of variance. J. Acoust. Soc. Amer. v120 i1. 407-415.
[20]
DaSalla, C.S., Kambara, H., Sato, M., Koike, Y., 2009. Spatial filtering and single-trial classification of EEG during vowel speech imagery. In: Proceedings of the 3rd International Convention on Rehabilitation Engineering & Assistive Technology, Singapore. Article No. 27. ACM, New York, NY, USA. ISBN:978-1-60558-792-9.
[21]
Dekens, T., Patsis, Y., Verhelst, W., Beaugendre, F., Capman, F., 2008. A multi-sensor speech database with applications towards robust speech processing in hostile environments. In: Proc. 6th Internat. Language Resources and Evaluation (LREC'08), European Language Resources Association (ELRA), Marrakech, Morocco, 28-30 May 2008.
[22]
Denby, B., Stone, M., 2004. Speech synthesis from real time ultrasound images of the tongue. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing, (ICASSP'04), Montréal, Canada, 17-21 May 2004, Vol. 1, pp. I685-I688.
[23]
Prospects for a Silent Speech Interface Using Ultrasound Imaging. IEEE ICASSP, Toulouse, France.
[24]
In: Dornhege, G., del R. Millan, J., Hinterberger, T., McFarland, D., Müller, K.-R. (Eds.), Towards Brain-Computer Interfacing, MIT Press.
[25]
Perceptual and acoustical analysis of alaryngeal speech: determinants of intelligibility. Percept. Motor Skills. v83. 801-802.
[26]
Dupont, S., Ris, C., 2004. Combined use of close-talk and throat microphones for improved speech recognition under non-stationary background noise. In: Proc. Robust 2004, Workshop (ITRW) on Robustness Issues in Conversational Interaction, Norwich, UK, August 2004.
[27]
Introduction to EEG and evoked potentials. J.B. Lippincot Co.
[28]
Ultrasound and the IRB. Clin. Linguist. Phonet. v16 i6. 567-572.
[29]
Development of a (silent) speech recognition system for patients following laryngectomy. Med. Eng. Phys. v30 i4. 419-425.
[30]
Fitzpatrick, M., 2002. Lip-reading cellphone silences loudmouths, New Scientist, edition of 03 April 2002.
[31]
Digital Speech Processing Synthesis and Recognition. second ed. Marcel Dekker.
[32]
On the relations between the direction of two-dimensional arm movements and cell discharge in primate motor cortex. J. Neurosci. v2 i11. 1527-1537.
[33]
Gibbon, F., 2005. Bibliography of Electropalatographic (EPG) Studies in English (1957-2005), Queen Margaret University, Edinburgh, UK, September 2005. %3chttp://www.qmu.ac.uk/ssrc/cleftnet/EPG_biblio_2005_september.pdf%3e.
[34]
Imaging speech production using fMRI. NeuroImage. v26 i1. 294-301.
[35]
Guenther, F.H., Brumberg, J.S., Nieto-Castanon, A., Bartels, J.L., Siebert, S.A., Wright, E.J., Tourville, J.A., Andreasen, D.S., Kennedy, P.R., 2008. A brain-computer interface for real-time speech synthesis by a locked-in individual implanted with a Neurotrophic Electrode. In: Neuroscience Meeting Planner 2008, Program No. 712.1, Washington, DC.
[36]
Hasegawa, T., Ohtani, K., 1992. Oral image to voice converter, image input microphone. In: Proc. IEEE ICCS/ISITA 1992 Singapore, Vol. 20, No. 1, pp. 617-620.
[37]
Hasegawa-Johnson, M., 2008. Private communication.
[38]
Unvoiced speech recognition using tissue-conductive acoustic sensor. EURASIP J. Adv. Signal Process. v2007 i1. 1-11.
[39]
Silent-speech enhancement system utilizing body-conducted vocal-tract resonance signals. Speech Comm. v52 i4. 301-313.
[40]
Neuronal ensemble control of prosthetic devices by a human with tetraplegia. Nature. v442 i7099. 164-171.
[41]
Hochberg, L.R., Simeral, J.D., Kim, S., Stein, J., Friehs, G.M., Black, M.J., Donoghue, J.P., 2008. More than two years of intracortically-based cursor control via a neural interface system. In: Neurosicence Meeting Planner 2008, Program No. 673.15, Washington, DC.
[42]
Speech Synthesis and Recognition. Taylor and Francis.
[43]
Electromagnetic articulography in coarticulation research. In: Hardcastle, W.H., Hewlitt, N. (Eds.), Coarticulation: Theory, Data and Techniques, Cambridge University Press. pp. 260-269.
[44]
House, D., Granström, B., 2002. Multimodal speech synthesis: improving information flow in dialogue systems using 3D talking heads. In: Artificial Intelligence: Methodology, Systems, and Applications. Lecture Notes in Computer Science, Vol. 2443/2002. Springer, Berlin/Heidelberg, pp. 65-84.
[45]
Hueber, T., Aversano, G., Chollet, G., Denby, B., Dreyfus, G., Oussar, Y., Roussel, P., Stone, M., 2007a. Eigentongue feature extraction for an ultrasound-based silent speech interface. In: IEEE Internat. Conf. on Acoustic, Speech, and Signal Processing, ICASSP07, Honolulu, Vol. 1, pp. 1245-1248.
[46]
Hueber, T., Chollet, G., Denby, B., Stone, M., Zouari, L, 2007b. Ouisper: corpus based synthesis driven by articulatory data. In: Internat. Congress of Phonetic Sciences, Saarbrücken, Germany, pp. 2193-2196.
[47]
Hueber, T., Chollet, G., Denby, B., Dreyfus, G., Stone, M., 2007c. Continuous-speech Phone Recognition from Ultrasound and Optical Images of the Tongue and Lips. Interspeech, Antwerp, Belgium, pp. 658-661.
[48]
Hueber, T., Chollet, G., Denby, B., Dreyfus, G., Stone, M., 2008a. Phone Recognition from Ultrasound and Optical Video Sequences for a Silent Speech Interface. Interspeech, Brisbane, Australia, pp. 2032-2035.
[49]
Hueber, T., Chollet, G., Denby, B., Stone, M., 2008b. Acquisition of ultrasound, video and acoustic speech data for a silent-speech interface application. In: Internat. Seminar on Speech Production, Strasbourg, France, pp. 365-369.
[50]
Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips. Speech Comm. v52 i4. 288-300.
[51]
Evaluation of a new electromagnetic tracking system using a standardized assessment protocol. Phys. Med. Biol. v51. N205-N210.
[52]
IEEE, 2008. Brain Computer Interfaces, IEEE Computer, Vol. 41, No. 10, October 2008.
[53]
Jorgensen, C., Lee, D.D., Agabon, S., 2003. Sub auditory speech recognition based on EMG signals. In: Proc. Internat. Joint Conf. on Neural Networks (IJCNN), pp. 3128-3133.
[54]
Jorgensen, C., Binsted, K., 2005. Web browser control using EMG based sub vocal speech recognition. In: Proc. 38th Annual Hawaii Internat. Conf. on System Sciences. IEEE, pp. 294c.1-294c.8.
[55]
Speech interfaces based upon surface electromyography. Speech Comm. v52 i4. 354-366.
[56]
Jou, S., Schultz, T., Walliczek, M., Kraft, F., 2006. Towards continuous speech recognition using surface electromyography. In: INTERSPEECH 2006 and 9th Internat. Conf. on Spoken Language Processing, Vol. 2, pp. 573-576.
[57]
Jou, S., Schultz, T., Waibel, A., 2007. Multi-stream articulatory feature classifiers for surface electromyographic continuous speech recognition. In: Internat. Conf. on Acoustics, Speech, and Signal Processing. IEEE, Honolulu, Hawaii.
[58]
The cone electrode: a long-term electrode that records from neurites grown onto its recording surface. J. Neurosci. Methods. v29. 181-193.
[59]
Kennedy, P.R., 2006. Comparing electrodes for use as cortical control signals: tiny tines, tiny wires or tiny cones on wires: which is best? In: The Biomedical Engineering Handbook. The Electrical Engineering Handbook Series, third ed., Vol. 1. CRS/Taylor and Francis, Boca Raton.
[60]
Direct control of a computer from the human central nervous system. IEEE Trans. Rehabil. Eng. v8 i2. 198-202.
[61]
Restoration of neural output from a paralyzed patient by direct brain connection. NeuroReport. v9. 1707-1711.
[62]
Kim, S., Simeral, J.D., Hochberg, L.R., Donoghue, J.P., Friehs, G.M., Black, M.J., 2007. Multi-state decoding of point-and-click control signals from motor cortical activity in a human with tetraplegia. In: Neural Engineering, 2007, CNE'07 3rd Internat. IEEE/EMBS Conf., pp. 486-489.
[63]
Mathematical Models for Speech Technology. John Wiley.
[64]
A review of classification algorithms for EEG-based brain computer interfaces. J. Neural Eng. v4. R1-R13.
[65]
Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model. In: Hardcastle, W., Marchal, A. (Eds.), Speech Production and Speech Modelling, Kluwer Academic Publisher, Amsterdam. pp. 131-149.
[66]
Maier-Hein, L., Metze, F., Schultz, T., Waibel, A., 2005. Session independent non-audible speech recognition using surface electromyography. In: IEEE Workshop on Automatic Speech Recognition and Understanding, San Juan, Puerto Rico, pp. 331-336.
[67]
Manabe, H., Hiraiwa, A., Sugimura, T., 2003. Unvoiced speech recognition using EMG-mime speech recognition. In: Proc. CHI, Human Factors in Computing Systems, Ft. Lauderdale, Florida, pp. 794-795.
[68]
Manabe, H., Zhang, Z., 2004. Multi-stream HMM for EMG-based speech recognition. In: Proc. 26th Annual International Conf. of the IEEE Engineering in Medicine and Biology Society, 1-5 September 2004, San Francisco, California, Vol. 2, pp. 4389-4392.
[69]
Instrumentation and database for the cross-language study of coarticulation. Lang. Speech. v36 i1. 3-20.
[70]
The Utah intracortical electrode array: a recording structure for potential brain-computer interfaces. Electroencephalogr. Clin. Neurophysiol. v102 i3. 228-239.
[71]
Miller, L.E., Andreasen, D.S., Bartels, J.L., Kennedy, P.R., Robesco, J., Siebert, S.A., Wright, E.J., 2007. Human speech cortex long-term recordings: Bayesian analyses. In: Neuroscience Meeting Planner 2007, Program No. 517.20, San Diego, USA.
[72]
Moeller, 2008. %3chttp://innovationdays.alcatel-lucent.com/2008/documents/Talking.Beyond.Hearing.pdf%3e.
[73]
Research summary of a scheme to ascertain the availability of speech information in the myoelectric signals of neck and head muscles using surface electrodes. Comput. Biol. Med. v16 i6. 399-410.
[74]
Morse, M.S., Day, S.H., Trull, B., Morse, H., 1989. Use of myoelectric signals to recognize speech. In: Images of the Twenty-First Century - Proc. 11th Annual Internat. Conf. of the IEEE Engineering in Medicine and Biology Society, Part 2, Vol. 11. Alliance for Engineering in Medicine and Biology, pp. 1793-1794.
[75]
Morse, M.S., Gopalan, Y.N., Wright, M., 1991. Speech recognition using myoelectric signals with neural networks. In: Proc. 13th Annual Internat. Conf. of the IEEE Engineering in Medicine and Biology Society, Vol. 13, No 4, Piscataway, NJ, United States. IEEE, pp. 1877-1878.
[76]
X-ray film database for speech research. J. Acoust. Soc. Amer. v98. 1222-1224.
[77]
Nakajima, Y., 2005. Development and evaluation of soft silicone NAM microphone. Technical Report IEICE, SP2005-7, pp. 7-12 (in Japanese).
[78]
Nakajima, Y., Kashioka, H., Shikano, K., Campbell, N., 2003a. Non-audible murmur recognition. In: Proc. Eurospeech 2003, pp. 2601-2604.
[79]
Nakajima, Y., Kashioka, H., Shikano, K., Campbell, N., 2003b. Non-audible murmur recognition input interface using stethoscopic microphone attached to the skin. In: Proc. IEEE ICASSP, pp. 708-711.
[80]
Non-audible murmur (NAM) recognition. IEICE Trans. Inform. Systems. vE89-D i1. 1-8.
[81]
Nakamura, H., 1988. Method of recognizing speech using a lip image. United States Patent 4769845, September 06.
[82]
Recording high quality speech during tagged cine-MRI studies using a fiber optic microphone. J. Magnet. Reson. Imag. v23. 92-97.
[83]
Clinical application of an EEG-based brain computer interface: a case study in a patient with severe motor impairment. Clin. Neurophysiol. v114. 399-409.
[84]
Ng, L., Burnett, G., Holzrichter, J., Gable, T., 2000. Denoising of human speech using combined acoustic and EM sensor signal processing. In: Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Istanbul, Turkey, 5-9 June 2000, Vol. 1, pp. 229-232.
[85]
Modeling tongue-palate contact patterns in the production of speech. J. Phonetics. v1996. 77-98.
[86]
Brain-Computer Interfacing for Intelligent Systems. Intell. Systems. v23 i3. 72-79.
[87]
Vocal tract shapes of non-audible murmur production. Acoust. Sci. Technol. v29. 195-198.
[88]
Ouisper, 2006. Oral Ultrasound synthetIc SpEech SouRce, "Projet Blanc", National Research Agency (ANR), France, Contract No. ANR-06-BLAN-0166, 2006-2009.
[89]
A competitive alternative for speaker assessment: physiological Microphone (PMIC). Speech Comm. v52 i4. 327-340.
[90]
Electro-magnetic midsagittal articulometer (EMMA) systems for transducing speech articulatory movements. J. Acoust. Soc. Amer. v92. 3078-3096.
[91]
Petajan, E.D., 1984. Automatic lipreading to enhance speech recognition. In: IEEE Communications Society Global Telecommunications Conf., Atlanta, USA.
[92]
Porbadnigk, A., Wester, M., Calliess, J., Schultz, T., 2009. EEG-based speech recognition - impact of temporal effects. In: Biosignals 2009, Porto, Portugal, January 2009, pp. 376-381.
[93]
Preuss, R.D., Fabbri, D.R., Cruthirds, D.R., 2006. Noise robust vocoding at 2400bps. In: 8th Internat. Conf. on Signal Processing, ICSP 2006, Guilin, China, November 16-20, 2006, Vol. 1, pp. 16-20.
[94]
Exploiting nonacoustic sensors for speech enhancement. IEEE Trans. Audio Speech Lang. Process. v14 i2. 533-544.
[95]
A multichannel electroglottograph. J. Voice. v6 i1. 36-43.
[96]
Talking heads. Proc. Audio Visual Speech Process. v1998. 233-238.
[97]
Sajda, P., Mueller, K.-R., Shenoy, K.V. (Eds.), 2008. Brain Computer Interfaces. IEEE Signal Process. Mag. (special issue).
[98]
Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain Lang. v31. 26-35.
[99]
Schroeter, J., Ostermann, J., Graf, H.P., Beutnagel, M., Cosatto, E., Syrdal, A., Conkie, A., Stylianou, Y., 2000. Multimodal speech synthesis. In: IEEE Internat. Conf. on Multimedia and Expo 2000, pp. 571-574.
[100]
Modeling coarticulation in large vocabulary EMG-based speech recognition. Speech Comm. v52 i4. 341-353.
[101]
Analysis of real-time ultrasound images of tongue configuration using a grid-digitizing system. J. Phonetics. v11. 207-218.
[102]
An ultrasound examination of tongue movement during swallowing. Dysphagia. v1. 78-83.
[103]
A head and transducer support (HATS) system for use in ultrasound imaging of the tongue during speech. J. Acoust. Soc. Amer. v98. 3107-3112.
[104]
A guide to analyzing tongue motion from ultrasound images. Clin. Linguist. Phonet. v19 i6-7. 455-502.
[105]
A speech prosthesis employing a speech synthesizer-vowel discrimination from perioral muscle activities and vowel production. IEEE Trans. Biomed. Eng. vBME-32 i7. 485-490.
[106]
Brain wave recognition of words. Proc. Nat. Acad. Sci. USA. v94. 14965-14969.
[107]
Tardelli, J.D. (Ed.), 2003. MIT Lincoln Labs Report ESC-TR-2004-084. Pilot Corpus for Multisensor Speech Processing.
[108]
The place of electromyography in speech research. Behav. Technol. v6.
[109]
TERC, 2009. %3chttp://spinlab.wpi.edu/projects/terc/terc.html%3e.
[110]
Comparison between electroglottography and electromagnetic glottography. J. Acoust. Soc. Amer. v107 i1. 581-588.
[111]
Tran, V.-A., Bailly, G., Loevenbruck, H., Jutten C., 2008a. Improvement to a NAM captured whisper-to-speech system. In: Interspeech 2008, Brisbane, Australia, pp. 1465-1498.
[112]
Tran, V.-A., Bailly, G., Loevenbruck, H., Toda, T., 2008b. Predicting F0 and voicing from NAM-captured whispered speech. In: Proc. Speech Prosody, Campinas, Brazil.
[113]
Improvement to a NAM-captured whisper-to-speech system. Speech Comm. v52 i4. 314-326.
[114]
Primary motor cortex tuning to intended movement kinematics in humans with tetraplegia. J. Neurosci. v28 i5. 1163-1178.
[115]
Walliczek, M., Kraft, F., Jou, S.-C., Schultz, T., Waibel, A., 2006. Sub-word unit based non-audible speech recognition using surface electromyography. In: Proc. Interspeech, Pittsburgh, USA, pp. 1487-1490.
[116]
Wand, M., Schultz, T., 2009. Speaker-adaptive speech recognition based on surface electromyography. In: BIOSTEC - BIOSIGNALS 2009 best papers. Communications in Computer and Information Science (CCIS) series. Springer, Heidelberg, in press.
[117]
Wester, M., Schultz, T., 2006. Unspoken speech - speech recognition based on electroencephalography. Master's Thesis, Universität Karlsruhe (TH), Karlsruhe, Germany.
[118]
Brain-computer interfaces for communication and control. Clin. Neurophysiol. v113 i6. 767-791.
[119]
Wrench, A.A., Scobbie, J.M., 2003. Categorising vocalisation of English / l / using EPG, EMA and ultrasound. In: 6th Internat. Seminar on Speech Production, Manly, Sydney, Australia, 7-10 December 2003, pp. 314-319.
[120]
Wrench, A., Scobbie, J., Linden, M., 2007. Evaluation of a helmet to hold an ultrasound probe. In: Ultrafest IV, New York, USA.
[121]
Wright, E.J., Andreasen, D.S., Bartels, J.L., Brumberg, J.S., Guenther, F.H., Kennedy, P.R., Miller, L.E., Robesco, J., Schwartz, A.B., Siebert, S.A., Velliste, M., 2007. Human speech cortex long-term recordings: neural net analyses. In: Neuroscience Meeting Planner 2007, Program No. 517.18, San Diego, USA.

Cited By

View all
  • (2024)Continuous lipreading based on acoustic temporal alignmentsEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-024-00345-72024:1Online publication date: 6-May-2024
  • (2024)Micro-Gesture Recognition of Tongue via Bone Conduction SoundAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686336(1-3)Online publication date: 13-Oct-2024
  • (2024)Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680770(6559-6568)Online publication date: 28-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Speech Communication
Speech Communication  Volume 52, Issue 4
April, 2010
113 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 April 2010

Author Tags

  1. Cellular telephones
  2. Silent speech
  3. Speech pathologies
  4. Speech recognition
  5. Speech synthesis

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Continuous lipreading based on acoustic temporal alignmentsEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-024-00345-72024:1Online publication date: 6-May-2024
  • (2024)Micro-Gesture Recognition of Tongue via Bone Conduction SoundAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686336(1-3)Online publication date: 13-Oct-2024
  • (2024)Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680770(6559-6568)Online publication date: 28-Oct-2024
  • (2024)Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596148:2(1-29)Online publication date: 15-May-2024
  • (2024)WhisperMask: a noise suppressive mask-type microphone for whisper speechProceedings of the Augmented Humans International Conference 202410.1145/3652920.3652925(1-14)Online publication date: 4-Apr-2024
  • (2024)GlassMail: Towards Personalised Wearable Assistant for On-the-Go Email Creation on Smart GlassesProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660683(372-390)Online publication date: 1-Jul-2024
  • (2024)ReactGenie: A Development Framework for Complex Multimodal Interactions Using Large Language ModelsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642517(1-23)Online publication date: 11-May-2024
  • (2024)ReHEarSSE: Recognizing Hidden-in-the-Ear Silently Spelled ExpressionsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642095(1-16)Online publication date: 11-May-2024
  • (2024)Human-inspired computational models for European Portuguese: a reviewLanguage Resources and Evaluation10.1007/s10579-023-09648-158:1(43-72)Online publication date: 1-Mar-2024
  • (2023)Multi-stage Multi-modalities Fusion of Lip, Tongue and Acoustics Information for Speech RecognitionProceedings of the 2023 6th Artificial Intelligence and Cloud Computing Conference10.1145/3639592.3639623(226-231)Online publication date: 16-Dec-2023
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media