Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3544548.3581465acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

LipLearner: Customizable Silent Speech Interactions on Mobile Devices

Published: 19 April 2023 Publication History

Abstract

Silent speech interface is a promising technology that enables private communications in natural language. However, previous approaches only support a small and inflexible vocabulary, which leads to limited expressiveness. We leverage contrastive learning to learn efficient lipreading representations, enabling few-shot command customization with minimal user effort. Our model exhibits high robustness to different lighting, posture, and gesture conditions on an in-the-wild dataset. For 25-command classification, an F1-score of 0.8947 is achievable only using one shot, and its performance can be further boosted by adaptively learning from more data. This generalizability allowed us to develop a mobile silent speech interface empowered with on-device fine-tuning and visual keyword spotting. A user study demonstrated that with LipLearner, users could define their own commands with high reliability guaranteed by an online incremental learning scheme. Subjective feedback indicated that our system provides essential functionalities for customizable silent speech interactions with high usability and learnability.

Supplementary Material

Supplemental Materials (3544548.3581465-supplemental-materials.zip)
MP4 File (3544548.3581465-video-figure.mp4)
Video Figure
MP4 File (3544548.3581465-video-preview.mp4)
Video Preview
MP4 File (3544548.3581465-talk-video.mp4)
Pre-recorded Video Presentation

References

[1]
Miguel Angrick, Christian Herff, Emily Mugler, Matthew C Tate, Marc W Slutzky, Dean J Krusienski, and Tanja Schultz. 2019. Speech synthesis from ECoG using densely connected 3D convolutional neural networks. Journal of neural engineering 16, 3 (2019), 036019.
[2]
Relja Arandjelovic and Andrew Zisserman. 2018. Objects that sound. In Proceedings of the European conference on computer vision (ECCV). 435–451.
[3]
Aaron Bangor, Philip T Kortum, and James T Miller. 2008. An empirical evaluation of the system usability scale. Intl. Journal of Human–Computer Interaction 24, 6(2008), 574–594.
[4]
John Brooke 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4–7.
[5]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
[6]
Tuochao Chen, Benjamin Steeper, Kinan Alsheikh, Songyun Tao, François Guimbretière, and Cheng Zhang. 2020. C-Face: Continuously Reconstructing Facial Expressions by Deep Learning Contours of the Face with Ear-Mounted Miniature Cameras. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 112–125. https://doi.org/10.1145/3379337.3415879
[7]
Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. 2019. A Closer Look at Few-shot Classification. CoRR abs/1904.04232(2019). arXiv:1904.04232http://arxiv.org/abs/1904.04232
[8]
Xinlei Chen, Saining Xie, and Kaiming He. 2021. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9640–9649.
[9]
Joon Son Chung and Andrew Zisserman. 2016. Lip reading in the wild. In Asian conference on computer vision. Springer, 87–103.
[10]
Statista Research Department. 2022. Main devices used with voice assistants in the U.S. 2021, by brand. Retrieved April 28, 2022 from https://www.statista.com/statistics/1274398/voice-assistant-use-by-device-united-states/
[11]
Tony Ezzat and Tomaso Poggio. 1998. Miketalk: A talking facial display based on morphing visemes. In Proceedings Computer Animation’98 (Cat. No. 98EX169). IEEE, 96–102.
[12]
Tony Ezzat and Tomaso Poggio. 2000. Visual speech synthesis by morphing visemes. International Journal of Computer Vision 38, 1 (2000), 45–57.
[13]
Michael J Fagan, Stephen R Ell, James M Gilbert, E Sarrazin, and Peter M Chapman. 2008. Development of a (silent) speech recognition system for patients following laryngectomy. Medical engineering & physics 30, 4 (2008), 419–425.
[14]
Dalu Feng, Shuang Yang, Shiguang Shan, and Xilin Chen. 2020. Learn an effective lip reading model without pains. arXiv preprint arXiv:2011.07557(2020).
[15]
Masaaki Fukumoto. 2018. Silentvoice: Unnoticeable voice input by ingressive speech. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. 237–246.
[16]
Ruohan Gao and Kristen Grauman. 2021. Visualvoice: Audio-visual speech separation with cross-modal consistency. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 15490–15500.
[17]
Jose A Gonzalez, Lam A Cheah, James M Gilbert, Jie Bai, Stephen R Ell, Phil D Green, and Roger K Moore. 2016. A silent speech system based on permanent magnet articulography and direct synthesis. Computer Speech & Language 39 (2016), 67–87.
[18]
Frank H Guenther, Jonathan S Brumberg, E Joseph Wright, Alfonso Nieto-Castanon, Jason A Tourville, Mikhail Panko, Robert Law, Steven A Siebert, Jess L Bartels, Dinal S Andreasen, 2009. A wireless brain-machine interface for real-time speech synthesis. PloS one 4, 12 (2009), e8218.
[19]
Harish Haresamudram, Irfan Essa, and Thomas Plötz. 2021. Contrastive Predictive Coding for Human Activity Recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 2, Article 65 (jun 2021), 26 pages. https://doi.org/10.1145/3463506
[20]
Christian Herff, Dominic Heger, Adriana De Pesters, Dominic Telaar, Peter Brunner, Gerwin Schalk, and Tanja Schultz. 2015. Brain-to-text: decoding spoken phrases from phone representations in the brain. Frontiers in neuroscience 9 (2015), 217.
[21]
Yiyang Huang, Xuefeng Liang, and Chaowei Fang. 2021. CALLip: Lipreading Using Contrastive and Attribute Learning. In Proceedings of the 29th ACM International Conference on Multimedia (Virtual Event, China) (MM ’21). Association for Computing Machinery, New York, NY, USA, 2492–2500. https://doi.org/10.1145/3474085.3475420
[22]
Thomas Hueber, Elie-Laurent Benaroya, Gérard Chollet, Bruce Denby, Gérard Dreyfus, and Maureen Stone. 2010. Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips. Speech Communication 52, 4 (2010), 288–300.
[23]
Apple Inc.2022. Core ML | Apple Developer Documentation. Retrieved Feb. 9, 2023 from https://developer.apple.com/documentation/coreml
[24]
Apple Inc.2022. Create ML | Apple Developer Documentation. Retrieved Feb. 9, 2023 from https://developer.apple.com/documentation/createml
[25]
Apple Inc.2022. SFSpeechRecognizer | Apple Developer Documentation. Retrieved Feb. 9, 2023 from https://developer.apple.com/documentation/speech/sfspeechrecognizer
[26]
Apple Inc.2022. Shortcuts User Guide - Apple Support. Retrieved Feb. 9, 2023 from https://support.apple.com/guide/shortcuts/welcome/ios
[27]
Apple Inc.2022. Vision | Apple Developer Documentation. Retrieved Feb. 9, 2023 from https://developer.apple.com/documentation/vision
[28]
Apple Inc.2022. What can I ask Siri? - Official Apple Support. Retrieved Feb. 9, 2023 from https://support.apple.com/siri
[29]
Dhruv Jain, Khoa Huynh Anh Nguyen, Steven M. Goodman, Rachel Grossman-Kahn, Hung Ngo, Aditya Kusupati, Ruofei Du, Alex Olwal, Leah Findlater, and Jon E. Froehlich. 2022. ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 305, 16 pages. https://doi.org/10.1145/3491102.3502020
[30]
Yan Ji, Licheng Liu, Hongcui Wang, Zhilei Liu, Zhibin Niu, and Bruce Denby. 2018. Updating the silent speech challenge benchmark with deep learning. Speech Communication 98(2018), 42–50.
[31]
Arnav Kapur, Shreyas Kapur, and Pattie Maes. 2018. Alterego: A personalized wearable silent speech interface. In 23rd International conference on intelligent user interfaces. 43–53.
[32]
Vahid Kazemi and Josephine Sullivan. 2014. One Millisecond Face Alignment with an Ensemble of Regression Trees. In CVPR.
[33]
Naoki Kimura, Tan Gemicioglu, Jonathan Womack, Richard Li, Yuhui Zhao, Abdelkareem Bedri, Zixiong Su, Alex Olwal, Jun Rekimoto, and Thad Starner. 2022. SilentSpeller: Towards mobile, hands-free, silent speech text entry using electropalatography. In CHI Conference on Human Factors in Computing Systems. 1–19.
[34]
Naoki Kimura, Kentaro Hayashi, and Jun Rekimoto. 2020. TieLent: A Casual Neck-Mounted Mouth Capturing Device for Silent Speech Interaction. In Proceedings of the International Conference on Advanced Visual Interfaces. 1–8.
[35]
Naoki Kimura, Michinari Kono, and Jun Rekimoto. 2019. SottoVoce: an ultrasound imaging-based silent speech interaction using deep neural networks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–11.
[36]
Naoki Kimura, Zixiong Su, and Takaaki Saeki. 2020. End-to-End Deep Learning Speech Recognition Model for Silent Speech Challenge. In INTERSPEECH. 1025–1026.
[37]
Naoki Kimura, Zixiong Su, Takaaki Saeki, and Jun Rekimoto. 2022. SSR7000: A Synchronized Corpus of Ultrasound Tongue Imaging for End-to-End Silent Speech Recognition. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. 6866–6873.
[38]
Davis E. King. 2009. Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning Research 10 (2009), 1755–1758.
[39]
Soonkyu Lee and DongSuk Yook. 2002. Audio-to-visual conversion using hidden markov models. In Pacific Rim International Conference on Artificial Intelligence. Springer, 563–570.
[40]
Richard Li, Jason Wu, and Thad Starner. 2019. Tongueboard: An oral interface for subtle input. In Proceedings of the 10th Augmented Human International Conference 2019. 1–9.
[41]
Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, 2019. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172(2019).
[42]
Brais Martinez, Pingchuan Ma, Stavros Petridis, and Maja Pantic. 2020. Lipreading using temporal convolutional networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6319–6323.
[43]
Geoffrey S Meltzner, James T Heaton, Yunbin Deng, Gianluca De Luca, Serge H Roy, and Joshua C Kline. 2018. Development of sEMG sensors and algorithms for silent speech recognition. Journal of neural engineering 15, 4 (2018), 046031.
[44]
Carolina Milanesi. 2016. Voice Assistant Anyone? Yes please, but not in public. Creative Strategies (2016).
[45]
Laxmi Pandey and Ahmed Sabbir Arif. 2021. LipType: A Silent Speech Recognizer Augmented with an Independent Repair Model. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–19.
[46]
Stavros Petridis, Themos Stafylakis, Pingehuan Ma, Feipeng Cai, Georgios Tzimiropoulos, and Maja Pantic. 2018. End-to-end audiovisual speech recognition. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 6548–6552.
[47]
Anne Porbadnigk, Marek Wester, Jan-P Calliess, and Tanja Schultz. 2009. EEG-based speech recognition.
[48]
KR Prajwal, Rudrabha Mukhopadhyay, Vinay P Namboodiri, and CV Jawahar. 2020. Learning individual speaking styles for accurate lip to speech synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13796–13805.
[49]
Qinwan Rabbani, Griffin Milsap, and Nathan E Crone. 2019. The potential for a speech brain–computer interface using chronic electrocorticography. Neurotherapeutics 16, 1 (2019), 144–165.
[50]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.
[51]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1–67. http://jmlr.org/papers/v21/20-074.html
[52]
Takeshi Saitoh and Michiko Kubokawa. 2019. LiP25w: Word-level Lip Reading Web Application for Smart Device. The 15th International Conference on Auditory-Visual Speech Processing (2019).
[53]
Paul W Schönle, Klaus Gräbe, Peter Wenig, Jörg Höhne, Jörg Schrader, and Bastian Conrad. 1987. Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain and Language 31, 1 (1987), 26–35.
[54]
Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I Hong. 2018. " Hey Alexa, What’s Up?" A Mixed-Methods Studies of In-Home Conversational Agent Usage. In Proceedings of the 2018 designing interactive systems conference. 857–868.
[55]
Arda Senocak, Tae-Hyun Oh, Junsik Kim, Ming-Hsuan Yang, and In So Kweon. 2018. Learning to localize sound source in visual scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4358–4366.
[56]
Changchong Sheng, Matti Pietikäinen, Qi Tian, and Li Liu. 2021. Cross-Modal Self-Supervised Learning for Lip Reading: When Contrastive Learning Meets Adversarial Training. In Proceedings of the 29th ACM International Conference on Multimedia (Virtual Event, China) (MM ’21). Association for Computing Machinery, New York, NY, USA, 2456–2464. https://doi.org/10.1145/3474085.3475415
[57]
Zixiong Su, Xinlei Zhang, Naoki Kimura, and Jun Rekimoto. 2021. Gaze+ Lip: Rapid, Precise and Expressive Interactions Combining Gaze Input and Silent Speech Commands for Hands-free Smart TV Control. In ACM Symposium on Eye Tracking Research and Applications. 1–6.
[58]
Ke Sun, Chun Yu, Weinan Shi, Lan Liu, and Yuanchun Shi. 2018. Lip-Interact: Improving Mobile Device Interaction with Silent Speech Commands. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology(Berlin, Germany) (UIST ’18). Association for Computing Machinery, New York, NY, USA, 581–593. https://doi.org/10.1145/3242587.3242599
[59]
Tomoki Toda, Mikihiro Nakagiri, and Kiyohiro Shikano. 2012. Statistical voice conversion techniques for body-conducted unvoiced speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing 20, 9(2012), 2505–2517.
[60]
Tomoki Toda, Keigo Nakamura, Hidehiko Sekimoto, and Kiyohiro Shikano. 2009. Voice conversion for various types of body transmitted speech. In 2009 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 3601–3604.
[61]
Tomoki Toda and Kiyohiro Shikano. 2005. NAM-to-speech conversion with Gaussian mixture models. (2005).
[62]
Carnegie Mellon University. 2011. The CMU Pronouncing Dictionary. Retrieved Feb. 9, 2023 from http://www.speech.cs.cmu.edu/cgi-bin/cmudict
[63]
Aaron Van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv e-prints (2018), arXiv–1807.
[64]
Michiel Visser, Mannes Poel, and Anton Nijholt. 1999. Classifying visemes for automatic lipreading. In International Workshop on Text, Speech and Dialogue. Springer, 349–352.
[65]
Michael Wand and Tanja Schultz. 2011. Session-independent EMG-based Speech Recognition. In Biosignals. 295–300.
[66]
Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, and Helen Meng. 2022. VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7252–7256. https://doi.org/10.1109/ICASSP43922.2022.9747427
[67]
Jason Wu, Chris Harrison, Jeffrey P. Bigham, and Gierad Laput. 2020. Automated Class Discovery and One-Shot Interactions for Acoustic Activity Recognition. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376875
[68]
Xuhai Xu, Jun Gong, Carolina Brum, Lilian Liang, Bongsoo Suh, Shivam Kumar Gupta, Yash Agarwal, Laurence Lindsey, Runchang Kang, Behrooz Shahsavari, Tu Nguyen, Heriberto Nieto, Scott E Hudson, Charlie Maalouf, Jax Seyed Mousavi, and Gierad Laput. 2022. Enabling Hand Gesture Customization on Wrist-Worn Devices. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 496, 19 pages. https://doi.org/10.1145/3491102.3501904
[69]
Ruidong Zhang, Mingyang Chen, Benjamin Steeper, Yaxuan Li, Zihan Yan, Yizhuo Chen, Songyun Tao, Tuochao Chen, Hyunchul Lim, and Cheng Zhang. 2022. SpeeChin: A Smart Necklace for Silent Speech Recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 4, Article 192 (dec 2022), 23 pages. https://doi.org/10.1145/3494987

Cited By

View all
  • (2024)StethoSpeech: Speech Generation Through a Clinical Stethoscope Attached to the SkinProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785158:3(1-21)Online publication date: 9-Sep-2024
  • (2024)DisMouse: Disentangling Information from Mouse Movement DataProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676411(1-13)Online publication date: 13-Oct-2024
  • (2024)WhisperMask: a noise suppressive mask-type microphone for whisper speechProceedings of the Augmented Humans International Conference 202410.1145/3652920.3652925(1-14)Online publication date: 4-Apr-2024
  • Show More Cited By

Index Terms

  1. LipLearner: Customizable Silent Speech Interactions on Mobile Devices

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
      April 2023
      14911 pages
      ISBN:9781450394215
      DOI:10.1145/3544548
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 April 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      • Best Paper

      Author Tags

      1. Customization
      2. Few-shot Learning
      3. Lipreading
      4. Silent Speech Interface

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      • JST CREST
      • JST Moonshot R&D

      Conference

      CHI '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,218
      • Downloads (Last 6 weeks)195
      Reflects downloads up to 06 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)StethoSpeech: Speech Generation Through a Clinical Stethoscope Attached to the SkinProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785158:3(1-21)Online publication date: 9-Sep-2024
      • (2024)DisMouse: Disentangling Information from Mouse Movement DataProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676411(1-13)Online publication date: 13-Oct-2024
      • (2024)WhisperMask: a noise suppressive mask-type microphone for whisper speechProceedings of the Augmented Humans International Conference 202410.1145/3652920.3652925(1-14)Online publication date: 4-Apr-2024
      • (2024)Enabling Hands-Free Voice Assistant Activation on EarphonesProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661890(155-168)Online publication date: 3-Jun-2024
      • (2024)TouchEditorProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314547:4(1-29)Online publication date: 12-Jan-2024
      • (2024)MELDER: The Design and Evaluation of a Real-time Silent Speech Recognizer for Mobile DevicesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642348(1-23)Online publication date: 11-May-2024
      • (2024)Mouse2Vec: Learning Reusable Semantic Representations of Mouse BehaviourProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642141(1-17)Online publication date: 11-May-2024
      • (2024)ReHEarSSE: Recognizing Hidden-in-the-Ear Silently Spelled ExpressionsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642095(1-16)Online publication date: 11-May-2024
      • (2024)Watch Your Mouth: Silent Speech Recognition with Depth SensingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642092(1-15)Online publication date: 11-May-2024
      • (2024)KuchiNavi: lip-reading-based navigation appFifteenth International Conference on Graphics and Image Processing (ICGIP 2023)10.1117/12.3021118(47)Online publication date: 25-Mar-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media