Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3379337.3415816acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article
Open access

Decoding Surface Touch Typing from Hand-Tracking

Published: 20 October 2020 Publication History

Abstract

We propose a novel text decoding method that enables touch typing on an uninstrumented flat surface. Rather than relying on physical keyboards or capacitive touch, our method takes as input hand motion of the typist, obtained through hand-tracking, and decodes this motion directly into text. We use a temporal convolutional network to represent a motion model that maps the hand motion, represented as a sequence of hand pose features, into text characters. To enable touch typing without the haptic feedback of a physical keyboard, we had to address more erratic typing motion due to drift of the fingers. Thus, we incorporate a language model as a text prior and use beam search to efficiently combine our motion and language models to decode text from erratic or ambiguous hand motion. We collected a dataset of 20 touch typists and evaluated our model on several baselines, including contact-based text decoding and typing on a physical keyboard. Our proposed method is able to leverage continuous hand pose information to decode text more accurately than contact-based methods and an offline study shows parity (73 WPM, 2.38% UER) with typing on a physical keyboard. Our results show that hand-tracking has the potential to enable rapid text entry in mobile environments.

Supplementary Material

VTT File (ufp1807pv.vtt)
VTT File (ufp1807vf.vtt)
VTT File (3379337.3415816.vtt)
SRT File (ufp1807pvc.srt)
Preview video captions
SRT File (ufp1807vfc.srt)
Video figure captions
MP4 File (ufp1807pv.mp4)
Preview video
MP4 File (ufp1807vf.mp4)
Video figure
MP4 File (3379337.3415816.mp4)
Presentation Video

References

[1]
Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan, Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Johannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, Yang Liu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, Sherjil Ozair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman, Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet, Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, Jidong Wang, Kaifu Wang, Yi Wang, Zhijian Wang, Zhiqian Wang, Shuang Wu, Likai Wei, Bo Xiao, Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, and Zhenyao Zhu. 2016. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin. In Proceedings of The 33rd International Conference on Machine Learning, Vol. 48. PMLR, New York, New York, USA, 173--182. http://proceedings.mlr.press/v48/amodei16.html
[2]
Shaojie Bai, J. Zico Kolter, and Vladlen Koltun. 2018. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. CoRR abs/1803.01271 (2018). http://arxiv.org/abs/1803.01271
[3]
Vivek Dhakal, Anna Maria Feit, Per Ola Kristensson, and Antti Oulasvirta. 2018. Observations on Typing from 136 Million Keystrokes. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). Association for Computing Machinery, New York, NY, USA, Article Paper 646, 12 pages.
[4]
John J. Dudley, Hrvoje Benko, Daniel Wigdor, and Per Ola Kristensson. 2019. Performance Envelopes of Virtual Keyboard Text Input Strategies in Virtual Reality. In 2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). 289--300.
[5]
John J. Dudley, Keith Vertanen, and Per Ola Kristensson. 2018. Fast and Precise Touch-Based Text Entry for Head-Mounted Augmented Reality with Variable Occlusion. ACM Trans. Comput.-Hum. Interact. 25, 6, Article Article 30 (Dec. 2018), 40 pages.
[6]
Anna Maria Feit, Daryl Weir, and Antti Oulasvirta. 2016. How We Type: Movement Strategies and Performance in Everyday Typing. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). Association for Computing Machinery, New York, NY, USA, 4262--4273.
[7]
Joshua Goodman, Gina Venolia, Keith Steury, and Chauncey Parker. 2002. Language Modeling for Soft Keyboards. In Proceedings of the 7th International Conference on Intelligent User Interfaces (IUI '02). Association for Computing Machinery, New York, NY, USA, 194--195.
[8]
Alex Graves, Santiago Fernández, Faustino Gomez, and Jürgen Schmidhuber. 2006. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. In Proceedings of the 23rd International Conference on Machine Learning (ICML '06). Association for Computing Machinery, New York, NY, USA, 369--376.
[9]
Alex Graves and Navdeep Jaitly. 2014. Towards End-To-End Speech Recognition with Recurrent Neural Networks. In Proceedings of the 31st International Conference on Machine Learning, Vol. 32. PMLR, Bejing, China, 1764--1772. http://proceedings.mlr.press/v32/graves14.html
[10]
Asela Gunawardana, Tim Paek, and Christopher Meek. 2010. Usability Guided Key-Target Resizing for Soft Keyboards. In Proceedings of the 15th International Conference on Intelligent User Interfaces (IUI '10). Association for Computing Machinery, New York, NY, USA, 111--118.
[11]
Aakar Gupta, Cheng Ji, Hui-Shyong Yeo, Aaron Quigley, and Daniel Vogel. 2019. RotoSwype: Word-Gesture Typing Using a Ring. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Article Paper 14, 12 pages.
[12]
Shangchen Han, Beibei Liu, Robert Wang, Yuting Ye, Christopher D. Twigg, and Kenrick Kin. 2018. Online Optical Marker-Based Hand Tracking with Deep Labels. ACM Trans. Graph. 37, 4, Article Article 166 (July 2018), 10 pages.
[13]
Awni Y. Hannun, Andrew L. Maas, Daniel Jurafsky, and Andrew Y. Ng. 2014. First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs. arXiv preprint arXiv:1408.2873 (2014). http://arxiv.org/abs/1408.2873
[14]
Niels Henze, Enrico Rukzio, and Susanne Boll. 2012. Observational and Experimental Investigation of Typing Behaviour Using Virtual Keyboards for Mobile Devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). Association for Computing Machinery, New York, NY, USA, 2659--2668.
[15]
Hongzhao Huang and Fuchun Peng. 2019. An Empirical Study of Efficient ASR Rescoring with Transformers. arXiv preprint arXiv:1910.11450 (2019).
[16]
Brent Edward Insko. 2001. Passive Haptics Significantly Enhances Virtual Environments. Ph.D. Dissertation. Advisor(s) Brooks, Frederick P. AAI3007820.
[17]
Per-Ola Kristensson and Shumin Zhai. 2005. Relaxing Stylus Typing Precision by Geometric Pattern Matching. In Proceedings of the 10th International Conference on Intelligent User Interfaces (IUI '05). Association for Computing Machinery, New York, NY, USA, 151--158.
[18]
Minkyung Lee and Woontack Woo. 2003. ARKB: 3D vision-based Augmented Reality Keyboard. In Online Proceeding of the 13th International Conference on Artificial Reality and Telexistence.
[19]
Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. 2017. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, Taipei, Taiwan, 986--995. https://www.aclweb.org/anthology/I17--1099
[20]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692
[21]
Anders Markussen, Mikkel Rønne Jakobsen, and Kasper Hornbæk. 2014. Vulture: A Mid-Air Word-Gesture Keyboard. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). Association for Computing Machinery, New York, NY, USA, 1073--1082.
[22]
Antonio Polino, Razvan Pascanu, and Dan Alistarh. 2018. Model compression via distillation and quantization. In International Conference on Learning Representations. https://openreview.net/forum?id=S1XolQbRW
[23]
R. William Soukoreff and I. Scott MacKenzie. 2003. Metrics for Text Entry Research: An Evaluation of MSD and KSPC, and a New Unified Error Metric. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '03). Association for Computing Machinery, New York, NY, USA, 113--120.
[24]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ? ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998--6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
[25]
Keith Vertanen, Haythem Memmi, Justin Emge, Shyam Reyal, and Per Ola Kristensson. 2015. VelociTap: Investigating Fast Mobile Text Entry Using Sentence-Based Decoding of Touchscreen Keyboard Input. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). Association for Computing Machinery, New York, NY, USA, 659--668.
[26]
Daryl Weir, Henning Pohl, Simon Rogers, Keith Vertanen, and Per Ola Kristensson. 2014. Uncertain Text Entry on Mobile Devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). Association for Computing Machinery, New York, NY, USA, 2307--2316.
[27]
Zhican Yang, Chun Yu, Xin Yi, and Yuanchun Shi. 2019. Investigating Gesture Typing for Indirect Touch. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3, 3, Article Article 117 (Sept. 2019), 22 pages.
[28]
Xin Yi, Chun Yu, Mingrui Zhang, Sida Gao, Ke Sun, and Yuanchun Shi. 2015. ATK: Enabling Ten-Finger Freehand Typing in Air Based on 3D Hand Tracking Data. In Proceedings of the 28th Annual ACM Symposium on User Interface Software and Technology (UIST '15). Association for Computing Machinery, New York, NY, USA, 539--548.
[29]
Suwen Zhu, Tianyao Luo, Xiaojun Bi, and Shumin Zhai. 2018. Typing on an Invisible Keyboard. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). Association for Computing Machinery, New York, NY, USA, Article Paper 439, 13 pages.
[30]
Suwen Zhu, Jingjie Zheng, Shumin Zhai, and Xiaojun Bi. 2019. I'sFree: Eyes-Free Gesture Typing via a Touch-Enabled Remote Control. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Article Paper 448, 12 pages.

Cited By

View all
  • (2024)MagDesk: Interactive Tabletop Workspace Based on Passive Magnetic TrackingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997568:4(1-31)Online publication date: 21-Nov-2024
  • (2024)StegoType: Surface Typing from Egocentric CamerasAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686762(1-14)Online publication date: 13-Oct-2024
  • (2024)StegoType: Surface Typing from Egocentric CamerasProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676343(1-14)Online publication date: 13-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
UIST '20: Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology
October 2020
1297 pages
ISBN:9781450375146
DOI:10.1145/3379337
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. augmented reality
  2. hand-tracking
  3. text input
  4. virtual reality

Qualifiers

  • Research-article

Conference

UIST '20

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Upcoming Conference

UIST '25
The 38th Annual ACM Symposium on User Interface Software and Technology
September 28 - October 1, 2025
Busan , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)451
  • Downloads (Last 6 weeks)60
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)MagDesk: Interactive Tabletop Workspace Based on Passive Magnetic TrackingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997568:4(1-31)Online publication date: 21-Nov-2024
  • (2024)StegoType: Surface Typing from Egocentric CamerasAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686762(1-14)Online publication date: 13-Oct-2024
  • (2024)StegoType: Surface Typing from Egocentric CamerasProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676343(1-14)Online publication date: 13-Oct-2024
  • (2024)TouchInsight: Uncertainty-aware Rapid Touch and Text Input for Mixed Reality from Egocentric VisionProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676330(1-16)Online publication date: 13-Oct-2024
  • (2024)STMG: A Machine Learning Microgesture Recognition System for Supporting Thumb-Based VR/AR InputProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642702(1-15)Online publication date: 11-May-2024
  • (2024)Kine-Appendage: Enhancing Freehand VR Interaction Through Transformations of Virtual AppendagesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.323074630:7(3298-3313)Online publication date: Jul-2024
  • (2023)From 2D to 3DProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808297:1(1-25)Online publication date: 28-Mar-2023
  • (2023)CAFI-ARProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35694996:4(1-23)Online publication date: 11-Jan-2023
  • (2023)Ultrasonic Keyboard: A Mid-Air Virtual Qwerty with Ultrasonic Feedback for Virtual RealityProceedings of the Seventeenth International Conference on Tangible, Embedded, and Embodied Interaction10.1145/3569009.3573117(1-8)Online publication date: 26-Feb-2023
  • (2023)ResType: Invisible and Adaptive Tablet Keyboard Leveraging Resting FingersProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581055(1-14)Online publication date: 19-Apr-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media