Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

FingerTrak: Continuous 3D Hand Pose Tracking by Deep Learning Hand Silhouettes Captured by Miniature Thermal Cameras on Wrist

Published: 15 June 2020 Publication History

Abstract

In this paper, we present FingerTrak, a minimal-obtrusive wristband that enables continuous 3D finger tracking and hand pose estimation with four miniature thermal cameras mounted closely on a form-fitting wristband. FingerTrak explores the feasibility of continuously reconstructing the entire hand postures (20 finger joints positions) without the needs of seeing all fingers. We demonstrate that our system is able to estimate the entire hand posture by observing only the outline of the hand, i.e., hand silhouettes from the wrist using low-resolution (32 x 24) thermal cameras. A customized deep neural network is developed to learn to "stitch" these multi-view images and estimate 20 joints positions in 3D space. Our user study with 11 participants shows that the system can achieve an average angular error of 6.46° when tested under the same background, and 8.06° when tested under a different background. FingerTrak also shows encouraging results with the re-mounting of the device and has the potential to reconstruct some of the complicated poses. We conclude this paper with further discussions of the opportunities and challenges of this technology.

Supplementary Material

hu (hu.zip)
Supplemental movie, appendix, image and software files for, FingerTrak: Continuous 3D Hand Pose Tracking by Deep Learning Hand Silhouettes Captured by Miniature Thermal Cameras on Wrist

References

[1]
Gilles Bailly, Jörg Müller, Michael Rohs, Daniel Wigdor, and Sven Kratz. 2012. ShoeSense: A New Perspective on Gestural Interaction and Wearable Applications. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI '12). ACM, New York, NY, USA, 1239--1248. https://doi.org/10.1145/2207676.2208576
[2]
Simon Baker, Takeo Kanade, et al. 2005. Shape-from-silhouette across time part i: Theory and algorithms. International Journal of Computer Vision 62, 3 (2005), 221--247.
[3]
Simon Baker, Takeo Kanade, et al. 2005. Shape-from-silhouette across time part ii: Applications to human modeling and markerless motion tracking. International Journal of Computer Vision 63, 3 (2005), 225--245.
[4]
Luca Ballan, Aparna Taneja, Jürgen Gall, Luc Van Gool, and Marc Pollefeys. 2012. Motion capture of hands in action using discriminative salient points. In European Conference on Computer Vision. Springer, 640--653.
[5]
Bruce Guenther Baumgart. 1974. Geometric modeling for computer vision. Technical Report. STANFORD UNIV CA DEPT OF COMPUTER SCIENCE.
[6]
Jean-Baptiste Chossat, Yiwei Tao, Vincent Duchaine, and Yong-Lae Park. 2015. Wearable soft artificial skin for hand motion detection with embedded microfluidic strain sensing. In 2015 IEEE international conference on robotics and automation (ICRA). IEEE, 2568--2573.
[7]
Simone Ciotti, Edoardo Battaglia, Nicola Carbonaro, Antonio Bicchi, Alessandro Tognetti, and Matteo Bianchi. 2016. A synergy-based optimally designed sensing glove for functional grasp recognition. Sensors 16, 6 (2016), 811.
[8]
James Connolly, Joan Condell, Brendan O'Flynn, Javier Torres Sanchez, and Philip Gardiner. 2017. IMU sensor-based electronic goniometric glove for clinical finger movement analysis. IEEE Sensors Journal 18, 3 (2017), 1273--1281.
[9]
Martin de La Gorce, David J Fleet, and Nikos Paragios. 2011. Model-based 3d hand pose estimation from monocular video. IEEE transactions on pattern analysis and machine intelligence 33, 9 (2011), 1793--1805.
[10]
Artem Dementyev and Joseph A Paradiso. 2014. WristFlex: low-power gesture input with wrist-worn pressure sensors. In Proceedings of the 27th annual ACM symposium on User interface software and technology. ACM, 161--166.
[11]
Guanglong Du, Ping Zhang, Jianhua Mai, and Zeling Li. 2012. Markerless kinect-based hand tracking for robot teleoperation. International Journal of Advanced Robotic Systems 9, 2 (2012), 36.
[12]
Rui Fukui, Masahiko Watanabe, Tomoaki Gyota, Masamichi Shimosaka, and Tomomasa Sato. 2011. Hand shape classification with a wrist contour sensor: development of a prototype device. In Proceedings of the 13th international conference on Ubiquitous computing. ACM, 311--314.
[13]
Liuhao Ge, Yujun Cai, Junwu Weng, and Junsong Yuan. 2018. Hand pointnet: 3d hand pose estimation using point sets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8417--8426.
[14]
Liuhao Ge, Zhou Ren, Yuncheng Li, Zehao Xue, Yingying Wang, Jianfei Cai, and Junsong Yuan. 2019. 3D Hand Shape and Pose Estimation from a Single RGB Image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10833--10842.
[15]
Oliver Glauser, Shihao Wu, Daniele Panozzo, Otmar Hilliges, and Olga Sorkine-Hornung. 2019. Interactive hand pose estimation using a stretch-sensing soft glove. ACM Transactions on Graphics (TOG) 38, 4 (2019), 41.
[16]
Oliver Glauser, Shihao Wu, Daniele Panozzo, Otmar Hilliges, and Olga Sorkine-Hornung. 2019. A stretch-sensing soft glove for interactive hand pose estimation. In ACM SIGGRAPH 2019 Emerging Technologies. ACM, 4.
[17]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[18]
Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. 2019. Bag of tricks for image classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 558--567.
[19]
Markus Höll, Markus Oberweger, Clemens Arth, and Vincent Lepetit. 2018. Efficient physics-based implementation for realistic hand-object interaction in virtual reality. In 2018 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). IEEE, 175--182.
[20]
Peter J Huber. 1992. Robust estimation of a location parameter. In Breakthroughs in statistics. Springer, 492--518.
[21]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML.
[22]
Umar Iqbal, Pavlo Molchanov, Thomas Breuel Juergen Gall, and Jan Kautz. 2018. Hand pose estimation via latent 2.5 d heatmap regression. In Proceedings of the European Conference on Computer Vision (ECCV). 118--134.
[23]
Bryce Kellogg, Vamsi Talla, and Shyamnath Gollakota. 2014. Bringing gesture recognition to all devices. In 11th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 14). 303--316.
[24]
Frederic Kerber, Michael Puhl, and Antonio Krüger. 2017. User-independent real-time hand gesture recognition based on surface electromyography. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services. ACM, 36.
[25]
David Kim, Otmar Hilliges, Shahram Izadi, Alex D. Butler, Jiawen Chen, Iason Oikonomidis, and Patrick Olivier. 2012. Digits: Freehand 3D Interactions Anywhere Using a Wrist-worn Gloveless Sensor. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (Cambridge, Massachusetts, USA) (UIST '12). ACM, New York, NY, USA, 167--176. https://doi.org/10.1145/2380116.2380139
[26]
Jungsoo Kim, Jiasheng He, Kent Lyons, and Thad Starner. 2007. The gesture watch: A wireless contact-free gesture based wrist interface. In 2007 11th IEEE International Symposium on Wearable Computers. IEEE, 15--22.
[27]
Rebecca K Kramer, Carmel Majidi, Ranjana Sahai, and Robert J Wood. 2011. Soft curvature sensors for joint angle proprioception. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 1919--1926.
[28]
Hong Li, Shishir Chawla, Richard Li, Sumeet Jain, Gregory D Abowd, Thad Starner, Cheng Zhang, and Thomas Plötz. 2018. Wristwash: towards automatic handwashing assessment using a wrist-worn device. In Proceedings of the 2018 ACM International Symposium on Wearable Computers. 132--139.
[29]
Tianxing Li, Xi Xiong, Yifei Xie, George Hito, Xing-Dong Yang, and Xia Zhou. 2017. Reconstructing hand poses using visible light. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 3 (2017), 71.
[30]
Hui Liang, Junsong Yuan, Daniel Thalmann, and Nadia Magnenat Thalmann. 2015. AR in hand: Egocentric palm pose tracking and gesture recognition for augmented reality applications. In Proceedings of the 23rd ACM international conference on Multimedia. 743--744.
[31]
Bor-Shing Lin, I Lee, Shu-Yu Yang, Yi-Chiang Lo, Junghsi Lee, and Jean-Lon Chen. 2018. Design of an inertial-sensor-based data glove for hand function evaluation. Sensors 18, 5 (2018), 1545.
[32]
Ilya Loshchilov and Frank Hutter. 2017. SGDR: Stochastic gradient descent with warm restarts. Proceedings of the International Conference on Learning Representations.
[33]
Jess McIntosh, Asier Marzo, Mike Fraser, and Carol Phillips. 2017. EchoFlex: Hand Gesture Recognition Using Ultrasound Imaging. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI '17). Association for Computing Machinery, New York, NY, USA, 1923--1934. https://doi.org/10.1145/3025453.3025807
[34]
Iason Oikonomidis, Nikolaos Kyriazis, and Antonis A Argyros. 2011. Efficient model-based 3D tracking of hand articulations using Kinect. In BmVC, Vol. 1. 3.
[35]
Iason Oikonomidis, Nikolaos Kyriazis, and Antonis A Argyros. 2011. Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In 2011 International Conference on Computer Vision. IEEE, 2088--2095.
[36]
Corey R Pittman and Joseph J LaViola Jr. 2017. Multiwave: Complex Hand Gesture Recognition Using the Doppler Effect. In Graphics Interface. 97--106.
[37]
Chen Qian, Xiao Sun, Yichen Wei, Xiaoou Tang, and Jian Sun. 2014. Realtime and robust hand tracking from depth. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1106--1113.
[38]
Grégory Rogez, Maryam Khademi, JS Supančič III, Jose Maria Martinez Montiel, and Deva Ramanan. 2014. 3d hand pose detection in egocentric rgb-d images. In European Conference on Computer Vision. Springer, 356--371.
[39]
Javier Romero, Dimitrios Tzionas, and Michael J. Black. 2017. Embodied Hands: Modeling and Capturing Hands and Bodies Together. ACM Transactions on Graphics, (Proc. SIGGRAPH Asia) 36, 6 (Nov. 2017).
[40]
Kyeongeun Seo and Hyeonjoong Cho. 2014. AirPincher: A HandHeld Device for Recognizing Delicate Mid-air Hand Gestures in Proceedings of UIST. (2014).
[41]
Toby Sharp, Cem Keskin, Duncan Robertson, Jonathan Taylor, Jamie Shotton, David Kim, Christoph Rhemann, Ido Leichter, Alon Vinnikov, Yichen Wei, Daniel Freedman, Pushmeet Kohli, Eyal Krupka, Andrew Fitzgibbon, and Shahram Izadi. 2015. Accurate, robust, and flexible real-time hand tracking. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 3633--3642.
[42]
Tomas Simon, Hanbyul Joo, Iain Matthews, and Yaser Sheikh. 2017. Hand keypoint detection in single images using multiview bootstrapping. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1145--1153.
[43]
Adrian Spurr, Jie Song, Seonwook Park, and Otmar Hilliges. 2018. Cross-modal deep variational hand pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 89--98.
[44]
Srinath Sridhar, Anders Markussen, Antti Oulasvirta, Christian Theobalt, and Sebastian Boring. 2017. WatchSense: On-and above-skin input sensing through a wearable depth sensor. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 3891--3902.
[45]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research 15 (2014), 1929--1958. http://jmlr.org/papers/v15/srivastava14a.html
[46]
Thad Starner, Joshua Weaver, and Alex Pentland. 1998. Real-time american sign language recognition using desk and wearable computer based video. IEEE Transactions on pattern analysis and machine intelligence 20, 12 (1998), 1371--1375.
[47]
Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik Learned-Miller. 2015. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision. 945--953.
[48]
Li Sun, Souvik Sen, Dimitrios Koutsonikolas, and Kyu-Han Kim. 2015. Widraw: Enabling hands-free drawing in the air on commodity wifi devices. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. ACM, 77--89.
[49]
Jonathan Taylor, Lucas Bordeaux, Thomas Cashman, Bob Corish, Cem Keskin, Toby Sharp, Eduardo Soto, David Sweeney, Julien Valentin, Benjamin Luff, Arran Topalian, Erroll Wood, Sameh Khamis, Pushmeet Kohli, Shahram Izadi, Richard Banks, Andrew Fitzgibbon, and Jamie Shotton. 2016. Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Transactions on Graphics (TOG) 35, 4 (2016), 143.
[50]
Jonathan Tompson, Murphy Stein, Yann Lecun, and Ken Perlin. 2014. Real-time continuous pose recovery of human hands using convolutional networks. ACM Transactions on Graphics (ToG) 33, 5 (2014), 169.
[51]
Andrew Vardy, John Robinson, and Li-Te Cheng. 1999. The wristcam as input device. In Digest of Papers. Third International Symposium on Wearable Computers. IEEE, 199--202.
[52]
Chao Xu, Parth H Pathak, and Prasant Mohapatra. 2015. Finger-writing with smartwatch: A case for finger and hand gesture recognition using smartwatch. In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications. ACM, 9--14.
[53]
Hui-Shyong Yeo, Erwin Wu, Juyoung Lee, Aaron Quigley, and Hideki Koike. 2019. Opisthenar: Hand Poses and Finger Tapping Recognition by Observing Back of Hand Using Embedded Wrist Camera. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (New Orleans, LA, USA) (UIST '19). Association for Computing Machinery, New York, NY, USA, 963--971. https://doi.org/10.1145/3332165.3347867
[54]
Cheng Zhang, Qiuyue Xue, Anandghan Waghmare, Ruichen Meng, Sumeet Jain, Yizeng Han, Xinyu Li, Kenneth Cunefare, Thomas Ploetz, Thad Starner, Inan Omer, and D. Abowd Gregory. 2018. FingerPing: Recognizing Fine-grained Hand Poses using Active Acoustic On-body Sensing. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 437.
[55]
Yingwei Zhang, Yiqiang Chen, Hanchao Yu, Xiaodong Yang, Wang Lu, and Hong Liu. 2018. Wearing-independent hand gesture recognition method based on EMG armband. Personal and Ubiquitous Computing 22, 3 (2018), 511--524.
[56]
Yang Zhang and Chris Harrison. 2015. Tomo: Wearable, low-cost electrical impedance tomography for hand gesture recognition. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology. ACM, 167--173.
[57]
Yang Zhang, Robert Xiao, and Chris Harrison. 2016. Advancing hand gesture recognition with high resolution electrical impedance tomography. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, 843--850.
[58]
Yixin Zhao, Parth H Pathak, Chao Xu, and Prasant Mohapatra. 2015. Finger and hand gesture recognition using smartwatch. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 471--471.
[59]
Christian Zimmermann and Thomas Brox. 2017. Learning to estimate 3d hand pose from single rgb images. In Proceedings of the IEEE International Conference on Computer Vision. 4903--4911.

Cited By

View all
  • (2024)Advanced systems and technologies for the enhancement of user experience in cultural spaces: an overviewHeritage Science10.1186/s40494-024-01186-512:1Online publication date: 28-Feb-2024
  • (2024)GestureGPT: Toward Zero-Shot Free-Form Hand Gesture Understanding with Large Language Model AgentsProceedings of the ACM on Human-Computer Interaction10.1145/36981458:ISS(462-499)Online publication date: 24-Oct-2024
  • (2024)Towards Smartphone-based 3D Hand Pose Reconstruction Using Acoustic SignalsACM Transactions on Sensor Networks10.1145/367712220:5(1-32)Online publication date: 26-Aug-2024
  • Show More Cited By

Index Terms

  1. FingerTrak: Continuous 3D Hand Pose Tracking by Deep Learning Hand Silhouettes Captured by Miniature Thermal Cameras on Wrist

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 4, Issue 2
    June 2020
    771 pages
    EISSN:2474-9567
    DOI:10.1145/3406789
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 June 2020
    Published in IMWUT Volume 4, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hand reconstruction
    2. pose recognition

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)292
    • Downloads (Last 6 weeks)27
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Advanced systems and technologies for the enhancement of user experience in cultural spaces: an overviewHeritage Science10.1186/s40494-024-01186-512:1Online publication date: 28-Feb-2024
    • (2024)GestureGPT: Toward Zero-Shot Free-Form Hand Gesture Understanding with Large Language Model AgentsProceedings of the ACM on Human-Computer Interaction10.1145/36981458:ISS(462-499)Online publication date: 24-Oct-2024
    • (2024)Towards Smartphone-based 3D Hand Pose Reconstruction Using Acoustic SignalsACM Transactions on Sensor Networks10.1145/367712220:5(1-32)Online publication date: 26-Aug-2024
    • (2024)SoundScroll: Robust Finger Slide Detection Using Friction Sound and Wrist-Worn MicrophonesProceedings of the 2024 ACM International Symposium on Wearable Computers10.1145/3675095.3676614(63-70)Online publication date: 5-Oct-2024
    • (2024)Demo of EITPose: Wearable and Practical Electrical Impedance Tomography for Continuous Hand Pose EstimationAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686770(1-3)Online publication date: 13-Oct-2024
    • (2024)Demonstrating Z-Band: Enabling Subtle Hand Interactions with Bio-impedance Sensing on the WristAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686766(1-2)Online publication date: 13-Oct-2024
    • (2024)EchoWrist: Continuous Hand Pose Tracking and Hand-Object Interaction Recognition Using Low-Power Active Acoustic Sensing On a WristbandProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642910(1-21)Online publication date: 11-May-2024
    • (2024)EITPose: Wearable and Practical Electrical Impedance Tomography for Continuous Hand Pose EstimationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642663(1-10)Online publication date: 11-May-2024
    • (2024)Finger-Tapping Motion Recognition Based on Skin Surface Deformation Using Wrist-Mounted Piezoelectric Film SensorsIEEE Sensors Journal10.1109/JSEN.2024.338633324:11(17876-17884)Online publication date: 1-Jun-2024
    • (2024)Beyond-Voice: Towards Continuous 3D Hand Pose Tracking on Commercial Home Assistant Devices2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN61024.2024.00017(151-162)Online publication date: 13-May-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media