research-article

Attention Based Dual Branches Fingertip Detection Network and Virtual Key System

Authors:

Chong Mou,

Xin ZhangAuthors Info & Claims

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 2159 - 2165

https://doi.org/10.1145/3394171.3413685

Published: 12 October 2020 Publication History

Get Access

Abstract

Gesture and fingertip are becoming more and more important mediums for human-computer interaction (HCI). Therefore, algorithms of gesture recognition and fingertip detection have been extensively investigated. However, problems mainly remain in how to achieve a win-win situation between speed and accuracy, and how to deal with complex interaction environment. To rectify these problems, this paper proposes an attention-based dual branches network that can efficiently fulfill both fingertip detection and gesture recognition tasks. In order to deal with complex interaction environment, we combine both channel-wise attention and spatial-wise attention into the fingertip detection model. The extensive experiments demonstrate that our novel model is both effective and efficient. In the experiment, our proposed model achieves the average fingertip detection error at around 2.8 pixels in 640×480 video frame, and the average recognition accuracy among eight gestures reaches $99%$. Moreover, the average forward time is about 8 ms. Due to the light-weight design, this model can also achieve high-efficiency performance on CPU. In addition, we design a virtual key system based on our proposed model, which can allow users to complete the "clicking" operation naturally in virtual environment. Our proposed system can perform well with a single normal RGB camera without any pre-processing (e.g., image segmentation or contour extraction), which can significantly reduce the complexity of the interaction system.

Supplementary Material

MP4 File (3394171.3413685.mp4)

This video is a brief introduction to the paper titled ?Attention Based Dual Branches Fingertip Detection Network and Virtual Key System?. In this video, we begin with the background and motivation of this paper. Then we introduce the architecture of our proposed model which can rectify the weaknesses mentioned earlier, and we compare the performance of our proposed model with that of existing competitive models to demonstrate the superiority of our proposed method. Finally, we present two kinds of gesture-based interactive applications (Air Writing and Virtual Clicking) which are established based on our proposed model. In summary, in this video, we want to present an idea which can perform both fingertip detection and gesture recognition with a single model, and demonstrate the superiority of this kind of model in fingertip detection and practical application.

Download
43.61 MB

References

[1]

Xiao Chu, Wei Yang, Wanli Ouyang, Cheng Ma, Alan L Yuille, and Xiaogang Wang. 2017. Multi-context attention for human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1831--1840.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Unified learning approach for egocentric hand gesture recognition and fingertip detection

Character Input System using Fingertip Detection with Kinect Sensor

Fingertip-based interactive projector-camera system

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations