research-article

MonoEye: Multimodal Human Motion Capture System Using A Single Ultra-Wide Fisheye Camera

Authors:

Dong-Hyun Hwang,

Kohei Aso,

Ye Yuan,

Kris Kitani,

Hideki KoikeAuthors Info & Claims

UIST '20: Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology

Pages 98 - 111

https://doi.org/10.1145/3379337.3415856

Published: 20 October 2020 Publication History

Get Access

Abstract

We present MonoEye, a multimodal human motion capture system using a single RGB camera with an ultra-wide fisheye lens, mounted on the user's chest. Existing optical motion capture systems use multiple cameras, which are synchronized and require camera calibration. These systems also have usability constraints that limit the user's movement and operating space. Since the MonoEye system is based on a wearable single RGB camera, the wearer's 3D body pose can be captured without space and environment limitations. The body pose, captured with our system, is aware of the camera orientation and therefore it is possible to recognize various motions that existing egocentric motion capture systems cannot recognize. Furthermore, the proposed system captures not only the wearer's body motion but also their viewport using the head pose estimation and an ultra-wide image. To implement robust multimodal motion capture, we design three deep neural networks: BodyPoseNet, HeadPoseNet, and CameraPoseNet, that estimate 3D body pose, head pose, and camera pose in real-time, respectively. We train these networks with our new extensive synthetic dataset providing 680K frames of renderings of people with a wide range of body shapes, clothing, actions, backgrounds, and lighting conditions. To demonstrate the interactive potential of the MonoEye system, we present several application examples from common body gestural to context-aware interactions.

Supplementary Material

VTT File (ufp5705pv.vtt)

Download
.50 KB

VTT File (ufp5705vf.vtt)

Download
3.91 KB

VTT File (3379337.3415856.vtt)

Download
5.55 KB

SRT File (ufp5705pvc.srt)

Preview video captions

Download
.50 KB

SRT File (ufp5705vfc.srt)

Video figure captions

Download
4.01 KB

MP4 File (ufp5705pv.mp4)

Preview video

Download
36.81 MB

MP4 File (ufp5705vf.mp4)

Video figure

Download
47.15 MB

MP4 File (3379337.3415856.mp4)

Presentation Video

Download
56.16 MB

References

[1]

2001. Carnegie Mellon University - Carnegie Mellon University Graphics Lab - motion capture library. (2001). http://mocap.cs.cmu.edu/

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Back-Hand-Pose: 3D Hand Pose Estimation for a Wrist-worn Camera via Dorsum Deformation Network

Portable 3D Human Pose Estimation for Human-Human Interaction using a Chest-Mounted Fisheye Camera

ControllerPose: Inside-Out Body Capture with VR Controller Cameras

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations