Generative Adversarial Network for Future Hand Segmentation from Egocentric Video

Jia, Wenqi; Liu, Miao; Rehg, James M.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.11305 (cs)

[Submitted on 21 Mar 2022 (v1), last revised 20 Jul 2022 (this version, v2)]

Title:Generative Adversarial Network for Future Hand Segmentation from Egocentric Video

Authors:Wenqi Jia, Miao Liu, James M. Rehg

View PDF

Abstract:We introduce the novel problem of anticipating a time series of future hand masks from egocentric video. A key challenge is to model the stochasticity of future head motions, which globally impact the head-worn camera video analysis. To this end, we propose a novel deep generative model -- EgoGAN, which uses a 3D Fully Convolutional Network to learn a spatio-temporal video representation for pixel-wise visual anticipation, generates future head motion using Generative Adversarial Network (GAN), and then predicts the future hand masks based on the video representation and the generated future head motion. We evaluate our method on both the EPIC-Kitchens and the EGTEA Gaze+ datasets. We conduct detailed ablation studies to validate the design choices of our approach. Furthermore, we compare our method with previous state-of-the-art methods on future image segmentation and show that our method can more accurately predict future hand masks.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.11305 [cs.CV]
	(or arXiv:2203.11305v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.11305

Submission history

From: Wenqi Jia [view email]
[v1] Mon, 21 Mar 2022 19:41:44 UTC (2,519 KB)
[v2] Wed, 20 Jul 2022 23:19:13 UTC (2,514 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Generative Adversarial Network for Future Hand Segmentation from Egocentric Video

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Generative Adversarial Network for Future Hand Segmentation from Egocentric Video

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators