Motion Segmentation using Frequency Domain Transformer Networks

Farazi, Hafez; Behnke, Sven

Computer Science > Computer Vision and Pattern Recognition

arXiv:2004.08638 (cs)

[Submitted on 18 Apr 2020]

Title:Motion Segmentation using Frequency Domain Transformer Networks

Authors:Hafez Farazi, Sven Behnke

View PDF

Abstract:Self-supervised prediction is a powerful mechanism to learn representations that capture the underlying structure of the data. Despite recent progress, the self-supervised video prediction task is still challenging. One of the critical factors that make the task hard is motion segmentation, which is segmenting individual objects and the background and estimating their motion separately. In video prediction, the shape, appearance, and transformation of each object should be understood only by predicting the next frame in pixel space. To address this task, we propose a novel end-to-end learnable architecture that predicts the next frame by modeling foreground and background separately while simultaneously estimating and predicting the foreground motion using Frequency Domain Transformer Networks. Experimental evaluations show that this yields interpretable representations and that our approach can outperform some widely used video prediction methods like Video Ladder Network and Predictive Gated Pyramids on synthetic data.

Comments:	28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges, Belgium, 2020
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2004.08638 [cs.CV]
	(or arXiv:2004.08638v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2004.08638

Submission history

From: Hafez Farazi [view email]
[v1] Sat, 18 Apr 2020 15:05:11 UTC (3,544 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2020-04

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hafez Farazi
Sven Behnke

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Motion Segmentation using Frequency Domain Transformer Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Motion Segmentation using Frequency Domain Transformer Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators