Neural Allocentric Intuitive Physics Prediction from Real Videos

Wang, Zhihua; Rosa, Stefano; Miao, Yishu; Lai, Zihang; Xie, Linhai; Markham, Andrew; Trigoni, Niki

Computer Science > Neural and Evolutionary Computing

arXiv:1809.03330 (cs)

[Submitted on 7 Sep 2018 (v1), last revised 17 Sep 2018 (this version, v2)]

Title:Neural Allocentric Intuitive Physics Prediction from Real Videos

Authors:Zhihua Wang, Stefano Rosa, Yishu Miao, Zihang Lai, Linhai Xie, Andrew Markham, Niki Trigoni

View PDF

Abstract:Humans are able to make rich predictions about the future dynamics of physical objects from a glance. On the other hand, most existing computer vision approaches require strong assumptions about the underlying system, ad-hoc modeling, or annotated datasets, to carry out even simple predictions. To tackle this gap, we propose a new perspective on the problem of learning intuitive physics that is inspired by the spatial memory representation of objects and spaces in human brains, in particular the co-existence of egocentric and allocentric spatial representations. We present a generic framework that learns a layered representation of the physical world, using a cascade of invertible modules. In this framework, real images are first converted to a synthetic domain representation that reduces complexity arising from lighting and texture. Then, an allocentric viewpoint transformer removes viewpoint complexity by projecting images to a canonical view. Finally, a novel Recurrent Latent Variation Network (RLVN) architecture learns the dynamics of the objects interacting with the environment and predicts future motion, leveraging the availability of unlimited synthetic simulations. Predicted frames are then projected back to the original camera view and translated back to the real world domain. Experimental results show the ability of the framework to consistently and accurately predict several frames in the future and the ability to adapt to real images.

Comments:	Added references, minor changes. arXiv admin note: text overlap with arXiv:1506.02025 by other authors
Subjects:	Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:1809.03330 [cs.NE]
	(or arXiv:1809.03330v2 [cs.NE] for this version)
	https://doi.org/10.48550/arXiv.1809.03330

Submission history

From: Stefano Rosa [view email]
[v1] Fri, 7 Sep 2018 10:33:56 UTC (1,903 KB)
[v2] Mon, 17 Sep 2018 12:05:28 UTC (1,903 KB)

Computer Science > Neural and Evolutionary Computing

Title:Neural Allocentric Intuitive Physics Prediction from Real Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Neural and Evolutionary Computing

Title:Neural Allocentric Intuitive Physics Prediction from Real Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators