research-article

Learning character-agnostic motion for motion retargeting in 2D

Authors:

Dani Lischinski,

Daniel Cohen-OrAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 38, Issue 4

Article No.: 75, Pages 1 - 14

https://doi.org/10.1145/3306346.3322999

Published: 12 July 2019 Publication History

Abstract

Analyzing human motion is a challenging task with a wide variety of applications in computer vision and in graphics. One such application, of particular importance in computer animation, is the retargeting of motion from one performer to another. While humans move in three dimensions, the vast majority of human motions are captured using video, requiring 2D-to-3D pose and camera recovery, before existing retargeting approaches may be applied. In this paper, we present a new method for retargeting video-captured motion between different human performers, without the need to explicitly reconstruct 3D poses and/or camera parameters.

In order to achieve our goal, we learn to extract, directly from a video, a high-level latent motion representation, which is invariant to the skeleton geometry and the camera view. Our key idea is to train a deep neural network to decompose temporal sequences of 2D poses into three components: motion, skeleton, and camera view-angle. Having extracted such a representation, we are able to re-combine motion with novel skeletons and camera views, and decode a retargeted temporal sequence, which we compare to a ground truth from a synthetic dataset.

We demonstrate that our framework can be used to robustly extract human motion from videos, bypassing 3D reconstruction, and outperforming existing retargeting methods, when applied to videos in-the-wild. It also enables additional applications, such as performance cloning, video-driven cartoons, and motion retrieval.

Supplementary Material

ZIP File (repository.zip)

PyTorch implementation for the paper Learning Character-Agnostic Motion for Motion Retargeting in 2D presented at SIGGRAPH 2019.

The code is also available on GitHub: https://github.com/ChrisWu1997/2D-Motion-Retargeting

Download
55.50 MB

References

[1]

Kfir Aberman, Mingyi Shi, Jing Liao, Dani Lischinski, Baoquan Chen, and Daniel Cohen-Or. 2018. Deep Video-Based Performance Cloning. arXiv preprint arXiv:1808.06847 (2018).

[2]

Adobe Systems Inc. 2018. Mixamo. https://www.mixamo.com. https://www.mixamo.com Accessed: 2018-12-27.

[3]

Andreas Aristidou, Daniel Cohen-Or, Jessica K. Hodgins, Yiorgos Chrysanthou, and Ariel Shamir. 2018. Deep Motifs and Motion Signatures. ACM Trans. Graph. 37, 6, Article 187 (Nov. 2018), 13 pages.

Digital Library

[4]

Jürgen Bernard, Eduard Dobermann, Anna Vögele, Björn Krüger, Jörn Kohlhammer, and Dieter Fellner. 2017. Visual-interactive semi-supervised labeling of human motion capture data. Electronic Imaging 2017, 1 (2017), 34--45.

[5]

Jürgen Bernard, Nils Wilhelm, Björn Krüger, Thorsten May, Tobias Schreck, and Jörn Kohlhammer. 2013. Motionexplorer: Exploratory search in human motion capture data based on hierarchical aggregation. IEEE TVCG 19, 12 (2013), 2257--2266.

Digital Library

[6]

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2016. Realtime multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1611.08050 (2016).

[7]

Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A Efros. 2018. Everybody dance now. arXiv preprint arXiv.1808.07371 (2018).

[8]

Songle Chen, Zhengxing Sun, and Yan Zhang. 2015. Scalable Organization of Collections of Motion Capture Data via Quantitative and Qualitative Analysis. In Proc. 5th ACM International Conference on Multimedia Retrieval. ACM, 411--418.

Digital Library

[9]

Kwang-Jin Choi and Hyeong-Seok Ko. 2000. Online motion retargetting. The Journal of Visualization and Computer Animation 11, 5 (2000), 223--235.

[10]

Michael Gleicher. 1998. Retargetting motion to new characters. In Proc. 25th annual conference on computer graphics and interactive techniques. ACM, 33--42.

Digital Library

[11]

Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 138.

Digital Library

[12]

Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. 2015. Learning motion manifolds with convolutional autoencoders. In SIGGRAPH Asia 2015 Technical Briefs. ACM, 18.

Digital Library

[13]

Eugene Hsu, Kari Pulli, and Jovan Popović. 2005. Style translation for human motion. In ACM Transactions on Graphics (TOG), Vol. 24. ACM, 1082--1089.

Digital Library

[14]

Yueqi Hu, Shuangyuan Wu, Shihong Xia, Jinghua Fu, and Wei Chen. 2010. Motion track: Visualizing variations of human motion data. In PacificVis. 153--160.

[15]

Angjoo Kanazawa, Michael J Black, David W Jacobs, and Jitendra Malik. 2018. End-to-end recovery of human shape and pose. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]

Leonard Kaufman and Peter J Rousseeuw. 2009. Finding groups in data: an introduction to cluster analysis. Vol. 344. John Wiley & Sons.

[17]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv.1412.6980 (2014).

[18]

Lucas Kovar and Michael Gleicher. 2004. Automated extraction and parameterization of motions in large data sets. In ACM Transactions on Graphics (ToG), Vol. 23. ACM, 559--568.

Digital Library

[19]

Tejas D Kulkarni, William F Whitney, Pushmeet Kohli, and Josh Tenenbaum. 2015. Deep convolutional inverse graphics network. In Advances in neural information processing systems. 2539--2547.

Digital Library

[20]

Jehee Lee and Sung Yong Shin. 1999. A hierarchical approach to interactive motion editing for human-like figures. In Proc. 26th annual conference on computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 39--48.

Digital Library

[21]

Lingjie Liu, Weipeng Xu, Michael Zollhoefer, Hyeongwoo Kim, Florian Bernard, Marc Habermann, Wenping Wang, and Christian Theobalt. 2018. Neural Animation and Reenactment of Human Actor Videos. arXiv preprint arXiv:1809.03658 (2018).

[22]

Liqian Ma, Qianru Sun, Stamatios Georgoulis, Luc Van Gool, Bernt Schiele, and Mario Fritz. 2018. Disentangled person image generation. In Proc. IEEE CVPR. 99--108.

[23]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579--2605.

[24]

Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. 2017. Vnect: Real-time 3d human pose estimation with a single rgb camera. ACM Transactions on Graphics (TOG) 36, 4 (2017), 44.

Digital Library

[25]

Jianyuan Min, Huajun Liu, and Jinxiang Chai. 2010. Synthesis and editing of personalized stylistic human motion. In Proc. 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. ACM, 39--46.

Digital Library

[26]

Lee Montgomery. 2012. Tradigital Maya: A CG Animator's Guide to Applying the Classical Principles of Animation. Focal Press.

[27]

Meinard Müller, Andreas Baak, and Hans-Peter Seidel. 2009. Efficient and robust annotation of motion capture data. In Proc. 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. ACM, 17--26.

Digital Library

[28]

Augustus Odena, Vincent Dumoulin, and Chris Olah. 2016. Deconvolution and Checker-board Artifacts. Distill (2016).

[29]

Xi Peng, Xiang Yu, Kihyuk Sohn, Dimitris N Metaxas, and Manmohan Chandraker. 2017. Reconstruction-based disentanglement for pose-invariant face recognition. intervals 20 (2017), 12.

[30]

Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, and Sergey Levine. 2018. SFV: reinforcement learning of physical skills from videos. ACM Trans. Graph. 37, 6 (November 2018), Article 178.

Digital Library

[31]

Sashank J Reddi, Satyen Kale, and Sanjiv Kumar. 2018. On the convergence of adam and beyond. (2018).

[32]

Charles F Rose III, Peter-Pike J Sloan, and Michael F Cohen. 2001. Artist-directed inverse-kinematics using radial basis function interpolation. In Computer Graphics Forum, Vol. 20. Wiley Online Library, 239--250.

[33]

Tomas Simon, Hanbyul Joo, Iain Matthews, and Yaser Sheikh. 2017. Hand Keypoint Detection in Single Images using Multiview Bootstrapping. In CVPR.

[34]

Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012).

[35]

Seyoon Tak and Hyeong-Seok Ko. 2005. A physically-based motion retargeting filter. ACM Transactions on Graphics (TOG) 24, 1 (2005), 98--117.

Digital Library

[36]

Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, and Jan Kautz. 2017. Mocogan: Decomposing motion and content for video generation. arXiv preprint arXiv:1707.04993 (2017).

[37]

Ruben Villegas, Jimei Yang, Duygu Ceylan, and Honglak Lee. 2018. Neural Kinematic Networks for Unsupervised Motion Retargetting. In Proc. IEEE CVPR. 8639--8648.

[38]

Shuangyuan Wu, Zhaoqi Wang, and Shihong Xia. 2009. Indexing and retrieval of human motion data by a hierarchical tree. In Proc. 16th ACM Symposium on Virtual Reality Software and Technology. ACM, 207--214.

Digital Library

[39]

Shihong Xia, Congyi Wang, Jinxiang Chai, and Jessica Hodgins. 2015. Realtime style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics (TOG) 34, 4 (2015), 119.

Digital Library

[40]

Weiyu Zhang, Menglong Zhu, and Konstantinos G Derpanis. 2013. From actemes to action: A strongly-supervised representation for detailed action understanding. In Proceedings of the IEEE International Conference on Computer Vision. 2248--2255.

Digital Library

Cited By

Kim SPark JLee MMun SLee K(2024)AI-based Real-time Online Random-play Dance PlatformJournal of Digital Contents Society10.9728/dcs.2024.25.3.68525:3(685-693)Online publication date: 31-Mar-2024
https://doi.org/10.9728/dcs.2024.25.3.685
Hu LZhang ZZhong CJiang BXia S(2024)Pose-Aware Attention Network for Flexible Motion Retargeting by Body PartIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.327791830:8(4792-4808)Online publication date: Aug-2024
https://doi.org/10.1109/TVCG.2023.3277918
Sheng CKuang GBai LHou CGuo YXu XPietikäinen MLiu L(2024)Deep Learning for Visual Speech Analysis: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337671046:9(6001-6022)Online publication date: Sep-2024
https://doi.org/10.1109/TPAMI.2024.3376710
Show More Cited By

Index Terms

Learning character-agnostic motion for motion retargeting in 2D
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Motion processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

A deep learning framework for character motion synthesis and editing

We present a framework to synthesize character movements based on high level parameters, such that the produced movements respect the manifold of human motion, trained on a large motion capture dataset. The learned motion manifold, which is represented ...
A physically-based motion retargeting filter

This article presents a novel constraint-based motion editing technique. On the basis of animator-specified kinematic and dynamic constraints, the method converts a given captured or animated motion to a physically plausible motion. In contrast to ...
DNN-based Skeleton Independent Motion Retargeting
Abstract
In the modern digital world, the use of animated applications has increased significantly and such applications have quietly become an integral part of life. Capturing human motions is a fundamental aspect of these applications which is performed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 38, Issue 4

August 2019

1480 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3306346

Editor:
Olga Sorkine-Hornung
ETH Zurich

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2019

Published in TOG Volume 38, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

66
Total Citations
View Citations
736
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)3

Reflects downloads up to 27 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kim SPark JLee MMun SLee K(2024)AI-based Real-time Online Random-play Dance PlatformJournal of Digital Contents Society10.9728/dcs.2024.25.3.68525:3(685-693)Online publication date: 31-Mar-2024
https://doi.org/10.9728/dcs.2024.25.3.685
Hu LZhang ZZhong CJiang BXia S(2024)Pose-Aware Attention Network for Flexible Motion Retargeting by Body PartIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.327791830:8(4792-4808)Online publication date: Aug-2024
https://doi.org/10.1109/TVCG.2023.3277918
Sheng CKuang GBai LHou CGuo YXu XPietikäinen MLiu L(2024)Deep Learning for Visual Speech Analysis: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337671046:9(6001-6022)Online publication date: Sep-2024
https://doi.org/10.1109/TPAMI.2024.3376710
Ma JZhang XYu S(2024)An Identity-Preserved Framework for Human Motion TransferIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.336401819(3495-3509)Online publication date: 8-Feb-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3364018
Fan JHu X(2024)Decomposing Task-Relevant Information From Surface Electromyogram for User-Generic Dexterous Finger Force DecodingIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.338359828:7(3907-3917)Online publication date: Jul-2024
https://doi.org/10.1109/JBHI.2024.3383598
Han JGu XYang GLo B(2024)Noise-Factorized Disentangled Representation Learning for Generalizable Motor Imagery EEG ClassificationIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2023.333707228:2(765-776)Online publication date: Feb-2024
https://doi.org/10.1109/JBHI.2023.3337072
Guo HLi SJia W(2024)Motion Retargeting from Human in Video to 3D Characters with Different Skeleton Topology2024 4th International Conference on Consumer Electronics and Computer Engineering (ICCECE)10.1109/ICCECE61317.2024.10504182(124-128)Online publication date: 12-Jan-2024
https://doi.org/10.1109/ICCECE61317.2024.10504182
Ng JQuah CChua RTandianus B(2024)A System for Retargeting Human Motion to Robot with Augmented Feedback via a Digital Twin Setup2024 10th International Conference on Control, Automation and Robotics (ICCAR)10.1109/ICCAR61844.2024.10569840(95-100)Online publication date: 27-Apr-2024
https://doi.org/10.1109/ICCAR61844.2024.10569840
Rekik RMarsot MOlivier AFranco JWuhrer S(2024)Correspondence-Free Online Human Motion Retargeting2024 International Conference on 3D Vision (3DV)10.1109/3DV62453.2024.00032(707-716)Online publication date: 18-Mar-2024
https://doi.org/10.1109/3DV62453.2024.00032
Yang DWang YDantcheva AGarattoni LFrancesca GBrémond F(2024)View-Invariant Skeleton Action Representation Learning via Motion RetargetingInternational Journal of Computer Vision10.1007/s11263-023-01967-8132:7(2351-2366)Online publication date: 16-Jan-2024
https://doi.org/10.1007/s11263-023-01967-8
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents