research-article

Open access

Driving-signal aware full-body avatars

Authors:

Timur Bagautdinov,

Takaaki Shiratori,

Jason SaragihAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 40, Issue 4

Article No.: 143, Pages 1 - 17

https://doi.org/10.1145/3450626.3459850

Published: 19 July 2021 Publication History

Abstract

We present a learning-based method for building driving-signal aware full-body avatars. Our model is a conditional variational autoencoder that can be animated with incomplete driving signals, such as human pose and facial keypoints, and produces a high-quality representation of human geometry and view-dependent appearance. The core intuition behind our method is that better drivability and generalization can be achieved by disentangling the driving signals and remaining generative factors, which are not available during animation. To this end, we explicitly account for information deficiency in the driving signal by introducing a latent space that exclusively captures the remaining information, thus enabling the imputation of the missing factors required during full-body animation, while remaining faithful to the driving signal. We also propose a learnable localized compression for the driving signal which promotes better generalization, and helps minimize the influence of global chance-correlations often found in real datasets. For a given driving signal, the resulting variational model produces a compact space of uncertainty for missing factors that allows for an imputation strategy best suited to a particular application. We demonstrate the efficacy of our approach on the challenging problem of full-body animation for virtual telepresence with driving signals acquired from minimal sensors placed in the environment and mounted on a VR-headset.

Supplementary Material

VTT File (3450626.3459850.vtt)

Download
20.27 KB

ZIP File (a143-bagautdinov.zip)

a143-bagautdinov.zip

Download
76.57 MB

MP4 File (a143-bagautdinov.mp4)

Download
429.01 MB

MP4 File (3450626.3459850.mp4)

Presentation.

Download
942.11 MB

References

[1]

Kfir Aberman, Mingyi Shi, Jing Liao, Dani Lischinski, Baoquan Chen, and Daniel Cohen-Or. 2019. Deep video-based performance cloning. In Computer Graphics Forum, Vol. 38. Wiley Online Library, 219--233.

[2]

O. Alexander, M. Rogers, W. Lambeth, J. Chiang, W. Ma, C. Wang, and P. Debevec. 2010. The Digital Emily Project: Achieving a Photorealistic Digital Actor. IEEE Computer Graphics and Applications 30, 4 (2010), 20--31.

Digital Library

[3]

Simon Alexanderson, Gustav Eje Henter, Taras Kucherenko, and Jonas Beskow. 2020. Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows. Computer Graphics Forum 39, 2 (2020), 487--496.

[4]

Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Lars Peters-son, and Stephen Gould. 2019. Mitigating Posterior Collapse in Strongly Conditioned Variational Autoencoders. (2019).

[5]

T. Alldieck, M. Magnor, W. Xu, C. Theobalt, and G. Pons-Moll. 2018. Detailed Human Avatars from Monocular Video. In Proceedings of International Conference on 3D Vision (3DV). 98--109.

[6]

Thiemo Alldieck, Marcus Magnor, Weipeng Xu, Christian Theobalt, and Gerard Pons-Moll. 2018. Video Based Reconstruction of 3D People Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]

Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, and James Davis. 2005. SCAPE: Shape Completion and Animation of People. ACM Trans. Graph. 24, 3 (July 2005), 408--416.

Digital Library

[8]

T. Bagautdinov, C. Wu, J. Saragih, P. Fua, and Y. Sheikh. 2018. Modeling Facial Geometry Using Compositional VAEs. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3877--3886.

[9]

Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, and R Devon Hjelm. 2018. Mine: mutual information neural estimation. arXiv preprint arXiv:1801.04062 (2018).

[10]

Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8 (2013), 1798--1828.

Digital Library

[11]

Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '99). ACM Press/Addison-Wesley Publishing Co., USA, 187--194.

Digital Library

[12]

Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J Black. 2016. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In European Conference on Computer Vision. Springer, 561--578.

[13]

M. Botsch and O. Sorkine. 2008. On Linear Variational Surface Deformation Methods. IEEE Transactions on Visualization and Computer Graphics 14, 1 (2008), 213--230.

Digital Library

[14]

Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst. 2017. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine 34, 4 (2017), 18--42.

[15]

Christopher P Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick Watters, Guillaume Desjardins, and Alexander Lerchner. 2018. Understanding disentangling in beta-VAE. arXiv preprint arXiv:1804.03599 (2018).

[16]

Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A Efros. 2019. Everybody dance now. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5933--5942.

[17]

Patrick Esser, Johannes Haux, Timo Milbich, et al. 2018. Towards learning a realistic rendering of human behavior. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 0--0.

[18]

Juergen Gall, Carsten Stoll, Edilson De Aguiar, Christian Theobalt, Bodo Rosenhahn, and Hans-Peter Seidel. 2009. Motion capture using joint skeleton tracking and surface estimation. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 1746--1753.

[19]

S. Galliani, K. Lasinger, and K. Schindler. 2015. Massively Parallel Multiview Stereopsis by Surface Normal Diffusion. In 2015 IEEE International Conference on Computer Vision (ICCV). 873--881.

Digital Library

[20]

S. Ginosar, A. Bar, G. Kohavi, C. Chan, A. Owens, and J. Malik. 2019. Learning Individual Styles of Conversational Gesture. In Computer Vision and Pattern Recognition (CVPR).

[21]

Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, and Mathieu Aubry. 2018. A papier-mâché approach to learning 3d surface generation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 216--224.

[22]

P. Guan, L. Reiss, D. Hirshberg, A. Weiss, and M. J. Black. 2012. DRAPE: DRessing Any PErson. ACM Trans. on Graphics (Proc. SIGGRAPH) 31, 4 (July 2012), 35:1--35:10.

Digital Library

[23]

Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2016. beta-vae: Learning basic visual concepts with a constrained variational framework. (2016).

[24]

Stephen Hill, Stephen McAuley, Laurent Belcour, Will Earl, Niklas Harrysson, Sébastien Hillaire, Naty Hoffman, Lee Kerley, Jasmin Patry, Rob Pieké, Igor Skliar, Jonathan Stone, Pascal Barla, Mégane Bati, and Iliyan Georgiev. 2020. Physically Based Shading in Theory and Practice. In ACM SIGGRAPH 2020 Courses.

[25]

Alec Jacobson and Olga Sorkine. 2011. Stretchable and Twistable Bones for Skeletal Shape Deformation. ACM Transactions on Graphics (proceedings of ACM SIGGRAPH ASIA) 30, 6 (2011), 165:1--165:8.

[26]

B. Jiang, J. Zhang, J. Cai, and J. Zheng. 2020. Disentangled Human Body Embedding Based on Deep Hierarchical Neural Network. IEEE Transactions on Visualization and Computer Graphics 26, 8 (2020), 2560--2575.

Digital Library

[27]

H. Joo, T. Simon, and Y. Sheikh. 2018. Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8320--8329.

[28]

Ladislav Kavan, Steven Collins, Jiří Žára, and Carol O'Sullivan. 2008. Geometric Skinning with Approximate Dual Quaternion Blending. ACM Trans. Graph. 27, 4, Article 105 (Nov. 2008), 23 pages.

Digital Library

[29]

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).

[30]

Alexander Kirillov, Yuxin Wu, Kaiming He, and Ross Girshick. 2020. Pointrend: Image segmentation as rendering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9799--9808.

[31]

Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic DENOYER, and Marc' Aurelio Ranzato. 2017. Fader Networks:Manipulating Images by Sliding Attributes. In Advances in Neural Information Processing Systems, Vol. 30. 5967--5976.

[32]

Manfred Lau, Jinxiang Chai, Ying-Qing Xu, and Heung-Yeung Shum. 2009. Face Poser: Interactive Modeling of 3D Facial Expressions Using Facial Priors. ACM Trans. Graph. 29, 1, Article 3 (Dec. 2009), 17 pages.

Digital Library

[33]

J. P. Lewis, Ken Anjyo, Taehyun Rhee, Mengjie Zhang, Fred Pighin, and Zhigang Deng. 2014. Practice and Theory of Blendshape Facial Models. In Eurographics 2014 - State of the Art Reports, Sylvain Lefebvre and Michela Spagnuolo (Eds.). The Eurographics Association.

[34]

J. P. Lewis, Matt Cordner, and Nickson Fong. 2000. Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '00). ACM Press/Addison-Wesley Publishing Co., USA, 165--172.

Digital Library

[35]

Lingjie Liu, Weipeng Xu, Michael Zollhöfer, Hyeongwoo Kim, Florian Bernard, Marc Habermann, Wenping Wang, and Christian Theobalt. 2019c. Neural Rendering and Reenactment of Human Actor Videos. ACM Trans. Graph. 38, 5, Article 139 (Oct. 2019), 14 pages.

Digital Library

[36]

Shichen Liu, Tianye Li, Weikai Chen, and Hao Li. 2019a. Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning. The IEEE International Conference on Computer Vision (ICCV) (Oct 2019).

[37]

Wen Liu, Zhixin Piao, Jie Min, Wenhan Luo, Lin Ma, and Shenghua Gao. 2019b. Liquid warping gan: A unified framework for human motion imitation, appearance transfer and novel view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5904--5913.

[38]

Stephen Lombardi, Jason Saragih, Tomas Simon, and Yaser Sheikh. 2018. Deep appearance models for face rendering. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1--13.

Digital Library

[39]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34, 6 (2015), 1--16.

Digital Library

[40]

Qianli Ma, Jinlong Yang, Anurag Ranjan, Sergi Pujades, Gerard Pons-Moll, Siyu Tang, and Michael J. Black. 2020. Learning to Dress 3D People in Generative Clothing. In Computer Vision and Pattern Recognition (CVPR). IEEE, 6468--6477.

[41]

N. Magnenat-Thalmann, R. Laperrière, and D. Thalmann. 1989. Joint-Dependent Local Deformations for Hand Animation and Object Grasping. In Proceedings on Graphics Interface '88 (Edmonton, Alberta, Canada). Canadian Information Processing Society, CAN, 26--33.

[42]

Pierre-Alexandre Mattei and Jes Frellsen. 2019. MIWAE: Deep generative modelling and imputation of incomplete data sets. In International Conference on Machine Learning. PMLR, 4413--4423.

[43]

Gavin Miller. 1994. Efficient Algorithms for Local and Global Accessibility Shading. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH). 319--326.

Digital Library

[44]

Gyeongsik Moon, Takaaki Shiratori, and Kyoung Mu Lee. 2020. DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling. In Proceedings of European Conference on Computer Vision (ECCV).

Digital Library

[45]

Evonne Ng, Hanbyul Joo, Shiry Ginosar, and Trevor Darrell. 2020. Body2Hands: Learning to Infer 3D Hands from Conversational Gesture Body Dynamics. arXiv preprint arXiv:2007.12287 (2020).

[46]

Ahmed A A Osman, Timo Bolkart, and Michael J. Black. 2020. STAR: A Sparse Trained Articulated Human Body Regressor. In European Conference on Computer Vision (ECCV). https://star.is.tue.mpg.de

[47]

Pablo Palafox, Aljaž Božič, Justus Thies, Matthias Nießner, and Angela Dai. 2021. NPMs: Neural Parametric Models for 3D Deformable Shapes. arXiv preprint arXiv:2104.00702 (2021).

[48]

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 165--174.

[49]

Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black. 2019. Expressive Body Capture: 3D Hands, Face, and Body from a Single Image. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 10975--10985. http://smpl-x.is.tue.mpg.de

[50]

Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. 2021. Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans. In CVPR.

[51]

Sergey Prokudin, Michael J. Black, and Javier Romero. 2021. SMPLpix: Neural Avatars from 3D Human Models. In Proceedings of Winter Conference on Applications of Computer Vision (WACV). 1810--1819.

[52]

Albert Pumarola, Antonio Agudo, Alberto Sanfeliu, and Francesc Moreno-Noguer. 2018. Unsupervised person image synthesis in arbitrary poses. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8620--8628.

[53]

Neng Qian, Jiayi Wang, Franziska Mueller, Florian Bernard, Vladislav Golyanik, and Christian Theobalt. 2020. HTML: A Parametric Hand Texture Model for 3D Hand Reconstruction and Personalization. In Proceedings of the European Conference on Computer Vision (ECCV). Springer.

Digital Library

[54]

Anurag Ranjan, Timo Bolkart, Soubhik Sanyal, and Michael J. Black. 2018. Generating 3D Faces using Convolutional Mesh Autoencoders. In European Conference on Computer Vision (ECCV), Vol. Lecture Notes in Computer Science, vol 11207. Springer, Cham, 725--741.

[55]

Edoardo Remelli, Artem Lukoianov, Stephan R Richter, Benoît Guillard, Timur Bagautdinov, Pierre Baque, and Pascal Fua. 2020. MeshSDF: Differentiable Iso-Surface Extraction. Neural Information Processing Systems (NeurIPS) (2020).

[56]

Javier Romero, Dimitrios Tzionas, and Michael J. Black. 2017. Embodied Hands: Modeling and Capturing Hands and Bodies Together. ACM Transactions on Graphics, (Proc. SIGGRAPH Asia) 36, 6 (Nov. 2017).

[57]

O. Ronneberger, P.Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, 234--241.

[58]

Shunsuke Saito, Jinlong Yang, Qianli Ma, and Michael J. Black. 2021. SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks. In Proceedings IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR).

[59]

Kripasindhu Sarkar, Dushyant Mehta, Weipeng Xu, Vladislav Golyanik, and Christian Theobalt. 2020. Neural Re-Rendering of Humans from a Single Image. In European Conference on Computer Vision (ECCV).

Digital Library

[60]

Gabriel Schwartz, Shih-En Wei, Te-Li Wang, Stephen Lombardi, Tomas Simon, Jason Saragih, and Yaser Sheikh. 2020. The eyes have it: an integrated eye and face model for photorealistic facial animation. ACM Transactions on Graphics (TOG) 39, 4 (2020), 91--1.

Digital Library

[61]

Aliaksandra Shysheya, Egor Zakharov, Kara-Ali Aliev, Renat Bashirov, Egor Burkov, Karim Iskakov, Aleksei Ivakhnenko, Yury Malkov, Igor Pasechnik, Dmitry Ulyanov, Alexander Vakhitov, and Victor Lempitsky. 2019. Textured Neural Avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[62]

Chenyang Si, Wei Wang, Liang Wang, and Tieniu Tan. 2018. Multistage adversarial losses for pose-based human image synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 118--126.

[63]

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning Structured Output Representation using Deep Conditional Generative Models. In Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett (Eds.), Vol. 28. Curran Associates, Inc., 3483--3491. https://proceedings.neurips.cc/paper/2015/file/8d55a249e6baa5c06772297520da2051-Paper.pdf

[64]

O. Sorkine, D. Cohen-Or, Y. Lipman, M. Alexa, C. Rössl, and H.-P. Seidel. 2004. Laplacian Surface Editing. In Proceedings of the 2004 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing (Nice, France) (SGP '04). Association for Computing Machinery, New York, NY, USA, 175--184.

Digital Library

[65]

Carsten Stoll, Juergen Gall, Edilson de Aguiar, Sebastian Thrun, and Christian Theobalt. 2010. Video-Based Reconstruction of Animatable Human Characters. In ACM SIGGRAPH Asia 2010 Papers (Seoul, South Korea) (SIGGRAPH ASIA '10). Association for Computing Machinery, New York, NY, USA, Article 139, 10 pages.

Digital Library

[66]

Mingxing Tan, Ruoming Pang, and Quoc V Le. 2020. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10781--10790.

[67]

J. Rafael Tena, Fernando De la Torre, and Iain Matthews. 2011. Interactive Region-Based Linear 3D Face Models. ACM Trans. Graph. 30, 4, Article 76 (July 2011), 10 pages.

Digital Library

[68]

Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1--12.

Digital Library

[69]

Arash Vahdat and Jan Kautz. 2020. NVAE: A Deep Hierarchical Variational Autoencoder. In Neural Information Processing Systems (NeurIPS).

[70]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).

[71]

Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popović. 2005. Face Transfer with Multilinear Models. ACM Trans. Graph. 24, 3 (July 2005), 426--433.

Digital Library

[72]

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. Video-to-video synthesis. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 1152--1164.

[73]

Shih-En Wei, Jason Saragih, Tomas Simon, Adam W. Harley, Stephen Lombardi, Michal Perdoch, Alexander Hypes, Dawei Wang, Hernan Badino, and Yaser Sheikh. 2019. VR Facial Animation via Multiview Image Translation. ACM Trans. Graph. 38, 4 (2019).

Digital Library

[74]

Chenglei Wu, Derek Bradley, Markus Gross, and Thabo Beeler. 2016. An Anatomically-Constrained Local Deformation Model for Monocular Face Capture. ACM Trans. Graph. 35, 4, Article 115 (July 2016), 12 pages.

Digital Library

[75]

Youngwoo Yoon, Bok Cha, Joo-Haeng Lee, Minsu Jang, Jaeyeon Lee, Jaehong Kim, and Geehyuk Lee. 2020. Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity. ACM Transactions on Graphics (TOG) 39, 6 (2020).

Digital Library

[76]

Keyang Zhou, Bharat Lal Bhatnagar, and Gerard Pons-Moll. 2020a. Unsupervised Shape and Pose Disentanglement for 3D Meshes. In The European Conference on Computer Vision (ECCV).

Digital Library

[77]

Yi Zhou, Chenglei Wu, Zimo Li, Chen Cao, Yuting Ye, Jason Saragih, Hao Li, and Yaser Sheikh. 2020b. Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels. In Advances in Neural Information Processing Systems.

Cited By

Xu YYe KShao TWeng Y(2025)Animatable 3D Gaussians for modeling dynamic humansFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40497-519:9Online publication date: 1-Sep-2025
https://dl.acm.org/doi/10.1007/s11704-024-40497-5
Zhu HZhan FTheobalt CHabermann M(2024)TriHuman: A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance SynthesisACM Transactions on Graphics10.1145/369714044:1(1-17)Online publication date: 24-Sep-2024
https://dl.acm.org/doi/10.1145/3697140
Zhang WYan YLiu YSheng XYang XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)E3Gen: Efficient, Expressive and Editable Avatars GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681409(6860-6869)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681409
Show More Cited By

Index Terms

Driving-signal aware full-body avatars
1. Computing methodologies
  1. Computer graphics
    1. Animation
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

MeshAvatar: Learning High-Quality Triangular Human Avatars from Multi-view Videos
Computer Vision – ECCV 2024
Abstract
We present a novel pipeline for learning high-quality triangular human avatars from multi-view videos. Recent methods for avatar learning are typically based on neural radiance fields (NeRF), which is not compatible with traditional graphics ...
Parametric editing of clothed 3D avatars

Easy editing of a clothed 3D human avatar is central to many practical applications. However, it is easy to produce implausible, unnatural looking results, since subtle reshaping or pose alteration of avatars requires global consistency and agreement ...
Hybrid avatars: enabling co-presence in multiple realities
Web3D '16: Proceedings of the 21st International Conference on Web3D Technology

Virtual reality (VR) and augmented reality (AR) technologies are quickly making their way into people's everyday lives. Typically, these technologies are used separately to create either plain VR or AR applications rather than harnessing the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 40, Issue 4

August 2021

2170 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3450626

Editor:
Sylvain Paris
Adobe Inc.

Issue’s Table of Contents

Copyright © 2021 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2021

Published in TOG Volume 40, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

50
Total Citations
View Citations
2,175
Total Downloads

Downloads (Last 12 months)368
Downloads (Last 6 weeks)30

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu YYe KShao TWeng Y(2025)Animatable 3D Gaussians for modeling dynamic humansFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40497-519:9Online publication date: 1-Sep-2025
https://dl.acm.org/doi/10.1007/s11704-024-40497-5
Zhu HZhan FTheobalt CHabermann M(2024)TriHuman: A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance SynthesisACM Transactions on Graphics10.1145/369714044:1(1-17)Online publication date: 24-Sep-2024
https://dl.acm.org/doi/10.1145/3697140
Zhang WYan YLiu YSheng XYang XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)E3Gen: Efficient, Expressive and Editable Avatars GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681409(6860-6869)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681409
Lin SLi ZSu ZZheng ZZhang HLiu Y(2024)LayGA: Layered Gaussian Avatars for Animatable Clothing TransferACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657501(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657501
Bashirov RLarionov AUstinova ESidorenko MSvitov DZakharkin ILempitsky V(2024)MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00351(3533-3543)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00351
Hu TXu HLuo LYu TZheng ZZhang HLiu YZwicker M(2024)HVTR++: Image and Pose Driven Human Avatars Using Hybrid Volumetric-Textural RenderingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.329772130:8(5478-5492)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3297721
Chen YSaha AChapiro AHäne CBazin JQiu BZanetti SKatsavounidis IBovik A(2024)Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual RealityIEEE Transactions on Image Processing10.1109/TIP.2024.346888133(5740-5754)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIP.2024.3468881
Jiang YLiao QWang ZLin XLu ZZhao YWei HYe JZhang YShao Z(2024)SMPLX-Lite: A Realistic and Drivable Avatar Benchmark with Rich Geometry and Texture Annotations2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687388(1-6)Online publication date: 15-Jul-2024
https://doi.org/10.1109/ICME57554.2024.10687388
Xie HZhu ZZhang X(2024)VR-Recon: Visual Refiner for Fine-Grained Human Reconstruction2024 IEEE 4th International Conference on Digital Twins and Parallel Intelligence (DTPI)10.1109/DTPI61353.2024.10778915(1-6)Online publication date: 18-Oct-2024
https://doi.org/10.1109/DTPI61353.2024.10778915
Li ZZheng ZWang LLiu Y(2024)Animatable Gaussians: Learning Pose-Dependent Gaussian Maps for High-Fidelity Human Avatar Modeling2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01864(19711-19722)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01864
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents