research-article

Open access

Automated Blendshape Personalization for Faithful Face Animations Using Commodity Smartphones

Authors:

Marc Erich LatoschikAuthors Info & Claims

VRST '22: Proceedings of the 28th ACM Symposium on Virtual Reality Software and Technology

Article No.: 22, Pages 1 - 9

https://doi.org/10.1145/3562939.3565622

Published: 29 November 2022 Publication History

All formats PDF

Abstract

Digital reconstruction of humans has various interesting use-cases. Animated virtual humans, avatars and agents alike, are the central entities in virtual embodied human-computer and human-human encounters in social XR. Here, a faithful reconstruction of facial expressions becomes paramount due to their prominent role in non-verbal behavior and social interaction. Current XR-platforms, like Unity 3D or the Unreal Engine, integrate recent smartphone technologies to animate faces of virtual humans by facial motion capturing. Using the same technology, this article presents an optimization-based approach to generate personalized blendshapes as animation targets for facial expressions. The proposed method combines a position-based optimization with a seamless partial deformation transfer, necessary for a faithful reconstruction. Our method is fully automated and considerably outperforms existing solutions based on example-based facial rigging or deformation transfer, and overall results in a much lower reconstruction error. It also neatly integrates with recent smartphone-based reconstruction pipelines for mesh generation and automated rigging, further paving the way to a widespread application of human-like and personalized avatars and agents in various use-cases.

Supplementary Material

Additional test results Animation video (supplementary_material.pdf)

Download
641.32 KB

MP4 File (supplementary_video.mp4)

Additional test results Animation video

Download
4.87 MB

References

[1]

Jascha Achenbach, Thomas Waltemate, Marc Erich Latoschik, and Mario Botsch. 2017. Fast generation of realistic virtual humans. In Proceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology. ACM. https://doi.org/10.1145/3139131.3139154

Digital Library

[2]

Jascha Achenbach, Eduard Zell, and Mario Botsch. 2015. Accurate Face Reconstruction through Anisotropic Fitting and Eye Correction. In Vision, Modeling & Visualization. The Eurographics Association. https://doi.org/10.2312/vmv.20151251

[3]

Jim Blascovich and Jeremy Bailenson. 2011. Infinite Reality: Avatars, Eternal Life, New Worlds, and the Dawn of the Virtual Revolution. William Morrow & Co. https://doi.org/10.1162/PRES_r_00068

Digital Library

[4]

Mario Botsch, Robert Sumner, Mark Pauly, and Markus Gross. 2006. Deformation Transfer for Detail-Preserving Surface Editing. In Proceedings of VMV 2006.

[5]

Sofien Bouaziz, Andrea Tagliasacci, and Mark Pauly. 2014. Dynamic 2D/3D Registration. In Eurographics 2014 Tutorial.

[6]

Sofien Bouaziz, Yangang Wang, and Mark Pauly. 2013. Online modeling for realtime facial animation. ACM Transactions on Graphics 32, 4 (2013), 1–10. https://doi.org/10.1145/2461912.2461976

Digital Library

[7]

David Burden and Maggi Savin-Baden. 2020. Virtual humans: Today and tomorrow. Chapman and Hall/CRC.

[8]

Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao, and Kun Zhou. 2016. Real-time facial animation with image-based dynamic avatars. ACM Transactions on Graphics 35, 4 (2016), 1–12. https://doi.org/10.1145/2897824.2925873

Digital Library

[9]

E. Carrigan, E. Zell, C. Guiard, and R. McDonnell. 2020. Expression Packing: As-Few-As-Possible Training Expressions for Blendshape Transfer. Computer Graphics Forum 39, 2 (2020), 219–233. https://doi.org/10.1111/cgf.13925

[10]

Dan Casas, Oleg Alexander, Andrew W. Feng, Graham Fyffe, Ryosuke Ichikari, Paul Debevec, Rhuizhe Wang, Evan Suma, and Ari Shapiro. 2015. Rapid Photorealistic Blendshapes from Commodity RGB-D Sensors. In Proceedings of the 19th Symposium on Interactive 3D Graphics and Games. ACM, 134–134. https://doi.org/10.1145/2699276.2721398

Digital Library

[11]

Justine Cassell, Catherine Pelachaud, Norman Badler, Mark Steedman, Brett Achorn, Tripp Becket, Brett Douville, Scott Prevost, and Matthew Stone. 1994. Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques. 413–420. https://doi.org/10.1145/192161.192272

Digital Library

[12]

Bindita Chaudhuri, Noranart Vesdapunt, Linda Shapiro, and Baoyuan Wang. 2020. Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting. In Computer Vision – ECCV 2020. Springer International Publishing, 142–160. https://doi.org/10.1007/978-3-030-58558-7_9

Digital Library

[13]

Paul Ekman and Wallace V Friesen. 1969. The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Semiotica 1, 1 (1969), 49–98. https://doi.org/10.1515/semi.1969.1.1.49

[14]

Paul Ekman and Wallace V. Friesen. 1978. Facial Action Coding System. https://doi.org/10.1037/t27734-000

[15]

Andrew Feng, Evan Suma Rosenberg, and Ari Shapiro. 2017. Just-in-time, viable, 3-D avatars from scans. Computer Animation and Virtual Worlds 28, 3-4 (2017). https://doi.org/10.1145/3084363.3085045

Digital Library

[16]

Pablo Garrido, Levi Valgaert, Chenglei Wu, and Christian Theobalt. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM Transactions on Graphics 32, 6 (2013), 1–10. https://doi.org/10.1145/2508363.2508380

Digital Library

[17]

Pablo Garrido, Michael Zollhöfer, Dan Casas, Levi Valgaerts, Kiran Varanasi, Patrick Pérez, and Christian Theobalt. 2016. Reconstruction of Personalized 3D Face Rigs from Monocular Video. ACM Transactions on Graphics 35, 3 (2016), 1–15. https://doi.org/10.1145/2890493

Digital Library

[18]

Ju Hee Han, Jee-In Kim, Hyungseok Kim, and Jang Won Suh. 2021. Generate Individually Optimized Blendshapes. In 2021 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE. https://doi.org/10.1109/bigcomp51126.2021.00030

[19]

Roger Blanco i Ribera, Eduard Zell, J. P. Lewis, Junyong Noh, and Mario Botsch. 2017. Facial retargeting with automatic range of motion alignment. ACM Transactions on Graphics 36, 4 (2017), 1–12. https://doi.org/10.1145/3072959.3073674

Digital Library

[20]

Alexandru Eugen Ichim, Sofien Bouaziz, and Mark Pauly. 2015. Dynamic 3D avatar creation from hand-held video input. ACM Transactions on Graphics 34, 4 (2015), 1–14. https://doi.org/10.1145/2766974

Digital Library

[21]

Marc Erich Latoschik, Florian Kern, Jan-Philipp Stauffert, Andrea Bartl, Mario Botsch, and Jean-Luc Lugrin. 2019. Not Alone Here?! Scalability and User Experience of Embodied Ambient Crowds in Distributed Social Virtual Reality. IEEE Transactions on Visualization and Computer Graphics (TVCG) 25, 5(2019), 2134–2144. https://doi.org/10.1109/TVCG.2019.2899250

[22]

Marc Erich Latoschik, Daniel Roth, Dominik Gall, Jascha Achenbach, Thomas Waltemate, and Mario Botsch. 2017. The Effect of Avatar Realism in Immersive Social Virtual Realities. In 23rd ACM Symposium on Virtual Reality Software and Technology (VRST). 39:1–39:10. https://doi.org/10.1145/3139131.3139156

Digital Library

[23]

Marc Erich Latoschik and Carolin Wienrich. 2022. Congruence and Plausibility, not Presence?! Pivotal Conditions for XR Experiences and Effects, a Novel Model. Frontiers in Virtual Reality(2022). https://doi.org/10.3389/frvir.2022.694433

[24]

JP Lewis and Kenichi Anjyo. 2010. Direct Manipulation Blendshapes. IEEE Computer Graphics and Applications 30, 4 (2010), 42–50. https://doi.org/10.1109/mcg.2010.41

Digital Library

[25]

J. P. Lewis, Ken Anjyo, Taehyun Rhee, Mengjie Zhang, Fred Pighin, and Zhigang Deng. 2014. Practice and Theory of Blendshape Facial Models. In Eurographics 2014 - State of the Art Reports. The Eurographics Association. https://doi.org/10.2312/egst.20141042

[26]

Hao Li, Thibaut Weise, and Mark Pauly. 2010. Example-based facial rigging. ACM Transactions on Graphics 29, 4 (2010), 1–6. https://doi.org/10.1145/1778765.1778769

Digital Library

[27]

Hao Li, Jihun Yu, Yuting Ye, and Chris Bregler. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics 32, 4 (2013), 1–10. https://doi.org/10.1145/2461912.2462019

Digital Library

[28]

Tianye Li, Timo Bolkart, Michael J. Black, Hao Li, and Javier Romero. 2017. Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics 36, 6 (2017), 1–17. https://doi.org/10.1145/3130800.3130813

Digital Library

[29]

Qiaoxi Liu and Anthony Steed. 2021. Social Virtual Reality Platform Comparison and Evaluation Using a Guided Group Walkthrough Method. Frontiers in Virtual Reality 2 (2021), p. 52. https://doi.org/10.3389/frvir.2021.668181

[30]

Yunpeng Liu, Stephan Beck, Renfang Wang, Jin Li, Huixia Xu, Shijie Yao, Xiaopeng Tong, and Bernd Froehlich. 2015. Hybrid Lossless-Lossy Compression for Real-Time Depth-Sensor Streams in 3D Telepresence Applications. In Advances in Multimedia Information Processing – PCM 2015. Springer International Publishing, Cham, pp. 442–452. https://doi.org/10.1007/978-3-319-24075-6_43

[31]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Transactions on Graphics 34, 6 (2015), 248:1–248:16. https://doi.org/10.1145/2816795.2818013

Digital Library

[32]

Birgit Lugrin, Catherine Pelachaud, and David Traum. 2021. The Handbook on Socially Interactive Agents: 20 years of Research on Embodied Conversational Agents, Intelligent Virtual Agents, and Social Robotics Volume 1: Methods, Behavior, Cognition. (2021). https://doi.org/10.1145/3477322

Digital Library

[33]

Nadia Magnenat-Thalmann and Daniel Thalmann. 2005. Virtual humans: thirty years of research, what next?The Visual Computer 21, 12 (2005), 997–1015. https://doi.org/10.1007/s00371-005-0363-6

[34]

David Matsumoto, Mark G Frank, and Hyi Sung Hwang. 2012. Reading people. Nonverbal Communication: Science and Applications (2012), 1. https://doi.org/10.4135/9781452244037

[35]

Verónica Costa Orvalho, Ernesto Zacur, and Antonio Susin. 2008. Transferring the Rig and Animations from a Character to Different Face Models. Computer Graphics Forum 27, 8 (2008), 1997–2012. https://doi.org/10.1111/j.1467-8659.2008.01187.x

[36]

Ahmed A A Osman, Timo Bolkart, and Michael J. Black. 2020. STAR: A Sparse Trained Articulated Human Body Regressor. In European Conference on Computer Vision (ECCV). 598–613. https://doi.org/10.1007/978-3-030-58539-6_36

Digital Library

[37]

Catherine Pelachaud. 2009. Modelling multimodal expression of emotion in a virtual agent. Philosophical Transactions of the Royal Society B: Biological Sciences 364, 1535 (2009), 3539–3548. https://doi.org/10.1098/rstb.2009.0186

[38]

Tekla S Perry. 2015. Virtual reality goes social. IEEE Spectrum 53, 1 (2015), 56–57. https://doi.org/10.1109/MSPEC.2016.7367470

Digital Library

[39]

Richard A. Roberts, Rafael Kuffner dos Anjos, Akinobu Maejima, and Ken Anjyo. 2021. Deformation transfer survey. Computers & Graphics 94 (2021), 52–61. https://doi.org/10.1016/j.cag.2020.10.004

[40]

Erika L Rosenberg and Paul Ekman. 2020. What the face reveals: Basic and applied studies of spontaneous expression using the Facial Action Coding System (FACS). Oxford University Press.

[41]

Daniel Roth, Carola Bloch, Anne-Kathrin Wilbers, Kai Kaspar, Marc Erich Latoschik, and Gary Bente. 2015. Quantification of Signal Carriers for Emotion Recognition from Body Movement and Facial Affects. Journal of Eye Movement Research4 (2015), p. 192. https://downloads.hci.informatik.uni-wuerzburg.de/2015-ecem-roth-quantification-signal-carriers.pdf

[42]

Daniel Roth, Carola Bloch, Anne-Kathrin Wilbers, Marc Erich Latoschik, Kai Kaspar, and Gary Bente. 2016. What You See is What You Get: Channel Dominance in the Decoding of Affective Nonverbal Behavior Displayed by Avatars. In Presentation at the 66th Annual Conference of the International Communication Association (ICA). https://downloads.hci.informatik.uni-wuerzburg.de/2016-Roth-WYSIWYG.pdf

[43]

Jun Saito. 2013. Smooth contact-aware facial blendshapes transfer. In Proceedings of the Symposium on Digital Production - DigiPro '13. ACM Press, 13–17. https://doi.org/10.1145/2491832.2491836

Digital Library

[44]

Yeongho Seol, Wan-Chun Ma, and J. P. Lewis. 2016. Creating an actor-specific facial rig from performance capture. In Proceedings of the 2016 Symposium on Digital Production. ACM. https://doi.org/10.1145/2947688.2947693

Digital Library

[45]

Yeongho Seol, Jaewoo Seo, Paul Hyunjin Kim, J. P. Lewis, and Junyong Noh. 2011. Artist friendly facial animation retargeting, In Proceedings of the 2011 SIGGRAPH Asia Conference on - SA '11. ACM Transactions on Graphics, 1–10. https://doi.org/10.1145/2024156.2024196

Digital Library

[46]

Antonio Susín Sergi Villagrasa. 2010. FACe! 3D Facial Animation System based on FACS. IV Iberoamerican Symposium in Computer Graphics (2010), 203–209.

[47]

Robert W. Sumner and Jovan Popović. 2004. Deformation transfer for triangle meshes. ACM Transactions on Graphics 23, 3 (2004), 399–405. https://doi.org/10.1145/1015706.1015736

Digital Library

[48]

Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2018. Face2Face: real-time face capture and reenactment of RGB videos. Commun. ACM 62, 1 (Dec. 2018), 96–104. https://doi.org/10.1145/3292039

Digital Library

[49]

Keith Waters. 1987. A muscle model for animation three-dimensional facial expression. ACM SIGGRAPH Computer Graphics 21, 4 (1987), 17–24. https://doi.org/10.1145/37402.37405

Digital Library

[50]

Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. 2011. Realtime performance-based facial animation. ACM Transactions on Graphics 30, 4 (2011), 1–10. https://doi.org/10.1145/2010324.1964972

Digital Library

[51]

Stephan Wenninger, Jascha Achenbach, Andrea Bartl, Marc Erich Latoschik, and Mario Botsch. 2020. Realistic Virtual Humans from Smartphone Videos. In 26th ACM Symposium on Virtual Reality Software and Technology. ACM, 1–11. https://doi.org/10.1145/3385956.3418940

Digital Library

[52]

Timo Zinßer, Jochen Schmidt, and Heinrich Niemann. 2005. Point Set Registration with Integrated Scale Estimation. In International Conference on Pattern Recognition and Image Processing (PRIP 2005).

Cited By

Rashik MJasim MKucher KSarvghad AMahyar N(2024)Beyond Text and Speech in Conversational Agents: Mapping the Design Space of AvatarsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661563(1875-1894)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661563

Index Terms

Automated Blendshape Personalization for Faithful Face Animations Using Commodity Smartphones
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Motion capture
    2. Shape modeling
      1. Mesh geometry models

Recommendations

Blendshapes from commodity RGB-D sensors
SIGGRAPH '15: ACM SIGGRAPH 2015 Talks

Creating and animating a realistic 3D human face is an important task in computer graphics. The capability of capturing the 3D face of a human subject and reanimate it quickly will find many applications in games, training simulations, and interactive ...
Dynamic 3D avatar creation from hand-held video input

We present a complete pipeline for creating fully rigged, personalized 3D facial avatars from hand-held video. Our system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded ...
A facial composite editor for blendshape characters
DigiPro '12: Proceedings of the Digital Production Symposium

In this paper we present an interactive editing system that allows digital modelers to rapidly create new blendshape face models for incidental or background characters, starting from a small number of given face models. The obvious approach of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

VRST '22: Proceedings of the 28th ACM Symposium on Virtual Reality Software and Technology

November 2022

466 pages

ISBN:9781450398893

DOI:10.1145/3562939

Editors:
Takafumi Koike
Hosei University, Japan
,
Naoya Koizumi
The University of Electro-Communications, Japan
,
Gerd Bruder
University of Central Florida, USA
,
Daniel Roth
Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
,
Kazuki Takashima
Tohoku University, Japan
,
Takefumi Hiraki
University of Tsukuba, Japan
,
Yuki Ban
The University of Tokyo, Japan
,
Michal Piovarci
Institute of Science and Technology Austria, Austria

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 November 2022

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Bundesministerium für Bildung und Forschung

Conference

VRST '22

Sponsor:

VRST '22: 28th ACM Symposium on Virtual Reality Software and Technology

November 29 - December 1, 2022

Tsukuba, Japan

Acceptance Rates

Overall Acceptance Rate 66 of 254 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
1,647
Total Downloads

Downloads (Last 12 months)667
Downloads (Last 6 weeks)80

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rashik MJasim MKucher KSarvghad AMahyar N(2024)Beyond Text and Speech in Conversational Agents: Mapping the Design Space of AvatarsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661563(1875-1894)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661563

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten