Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3562939.3565622acmconferencesArticle/Chapter ViewAbstractPublication PagesvrstConference Proceedingsconference-collections
research-article
Open access

Automated Blendshape Personalization for Faithful Face Animations Using Commodity Smartphones

Published: 29 November 2022 Publication History

Abstract

Digital reconstruction of humans has various interesting use-cases. Animated virtual humans, avatars and agents alike, are the central entities in virtual embodied human-computer and human-human encounters in social XR. Here, a faithful reconstruction of facial expressions becomes paramount due to their prominent role in non-verbal behavior and social interaction. Current XR-platforms, like Unity 3D or the Unreal Engine, integrate recent smartphone technologies to animate faces of virtual humans by facial motion capturing. Using the same technology, this article presents an optimization-based approach to generate personalized blendshapes as animation targets for facial expressions. The proposed method combines a position-based optimization with a seamless partial deformation transfer, necessary for a faithful reconstruction. Our method is fully automated and considerably outperforms existing solutions based on example-based facial rigging or deformation transfer, and overall results in a much lower reconstruction error. It also neatly integrates with recent smartphone-based reconstruction pipelines for mesh generation and automated rigging, further paving the way to a widespread application of human-like and personalized avatars and agents in various use-cases.

Supplementary Material

Additional test results Animation video (supplementary_material.pdf)
MP4 File (supplementary_video.mp4)
Additional test results Animation video

References

[1]
Jascha Achenbach, Thomas Waltemate, Marc Erich Latoschik, and Mario Botsch. 2017. Fast generation of realistic virtual humans. In Proceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology. ACM. https://doi.org/10.1145/3139131.3139154
[2]
Jascha Achenbach, Eduard Zell, and Mario Botsch. 2015. Accurate Face Reconstruction through Anisotropic Fitting and Eye Correction. In Vision, Modeling & Visualization. The Eurographics Association. https://doi.org/10.2312/vmv.20151251
[3]
Jim Blascovich and Jeremy Bailenson. 2011. Infinite Reality: Avatars, Eternal Life, New Worlds, and the Dawn of the Virtual Revolution. William Morrow & Co. https://doi.org/10.1162/PRES_r_00068
[4]
Mario Botsch, Robert Sumner, Mark Pauly, and Markus Gross. 2006. Deformation Transfer for Detail-Preserving Surface Editing. In Proceedings of VMV 2006.
[5]
Sofien Bouaziz, Andrea Tagliasacci, and Mark Pauly. 2014. Dynamic 2D/3D Registration. In Eurographics 2014 Tutorial.
[6]
Sofien Bouaziz, Yangang Wang, and Mark Pauly. 2013. Online modeling for realtime facial animation. ACM Transactions on Graphics 32, 4 (2013), 1–10. https://doi.org/10.1145/2461912.2461976
[7]
David Burden and Maggi Savin-Baden. 2020. Virtual humans: Today and tomorrow. Chapman and Hall/CRC.
[8]
Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao, and Kun Zhou. 2016. Real-time facial animation with image-based dynamic avatars. ACM Transactions on Graphics 35, 4 (2016), 1–12. https://doi.org/10.1145/2897824.2925873
[9]
E. Carrigan, E. Zell, C. Guiard, and R. McDonnell. 2020. Expression Packing: As-Few-As-Possible Training Expressions for Blendshape Transfer. Computer Graphics Forum 39, 2 (2020), 219–233. https://doi.org/10.1111/cgf.13925
[10]
Dan Casas, Oleg Alexander, Andrew W. Feng, Graham Fyffe, Ryosuke Ichikari, Paul Debevec, Rhuizhe Wang, Evan Suma, and Ari Shapiro. 2015. Rapid Photorealistic Blendshapes from Commodity RGB-D Sensors. In Proceedings of the 19th Symposium on Interactive 3D Graphics and Games. ACM, 134–134. https://doi.org/10.1145/2699276.2721398
[11]
Justine Cassell, Catherine Pelachaud, Norman Badler, Mark Steedman, Brett Achorn, Tripp Becket, Brett Douville, Scott Prevost, and Matthew Stone. 1994. Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques. 413–420. https://doi.org/10.1145/192161.192272
[12]
Bindita Chaudhuri, Noranart Vesdapunt, Linda Shapiro, and Baoyuan Wang. 2020. Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting. In Computer Vision – ECCV 2020. Springer International Publishing, 142–160. https://doi.org/10.1007/978-3-030-58558-7_9
[13]
Paul Ekman and Wallace V Friesen. 1969. The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Semiotica 1, 1 (1969), 49–98. https://doi.org/10.1515/semi.1969.1.1.49
[14]
Paul Ekman and Wallace V. Friesen. 1978. Facial Action Coding System. https://doi.org/10.1037/t27734-000
[15]
Andrew Feng, Evan Suma Rosenberg, and Ari Shapiro. 2017. Just-in-time, viable, 3-D avatars from scans. Computer Animation and Virtual Worlds 28, 3-4 (2017). https://doi.org/10.1145/3084363.3085045
[16]
Pablo Garrido, Levi Valgaert, Chenglei Wu, and Christian Theobalt. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM Transactions on Graphics 32, 6 (2013), 1–10. https://doi.org/10.1145/2508363.2508380
[17]
Pablo Garrido, Michael Zollhöfer, Dan Casas, Levi Valgaerts, Kiran Varanasi, Patrick Pérez, and Christian Theobalt. 2016. Reconstruction of Personalized 3D Face Rigs from Monocular Video. ACM Transactions on Graphics 35, 3 (2016), 1–15. https://doi.org/10.1145/2890493
[18]
Ju Hee Han, Jee-In Kim, Hyungseok Kim, and Jang Won Suh. 2021. Generate Individually Optimized Blendshapes. In 2021 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE. https://doi.org/10.1109/bigcomp51126.2021.00030
[19]
Roger Blanco i Ribera, Eduard Zell, J. P. Lewis, Junyong Noh, and Mario Botsch. 2017. Facial retargeting with automatic range of motion alignment. ACM Transactions on Graphics 36, 4 (2017), 1–12. https://doi.org/10.1145/3072959.3073674
[20]
Alexandru Eugen Ichim, Sofien Bouaziz, and Mark Pauly. 2015. Dynamic 3D avatar creation from hand-held video input. ACM Transactions on Graphics 34, 4 (2015), 1–14. https://doi.org/10.1145/2766974
[21]
Marc Erich Latoschik, Florian Kern, Jan-Philipp Stauffert, Andrea Bartl, Mario Botsch, and Jean-Luc Lugrin. 2019. Not Alone Here?! Scalability and User Experience of Embodied Ambient Crowds in Distributed Social Virtual Reality. IEEE Transactions on Visualization and Computer Graphics (TVCG) 25, 5(2019), 2134–2144. https://doi.org/10.1109/TVCG.2019.2899250
[22]
Marc Erich Latoschik, Daniel Roth, Dominik Gall, Jascha Achenbach, Thomas Waltemate, and Mario Botsch. 2017. The Effect of Avatar Realism in Immersive Social Virtual Realities. In 23rd ACM Symposium on Virtual Reality Software and Technology (VRST). 39:1–39:10. https://doi.org/10.1145/3139131.3139156
[23]
Marc Erich Latoschik and Carolin Wienrich. 2022. Congruence and Plausibility, not Presence?! Pivotal Conditions for XR Experiences and Effects, a Novel Model. Frontiers in Virtual Reality(2022). https://doi.org/10.3389/frvir.2022.694433
[24]
JP Lewis and Kenichi Anjyo. 2010. Direct Manipulation Blendshapes. IEEE Computer Graphics and Applications 30, 4 (2010), 42–50. https://doi.org/10.1109/mcg.2010.41
[25]
J. P. Lewis, Ken Anjyo, Taehyun Rhee, Mengjie Zhang, Fred Pighin, and Zhigang Deng. 2014. Practice and Theory of Blendshape Facial Models. In Eurographics 2014 - State of the Art Reports. The Eurographics Association. https://doi.org/10.2312/egst.20141042
[26]
Hao Li, Thibaut Weise, and Mark Pauly. 2010. Example-based facial rigging. ACM Transactions on Graphics 29, 4 (2010), 1–6. https://doi.org/10.1145/1778765.1778769
[27]
Hao Li, Jihun Yu, Yuting Ye, and Chris Bregler. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics 32, 4 (2013), 1–10. https://doi.org/10.1145/2461912.2462019
[28]
Tianye Li, Timo Bolkart, Michael J. Black, Hao Li, and Javier Romero. 2017. Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics 36, 6 (2017), 1–17. https://doi.org/10.1145/3130800.3130813
[29]
Qiaoxi Liu and Anthony Steed. 2021. Social Virtual Reality Platform Comparison and Evaluation Using a Guided Group Walkthrough Method. Frontiers in Virtual Reality 2 (2021), p. 52. https://doi.org/10.3389/frvir.2021.668181
[30]
Yunpeng Liu, Stephan Beck, Renfang Wang, Jin Li, Huixia Xu, Shijie Yao, Xiaopeng Tong, and Bernd Froehlich. 2015. Hybrid Lossless-Lossy Compression for Real-Time Depth-Sensor Streams in 3D Telepresence Applications. In Advances in Multimedia Information Processing – PCM 2015. Springer International Publishing, Cham, pp. 442–452. https://doi.org/10.1007/978-3-319-24075-6_43
[31]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Transactions on Graphics 34, 6 (2015), 248:1–248:16. https://doi.org/10.1145/2816795.2818013
[32]
Birgit Lugrin, Catherine Pelachaud, and David Traum. 2021. The Handbook on Socially Interactive Agents: 20 years of Research on Embodied Conversational Agents, Intelligent Virtual Agents, and Social Robotics Volume 1: Methods, Behavior, Cognition. (2021). https://doi.org/10.1145/3477322
[33]
Nadia Magnenat-Thalmann and Daniel Thalmann. 2005. Virtual humans: thirty years of research, what next?The Visual Computer 21, 12 (2005), 997–1015. https://doi.org/10.1007/s00371-005-0363-6
[34]
David Matsumoto, Mark G Frank, and Hyi Sung Hwang. 2012. Reading people. Nonverbal Communication: Science and Applications (2012), 1. https://doi.org/10.4135/9781452244037
[35]
Verónica Costa Orvalho, Ernesto Zacur, and Antonio Susin. 2008. Transferring the Rig and Animations from a Character to Different Face Models. Computer Graphics Forum 27, 8 (2008), 1997–2012. https://doi.org/10.1111/j.1467-8659.2008.01187.x
[36]
Ahmed A A Osman, Timo Bolkart, and Michael J. Black. 2020. STAR: A Sparse Trained Articulated Human Body Regressor. In European Conference on Computer Vision (ECCV). 598–613. https://doi.org/10.1007/978-3-030-58539-6_36
[37]
Catherine Pelachaud. 2009. Modelling multimodal expression of emotion in a virtual agent. Philosophical Transactions of the Royal Society B: Biological Sciences 364, 1535 (2009), 3539–3548. https://doi.org/10.1098/rstb.2009.0186
[38]
Tekla S Perry. 2015. Virtual reality goes social. IEEE Spectrum 53, 1 (2015), 56–57. https://doi.org/10.1109/MSPEC.2016.7367470
[39]
Richard A. Roberts, Rafael Kuffner dos Anjos, Akinobu Maejima, and Ken Anjyo. 2021. Deformation transfer survey. Computers & Graphics 94 (2021), 52–61. https://doi.org/10.1016/j.cag.2020.10.004
[40]
Erika L Rosenberg and Paul Ekman. 2020. What the face reveals: Basic and applied studies of spontaneous expression using the Facial Action Coding System (FACS). Oxford University Press.
[41]
Daniel Roth, Carola Bloch, Anne-Kathrin Wilbers, Kai Kaspar, Marc Erich Latoschik, and Gary Bente. 2015. Quantification of Signal Carriers for Emotion Recognition from Body Movement and Facial Affects. Journal of Eye Movement Research4 (2015), p. 192. https://downloads.hci.informatik.uni-wuerzburg.de/2015-ecem-roth-quantification-signal-carriers.pdf
[42]
Daniel Roth, Carola Bloch, Anne-Kathrin Wilbers, Marc Erich Latoschik, Kai Kaspar, and Gary Bente. 2016. What You See is What You Get: Channel Dominance in the Decoding of Affective Nonverbal Behavior Displayed by Avatars. In Presentation at the 66th Annual Conference of the International Communication Association (ICA). https://downloads.hci.informatik.uni-wuerzburg.de/2016-Roth-WYSIWYG.pdf
[43]
Jun Saito. 2013. Smooth contact-aware facial blendshapes transfer. In Proceedings of the Symposium on Digital Production - DigiPro '13. ACM Press, 13–17. https://doi.org/10.1145/2491832.2491836
[44]
Yeongho Seol, Wan-Chun Ma, and J. P. Lewis. 2016. Creating an actor-specific facial rig from performance capture. In Proceedings of the 2016 Symposium on Digital Production. ACM. https://doi.org/10.1145/2947688.2947693
[45]
Yeongho Seol, Jaewoo Seo, Paul Hyunjin Kim, J. P. Lewis, and Junyong Noh. 2011. Artist friendly facial animation retargeting, In Proceedings of the 2011 SIGGRAPH Asia Conference on - SA '11. ACM Transactions on Graphics, 1–10. https://doi.org/10.1145/2024156.2024196
[46]
Antonio Susín Sergi Villagrasa. 2010. FACe! 3D Facial Animation System based on FACS. IV Iberoamerican Symposium in Computer Graphics (2010), 203–209.
[47]
Robert W. Sumner and Jovan Popović. 2004. Deformation transfer for triangle meshes. ACM Transactions on Graphics 23, 3 (2004), 399–405. https://doi.org/10.1145/1015706.1015736
[48]
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2018. Face2Face: real-time face capture and reenactment of RGB videos. Commun. ACM 62, 1 (Dec. 2018), 96–104. https://doi.org/10.1145/3292039
[49]
Keith Waters. 1987. A muscle model for animation three-dimensional facial expression. ACM SIGGRAPH Computer Graphics 21, 4 (1987), 17–24. https://doi.org/10.1145/37402.37405
[50]
Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. 2011. Realtime performance-based facial animation. ACM Transactions on Graphics 30, 4 (2011), 1–10. https://doi.org/10.1145/2010324.1964972
[51]
Stephan Wenninger, Jascha Achenbach, Andrea Bartl, Marc Erich Latoschik, and Mario Botsch. 2020. Realistic Virtual Humans from Smartphone Videos. In 26th ACM Symposium on Virtual Reality Software and Technology. ACM, 1–11. https://doi.org/10.1145/3385956.3418940
[52]
Timo Zinßer, Jochen Schmidt, and Heinrich Niemann. 2005. Point Set Registration with Integrated Scale Estimation. In International Conference on Pattern Recognition and Image Processing (PRIP 2005).

Cited By

View all
  • (2024)Beyond Text and Speech in Conversational Agents: Mapping the Design Space of AvatarsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661563(1875-1894)Online publication date: 1-Jul-2024

Index Terms

  1. Automated Blendshape Personalization for Faithful Face Animations Using Commodity Smartphones

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      VRST '22: Proceedings of the 28th ACM Symposium on Virtual Reality Software and Technology
      November 2022
      466 pages
      ISBN:9781450398893
      DOI:10.1145/3562939
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 November 2022

      Check for updates

      Author Tags

      1. blendshapes
      2. deformation transfer
      3. face animation
      4. facial rigging
      5. personalization
      6. virtual humans

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      VRST '22

      Acceptance Rates

      Overall Acceptance Rate 66 of 254 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)667
      • Downloads (Last 6 weeks)80
      Reflects downloads up to 20 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Beyond Text and Speech in Conversational Agents: Mapping the Design Space of AvatarsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661563(1875-1894)Online publication date: 1-Jul-2024

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media