Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

High Fidelity Makeup via 2D and 3D Identity Preservation Net

Published: 13 June 2024 Publication History
  • Get Citation Alerts
  • Abstract

    In this article, we address the challenging makeup transfer task, aiming to transfer makeup from a reference image to a source image while preserving facial geometry and background consistency. Existing deep neural network-based methods have shown promising results in aligning facial parts and transferring makeup textures. However, they often neglect the facial geometry of the source image, leading to two adverse effects: (1) alterations in geometrically relevant facial features, causing face flattening and loss of personality, and (2) difficulties in maintaining background consistency, as networks cannot clearly determine the face-background boundary. To jointly tackle these issues, we propose the High Fidelity Makeup via two-dimensional (2D) and 3D Identity Preservation Network (IP23-Net), to the best of our knowledge, a novel framework that leverages facial geometry information to generate more realistic results. Our method comprises a 3D Shape Identity Encoder, which extracts identity and 3D shape features. We incorporate a 3D face reconstruction model to ensure the three-dimensional effect of face makeup, thereby preserving the characters’ depth and natural appearance. To preserve background consistency, our Background Correction Decoder automatically predicts an adaptive mask for the source image, distinguishing the foreground and background. In addition to popular benchmarks, we introduce a new large-scale High Resolution Synthetic Makeup Dataset containing 335,230 diverse high-resolution face images to evaluate our method’s generalization ability. Experiments demonstrate that IP23-Net achieves high-fidelity makeup transfer while effectively preserving background consistency. The code will be made publicly available.

    References

    [1]
    Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein. 2021. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’21). 5799–5809.
    [2]
    Huiwen Chang, Jingwan Lu, Fisher Yu, and Adam Finkelstein. 2018. Pairedcyclegan: Asymmetric style transfer for applying and removing makeup. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’18). 40–48.
    [3]
    Cunjian Chen, Antitza Dantcheva, and Arun Ross. 2013. Automatic facial makeup detection with application in face recognition. In Proceedings of the International Conference on Biometrics (ICB’13). IEEE, 1–8.
    [4]
    Cunjian Chen, Antitza Dantcheva, and Arun Ross. 2016. An ensemble of patch-based subspaces for makeup-robust face recognition. Inf. Fusion 32 (2016), 80–92.
    [5]
    Cunjian Chen, Antitza Dantcheva, Thomas Swearingen, and Arun Ross. 2017. Spoofing faces using makeup: An investigative study. In Proceedings of the IEEE International Conference on Identity, Security and Behavior Analysis (ISBA’17). IEEE, 1–8.
    [6]
    Dong Chen, Yueting Zhuang, Zijin Shen, Carl Yang, Guoming Wang, Siliang Tang, and Yi Yang. 2022. Cross-modal data augmentation for tasks of different modalities. IEEE Trans. Multimedia (2022).
    [7]
    Hung-Jen Chen, Ka-Ming Hui, Szu-Yu Wang, Li-Wu Tsao, Hong-Han Shuai, and Wen-Huang Cheng. 2019. Beautyglow: On-demand makeup transfer framework with reversible generative network. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’19). 10042–10050.
    [8]
    Antitza Dantcheva, Cunjian Chen, and Arun Ross. 2012. Can facial cosmetics affect the matching accuracy of face recognition systems? In Proceedings of the IEEE 5th International Conference on Biometrics: Theory, Applications and Systems (BTAS’12). IEEE, 391–398.
    [9]
    Kirsten Dellinger and Christine L. Williams. 1997. Makeup at work: Negotiating appearance rules in the workplace. Gender Soc. 11, 2 (1997), 151–177.
    [10]
    Han Deng, Chu Han, Hongmin Cai, Guoqiang Han, and Shengfeng He. 2021. Spatially-invariant style-codes controlled makeup transfer. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’21). 6549–6557.
    [11]
    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248–255.
    [12]
    Jiankang Deng, Jia Guo, Xue Niannan, and Stefanos Zafeiriou. 2019. ArcFace: Additive angular margin loss for deep face recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’19).
    [13]
    Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, and Xin Tong. 2019. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops’19). 0–0.
    [14]
    Terrance DeVries, Miguel Angel Bautista, Nitish Srivastava, Graham W. Taylor, and Joshua M. Susskind. 2021. Unconstrained scene generation with locally conditioned radiance fields. In Proceedings of the International Conference on Computer Vision (ICCV’21). 14304–14313.
    [15]
    Yuhang Ding, Hehe Fan, Mingliang Xu, and Yi Yang. 2020. Adaptive exploration for unsupervised person re-identification. ACM Trans. Multimedia Comput. Commun. Appl. 16, 1 (2020), 1–19.
    [16]
    Hehe Fan, Liang Zheng, Chenggang Yan, and Yi Yang. 2018. Unsupervised person re-identification: Clustering and fine-tuning. ACM Trans. Multimedia Comput. Commun. Appl. 14, 4 (2018), 1–18.
    [17]
    Qiao Gu, Guanzhi Wang, Mang Tik Chiu, Yu-Wing Tai, and Chi-Keung Tang. 2019. Ladn: Local adversarial disentangling network for facial makeup and de-makeup. In Proceedings of the International Conference on Computer Vision (ICCV’19). 10481–10490.
    [18]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’16). 770–778.
    [19]
    Paul Henderson, Vagia Tsiminaki, and Christoph H Lampert. 2020. Leveraging 2d data to learn textured 3d mesh generation. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’20). 7498–7507.
    [20]
    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’17), Vol. 30. https://proceedings.neurips.cc/paper/2017/hash/8a1d694707eb0fefe65871369074926d-Abstract.html
    [21]
    Daichi Horita and Kiyoharu Aizawa. 2022. SLGAN: Style-and latent-guided generative adversarial network for desirable makeup transfer and removal. In Proceedings of the 4th ACM MM in Asia. 1–5.
    [22]
    Bingwen Hu, Zhedong Zheng, Ping Liu, Wankou Yang, and Mingwu Ren. 2020. Unsupervised eyeglasses removal in the wild. IEEE Trans. Cybernet. 51, 9 (2020), 4373–4385.
    [23]
    Shuo Huang, Zongxin Yang, Liangting Li, Yi Yang, and Jia Jia. 2023. AvatarFusion: Zero-shot generation of clothing-decoupled 3D avatars using 2D diffusion. In Proceedings of the ACM International Multimedia Conference (MM’23). 5734–5745.
    [24]
    Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the International Conference on Computer Vision (ICCV’17). 1501–1510.
    [25]
    Zhikun Huang, Zhedong Zheng, Chenggang Yan, Hongtao Xie, Yaoqi Sun, Jianzhong Wang, and Jiyong Zhang. 2021. Real-world automatic makeup via identity preservation makeup net. In Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence. 652–658.
    [26]
    Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’17). 1125–1134.
    [27]
    Wentao Jiang, Si Liu, Chen Gao, Jie Cao, Ran He, Jiashi Feng, and Shuicheng Yan. 2020. Psgan: Pose and expression robust spatial-aware gan for customizable makeup transfer. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’20). 5194–5202.
    [28]
    Wentao Jiang, Si Liu, Chen Gao, Ran He, Bo Li, and Shuicheng Yan. 2020. Beautify as you like. In Proceedings of the 28th ACM International Multimedia Conference (MM’20). 4542–4544.
    [29]
    Yang Jin, Wenhao Jiang, Yi Yang, and Yadong Mu. 2021. Zero-shot video event detection with high-order semantic concept discovery and matching. IEEE Trans. Multimedia 24 (2021), 1896–1908.
    [30]
    Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision. Springer, 694–711.
    [31]
    Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’19). 4401–4410.
    [32]
    Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’20). 8110–8119.
    [33]
    Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), Yoshua Bengio and Yann LeCun (Eds.).
    [34]
    Cheng-Han Lee, Ziwei Liu, Lingyun Wu, and Ping Luo. 2020. Maskgan: Towards diverse and interactive facial image manipulation. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’20). 5549–5558.
    [35]
    Kexin Li, Zongxin Yang, Lei Chen, Yi Yang, and Jun Xiao. 2023. Catr: Combinatorial-dependence audio-queried transformer for audio-visual video segmentation. In Proceedings of the ACM International Multimedia Conference (MM’23). 1485–1494.
    [36]
    Tingting Li, Ruihe Qian, Chao Dong, Si Liu, Qiong Yan, Wenwu Zhu, and Liang Lin. 2018. Beautygan: Instance-level facial makeup transfer with deep generative adversarial network. In Proceedings of the ACM International Multimedia Conference (MM’18). 645–653.
    [37]
    Xiangtai Li, Henghui Ding, Wenwei Zhang, Haobo Yuan, Guangliang Cheng, Pang Jiangmiao, Kai Chen, Ziwei Liu, and Chen Change Loy. 2023. Transformer-based visual segmentation: A survey (unpublished).
    [38]
    Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, and Chen Change Loy. 2024. OMG-Seg: Is one model good enough for all segmentation? In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’24).
    [39]
    Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. 2017. Visual attribute transfer through deep image analogy. ACM Trans. Graph. 36, 4 (2017), 120:1–120:15.
    [40]
    Si Liu, Wentao Jiang, Chen Gao, Ran He, Jiashi Feng, Bo Li, and Shuicheng Yan. 2021. Psgan++: Robust detail-preserving makeup transfer and removal. IEEE Trans. Pattern Anal. Mach. Intell. 44, 11 (2021), 8538–8551.
    [41]
    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the International Conference on Computer Vision (ICCV’21). 10012–10022.
    [42]
    Yawei Luo and Yi Yang. 2024. Large language model and domain-specific model collaboration for smart education. Front. Inf. Technol. Electr. Eng. 1, 1 (2024).
    [43]
    Yueming Lyu, Jing Dong, Bo Peng, Wei Wang, and Tieniu Tan. 2021. SOGAN: 3D-aware shadow and occlusion robust GAN for makeup transfer. In Proceedings of the 29th ACM International Multimedia Conference (MM. 3601–3609.
    [44]
    Yueming Lyu, Yue Jiang, Ziwen He, Bo Peng, Yunfan Liu, and Jing Dong. 2023. 3D-aware adversarial makeup generation for facial privacy protection. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 11 (2023), 13438–13453. DOI:
    [45]
    Thao Nguyen, Anh Tuan Tran, and Minh Hoai. 2021. Lipstick ain’t enough: Beyond color matching for in-the-wild makeup transfer. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’21). 13305–13314.
    [46]
    Tam V Nguyen and Luoqi Liu. 2017. Smart mirror: Intelligent makeup recommendation and synthesis. In Proceedings of the 25th ACM Proceedings of the ACM International Multimedia Conference (MM’17). 1253–1254.
    [47]
    Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, and Yong-Liang Yang. 2019. Hologan: Unsupervised learning of 3d representations from natural images. In Proceedings of the International Conference on Computer Vision (ICCV’19). 7588–7597.
    [48]
    Thu H Nguyen-Phuoc, Christian Richardt, Long Mai, Yongliang Yang, and Niloy Mitra. 2020. Blockgan: Learning 3d object-aware scene representations from unlabelled images. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’20), Vol. 33, 6767–6778.
    [49]
    Michael Niemeyer and Andreas Geiger. 2021. Giraffe: Representing scenes as compositional generative neural feature fields. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’21). 11453–11464.
    [50]
    Pietro Pala and Stefano Berretti. 2019. Reconstructing 3D face models by incremental aggregation and refinement of depth frames. ACM Trans. Multimedia Comput. Commun. Appl. 15, 1 (2019), 1–24.
    [51]
    Pascal Paysan, Reinhard Knothe, Brian Amberg, Sami Romdhani, and Thomas Vetter. 2009. A 3D face model for pose and illumination invariant face recognition. In Proceedings of the 6th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 296–301.
    [52]
    Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas Geiger. 2020. Graf: Generative radiance fields for 3d-aware image synthesis. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’20), Vol. 33, 20154–20166.
    [53]
    Xiaolong Shen, Jianxin Ma, Chang Zhou, and Zongxin Yang. 2023. Controllable 3D face generation with conditional style code diffusion. arXiv:2312.13941. Retrieved from https://arxiv.org/abs/2312.13941
    [54]
    Aliaksandr Siarohin, Gloria Zen, Cveta Majtanovic, Xavier Alameda-Pineda, Elisa Ricci, and Nicu Sebe. 2019. Increasing image memorability with neural style transfer. ACM Trans. Multimedia Comput. Commun. Appl. 15, 2 (2019), 1–22.
    [55]
    Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), Yoshua Bengio and Yann LeCun (Eds.).
    [56]
    Yucheng Suo, Zhedong Zheng, Xiaohan Wang, Bang Zhang, and Yi Yang. 2024. Jointly harnessing prior structures and temporal consistency for sign language video generation. ACM Trans. Multimedia Comput. Commun. Appl. (2024).
    [57]
    Quan Wang, Sheng Li, Xinpeng Zhang, and Guorui Feng. 2022. Multi-granularity brushstrokes network for universal style transfer. ACM Trans. Multimedia Comput. Commun. Appl. 18, 4 (2022), 1–17.
    [58]
    Xinrui Wang and Jinze Yu. 2020. Learning to cartoonize using white-box cartoon representations. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’20). 8090–8099.
    [59]
    Xiaohan Wang, Linchao Zhu, Yu Wu, and Yi Yang. 2020. Symbiotic attention for egocentric action recognition with object-centric alignment. IEEE Trans. Pattern Analy. Mach. Intell. (2020).
    [60]
    Lin Xu, Yangzhou Du, and Yimin Zhang. 2013. An automatic framework for example-based virtual makeup. In Proceedings of the IEEE International Conference on Image Processing. IEEE, 3206–3210.
    [61]
    Chenyu Yang, Wanrong He, Yingqing Xu, and Yang Gao. 2022. Elegant: Exquisite and locally editable gan for makeup transfer. In European Conference on Computer Vision. Springer, 737–754.
    [62]
    Yi Yang, Yueting Zhuang, and Yunhe Pan. 2021. Multiple knowledge representation for big data artificial intelligence: Framework, applications, and case studies. Front. Inf. Technol. Electr. Eng. 22, 12 (2021), 1551–1558.
    [63]
    Zongxin Yang, Yunchao Wei, and Yi Yang. 2021. Collaborative video object segmentation by multi-scale foreground-background integration. IEEE Trans. Pattern Anal. Mach. Intell. 44, 9 (2021), 4701–4712.
    [64]
    Lidong Zeng, Zhedong Zheng, Yinwei Wei, and Tat-seng Chua. 2024. Instilling multi-round thinking to text-guided image generation. arXiv:2401.08472. Retrieved from https://arxiv.org/abs/2401.08472
    [65]
    Kaihao Zhang, Dongxu Li, Wenhan Luo, Jingyu Liu, Jiankang Deng, Wei Liu, and Stefanos Zafeiriou. 2022. EDFace-celeb-1 M: Benchmarking face hallucination with a million-scale dataset. IEEE Trans. Pattern Anal. Mach. Intell. 45, 3 (2022), 3968–3978.
    [66]
    Xuanmeng Zhang, Zhedong Zheng, Daiheng Gao, Bang Zhang, Pan Pan, and Yi Yang. 2022. Multi-view consistent generative adversarial networks for 3d-aware image synthesis. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’22). 18450–18459.
    [67]
    Xuanmeng Zhang, Zhedong Zheng, Daiheng Gao, Bang Zhang, Yi Yang, and Tat-Seng Chua. 2023. Multi-view consistent generative adversarial networks for compositional 3D-aware image synthesis. Int. J. Comput. Vis. (2023), 1–24.
    [68]
    Yi Zhang, Linzi Qu, Lihuo He, Wen Lu, and Xinbo Gao. 2019. Beauty aware network: An unsupervised method for makeup product retrieval. In Proceedings of the 27th ACM International Multimedia Conference (MM’19). 2558–2562.
    [69]
    Zhedong Zheng, Xiaohan Wang, Nenggan Zheng, and Yi Yang. 2022. Parameter-efficient person re-identification in the 3d space. IEEE Trans. Neural Netw. Learn. Syst. (2022).
    [70]
    Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. A discriminatively learned cnn embedding for person reidentification. ACM Trans. Multimedia Comput. Commun. Appl. 14, 1 (2017), 1–20.
    [71]
    Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li, and Yi Yang. 2018. Camstyle: A novel data augmentation method for person re-identification. IEEE Trans. Image Process. 28, 3 (2018), 1176–1190.
    [72]
    Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the International Conference on Computer Vision (ICCV’17). 2223–2232.
    [73]
    Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Josh Tenenbaum, and Bill Freeman. 2018. Visual object networks: Image generation with disentangled 3D representations. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’18), Vol. 31.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 8
    August 2024
    698 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3618074
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 June 2024
    Online AM: 08 April 2024
    Accepted: 22 March 2024
    Revised: 20 February 2024
    Received: 21 August 2023
    Published in TOMM Volume 20, Issue 8

    Check for updates

    Author Tags

    1. Makeup transfer
    2. 3D stereoscopic loss
    3. high resolution synthetic makeup dataset
    4. multimedia application

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Fundamental Research Funds for the Central Universities

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 157
      Total Downloads
    • Downloads (Last 12 months)157
    • Downloads (Last 6 weeks)109
    Reflects downloads up to 14 Aug 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media