research-article

xCos: An Explainable Cosine Metric for Face Verification Task

Authors:

Ya-Liang Chang,

Winston H. HsuAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 17, Issue 3s

Article No.: 112, Pages 1 - 16

https://doi.org/10.1145/3469288

Published: 15 November 2021 Publication History

Abstract

We study the XAI (explainable AI) on the face recognition task, particularly the face verification. Face verification has become a crucial task in recent days and it has been deployed to plenty of applications, such as access control, surveillance, and automatic personal log-on for mobile devices. With the increasing amount of data, deep convolutional neural networks can achieve very high accuracy for the face verification task. Beyond exceptional performances, deep face verification models need more interpretability so that we can trust the results they generate. In this article, we propose a novel similarity metric, called explainable cosine (xCos), that comes with a learnable module that can be plugged into most of the verification models to provide meaningful explanations. With the help of xCos, we can see which parts of the two input faces are similar, where the model pays its attention to, and how the local similarities are weighted to form the output xCos score. We demonstrate the effectiveness of our proposed method on LFW and various competitive benchmarks, not only resulting in providing novel and desirable model interpretability for face verification but also ensuring the accuracy as plugging into existing face recognition models.

References

[1]

Peter L. Bartlett, Fernando C. N. Pereira, Christopher J. C. Burges, Léon Bottou, and Kilian Q. Weinberger (Eds.). 2012. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings. http://papers.nips.cc/book/advances-in-neural-information-processing-systems-25-2012.

[2]

Wieland Brendel and Matthias Bethge. 2019. Approximating CNNs with Bag-of-Local-Features models works surprisingly well on ImageNet. In International Conference on Learning Representations. https://openreview.net/pdf?id=SkfMWhAqYQ.

[3]

Qiong Cao, Li Shen, Weidi Xie, Omkar M. Parkhi, and Andrew Zisserman. 2018. Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG’18). IEEE, 67–74.

[4]

Greg Castañón and Jeffrey Byrne. 2018. Visualizing and quantifying discriminative features for face recognition. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG’18), 16–23.

[5]

Ya-Liang Chang, Zhe Yu Liu, Kuan-Ying Lee, and Winston Hsu. 2019. Free-form video inpainting with 3D gated convolution and temporal PatchGAN. In Proceedings of the International Conference on Computer Vision (ICCV’19).

[6]

A. Chattopadhay, A. Sarkar, P. Howlader, and V. N. Balasubramanian. 2018. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV’18). 839–847. https://doi.org/10.1109/WACV.2018.00097

[7]

B. Chen, C. Chen, and W. H. Hsu. 2015. Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Transactions on Multimedia 17, 6 (2015), 804–815.

Digital Library

[8]

Runjin Chen, Hao Chen, Jie Ren, Ge Huang, and Quanshi Zhang. 2019. Explaining neural networks semantically and quantitatively. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19).

[9]

Jiankang Deng, Jia Guo, Xue Niannan, and Stefanos Zafeiriou. 2019. ArcFace: Additive angular margin loss for deep face recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).

[10]

Jiankang Deng, Yuxiang Zhou, and Stefanos P. Zafeiriou. 2017. Marginal loss for deep face recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17), 2006–2014.

[11]

D. Gu, Y. Li, F. Jiang, Z. Wen, S. Liu, W. Shi, G. Lu, and C. Zhou. 2020. VINet: A visually interpretable image diagnosis network. IEEE Transactions on Multimedia 22, 7 (2020), 1720–1729.

[12]

David Gunning. 2017. Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency (DARPA), nd Web 2 (2017).

[13]

Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. MS-Celeb-1M: A dataset and benchmark for large-scale face recognition. In European Conference Computer Vision (ECCV’16).

[14]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770–778. https://doi.org/10.1109/CVPR.2016.90

[15]

Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilović, Karthikeyan Natesan Ramamurthy, and Kush R. Varshney. 2019. TED: Teaching AI to explain its decisions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (AIES’19). ACM, New York, NY, 123–129. https://doi.org/10.1145/3306618.3314273

Digital Library

[16]

Geoffrey Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop. http://arxiv.org/abs/1503.02531.

[17]

Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. 2007. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. Technical Report 07-49. University of Massachusetts, Amherst.

[18]

Rui Huang, Shu Zhang, Tianyu Li, and Ran He. 2017. Beyond face rotation: Global and local perception GAN for photorealistic and identity preserving frontal view synthesis. In The IEEE International Conference on Computer Vision (ICCV’17).

[19]

Meina Kan, Shiguang Shan, Hong Chang, and Xilin Chen. 2014. Stacked progressive auto-encoders (SPAE) for face recognition across poses. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 1883–1890.

Digital Library

[20]

N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. 2009. Attribute and simile classifiers for face verification. In IEEE 12th International Conference on Computer Vision. 365–372.

[21]

Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. 2017. SphereFace: Deep hypersphere embedding for face recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).

[22]

Xuan Liu, Xiaoguang Wang, and Stan Matwin. 2018. Improving the interpretability of deep neural networks with knowledge distillation. In IEEE International Conference on Data Mining Workshops (ICDMW’18), 905–912.

[23]

Chaochao Lu and Xiaoou Tang. 2015. Surpassing human-level face verification performance on LFW with gaussian face. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 3811–3819. http://dl.acm.org/citation.cfm?id=2888116.2888245.

Digital Library

[24]

A. M. Martinez and Robert Benavente. 1998. The AR face database. Tech. Rep. 24 CVC Technical Report (Jan. 1998).

[25]

Lixuan Meng, Chenggang Yan, Jun Li, Jian Yin, Wu Liu, Hongtao Xie, and Liang Li. 2020. Multi-features fusion and decomposition for age-invariant face recognition. In Proceedings of the 28th ACM International Conference on Multimedia (MM’20). Association for Computing Machinery, New York, NY, 3146–3154. https://doi.org/10.1145/3394171.3413499

Digital Library

[26]

S. Moschoglou, A. Papaioannou, C. Sagonas, J. Deng, I. Kotsia, and S. Zafeiriou. 2017. AgeDB: The first manually collected, in-the-wild age database. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). 1997–2005. https://doi.org/10.1109/CVPRW.2017.250

[27]

Omkar M. Parkhi, Andrea Vedaldi, and Andrew Zisserman. 2015. Deep face recognition. In The British Machine Vision Conference (BMVC’15).

[28]

Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6 (2017), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

Digital Library

[29]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, New York, NY, 1135–1144. https://doi.org/10.1145/2939672.2939778

Digital Library

[30]

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In IEEE International Conference on Computer Vision (ICCV’17). 618–626. https://doi.org/10.1109/ICCV.2017.74

[31]

S. Sengupta, J. Chen, C. Castillo, V. M. Patel, R. Chellappa, and D. W. Jacobs. 2016. Frontal to profile face verification in the wild. In IEEE Winter Conference on Applications of Computer Vision (WACV’16). 1–9. https://doi.org/10.1109/WACV.2016.7477558

[32]

Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4 (2017), 640–651. https://doi.org/10.1109/TPAMI.2016.2572683

Digital Library

[33]

Yi Sun, Yuheng Chen, Xiaogang Wang, and Xiaoou Tang. 2014. Deep learning face representation by joint identification-verification. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. 1988–1996. http://papers.nips.cc/paper/5416-deep-learning-face-representation-by-joint-identification-verification.

Digital Library

[34]

H. J. Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. 2018. CosFace: Large margin cosine loss for deep face recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 5265–5274.

[35]

Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A discriminative feature learning approach for deep face recognition. In European Conference Computer Vision (ECCV’16).

[36]

Jonathan R. Williford, Brandon B. May, and Jeffrey Byrne. 2020. Explainable face recognition. In 16th European Conference Computer Vision (ECCV’20), Proceedings, Part XI(Lecture Notes in Computer Science, Vol. 12356), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 248–263. https://doi.org/10.1007/978-3-030-58621-8_15

Digital Library

[37]

Lior Wolf, Tal Hassner, and Itay Maoz. 2011. Face recognition in unconstrained videos with matched background similarity. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 529–534.

Digital Library

[38]

Wenjie Yang, Houjing Huang, Zhang Zhang, Xiaotang Chen, Kaiqi Huang, and Shu Zhang. 2019. Towards rich feature discovery with class activation maps augmentation for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19).

[39]

Dong Yi, Zhen Lei, Shengcai Liao, and Stan Z. Li. 2014. Learning face representation from scratch. ArXiv abs/1411.7923 (2014).

[40]

Bangjie Yin, Luan Tran, Haoxiang Li, Xiaohui Shen, and Xiaoming Liu. 2019. Towards interpretable face recognition. In Proceeding of International Conference on Computer Vision (ICCV’19).

[41]

K. Zhang, Z. Zhang, Z. Li, and Y. Qiao. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23, 10 (Oct. 2016), 1499–1503. https://doi.org/10.1109/LSP.2016.2603342

[42]

Tianyue Zheng, Weihong Deng, and Jiani Hu. 2017. Cross-age LFW: A database for studying cross-age face recognition in unconstrained environments. CoRR abs/1708.08197 (2017). arxiv:1708.08197 http://arxiv.org/abs/1708.08197.

[43]

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2015. Learning deep features for discriminative localization. arXiv e-prints, Article arXiv:1512.04150 (Dec 2015), arXiv:1512.04150 pages. arxiv:1512.04150 [cs.CV]

[44]

Z. Zhu, P. Luo, X. Wang, and X. Tang. 2013. Deep learning identity-preserving face space. In IEEE International Conference on Computer Vision (ICCV’13). 113–120.

Cited By

Saarela MPodgorelec V(2024)Recent Applications of Explainable AI (XAI): A Systematic Literature ReviewApplied Sciences10.3390/app1419888414:19(8884)Online publication date: 2-Oct-2024
https://doi.org/10.3390/app14198884
Bousnina NAscenso JCorreia PPereira F(2024)Subjective performance assessment protocol for visual explanations-based face verification explainabilityEURASIP Journal on Image and Video Processing10.1186/s13640-024-00645-02024:1Online publication date: 27-Sep-2024
https://doi.org/10.1186/s13640-024-00645-0
Yang KChen YDu WOkoshi TKo JLiKamWa R(2024)OrchLoc: In-Orchard Localization via a Single LoRa Gateway and Generative Diffusion Model-based FingerprintingProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661876(304-317)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661876
Show More Cited By

Index Terms

xCos: An Explainable Cosine Metric for Face Verification Task
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Biometrics
  2. Machine learning
    1. Machine learning algorithms

Recommendations

Multiview discriminative marginal metric learning for makeup face verification
Abstract
Makeup face verification in the wild is an important research problem for its popularization in real-world. However, little effort has been made to tackle it in computer vision. In this research, we first build a new database, i.e., ...
Does face restoration improve face verification?
Abstract
Methods for face verification works reasonably well on face images with standardized (frontal) face positions and good spatial resolution. However such methods have significant challenges on poor resolution images, poor lighting conditions and not ...
High-resolution face verification using pore-scale facial features
Face recognition methods, which usually represent face images using holistic or local facial features, rely heavily on alignment. Their performances also suffer a severe degradation under variations in expressions or poses, especially when there is one ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 17, Issue 3s

October 2021

324 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3492435

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021

Accepted: 01 June 2021

Revised: 01 May 2021

Received: 01 December 2020

Published in TOMM Volume 17, Issue 3s

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Ministry of Science and Technology, Taiwan
Qualcomm Technologies, Inc.

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
488
Total Downloads

Downloads (Last 12 months)103
Downloads (Last 6 weeks)5

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Saarela MPodgorelec V(2024)Recent Applications of Explainable AI (XAI): A Systematic Literature ReviewApplied Sciences10.3390/app1419888414:19(8884)Online publication date: 2-Oct-2024
https://doi.org/10.3390/app14198884
Bousnina NAscenso JCorreia PPereira F(2024)Subjective performance assessment protocol for visual explanations-based face verification explainabilityEURASIP Journal on Image and Video Processing10.1186/s13640-024-00645-02024:1Online publication date: 27-Sep-2024
https://doi.org/10.1186/s13640-024-00645-0
Yang KChen YDu WOkoshi TKo JLiKamWa R(2024)OrchLoc: In-Orchard Localization via a Single LoRa Gateway and Generative Diffusion Model-based FingerprintingProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661876(304-317)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661876
Ma YZhao CHuang BLi XBasu A(2024)RAST: Restorable Arbitrary Style TransferACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363877020:5(1-21)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1145/3638770
Huber MLuu ATerhörst PDamer N(2024)Efficient Explainable Face Verification based on Similarity Score Argument Backpropagation2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00467(4724-4733)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00467
Lu YXu ZEbrahimi T(2024)Towards Visual Saliency Explanations of Face Verification2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00466(4714-4723)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00466
Lu YXu ZEbrahimi T(2024)Explainable Face Verification via Feature-Guided Gradient Backpropagation2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG)10.1109/FG59268.2024.10581925(1-5)Online publication date: 27-May-2024
https://doi.org/10.1109/FG59268.2024.10581925
Nakanishi T(2024)PCaLDI: Explainable Similarity and Distance Metrics Using Principal Component Analysis Loadings for Feature ImportanceIEEE Access10.1109/ACCESS.2024.338754712(52623-52640)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3387547
Lin FWei CBai ZChang CChiu M(2024)Developing an explainable diagnosis system utilizing deep learning model: a case study of spontaneous pneumothoraxPhysics in Medicine & Biology10.1088/1361-6560/ad5e3169:14(145017)Online publication date: 15-Jul-2024
https://doi.org/10.1088/1361-6560/ad5e31
Tucci CDella Greca ATortora GFrancese R(2024)Explainable biometrics: a systematic literature reviewJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-024-04856-1Online publication date: 19-Sep-2024
https://doi.org/10.1007/s12652-024-04856-1
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents