Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

xCos: An Explainable Cosine Metric for Face Verification Task

Published: 15 November 2021 Publication History

Abstract

We study the XAI (explainable AI) on the face recognition task, particularly the face verification. Face verification has become a crucial task in recent days and it has been deployed to plenty of applications, such as access control, surveillance, and automatic personal log-on for mobile devices. With the increasing amount of data, deep convolutional neural networks can achieve very high accuracy for the face verification task. Beyond exceptional performances, deep face verification models need more interpretability so that we can trust the results they generate. In this article, we propose a novel similarity metric, called explainable cosine (xCos), that comes with a learnable module that can be plugged into most of the verification models to provide meaningful explanations. With the help of xCos, we can see which parts of the two input faces are similar, where the model pays its attention to, and how the local similarities are weighted to form the output xCos score. We demonstrate the effectiveness of our proposed method on LFW and various competitive benchmarks, not only resulting in providing novel and desirable model interpretability for face verification but also ensuring the accuracy as plugging into existing face recognition models.

References

[1]
Peter L. Bartlett, Fernando C. N. Pereira, Christopher J. C. Burges, Léon Bottou, and Kilian Q. Weinberger (Eds.). 2012. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings. http://papers.nips.cc/book/advances-in-neural-information-processing-systems-25-2012.
[2]
Wieland Brendel and Matthias Bethge. 2019. Approximating CNNs with Bag-of-Local-Features models works surprisingly well on ImageNet. In International Conference on Learning Representations. https://openreview.net/pdf?id=SkfMWhAqYQ.
[3]
Qiong Cao, Li Shen, Weidi Xie, Omkar M. Parkhi, and Andrew Zisserman. 2018. Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG’18). IEEE, 67–74.
[4]
Greg Castañón and Jeffrey Byrne. 2018. Visualizing and quantifying discriminative features for face recognition. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG’18), 16–23.
[5]
Ya-Liang Chang, Zhe Yu Liu, Kuan-Ying Lee, and Winston Hsu. 2019. Free-form video inpainting with 3D gated convolution and temporal PatchGAN. In Proceedings of the International Conference on Computer Vision (ICCV’19).
[6]
A. Chattopadhay, A. Sarkar, P. Howlader, and V. N. Balasubramanian. 2018. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV’18). 839–847. https://doi.org/10.1109/WACV.2018.00097
[7]
B. Chen, C. Chen, and W. H. Hsu. 2015. Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Transactions on Multimedia 17, 6 (2015), 804–815.
[8]
Runjin Chen, Hao Chen, Jie Ren, Ge Huang, and Quanshi Zhang. 2019. Explaining neural networks semantically and quantitatively. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19).
[9]
Jiankang Deng, Jia Guo, Xue Niannan, and Stefanos Zafeiriou. 2019. ArcFace: Additive angular margin loss for deep face recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).
[10]
Jiankang Deng, Yuxiang Zhou, and Stefanos P. Zafeiriou. 2017. Marginal loss for deep face recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17), 2006–2014.
[11]
D. Gu, Y. Li, F. Jiang, Z. Wen, S. Liu, W. Shi, G. Lu, and C. Zhou. 2020. VINet: A visually interpretable image diagnosis network. IEEE Transactions on Multimedia 22, 7 (2020), 1720–1729.
[12]
David Gunning. 2017. Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency (DARPA), nd Web 2 (2017).
[13]
Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. MS-Celeb-1M: A dataset and benchmark for large-scale face recognition. In European Conference Computer Vision (ECCV’16).
[14]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770–778. https://doi.org/10.1109/CVPR.2016.90
[15]
Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilović, Karthikeyan Natesan Ramamurthy, and Kush R. Varshney. 2019. TED: Teaching AI to explain its decisions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (AIES’19). ACM, New York, NY, 123–129. https://doi.org/10.1145/3306618.3314273
[16]
Geoffrey Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop. http://arxiv.org/abs/1503.02531.
[17]
Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. 2007. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. Technical Report 07-49. University of Massachusetts, Amherst.
[18]
Rui Huang, Shu Zhang, Tianyu Li, and Ran He. 2017. Beyond face rotation: Global and local perception GAN for photorealistic and identity preserving frontal view synthesis. In The IEEE International Conference on Computer Vision (ICCV’17).
[19]
Meina Kan, Shiguang Shan, Hong Chang, and Xilin Chen. 2014. Stacked progressive auto-encoders (SPAE) for face recognition across poses. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 1883–1890.
[20]
N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. 2009. Attribute and simile classifiers for face verification. In IEEE 12th International Conference on Computer Vision. 365–372.
[21]
Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. 2017. SphereFace: Deep hypersphere embedding for face recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).
[22]
Xuan Liu, Xiaoguang Wang, and Stan Matwin. 2018. Improving the interpretability of deep neural networks with knowledge distillation. In IEEE International Conference on Data Mining Workshops (ICDMW’18), 905–912.
[23]
Chaochao Lu and Xiaoou Tang. 2015. Surpassing human-level face verification performance on LFW with gaussian face. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 3811–3819. http://dl.acm.org/citation.cfm?id=2888116.2888245.
[24]
A. M. Martinez and Robert Benavente. 1998. The AR face database. Tech. Rep. 24 CVC Technical Report (Jan. 1998).
[25]
Lixuan Meng, Chenggang Yan, Jun Li, Jian Yin, Wu Liu, Hongtao Xie, and Liang Li. 2020. Multi-features fusion and decomposition for age-invariant face recognition. In Proceedings of the 28th ACM International Conference on Multimedia (MM’20). Association for Computing Machinery, New York, NY, 3146–3154. https://doi.org/10.1145/3394171.3413499
[26]
S. Moschoglou, A. Papaioannou, C. Sagonas, J. Deng, I. Kotsia, and S. Zafeiriou. 2017. AgeDB: The first manually collected, in-the-wild age database. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). 1997–2005. https://doi.org/10.1109/CVPRW.2017.250
[27]
Omkar M. Parkhi, Andrea Vedaldi, and Andrew Zisserman. 2015. Deep face recognition. In The British Machine Vision Conference (BMVC’15).
[28]
Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6 (2017), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
[29]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, New York, NY, 1135–1144. https://doi.org/10.1145/2939672.2939778
[30]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In IEEE International Conference on Computer Vision (ICCV’17). 618–626. https://doi.org/10.1109/ICCV.2017.74
[31]
S. Sengupta, J. Chen, C. Castillo, V. M. Patel, R. Chellappa, and D. W. Jacobs. 2016. Frontal to profile face verification in the wild. In IEEE Winter Conference on Applications of Computer Vision (WACV’16). 1–9. https://doi.org/10.1109/WACV.2016.7477558
[32]
Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4 (2017), 640–651. https://doi.org/10.1109/TPAMI.2016.2572683
[33]
Yi Sun, Yuheng Chen, Xiaogang Wang, and Xiaoou Tang. 2014. Deep learning face representation by joint identification-verification. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. 1988–1996. http://papers.nips.cc/paper/5416-deep-learning-face-representation-by-joint-identification-verification.
[34]
H. J. Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. 2018. CosFace: Large margin cosine loss for deep face recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 5265–5274.
[35]
Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A discriminative feature learning approach for deep face recognition. In European Conference Computer Vision (ECCV’16).
[36]
Jonathan R. Williford, Brandon B. May, and Jeffrey Byrne. 2020. Explainable face recognition. In 16th European Conference Computer Vision (ECCV’20), Proceedings, Part XI(Lecture Notes in Computer Science, Vol. 12356), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 248–263. https://doi.org/10.1007/978-3-030-58621-8_15
[37]
Lior Wolf, Tal Hassner, and Itay Maoz. 2011. Face recognition in unconstrained videos with matched background similarity. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 529–534.
[38]
Wenjie Yang, Houjing Huang, Zhang Zhang, Xiaotang Chen, Kaiqi Huang, and Shu Zhang. 2019. Towards rich feature discovery with class activation maps augmentation for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19).
[39]
Dong Yi, Zhen Lei, Shengcai Liao, and Stan Z. Li. 2014. Learning face representation from scratch. ArXiv abs/1411.7923 (2014).
[40]
Bangjie Yin, Luan Tran, Haoxiang Li, Xiaohui Shen, and Xiaoming Liu. 2019. Towards interpretable face recognition. In Proceeding of International Conference on Computer Vision (ICCV’19).
[41]
K. Zhang, Z. Zhang, Z. Li, and Y. Qiao. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23, 10 (Oct. 2016), 1499–1503. https://doi.org/10.1109/LSP.2016.2603342
[42]
Tianyue Zheng, Weihong Deng, and Jiani Hu. 2017. Cross-age LFW: A database for studying cross-age face recognition in unconstrained environments. CoRR abs/1708.08197 (2017). arxiv:1708.08197http://arxiv.org/abs/1708.08197.
[43]
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2015. Learning deep features for discriminative localization. arXiv e-prints, Article arXiv:1512.04150 (Dec 2015), arXiv:1512.04150 pages. arxiv:1512.04150 [cs.CV]
[44]
Z. Zhu, P. Luo, X. Wang, and X. Tang. 2013. Deep learning identity-preserving face space. In IEEE International Conference on Computer Vision (ICCV’13). 113–120.

Cited By

View all
  • (2024)Recent Applications of Explainable AI (XAI): A Systematic Literature ReviewApplied Sciences10.3390/app1419888414:19(8884)Online publication date: 2-Oct-2024
  • (2024)Subjective performance assessment protocol for visual explanations-based face verification explainabilityEURASIP Journal on Image and Video Processing10.1186/s13640-024-00645-02024:1Online publication date: 27-Sep-2024
  • (2024)OrchLoc: In-Orchard Localization via a Single LoRa Gateway and Generative Diffusion Model-based FingerprintingProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661876(304-317)Online publication date: 3-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 3s
October 2021
324 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3492435
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021
Accepted: 01 June 2021
Revised: 01 May 2021
Received: 01 December 2020
Published in TOMM Volume 17, Issue 3s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XAI
  2. xCos
  3. face verification
  4. face recognition
  5. explainable AI
  6. explainable artificial intelligence

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • Ministry of Science and Technology, Taiwan
  • Qualcomm Technologies, Inc.

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)103
  • Downloads (Last 6 weeks)5
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Recent Applications of Explainable AI (XAI): A Systematic Literature ReviewApplied Sciences10.3390/app1419888414:19(8884)Online publication date: 2-Oct-2024
  • (2024)Subjective performance assessment protocol for visual explanations-based face verification explainabilityEURASIP Journal on Image and Video Processing10.1186/s13640-024-00645-02024:1Online publication date: 27-Sep-2024
  • (2024)OrchLoc: In-Orchard Localization via a Single LoRa Gateway and Generative Diffusion Model-based FingerprintingProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661876(304-317)Online publication date: 3-Jun-2024
  • (2024)RAST: Restorable Arbitrary Style TransferACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363877020:5(1-21)Online publication date: 22-Jan-2024
  • (2024)Efficient Explainable Face Verification based on Similarity Score Argument Backpropagation2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00467(4724-4733)Online publication date: 3-Jan-2024
  • (2024)Towards Visual Saliency Explanations of Face Verification2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00466(4714-4723)Online publication date: 3-Jan-2024
  • (2024)Explainable Face Verification via Feature-Guided Gradient Backpropagation2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG)10.1109/FG59268.2024.10581925(1-5)Online publication date: 27-May-2024
  • (2024)PCaLDI: Explainable Similarity and Distance Metrics Using Principal Component Analysis Loadings for Feature ImportanceIEEE Access10.1109/ACCESS.2024.338754712(52623-52640)Online publication date: 2024
  • (2024)Developing an explainable diagnosis system utilizing deep learning model: a case study of spontaneous pneumothoraxPhysics in Medicine & Biology10.1088/1361-6560/ad5e3169:14(145017)Online publication date: 15-Jul-2024
  • (2024)Explainable biometrics: a systematic literature reviewJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-024-04856-1Online publication date: 19-Sep-2024
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media