research-article

Learning to name faces: a multimodal learning scheme for search-based face annotation

Authors:

Steven C.H. Hoi,

Chunyan MiaoAuthors Info & Claims

SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pages 443 - 452

https://doi.org/10.1145/2484028.2484040

Published: 28 July 2013 Publication History

Abstract

Automated face annotation aims to automatically detect human faces from a photo and further name the faces with the corresponding human names. In this paper, we tackle this open problem by investigating a search-based face annotation (SBFA) paradigm for mining large amounts of web facial images freely available on the WWW. Given a query facial image for annotation, the idea of SBFA is to first search for top-n similar facial images from a web facial image database and then exploit these top-ranked similar facial images and their weak labels for naming the query facial image. To fully mine those information, this paper proposes a novel framework of Learning to Name Faces (L2NF) -- a unified multimodal learning approach for search-based face annotation, which consists of the following major components: (i) we enhance the weak labels of top-ranked similar images by exploiting the "label smoothness" assumption; (ii) we construct the multimodal representations of a facial image by extracting different types of features; (iii) we optimize the distance measure for each type of features using distance metric learning techniques; and finally (iv) we learn the optimal combination of multiple modalities for annotation through a learning to rank scheme. We conduct a set of extensive empirical studies on two real-world facial image databases, in which encouraging results show that the proposed algorithms significantly boost the naming accuracy of search-based face annotation task.

References

[1]

T. Ahonen, A. Hadid, and M. Pietikäinen. Face description with local binary patterns: Application to face recognition. IEEE TPMAI, 28(12):2037--2041, 2006.

Digital Library

[2]

M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7:2399--2434, 2006.

Digital Library

[3]

T. L. Berg, A. C. Berg, J. Edwards, M. Maire, R. White, Y. W. Teh, E. G. Learned-Miller, and D. A. Forsyth. Names and faces in the news. In IEEE CVPR'04, 2004.

Digital Library

[4]

J. Bu, B. Xu, C. Wu, C. Chen, J. Zhu, D. Cai, and X. He. Unsupervised face-name association via commute distance. In ACM MM'12, pages 219--228, 2012.

Digital Library

[5]

Z. Cao, Q. Yin, X. Tang, and J. Sun. Face recognition with learning-based descriptor. In IEEE CVPR'10, 2010.

[6]

G. Carneiro, A. B. Chan, P. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE TPMAI, pages 394--410, 2006.

Digital Library

[7]

B.-C. Chen, Y.-Y. Chen, Y.-H. Kuo, and W. H. Hsu. Scalable face image retrieval using attribute-enhanced sparse codewords. IEEE Trans. on Multimedia, 2012.

[8]

J. Y. Choi, W. D. Neve, K. N. Plataniotis, and Y. M. Ro. Collaborative face recognition for improved face annotation in personal photo collections shared on online social networks. IEEE Trans. on Multimedia, 13, 2011.

Digital Library

[9]

P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV'02, 2002.

Digital Library

[10]

J. Fan, Y. Gao, and H. Luo. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In ACM MM'04, pages 540--547, 2004.

Digital Library

[11]

S. Gao, I. W.-H. Tsang, and L.-T. Chia. Kernel sparse representation for image classification and face recognition. In ECCV'10, ECCV'10, pages 1--14, Berlin, Heidelberg, 2010. Springer-Verlag.

Digital Library

[12]

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Automatic face naming with caption-based supervision. In IEEE CVPR'08, pages 1--8, 2008.

[13]

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Face recognition from caption-based supervision. In IJCV'12, 96:64--82, Jan 2012.

Digital Library

[14]

M. Guillaumin, J. Verbeek, and C. Schmid. Multiple instance metric learning from automatically labeled bags of faces. In ECCV'10, pages 634--647, Sep 2010.

Digital Library

[15]

A. Hanbury. A survey of methods for image annotation. J. Vis. Lang. Comput., 19:617--627, October 2008.

Digital Library

[16]

S. C. Hoi, R. Jin, J. Zhu, and M. R. Lyu. Semi-supervised svm batch mode active learning with applications to image retrieval. ACM TOIS, 27(3):1--29, July 2009.

Digital Library

[17]

A. Holub, P. Moreels, and P. Perona. Unsupervised clustering for google searches of celebrity images. In IEEE FG'08, pages 1--8, 2008.

[18]

G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October 2007.

[19]

T. Joachims, T. Finley, and C.-N. J. Yu. Cutting-plane training of structural svms. Machine Learning, 77(1):27--59, Oct. 2009.

Digital Library

[20]

M. G. Kresimir Delac and M. S. Bartlett. Recent Advances in Face Recognition. I-Tech Education and Publishing, 2008.

[21]

N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. Attribute and simile classifiers for face verification. In IEEE ICCV'09, Oct 2009.

[22]

D.-D. Le and S. Satoh. Unsupervised face annotation by mining the web. In ICDM'08, pages 383--392, 2008.

Digital Library

[23]

C. Liu and H. Wechsler. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE TIP, 11(4):467--476, apr 2002.

Digital Library

[24]

A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV'08, pages 316--329, 2008.

Digital Library

[25]

B. Mcfee and G. Lanckriet. Metric learning to rank. In ICML'10, 2010.

[26]

T. Mensink and J. J. Verbeek. Improving people search using query expansions. In ECCV'08, pages 86--99, 2008.

Digital Library

[27]

D. Ozkan and P. Duygulu. A graph based approach for naming faces in news photos. In IEEE CVPR'06, pages 1477--1482, 2006.

Digital Library

[28]

X. Rui, M. Li, Z. Li, W.-Y. Ma, and N. Yu. Bipartite graph reinforcement model for web image annotation. In ACM MM'07, pages 585--594, Augsburg, Germany, 2007.

Digital Library

[29]

B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. Labelme: A database and web-based tool for image annotation. IJCV, 77(1--3):157--173, 2008.

Digital Library

[30]

S. Satoh, Y. Nakamura, and T. Kanade. Name-it: Naming and detecting faces in news videos. IEEE TMM, 6(1), 1999.

Digital Library

[31]

C. Siagian and L. Itti. Rapid biologically-inspired scene classification using features shared with visual attention. IEEE TPMAI, 29:300--312, February 2007.

Digital Library

[32]

J. Tang, R. Hong, S. Yan, T.-S. Chua, G.-J. Qi, and R. Jain. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM TIST, 2:14:1--14:15, 2011.

Digital Library

[33]

J. Zhu, S. C. Hoi, and M. R. Lyu. Face annotation by transductive kernel fisher discriminant. IEEE TMM, 10(01):86--96, 2008.

Digital Library

[34]

I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6:1453--1484, September 2005.

Digital Library

[35]

C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Image annotation refinement using random walk with restarts. In ACM MM'06, pages 647--650, 2006.

Digital Library

[36]

C. Wang, S. Yan, L. Zhang, and H.-J. Zhang. Multi-label sparse coding for automatic image annotation. IEEE CVPR'09, 0:1643--1650, 2009.

[37]

D. Wang, S. C. Hoi, and Y. He. Mining weakly labeled web facial images for search-based face annotation. In ACM SIGIR'11, pages 535--544, 2011.

Digital Library

[38]

D. Wang, S. C. H. Hoi, Y. He, and J. Zhu. Retrieval-based face annotation by weak label regularized local coordinate coding. In ACM MM'11, pages 353--362, 2011.

Digital Library

[39]

X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma. Annosearch: Image auto-annotation by search. In IEEE CVPR'06, 2006.

Digital Library

[40]

J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE TPMAI, 31(2):210--227, April 2008.

Digital Library

[41]

F. Wu, Y. Han, Q. Tian, and Y. Zhuang. Multi-label boosting for image annotation by structural grouping sparsity. In ACM MM'10, pages 15--24, 2010.

Digital Library

[42]

S. C. Hoi and M. R. Lyu. A multimodal and multilevel ranking scheme for large-scale video retrieval. TMM, 10(4):607--619, 2008.

Digital Library

[43]

Z. Wu, Q. Ke, J. Sun, and H.-Y. Shum. Scalable face image retrieval with identity-based quantization and multi-reference re-ranking. In IEEE CVPR'10, pages 3469--3476, 2010.

[44]

J. Zhu, S. C. Hoi, and L. V. Gool. Unsupervised face alignment by robust nonrigid mapping. In ICCV'09, 2009.

[45]

J. Xu, T.-Y. Liu, M. Lu, H. Li, and W.-Y. Ma. Directly optimizing evaluation measures in learning to rank. In ACM SIGIR'08, pages 107--114, 2008.

Digital Library

[46]

S. C. Hoi, W. Liu, M. R. Lyu, and W.-Y. Ma. Learning distance metrics with contextual constraints for image retrieval. In CVPR'06, volume 2, pages 2072--2078, 2006.

Digital Library

[47]

L. Yang and R. Jin. Distance metric learning: A comprehensive survey. Technical report, MSU, 2006.

[48]

Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In ACM SIGIR'07, pages 271--278, 2007.

Digital Library

[49]

L. Wu, S. C. Hoi, R. Jin, J. Zhu, and N. Yu. Distance metric learning from uncertain side information with application to automated photo tagging. In ACM MM'09, pages 135--144, 2009.

Digital Library

Cited By

Aman ERawat AGiri AGothwal H(2019)Content-Based Image Retrieval : A Comprehensive StudyInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT1952275(1073-1081)Online publication date: 20-Mar-2019
https://doi.org/10.32628/CSEIT1952275
Tang FDong WMeng YMei XHuang FZhang XDeussen O(2018)Animated Construction of Chinese Brush PaintingsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.277429224:12(3019-3031)Online publication date: 1-Dec-2018
https://doi.org/10.1109/TVCG.2017.2774292
Bracamonte TBustos BPoblete BSchreck T(2018)Extracting semantic knowledge from web context for multimedia IRMultimedia Tools and Applications10.1007/s11042-017-4997-y77:11(13853-13889)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1007/s11042-017-4997-y
Show More Cited By

Index Terms

Learning to name faces: a multimodal learning scheme for search-based face annotation
1. Information systems
  1. Information retrieval

Recommendations

Mining weakly labeled web facial images for search-based face annotation
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

In this paper, we investigate a search-based face annotation framework by mining weakly labeled facial images that are freely available on the internet. A key component of such a search-based annotation paradigm is to build a database of facial images ...
Retrieval-based face annotation by weak label regularized local coordinate coding
MM '11: Proceedings of the 19th ACM international conference on Multimedia

This paper investigates a retrieval-based annotation paradigm of mining web facial images for automated face annotation. In general, there are two key challenges for such an annotation paradigm. The first challenge is how to efficiently retrieve a short ...
FANS: face annotation by searching large-scale web facial images
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Auto face annotation is an important technique for many real-world applications, such as online photo album management, new video summarization, and so on. It aims to automatically detect human faces from a photo image and further name the faces with ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

July 2013

1188 pages

ISBN:9781450320344

DOI:10.1145/2484028

General Chairs:
Gareth J.F. Jones
Dublin City University, Ireland
,
Páraic Sheridan
Dublin City University, Ireland
,
Program Chairs:
Diane Kelly
University of North Carolina, Chapel Hill, USA
,
Maarten de Rijke
University of Amsterdam, The Netherlands
,
Tetsuya Sakai
Microsoft Research Asia, China

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 July 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '13

Sponsor:

SIGIR

SIGIR '13: The 36th International ACM SIGIR conference on research and development in Information Retrieval

July 28 - August 1, 2013

Dublin, Ireland

Acceptance Rates

SIGIR '13 Paper Acceptance Rate 73 of 366 submissions, 20%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
316
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Aman ERawat AGiri AGothwal H(2019)Content-Based Image Retrieval : A Comprehensive StudyInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT1952275(1073-1081)Online publication date: 20-Mar-2019
https://doi.org/10.32628/CSEIT1952275
Tang FDong WMeng YMei XHuang FZhang XDeussen O(2018)Animated Construction of Chinese Brush PaintingsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.277429224:12(3019-3031)Online publication date: 1-Dec-2018
https://doi.org/10.1109/TVCG.2017.2774292
Bracamonte TBustos BPoblete BSchreck T(2018)Extracting semantic knowledge from web context for multimedia IRMultimedia Tools and Applications10.1007/s11042-017-4997-y77:11(13853-13889)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1007/s11042-017-4997-y
Feng FNie LWang XHong RChua TKando NSakai TJoho HLi Hde Vries AWhite R(2017)Computational Social IndicatorsProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3080773(455-464)Online publication date: 7-Aug-2017
https://dl.acm.org/doi/10.1145/3077136.3080773
Moreira-Matias LFarah H(2017)On Developing a Driver Identification Methodology Using In-Vehicle Data RecordersIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2016.263936118:9(2387-2396)Online publication date: Sep-2017
https://doi.org/10.1109/TITS.2016.2639361
Kasthuri ASuruliandi A(2017)A survey on face annotation techniques2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS)10.1109/ICACCS.2017.8014641(1-9)Online publication date: Jan-2017
https://doi.org/10.1109/ICACCS.2017.8014641
Chang JJuang HChen YChang C(2017)Safe binary particle swam algorithm for an enhanced unsupervised label refinement in automatic face annotationMultimedia Tools and Applications10.1007/s11042-016-4058-y76:18(18339-18359)Online publication date: 1-Sep-2017
https://dl.acm.org/doi/10.1007/s11042-016-4058-y
Kundert-Gibbs J(2017)Image-Based Content Retrieval via Class-Based Histogram ComparisonsIT Convergence and Security 201710.1007/978-981-10-6451-7_1(3-10)Online publication date: 31-Aug-2017
https://doi.org/10.1007/978-981-10-6451-7_1
Wu PHoi SZhao PMiao CLiu Z(2016)Online Multi-Modal Distance Metric Learning with Application to Image RetrievalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.247729628:2(454-467)Online publication date: 1-Feb-2016
https://dl.acm.org/doi/10.1109/TKDE.2015.2477296
Changn JJuang H(2016)An Effective Machine Learning Approach for Refining the Labels of Web Facial ImagesFrontier Computing10.1007/978-981-10-0539-8_106(1073-1083)Online publication date: 20-Apr-2016
https://doi.org/10.1007/978-981-10-0539-8_106
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten