Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2484028.2484040acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to name faces: a multimodal learning scheme for search-based face annotation

Published: 28 July 2013 Publication History

Abstract

Automated face annotation aims to automatically detect human faces from a photo and further name the faces with the corresponding human names. In this paper, we tackle this open problem by investigating a search-based face annotation (SBFA) paradigm for mining large amounts of web facial images freely available on the WWW. Given a query facial image for annotation, the idea of SBFA is to first search for top-n similar facial images from a web facial image database and then exploit these top-ranked similar facial images and their weak labels for naming the query facial image. To fully mine those information, this paper proposes a novel framework of Learning to Name Faces (L2NF) -- a unified multimodal learning approach for search-based face annotation, which consists of the following major components: (i) we enhance the weak labels of top-ranked similar images by exploiting the "label smoothness" assumption; (ii) we construct the multimodal representations of a facial image by extracting different types of features; (iii) we optimize the distance measure for each type of features using distance metric learning techniques; and finally (iv) we learn the optimal combination of multiple modalities for annotation through a learning to rank scheme. We conduct a set of extensive empirical studies on two real-world facial image databases, in which encouraging results show that the proposed algorithms significantly boost the naming accuracy of search-based face annotation task.

References

[1]
T. Ahonen, A. Hadid, and M. Pietikäinen. Face description with local binary patterns: Application to face recognition. IEEE TPMAI, 28(12):2037--2041, 2006.
[2]
M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7:2399--2434, 2006.
[3]
T. L. Berg, A. C. Berg, J. Edwards, M. Maire, R. White, Y. W. Teh, E. G. Learned-Miller, and D. A. Forsyth. Names and faces in the news. In IEEE CVPR'04, 2004.
[4]
J. Bu, B. Xu, C. Wu, C. Chen, J. Zhu, D. Cai, and X. He. Unsupervised face-name association via commute distance. In ACM MM'12, pages 219--228, 2012.
[5]
Z. Cao, Q. Yin, X. Tang, and J. Sun. Face recognition with learning-based descriptor. In IEEE CVPR'10, 2010.
[6]
G. Carneiro, A. B. Chan, P. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE TPMAI, pages 394--410, 2006.
[7]
B.-C. Chen, Y.-Y. Chen, Y.-H. Kuo, and W. H. Hsu. Scalable face image retrieval using attribute-enhanced sparse codewords. IEEE Trans. on Multimedia, 2012.
[8]
J. Y. Choi, W. D. Neve, K. N. Plataniotis, and Y. M. Ro. Collaborative face recognition for improved face annotation in personal photo collections shared on online social networks. IEEE Trans. on Multimedia, 13, 2011.
[9]
P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV'02, 2002.
[10]
J. Fan, Y. Gao, and H. Luo. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In ACM MM'04, pages 540--547, 2004.
[11]
S. Gao, I. W.-H. Tsang, and L.-T. Chia. Kernel sparse representation for image classification and face recognition. In ECCV'10, ECCV'10, pages 1--14, Berlin, Heidelberg, 2010. Springer-Verlag.
[12]
M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Automatic face naming with caption-based supervision. In IEEE CVPR'08, pages 1--8, 2008.
[13]
M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Face recognition from caption-based supervision. In IJCV'12, 96:64--82, Jan 2012.
[14]
M. Guillaumin, J. Verbeek, and C. Schmid. Multiple instance metric learning from automatically labeled bags of faces. In ECCV'10, pages 634--647, Sep 2010.
[15]
A. Hanbury. A survey of methods for image annotation. J. Vis. Lang. Comput., 19:617--627, October 2008.
[16]
S. C. Hoi, R. Jin, J. Zhu, and M. R. Lyu. Semi-supervised svm batch mode active learning with applications to image retrieval. ACM TOIS, 27(3):1--29, July 2009.
[17]
A. Holub, P. Moreels, and P. Perona. Unsupervised clustering for google searches of celebrity images. In IEEE FG'08, pages 1--8, 2008.
[18]
G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October 2007.
[19]
T. Joachims, T. Finley, and C.-N. J. Yu. Cutting-plane training of structural svms. Machine Learning, 77(1):27--59, Oct. 2009.
[20]
M. G. Kresimir Delac and M. S. Bartlett. Recent Advances in Face Recognition. I-Tech Education and Publishing, 2008.
[21]
N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. Attribute and simile classifiers for face verification. In IEEE ICCV'09, Oct 2009.
[22]
D.-D. Le and S. Satoh. Unsupervised face annotation by mining the web. In ICDM'08, pages 383--392, 2008.
[23]
C. Liu and H. Wechsler. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE TIP, 11(4):467--476, apr 2002.
[24]
A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV'08, pages 316--329, 2008.
[25]
B. Mcfee and G. Lanckriet. Metric learning to rank. In ICML'10, 2010.
[26]
T. Mensink and J. J. Verbeek. Improving people search using query expansions. In ECCV'08, pages 86--99, 2008.
[27]
D. Ozkan and P. Duygulu. A graph based approach for naming faces in news photos. In IEEE CVPR'06, pages 1477--1482, 2006.
[28]
X. Rui, M. Li, Z. Li, W.-Y. Ma, and N. Yu. Bipartite graph reinforcement model for web image annotation. In ACM MM'07, pages 585--594, Augsburg, Germany, 2007.
[29]
B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. Labelme: A database and web-based tool for image annotation. IJCV, 77(1--3):157--173, 2008.
[30]
S. Satoh, Y. Nakamura, and T. Kanade. Name-it: Naming and detecting faces in news videos. IEEE TMM, 6(1), 1999.
[31]
C. Siagian and L. Itti. Rapid biologically-inspired scene classification using features shared with visual attention. IEEE TPMAI, 29:300--312, February 2007.
[32]
J. Tang, R. Hong, S. Yan, T.-S. Chua, G.-J. Qi, and R. Jain. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM TIST, 2:14:1--14:15, 2011.
[33]
J. Zhu, S. C. Hoi, and M. R. Lyu. Face annotation by transductive kernel fisher discriminant. IEEE TMM, 10(01):86--96, 2008.
[34]
I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6:1453--1484, September 2005.
[35]
C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Image annotation refinement using random walk with restarts. In ACM MM'06, pages 647--650, 2006.
[36]
C. Wang, S. Yan, L. Zhang, and H.-J. Zhang. Multi-label sparse coding for automatic image annotation. IEEE CVPR'09, 0:1643--1650, 2009.
[37]
D. Wang, S. C. Hoi, and Y. He. Mining weakly labeled web facial images for search-based face annotation. In ACM SIGIR'11, pages 535--544, 2011.
[38]
D. Wang, S. C. H. Hoi, Y. He, and J. Zhu. Retrieval-based face annotation by weak label regularized local coordinate coding. In ACM MM'11, pages 353--362, 2011.
[39]
X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma. Annosearch: Image auto-annotation by search. In IEEE CVPR'06, 2006.
[40]
J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE TPMAI, 31(2):210--227, April 2008.
[41]
F. Wu, Y. Han, Q. Tian, and Y. Zhuang. Multi-label boosting for image annotation by structural grouping sparsity. In ACM MM'10, pages 15--24, 2010.
[42]
S. C. Hoi and M. R. Lyu. A multimodal and multilevel ranking scheme for large-scale video retrieval. TMM, 10(4):607--619, 2008.
[43]
Z. Wu, Q. Ke, J. Sun, and H.-Y. Shum. Scalable face image retrieval with identity-based quantization and multi-reference re-ranking. In IEEE CVPR'10, pages 3469--3476, 2010.
[44]
J. Zhu, S. C. Hoi, and L. V. Gool. Unsupervised face alignment by robust nonrigid mapping. In ICCV'09, 2009.
[45]
J. Xu, T.-Y. Liu, M. Lu, H. Li, and W.-Y. Ma. Directly optimizing evaluation measures in learning to rank. In ACM SIGIR'08, pages 107--114, 2008.
[46]
S. C. Hoi, W. Liu, M. R. Lyu, and W.-Y. Ma. Learning distance metrics with contextual constraints for image retrieval. In CVPR'06, volume 2, pages 2072--2078, 2006.
[47]
L. Yang and R. Jin. Distance metric learning: A comprehensive survey. Technical report, MSU, 2006.
[48]
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In ACM SIGIR'07, pages 271--278, 2007.
[49]
L. Wu, S. C. Hoi, R. Jin, J. Zhu, and N. Yu. Distance metric learning from uncertain side information with application to automated photo tagging. In ACM MM'09, pages 135--144, 2009.

Cited By

View all
  • (2019)Content-Based Image Retrieval : A Comprehensive StudyInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT1952275(1073-1081)Online publication date: 20-Mar-2019
  • (2018)Animated Construction of Chinese Brush PaintingsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.277429224:12(3019-3031)Online publication date: 1-Dec-2018
  • (2018)Extracting semantic knowledge from web context for multimedia IRMultimedia Tools and Applications10.1007/s11042-017-4997-y77:11(13853-13889)Online publication date: 1-Jun-2018
  • Show More Cited By

Index Terms

  1. Learning to name faces: a multimodal learning scheme for search-based face annotation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
    July 2013
    1188 pages
    ISBN:9781450320344
    DOI:10.1145/2484028
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 July 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. auto face annotation
    2. supervised learning
    3. web facial images

    Qualifiers

    • Research-article

    Conference

    SIGIR '13
    Sponsor:

    Acceptance Rates

    SIGIR '13 Paper Acceptance Rate 73 of 366 submissions, 20%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Content-Based Image Retrieval : A Comprehensive StudyInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT1952275(1073-1081)Online publication date: 20-Mar-2019
    • (2018)Animated Construction of Chinese Brush PaintingsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.277429224:12(3019-3031)Online publication date: 1-Dec-2018
    • (2018)Extracting semantic knowledge from web context for multimedia IRMultimedia Tools and Applications10.1007/s11042-017-4997-y77:11(13853-13889)Online publication date: 1-Jun-2018
    • (2017)Computational Social IndicatorsProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3077136.3080773(455-464)Online publication date: 7-Aug-2017
    • (2017)On Developing a Driver Identification Methodology Using In-Vehicle Data RecordersIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2016.263936118:9(2387-2396)Online publication date: Sep-2017
    • (2017)A survey on face annotation techniques2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS)10.1109/ICACCS.2017.8014641(1-9)Online publication date: Jan-2017
    • (2017)Safe binary particle swam algorithm for an enhanced unsupervised label refinement in automatic face annotationMultimedia Tools and Applications10.1007/s11042-016-4058-y76:18(18339-18359)Online publication date: 1-Sep-2017
    • (2017)Image-Based Content Retrieval via Class-Based Histogram ComparisonsIT Convergence and Security 201710.1007/978-981-10-6451-7_1(3-10)Online publication date: 31-Aug-2017
    • (2016)Online Multi-Modal Distance Metric Learning with Application to Image RetrievalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.247729628:2(454-467)Online publication date: 1-Feb-2016
    • (2016)An Effective Machine Learning Approach for Refining the Labels of Web Facial ImagesFrontier Computing10.1007/978-981-10-0539-8_106(1073-1083)Online publication date: 20-Apr-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media