research-article

iQIYI Celebrity Video Identification Challenge

Authors:

Danming XieAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 2516 - 2520

https://doi.org/10.1145/3343031.3356081

Published: 15 October 2019 Publication History

Abstract

We held the iQIYI Celebrity Video Identification Challenge in ACMMULTIMEDIA 2019. The purpose was to encourage the research on video-based person identification. We released the iQIYI-VID-2019 dataset, which contains 200K videos of 10K celebrities. In this paper, we introduce the organization of the challenge, the dataset, the evaluation process, and the results.

References

[1]

Martin Bäuml, Makarand Tapaswi, and Rainer Stiefelhagen. 2013. Semi-supervised Learning with Constraints for Person Identification in Multimedia Data. In CVPR. 3602--3609.

[2]

Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman. 2018. VGGFace2: A dataset for recognising faces across pose and age. In International Conference on Automatic Face and Gesture Recognition.

[3]

Joon Son Chung, Arsha Nagrani, and Andrew Zisserman. 2018. VoxCeleb2: Deep Speaker Recognition. In Interspeech. 1086--1090.

[4]

Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In CVPR.

[5]

Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition. In European Conference on Computer Vision. 87--102.

[6]

Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. 2007. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. Technical Report 07--49. University of Massachusetts, Amherst.

[7]

Qingqiu Huang, Wentao Liu, and Dahua Lin. 2018. Person Search in Videos with One Portrait Through Visual and Temporal Links. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part XIII. 437--454.

[8]

Minyoung Kim, Sanjiv Kumar, Vladimir Pavlovic, and Henry A. Rowley. 2008. Face tracking and recognition with visual constraints in real-world videos. In IEEE Conference on Computer Vision and Pattern Recognition.

[9]

Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. 2014. DeepReID: Deep Filter Pairing Neural Network for Person Re-identification. In CVPR. 152--159.

[10]

Yuanliu Liu, Bo Peng, Peipei Shi, He Yan, Yong Zhou, Bing Han, Yi Zheng, Chao Lin, Jianbin Jiang, Yin Fan, Tingwei Gao, Ganwen Wang, Jian Liu, Xiangju Lu, and Danming Xie. 2018. iQIYI-VID: A Large Dataset for Multi-modal Person Identification. CoRR, Vol. abs/1811.07548 (2018). http://arxiv.org/abs/1811.07548

[11]

Daniel Miller, Ira Kemelmacher-Shlizerman, and Steven M. Seitz. 2015. MegaFace: A Million Faces for Recognition at Scale. CoRR, Vol. abs/1505.02108 (2015).

[12]

Arsha Nagrani and Andrew Zisserman. 2017. From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script. In BMVC.

[13]

Mahyar Najibi, Pouya Samangouei, Rama Chellappa, and Larry S. Davis. 2017. SSH: Single Stage Headless Face Detector. In IEEE International Conference on Computer Vision. 4885--4894.

[14]

Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: An ASR corpus based on public domain audio books. In International Conference on Acoustics, Speech and Signal Processing. 5206--5210.

[15]

Mirco Ravanelli and Yoshua Bengio. 2018. Speaker Recognition from raw waveform with SincNet. CoRR, Vol. abs/1808.00158 (2018).

[16]

Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, Faster, Stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017. 6517--6525.

[17]

J S. Garofolo, Lori Lamel, W M. Fisher, Jonathan Fiscus, and D S. Pallett. 1993. DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1--1.1., Vol. 93 (01 1993), 27403.

[18]

J. Sivic, M. Everingham, and A. Zisserman. 2009. "Who are you?" -- Learning Person Specific Classifiers from Video. In CVPR.

[19]

Guanshuo Wang, Yufeng Yuan, Xiong Chen, Jiwei Li, and Xi Zhou. 2018. Learning Discriminative Features with Multiple Granularity for Person Re-Identification. In ACM Multimedia.

[20]

Taiqing Wang, Shaogang Gong, Xiatian Zhu, and Shengjin Wang. 2014. Person Re-identification by Video Ranking. In European Conference on Computer Vision. 688--703.

[21]

Lior Wolf, Tal Hassner, and Itay Maoz. 2011. Face recognition in unconstrained videos with matched background similarity. In IEEE Conference on Computer Vision and Pattern Recognition. 529--534.

Digital Library

[22]

Liang Zheng, Zhi Bie, Yifan Sun, Jingdong Wang, Chi Su, Shengjin Wang, and Qi Tian. 2016. MARS: A Video Benchmark for Large-Scale Person Re-Identification. In European Conference on Computer Vision.

[23]

Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable Person Re-identification: A Benchmark. In International Conference on Computer Vision. 1116--1124.

Cited By

Xie QLu ZZhou WLi H(2023)Improving Person Re-Identification With Multi-Cue Similarity Embedding and PropagationIEEE Transactions on Multimedia10.1109/TMM.2022.320794925(6384-6396)Online publication date: 2023
https://doi.org/10.1109/TMM.2022.3207949
Chen JSun CZhang SZeng J(2023)Cross-modal dynamic sentiment annotation for speech sentiment analysisComputers and Electrical Engineering10.1016/j.compeleceng.2023.108598106(108598)Online publication date: Mar-2023
https://doi.org/10.1016/j.compeleceng.2023.108598
Hu YYan CCao CWang HWu B(2023)Social Relation Graph Generation on Untrimmed VideoMultiMedia Modeling10.1007/978-3-031-27818-1_61(739-744)Online publication date: 31-Mar-2023
https://doi.org/10.1007/978-3-031-27818-1_61
Show More Cited By

Index Terms

iQIYI Celebrity Video Identification Challenge
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Video search

Recommendations

ResidualDenseNetwork: A Simple Approach for Video Person Identification
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Video identification is an important task in the practical application and industry. Based on the iQIYI-VID-2019 dataset, ACM International Conference on Multimedia and iQIYI co-hosted the celebrity video identification challenge. We take part in the ...
A Novel Deep Multi-Modal Feature Fusion Method for Celebrity Video Identification
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

In this paper, we develop a novel multi-modal feature fusion method for the 2019 iQIYI Celebrity Video Identification Challenge, which is held in conjunction with ACM MM 2019. The purpose of this challenge is to retrieve all the video clips of a given ...
Context-based person identification framework for smart video surveillance

Smart video surveillance (SVS) applications enhance situational awareness by allowing domain analysts to focus on the events of higher priority. SVS approaches operate by trying to extract and interpret higher "semantic" level events that occur in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
258
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xie QLu ZZhou WLi H(2023)Improving Person Re-Identification With Multi-Cue Similarity Embedding and PropagationIEEE Transactions on Multimedia10.1109/TMM.2022.320794925(6384-6396)Online publication date: 2023
https://doi.org/10.1109/TMM.2022.3207949
Chen JSun CZhang SZeng J(2023)Cross-modal dynamic sentiment annotation for speech sentiment analysisComputers and Electrical Engineering10.1016/j.compeleceng.2023.108598106(108598)Online publication date: Mar-2023
https://doi.org/10.1016/j.compeleceng.2023.108598
Hu YYan CCao CWang HWu B(2023)Social Relation Graph Generation on Untrimmed VideoMultiMedia Modeling10.1007/978-3-031-27818-1_61(739-744)Online publication date: 31-Mar-2023
https://doi.org/10.1007/978-3-031-27818-1_61
Liao ZDi DHao JZhang JZhu SYin J(2023)MMM-GCN: Multi-Level Multi-Modal Graph Convolution Network for Video-Based Person IdentificationMultiMedia Modeling10.1007/978-3-031-27077-2_1(3-15)Online publication date: 29-Mar-2023
https://doi.org/10.1007/978-3-031-27077-2_1
Chanlongrat WApichanapong TSinngam PChaisangmongkon W(2022)A semi-automated system for person re-identification adaptation to cross-outfit and cross-posture scenariosApplied Intelligence10.1007/s10489-021-02896-052:8(9501-9520)Online publication date: 6-Jan-2022
https://doi.org/10.1007/s10489-021-02896-0
Ciaparrone GChiariglione LTagliaferri R(2022)A comparison of deep learning models for end-to-end face-based video retrieval in unconstrained videosNeural Computing and Applications10.1007/s00521-021-06875-x34:10(7489-7506)Online publication date: 5-Jan-2022
https://doi.org/10.1007/s00521-021-06875-x
Zhang SChen JLi MLi TLu PWang Z(2021)Segment-Level Cross-Modal Knowledge Transfer for Speech Sentiment Analysis2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET)10.1109/CCET52649.2021.9544303(243-247)Online publication date: 13-Aug-2021
https://doi.org/10.1109/CCET52649.2021.9544303
Zhang SChen MChen JLi YWu YLi MZhu C(2021)Combining cross-modal knowledge transfer and semi-supervised learning for speech emotion recognitionKnowledge-Based Systems10.1016/j.knosys.2021.107340(107340)Online publication date: Jul-2021
https://doi.org/10.1016/j.knosys.2021.107340
Li FWang WLiu ZWang HYan CWu B(2021)Frame Aggregation and Multi-modal Fusion Framework for Video-Based Person RecognitionMultiMedia Modeling10.1007/978-3-030-67832-6_7(75-86)Online publication date: 21-Jan-2021
https://doi.org/10.1007/978-3-030-67832-6_7
Wang WWu BLi FLiu Z(2020)Multi-Cue and Temporal Attention for Person Recognition in VideosPattern Recognition and Computer Vision10.1007/978-3-030-60639-8_31(369-380)Online publication date: 15-Oct-2020
https://doi.org/10.1007/978-3-030-60639-8_31

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten