research-article

CA₃Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification

Authors:

Yongdong ZhangAuthors Info & Claims

MM '18: Proceedings of the 26th ACM international conference on Multimedia

Pages 737 - 745

https://doi.org/10.1145/3240508.3240585

Published: 15 October 2018 Publication History

Abstract

Person re-identification aims to identify the same pedestrian across non-overlapping camera views. Deep learning techniques have been applied for person re-identification recently, towards learning representation of pedestrian appearance. This paper presents a novel Contextual-Attentional Attribute-Appearance Network ($\rm CA^3Net$) for person re-identification. The $\rm CA^3Net$ simultaneously exploits the complementarity between semantic attributes and visual appearance, the semantic context among attributes, visual attention on attributes as well as spatial dependencies among body parts, leading to discriminative and robust pedestrian representation. Specifically, an attribute network within $\rm CA^3Net$ is designed with an Attention-LSTM module. It concentrates the network on latent image regions related to each attribute as well as exploits the semantic context among attributes by a LSTM module. An appearance network is developed to learn appearance features from the full body, horizontal and vertical body parts of pedestrians with spatial dependencies among body parts. The $\rm CA^3Net$ jointly learns the attribute and appearance features in a multi-task learning manner, generating comprehensive representation of pedestrians. Extensive experiments on two challenging benchmarks, i.e., Market-1501 and DukeMTMC-reID datasets, have demonstrated the effectiveness of the proposed approach.

References

[1]

Ejaz Ahmed, Michael Jones, and Tim K Marks. 2015. An improved deep learning architecture for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 3908--3916.

[2]

Xiang Bai, Mingkun Yang, Tengteng Huang, Zhiyong Dou, Rui Yu, and Yongchao Xu. 2017. Deep-Person: Learning Discriminative Deep Features for Person Re-Identification. arXiv preprint arXiv:1711.10658 (2017).

[3]

Igor Barros Barbosa, Marco Cristani, Barbara Caputo, Aleksander Rognhaugen, and Theoharis Theoharis. 2018. Looking beyond appearances: Synthetic training data for deep cnns in re-identification. Computer Vision and Image Understanding, Vol. 167 (2018), 50--62.

Digital Library

[4]

Yanbei Chen, Xiatian Zhu, and Shaogang Gong. 2017. Person re-identification by deep learning multi-scale representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2590--2600.

[5]

Ying-Cong Chen, Xiatian Zhu, Wei-Shi Zheng, and Jian-Huang Lai. 2018. Person re-identification by camera correlation aware feature augmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 2 (2018), 392--408.

Digital Library

[6]

Young Deok Chun, Nam Chul Kim, and Ick Hoon Jang. 2008. Content-based image retrieval using multiresolution color and texture features. IEEE Transactions on Multimedia, Vol. 10, 6 (2008), 1073--1084.

Digital Library

[7]

Michela Farenzena, Loris Bazzani, Alessandro Perina, Vittorio Murino, and Marco Cristani. 2010. Person re-identification by symmetry-driven accumulation of local features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2360--2367.

[8]

Pedro Felzenszwalb, David McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.

[9]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[10]

Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).

[11]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.

Digital Library

[12]

Cijo Jose and Francc ois Fleuret. 2016. Scalable metric learning via weighted approximate rank component analysis. In Proceedings of the European Conference on Computer Vision. 875--890.

[13]

Srikrishna Karanam, Mengran Gou, Ziyan Wu, Angels Rates-Borras, Octavia Camps, and Richard J Radke. 2018. A systematic evaluation and benchmark for person re-identification: Features, metrics, and datasets. IEEE Transactions on Pattern Analysis and Machine Intelligence 1 (2018), 1--1.

[14]

Dangwei Li, Xiaotang Chen, Zhang Zhang, and Kaiqi Huang. 2017a. Learning deep context-aware features over body and latent parts for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 384--393.

[15]

Wei Li, Xiatian Zhu, and Shaogang Gong. 2017b. Person re-identification by deep joint learning of multi-loss classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence . 2194--2200.

Digital Library

[16]

Wei Li, Xiatian Zhu, and Shaogang Gong. 2018. Harmonious Attention Network for Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2--11.

[17]

Yining Li, Chen Huang, Chen Change Loy, and Xiaoou Tang. 2016. Human attribute recognition by deep hierarchical contexts. In Proceedings of the European Conference on Computer Vision. Springer, 684--700.

[18]

Shengcai Liao, Yang Hu, Xiangyu Zhu, and Stan Z Li. 2015. Person re-identification by local maximal occurrence representation and metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2197--2206.

[19]

Yutian Lin, Liang Zheng, Zhedong Zheng, Yu Wu, and Yi Yang. 2017. Improving person re-identification by attribute and identity learning. arXiv preprint arXiv:1703.07220 (2017).

[20]

Giuseppe Lisanti, Iacopo Masi, Andrew D Bagdanov, and Alberto Del Bimbo. 2015. Person re-identification by iterative re-weighted sparse ranking. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, 8 (2015), 1629--1642.

Digital Library

[21]

Hao Liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, and Jiashi Feng. 2017a. Video-based person re-identification with accumulative motion context. IEEE Transactions on Circuits and Systems for Video Technology 99 (2017), 1--1.

[22]

Jiawei Liu, Zheng-Jun Zha, Xuejin Chen, Zilei Wang, and Yongdong Zhang. 2018. Dense 3d-convolutional neural network for person re-identification in videos. ACM Transactions on Multimedia Computing Communications and Applications, Vol. pp, 1 (2018), 1.

[23]

Jiawei Liu, Zheng-Jun Zha, Qi Tian, Dong Liu, Ting Yao, Qiang Ling, and Tao Mei. 2016. Multi-scale triplet cnn for person re-identification. In Proceedings of the 2016 ACM on Multimedia Conference. ACM, 192--196.

Digital Library

[24]

Xihui Liu, Haiyu Zhao, Maoqing Tian, Lu Sheng, Jing Shao, Shuai Yi, Junjie Yan, and Xiaogang Wang. 2017b. Hydraplus-net: Attentive deep features for pedestrian analysis. arXiv preprint arXiv:1709.09930 (2017).

[25]

Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. In Proceedings of the European Conference on Computer Vision. 17--35.

[26]

Arne Schumann and Rainer Stiefelhagen. 2017. Person re-identification by deep learning attribute-complementary information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1435--1443.

[27]

Zhiyuan Shi, Timothy M Hospedales, and Tao Xiang. 2015. Transferring a semantic representation for person re-identification and search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 4184--4193.

[28]

Jianlou Si, Honggang Zhang, Chun-Guang Li, Jason Kuen, Xiangfei Kong, Alex C. Kot, and Gang Wang. 2018. Dual Attention Matching Network for Context-Aware Feature Sequence Based Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8--17.

[29]

Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry Steven Davis, and Wen Gao. 2018. Multi-task learning with low rank attribute embedding for multi-camera person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 5 (2018), 1167--1181.

[30]

Yifan Sun, Liang Zheng, Weijian Deng, and Shengjin Wang. 2017. Svdnet for pedestrian retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6--15.

[31]

Rahul Rama Varior, Mrinal Haloi, and Gang Wang. 2016. Gated siamese convolutional neural network architecture for human re-identification. In Proceedings of the European Conference on Computer Vision. Springer, 791--808.

[32]

Jingya Wang, Xiatian Zhu, Shaogang Gong, and Wei Li. 2017. Attribute recognition by joint recurrent learning of context and correlation. In Proceedings of the IEEE International Conference on Computer Vision, Vol. 2--12.

[33]

Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, and Qi Tian. 2017. Glad: Global-local-alignment descriptor for pedestrian retrieval. In Proceedings of the 2017 ACM on Multimedia Conference. ACM, 420--428.

Digital Library

[34]

Yang Yang, Jimei Yang, Junjie Yan, Shengcai Liao, Dong Yi, and Stan Z Li. 2014. Salient color names for person re-identification. In Proceedings of the European Conference on Computer Vision. Springer, 536--551.

[35]

Qian Yu, Xiaobin Chang, Yi-Zhe Song, Tao Xiang, and Timothy M Hospedales. 2017. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv preprint arXiv:1711.08106 (2017).

[36]

Li Zhang, Tao Xiang, and Shaogang Gong. 2016. Learning a discriminative null space for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 1239--1248.

[37]

Xin Zhao, Liufang Sang, Guiguang Ding, Yuchen Guo, and Xiaoming Jin. 2018. Grouping Attribute Recognition for Pedestrian with Joint Recurrent Learning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence . 3177--3183.

[38]

Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision. 1116--1124.

Digital Library

[39]

Zhedong Zheng, Liang Zheng, and Yi Yang. 2017a. Pedestrian alignment network for large-scale person re-identification. arXiv preprint arXiv:1707.00408 (2017).

[40]

Zhedong Zheng, Liang Zheng, and Yi Yang. 2017b. Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro. In Proceedings of the IEEE International Conference on Computer Vision .

[41]

Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017a. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 3652--3661.

[42]

Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2017b. Random Erasing Data Augmentation. arXiv preprint arXiv:1708.04896 (2017).

[43]

Zhen Zhou, Yan Huang, Wei Wang, Liang Wang, and Tieniu Tan. 2017. See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6776--6785.

Cited By

Eom CLee GCho KJung HJin MHam B(2025)Cerberus: Attribute-based person re-identification using semantic IDsExpert Systems with Applications10.1016/j.eswa.2024.125320259(125320)Online publication date: Jan-2025
https://doi.org/10.1016/j.eswa.2024.125320
Weng JHu KYao TWang JWang Z(2023)Federated Unsupervised Cluster-Contrastive learning for person Re-identification: A coarse-to-fine approachComputer Vision and Image Understanding10.1016/j.cviu.2023.103831237(103831)Online publication date: Dec-2023
https://doi.org/10.1016/j.cviu.2023.103831
Wang WChen YWang DTie ZTao LKe W(2023)Joint attribute soft-sharing and contextual local: a multi-level features learning network for person re-identificationThe Visual Computer10.1007/s00371-023-02914-x40:4(2251-2264)Online publication date: 12-Jun-2023
https://doi.org/10.1007/s00371-023-02914-x
Show More Cited By

Index Terms

CA₃Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object identification
      2. Computer vision tasks
        Visual content-based indexing and retrieval
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Person Search with Joint Detection, Segmentation and Re-identification
Human Centered Computing
Abstract
Person search is a new and challenging task proposed in recent years. It aims to jointly handle person detection and person re-identification in an end-to-end deep learning neural network. In this paper, we propose a new multi-task framework, ...
Video Question Answering via Attribute-Augmented Attention Network Learning
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Video Question Answering is a challenging problem in visual information retrieval, which provides the answer to the referenced video content according to the question. However, the existing visual question answering approaches mainly tackle the problem ...
Multi-shot person re-identification based on appearance and spatial-temporal cues in a large camera network
Abstract
Person re-identification is an important video analysis problem that aims to track people over non-overlapping views in a large camera network. The purpose is to find the same person from disjoint camera views at different times and locations. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '18: Proceedings of the 26th ACM international conference on Multimedia

October 2018

2167 pages

ISBN:9781450356657

DOI:10.1145/3240508

General Chairs:
Susanne Boll
University of Oldenburg, Germany
,
Kyoung Mu Lee
Seoul National University, Korea
,
Jiebo Luo
University of Rochester, USA
,
Wenwu Zhu
Tsinghua University, China
,
Program Chairs:
Hyeran Byun
Yonsei University, Korea
,
Chang Wen Chen
State Univ. Of New York at Buffalo, USA
,
Rainer Lienhart
University of Augsburg, Germany
,
Tao Mei
JD AI, China

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R&D Program of China
Fundamental Research Funds for the Central Universities
National Natural Science Foundation of China

Conference

MM '18

Sponsor:

SIGMM

MM '18: ACM Multimedia Conference

October 22 - 26, 2018

Seoul, Republic of Korea

Acceptance Rates

MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
416
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)11

Reflects downloads up to 21 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Eom CLee GCho KJung HJin MHam B(2025)Cerberus: Attribute-based person re-identification using semantic IDsExpert Systems with Applications10.1016/j.eswa.2024.125320259(125320)Online publication date: Jan-2025
https://doi.org/10.1016/j.eswa.2024.125320
Weng JHu KYao TWang JWang Z(2023)Federated Unsupervised Cluster-Contrastive learning for person Re-identification: A coarse-to-fine approachComputer Vision and Image Understanding10.1016/j.cviu.2023.103831237(103831)Online publication date: Dec-2023
https://doi.org/10.1016/j.cviu.2023.103831
Wang WChen YWang DTie ZTao LKe W(2023)Joint attribute soft-sharing and contextual local: a multi-level features learning network for person re-identificationThe Visual Computer10.1007/s00371-023-02914-x40:4(2251-2264)Online publication date: 12-Jun-2023
https://doi.org/10.1007/s00371-023-02914-x
Tang ZHuang J(2022)Harmonious Multi-branch Network for Person Re-identification with Harder Triplet LossACM Transactions on Multimedia Computing, Communications, and Applications10.1145/350140518:4(1-21)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3501405
Pan SFeng WChong Y(2022)Attribute-Guided Global and Part-Level Identity Network for Person Re-IdentificationInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S0218001422500112Online publication date: 12-May-2022
https://doi.org/10.1142/S0218001422500112
Fu MSun SGao HWang DTong XLiu QLiang Q(2022)Improving Person Reidentification Using a Self-Focusing Network in Internet of ThingsIEEE Internet of Things Journal10.1109/JIOT.2021.30849789:12(9342-9353)Online publication date: 15-Jun-2022
https://doi.org/10.1109/JIOT.2021.3084978
Peng YLi WLi YPei YGuo Y(2022)Multi-task person re-identification via attribute and part-based learningMultimedia Tools and Applications10.1007/s11042-022-12124-781:8(11221-11237)Online publication date: 17-Feb-2022
https://doi.org/10.1007/s11042-022-12124-7
Qu XLiu LZhu LZhang H(2022)Attribute-aware style adaptation for person re-identificationMultimedia Systems10.1007/s00530-022-01024-329:2(469-485)Online publication date: 29-Nov-2022
https://doi.org/10.1007/s00530-022-01024-3
Sabri SRandhawa ZDoretto G(2022)Joint Discriminative and Metric Embedding Learning for Person Re-identificationAdvances in Visual Computing10.1007/978-3-031-20716-7_13(165-178)Online publication date: 10-Dec-2022
https://doi.org/10.1007/978-3-031-20716-7_13
Zhu YZha ZZhang TLiu JLuo JWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)A Structured Graph Attention Network for Vehicle Re-IdentificationProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413607(646-654)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3413607
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents