Article

Region Graph Embedding Network for Zero-Shot Learning

Authors:

Ling ShaoAuthors Info & Claims

Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV

August 2020

Pages 562 - 580

https://doi.org/10.1007/978-3-030-58548-8_33

Published: 23 August 2020 Publication History

Abstract

Most of the existing Zero-Shot Learning (ZSL) approaches learn direct embeddings from global features or image parts (regions) to the semantic space, which, however, fail to capture the appearance relationships between different local regions within a single image. In this paper, to model the relations among local image regions, we incorporate the region-based relation reasoning into ZSL. Our method, termed as Region Graph Embedding Network (RGEN), is trained end-to-end from raw image data. Specifically, RGEN consists of two branches: the Constrained Part Attention (CPA) branch and the Parts Relation Reasoning (PRR) branch. CPA branch is built upon attention and produces the image regions. To exploit the progressive interactions among these regions, we represent them as a region graph, on which the parts relation reasoning is performed with graph convolutions, thus leading to our PRR branch. To train our model, we introduce both a transfer loss and a balance loss to contrast class similarities and pursue the maximum response consistency among seen and unseen outputs, respectively. Extensive experiments on four datasets well validate the effectiveness of the proposed method under both ZSL and generalized ZSL settings.

References

[1]

Akata, Z., Malinowski, M., Fritz, M., Schiele, B.: Multi-cue zero-shot learning with strong supervision. In: CVPR (2016)

[2]

Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: CVPR (2013)

[3]

Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. In: TPAMI (2016)

[4]

Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: CVPR (2015)

[5]

Annadani, Y., Biswas, S.: Preserving semantic relations for zero-shot learning. In: CVPR (2018)

[6]

Cacheux, Y., Borgne, H., Crucianu, M.: Modeling inter and intra-class relations in the triplet loss for zero-shot learning. In: ICCV (2019)

[7]

Changpinyo, S., Chao, W.L., Gong, B., Sha, F.: Synthesized classifiers for zero-shot learning. In: CVPR (2016)

[8]

Chao W-L, Changpinyo S, Gong B, and Sha F Leibe B, Matas J, Sebe N, and Welling M An empirical study and analysis of generalized zero-shot learning for object recognition in the wild Computer Vision – ECCV 2016 2016 Cham Springer 52-68

[9]

Chen, L., Zhang, H., Xiao, J., Liu, W., Chang, S.F.: Zero-shot visual recognition using semantics-preserving adversarial embedding network. In: CVPR (2018)

[10]

Elhoseiny, M., Elfeki, M.: Creativity inspired zero-shot learning. In: ICCV (2019)

[11]

Elhoseiny, M., Zhu, Y., Zhang, H., Elgammal, A.M.: Link the head to the "beak": zero shot learning from noisy text description at part precision. In: CVPR (2017)

[12]

Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)

[13]

Felix, R., Kumar, V.B., Reid, I., Carneiro, G.: Multi-modal cycle-consistent generalized zero-shot learning. In: ECCV (2008)

[14]

Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., T. Mikolov, E.A.: DeViSE: a deep visual-semantic embedding model. In: NeurIPS (2013)

[15]

Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Transductive multi-view zero-shot learning. In: TPAMI (2015)

[16]

Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)

[17]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

[18]

Jayaraman, D., Grauman, K.: Zero-shot recognition with unreliable attributes. In: NeurIPS (2014)

[19]

Jiang, H., Wang, R., Shan, S., Chen, X.: Transferable contrastive network for generalized zero-shot learning. In: ICCV (2019)

[20]

Kampffmeyer, M., Chen, Y., Liang, X., Wang, H., Zhang, Y., Xing, E.: Rethinking knowledge graph propagation for zero-shot learning. In: CVPR (2019)

[21]

Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 (2016)

[22]

Kodirov, E., Xiang, T., Fu, Z., Gong, S.: Unsupervised domain adaptation for zero-shot learning. In: ICCV (2015)

[23]

Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR (2017)

[24]

Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)

[25]

Li, X., Yang, F., Cheng, H., Liu, W., Shen, D.: Contour knowledge transfer for salient object detection. In: ECCV (2018)

[26]

Li, Y., Zhang, J., Zhang, J., Huang, K.: Discriminative learning of latent features for zero-shot recognition. In: CVPR (2018)

[27]

Liu, S., Long, M., Wang, J., Jordan, M.: Generalized zero-shot learning with deep calibration network. In: NeurIPS (2018)

[28]

Liu, Y., Guo, J., Cai, D., He, X.: Attribute attention for semantic disambiguation in zero-shot learning. In: ICCV (2019)

[29]

Long, Y., Liu, L., Shen, F., Shao, L., Li, X.: Zero-shot learning using synthesised unseen visual data with diffusion regularisation. In: TPAMI (2017)

[30]

Lu, X., Wang, W., Ma, C., Shen, J., Shao, L., Porikli, F.: See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: CVPR (2019)

[31]

Lu, X., Wang, W., Martin, D., Zhou, T., Shen, J., Luc, V.G.: Video object segmentation with episodic graph memory networks. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)

[32]

Maaten LVD and Hinton G Visualizing data using t-SNE JMLR 2008 9 2579-2605

[33]

Morgado, P., Vasconcelos, N.: Semantically consistent regularization for zero-shot recognition. In: CVPR (2017)

[34]

Norouzi, M., et al.: Zero-shot learning by convex combination of semantic embeddings. In: NeurIPS (2014)

[35]

Palatucci, M., Pomerleau, D., Hinton, G.E., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: NeurIPS (2009)

[36]

Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: CVPR (2012)

[37]

Qiao, R., Liu, L., Shen, C., van den Hengel, A.: Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR (2016)

[38]

Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: CVPR (2016)

[39]

Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: ICML (2015)

[40]

Shen, Y., Qin, J., Huang, L., Liu, L., Zhu, F., Shao, L.: Invertible zero-shot recognition flows. In: Proceedings of the European Conference on Computer Vision (ECCV) (2020)

[41]

Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross-modal transfer. In: NeurIPS (2013)

[42]

Song, J., Shen, C., Yang, Y., Liu, Y., Song, M.: Transductive unbiased embedding for zero-shot learning. In: CVPR (2018)

[43]

Verma, V.K., Arora, G., Mishra, A., Rai, P.: Generalized zero-shot learning via synthesized examples. In: CVPR (2018)

[44]

Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. In: Technical report (2011)

[45]

Wang, X., Ye, Y., Gupta, A.: Zero-shot recognition via semantic embeddings and knowledge graphs. In: CVPR (2018)

[46]

Wu B et al. Tencent ml-images: a large-scale multi-label image database for visual representation learning IEEE Access 2019 7 172683-172693

[47]

Wu B, Jia F, Liu W, Ghanem B, and Lyu S Multi-label learning with missing labels using mixed dependency graphs Int. J. Comput. Vis. 2018 126 875-896

Digital Library

[48]

Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR (2016)

[49]

Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: CVPR (2018)

[50]

Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: CVPR (2017)

[51]

Xian, Y., Sharma, S., Saurabh, S., Akata, Z.: f-VAEGAN-D2: a feature generating framework for any-shot learning. In: CVPR (2019)

[52]

Xie, G.S., et al.: Attentive region embedding network for zero-shot learning. In: CVPR (2019)

[53]

Xie GS, Zhang XY, Yang W, Xu M, Yan S, and Liu CL LG-CNN: from local parts to global discrimination for fine-grained recognition Pattern Recogn. 2017 71 118-131

[54]

Xie GS et al. SRSC: selective, robust, and supervised constrained feature representation for image classification IEEE Trans. Neural Netw. Learn. Syst. 2019 31 4290-4302

[55]

Xu H and Saenko K Leibe B, Matas J, Sebe N, and Welling M Ask, attend and answer: exploring question-guided spatial attention for visual question answering Computer Vision – ECCV 2016 2016 Cham Springer 451-466

[56]

Xu, J., Zhao, R., Zhu, F., Wang, H., Ouyang, W.: Attention-aware compositional network for person re-identification. arXiv:1805.03344 (2018)

[57]

Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: ICML (2015)

[58]

Yang, F.S.Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: Relation network for few-shot learning. In: CVPR (2018)

[59]

Yang, G., Liu, J., Xu, J., Li, X.: Dissimilarity representation learning for generalized zero-shot recognition. In: MM (2018)

[60]

Yao Y et al. Exploiting web images for multi-output classification: from category to subcategories IEEE Trans. Neural Netw. Learn. Syst. 2020 31 2348-2360

[61]

Yao Y, Zhang J, Shen F, Hua X, Xu J, and Tang Z Exploiting web images for dataset construction: a domain robust approach IEEE Trans. Multimedia 2017 19 1771-1784

Digital Library

[62]

Ye, M., Guo, Y.: Zero-shot classification with discriminative semantic representation learning. In: CVPR (2017)

[63]

Yu, H., Lee, B.: Zero-shot learning via simultaneous generating and learning. In: NeurIPS (2019)

[64]

Yu, Y., Ji, Z., Fu, Y., Guo, J., Pang, Y., Zhang, Z.: Stacked semantics-guided attention model for fine-grained zero-shot learning. In: NeurIPS (2018)

[65]

Yu, Y., Ji, Z., Han, J., Zhang, Z.: Episode-based prototype generating network for zero-shot learning. In: CVPR (2020)

[66]

Zhang, L., Xiang, T., Gong, S., et al.: Learning a deep embedding model for zero-shot learning. In: CVPR (2017)

[67]

Zhang L et al. Towards effective deep embedding for zero-shot learning IEEE Trans. Circ. Syst. Video Technol. 2020 30 2843-2852

Digital Library

[68]

Zhang L et al. Adaptive importance learning for improving lightweight image super-resolution network Int. J. Comput. Vis. 2020 128 479-499

Digital Library

[69]

Zhang L et al. Unsupervised domain adaptation using robust class-wise matching IEEE Trans. Circ. Syst. Video Technol. 2018 29 1339-1349

Digital Library

[70]

Zhang L, Wei W, Bai C, Gao Y, and Zhang Y Exploiting clustering manifold structure for hyperspectral imagery super-resolution IEEE Trans. Image Process. 2018 27 5969-5982

Digital Library

[71]

Zhang L, Wei W, Zhang Y, Shen C, Van Den Hengel A, and Shi Q Cluster sparsity field: an internal hyperspectral imagery prior for reconstruction Int. J. Comput. Vis. 2018 126 797-821

Digital Library

[72]

Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: ICCV (2015)

[73]

Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: CVPR (2016)

[74]

Zhang Z, Liu L, Shen F, Shen HT, and Shao L Binary multi-view clustering IEEE Trans. Pattern Anal. Mach. Intell. 2018 41 1774-1782

[75]

Zhao, F., Liao, S., Xie, G.S., Zhao, J., Zhang, K., Shao, L.: Unsupervised domain adaptation with noise resistible mutual-training for person re-identification. In: ECCV (2020)

[76]

Zhao, F., Zhao, J., Yan, S., Feng, J.: Dynamic conditional networks for few-shot learning. In: ECCV (2018)

[77]

Zhou, B., Khosla, A.A.L., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)

[78]

Zhu, P., Wang, H., Saligrama, V.: Generalized zero-shot recognition based on visually semantic embedding. In: CVPR (2019)

[79]

Zhu, Y., Elhoseiny, M., Liu, B., Peng, X., Elgammal, A.: A generative adversarial approach for zero-shot learning from noisy texts. In: CVPR (2018)

[80]

Zhu, Y., Xie, J., Tang, Z., Peng, X., Elgammal, A.: Learning where to look: semantic-guided multi-attention localization for zero-shot learning. In: NeurIPS (2019)

Cited By

Yu DLiu XYang BChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Zero-shot Image Classification with Logic Adapter and Rule PromptProceedings of the ACM on Web Conference 202410.1145/3589334.3645554(2075-2084)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645554
Sarma SSur A(2023)DiRaC-I: Identifying Diverse and Rare Training Classes for Zero-Shot LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/360314720:1(1-23)Online publication date: 31-May-2023
https://dl.acm.org/doi/10.1145/3603147
Zhang YFeng SEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Enhancing Domain-Invariant Parts for Generalized Zero-Shot LearningProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611764(6283-6291)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611764
Show More Cited By

Recommendations

Zero-Shot Learning with Noisy Labels
Abstract
Zero-shot learning (ZSL) is an attractive technique that can recognize novel object classes without any visual examples, but most existing methods assume that the class labels of the training instances from seen classes are accurate and reliable. ...
Read More
Transductive Visual-Semantic Embedding for Zero-shot Learning
ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

Zero-shot learning (ZSL) aims to bridge the knowledge transfer via available semantic representations (e.g., attributes) between labeled source instances of seen classes and unlabelled target instances of unseen classes. Most existing ZSL approaches ...
Read More
Multi-label Generalized Zero-Shot Learning Using Identifiable Variational Autoencoders
Extended Reality
Abstract
Multi-label Zero-Shot Learning (ZSL) is an extension of traditional single-label ZSL, where the objective is to accurately classify images containing multiple unseen classes that are not available during training. Current techniques depends on ...
Read More

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV

Aug 2020

857 pages

ISBN:978-3-030-58547-1

DOI:10.1007/978-3-030-58548-8

Editors:
Andrea Vedaldi
University of Oxford, Oxford, UK
,
Horst Bischof
Graz University of Technology, Graz, Austria
,
Thomas Brox
University of Freiburg, Freiburg im Breisgau, Germany
,
Jan-Michael Frahm
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

© Springer Nature Switzerland AG 2020.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 August 2020

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Yu DLiu XYang BChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Zero-shot Image Classification with Logic Adapter and Rule PromptProceedings of the ACM on Web Conference 202410.1145/3589334.3645554(2075-2084)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645554
Sarma SSur A(2023)DiRaC-I: Identifying Diverse and Rare Training Classes for Zero-Shot LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/360314720:1(1-23)Online publication date: 31-May-2023
https://dl.acm.org/doi/10.1145/3603147
Zhang YFeng SEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Enhancing Domain-Invariant Parts for Generalized Zero-Shot LearningProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611764(6283-6291)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611764
Wu LLi ZZhao HWang ZLiu QHuai BYuan NChen ESingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph PropagationProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599486(2618-2628)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599486
Hu YChapman AWen GHall D(2022)What Can Knowledge Bring to Machine Learning?—A Survey of Low-shot Learning for Structured DataACM Transactions on Intelligent Systems and Technology10.1145/351003013:3(1-45)Online publication date: 3-Mar-2022
https://dl.acm.org/doi/10.1145/3510030
Alamri FDutta A(2021)Implicit and Explicit Attention for Zero-Shot LearningPattern Recognition10.1007/978-3-030-92659-5_30(467-483)Online publication date: 28-Sep-2021
https://dl.acm.org/doi/10.1007/978-3-030-92659-5_30
Zhao FLiao SXie GZhao JZhang KShao L(2020)Unsupervised Domain Adaptation with Noise Resistible Mutual-Training for Person Re-identificationComputer Vision – ECCV 202010.1007/978-3-030-58621-8_31(526-544)Online publication date: 23-Aug-2020
https://dl.acm.org/doi/10.1007/978-3-030-58621-8_31

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents