Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3343031.3351092acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Domain-Specific Embedding Network for Zero-Shot Recognition

Published: 15 October 2019 Publication History

Abstract

Zero-Shot Learning (ZSL) seeks to recognize a sample from either seen or unseen domain by projecting the image data and semantic labels into a joint embedding space. However, most existing methods directly adapt a well-trained projection from one domain to another, thereby ignoring the serious bias problem caused by domain differences. To address this issue, we propose a novel Domain-Specific Embedding Network (DSEN) that can apply specific projections to different domains for unbiased embedding, as well as several domain constraints. In contrast to previous methods, the DSEN decomposes the domain-shared projection function into one domain-invariant and two domain-specific sub-functions to explore the similarities and differences between two domains. To prevent the two specific projections from breaking the semantic relationship, a semantic reconstruction constraint is proposed by applying the same decoder function to them in a cycle consistency way. Furthermore, a domain division constraint is developed to directly penalize the margin between real and pseudo image features in respective seen and unseen domains, which can enlarge the inter-domain difference of visual features. Extensive experiments on four public benchmarks demonstrate the effectiveness of DSEN with an average of $9.2%$ improvement in terms of harmonic mean. The code is available in \urlhttps://github.com/mboboGO/DSEN-for-GZSL.

References

[1]
Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. 2013. Label-embedding for attribute-based classification. In 2013 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 819--826.
[2]
Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. 2016. Label-embedding for image classification. IEEE transactions on pattern analysis and machine intelligence, Vol. 38, 7 (2016), 1425--1438.
[3]
Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. 2015. Evaluation of output embeddings for fine-grained image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2927--2936.
[4]
Yashas Annadani and Soma Biswas. 2018. Preserving Semantic Relations for Zero-Shot Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7603--7612.
[5]
Maxime Bucher, Stéphane Herbin, and Frédéric Jurie. 2017. Generating visual representations for zero-shot classification. In Proceedings of the IEEE International Conference on Computer Vision. 2666--2673.
[6]
Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. 2016. Synthesized classifiers for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5327--5336.
[7]
Long Chen, Hanwang Zhang, Jun Xiao, Wei Liu, and Shih-Fu Chang. 2018. Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2.
[8]
Shancheng Fang, Hongtao Xie, Zheng-Jun Zha, Nannan Sun, Jianlong Tan, and Yongdong Zhang. 2018. Attention and language ensemble for scene text recognition with convolutional sequence modeling. In 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 248--256.
[9]
Ali Farhadi, Ian Endres, Derek Hoiem, and David Forsyth. 2009. Describing objects by their attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1778--1785.
[10]
Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Tomas Mikolov, et almbox. 2013. Devise: A deep visual-semantic embedding model. In Advances in neural information processing systems. 2121--2129.
[11]
Yanwei Fu, Timothy M Hospedales, Tao Xiang, Zhenyong Fu, and Shaogang Gong. 2014. Transductive multi-view embedding for zero-shot recognition and annotation. In European Conference on Computer Vision. Springer, 584--599.
[12]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[14]
Xiangteng He and Yuxin Peng. 2018. Only Learn One Sample: Fine-Grained Visual Categorization with One Sample Training. In 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 1372--1380.
[15]
Huajie Jiang, Ruiping Wang, Shiguang Shan, and Xilin Chen. 2018. Learning class prototypes via structure alignment for zero-shot recognition. In Proceedings of the European conference on computer vision. 118--134.
[16]
Huajie Jiang, Ruiping Wang, Shiguang Shan, Yi Yang, and Xilin Chen. 2017. Learning discriminative latent attributes for zero-shot classification. In Proceedings of the IEEE International Conference on Computer Vision . 4223--4232.
[17]
Elyor Kodirov, Tao Xiang, Zhenyong Fu, and Shaogang Gong. 2015. Unsupervised domain adaptation for zero-shot learning. In Proceedings of the IEEE International Conference on Computer Vision. 2452--2460.
[18]
Elyor Kodirov, Tao Xiang, and Shaogang Gong. 2017. Semantic autoencoder for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3174--3183.
[19]
Vinay Kumar Verma, Gundeep Arora, Ashish Mishra, and Piyush Rai. 2018. Generalized zero-shot learning via synthesized examples. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4281--4289.
[20]
Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. 2009. Learning to detect unseen object classes by between-class attribute transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 951--958.
[21]
Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. 2014. Attribute-based classification for zero-shot visual object categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, 3 (2014), 453--465.
[22]
Angeliki Lazaridou, Georgiana Dinu, and Marco Baroni. 2015. Hubness and pollution: Delving into cross-space mapping for zero-shot learning. In the 7th International Joint Conference on Natural Language Processing), Vol. 1. 270--280.
[23]
Yan Li, Junge Zhang, Jianguo Zhang, and Kaiqi Huang. 2018. Discriminative Learning of Latent Features for Zero-Shot Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 7463--7471.
[24]
Teng Long, Xing Xu, Youyou Li, Fumin Shen, Jingkuan Song, and Heng Tao Shen. 2018. Pseudo transfer with marginalized corrupted attribute for zero-shot learning. In 2018 ACM international conference on Multimedia. ACM, 1802--1810.
[25]
Ashish Mishra, Shiva Krishna Reddy, Anurag Mittal, and Hema A Murthy. 2018. A generative model for zero shot learning using conditional variational autoencoders. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2188--2196.
[26]
Pedro Morgado and Nuno Vasconcelos. 2017. Semantically consistent regularization for zero-shot recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 9. 10.
[27]
Yulei Niu, Zhiwu Lu, Songfang Huang, Xin Gao, and Ji-Rong Wen. 2017. FeaBoost: Joint Feature and Label Refinement for Semantic Segmentation. In AAAI . 1474--1480.
[28]
Mark Palatucci, Dean Pomerleau, Geoffrey E Hinton, and Tom M Mitchell. 2009. Zero-shot learning with semantic output codes. In Advances in neural information processing systems. 1410--1418.
[29]
Genevieve Patterson and James Hays. 2012. Sun attribute database: Discovering, annotating, and recognizing scene attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2751--2758.
[30]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing. 1532--1543.
[31]
Ruizhi Qiao, Lingqiao Liu, Chunhua Shen, and Anton van den Hengel. 2016. Less is more: zero-shot learning from online textual documents with noise suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2249--2257.
[32]
Milovs Radovanović, Alexandros Nanopoulos, and Mirjana Ivanović. 2010. Hubs in space: Popular nearest neighbors in high-dimensional data. Journal of Machine Learning Research, Vol. 11, Sep (2010), 2487--2531.
[33]
Bernardino Romera-Paredes and Philip Torr. 2015. An embarrassingly simple approach to zero-shot learning. In International Conference on Machine Learning. 2152--2161.
[34]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et almbox. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, Vol. 115, 3 (2015), 211--252.
[35]
Yutaro Shigeto, Ikumi Suzuki, Kazuo Hara, Masashi Shimbo, and Yuji Matsumoto. 2015. Ridge regression, hubness, and zero-shot learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 135--151.
[36]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[37]
Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. 2013. Zero-shot learning through cross-modal transfer. In Advances in neural information processing systems. 935--943.
[38]
Jie Song, Chengchao Shen, Yezhou Yang, Yang Liu, and Mingli Song. 2018. Transductive Unbiased Embedding for Zero-Shot Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1024--1033.
[39]
Nenad Tomasev, Milos Radovanovic, Dunja Mladenic, and Mirjana Ivanovic. 2014. The role of hubness in clustering high-dimensional data. IEEE transactions on knowledge and data engineering, Vol. 26, 3 (2014), 739--751.
[40]
Chaojie Wang, Bo Chen, Sucheng Xiao, and Mingyuan Zhou. 2019. Convolutional Poisson Gamma Belief Network. In ICML .
[41]
Chaojie Wang, Bo Chen, and Mingyuan Zhou. 2018. Multimodal Poisson gamma belief network. In Thirty-Second AAAI Conference on Artificial Intelligence .
[42]
Peter Welinder, Steve Branson, Takeshi Mita, Catherine Wah, Florian Schroff, Serge Belongie, and Pietro Perona. 2010. Caltech-UCSD birds 200. (2010).
[43]
Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein, and Bernt Schiele. 2016. Latent embeddings for zero-shot classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 69--77.
[44]
Yongqin Xian, Christoph H Lampert, Bernt Schiele, and Zeynep Akata. 2018a. Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE transactions on pattern analysis and machine intelligence (2018).
[45]
Yongqin Xian, Tobias Lorenz, Bernt Schiele, and Zeynep Akata. 2018b. Feature generating networks for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5542--5551.
[46]
Hongtao Xie, Dongbao Yang, Nannan Sun, Zhineng Chen, and Yongdong Zhang. 2019. Automated pulmonary nodule detection in CT images using deep convolutional neural networks. Pattern Recognition, Vol. 85 (2019), 109--119.
[47]
Yang Yang, Yadan Luo, Weilun Chen, Fumin Shen, Jie Shao, and Heng Tao Shen. 2016. Zero-shot hashing via transferring supervised knowledge. In Proceedings of the 24th ACM international conference on Multimedia. ACM, 1286--1295.
[48]
Hongguang Zhang and Piotr Koniusz. 2018. Zero-shot kernel learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7670--7679.
[49]
Li Zhang, Tao Xiang, and Shaogang Gong. 2017. Learning a deep embedding model for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2021--2030.
[50]
Feng Zheng, Xin Miao, and Heng Huang. 2018. Fast vehicle identification via ranked semantic sampling based embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. AAAI Press, 3697--3703.

Cited By

View all
  • (2024)Differential Refinement Network for Zero-Shot LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320188335:3(4164-4178)Online publication date: Mar-2024
  • (2023)Frequency-based Zero-Shot Learning with Phase AugmentationProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611990(3181-3189)Online publication date: 26-Oct-2023
  • (2023)Multiscale Visual-Attribute Co-Attention for Zero-Shot Image RecognitionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.313236634:9(6003-6014)Online publication date: Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '19: Proceedings of the 27th ACM International Conference on Multimedia
October 2019
2794 pages
ISBN:9781450368896
DOI:10.1145/3343031
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. categorization
  2. joint embedding
  3. neural networks
  4. zero-shot learning

Qualifiers

  • Research-article

Funding Sources

  • National Defense Science and Technology Fund for Distinguished Young Scholars
  • the National Nature Science Foundation of China
  • the Youth Innovation Promotion Association Chinese Academy of Sciences
  • National Postdoctoral Programme for Innovative Talents
  • the National Key Research and Development Program of China

Conference

MM '19
Sponsor:

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;
Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Differential Refinement Network for Zero-Shot LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320188335:3(4164-4178)Online publication date: Mar-2024
  • (2023)Frequency-based Zero-Shot Learning with Phase AugmentationProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611990(3181-3189)Online publication date: 26-Oct-2023
  • (2023)Multiscale Visual-Attribute Co-Attention for Zero-Shot Image RecognitionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.313236634:9(6003-6014)Online publication date: Sep-2023
  • (2023)Uni3DA: Universal 3D Domain Adaptation for Object RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.320221333:1(379-392)Online publication date: Jan-2023
  • (2022)Relation-Aware Compositional Zero-Shot Learning for Attribute-Object Pair RecognitionIEEE Transactions on Multimedia10.1109/TMM.2021.310441124(3652-3664)Online publication date: 1-Jan-2022
  • (2022)Content-Attribute Disentanglement for Generalized Zero-Shot LearningIEEE Access10.1109/ACCESS.2022.317880010(58320-58331)Online publication date: 2022
  • (2021)Domain-Oriented Semantic Embedding for Zero-Shot LearningIEEE Transactions on Multimedia10.1109/TMM.2020.303312423(3919-3930)Online publication date: 1-Jan-2021
  • (2021)Towards Efficient Multiview Object Detection with Adaptive Action Prediction2021 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA48506.2021.9561388(13423-13429)Online publication date: 30-May-2021
  • (2020)Generalized Zero-Shot Learning using Generated Proxy Unseen Samples and Entropy SeparationProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413657(4262-4270)Online publication date: 12-Oct-2020
  • (2020)Self-Adaptive Embedding For Few-Shot Classification By Hierarchical Attention2020 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME46284.2020.9102830(1-6)Online publication date: Jul-2020
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media