research-article

Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation

Authors:

Jingdong WangAuthors Info & Claims

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Pages 35 - 44

https://doi.org/10.1145/2733373.2806216

Published: 13 October 2015 Publication History

Abstract

In recent years, deep networks have been successfully applied to model image concepts and achieved competitive performance on many data sets. In spite of impressive performance, the conventional deep networks can be subjected to the decayed performance if we have insufficient training examples. This problem becomes extremely severe for deep networks with powerful representation structure, making them prone to over fitting by capturing nonessential or noisy information in a small data set. In this paper, to address this challenge, we will develop a novel deep network structure, capable of transferring labeling information across heterogeneous domains, especially from text domain to image domain. This weakly-shared Deep Transfer Networks (DTNs) can adequately mitigate the problem of insufficient image training data by bringing in rich labels from the text domain.

Specifically, we present a novel architecture of DTNs to translate cross-domain information from text to image. To share the labels between two domains, we will build multiple weakly shared layers of features. It allows to represent both shared inter-domain features and domain-specific features, making this structure more flexible and powerful in capturing complex data of different domains jointly than the strongly shared layers. Experiments on real world dataset will show its competitive performance as compared with the other state-of-the-art methods.

References

[1]

Y. Bengio. Learning deep architectures for ai. Foundations and trends R in Machine Learning, 2(1):1--127, 2009.

Digital Library

[2]

Y. Bengio. Deep learning of representations for unsupervised and transfer learning. In ICML, pages 17--36, 2012.

[3]

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. NIPS, 19:153, 2007.

Digital Library

[4]

M. Chen, Z. Xu, F. Sha, and K. Q. Weinberger. Marginalized denoising autoencoders for domain adaptation. In ICML, pages 767--774, 2012.

Digital Library

[5]

T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: a real-world web image database from national university of singapore. In CIVR, page 48, 2009.

Digital Library

[6]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248--255, 2009.

[7]

J. Donahue, J. Hoffman, E. Rodner, K. Saenko, and T. Darrell. Semi-supervised domain adaptation with instance constraints. In CVPR, pages 668--675, 2013.

Digital Library

[8]

L. Duan, D. Xu, and I. W. Tsang. Learning with augmented features for heterogeneous domain adaptation. In ICML, pages 711--718, 2012.

Digital Library

[9]

F. Feng, X. Wang, and R. Li. Cross-modal retrieval with correspondence autoencoder. In ACM Multimedia, pages 7--16, 2014.

Digital Library

[10]

X. Glorot, A. Bordes, and Y. Bengio. Domain adaptation for large-scale sentiment classification: A deep learning approach. In ICML, pages 513-520, 2011.

Digital Library

[11]

G. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527--1554, 2006.

Digital Library

[12]

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv, 2012.

[13]

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia, pages 675--678, 2014.

Digital Library

[14]

Y.-G. Jiang, C.-W. Ngo, and S.-F. Chang. Semantic context transfer across heterogeneous sources for domain adaptive video search. In ACM Multimedia, pages 155--164, 2009.

Digital Library

[15]

M. Kan, S. Shan, H. Chang, and X. Chen. Stacked progressive auto-encoders (spae) for face recognition across poses. In CVPR, pages 1883--1890, 2014.

Digital Library

[16]

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1097--1105, 2012.

Digital Library

[17]

H. Lee, C. Ekanadham, and A. Y. Ng. Sparse deep belief net model for visual area v2. In NIPS, pages 873--880, 2008.

Digital Library

[18]

M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu. Transfer joint matching for unsupervised domain adaptation. In CVPR, pages 1410-1417, 2014.

Digital Library

[19]

J. Ngiam, Z. Chen, S. A. Bhaskar, P. W. Koh, and A. Y. Ng. Sparse filtering. In NIPS, pages 1125--1133, 2011.

[20]

J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. Multimodal deep learning. In ICML, pages 689--696, 2011.

Digital Library

[21]

J. Ni, Q. Qiu, and R. Chellappa. Subspace interpolation via dictionary learning for unsupervised domain adaptation. In CVPR, pages 692--699, 2013.

Digital Library

[22]

X. Ou, L. Yan, H. Ling, C. Liu, and M. Liu. Inductive transfer deep hashing for image retrieval. In ACM Multimedia, pages 969--972, 2014.

Digital Library

[23]

S. J. Pan and Q. Yang. A survey on transfer learning. TKDE, 22(10):1345--1359, 2010.

Digital Library

[24]

G.-J. Qi, C. Aggarwal, and T. Huang. Towards semantic knowledge propagation from text corpus to web images. In WWW, pages 297--306, 2011.

Digital Library

[25]

S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. Contractive auto-encoders: Explicit invariance during feature extraction. In ICML, pages 833--840, 2011.

Digital Library

[26]

S. D. Roy, T. Mei, W. Zeng, and S. Li. Socialtransfer: cross-domain transfer learning from social streams for media applications. In ACM Multimedia, pages 649--658, 2012.

Digital Library

[27]

R. Salakhutdinov and G. E. Hinton. Deep boltzmann machines. In AISTATS, pages 448--455, 2009.

[28]

R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines for collaborative filtering. In ICML, pages 791--798, 2007.

Digital Library

[29]

R. Socher, M. Ganjoo, C. D. Manning, and A. Ng. Zero-shot learning through cross-modal transfer. In NIPS, pages 935--943, 2013.

Digital Library

[30]

K. Sohn, W. Shang, and H. Lee. Improved multimodal deep learning with variation of information. In NIPS, pages 2141--2149, 2014.

Digital Library

[31]

N. Srivastava and R. Salakhutdinov. Multimodal learning with deep boltzmann machines. In NIPS, pages 2222--2230, 2012.

Digital Library

[32]

Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In CVPR, pages 1701--1708, 2014.

Digital Library

[33]

P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, pages 1096--1103, 2008.

Digital Library

[34]

W. Wang, Z. Cui, H. Chang, S. Shan, and X. Chen. Deeply coupled auto-encoder networks for cross-view classification. arXiv, 2014.

[35]

P. Xu, M. Ye, X. Li, Q. Liu, Y. Yang, and J. Ding. Dynamic background learning through deep auto-encoder networks. In ACM Multimedia, pages 107--116, 2014.

Digital Library

[36]

X. Zhang, F. X. Yu, S.-F. Chang, and S. Wang. Deep transfer network: Unsupervised domain adaptation. arXiv, 2015.

[37]

Y. Zhu, Y. Chen, Z. Lu, S. J. Pan, G.-R. Xue, Y. Yu, and Q. Yang. Heterogeneous transfer learning for image classification. In AAAI, 2011

Digital Library

Cited By

Dai GShu XWu WYan RZhang J(2025)GPT4Ego: Unleashing the Potential of Pre-Trained Models for Zero-Shot Egocentric Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.352165827(401-413)Online publication date: 2025
https://doi.org/10.1109/TMM.2024.3521658
Rohlfs C(2025)Generalization in neural networks: A broad surveyNeurocomputing10.1016/j.neucom.2024.128701611(128701)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128701
Yang HHe HZhang WWang YJing L(2024)Multi-Source and Multi-modal Deep Network Embedding for Cross-Network Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/365330418:6(1-26)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653304
Show More Cited By

Index Terms

Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations

Recommendations

Generalized Deep Transfer Networks for Knowledge Propagation in Heterogeneous Domains
Special Section on Trust Management for Multimedia Big Data and Special Section on Best Papers of ACM Multimedia 2015

In recent years, deep neural networks have been successfully applied to model visual concepts and have achieved competitive performance on many tasks. Despite their impressive performance, traditional deep networks are subjected to the decayed ...
Actively transfer domain knowledge
ECMLPKDD'08: Proceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II

When labeled examples are not readily available, active learning and transfer learning are separate efforts to obtain labeled examples for inductive learning. Active learning asks domain experts to label a small set of examples, but there is a cost ...
Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Successful semantic segmentation methods typically rely on the training datasets containing a large number of pixel-wise labeled images. To alleviate the dependence on such a fully annotated training dataset, in this paper, we propose a semi- and weakly-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

October 2015

1402 pages

ISBN:9781450334594

DOI:10.1145/2733373

General Chairs:
Xiaofang Zhou
The University of Queensland, Australia
,
Alan F. Smeaton
Dublin City University, Ireland
,
Qi Tian
The University of Texas at San Antonio, USA
,
Program Chairs:
Dick C.A. Bulterman
FXPAL, USA
,
Heng Tao Shen
The University of Queensland, Australia
,
Ketan Mayer-Patel
The University of North Carolina, USA
,
Shuicheng Yan
National University of Singapore, Singapore

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Program for New Century Excellent Talents in University
973 Program of China
Natural Science Fund for Distinguished Young Scholars of Jiangsu Province
National Natural Science Foundation of China

Conference

MM '15

Sponsor:

SIGMM

MM '15: ACM Multimedia Conference

October 26 - 30, 2015

Brisbane, Australia

Acceptance Rates

MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

154
Total Citations
View Citations
1,100
Total Downloads

Downloads (Last 12 months)49
Downloads (Last 6 weeks)6

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dai GShu XWu WYan RZhang J(2025)GPT4Ego: Unleashing the Potential of Pre-Trained Models for Zero-Shot Egocentric Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.352165827(401-413)Online publication date: 2025
https://doi.org/10.1109/TMM.2024.3521658
Rohlfs C(2025)Generalization in neural networks: A broad surveyNeurocomputing10.1016/j.neucom.2024.128701611(128701)Online publication date: Jan-2025
https://doi.org/10.1016/j.neucom.2024.128701
Yang HHe HZhang WWang YJing L(2024)Multi-Source and Multi-modal Deep Network Embedding for Cross-Network Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/365330418:6(1-26)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653304
Wang DDing ATian YWang QHe LGao X(2024)Deep Hierarchical Multimodal Metric LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.328997135:11(15787-15799)Online publication date: Nov-2024
https://doi.org/10.1109/TNNLS.2023.3289971
Yan XMao YYe YYu H(2024)Cross-Modal Clustering With Deep Correlated Information Bottleneck MethodIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.326978935:10(13508-13522)Online publication date: Oct-2024
https://doi.org/10.1109/TNNLS.2023.3269789
Liu TZhao RJia WLam KKong J(2024)Holistic-Guided Disentangled Learning With Cross-Video Semantics Mining for Concurrent First-Person and Third-Person Activity RecognitionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3202835(1-15)Online publication date: 2024
https://doi.org/10.1109/TNNLS.2022.3202835
Peng SHe YLiu XCheung YXu XCui Z(2024)Relation-Aggregated Cross-Graph Correlation Learning for Fine-Grained Image–Text RetrievalIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318856935:2(2194-2207)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3188569
Wang ZGao ZHan MYang YShen H(2024)Estimating the Semantics via Sector Embedding for Image-Text RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.340766426(10342-10353)Online publication date: 2024
https://doi.org/10.1109/TMM.2024.3407664
Zhao WWu X(2024)Boosting Entity-Aware Image Captioning With Multi-Modal Knowledge GraphIEEE Transactions on Multimedia10.1109/TMM.2023.330127926(2659-2670)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3301279
Li GGao QHan JGao X(2024)A Coarse-to-Fine Cell Division Approach for Hyperspectral Remote Sensing Image ClassificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.333913534:6(4928-4941)Online publication date: Jun-2024
https://doi.org/10.1109/TCSVT.2023.3339135
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten