Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2733373.2806216acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation

Published: 13 October 2015 Publication History

Abstract

In recent years, deep networks have been successfully applied to model image concepts and achieved competitive performance on many data sets. In spite of impressive performance, the conventional deep networks can be subjected to the decayed performance if we have insufficient training examples. This problem becomes extremely severe for deep networks with powerful representation structure, making them prone to over fitting by capturing nonessential or noisy information in a small data set. In this paper, to address this challenge, we will develop a novel deep network structure, capable of transferring labeling information across heterogeneous domains, especially from text domain to image domain. This weakly-shared Deep Transfer Networks (DTNs) can adequately mitigate the problem of insufficient image training data by bringing in rich labels from the text domain.
Specifically, we present a novel architecture of DTNs to translate cross-domain information from text to image. To share the labels between two domains, we will build multiple weakly shared layers of features. It allows to represent both shared inter-domain features and domain-specific features, making this structure more flexible and powerful in capturing complex data of different domains jointly than the strongly shared layers. Experiments on real world dataset will show its competitive performance as compared with the other state-of-the-art methods.

References

[1]
Y. Bengio. Learning deep architectures for ai. Foundations and trends R in Machine Learning, 2(1):1--127, 2009.
[2]
Y. Bengio. Deep learning of representations for unsupervised and transfer learning. In ICML, pages 17--36, 2012.
[3]
Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. NIPS, 19:153, 2007.
[4]
M. Chen, Z. Xu, F. Sha, and K. Q. Weinberger. Marginalized denoising autoencoders for domain adaptation. In ICML, pages 767--774, 2012.
[5]
T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: a real-world web image database from national university of singapore. In CIVR, page 48, 2009.
[6]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248--255, 2009.
[7]
J. Donahue, J. Hoffman, E. Rodner, K. Saenko, and T. Darrell. Semi-supervised domain adaptation with instance constraints. In CVPR, pages 668--675, 2013.
[8]
L. Duan, D. Xu, and I. W. Tsang. Learning with augmented features for heterogeneous domain adaptation. In ICML, pages 711--718, 2012.
[9]
F. Feng, X. Wang, and R. Li. Cross-modal retrieval with correspondence autoencoder. In ACM Multimedia, pages 7--16, 2014.
[10]
X. Glorot, A. Bordes, and Y. Bengio. Domain adaptation for large-scale sentiment classification: A deep learning approach. In ICML, pages 513-520, 2011.
[11]
G. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527--1554, 2006.
[12]
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv, 2012.
[13]
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia, pages 675--678, 2014.
[14]
Y.-G. Jiang, C.-W. Ngo, and S.-F. Chang. Semantic context transfer across heterogeneous sources for domain adaptive video search. In ACM Multimedia, pages 155--164, 2009.
[15]
M. Kan, S. Shan, H. Chang, and X. Chen. Stacked progressive auto-encoders (spae) for face recognition across poses. In CVPR, pages 1883--1890, 2014.
[16]
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1097--1105, 2012.
[17]
H. Lee, C. Ekanadham, and A. Y. Ng. Sparse deep belief net model for visual area v2. In NIPS, pages 873--880, 2008.
[18]
M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu. Transfer joint matching for unsupervised domain adaptation. In CVPR, pages 1410-1417, 2014.
[19]
J. Ngiam, Z. Chen, S. A. Bhaskar, P. W. Koh, and A. Y. Ng. Sparse filtering. In NIPS, pages 1125--1133, 2011.
[20]
J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. Multimodal deep learning. In ICML, pages 689--696, 2011.
[21]
J. Ni, Q. Qiu, and R. Chellappa. Subspace interpolation via dictionary learning for unsupervised domain adaptation. In CVPR, pages 692--699, 2013.
[22]
X. Ou, L. Yan, H. Ling, C. Liu, and M. Liu. Inductive transfer deep hashing for image retrieval. In ACM Multimedia, pages 969--972, 2014.
[23]
S. J. Pan and Q. Yang. A survey on transfer learning. TKDE, 22(10):1345--1359, 2010.
[24]
G.-J. Qi, C. Aggarwal, and T. Huang. Towards semantic knowledge propagation from text corpus to web images. In WWW, pages 297--306, 2011.
[25]
S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. Contractive auto-encoders: Explicit invariance during feature extraction. In ICML, pages 833--840, 2011.
[26]
S. D. Roy, T. Mei, W. Zeng, and S. Li. Socialtransfer: cross-domain transfer learning from social streams for media applications. In ACM Multimedia, pages 649--658, 2012.
[27]
R. Salakhutdinov and G. E. Hinton. Deep boltzmann machines. In AISTATS, pages 448--455, 2009.
[28]
R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines for collaborative filtering. In ICML, pages 791--798, 2007.
[29]
R. Socher, M. Ganjoo, C. D. Manning, and A. Ng. Zero-shot learning through cross-modal transfer. In NIPS, pages 935--943, 2013.
[30]
K. Sohn, W. Shang, and H. Lee. Improved multimodal deep learning with variation of information. In NIPS, pages 2141--2149, 2014.
[31]
N. Srivastava and R. Salakhutdinov. Multimodal learning with deep boltzmann machines. In NIPS, pages 2222--2230, 2012.
[32]
Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In CVPR, pages 1701--1708, 2014.
[33]
P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, pages 1096--1103, 2008.
[34]
W. Wang, Z. Cui, H. Chang, S. Shan, and X. Chen. Deeply coupled auto-encoder networks for cross-view classification. arXiv, 2014.
[35]
P. Xu, M. Ye, X. Li, Q. Liu, Y. Yang, and J. Ding. Dynamic background learning through deep auto-encoder networks. In ACM Multimedia, pages 107--116, 2014.
[36]
X. Zhang, F. X. Yu, S.-F. Chang, and S. Wang. Deep transfer network: Unsupervised domain adaptation. arXiv, 2015.
[37]
Y. Zhu, Y. Chen, Z. Lu, S. J. Pan, G.-R. Xue, Y. Yu, and Q. Yang. Heterogeneous transfer learning for image classification. In AAAI, 2011

Cited By

View all
  • (2025)GPT4Ego: Unleashing the Potential of Pre-Trained Models for Zero-Shot Egocentric Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.352165827(401-413)Online publication date: 2025
  • (2025)Generalization in neural networks: A broad surveyNeurocomputing10.1016/j.neucom.2024.128701611(128701)Online publication date: Jan-2025
  • (2024)Multi-Source and Multi-modal Deep Network Embedding for Cross-Network Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/365330418:6(1-26)Online publication date: 26-Apr-2024
  • Show More Cited By

Index Terms

  1. Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '15: Proceedings of the 23rd ACM international conference on Multimedia
    October 2015
    1402 pages
    ISBN:9781450334594
    DOI:10.1145/2733373
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 October 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-domain label transfer
    2. deep transfer network
    3. heterogeneous-domain knowledge propagation
    4. image classification

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '15
    Sponsor:
    MM '15: ACM Multimedia Conference
    October 26 - 30, 2015
    Brisbane, Australia

    Acceptance Rates

    MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)GPT4Ego: Unleashing the Potential of Pre-Trained Models for Zero-Shot Egocentric Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.352165827(401-413)Online publication date: 2025
    • (2025)Generalization in neural networks: A broad surveyNeurocomputing10.1016/j.neucom.2024.128701611(128701)Online publication date: Jan-2025
    • (2024)Multi-Source and Multi-modal Deep Network Embedding for Cross-Network Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/365330418:6(1-26)Online publication date: 26-Apr-2024
    • (2024)Deep Hierarchical Multimodal Metric LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.328997135:11(15787-15799)Online publication date: Nov-2024
    • (2024)Cross-Modal Clustering With Deep Correlated Information Bottleneck MethodIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.326978935:10(13508-13522)Online publication date: Oct-2024
    • (2024)Holistic-Guided Disentangled Learning With Cross-Video Semantics Mining for Concurrent First-Person and Third-Person Activity RecognitionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3202835(1-15)Online publication date: 2024
    • (2024)Relation-Aggregated Cross-Graph Correlation Learning for Fine-Grained Image–Text RetrievalIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318856935:2(2194-2207)Online publication date: Feb-2024
    • (2024)Estimating the Semantics via Sector Embedding for Image-Text RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.340766426(10342-10353)Online publication date: 2024
    • (2024)Boosting Entity-Aware Image Captioning With Multi-Modal Knowledge GraphIEEE Transactions on Multimedia10.1109/TMM.2023.330127926(2659-2670)Online publication date: 1-Jan-2024
    • (2024)A Coarse-to-Fine Cell Division Approach for Hyperspectral Remote Sensing Image ClassificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.333913534:6(4928-4941)Online publication date: Jun-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media