Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Few-Shot Text and Image Classification via Analogical Transfer Learning

Published: 29 October 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Learning from very few samples is a challenge for machine learning tasks, such as text and image classification. Performance of such task can be enhanced via transfer of helpful knowledge from related domains, which is referred to as transfer learning. In previous transfer learning works, instance transfer learning algorithms mostly focus on selecting the source domain instances similar to the target domain instances for transfer. However, the selected instances usually do not directly contribute to the learning performance in the target domain. Hypothesis transfer learning algorithms focus on the model/parameter level transfer. They treat the source hypotheses as well-trained and transfer their knowledge in terms of parameters to learn the target hypothesis. Such algorithms directly optimize the target hypothesis by the observable performance improvements. However, they fail to consider the problem that instances that contribute to the source hypotheses may be harmful for the target hypothesis, as instance transfer learning analyzed. To relieve the aforementioned problems, we propose a novel transfer learning algorithm, which follows an analogical strategy. Particularly, the proposed algorithm first learns a revised source hypothesis with only instances contributing to the target hypothesis. Then, the proposed algorithm transfers both the revised source hypothesis and the target hypothesis (only trained with a few samples) to learn an analogical hypothesis. We denote our algorithm as Analogical Transfer Learning. Extensive experiments on one synthetic dataset and three real-world benchmark datasets demonstrate the superior performance of the proposed algorithm.

    References

    [1]
    Shuang Ao, Xiang Li, and Charles X. Ling. 2017. Effective multiclass transfer for hypothesis transfer learning. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 64--75.
    [2]
    Yusuf Aytar and Andrew Zisserman. 2011. Tabula rasa: Model transfer for object category detection. In 2011 International Conference on Computer Vision. IEEE, 2252--2259.
    [3]
    Steffen Bickel, Michael Brückner, and Tobias Scheffer. 2007. Discriminative learning for differing training and test distributions. In Proceedings of the International Conference on Machine Learning. ACM, 81--88.
    [4]
    Lorenzo Bruzzone and Mattia Marconcini. 2010. Domain adaptation problems: A DASVM classification technique and a circular validation strategy. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 5 (2010), 770--787.
    [5]
    Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011), 27 pages.
    [6]
    Xiaojun Chang, Zhigang Ma, Ming Lin, Yi Yang, and Alexander G. Hauptmann. 2017. Feature interaction augmented sparse learning for fast kinect motion detection. IEEE Transactions on Image Processing 26, 8 (2017), 3911--3920.
    [7]
    Xiaojun Chang, Feiping Nie, Yi Yang, Chengqi Zhang, and Heng Huang. 2016. Convex sparse PCA for unsupervised feature learning. ACM Transactions on Knowledge Discovery from Data 11, 1 (2016), 3:1--3:16.
    [8]
    Xiaojun Chang and Yi Yang. 2017. Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Transactions on Neural Networks and Learning Systems 28, 10 (2017), 2294--2305.
    [9]
    Xiaojun Chang, Yaoliang Yu, Yi Yang, and Eric P. Xing. 2017. Semantic pooling for complex event analysis in untrimmed videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 8 (2017), 1617--1632.
    [10]
    Jonghyun Choi, Sung Ju Hwang, Leonid Sigal, and Larry S. Davis. 2016. Knowledge transfer with interactive learning of semantic relationships. In AAAI Conference on Artificial Intelligence.
    [11]
    Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the International Conference on Machine Learning. ACM, 193--200.
    [12]
    Cheng Deng, Zhaojia Chen, Xianglong Liu, Xinbo Gao, and Dacheng Tao. 2018. Triplet-based deep hashing network for cross-modal retrieval. IEEE Transactions on Image Processing 27, 8 (2018), 3893--3903.
    [13]
    Zhengming Ding and Yun Fu. 2016. Robust transfer metric learning for image classification. IEEE Transactions on Image Processing 26, 2 (2016), 660--670.
    [14]
    Yanbo Fan, Ran He, Jian Liang, and Bao-Gang Hu. 2017. Self-Paced learning: An implicit regularization perspective. In AAAI Conference on Artificial Intelligence, Vol. 3. 4.
    [15]
    Meng Fang, Jie Yin, Xingquan Zhu, and Chengqi Zhang. 2015. TrGraph: Cross-network transfer learning via common signature subgraphs. IEEE Transactions on Knowledge and Data Engineering 27, 9 (2015), 2536--2549.
    [16]
    Chenqiang Gao, Deyu Meng, Yi Yang, Yongtao Wang, Xiaofang Zhou, and Alexander G. Hauptmann. 2013. Infrared patch-image model for small target detection in a single image. IEEE Transactions on Image Processing 22, 12 (2013), 4996--5009.
    [17]
    Jing Gao, Wei Fan, Jing Jiang, and Jiawei Han. 2008. Knowledge transfer via multiple model local structure mapping. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 283--291.
    [18]
    Peter Gärdenfors. 2003. Belief Revision. Vol. 29. Cambridge University Press.
    [19]
    Liang Ge, Jing Gao, Hung Ngo, Kang Li, and Aidong Zhang. 2014. On handling negative transfer and imbalanced distributions in multiple source transfer learning. Statistical Analysis and Data Mining 7, 4 (2014), 254--271.
    [20]
    Mehmet Gönen and Adam A Margolin. 2014. Kernelized Bayesian transfer learning. In AAAI Conference on Artificial Intelligence. 1831--1839.
    [21]
    Marti A. Hearst, Susan T. Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf. 1998. Support vector machines. IEEE Intelligent Systems and their applications 13, 4 (1998), 18--28.
    [22]
    Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin. 2001. Image analogies. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. ACM, 327--340.
    [23]
    Judy Hoffman, Erik Rodner, Jeff Donahue, Trevor Darrell, and Kate Saenko. 2013. Efficient learning of domain-invariant image representations. arXiv preprint arXiv:1301.3224 (2013).
    [24]
    Cheng-An Hou, Yao-Hung Hubert Tsai, Yi-Ren Yeh, and Yu-Chiang Frank Wang. 2016. Unsupervised domain adaptation with label and structural consistency. IEEE Transactions on Image Processing 25, 12 (2016), 5552--5562.
    [25]
    Jiayuan Huang, Arthur Gretton, Karsten M. Borgwardt, Bernhard Schölkopf, and Alex J. Smola. 2007. Correcting sample selection bias by unlabeled data. In Advances in Neural Information Processing Systems. 601--608.
    [26]
    Jing Jiang and ChengXiang Zhai. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the Annual Meeting on Association for Computational Linguistics, Vol. 7. 264--271.
    [27]
    Lu Jiang, Deyu Meng, Shoou-I Yu, Zhenzhong Lan, Shiguang Shan, and Alexander Hauptmann. 2014. Self-paced learning with diversity. In Advances in Neural Information Processing Systems. 2078--2086.
    [28]
    Lu Jiang, Deyu Meng, Qian Zhao, Shiguang Shan, and Alexander G. Hauptmann. 2015. Self-paced curriculum learning. In AAAI Conference on Artificial Intelligence, Vol. 2. 6.
    [29]
    Peiguang Jing, Yuting Su, Liqiang Nie, and Huimin Gu. 2017. Predicting image memorability through adaptive transfer learning from external sources. IEEE Transactions on Multimedia 19, 5 (2017), 1050--1062.
    [30]
    Olivier Chapelle, Bernhard Schlkopf, and Alexander Zien. 2010. Semi-Supervised Learning (1st ed.). The MIT Press.
    [31]
    David Kale, Marjan Ghazvininejad, Anil Ramakrishna, Jingrui He, and Yan Liu. 2015. Hierarchical active transfer learning. In Society for Industrial and Applied Mathematics Publications. SIAM.
    [32]
    M. Pawan Kumar, Benjamin Packer, and Daphne Koller. 2010. Self-paced learning for latent variable models. In Advances in Neural Information Processing Systems. 1189--1197.
    [33]
    Ilja Kuzborskij and Francesco Orabona. 2013. Stability and hypothesis transfer learning. In Proceedings of the International Conference on Machine Learning. 942--950.
    [34]
    Ilja Kuzborskij and Francesco Orabona. 2014. Fast rates by transferring from auxiliary hypotheses. arXiv preprint arXiv:1412.1619 (2014).
    [35]
    Ilja Kuzborskij, Francesco Orabona, and Barbara Caputo. 2013. From n to n+ 1: Multiclass transfer incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3358--3365.
    [36]
    Ilja Kuzborskij, Francesco Orabona, and Barbara Caputo. 2014. Scalable greedy algorithms for transfer learning. arXiv preprint arXiv:1408.1292 (2014).
    [37]
    Ken Lang. 1995. Newsweeder: Learning to filter netnews. In Proceedings of the International Conference on Machine Learning. 331--339.
    [38]
    David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li. 2004. Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research 5, (Apr. 2014), 361--397.
    [39]
    Zhihui Li, Feiping Nie, Xiaojun Chang, and Yi Yang. 2017. Beyond trace ratio: Weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Transactions on Knowledge and Data Engineering 29, 10 (2017), 2100--2110.
    [40]
    Zhihui Li, Feiping Nie, Xiaojun Chang, Yi Yang, Chengqi Zhang, and Nicu Sebe. 2018. Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Transactions on Neural Networks and Learning Systems (2018).
    [41]
    Xuejun Liao, Ya Xue, and Lawrence Carin. 2005. Logistic regression with an auxiliary data source. In Proceedings of the International Conference on Machine Learning. ACM, 505--512.
    [42]
    Tongliang Liu, Dacheng Tao, Mingli Song, and Stephen J. Maybank. 2017. Algorithm-dependent generalization bounds for multi-task learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 2 (2017), 227--241.
    [43]
    Mingsheng Long, Jianmin Wang, Jiaguang Sun, and S. Yu Philip. 2015. Domain invariant transfer kernel learning. IEEE Transactions on Knowledge and Data Engineering 27, 6 (2015), 1519--1532.
    [44]
    Zhigang Ma, Xiaojun Chang, Yi Yang, Nicu Sebe, and Alexander G. Hauptmann. 2017. The many shades of negativity. IEEE Transactions on Multimedia 19, 7 (2017), 1558--1568.
    [45]
    Minh Luan Nguyen, Ivor W. Tsang, Kian Ming Adam Chai, and Hai Leong Chieu. 2014. Robust domain adaptation for relation extraction via clustering consistency. In Proceedings of the Annual Meeting on Association for Computational Linguistics. 807--817.
    [46]
    Liqiang Nie, Xiang Wang, Jianglong Zhang, Xiangnan He, Hanwang Zhang, Richang Hong, and Qi Tian. 2017. Enhancing micro-video understanding by harnessing external sounds. In Proceedings of the 2017 ACM on Multimedia Conference. ACM, 1192--1200.
    [47]
    Francesco Orabona, Claudio Castellini, Barbara Caputo, Angelo Emanuele Fiorilla, and Giulio Sandini. 2009. Model adaptation with least-squares SVM for adaptive hand prosthetics. In IEEE International Conference on Robotics and Automation, 2009. ICRA’09. IEEE, 2897--2903.
    [48]
    Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345--1359.
    [49]
    Joaquin Quionero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D. Lawrence. 2009. Dataset Shift in Machine Learning. The MIT Press.
    [50]
    Sachin Ravi and Hugo Larochelle. 2017. Optimization as a model for few-shot learning. In International Conference on Learning Representations.
    [51]
    Chuan-Xian Ren, Dao-Qing Dai, Ke-Kun Huang, and Zhao-Rong Lai. 2014. Transfer learning of structured representation for face recognition. IEEE Transactions on Image Processing 23, 12 (2014), 5440--5454.
    [52]
    Chun-Wei Seah, Ivor Wai-Hung Tsang, and Yew-Soon Ong. 2011. Healing sample selection bias by source classifier selection. In 2011 IEEE 11th International Conference on Data Mining. IEEE, 577--586.
    [53]
    Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
    [54]
    Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical networks for few-shot learning. arXiv preprint arXiv:1703.05175.
    [55]
    Richard Socher, Milind Ganjoo, Christopher D. Manning, and Andrew Ng. 2013. Zero-shot learning through cross-modal transfer. In Advances in Neural Information Processing Systems. 935--943.
    [56]
    Masashi Sugiyama, Shinichi Nakajima, Hisashi Kashima, Paul V. Buenau, and Motoaki Kawanabe. 2008. Direct importance estimation with model selection and its application to covariate shift adaptation. In Advances in Neural Information Processing Systems. 1433--1440.
    [57]
    Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, and Timothy M. Hospedales. 2017. Learning to compare: Relation network for few-shot learning. arXiv preprint arXiv:1711.06025.
    [58]
    James S. Supancic and Deva Ramanan. 2013. Self-paced learning for long-term tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2379--2386.
    [59]
    Tatiana Tommasi, Francesco Orabona, and Barbara Caputo. 2010. Safety in numbers: Learning categories from few examples with multi model knowledge transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3081--3088.
    [60]
    Tatiana Tommasi, Francesco Orabona, and Barbara Caputo. 2014. Learning categories from few examples with multi model knowledge transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 5 (2014), 928--941.
    [61]
    Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. 2016. Matching networks for one shot learning. In Advances in Neural Information Processing Systems. 3630--3638.
    [62]
    Sen Wang, Xiaojun Chang, Xue Li, Guodong Long, Lina Yao, and Quan Z. Sheng. 2016. Diagnosis code assignment using sparsity-based disease correlation embedding. IEEE Transactions on Knowledge and Data Engineering 28, 12 (2016), 3191--3202.
    [63]
    Sen Wang, Xue Li, Xiaojun Chang, Lina Yao, Quan Z. Sheng, and Guodong Long. 2017. Learning multiple diagnosis codes for ICU patients with local disease correlation mining. ACM Transactions on Knowledge Discovery from Data 11, 3 (2017), 31:1--31:21.
    [64]
    Yu-Xiong Wang and Martial Hebert. 2016. Learning by transferring from unsupervised universal sources. In AAAI Conference on Artificial Intelligence.
    [65]
    Yongqin Xian, Christoph H. Lampert, Bernt Schiele, and Zeynep Akata. 2017. Zero-shot learning—A comprehensive evaluation of the good, the bad and the ugly. arXiv preprint arXiv:1707.00600.
    [66]
    Yonghui Xu, Sinno Jialin Pan, Hui Xiong, Qingyao Wu, Ronghua Luo, Huaqing Min, and Hengjie Song. 2017. A unified framework for metric transfer learning. IEEE Transactions on Knowledge and Data Engineering 29, 6 (2017), 1158--1171.
    [67]
    Yan Yan, Feiping Nie, Wen Li, Chenqiang Gao, Yi Yang, and Dong Xu. 2016. Image classification by cross-media active learning with privileged information. IEEE Transactions on Multimedia 18, 12 (2016), 2494--2502.
    [68]
    Dingwen Zhang, Deyu Meng, Chao Li, Lu Jiang, Qian Zhao, and Junwei Han. 2015. A self-paced multiple-instance learning framework for co-saliency detection. In Proceedings of the IEEE International Conference on Computer Vision. 594--602.
    [69]
    Qian Zhao, Deyu Meng, Lu Jiang, Qi Xie, Zongben Xu, and Alexander G. Hauptmann. 2015. Self-paced learning for matrix factorization. In AAAI Conference on Artificial Intelligence. 3196--3202.
    [70]
    Hao Zhou, Vamsi K. Ithapu, Sathya Narayanan Ravi, Vikas Singh, Grace Wahba, and Sterling C. Johnson. 2016. Hypothesis testing in unsupervised domain adaptation with applications in neuroscience. In Advances in Neural Information Processing Systems, D. D. Lee, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 2496--2504.

    Cited By

    View all
    • (2024)Unified Uncertainty Estimation for Cognitive Diagnosis ModelsProceedings of the ACM on Web Conference 202410.1145/3589334.3645488(3545-3554)Online publication date: 13-May-2024
    • (2024)A Bayesian transfer sparse identification method for nonlinear ARX systemsInternational Journal of Adaptive Control and Signal Processing10.1002/acs.3884Online publication date: 8-Aug-2024
    • (2023)Deep convolutional neural networks for multiple histologic types of ovarian tumors classification in ultrasound imagesFrontiers in Oncology10.3389/fonc.2023.115420013Online publication date: 23-Jun-2023
    • Show More Cited By

    Index Terms

    1. Few-Shot Text and Image Classification via Analogical Transfer Learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 9, Issue 6
      Regular Papers
      November 2018
      290 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/3289398
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 October 2018
      Accepted: 01 May 2018
      Revised: 01 May 2018
      Received: 01 January 2018
      Published in TIST Volume 9, Issue 6

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Transfer learning
      2. classification

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)36
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 12 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Unified Uncertainty Estimation for Cognitive Diagnosis ModelsProceedings of the ACM on Web Conference 202410.1145/3589334.3645488(3545-3554)Online publication date: 13-May-2024
      • (2024)A Bayesian transfer sparse identification method for nonlinear ARX systemsInternational Journal of Adaptive Control and Signal Processing10.1002/acs.3884Online publication date: 8-Aug-2024
      • (2023)Deep convolutional neural networks for multiple histologic types of ovarian tumors classification in ultrasound imagesFrontiers in Oncology10.3389/fonc.2023.115420013Online publication date: 23-Jun-2023
      • (2023)Federated User Modeling from Hierarchical InformationACM Transactions on Information Systems10.1145/356048541:2(1-33)Online publication date: 9-Feb-2023
      • (2023)Leveraging Transferable Knowledge Concept Graph Embedding for Cold-Start Cognitive DiagnosisProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591774(983-992)Online publication date: 19-Jul-2023
      • (2023)Brain‐regulated learning for classifying on‐site hazards with small datasetsComputer-Aided Civil and Infrastructure Engineering10.1111/mice.13078Online publication date: Aug-2023
      • (2023)Few-shot and meta-learning methods for image understanding: a surveyInternational Journal of Multimedia Information Retrieval10.1007/s13735-023-00279-412:2Online publication date: 29-Jun-2023
      • (2022)Few-Shot Image Classification: Current Status and Research TrendsElectronics10.3390/electronics1111175211:11(1752)Online publication date: 31-May-2022
      • (2022)Knowledge-Sensed Cognitive Diagnosis for Intelligent Education PlatformsProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557372(1451-1460)Online publication date: 17-Oct-2022
      • (2022)Cognitive Diagnosis Focusing on Knowledge ConceptsProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557096(3272-3281)Online publication date: 17-Oct-2022
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media