research-article

TranSlider: Transfer Ensemble Learning from Exploitation to Exploration

Authors:

Junzhou HuangAuthors Info & Claims

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 368 - 378

https://doi.org/10.1145/3394486.3403079

Published: 20 August 2020 Publication History

Abstract

In transfer learning, what and where to transfer has been widely studied. Nevertheless, the learned transfer strategies are at high risk of over-fitting, especially when only a few annotated instances are available in the target domain. In this paper, we introduce the concept of transfer ensemble learning, a new direction to tackle the over-fitting of transfer strategies. Intuitively, models with different transfer strategies offer various perspectives on what and where to transfer. Therefore a core problem is to search these diversely transferred models for ensemble so as to achieve better generalization. Towards this end, we propose the Transferability Slider (TranSlider) for transfer ensemble learning. By decreasing the transferability, we obtain a spectrum of base models ranging from pure exploitation of the source model to unconstrained exploration for the target domain. Furthermore, the manner of decreasing transferability with parameter sharing guarantees fast optimization at no additional training cost. Finally, we conduct extensive experiments with various analyses, which demonstrate that TranSlider achieves the state-of-the-art on comprehensive benchmark datasets.

References

[1]

Firoj Alam, Shafiq Joty, and Muhammad Imran. 2018. Domain Adaptation with Adversarial Training and Graph Embeddings. In ACL. 1077--1087.

[2]

Antreas Antoniou, Harrison Edwards, and Amos Storkey. 2018. How to train your MAML. In ICLR.

[3]

Leo Breiman. 1996. Bagging predictors. Machine learning, Vol. 24, 2 (1996), 123--140.

[4]

Rich Caruana, Alexandru Niculescu-Mizil, Geoff Crew, and Alex Ksikes. 2004. Ensemble selection from libraries of models. In ICML. 18.

[5]

Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1--15.

Digital Library

[6]

Zi-Yi Dou, Junjie Hu, Antonios Anastasopoulos, and Graham Neubig. 2019 a. Unsupervised Domain Adaptation for Neural Machine Translation with Domain-Aware Feature Embeddings. In EMNLP. 1417--1422.

[7]

Zi-Yi Dou, Keyi Yu, and Antonios Anastasopoulos. 2019 b. Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks. In EMNLP. 1192--1197.

[8]

Mathias Eitz, James Hays, and Marc Alexa. 2012. How Do Humans Sketch Objects? ACM Trans. Graph. (Proc. SIGGRAPH), Vol. 31, 4 (2012), 44:1--44:10.

Digital Library

[9]

Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born Again Neural Networks. In ICML. 1607--1616.

[10]

AD. Perona P Griffin, G. Holub. 2007. The Caltech 256. (2007).

[11]

Yunhui Guo, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, and Rogerio Feris. 2019. SpotTune: transfer learning through adaptive fine-tuning. In CVPR. 4805--4814.

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.

[13]

Byeongho Heo, Minsik Lee, Sangdoo Yun, and Jin Young Choi. 2019. Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons. In AAAI, Vol. 33. 3779--3787.

Digital Library

[14]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).

[15]

Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E Hopcroft, and Kilian Q Weinberger. 2017. Snapshot ensembles: Train 1, get m for free. In ICLR.

[16]

Yunhun Jang, Hankook Lee, Sung Ju Hwang, and Jinwoo Shin. 2019. Learning What and Where to Transfer. In ICML. 3030--3039.

[17]

Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. 2011. Novel Dataset for Fine-Grained Image Categorization. In First Workshop on Fine-Grained Visual Categorization, CVPR.

[18]

Sasi Kiran Yelamarthi, Shiva Krishna Reddy, Ashish Mishra, and Anurag Mittal. 2018. A zero-shot framework for sketch based image retrieval. In ECCV. 300--317.

[19]

Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014).

[20]

Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).

[21]

Xuhong Li, Yves Grandvalet, and Franck Davoine. 2018. Explicit inductive bias for transfer learning with convolutional networks. In ICML. 2825--2834.

[22]

Xingjian Li, Haoyi Xiong, Hanchao Wang, Yuxuan Rao, Liping Liu, and Jun Huan. 2019. DELTA: DEep Learning Transfer using Feature Map with Attention for Convolutional Networks. In ICLR.

[23]

Hong Liu, Mingsheng Long, Jianmin Wang, and Michael I Jordan. 2019. Towards Understanding the Transferability of Deep Representations. arXiv preprint arXiv:1909.12031 (2019).

[24]

Ilya Loshchilov and Frank Hutter. 2016. Sgdr: Stochastic gradient descent with warm restarts. In ICLR.

[25]

David JC MacKay. 1992. A practical Bayesian framework for backpropagation networks. Neural computation, Vol. 4, 3 (1992), 448--472.

[26]

Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016. Cross-Stitch Networks for Multi-Task Learning. In CVPR. 3994--4003.

[27]

Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, and Andrew Howard. 2018. K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning. In ICLR.

[28]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading digits in natural images with unsupervised feature learning. (2011).

[29]

Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, Vol. 22, 10 (2009), 1345--1359.

Digital Library

[30]

Hanyu Peng, Jiaxiang Wu, Shifeng Chen, and Junzhou Huang. 2019. Collaborative channel pruning for deep networks. In ICML. 5113--5122.

[31]

Ariadna Quattoni and Antonio Torralba. 2009. Recognizing indoor scenes. In CVPR. 413--420.

[32]

Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2015. Fitnets: Hints for thin deep nets. In ICLR.

[33]

Sebastian Ruder12, Joachim Bingel, Isabelle Augenstein, and Anders Søgaard. 2017. Sluice networks: Learning what to share between loosely related tasks. STAT, Vol. 1050 (2017), 23.

[34]

Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In CVPR. 806--813.

[35]

Suraj Srinivas and Francois Fleuret. 2018. Knowledge Transfer with Jacobian Matching. In ICML. 4723--4731.

[36]

C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.

[37]

Chenglin Yang, Lingxi Xie, Chi Su, and Alan L Yuille. 2019. Snapshot distillation: Teacher-student optimization in one generation. In CVPR. 2859--2868.

[38]

Junho Yim, Donggyu Joo, Jihoon Bae, and Junmo Kim. 2017. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In CVPR. 4133--4141.

[39]

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In NeurIPS. 3320--3328.

[40]

Sergey Zagoruyko and Nikos Komodakis. 2016. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In ICLR.

[41]

Yinghua Zhang, Yu Zhang, and Qiang Yang. 2019. Parameter Transfer Unit for Deep Neural Networks. In PAKDD. 82--95.

[42]

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence, Vol. 40, 6 (2017), 1452--1464.

Cited By

Liu BCai YBi HZhang ZLi DGuo YChen X(2023)Beyond Fine-Tuning: Efficient and Effective Fed-Tuning for Mobile/Web UsersProceedings of the ACM Web Conference 202310.1145/3543507.3583212(2863-2873)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583212
Zhuang YLiu QHuang ZLi ZJin BBi HChen EWang SAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)A Robust Computerized Adaptive Testing Approach in Educational Question RetrievalProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531928(416-426)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531928
Zhao HCheng YZhang XZhu HLiu QXiong HZhang W(2022)What is Market Talking about Market-oriented Prospect Analysis for Entrepreneur FundraisingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3174336(1-1)Online publication date: 2022
https://doi.org/10.1109/TKDE.2022.3174336
Show More Cited By

Index Terms

TranSlider: Transfer Ensemble Learning from Exploitation to Exploration
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning
      2. Supervised learning
        Supervised learning by classification
    2. Machine learning algorithms
      1. Ensemble methods

Recommendations

Adaptive boosting for transfer learning using dynamic updates
ECML PKDD'11: Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I

Instance-based transfer learning methods utilize labeled examples from one domain to improve learning performance in another domain via knowledge transfer. Boosting-based transfer learning algorithms are a subset of such methods and have been applied ...
Transfer learning using computational intelligence

Transfer learning aims to provide a framework to utilize previously-acquired knowledge to solve new but similar problems much more quickly and effectively. In contrast to classical machine learning methods, transfer learning methods exploit the ...
Adaptive boosting for transfer learning using dynamic updates
ECMLPKDD'11: Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I

Instance-based transfer learning methods utilize labeled examples from one domain to improve learning performance in another domain via knowledge transfer. Boosting-based transfer learning algorithms are a subset of such methods and have been applied ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

August 2020

3664 pages

ISBN:9781450379984

DOI:10.1145/3394486

General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Joint Research Center of Tencent and Tsinghua
SZSTI
NSFC

Conference

KDD '20

Sponsor:

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

July 6 - 10, 2020

CA, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
749
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)4

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu BCai YBi HZhang ZLi DGuo YChen X(2023)Beyond Fine-Tuning: Efficient and Effective Fed-Tuning for Mobile/Web UsersProceedings of the ACM Web Conference 202310.1145/3543507.3583212(2863-2873)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583212
Zhuang YLiu QHuang ZLi ZJin BBi HChen EWang SAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)A Robust Computerized Adaptive Testing Approach in Educational Question RetrievalProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531928(416-426)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531928
Zhao HCheng YZhang XZhu HLiu QXiong HZhang W(2022)What is Market Talking about Market-oriented Prospect Analysis for Entrepreneur FundraisingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3174336(1-1)Online publication date: 2022
https://doi.org/10.1109/TKDE.2022.3174336
Lin JMa JZhu JCui Y(2022)A Transfer Ensemble Learning Method for Evaluating Power Transformer Health Conditions with Limited Measurement DataIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2022.3175268(1-1)Online publication date: 2022
https://doi.org/10.1109/TIM.2022.3175268
Liu PLi CHe ZYu DXu ZLei M(2022)Deep Domain Adaptation for Powe Transformer Fault Diagnosis Based on Transfer Convolutional Neural Network2022 4th International Conference on Electrical Engineering and Control Technologies (CEECT)10.1109/CEECT55960.2022.10030508(25-29)Online publication date: Dec-2022
https://doi.org/10.1109/CEECT55960.2022.10030508
Nauta Mvan Bree RSeifert C(2021)Neural Prototype Trees for Interpretable Fine-grained Image Recognition2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR46437.2021.01469(14928-14938)Online publication date: Jun-2021
https://doi.org/10.1109/CVPR46437.2021.01469
Arefeen MTabassum Nimi SSarwar Uddin MLee Y(2021)TransJury: Towards Explainable Transfer Learning through Selection of Layers from Deep Neural Networks2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671723(978-984)Online publication date: 15-Dec-2021
https://doi.org/10.1109/BigData52589.2021.9671723

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents