Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

On handling negative transfer and imbalanced distributions in multiple source transfer learning

Published: 01 August 2014 Publication History

Abstract

Transfer learning has benefited many real-world applications where labeled data are abundant in source domains but scarce in the target domain. As there are usually multiple relevant domains where knowledge can be transferred, multiple source transfer learning MSTL has recently attracted much attention. However, we are facing two major challenges when applying MSTL. First, without knowledge about the difference between source and target domains, negative transfer occurs when knowledge is transferred from highly irrelevant sources. Second, existence of imbalanced distributions in classes, where examples in one class dominate, can lead to improper judgement on the source domains' relevance to the target task. Since existing MSTL methods are usually designed to transfer from relevant sources with balanced distributions, they will fail in applications where these two challenges persist. In this article, we propose a novel two-phase framework to effectively transfer knowledge from multiple sources even when there exists irrelevant sources and imbalanced class distributions. First, an effective supervised local weight scheme is proposed to assign a proper weight to each source domain's classifier based on its ability of predicting accurately on each local region of the target domain. The second phase then learns a classifier for the target domain by solving an optimization problem which concerns both training error minimization and consistency with weighted predictions gained from source domains. A theoretical analysis shows that as the number of source domains increases, the probability that the proposed approach has an error greater than a bound is becoming exponentially small. We further extend the proposed approach to an online processing scenario to conduct transfer learning on continuously arriving data. Extensive experiments on disease prediction, spam filtering and intrusion detection datasets demonstrate that: i the proposed two-phase approach outperforms existing MSTL approaches due to its ability of tackling negative transfer and imbalanced distribution challenges, and ii the proposed online approach achieves comparable performance to the offline scheme.

References

[1]
<label>1</label> M.Bahadori, Y.Liu, and D.Zhang, Learning with minimum supervision: a general framework for transductive transfer, In International Conference on Data Mining, Edmonton, Canada, 2011.
[2]
<label>2</label> R.Chattopadhyay, J.Ye, S.Panchanathan, W.Fan, and I.Davidson, Multi-source domain adaptation and its application to early detection of fatigue, In SIGKDD Knowledge Discovery and Data Mining, San Diego, CA, 2011.
[3]
<label>3</label> L.Duan, I.Tsang, and D.Xu, Domain adaptation from multiple sources via auxiliary classifiers, In International Conference on Machine Learning, Montreal, Canada, 2009.
[4]
<label>4</label> J.Gao, W.Fan, J.Jiang, and J.Han, Knowledge transfer via multiple model local structure mapping, In SIGKDD Knowledge Discovery and Data Mining, Las Vegas, NV, 2008.
[5]
<label>5</label> J.Gao, F.Liang, W.Fan, Y.Sun, and J.Han, Graph-based consensus maximization among multiple supervised and unsupervised models, In Advances in Neural Information Processing Systems, Vancouver, Canada, 2009.
[6]
<label>6</label> M.Long, J.Wang, G.Ding, W.Cheng, X.Zhang, and W.Wang, Dual transfer learning, In SIAM International Conference on Data Mining, Anaheim, CA, 2012.
[7]
<label>7</label> P.Luo, F.Zhuang, H.Xiong, Y.Xiong, and Q.He, Transfer learning from multiple source domains via consensus regularization, In International Conference on Information and Knowledge Management, Napa Valley, CA, 2008.
[8]
<label>8</label> Q.Sun, R.Chattopadhyay, S.Panchanathan, and J.Ye, A two-stageweighting framework for multi-source domain adaptation, In Advances in Neural Information Processing Systems, Sierra Nevada, Spain, 2011.
[9]
<label>9</label> S.Pan and Q.Yang, A survey on transfer learning, In IEEE Transactions on Knowledge and Data Engineering, 2010.
[10]
<label>10</label> B.Cao, S.Pan, Y.Zhang, D.Yeung, and Q.Yang, Adaptive transfer learning, In Advancement of Artificial Intelligence, San Francisco, CA, 2010.
[11]
<label>11</label> M.Rosenstein, Z.Marx, and L.Kaelbling, To transfer or not to transfer, In Advances in Neural Information Processing Systems, Whistler, Canada, 2005.
[12]
<label>12</label> N.Chawla, Data mining for imbalanced datasets: an overview. In The Data Mining and Knowledge Discovery Handbook, 2005.
[13]
<label>13</label> N.Chawla, N.Japkowicz, and A.Kotcz, Editorial: special issue on learning from imbalanced data sets, In SIGKDD Explore Newsletter, 2004.
[14]
<label>14</label> G.Moody and R.Mark, The impact of the Mit-Bih arrhythmia database, In IEEE Engineering in Medicine and Biology Society, 2001.
[15]
<label>15</label> J.Shi and J.Malik, Normalized cuts and image segmentation, IEEE Trans Pattern Anal Mach Intell Volume 22 Issue 8 2000, pp.888-905.
[16]
<label>16</label> D.Zhou, O.Bousquet, T.Lal, J.Weston, and B.Scholkopf, Learning with local and global consistency, Adv Neural Inf Process Syst 2003.
[17]
<label>17</label> B.Scholkopf and A.Smola, Learning with Kernels Support Vector Machines, Regularization, Optimization and Beyond, The MIT Press, Boston, MA, 2002.
[18]
<label>18</label> W.Hoeffding, Probability inequalities for sums of bounded random variables, J Am Stat Assoc Volume 58 1963, pp.13-30.
[19]
<label>19</label> K.Crammer, O.Dekel, J.Keshet, S.Shalev-Shwartz, and Y.Singer, Online passive-agressive algorithms, J Mach Learn Res Volume 7 2006, pp.551-585.
[20]
<label>20</label> Y.Li and P.Long, The relaxed online maximum margin algorithm, In Advances in Neural Information Processing Systems, Denver, CO, 1999.
[21]
<label>21</label> F.Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain, Psychol Rev Volume 65 1958, pp.386-408.
[22]
<label>22</label> P.Zhao and S.Hoi. OTL: a framework of online transfer learning, In International Conference on Machine Learning, Haifa, Israel, 2010.
[23]
<label>23</label> P.Zhao, S.Hoi, and R.Jin, Duol: a double updating approach for online learning, In Advances in Neural Information Processing Systems, Vancouver, Canada, 2009.
[24]
<label>24</label> H.Ning, W.Xu, Y.Chi, Y.Gong, and T.Huang, Incremental spectral clustering with application to monitoring of evolving blog communities, In SIAM International Conference on Data Mining, Minneapolis MN, 2007.
[25]
<label>25</label> A.Goldberger, L.Amaral, L.Glass, J.Hausdorff, R.Mark, J.Mietus, G.Moody, C.Peng, and H.Stanley, Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals, Circulation Volume 101 2000, pp.215-220.
[26]
<label>26</label> H.Gothwal, S.Kedawat, and R.Kumar, Cardiac arrhythmias detection in an ECG beat signal using fast Fourier transform and artificial neural network, J Biomed Sci Eng 2011.
[27]
<label>27</label> K.Borgwardt, A.Gretton, M.Rasch, H.Kriegel, B.Scholkopf, and A.Smola, Integrating structured biological data by kernel maximum mean discrepancy, Bioinformatics Volume 22 2006, pp.49-57.
[28]
<label>28</label> W.Dai, Q.Yang, G.Xue, and Y.Yu, Boosting for transfer learning, In International Conference on Machine Learning, Corvallis OR, 2007.
[29]
<label>29</label> H.Daume and D.Marcu, Domain adaptation for statistical classifiers, J Artif Intell Res Volume 26 2006, pp.101-126.
[30]
<label>30</label> R.Raina, A.Battle, H.Lee, B.Packer, and A. Y.Ng, Self-taught learning: transfer learning from unlabeled data, In International Conference on Machine Learning, Corvallis OR, 2007.
[31]
<label>31</label> A.Arnold, R.Nallapati, and W.Cohen, A comparative study of methods for transductive transfer learning, In International Conference on Data Mining Workshops, Omaha, NE, 2007.
[32]
<label>32</label> W.Dai, G.Xue, Q.Yang, and Y.Yu, Co-clustering based classification for out-of-domain documents, In SIGKDD Knowledge Discovery and Data Mining, San Jose, CA, 2007.
[33]
<label>33</label> H.Shimodaira, Improving predictive inference under covariate shift by weighting the log-likelihood function, J Stat Plann Inference Volume 90 2000, pp.227-244.
[34]
<label>34</label> W.Dai, Q.Yang, G.Xue, and Y.Yu, Self-taught clustering, In International Conference on Machine Learning, Helsinki, Finland, 2008.
[35]
<label>35</label> Z.Wang, Y.Song, and C.Zhang, Transferred dimensionality reduction, In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Antwerp, Belgium, 2008.
[36]
<label>36</label> M.Kubat and S.Matwin, Adressing the curse of imbalanced training sets: one sided selection, In International Conference on Machine Learning, Nashville, Tennessee, 1997.
[37]
<label>37</label> N.Chawla, L.Hall, K.Bowyer, and W. P.Kegelmeyer, Smote: synthetic minorit oversampling technique, J Artif Intell Res Volume 16 2002, pp.321-357.
[38]
<label>38</label> G.Weiss, Mining with rarity: a unifying framework, In SIGKDD Explore Newsletter, 2004.
[39]
<label>39</label> P.Domingos, Metacost: A general method for making classifiers cost-sensitive, In SIGKDD Knowledge Discovery and Data Mining, San Diego, CA, 1999.
[40]
<label>40</label> O.Chapelle, B.Scholkopf, and A.Zien, eds. Semi-Supervised Learning. MIT Press, Boston MA, 2006.
[41]
<label>41</label> A.Goldberg, M.Li, and X.Zhu, Online manifold regularization: A new learning setting and empirical study, In The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Antwerp, Belgium, 2008.
[42]
<label>42</label> X.Zhu, Z.Gharamani, and J.Lafferty, Semi-supervised learning using Gaussian fields and harmonic functions, In International Conference on Machine Learning, DC, USA, 2003.
[43]
<label>43</label> S.-S.Shai, Online Learning: Theory, Ph.D. Thesis; University of Chicago, 2007.

Cited By

View all
  • (2024)Multi-Task Learning with Sequential Dependence Toward Industrial Applications: A Systematic FormulationACM Transactions on Knowledge Discovery from Data10.1145/364046818:5(1-29)Online publication date: 28-Feb-2024
  • (2023)Selectivity drives productivityProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667727(36913-36937)Online publication date: 10-Dec-2023
  • (2023)Mining Label Distribution Drift in Unsupervised Domain AdaptationAI 2023: Advances in Artificial Intelligence10.1007/978-981-99-8388-9_29(354-366)Online publication date: 28-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Statistical Analysis and Data Mining
Statistical Analysis and Data Mining  Volume 7, Issue 4
August 2014
95 pages

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 August 2014

Author Tags

  1. imbalanced distribution
  2. multiple source
  3. negative transfer
  4. transfer learning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Multi-Task Learning with Sequential Dependence Toward Industrial Applications: A Systematic FormulationACM Transactions on Knowledge Discovery from Data10.1145/364046818:5(1-29)Online publication date: 28-Feb-2024
  • (2023)Selectivity drives productivityProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667727(36913-36937)Online publication date: 10-Dec-2023
  • (2023)Mining Label Distribution Drift in Unsupervised Domain AdaptationAI 2023: Advances in Artificial Intelligence10.1007/978-981-99-8388-9_29(354-366)Online publication date: 28-Nov-2023
  • (2023)Domain Adaptation for Anomaly Detection on Heterogeneous Graphs in E-CommerceAdvances in Information Retrieval10.1007/978-3-031-28238-6_20(304-318)Online publication date: 2-Apr-2023
  • (2022)Input addition and deletion in reinforcement: towards protean learningAutonomous Agents and Multi-Agent Systems10.1007/s10458-021-09534-636:1Online publication date: 1-Apr-2022
  • (2021)Novel Distant Domain Transfer Learning Method for COVID-19 Classification from X-rays ImagesProceedings of the 5th International Conference on Algorithms, Computing and Systems10.1145/3490700.3490721(127-134)Online publication date: 24-Sep-2021
  • (2021)Cross-Modality Transfer Learning for Image-Text Information ManagementACM Transactions on Management Information Systems10.1145/346432413:1(1-14)Online publication date: 5-Oct-2021
  • (2021)Towards Safe Weakly Supervised LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.292239643:1(334-346)Online publication date: 1-Jan-2021
  • (2021)Evolutionary Transfer Optimization - A New Frontier in Evolutionary Computation ResearchIEEE Computational Intelligence Magazine10.1109/MCI.2020.303906616:1(22-33)Online publication date: 1-Feb-2021
  • (2020)Knowledge Transfer in Vision RecognitionACM Computing Surveys10.1145/337934453:2(1-35)Online publication date: 17-Apr-2020
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media