article

On handling negative transfer and imbalanced distributions in multiple source transfer learning

Authors:

Liang Ge,

Jing Gao,

Hung Ngo,

Kang Li,

Aidong ZhangAuthors Info & Claims

Statistical Analysis and Data Mining, Volume 7, Issue 4

Pages 254 - 271

https://doi.org/10.1002/sam.11217

Published: 01 August 2014 Publication History

Abstract

Transfer learning has benefited many real-world applications where labeled data are abundant in source domains but scarce in the target domain. As there are usually multiple relevant domains where knowledge can be transferred, multiple source transfer learning MSTL has recently attracted much attention. However, we are facing two major challenges when applying MSTL. First, without knowledge about the difference between source and target domains, negative transfer occurs when knowledge is transferred from highly irrelevant sources. Second, existence of imbalanced distributions in classes, where examples in one class dominate, can lead to improper judgement on the source domains' relevance to the target task. Since existing MSTL methods are usually designed to transfer from relevant sources with balanced distributions, they will fail in applications where these two challenges persist. In this article, we propose a novel two-phase framework to effectively transfer knowledge from multiple sources even when there exists irrelevant sources and imbalanced class distributions. First, an effective supervised local weight scheme is proposed to assign a proper weight to each source domain's classifier based on its ability of predicting accurately on each local region of the target domain. The second phase then learns a classifier for the target domain by solving an optimization problem which concerns both training error minimization and consistency with weighted predictions gained from source domains. A theoretical analysis shows that as the number of source domains increases, the probability that the proposed approach has an error greater than a bound is becoming exponentially small. We further extend the proposed approach to an online processing scenario to conduct transfer learning on continuously arriving data. Extensive experiments on disease prediction, spam filtering and intrusion detection datasets demonstrate that: i the proposed two-phase approach outperforms existing MSTL approaches due to its ability of tackling negative transfer and imbalanced distribution challenges, and ii the proposed online approach achieves comparable performance to the offline scheme.

References

[1]

<label>1</label> M.Bahadori, Y.Liu, and D.Zhang, Learning with minimum supervision: a general framework for transductive transfer, In International Conference on Data Mining, Edmonton, Canada, 2011.

Abstract

References

Cited By

Recommendations

Online transfer learning by leveraging multiple source domains

Hierarchical boosting for transfer learning with multi-source

Mitigating Negative Transfer in Cross-Domain Recommendation via Knowledge Transferability Enhancement

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations