Article

Multi-task feature learning

Authors:

Andreas Argyriou,

Theodoros Evgeniou, and

Massimiliano PontilAuthors Info & Claims

NIPS'06: Proceedings of the 19th International Conference on Neural Information Processing Systems

December 2006

Pages 41 - 48

Published: 04 December 2006 Publication History

Abstract

We present a method for learning a low-dimensional representation which is shared across a set of multiple related tasks. The method builds upon the well-known 1-norm regularization problem using a new regularizer which controls the number of learned features common for all the tasks. We show that this problem is equivalent to a convex optimization problem and develop an iterative algorithm for solving it. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the latter step we learn commonacross-tasks representations and in the former step we learn task-specific functions using these representations. We report experiments on a simulated and a real data set which demonstrate that the proposed method dramatically improves the performance relative to learning each task independently. Our algorithm can also be used, as a special case, to simply select – not learn – a few common features across the tasks.

References

[1]

J. Abernethy, F. Bach, T. Evgeniou and J-P. Vert. Low-rank matrix factorization with attributes. Technical report N24/06/MM, Ecole des Mines de Paris, 2006.

[2]

R.K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Machine Learning Research. 6: 1817-1853, 2005.

[3]

B. Bakker and T. Heskes. Task clustering and gating for Bayesian multi–task learning. J. of Machine Learning Research, 4: 83-99, 2003.

[4]

J. Baxter. A model for inductive bias learning. J. of Artificial Intelligence Research, 12: 149-198, 2000.

[5]

S. Ben-David and R. Schuller. Exploiting task relatedness for multiple task learning. Proceedings of Computational Learning Theory (COLT), 2003.

[6]

R. Caruana. Multi–task learning. Machine Learning, 28: 41-75, 1997.

[7]

D. Donoho. For most large underdetermined systems of linear equations, the minimal l¹-norm near-solution approximates the sparsest near-solution. Preprint, Dept. of Statistics, Stanford University, 2004.

[8]

T. Evgeniou, C.A. Micchelli and M. Pontil. Learning multiple tasks with kernel methods. J. Machine Learning Research, 6: 615-637, 2005.

[9]

T. Evgeniou, M. Pontil and O. Toubia. A convex optimization approach to modeling consumer heterogeneity in conjoint estimation. INSEAD N 2006/62/TOM/DS.

[10]

M. Fazel, H. Hindi and S. P. Boyd. A rank minimization heuristic with application to minimum order system approximation. Proceedings, American Control Conference, 6, 2001.

[11]

T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer Verlag Series in Statistics, New York, 2001.

[12]

T. Jebara. Multi-task feature and kernel selection for SVMs. Proc. of ICML 2004.

[13]

P.J. Lenk, W.S. DeSarbo, P.E. Green, M.R. Young. Hierarchical Bayes conjoint analysis: recovery of partworth heterogeneity from reduced experimental designs. Marketing Science, 15(2): 173-191, 1996.

[14]

C.A. Micchelli and A. Pinkus. Variational problems arising from balancing several error criteria. Rendiconti di Matematica, Serie VII, 14: 37-86, 1994.

[15]

C. A. Micchelli and M. Pontil. On learning vector–valued functions. Neural Computation, 17:177-204, 2005.

[16]

T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, T. Poggio. Theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. AI Memo No. 2005-036, MIT, Cambridge, MA, October, 2005.

[17]

N. Srebro, J.D.M. Rennie, and T.S. Jaakkola. Maximum-margin matrix factorization. NIPS 2004.

[18]

A. Torralba, K. P. Murphy and W. T. Freeman. Sharing features: efficient boosting procedures for multiclass object detection. Proc. of CVPR'04, pages 762-769, 2004.

[19]

K. Yu, V. Tresp and A. Schwaighofer. Learning Gaussian processes from multiple tasks. Proc. of ICML 2005.

[20]

J. Zhang, Z. Ghahramani and Y. Yang. Learning Multiple Related Tasks using Latent Independent Component Analysis. NIPS 2006.

Cited By

Wu HBradley J(2024)Global–local shrinkage multivariate logit-beta priors for multiple response-type dataStatistics and Computing10.1007/s11222-024-10380-134:2Online publication date: 3-Feb-2024
https://dl.acm.org/doi/10.1007/s11222-024-10380-1
Gao GBao ZCao JQin ASellis T(2022)Location-Centered House Price Prediction: A Multi-Task Learning ApproachACM Transactions on Intelligent Systems and Technology10.1145/350180613:2(1-25)Online publication date: 5-Jan-2022
https://dl.acm.org/doi/10.1145/3501806
Zhou JTsang IPan STan M(2021)Multi-class heterogeneous domain adaptationThe Journal of Machine Learning Research10.5555/3322706.336199820:1(2041-2071)Online publication date: 9-Mar-2021
https://dl.acm.org/doi/10.5555/3322706.3361998
Show More Cited By

Multi-task feature learning
1. Computing methodologies

Recommendations

Convex multi-task feature learning

We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single-task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned ...
Read More
Multi-stage multi-task feature learning

Multi-task sparse feature learning aims to improve the generalization performance by exploiting the shared features among tasks. It has been successfully applied to many applications including computer vision and biomedical informatics. Most of the ...
Read More
Multi-task model and feature joint learning
IJCAI'15: Proceedings of the 24th International Conference on Artificial Intelligence

Given several tasks, multi-task learning (MTL) learns multiple tasks jointly by exploring the interdependence between them. The basic assumption in MTL is that those tasks are indeed related. Existing MTL methods model the task relatedness/ ...
Read More

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'06: Proceedings of the 19th International Conference on Neural Information Processing Systems

December 2006

1632 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 04 December 2006

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

108
Total Citations
View Citations
2
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Wu HBradley J(2024)Global–local shrinkage multivariate logit-beta priors for multiple response-type dataStatistics and Computing10.1007/s11222-024-10380-134:2Online publication date: 3-Feb-2024
https://dl.acm.org/doi/10.1007/s11222-024-10380-1
Gao GBao ZCao JQin ASellis T(2022)Location-Centered House Price Prediction: A Multi-Task Learning ApproachACM Transactions on Intelligent Systems and Technology10.1145/350180613:2(1-25)Online publication date: 5-Jan-2022
https://dl.acm.org/doi/10.1145/3501806
Zhou JTsang IPan STan M(2021)Multi-class heterogeneous domain adaptationThe Journal of Machine Learning Research10.5555/3322706.336199820:1(2041-2071)Online publication date: 9-Mar-2021
https://dl.acm.org/doi/10.5555/3322706.3361998
Liu ZShen YCheng XLi QWei JZhang ZWang DZeng XGu JZhou JDemartini GZuccon GCulpepper JHuang ZTong H(2021)Learning Representations of Inactive UsersProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482131(3278-3282)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482131
Kyono TGilbert FSchaar M(2021)Triage of 2D Mammographic Images Using Multi-view Multi-task Convolutional Neural NetworksACM Transactions on Computing for Healthcare10.1145/34531662:3(1-24)Online publication date: 15-Jul-2021
https://dl.acm.org/doi/10.1145/3453166
Del Buono NEsposito FSelicato L(2021)Toward a New Approach for Tuning Regularization Hyperparameter in NMFMachine Learning, Optimization, and Data Science10.1007/978-3-030-95467-3_36(500-511)Online publication date: 4-Oct-2021
https://dl.acm.org/doi/10.1007/978-3-030-95467-3_36
Le DLauw H(2019)Learning multiple maps from conditional ordinal tripletsProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367243.3367430(2815-2822)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367243.3367430
He XAlesiani FShaker A(2019)Efficient and scalable multi-task regression on massive number of tasksProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v33i01.33013763(3762-3770)Online publication date: 27-Jan-2019
https://dl.acm.org/doi/10.1609/aaai.v33i01.33013763
Zhou YYing LHe J(2019)Multi-task Crowdsourcing via an Optimization FrameworkACM Transactions on Knowledge Discovery from Data10.1145/331022713:3(1-26)Online publication date: 29-May-2019
https://dl.acm.org/doi/10.1145/3310227
Chen CQiu MYang YZhou JHuang JLi XBao F(2019)Multi-Domain Gated CNN for Review Helpfulness PredictionThe World Wide Web Conference10.1145/3308558.3313587(2630-2636)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3308558.3313587
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents