Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2976456.2976462guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

Multi-task feature learning

Published: 04 December 2006 Publication History
  • Get Citation Alerts
  • Abstract

    We present a method for learning a low-dimensional representation which is shared across a set of multiple related tasks. The method builds upon the well-known 1-norm regularization problem using a new regularizer which controls the number of learned features common for all the tasks. We show that this problem is equivalent to a convex optimization problem and develop an iterative algorithm for solving it. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the latter step we learn commonacross-tasks representations and in the former step we learn task-specific functions using these representations. We report experiments on a simulated and a real data set which demonstrate that the proposed method dramatically improves the performance relative to learning each task independently. Our algorithm can also be used, as a special case, to simply select – not learn – a few common features across the tasks.

    References

    [1]
    J. Abernethy, F. Bach, T. Evgeniou and J-P. Vert. Low-rank matrix factorization with attributes. Technical report N24/06/MM, Ecole des Mines de Paris, 2006.
    [2]
    R.K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Machine Learning Research. 6: 1817-1853, 2005.
    [3]
    B. Bakker and T. Heskes. Task clustering and gating for Bayesian multi–task learning. J. of Machine Learning Research, 4: 83-99, 2003.
    [4]
    J. Baxter. A model for inductive bias learning. J. of Artificial Intelligence Research, 12: 149-198, 2000.
    [5]
    S. Ben-David and R. Schuller. Exploiting task relatedness for multiple task learning. Proceedings of Computational Learning Theory (COLT), 2003.
    [6]
    R. Caruana. Multi–task learning. Machine Learning, 28: 41-75, 1997.
    [7]
    D. Donoho. For most large underdetermined systems of linear equations, the minimal l1-norm near-solution approximates the sparsest near-solution. Preprint, Dept. of Statistics, Stanford University, 2004.
    [8]
    T. Evgeniou, C.A. Micchelli and M. Pontil. Learning multiple tasks with kernel methods. J. Machine Learning Research, 6: 615-637, 2005.
    [9]
    T. Evgeniou, M. Pontil and O. Toubia. A convex optimization approach to modeling consumer heterogeneity in conjoint estimation. INSEAD N 2006/62/TOM/DS.
    [10]
    M. Fazel, H. Hindi and S. P. Boyd. A rank minimization heuristic with application to minimum order system approximation. Proceedings, American Control Conference, 6, 2001.
    [11]
    T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer Verlag Series in Statistics, New York, 2001.
    [12]
    T. Jebara. Multi-task feature and kernel selection for SVMs. Proc. of ICML 2004.
    [13]
    P.J. Lenk, W.S. DeSarbo, P.E. Green, M.R. Young. Hierarchical Bayes conjoint analysis: recovery of partworth heterogeneity from reduced experimental designs. Marketing Science, 15(2): 173-191, 1996.
    [14]
    C.A. Micchelli and A. Pinkus. Variational problems arising from balancing several error criteria. Rendiconti di Matematica, Serie VII, 14: 37-86, 1994.
    [15]
    C. A. Micchelli and M. Pontil. On learning vector–valued functions. Neural Computation, 17:177-204, 2005.
    [16]
    T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, T. Poggio. Theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. AI Memo No. 2005-036, MIT, Cambridge, MA, October, 2005.
    [17]
    N. Srebro, J.D.M. Rennie, and T.S. Jaakkola. Maximum-margin matrix factorization. NIPS 2004.
    [18]
    A. Torralba, K. P. Murphy and W. T. Freeman. Sharing features: efficient boosting procedures for multiclass object detection. Proc. of CVPR'04, pages 762-769, 2004.
    [19]
    K. Yu, V. Tresp and A. Schwaighofer. Learning Gaussian processes from multiple tasks. Proc. of ICML 2005.
    [20]
    J. Zhang, Z. Ghahramani and Y. Yang. Learning Multiple Related Tasks using Latent Independent Component Analysis. NIPS 2006.

    Cited By

    View all
    1. Multi-task feature learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      NIPS'06: Proceedings of the 19th International Conference on Neural Information Processing Systems
      December 2006
      1632 pages

      Publisher

      MIT Press

      Cambridge, MA, United States

      Publication History

      Published: 04 December 2006

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Global–local shrinkage multivariate logit-beta priors for multiple response-type dataStatistics and Computing10.1007/s11222-024-10380-134:2Online publication date: 3-Feb-2024
      • (2022)Location-Centered House Price Prediction: A Multi-Task Learning ApproachACM Transactions on Intelligent Systems and Technology10.1145/350180613:2(1-25)Online publication date: 5-Jan-2022
      • (2021)Multi-class heterogeneous domain adaptationThe Journal of Machine Learning Research10.5555/3322706.336199820:1(2041-2071)Online publication date: 9-Mar-2021
      • (2021)Learning Representations of Inactive UsersProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482131(3278-3282)Online publication date: 26-Oct-2021
      • (2021)Triage of 2D Mammographic Images Using Multi-view Multi-task Convolutional Neural NetworksACM Transactions on Computing for Healthcare10.1145/34531662:3(1-24)Online publication date: 15-Jul-2021
      • (2021)Toward a New Approach for Tuning Regularization Hyperparameter in NMFMachine Learning, Optimization, and Data Science10.1007/978-3-030-95467-3_36(500-511)Online publication date: 4-Oct-2021
      • (2019)Learning multiple maps from conditional ordinal tripletsProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367243.3367430(2815-2822)Online publication date: 10-Aug-2019
      • (2019)Efficient and scalable multi-task regression on massive number of tasksProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v33i01.33013763(3762-3770)Online publication date: 27-Jan-2019
      • (2019)Multi-task Crowdsourcing via an Optimization FrameworkACM Transactions on Knowledge Discovery from Data10.1145/331022713:3(1-26)Online publication date: 29-May-2019
      • (2019)Multi-Domain Gated CNN for Review Helpfulness PredictionThe World Wide Web Conference10.1145/3308558.3313587(2630-2636)Online publication date: 13-May-2019
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media