Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1835804.1835952acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Learning incoherent sparse and low-rank patterns from multiple tasks

Published: 25 July 2010 Publication History
  • Get Citation Alerts
  • Abstract

    We consider the problem of learning incoherent sparse and low-rank patterns from multiple tasks. Our approach is based on a linear multi-task learning formulation, in which the sparse and low-rank patterns are induced by a cardinality regularization term and a low-rank constraint, respectively. This formulation is non-convex; we convert it into its convex surrogate, which can be routinely solved via semidefinite programming for small-size problems. We propose to employ the general projected gradient scheme to efficiently solve such a convex surrogate; however, in the optimization formulation, the objective function is non-differentiable and the feasible domain is non-trivial. We present the procedures for computing the projected gradient and ensuring the global convergence of the projected gradient scheme. The computation of projected gradient involves a constrained optimization problem; we show that the optimal solution to such a problem can be obtained via solving an unconstrained optimization subproblem and an Euclidean projection subproblem. In addition, we present two projected gradient algorithms and discuss their rates of convergence. Experimental results on benchmark data sets demonstrate the effectiveness of the proposed multi-task learning formulation and the efficiency of the proposed projected gradient algorithms.

    Supplementary Material

    JPG File (kdd2010_chen_lislrpmt_01.jpg)
    MOV File (kdd2010_chen_lislrpmt_01.mov)

    References

    [1]
    J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert. A new approach to collaborative filtering: Operator estimation with spectral regularization. Journal of Machine Learning Research, 10:803--826, 2009.
    [2]
    R. K. Ando. BioCreative II gene mention tagging system at IBM Watson. In Proceedings of the Second BioCreative Challenge Evaluation Workshop, 2007.
    [3]
    R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817--1853, 2005.
    [4]
    A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243--272, 2008.
    [5]
    B. Bakker and T. Heskes. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research, 4:83--99, 2003.
    [6]
    J. Baxter. A model of inductive bias learning. Journal of Artificial Intelligence Research, 12:149--198, 2000.
    [7]
    A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal of Imaging Science, 2:183--202, 2009.
    [8]
    D. P. Bertsekas, A. Nedic, and A. E. Ozdaglar. Convex Analysis and Optimization. Athena Scientific, April 2003.
    [9]
    J. Bi, T. Xiong, S. Yu, M. Dundar, and R. B. Rao. An improved multi-task learning approach with applications in medical diagnosis. In ECML/PKDD, 2008.
    [10]
    S. Bickel, J. Bogojeska, T. Lengauer, and T. Scheffer. Multi-task learning for HIV therapy screening. In ICML, 2008.
    [11]
    S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
    [12]
    E. J. Candès, X. Li, Y. Ma, and J. Wright. Robust principal component analysis. Submitted for publication.
    [13]
    R. Caruana. Multitask learning. Machine Learning, 28(1):41--75, 1997.
    [14]
    V. Chandrasekaran, S. Sanghavi, P. A. Parrilo, and A. S. Willsky. Sparse and low-rank matrix decompositions. In SYSID, 2009.
    [15]
    J. Chen, L. Tang, J. Liu, and J. Ye. A convex formulation for learning shared structures from multiple tasks. In ICML, 2009.
    [16]
    T. Evgeniou, C. A. Micchelli, and M. Pontil. Learning multiple tasks with kernel methods. Journal of Machine Learning Research, 6:615--637, 2005.
    [17]
    M. Fazel, H. Hindi, and S. Boyd. A rank minimization heuristic with application to minimum order system approximation. In ACL, 2001.
    [18]
    G. Gene and V. L. Charles. Matrix computations. Johns Hopkins University Press, 1996.
    [19]
    L. Jacob, F. Bach, and J.-P. Vert. Clustered multi-task learning: A convex formulation. In NIPS, 2008.
    [20]
    S. Ji and J. Ye. An accelerated gradient method for trace norm minimization. In ICML, 2009.
    [21]
    N. D. Lawrence and J. C. Platt. Learning to learn with the informative vector machine. In ICML, 2004.
    [22]
    J. Liu, S. Ji, and J. Ye. SLEP: Sparse Learning with Efficient Projections. Arizona State University, 2009.
    [23]
    J. Liu and J. Ye. Efficient euclidean projections in linear time. In ICML, 2009.
    [24]
    A. Martinez and R. Benavente. The AR face database. Technical report, 1998.
    [25]
    A. Nemirovski. Efficient Methods in Convex Programming. Lecture Notes, 1995.
    [26]
    Y. Nesterov. Introductory Lectures on Convex Programming. Lecture Notes, 1998.
    [27]
    G. Obozinski, B.Taskar, and M. Jordan. Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing, 2009.
    [28]
    T. K. Pong, P. Tseng, S. Ji, and J. Ye. Trace norm regularization: Reformulations, algorithms, and multi-task learning. Submitted to SIAM Journal on Optimization, 2009.
    [29]
    A. Schwaighofer, V. Tresp, and K. Yu. Learning gaussian process kernels via hierarchical bayes. In NIPS, 2004.
    [30]
    A. Shapiro. Weighted minimum trace factor analysis. Psychometrika, 47:243--264, 1982.
    [31]
    S. Si, D. Tao, and B. Geng. Bregman divergence-based regularization for transfer subspace learning. IEEE Transactions on Knowledge and Data Engineering, 22:929--942, 2010.
    [32]
    J. F. Sturm. Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones. Optimization Methods and Software, (11-12):653--625, 1998.
    [33]
    R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267--288, 1994.
    [34]
    N. Ueda and K. Saito. Single-shot detection of multiple categories of text using parametric mixture models. In KDD, 2002.
    [35]
    L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review, 38(1):49--95, March 1996.
    [36]
    G. A. Watson. Characterization of the subdifferential of some matrix norms. Linear Algebra and its Applications, (170):33--45, 1992.
    [37]
    J. Wright, A. Ganesh, S. Rao, and Y. Ma. Robust principal component analysis: Exact recovery of corrupted low-rank matrices by convex optimization. In NIPS, 2009.
    [38]
    Y. Xue, X. Liao, L. Carin, and B. Krishnapuram. Multi-task learning for classification with dirichlet process priors. Journal of Machine Learning Research, 8:35--63, 2007.
    [39]
    Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In ICML, 1997.
    [40]
    K. Yu, V. Tresp, and A. Schwaighofer. Learning gaussian processes from multiple tasks. In ICML, 2005.
    [41]
    J. Zhang, Z. Ghahramani, and Y. Yang. Learning multiple related tasks using latent independent component analysis. In NIPS, 2005.

    Cited By

    View all
    • (2023)Hierarchical Lifelong Machine Learning With “Watchdog”IEEE Transactions on Big Data10.1109/TBDATA.2021.31108629:1(63-74)Online publication date: 1-Feb-2023
    • (2022)Learning Linear and Nonlinear Low-Rank Structure in Multi-Task LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3203904(1-12)Online publication date: 2022
    • (2022)A Survey on Multi-Task LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.307020334:12(5586-5609)Online publication date: 1-Dec-2022
    • Show More Cited By

    Index Terms

    1. Learning incoherent sparse and low-rank patterns from multiple tasks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
      July 2010
      1240 pages
      ISBN:9781450300551
      DOI:10.1145/1835804
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 July 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. multi-task learning
      2. sparse and low-rank patterns
      3. trace norm

      Qualifiers

      • Research-article

      Conference

      KDD '10
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '24

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 12 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Hierarchical Lifelong Machine Learning With “Watchdog”IEEE Transactions on Big Data10.1109/TBDATA.2021.31108629:1(63-74)Online publication date: 1-Feb-2023
      • (2022)Learning Linear and Nonlinear Low-Rank Structure in Multi-Task LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3203904(1-12)Online publication date: 2022
      • (2022)A Survey on Multi-Task LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.307020334:12(5586-5609)Online publication date: 1-Dec-2022
      • (2022)Disclosing incoherent sparse and low-rank patterns inside homologous GPCR tasks for better modelling of ligand bioactivitiesFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-021-0478-616:4Online publication date: 1-Aug-2022
      • (2021)A Simple Approach to Balance Task Loss in Multi-Task Learning2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671640(812-823)Online publication date: 15-Dec-2021
      • (2020)Multitask Bayesian Spatiotemporal Gaussian Processes for Short-Term Load ForecastingIEEE Transactions on Industrial Electronics10.1109/TIE.2019.292827567:6(5132-5143)Online publication date: Jun-2020
      • (2020)Multi-target regression via output space quantization2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9206984(1-9)Online publication date: Jul-2020
      • (2020)Transfer Learning10.1017/9781139061773Online publication date: 24-Jan-2020
      • (2018)Learning to multitaskProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327345.3327479(5776-5787)Online publication date: 3-Dec-2018
      • (2018)Multi-Task Learning with Low Rank Attribute Embedding for Multi-Camera Person Re-IdentificationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.267900240:5(1167-1181)Online publication date: 1-May-2018
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media