Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1102351.1102382acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Hierarchic Bayesian models for kernel learning

Published: 07 August 2005 Publication History
  • Get Citation Alerts
  • Abstract

    The integration of diverse forms of informative data by learning an optimal combination of base kernels in classification or regression problems can provide enhanced performance when compared to that obtained from any single data source. We present a Bayesian hierarchical model which enables kernel learning and present effective variational Bayes estimators for regression and classification. Illustrative experiments demonstrate the utility of the proposed method. Matlab code replicating results reported is available at http://www.dcs.gla.ac.uk/~srogers/kernel_comb.html.

    References

    [1]
    Andrews, D., & Mallows, C. (1974). Scale mixtures of Normal distributions. Journal of the Royal Statistical Society, Series B, 36, 99--102.]]
    [2]
    Bach, F., Lanckriet, G., & Jordan, M. I. (2004). Multiple kernel learning, conic duality and the SMO algorithm. Proceedings of the Twenty-First International Conference on Machine Learning.]]
    [3]
    Bach, F. R., Thibaux, R., & Jordan, M. I. (2005). Computing regularization paths for learning multiple kernels. In L. K. Saul. Y. Weiss and L. Bottou (Eds.), Advances in neural information processing systems 17. Cambridge. MA: MIT Press.]]
    [4]
    Beal, M. (2003). Variational algorithms for approximate bayesian inference. Doctoral dissertation, University College London.]]
    [5]
    Bishop, C., & Tipping, M. (2000). Variational relevance vector machines. Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (pp.46--53).]]
    [6]
    Bousquet, O., & Herrmann, D. J. L. (2003). On the complexity of learning the kernel matrix. In S. T. S. Becker and K. Obermayer (Eds.), Advances in neural information processing systems 15, 399--406. Cambridge, MA: MIT Press.]]
    [7]
    Crammer, K., Keshet, J., & Singer, Y. (2003). Kernel design using boosting. In S. T. S. Becker and K. Obermayer (Eds.), Advances in neural information processing systems 15, 537--544. Cambridge, MA: MIT Press.]]
    [8]
    Cristianini, N., Shawe-Taylor, J., Elisseeff, A., & Kandola. J. (2002). On kernel-target alignment. Advances in Neural Information Processing Systems 14. Cambridge, MA: MIT Press.]]
    [9]
    Fung, G., Dundar, M., Bi, J., & Rao, B. (2004). A fast iterative algorithm for Fisher discriminant using heterogeneous kernels. Proceedings of the twenty-first International Conference on Machine Learning (pp. 313--320).]]
    [10]
    Gunn, S., & Kandola, J. (2002). Structural modelling with sparse kernels. Machine Learning, 48, 137--163.]]
    [11]
    Jaakkola, T. (1997). Variational methods for inference and estimation in graphical models. Doctoral dissertation, MIT.]]
    [12]
    Jordan, M., Ghahramani, Z., Jaakkola, T., & Saul, L. (1999). An introduction to variational methods for graphical models. Machine Learning, 37, 183--233.]]
    [13]
    Kolenda, T., Hansen, L., Larsen, J., & Winther, O. (2002). Independent component analysis for understanding multimedia content. Proceedings of IEEE Workshop on Neural Networks for Signal Processing XII (pp. 757--766).]]
    [14]
    Lafferty, J., & Lebanon, G. (2005). Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6, 129--163.]]
    [15]
    Lanckriet, G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. (2004). Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5, 27--72.]]
    [16]
    Lawrence, N. D., Milo, M., Niranjan, M., Rashbass, P., & Soullier, S. (2004). Reducing the variability in cDNA microarray image processing by Bayesian inference. Bioinformatics, 20, 518--526.]]
    [17]
    MacKay, D. (2003). Information theory, inference, and learning algorithms. Cambridge University Press.]]
    [18]
    Minka, T. (2001). A family of algorithms for approximate Bayesian inference. Doctoral dissertation, MIT.]]
    [19]
    Ong, C. S., Smola, A. J., & Williamson, R. C. (2003). Hyperkernels. In S. T. S. Becker and K. Obermayer (Eds.), Advances in neural information processing systems 15, 478--485. Cambridge, MA: MIT Press.]]
    [20]
    Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.]]
    [21]
    Tipping, M. (2001). Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211--244.]]
    [22]
    Tsang, I. W., & Kwok, J. T. (2004). Efficient hyperkernel learning using second-order cone programming. Proceedings of the 15th European Conference on Machine Learning (pp. 453--464).]]
    [23]
    Zhang, Z., Yeung, D.-Y., & Kwok, J. T. (2004). Bayesian inference for transductive learning of kernel matrix using the Tanner-Wong data augmentation algorithm. Proceedings of the Twenty-First International Conference on Machine Learning (pp. 935--942).]]

    Cited By

    View all
    • (2020)Facies Identification Based on Multikernel Relevance Vector MachineIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2020.298168758:10(7269-7282)Online publication date: Oct-2020
    • (2019)A Bayesian Multiple Kernel Learning Algorithm for SSVEP BCI DetectionIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2018.287804823:5(1990-2001)Online publication date: Sep-2019
    • (2018)BALSON: BAYESIAN LEAST SQUARES OPTIMIZATION WITH NONNEGATIVE L1-NORM CONSTRAINT2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP)10.1109/MLSP.2018.8517036(1-6)Online publication date: Sep-2018
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICML '05: Proceedings of the 22nd international conference on Machine learning
    August 2005
    1113 pages
    ISBN:1595931805
    DOI:10.1145/1102351
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 140 of 548 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Facies Identification Based on Multikernel Relevance Vector MachineIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2020.298168758:10(7269-7282)Online publication date: Oct-2020
    • (2019)A Bayesian Multiple Kernel Learning Algorithm for SSVEP BCI DetectionIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2018.287804823:5(1990-2001)Online publication date: Sep-2019
    • (2018)BALSON: BAYESIAN LEAST SQUARES OPTIMIZATION WITH NONNEGATIVE L1-NORM CONSTRAINT2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP)10.1109/MLSP.2018.8517036(1-6)Online publication date: Sep-2018
    • (2017)Hierarchical Bayesian Multiple Kernel Learning Based Feature Fusion for Action RecognitionMultimodal Pattern Recognition of Social Signals in Human-Computer-Interaction10.1007/978-3-319-59259-6_8(85-97)Online publication date: 1-Jun-2017
    • (2016)Learning kernels with random featuresProceedings of the 30th International Conference on Neural Information Processing Systems10.5555/3157096.3157242(1306-1314)Online publication date: 5-Dec-2016
    • (2016)Online Bayesian multiple kernel bipartite rankingProceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence10.5555/3020948.3020964(142-151)Online publication date: 25-Jun-2016
    • (2016)Efficient Bayesian Maximum Margin Multiple Kernel LearningEuropean Conference on Machine Learning and Knowledge Discovery in Databases - Volume 985110.1007/978-3-319-46128-1_11(165-181)Online publication date: 19-Sep-2016
    • (2015)Learning Combinations of Multiple Feature Representations for Music Emotion PredictionProceedings of the 1st International Workshop on Affect & Sentiment in Multimedia10.1145/2813524.2813534(3-8)Online publication date: 30-Oct-2015
    • (2015)Generalized Multiple Kernel Learning With Data-Dependent PriorsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2014.233413726:6(1134-1148)Online publication date: Jun-2015
    • (2015)Gaussian Kernel Width Optimization for Sparse Bayesian LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2014.232113426:4(709-719)Online publication date: Apr-2015
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media