Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

MKEL: Multiple Kernel Ensemble Learning via Unified Ensemble Loss for Image Classification

Published: 08 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    In this article, a novel ensemble model, called Multiple Kernel Ensemble Learning (MKEL), is developed by introducing a unified ensemble loss. Different from the previous multiple kernel learning (MKL) methods, which attempt to seek a linear combination of basis kernels as a unified kernel, our MKEL model aims to find multiple solutions in corresponding Reproducing Kernel Hilbert Spaces (RKHSs) simultaneously. To achieve this goal, multiple individual kernel losses are integrated into a unified ensemble loss. Therefore, each model can co-optimize to learn its optimal parameters by minimizing a unified ensemble loss in multiple RKHSs. Furthermore, we apply our proposed ensemble loss into the deep network paradigm and take the sub-network as a kernel mapping from the original input space into a feature space, named Deep-MKEL (D-MKEL). Our D-MKEL model can utilize the diversified deep individual sub-networks into a whole unified network to improve the classification performance. With this unified loss design, our D-MKEL model can make our network much wider than other traditional deep kernel networks and more parameters are learned and optimized. Experimental results on several mediate UCI classification and computer vision datasets demonstrate that our MKEL model can achieve the best classification performance among comparative MKL methods, such as Simple MKL, GMKL, Spicy MKL, and Matrix-Regularized MKL. On the contrary, experimental results on large-scale CIFAR-10 and SVHN datasets concretely show the advantages and potentialities of the proposed D-MKEL approach compared to state-of-the-art deep kernel methods.

    References

    [1]
    Fabio Aiolli and Michele Donini. 2015. EasyMKL: A scalable multiple kernel learning algorithm. Neurocomputing 169 (2015), 215–224.
    [2]
    Francis R. Bach. 2008. Consistency of the group lasso and multiple kernel learning. Journal of Machine Learning Research 9, (2008), 1179–1225.
    [3]
    Francis R. Bach, Gert R.G. Lanckriet, and Michael I. Jordan. 2004. Multiple kernel learning, conic duality, and the SMO algorithm. In Proceedings of the 21st International Conference on Machine Learning. ACM, 6.
    [4]
    Francis R. Bach, Romain Thibaux, and Michael I. Jordan. 2005. Computing regularization paths for learning multiple kernels. In Advances in Neural Information Processing Systems. 73–80.
    [5]
    Feng Bao, Yue Deng, Mulong Du, Zhiquan Ren, Sen Wan, Kenny Ye Liang, Shaohua Liu, Bo Wang, Junyi Xin, Feng Chen, et al. 2020. Explaining the genetic causality for complex phenotype via deep association kernel learning. Patterns 1, 6 (2020), 100057.
    [6]
    Sijia Cai, Wangmeng Zuo, and Lei Zhang. 2017. Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In Proceedings of the IEEE International Conference on Computer Vision. 511–520.
    [7]
    Gong Cheng, Junwei Han, and Xiaoqiang Lu. 2017. Remote sensing image scene classification: Benchmark and state of the art. Proceedings of the IEEE 105, 10 (2017), 1865–1883.
    [8]
    Gong Cheng, Ceyuan Yang, Xiwen Yao, Lei Guo, and Junwei Han. 2018. When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs. IEEE Transactions on Geoscience and Remote Sensing 56, 5 (2018), 2811–2821.
    [9]
    Koby Crammer, Joseph Keshet, and Yoram Singer. 2003. Kernel design using boosting. In Advances in Neural Information Processing Systems. 553–560.
    [10]
    Nello Cristianini, John Shawe-Taylor, et al. 2000. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press.
    [11]
    Tri Dao, Albert Gu, Alexander Ratner, Virginia Smith, Chris De Sa, and Christopher Ré. 2019. A kernel theory of modern data augmentation. In International Conference on Machine Learning. PMLR, 1528–1537.
    [12]
    Peter Gehler and Sebastian Nowozin. 2009. On feature combination for multiclass object classification. In 2009 IEEE 12th International Conference on Computer Vision. IEEE, 221–228.
    [13]
    Mehmet Gnen and Ethem Alpaydn. 2011. Multiple kernel learning algorithms. Journal of Machine Learning Research 12, (2011), 2211–2268.
    [14]
    Yanfeng Gu, Tianzhu Liu, Xiuping Jia, Jón Atli Benediktsson, and Jocelyn Chanussot. 2016. Nonlinear multiple kernel learning with multiple-structure-element extended morphological profiles for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing 54, 6 (2016), 3235–3247.
    [15]
    Yina Han, Kunde Yang, Yixin Yang, and Yuanliang Ma. 2016. Localized multiple kernel learning with dynamical clustering and matrix regularization. IEEE Transactions on Neural Networks and Learning Systems 29, 2 (2016), 486–499.
    [16]
    Yina Han, Yixin Yang, Xuelong Li, Qingyu Liu, and Yuanliang Ma. 2018. Matrix-regularized multiple kernel learning via L1 norms. IEEE Transactions on Neural Networks and Learning Systems 29, 10 (2018), 4997–5007.
    [17]
    Zakria Hussain and John Shawe-Taylor. 2011. Improved loss bounds for multiple kernel learning. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. 370–377.
    [18]
    Zakria Hussain and John Shawe-Taylor. 2011. A note on improved loss bounds for multiple kernel learning. arXiv preprint arXiv:1106.6258 15, 2004 (2011), 1–11.
    [19]
    Ashesh Jain, Swaminathan V.N. Vishwanathan, and Manik Varma. 2012. SPF-GMKL: Generalized multiple kernel learning with a million kernels. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 750–758.
    [20]
    Pratik Jawanpuria, Saketha N. Jagarlapudi, and Ganesh Ramakrishnan. 2011. Efficient rule ensemble learning using hierarchical kernels. In Proceedings of the 28th International Conference on Machine Learning (ICML’11). 161–168.
    [21]
    Marius Kloft, Ulf Brefeld, Pavel Laskov, Klaus-Robert Müller, Alexander Zien, and Sören Sonnenburg. 2009. Efficient and accurate lp-norm multiple kernel learning. In Advances in Neural Information Processing Systems. 997–1005.
    [22]
    Marius Kloft, Ulf Brefeld, Sören Sonnenburg, and Alexander Zien. 2011. Lp-norm multiple kernel learning. Journal of Machine Learning Research 12, (2011), 953–997.
    [23]
    Marius Kloft, Ulrich Rückert, and Peter L. Bartlett. 2010. A unifying view of multiple kernel learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 66–81.
    [24]
    Ivano Lauriola, Claudio Gallicchio, and Fabio Aiolli. 2020. Enhancing deep neural networks via multiple kernel learning. Pattern Recognition 101 (2020), 107194.
    [25]
    Yunwen Lei, Alexander Binder, Dogan, and Marius Kloft. 2015. Theory and algorithms for the localized setting of learning kernels. In Feature Extraction: Modern Questions and Challenges. 173–195.
    [26]
    Chun-Liang Li, Wei-Cheng Chang, Youssef Mroueh, Yiming Yang, and Barnabás Póczos. 2019. Implicit kernel learning. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 2007–2016.
    [27]
    Xiang Li, Wenhai Wang, Xiaolin Hu, and Jian Yang. 2019. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 510–519.
    [28]
    Yujian Li and Ting Zhang. 2017. Deep neural mapping support vector machines. Neural Networks 93 (2017), 185–194.
    [29]
    Zhiqiang Li, Xiao-Yuan Jing, Xiaoke Zhu, and Hongyu Zhang. 2017. Heterogeneous defect prediction through multiple kernel learning and ensemble learning. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME’17). IEEE, 91–102.
    [30]
    Stefano Melacci and Mikhail Belkin. 2011. Laplacian support vector machines trained in the primal. Journal of Machine Learning Research 12, (2011), 1149–1184.
    [31]
    Hieu V. Nguyen and Li Bai. 2010. Cosine similarity metric learning for face verification. In Asian Conference on Computer Vision. Springer, 709–720.
    [32]
    Alain Rakotomamonjy, Francis R. Bach, Stéphane Canu, and Yves Grandvalet. 2008. SimpleMKL. Journal of Machine Learning Research 9, (2008), 2491–2521.
    [33]
    B. Schölkopf, Michael G. Akritas, and Dimitris N. Politis. 2003. An introduction to support vector machines. In International Conference on Recent Advances and Trends in Nonparametric Statistics. 3–17.
    [34]
    Bernhard Scholkopf and Alexander J. Smola. 2001. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
    [35]
    Xiang-Jun Shen, Si-Xing Liu, Bing-Kun Bao, Chun-Hong Pan, Zheng-Jun Zha, and Jianping Fan. 2020. A generalized least-squares approach regularized with graph embedding for dimensionality reduction. Pattern Recognition 98 (2020), 107023.
    [36]
    Xiang-Jun Shen, ChengGong Ni, Liangjun Wang, and Zheng-Jun Zha. 2021. SLiKER: Sparse loss induced kernel ensemble regression. Pattern Recognition 109 (2021), 107587.
    [37]
    Ashish Shrivastava, Vishal M. Patel, and Rama Chellappa. 2014. Multiple kernel learning for sparse representation-based classification. IEEE Transactions on Image Processing 23, 7 (2014), 3013–3024.
    [38]
    Sören Sonnenburg, Gunnar Rätsch, Christin Schäfer, and Bernhard Schölkopf. 2006. Large scale multiple kernel learning. The Journal of Machine Learning Research 7 (2006), 1531–1565.
    [39]
    Eric V. Strobl and Shyam Visweswaran. 2013. Deep multiple kernel learning. In 12th International Conference on Machine Learning and Applications, Vol. 1. IEEE, 414–417.
    [40]
    Hongwei Sun. 2005. Mercer theorem for RKHS on noncompact sets. Journal of Complexity 21, 3 (2005), 337–349.
    [41]
    Tao Sun, Licheng Jiao, Fang Liu, Shuang Wang, and Jie Feng. 2013. Selective multiple kernel learning for classification with ensemble strategy. Pattern Recognition 46, 11 (2013), 3081–3090.
    [42]
    Taiji Suzuki and Ryota Tomioka. 2011. SpicyMKL: A fast algorithm for multiple kernel learning with thousands of kernels. Machine Learning 85, 1–2 (2011), 77–108.
    [43]
    Yichuan Tang. 2013. Deep learning using linear support vector machines. arXiv: Learning (2013).
    [44]
    Ryota Tomioka and Masashi Sugiyama. 2009. Dual-augmented Lagrangian method for efficient sparse reconstruction. IEEE Signal Processing Letters 16, 12 (2009), 1067–1070.
    [45]
    Gerrit J.J. Van Den Burg and Patrick J.F. Groenen. 2016. GenSVM: A generalized multiclass support vector machine. Journal of Machine Learning Research 17, 1 (2016), 7964–8005.
    [46]
    Manik Varma and Bodla Rakesh Babu. 2009. More generality in efficient multiple kernel learning. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 1065–1072.
    [47]
    Hao Wang, Qilong Wang, Mingqi Gao, Peihua Li, and Wangmeng Zuo. 2018. Multi-scale location-aware kernel representation for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1248–1257.
    [48]
    Hao Wang, Qilong Wang, Peihua Li, and Wangmeng Zuo. 2021. Multi-scale structural kernel representation for object detection. Pattern Recognition 110 (2021), 107593.
    [49]
    Qi Wang, Xiang He, and Xuelong Li. 2018. Locality and structure regularized low rank representation for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing 57, 2 (2018), 911–923.
    [50]
    Qi Wang, Shaoteng Liu, Jocelyn Chanussot, and Xuelong Li. 2018. Scene classification with recurrent attention of VHR remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 57, 2 (2018), 1155–1167.
    [51]
    Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, and Eric P. Xing. 2016. Deep kernel learning. arXiv preprint arXiv:1511.02222.370–378.
    [52]
    Hao Xia and Steven C.H. Hoi. 2012. Mkboost: A framework of multiple kernel boosting. IEEE Transactions on Knowledge and Data Engineering 25, 7 (2012), 1574–1586.
    [53]
    Jie Xu, Xianglong Liu, Zhouyuan Huo, Cheng Deng, Feiping Nie, and Heng Huang. 2017. Multi-class support vector machine via maximizing multi-class margins. In The 26th International Joint Conference on Artificial Intelligence (IJCAI’17).
    [54]
    Zheng-Jun Zha, Chong Wang, Dong Liu, Hongtao Xie, and Yongdong Zhang. 2020. Robust deep co-saliency detection with group semantic and pyramid attention. IEEE Transactions on Neural Networks and Learning Systems 31, 7 (2020), 2398–2408.
    [55]
    Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, and Tat-Seng Chua. 2014. Robust (semi) nonnegative graph embedding. IEEE Transactions on Image Processing 23, 7 (2014), 2996–3012.
    [56]
    Yan-Tao Zheng, Yiqun Li, Zheng-Jun Zha, and Tat-Seng Chua. 2011. Mining travel patterns from GPS-tagged photos. In International Conference on Multimedia Modeling. Springer, 262–272.
    [57]
    Jinfeng Zhuang, Ivor W. Tsang, and Steven C.H. Hoi. 2011. Two-layer multiple kernel learning. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. 909–917.

    Cited By

    View all
    • (2024)SignSense: AI Framework for Sign Language RecognitionInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-17257(372-385)Online publication date: 14-Apr-2024
    • (2023)Diagnose Like Doctors: Weakly Supervised Fine-Grained Classification of Breast CancerACM Transactions on Intelligent Systems and Technology10.1145/357203314:2(1-17)Online publication date: 16-Feb-2023
    • (2023)Ensemble Multifeatured Deep Learning Models and Applications: A SurveyIEEE Access10.1109/ACCESS.2023.332004211(107194-107217)Online publication date: 2023
    • Show More Cited By

    Index Terms

    1. MKEL: Multiple Kernel Ensemble Learning via Unified Ensemble Loss for Image Classification

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 12, Issue 4
      August 2021
      368 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/3468075
      • Editor:
      • Huan Liu
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 June 2021
      Accepted: 01 March 2021
      Revised: 01 March 2021
      Received: 01 July 2020
      Published in TIST Volume 12, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Ensemble loss
      2. multiple kernel learning
      3. ensemble learning
      4. deep networks

      Qualifiers

      • Research-article
      • Refereed

      Funding Sources

      • National Natural Science Foundation of China
      • National Key R&D Program of China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)41
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 10 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)SignSense: AI Framework for Sign Language RecognitionInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-17257(372-385)Online publication date: 14-Apr-2024
      • (2023)Diagnose Like Doctors: Weakly Supervised Fine-Grained Classification of Breast CancerACM Transactions on Intelligent Systems and Technology10.1145/357203314:2(1-17)Online publication date: 16-Feb-2023
      • (2023)Ensemble Multifeatured Deep Learning Models and Applications: A SurveyIEEE Access10.1109/ACCESS.2023.332004211(107194-107217)Online publication date: 2023
      • (2023)SignExplainer: An Explainable AI-Enabled Framework for Sign Language Recognition With Ensemble LearningIEEE Access10.1109/ACCESS.2023.327485111(47410-47419)Online publication date: 2023
      • (2023)MKL-SINGInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10324360:3Online publication date: 1-May-2023
      • (2023)Multi-view Representation Induced Kernel Ensemble Support Vector MachineNeural Processing Letters10.1007/s11063-023-11250-z55:6(7035-7056)Online publication date: 3-Apr-2023
      • (2022)Robust classification via clipping-based kernel recursive least lncosh of errorExpert Systems with Applications10.1016/j.eswa.2022.116811198(116811)Online publication date: Jul-2022

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media