Abstract
This paper concentrates on a parallel acceleration method of optimizing Gaussian hyper-parameters with the maximum likelihood estimation. In the process of optimizing the hyper-parameters, many calculations of the kernel matrix inversion will be operated. With an increase of the kernel matrix scale, the high computation burden will be generated. In order to improve the calculating efficiency, we introduce a decomposing and iterative (DI) algorithm. This algorithm divides the large-scale kernel matrix into four blocks and solves the matrix inversion with constant iterations. Due to the independency of the calculations of the sub-matrix blocks, it is quite suitable to put the sub-matrix blocks computation in graphics processing unit. Hence, the parallel decomposing and iterative (DIP) algorithm is introduced. The inverted pendulum and ball-plate system experiments are carried out to confirm the effectiveness of the DI and DIP algorithms. Based on the simulation results, the proposed DI and DIP algorithms shed light on real engineering application in the future. This paper also provides a practical and feasible approach to accelerate the optimization of hyper-parameters with maximum likelihood estimation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Sutton RS, Barto AG (1992) Reinforcement learning: an introduction, Bradford book. Mach Learn 8(3–4):225–227
Watkins CJ, Dayan P (1992) Technical note: Q-learning. Mach Learn 8(3–4):279–292
Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274
Sutton RS (1995) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Proceedings of the international conference on neural information processing systems, pp 1038–1044
Achbany Y, Fouss F, Yen L, Pirotte A (2008) Tuning continual exploration in reinforcement learning: an optimality property of the Boltzmann strategy. Neurocomputing 71(13):2507–2520
Powell W (2008) Approximate dynamic programming: solving the curses of dimensionality. Wiley 24(1):155–157
Srinivasan D, Jin X, Cheu R (2005) Adaptive neural network models for automatic incident detection on freeways. Neurocomputing 64(1):473–496
Radhakrishnan A, Kavitha V (2016) Energy conservation in cloud data centers by minimizing virtual machines migration through artificial neural network. Computing 98(11):1–18
Bertsekas D (2007) Dynamic programming and optimal control. Athena Sci 47(6):833–834
Valenti M (2007) Approximate dynamic programming with applications in multi-agent systems. Institute of Technology Press, Cambridge
Abe S (1976) Kernel-based methods. Computing 17(2):163–167
Wu J, Xu X, Lian C, Huang Y (2011) Multi-robot formation control with kernel-based reinforcement learning. Robot 33(3):379–384
Mahadevan S (2005) Proto-value functions: developmental reinforcement learning. In: Proceedings of international conference on machine learning, pp 553–560
Staniszewska M, Jarosz J, Jon M, Gamian A (2006) Fast direct policy evaluation using multiscale analysis of Markov diffusion processes. In: Proceedings of international conference on machine learning, pp 601–608
Rasmussen C, Nickisch H (2010) Gaussian processes for machine learning (GPML) toolbox. J Mach Learn Res 11(6):3011–3015
Sussner P (2000) Observations on morphological associative memories and the kernel method. Neurocomputing 31(1–4):167–183
Ormoneit D (2001) Kernel-based reinforcement learning in average-cost problems Dirk Ormoneit. IEEE Trans Autom Control 49(2–3):161–178
Ormoneit D, Glynn P (2002) Kernel-based reinforcement learning in average-cost problems. Mach Learn 49(2–3):161–178
Engel Y, Mannor S, Meir R (2003) The Gaussian process approach to temporal difference learning. In: Proceedings of twenty international conference on machine learning, pp 154–161
Rasmussen C, Kuss M (2004) Gaussian processes in machine learning. In: Bousquet O, Luxburg U, Rätsch G (eds) Advances in neural information processing systems. MIT Press, Cambridge
Udluft S, Martinetz T (2006) Kernel rewards regression: an information efficient batch policy iteration approach. In: Proceedings of the international conference on artificial intelligence and applications, pp 428–433
David V, Sanchez A (2003) Advanced support vector machines and kernel methods. Neurocomputing 55(1–2):5–20
Song T, Li D, Cao L, Hirasawa K (2016) Kernel-based least squares temporal difference with gradient correction. IEEE Trans Neural Netw Learn Syst 27(4):771–782
Colkesen I, Sahin E, Kavzoglu T (2016) Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression. J Afr Earth Sci 118:53–64
Li G, Wen C, Li Z, Zhang A, Yang F (2013) Model-based online learning with kernels. IEEE Trans Neural Netw Learn Syst 24(3):356–369
Engel Y, Mannor S, Meir R (2005) Reinforcement learning with Gaussian processes. In: Proceedings of the international conference on machine learning, pp 201–208
Rasmussen C (2003) Gaussian processes in machine learning. In: Proceedings of the Summer School on machine learning, pp 63–71
Xu X, Xie T, Hu D, Lu X (2005) Kernel least-squares temporal difference learning. Int J Inf Technol 11:54–63
Xu X, Hu D, Lu X (2007) Kernel-based least squares policy iteration for reinforcement learning. IEEE Trans Neural Netw 18(4):973–992
Xu X, Lian C, Zuo L, He H (2014) Kernel-based approximate dynamic programming for real-time online learning control: an experimental study. IEEE Trans Control Syst Technol 22(1):146–156
Reisinger J, Stone P, Miikkulainen R (2008) Online kernel selection for Bayesian reinforcement learning. In: Proceedings of international conference on machine learning, pp 816–823
Kveton B, Theocharous G (2013) Structured kernel-based reinforcement learning. In: Proceedings of the twenty-seventh AAAI conference on artificial intelligence, pp 569–575
Chen X, Xie P, Xiong Y, He Y, Wu M (2015) Two-phase iteration for value function approximation and hyperparameter optimization in Gaussian-kernel-based adaptive critic design. Math Probl Eng 9:1–14
NVIDIA (2017) CUDA C Programming Guide version 8.0. http://docs.nvidia.com/cuda/cudacprogramming-guide. Accessed 10 June 2018
Zein A, Mccreath E, Rendell A, Smola A (2008) Performance evaluation of the NVIDIA GeForce 8800 GTX GPU for machine learning. In: Proceedings of the international conference on computational science, pp 466–475
Kirk D, Hwu W (2010) Programming massively parallel processors: a hands-on approach, vol 11(3). Tsinghua University Press, Beijing
Volkov V (2010) Better performance at lower occupancy. In: Proceedings of the GPU technology conference, pp 1–6
Ketema J, Donaldson A (2017) Termination analysis for GPU kernels. Sci Comput Program 148:1–16
Liu B, Xin Y, Cheung R, Yan H (2014) GPU-based biclustering for microarray data analysis in neurocomputing. Neurocomputing 134(4):239–246
Chen C, Li K, Ouyang A, Tang Z, Li K (2017) GPU-accelerated parallel hierarchical extreme learning machine on Flink for big data. IEEE Trans Syst Man Cybern Syst. 47(10):2740–2753
Chang L, El-Araby E, Dang V, Dao L (2014) GPU acceleration of nonlinear diffusion tensor estimation using CUDA and MPI. Neurocomputing 135(C):328–338
Azarkhish E, Rossi D, Loi I, Benini L (2017) Neurostream: scalable and energy efficient deep learning with smart memory cubes. IEEE Trans Parallel Distrib Syst 99:1–13
Bailey D, Ferguson H (1988) A Strassen–Newton algorithm for high-speed parallelizable matrix inversion. In: Proceedings of the ACM/IEEE conference on supercomputing, pp 419–424
Sharma G, Agarwala A, Bhattacharya B (2013) A fast parallel Gauss–Jordan algorithm for matrix inversion using CUDA. Comput Struct 128:31–37
Murni A , Ernastuti T, Kerami D (2016) Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing. In: Proceedings of the international symposium on current progress in mathematics and sciences, pp 1-6
Wang Z, Xu X, Zhao W (2010) Optimizing sparse matrix-vector multiplication on CUDA companion. In: Proceedings of the international conference on education technology and computer, pp 109–113
Su X, Xia F, Liu J, Wu L (2018) Event-triggered fuzzy control of nonlinear systems with its application to inverted pendulum systems. Automatica 94:236–248
Rubio E (2010) Indirect hierarchical FCMAC control for the ball and plate system. Neurocomputing 73(13–15):2454–2463
Alpaslan Yildiz H, Goren-Sumer L (2017) Stabilizing of ball and plate system using an approximate model. In: Proceedings of twenty the international federation of automatic control, vol 50, pp 9601–9606
Acknowledgements
This work is supported by National Natural Science Foundation of China under Grant 61473316, the Hubei Provincial Natural Science Foundation of China under Grant Nos. 2017CFA030 and 2015CFA010, and the 111 project under Grant B17040.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, L., Chen, X. Optimization of kernel learning algorithm based on parallel architecture. Computing 102, 1881–1907 (2020). https://doi.org/10.1007/s00607-019-00760-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-019-00760-1