Abstract
The stochastic gradient descend (SGD) is a prevalence algorithm used to optimize Convolutional Neural Network (CNN) by many researchers. However, it has several disadvantages such as occurring in local optimum and vanishing gradient problems that need to be overcome or optimized. In this paper, we propose a hybrid learning algorithm which aims to tackle the above mentioned drawbacks by integrating the methods of particle swarm optimization (PSO) and SGD. To take advantage of the excellent global search capability of PSO, we introduce the velocity update formula which is combined with the gradient descend to overcome the shortcomings. In addition, due to the cooperation of the particles, the proposed algorithm helps the convolutional neural network dampen overfitting and obtain better results. The German traffic sign recognition (GTSRB) benchmark is employed as dataset to evaluate the performance and experimental results demonstrate that proposed method outperforms the standard SGD and conjugate gradient (CG) based approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sermanet, P., Kavukcuoglu, K., Chintala, S., et al.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3626–3633 (2013)
Karpathy, A., Toderici, G., Shetty, S., et al.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)
Qian, N.: On the momentum term in gradient descent learning algorithms. Neural Netw.: Off. J. Int. Neural Netw. Soc. 12(1), 145–151 (1999)
Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence o(1/k2). In: Doklady ANSSSR (translated as Soviet.Math.Docl.), vol. 269, pp. 543–547 (1983)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Zeiler, M.D.: ADADELTA: an adaptive learning rate method (2012). arXiv preprint arXiv:1212.5701
Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw. Mach. Learn. 4(2), 26–31 (2012)
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, pp. 1–13 (2015)
Murphy, K.P.: Machine Learning: A Probabilistic Perspective, pp. 82–92. MIT Press, Cambridge (2012)
Kennedy, J.: Particle swarm optimization. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 760–766. Springer, Heidelberg (2011)
Wang, X., Gao, X.Z., Ovaska, S.J.: A hybrid particle swarm optimization method. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2006, vol. 5, pp. 4151–4157. IEEE (2006)
LeCun, Y., Bottou, L., Bengio, Y., Haffiner, P.: Gradient-based leaning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS 2010, vol. 9, pp. 249–256 (2010)
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 1453–1460 (2011)
Fletcher, R., Reeves, C.M.: Function minimization by conjugate gradients. Comput. J. 7(2), 149–154 (1964)
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 1453–1460 (2011)
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)
Acknowledgement
This work has been supported by National Natural Science Foundation of China under grant 61472284.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zang, D., Ding, J., Cheng, J., Zhang, D., Tang, K. (2017). A Hybrid Learning Algorithm for the Optimization of Convolutional Neural Network. In: Huang, DS., Hussain, A., Han, K., Gromiha, M. (eds) Intelligent Computing Methodologies. ICIC 2017. Lecture Notes in Computer Science(), vol 10363. Springer, Cham. https://doi.org/10.1007/978-3-319-63315-2_61
Download citation
DOI: https://doi.org/10.1007/978-3-319-63315-2_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63314-5
Online ISBN: 978-3-319-63315-2
eBook Packages: Computer ScienceComputer Science (R0)