Abstract
Neural networks stand out from artificial intelligence because they can complete challenging tasks, such as image classification. However, designing a neural network for a particular problem requires experience and tedious trial and error. Automating this process defines a research field usually relying on population-based meta-heuristics. This kind of optimizer generally needs numerous function evaluations, which are computationally demanding in this context as they involve building, training, and evaluating different neural networks. Fortunately, these algorithms are also well suited for parallel computing. This work describes how the teaching–learning-based optimization algorithm has been adapted for designing neural networks exploiting a multi-GPU high-performance computing environment. The optimizer, not applied before for this purpose up to the authors’ knowledge, has been selected because it lacks specific parameters and is compatible with large-scale optimization. Thus, its configuration does not result in another problem and could design architectures with many layers. The parallelization scheme is decoupled from the optimizer. It can be seen as an external evaluation service managing multiple GPUs for promising neural network designs, even at different machines, and multiple CPU’s for low-performing solutions. This strategy has been tested in designing a neural network for image classification based on the CIFAR-10 dataset. The architectures found outperform human designs, and the sequential process is accelerated 4.2 times with 4 GPUs and 96 cores thanks to parallelization, being the ideal speed up 4.39 in this case.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The source code of the Oracle is available in Github: https://github.com/marcoslupion/NAS_MultiGPU.
Notes
https://docs.oracle.com/cd/E19061-01/hpc.cluster30/806-0296-10/6j9llte66/index.html.https://docs.oracle.com/cd/E19061-01/hpc.cluster30/806-0296-10/6j9llte66/index.html.
References
Sharma N, Sharma R, Jindal N (2021) Machine learning and deep learning applications-a vision. Global Trans Proc 2(1):24–28. https://doi.org/10.1016/j.gltp.2021.01.004
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788. https://doi.org/10.48550/ARXIV.1506.02640
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778. https://doi.org/10.48550/ARXIV.1512.03385
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3\(^{rd}\) International Conference on Learning Representations, pp 1–14. https://doi.org/10.48550/ARXIV.1409.1556
Liu Y, Sun Y, Xue B, Zhang M, Yen GG, Tan KC (2021) A survey on evolutionary neural architecture search. IEEE Trans Neural Networks learn Syst. https://doi.org/10.1109/TNNLS.2021.3100554
Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: Proceedings of the 5\(^{th}\) International Conference on Learning Representations, pp. 1–16. https://doi.org/10.48550/ARXIV.1611.01578
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: International Conference on Machine Learning, pp 2902–2911. https://doi.org/10.48550/ARXIV.1703.01041. PMLR
Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: 2018 IEEE Congress on Evolutionary Computation, pp 1–8. https://doi.org/10.1109/CEC.2018.8477735. IEEE
Byla E, Pang W (2019) Deepswarm: optimising convolutional neural networks using swarm intelligence. In: UK Workshop on Computational Intelligence, pp 119–130. https://doi.org/10.1007/978-3-030-29933-0_10. Springer
Rao RV, Savsani VJ, Vakharia DP (2012) Teaching-learning-based optimization: an optimization method for continuous non-linear large scale problems. Inform Sci 183(1):1–15. https://doi.org/10.1016/j.ins.2011.08.006
Yang Z, Li K, Guo Y, Ma H, Zheng M (2018) Compact real-valued teaching-learning based optimization with the applications to neural network training. Knowl-Based Syst 159:51–62. https://doi.org/10.1016/j.knosys.2018.06.004
Jameel SM, Hashmani MA, Rehman M, Budiman A (2020) An adaptive deep learning framework for dynamic image classification in the internet of things environment. Sensors. https://doi.org/10.3390/s20205811
Orts F, Ortega G, Puertas AM, García I, Garzón EM (2020) On solving the unrelated parallel machine scheduling problem: active microrheology as a case study. J Supercomput 76(11):8494–8509. https://doi.org/10.1007/s11227-019-03121-z
Augonnet C, Thibault S, Namyst R, Wacrenier P-A (2009) Starpu: A unified platform for task scheduling on heterogeneous multicore architectures. In: Sips H, Epema D, Lin H-X (eds) Euro-Par 2009 Parallel Processing. Springer, Berlin, Heidelberg, pp 863–874
Luk C-K, Hong S, Kim H (2009) Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp 45–55
McCormick P, Inman J, Ahrens J, Mohd-Yusof J, Roth G, Cummins S (2007) Scout: a data-parallel programming language for graphics processors. Parallel Comput 33(10):648–662. https://doi.org/10.1016/j.parco.2007.09.001
Martinez D, Brewer W, Behm G, Strelzoff A, Wilson A, Wade D (2018) Deep learning evolutionary optimization for regression of rotorcraft vibrational spectra. In: 2018 IEEE/ACM Machine Learning in HPC Environments (MLHPC), pp 57–66. https://doi.org/10.1109/MLHPC.2018.8638645
Patton RM, Johnston JT, Young SR, Schuman CD, Potok TE, Rose DC, Lim S, Chae J, Hou L, Abousamra S, Samaras D, Saltz J (2019) Exascale deep learning to accelerate cancer research. In: 2019 IEEE International Conference on Big Data (Big Data), pp 1488–1496. https://doi.org/10.1109/BigData47090.2019.9006467
Balaprakash P, Salim M, Uram TD, Vishwanath V, Wild SM (2018) Deephyper: Asynchronous hyperparameter search for deep neural networks. In: 2018 IEEE 25th International Conference on High Performance Computing (HiPC), pp 42–51. https://doi.org/10.1109/HiPC.2018.00014
Salim MA, Uram TD, Childers JT, Balaprakash P, Vishwanath V, Papka ME (2019) Balsam: automated scheduling and execution of dynamic, data-intensive hpc workflows. https://doi.org/10.48550/ARXIV.1909.08704
Cruz NC, Redondo JL, Álvarez JD, Berenguel M, Ortigosa PM (2017) A parallel teaching-learning-based optimization procedure for automatic heliostat aiming. J Supercomput 73(1):591–606. https://doi.org/10.1007/s11227-016-1914-5
Cruz NC, Marín M, Redondo M, Ortigosa EM, Ortigosa PM (2021) A comparative study of stochastic optimizers for fitting neuron models application to the cerebellar granule cell. Informatica 32(3):477–498
Torres-Moreno JL, Cruz NC, Álvarez JD, Redondo JL, Giménez-Fernandez A (2022) An open-source tool for path synthesis of four-bar mechanisms. Mech Mach Theory 169:104604
Boussaïd I, Lepagnot J, Siarry P (2013) A survey on optimization metaheuristics. Inform Sci 237:82–117
van Geit W, De Schutter E, Achard P (2008) Automated neuron model optimization techniques: a review. Biol Cyber 99(4):241–251
Cruz NC, Álvarez JD, Redondo JL, Berenguel M, Ortigosa PM (2018) A two-layered solution for automatic heliostat aiming. Eng Appl Artif Intell 72:253–266. https://doi.org/10.1016/j.engappai.2018.04.014
Yeniay Ö (2005) Penalty function methods for constrained optimization with genetic algorithms. Math Comput Appl 10(1):45–56. https://doi.org/10.3390/mca10010045
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images
Funding
This work has been funded by the projects R+D+i RTI2018-095993-B-I00 and PID2021-123278OB-I00 from MCI-N/AEI/10.13039/501100011033/ and ERDF funds; by the Andalusian regional government through the project P18-RT-119, by the University of Almería through the project UAL18-TIC-A020, and by the Department of Informatics of the University of Almería. M. Lupión is a fellowship of the FPU program from the Spanish Ministry of Education (FPU19/02756). N.C. Cruz is funded by the Ministry of Digital Transformation, Industry, Knowledge and Universities of the Andalusian regional government.
Author information
Authors and Affiliations
Contributions
ML, NCC, JFS, BP and PMO contributed to the conceptualization; ML, NCC, JFS, BP and PMO were involved in the methodology; ML and NCC assisted in Implementation; ML and NCC contributed to the experimentation; ML, NCC, JFS and PMO helped in writing—original draft preparation; JFS, BP and PMO were involved in writing—review and editing. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lupión, M., Cruz, N.C., Sanjuan, J.F. et al. Accelerating neural network architecture search using multi-GPU high-performance computing. J Supercomput 79, 7609–7625 (2023). https://doi.org/10.1007/s11227-022-04960-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04960-z