Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A Fuzzy Neural Network Based Dynamic Data Allocation Model on Heterogeneous Multi-GPUs for Large-scale Computations

  • Research Article
  • Published:
International Journal of Automation and Computing Aims and scope Submit manuscript

Abstract

The parallel computation capabilities of modern graphics processing units (GPUs) have attracted increasing attention from researchers and engineers who have been conducting high computational throughput studies. However, current single GPU based engineering solutions are often struggling to fulfill their real-time requirements. Thus, the multi-GPU-based approach has become a popular and cost-effective choice for tackling the demands. In those cases, the computational load balancing over multiple GPU “nodes” is often the key and bottleneck that affect the quality and performance of the real-time system. The existing load balancing approaches are mainly based on the assumption that all GPU nodes in the same computer framework are of equal computational performance, which is often not the case due to cluster design and other legacy issues. This paper presents a novel dynamic load balancing (DLB) model for rapid data division and allocation on heterogeneous GPU nodes based on an innovative fuzzy neural network (FNN). In this research, a 5-state parameter feedback mechanism defining the overall cluster and node performance is proposed. The corresponding FNN-based DLB model will be capable of monitoring and predicting individual node performance under different workload scenarios. A real-time adaptive scheduler has been devised to reorganize the data inputs to each node when necessary to maintain their runtime computational performance. The devised model has been implemented on two dimensional (2D) discrete wavelet transform (DWT) applications for evaluation. Experiment results show that this DLB model enables a high computational throughput while ensuring real-time and precision requirements from complex computational tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. B. Kirk, W. W. Hwu. Programming Massively Parallel Processors: A Hands-on Approach, 3rd ed, New York, USA: Morgan Kaufmann, 2016.

    Google Scholar 

  2. R. Couturier. Designing Scientific Applications on GPUs, Boca Raton, USA: CRC Press, 2013.

    MATH  Google Scholar 

  3. S. W. Keckler, W. J. Dally, B. Khailany, M. Garland, D. Glasco. GPUs and the future of parallel computing. IEEE Micro, vol. 31, no. 5, pp. 7–17, 2011. DOI: 10.1109/MM.2011.89.

    Article  Google Scholar 

  4. C. W. Lee, J. Ko, T. Y. Choe. Two-way partitioning of a recursive Gaussian filter in CUDA. EURASIP Journal on Image and Video Processing, vol. 2014, no. 1, Article number 33, 2014. DOI: 10.1186/1687-5281-2014-33.

    Google Scholar 

  5. J. A. Belloch, A. Gonzalez, F. J. Martínez-Zaldívar, A. M. Vidal. Real-time massive convolution for audio applications on GPU. The Journal of Supercomputing, vol. 58, no. 3, pp. 449–457, 2011. DOI: 10.1007/s11227-011-0610.

    Article  Google Scholar 

  6. F. Nasse, C. Thurau, G. A. Fink. Face detection using GPU-based convolutional neural networks. In Proceedings of the 13th International Conference on Computer Analysis of Images and Patterns, Münster, Germany, pp. 83–90, 2009. DOI: 10.1007/978-3-642-03767-2 10.

    Chapter  Google Scholar 

  7. NVIDIA. CUDA C Programming Guide v8.0. [Online], Available: http://docs.nvidia.com cuda/cuda-cprogramming- guide/index.htm, 2017.

  8. A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017. DOI: 10.1145/3065386.

    Article  Google Scholar 

  9. C. Szegedy, W. Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, 2015. DOI: 10.1109/CVPR.2015.7298594.

    Google Scholar 

  10. E. Guerra, J. De Lara, A. Malizia, P. Díaz. Supporting user-oriented analysis for multi-view domain-specific visual languages. Information and Software Technology, vol. 51, no. 4, pp. 769–784, 2009. DOI: 10.1016/j.infsof.2008.09.005.

    Article  Google Scholar 

  11. X. J. Jiang, D. J. Whitehouse. Technological shifts in surface metrology. CIRP Annals, vol. 61, no. 2, pp. 815–836, 2012. DOI: 10.1016/j.cirp.2012.05.009.

    Article  Google Scholar 

  12. J. J. Wang, W. L. Lu, X. J. Liu, X. Q. Jiang. Highspeed parallel wavelet algorithm based on CUDA and its application in three-dimensional surface texture analysis. In Proceedings of International Conference on Electric Information and Control Engineering, IEEE, Wuhan, China, pp. 2249–2252, 2011. DOI: 10.1109/ICEICE.2011.5778225.

    Google Scholar 

  13. S. Chen, X. M. Li. A hybrid GPU/CPU FFT library for large FFT problems. In Proceedings of the 32nd International Performance Computing and Communications Conference, IEEE, San Diego, USA, 2013. DOI: 10.1109/PCCC.2013.6742796.

    Google Scholar 

  14. C. L. Zhang, Y. P. Xu, J. He, J. Lu, L. Lu, Z. J. Xu. Multi-GPUs Gaussian filtering for real-time big data processing. In Proceedings of the 10th International Conference on Software, Knowledge, Information Management & Applications, IEEE, Chengdu, China, 2016. DOI: 10.1109/SKIMA.2016.7916225.

    Google Scholar 

  15. S. Schaetz, M. Uecker. A multi-GPU programming library for real-time applications. In Proceedings of the 12th International Conference on Algorithms and Architectures for Parallel Processing, Fukuoka, Japan, pp. 231–236, 2012. DOI: 10.1007/978-3-642-33078-0 9.

    Google Scholar 

  16. J. A. Stuart, J. D. Owens. Multi-GPU MapReduce on GPU clusters. In Proceedings of 2011 IEEE International Parallel & Distributed Processing Symposium, IEEE, Anchorage, USA, pp. 1068–1079, 2011. DOI: 10.1109/IPDPS.2011.102.

    Chapter  Google Scholar 

  17. M. Grossman, M. Breternitz, V. Sarkar. HadoopCL: MapReduce on distributed heterogeneous platforms through seamless integration of Hadoop and OpenCL. In Proceedings of the 27th Parallel and Distributed Processing Symposium Workshops & PhD Forum, IEEE, Cambridge, MA, USA, pp. 1918–1927, 2013. DOI: 10.1109/IPDPSW.2013.246.

    Google Scholar 

  18. M. Boyer, K. Skadron, S. Che, N. Jayasena. Load balancing in a changing world: Dealing with heterogeneity and performance variability. In Proceedings of ACM International Conference on Computing Frontiers, Ischia, Italy, 2013. DOI: 10.1145/2482767.2482794.

    Google Scholar 

  19. L. Chen, O. Villa, S. Krishnamoorthy, G. R. Gao. Dynamic load balancing on single- and multi-GPU systems. In Proceedings of IEEE International Symposium on Parallel & Distributed Processing, IEEE, Atlanta, USA, 2010. DOI: 10.1109/IPDPS.2010.5470413.

    Google Scholar 

  20. A. Acosta, R. Corujo, V. Blanco, F. Almeida. Dynamic load balancing on heterogeneous multicore/multiGPU systems. In Proceedings of International Conference on High Performance Computing and Simulation, IEEE, Caen, France, pp. 467–476, 2010. DOI: 10.1109/HPCS.2010.5547097.

    Google Scholar 

  21. A. Acosta, V. Blanco, F. Almeida. Towards the dynamic load balancing on heterogeneous multi-GPU systems. In Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, IEEE, Leganes, Spain, pp. 646–653, 2012. DOI: 10.1109/ISPA.2012.96.

    Google Scholar 

  22. B. Pérez, E. Stafford, J. L. Bosque, R. Beivide. Energy efficiency of load balancing for data-parallel applications in heterogeneous systems. The Journal of Supercomputing, vol. 73, no. 1, pp. 330–342, 2017. DOI: 10.1007/s11227-016- 1864-y.

    Article  Google Scholar 

  23. R. Kaleem, R. Barik, T. Shpeisman, C. L. Hu, B. T. Lewis, K. Pingali. Adaptive heterogeneous scheduling for integrated GPUs. In Proceedings of the 23rd International Conference on Parallel Architecture and Compilation Techniques, IEEE, Edmonton, Canada, pp. 151–162, 2014. DOI: 10.1145/2628071.2628088.

    Google Scholar 

  24. C. L. Zhang, Y. P. Xu, J. L. Zhou, Z. J. Xu, L. Lu, J. Lu. Dynamic load balancing on multi-GPUs system for big data processing. In Proceedings of the 23rd International Conference on Automation and Computing, IEEE, Huddersfield, UK, 2017. DOI: 10.23919/IConAC.2017.8082085.

    Google Scholar 

  25. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: 10.1109/CVPR.2016.90.

    Google Scholar 

  26. H. Zermane, H. Mouss. Development of an internet and fuzzy based control system of manufacturing process. International Journal of Automation and Computing, vol. 14, no. 6, pp. 706–718, 2017. DOI: 10.1007/s11633-016-1027-x.

    Article  Google Scholar 

  27. J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, W. J. Lou. Fuzzy keyword search over encrypted data in cloud computing. In Proceedings of IEEE Conference on Computer Communications, IEEE, San Diego, CA, USA, pp. 1–5, 2010. DOI: 10.1109/INFCOM.2010.5462196.

    Google Scholar 

  28. S. Krinidis, V. Chatzis. A robust fuzzy local information C-means clustering algorithm. IEEE Transactions on Image Processing, vol. 19, no. 5, pp. 1328–1337, 2010. DOI: 10.1109/TIP.2010.2040763.

    Article  MathSciNet  MATH  Google Scholar 

  29. M. Algabri, H. Mathkour, H. Ramdane. Mobile robot navigation and obstacle-avoidance using ANFIS in unknown environment. International Journal of Computer Applications, vol. 91, no. 14, pp. 36–41, 2014. DOI: 10.5120/15952- 5400.

    Article  Google Scholar 

  30. R. J. Kuo, S. Y. Hong, Y. C. Huang. Integration of particle swarm optimization-based fuzzy neural network and artificial neural network for supplier selection. Applied Mathematical Modelling, vol. 34, no. 12, pp. 3976–3990, 2010. DOI: 10.1016/j.apm.2010.03.033.

    Article  MATH  Google Scholar 

  31. C. L. P. Chen, Y. J. Liu, G. X. Wen. Fuzzy neural network-based adaptive control for a class of uncertain nonlinear stochastic systems. IEEE Transactions on Cybernetics, vol. 44, no. 5, pp. 583–593, 2014. DOI: 10.1109/TCYB. 2013.2262935.

    Article  Google Scholar 

  32. A. Saffar, R. Hooshmand, A. Khodabakhshian. A new fuzzy optimal reconfiguration of distribution systems for loss reduction and load balancing using ant colony search-based algorithm. Applied Soft Computing, vol. 11, no. 5, pp. 4021–4028, 2011. DOI: 10.1016/j.asoc.2011.03.003.

    Article  Google Scholar 

  33. N. Susila, S. Chandramathi, R. Kishore. A fuzzy-based firefly algorithm for dynamic load balancing in cloud computing environment. Journal of Emerging Technologies in Web Intelligence, vol. 6, no. 4, pp. 435–440, 2014. DOI:10.4304/jetwi.6.4.435-440

    Google Scholar 

  34. A. N. Toosi, R. Buyya. A fuzzy logic-based controller for cost and energy efficient load balancing in geo-distributed data centers. In Proceedings of the 8th IEEE/ACM International Conference on Utility and Cloud Computing, IEEE, Limassol, Cyprus, pp. 186–194, 2015. DOI: 10.1109/UCC.2015.35.

    Google Scholar 

  35. H. Muhamedsalih, X. Jiang, F. Gao. Accelerated surface measurement using wavelength scanning interferometer with compensation of environmental noise. Procedia CIRP, vol. 10, pp. 70–76, 2013. DOI: 10.1016/j.procir.2013.08.014.

    Article  Google Scholar 

  36. S. H. Lee, J. S. Lim. Forecasting KOSPI based on a neural network with weighted fuzzy membership functions. Expert Systems with Applications, vol. 38, no. 4, pp. 4259–4263, 2011. DOI: 10.1016/j.eswa.2010.09.093.

    Article  MathSciNet  Google Scholar 

  37. W. Sweldens. The lifting scheme: A construction of second generation wavelets. SIAM Journal on Mathematical Analysis, vol. 29, no. 2, pp. 511–546, 1998. DOI: 10.1137/S0036141095289051.

    Article  MathSciNet  MATH  Google Scholar 

  38. S. Mittal, J. S. Vetter. A survey of CPU-GPU heterogeneous computing techniques. ACM Computing Surveys, vol. 47, no. 4, Article number 69, 2015. DOI: 10.1145/2788396.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan-Ping Xu.

Additional information

This work was supported by National Natural Science Foundation of China (No. 61203172), the SSTP of Sichuan (Nos. 2018YYJC0994 and 2017JY0011) and Shenzhen STPP (No. GJHZ20160301164521358).

Recommended by Associate Editor Hong-Ji Yang

Chao-Long Zhang received the B.Eng. and M. Sc. degrees in software engineering from Chengdu University of Information Technology, China in 2014 and 2017, respectively. He is currently a Ph.D. degree candidate with School of Computing and Engineering, University of Huddersfield, UK.

His research interests include high-performance computing (HPC), computer vision, and deep learning network applications.

Yuan-Ping Xu received the B.Eng. degree in computer science and technology from Southwest Jiaotong University, China in 2003, and M. Sc. and Ph.D. degrees in software engineering from University of Huddersfield, UK in 2004 and 2009, respectively. From February 2009 to November 2010, he worked as a research fellow in the Centre of Precision Technologies, University of Huddersfield, UK. He is currently a professor with School of Software Engineering, Chengdu University of Information Technology, China.

His research interests include knowledge-based systems, expert systems, big data analysis and deep learning network applications.

Zhi-Jie Xu received the B.Eng. degree in communication engineering from the Xi’an University of Science and Technology, China in 1991. After graduation, he first started as an electronics engineer before moving to the United Kingdom and worked as a research scientist in the Robotics Lab at the University of Derby. He received the Ph.D. degree in 2000 from the University of Derby based on his research work in virtual reality-based manufacturing simulation and robotics systems. He has been employed as a full time academic member of staff since April 1999 serving the roles of lecturer, senior lecturer, reader and professor respectively at the University of Huddersfield in UK. He has published over one hundred peer-reviewed journal and conference papers as well as edited 5 books in the relevant fields. He has successfully supervised 8 Ph. D. students to completion while securing substantial research and industrial grants. He is a member of the IEEE, IET, BCS, and a fellow of HEA, and editors for multiple prestigious academic journals and conferences. He is the current President of the Chinese Automation and Computing Society in the United Kingdom.

His research interests include visualization, HCI, vision systems, and machine learning.

Jia He received B. Eng. and M. sc. degrees in computer science and technology from Southwest Normal University of China, China in 1989 and 1996, respectively, and received Ph.D. degree in computer science from University of Electronic Science and Technology of China, China in 2012. She is currently a professor and the Dean with School of Computer Science, Chengdu University of Information Technology, China.

Her research interests include computer vision, artificial intelligence, and pattern recognition.

Jing Wang received the Ph.D. degree from University of Huddersfield, UK in 2012. He worked as a research fellow and carried out independent research work on image processing, analysing and understanding in University of Huddersfield, UK before 2017. He is now working at Sheffield Hallam University as a lecturer in software engineering and computer science.

His research interest is real-world applications of computer vision systems.

Jian-Hua Adu received B. Sc. degree in applied physics from Minzu University of China, China in 1999, received M. Sc. degree in computer science from Shandong University, China in 2006, and received Ph.D. degree in computer science from Sichuan University, China in 2012. He is currently an associate professor with School of Software Engineering, Chengdu University of Information Technology, China.

His research interests include image fusion and segmentation, medical image processing and analysis, and pattern recognition.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, CL., Xu, YP., Xu, ZJ. et al. A Fuzzy Neural Network Based Dynamic Data Allocation Model on Heterogeneous Multi-GPUs for Large-scale Computations. Int. J. Autom. Comput. 15, 181–193 (2018). https://doi.org/10.1007/s11633-018-1120-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-018-1120-4

Keywords