Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Parallel Deep Neural Network Training for Big Data on Blue Gene/Q

Published: 01 June 2017 Publication History

Abstract

Deep Neural Networks (DNNs) have recently been shown to significantly outperform existing machine learning techniques in several pattern recognition tasks. DNNs are the state-of-the-art models used in image recognition, object detection, classification and tracking, and speech and language processing applications. The biggest drawback to DNNs has been the enormous cost in computation and time taken to train the parameters of the networks-often a tenfold increase relative to conventional technologies. Such training time costs can be mitigated by the application of parallel computing algorithms and architectures. However, these algorithms often run into difficulties because of the cost of inter-processor communication bottlenecks. In this paper, we describe how to enable Parallel Deep Neural Network Training on the IBM Blue Gene/Q (BG/Q) computer system. Specifically, we explore DNN training using the data-parallel Hessian-free 2nd order optimization algorithm. Such an algorithm is particularly well-suited to parallelization across a large set of loosely coupled processors. BG/Q, with its excellent inter-processor communication characteristics, is an ideal match for this type of algorithm. The paper discusses how issues regarding programming model and data-dependent imbalances are addressed. Results on large-scale speech tasks show that the performance on BG/Q scales linearly up to 4,096 processes with no loss in accuracy. This allows us to train neural networks using billions of training examples in a few hours.

Cited By

View all
  • (2022)Construction of Flipped Classroom for College English Courses Using Big Data MOOCs and Information SystemMobile Information Systems10.1155/2022/96802052022Online publication date: 1-Jan-2022
  • (2022)Teaching Mode in the Management of Higher Vocational Colleges in the Era of Big DataMobile Information Systems10.1155/2022/81004952022Online publication date: 1-Jan-2022
  • (2022)Three-Dimensional Landscape Rendering and Landscape Spatial Distribution of Traditional Villages Based on Big Data Information SystemMobile Information Systems10.1155/2022/49459182022Online publication date: 1-Jan-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 28, Issue 6
June 2017
276 pages

Publisher

IEEE Press

Publication History

Published: 01 June 2017

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Construction of Flipped Classroom for College English Courses Using Big Data MOOCs and Information SystemMobile Information Systems10.1155/2022/96802052022Online publication date: 1-Jan-2022
  • (2022)Teaching Mode in the Management of Higher Vocational Colleges in the Era of Big DataMobile Information Systems10.1155/2022/81004952022Online publication date: 1-Jan-2022
  • (2022)Three-Dimensional Landscape Rendering and Landscape Spatial Distribution of Traditional Villages Based on Big Data Information SystemMobile Information Systems10.1155/2022/49459182022Online publication date: 1-Jan-2022
  • (2021)Accurate mining of location data in the communication field based on big dataJournal of High Speed Networks10.3233/JHS-21066527:3(251-264)Online publication date: 1-Jan-2021
  • (2021)Image Recommendation Algorithm Combined with Deep Neural Network Designed for Social NetworksComplexity10.1155/2021/51961902021Online publication date: 1-Jan-2021
  • (2021)Effective Scheduler for Distributed DNN Training Based on MapReduce and GPU ClusterJournal of Grid Computing10.1007/s10723-021-09550-619:1Online publication date: 1-Mar-2021
  • (2020)Deep learning parallel computing and evaluation for embedded system clustering architecture processorDesign Automation for Embedded Systems10.1007/s10617-020-09235-524:3(145-159)Online publication date: 1-Sep-2020
  • (2019)Deep reuseProceedings of the ACM International Conference on Supercomputing10.1145/3330345.3330384(438-448)Online publication date: 26-Jun-2019
  • (2019)Demystifying Parallel and Distributed Deep LearningACM Computing Surveys10.1145/332006052:4(1-43)Online publication date: 30-Aug-2019
  • (2018)Exploring flexible communications for streamlining DNN ensemble training pipelinesProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291742(1-12)Online publication date: 11-Nov-2018
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media