Abstract
Long short-term memory (LSTM) is an important model for sequential data processing. However, large amounts of matrix computations in LSTM unit seriously aggravate the training when model grows larger and deeper as well as more data become available. In this work, we propose an efficient distributed duration-aware LSTM(D-LSTM) for large scale sequential data analysis. We improve LSTM’s training performance from two aspects. First, the duration of sequence item is explored in order to design a computationally efficient cell, called duration-aware LSTM(D-LSTM) unit. With an additional mask gate, the D-LSTM cell is able to perceive the duration of sequence item and adopt an adaptive memory update accordingly. Secondly, on the basis of D-LSTM unit, a novel distributed training algorithm is proposed, where D-LSTM network is divided logically and multiple distributed neurons are introduced to perform the easier and concurrent linear calculations in parallel. Different from the physical division in model parallelism, the logical split based on hidden neurons can greatly reduce the communication overhead which is a major bottleneck in distributed training. We evaluate the effectiveness of the proposed method on two video datasets. The experimental results shown our distributed D-LSTM greatly reduces the training time and can improve the training efficiency for large scale sequence analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal processing, pp. 6645–6649. IEEE, Piscataway (2013)
Salehinejad, H., Sankar, S, Barfett, J., Colak, E., Valaee, S.: Recent advances in recurrent neural networks. arXiv preprint arXiv:1801.01078 (2018)
Dean, J., Corrado, G.S., Monga, R., Chen, K., Ng, A.Y.: Large scale distributed deep network. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1223–1231. Neural Information Processing Systems Foundation, San Diego (2012)
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Devin, M.: Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
Xing, E.P., Ho, Q.R., Dai, W., Kim, J.K., Yu, Y.L.: Petuum: a new platform for distributed machine learning on big data. IEEE Trans. Big Data 1(2), 49–67 (2015)
Li, M., Andersen, D.G., Park, J.W., Smola, A.J., Su, B.Y.: Scaling distributed machine learning with the parameter server. In: Proceedings of International Conference on Big Data Science & Computing, p. 1. ACM, New York (2014)
Eluyode, O.S., Akomolafe, D.T.: Comparative study of biological and artificial neural networks. Eur. J. Appl. Eng. Sci. Res. 2(1), 36–46 (2013)
Lei, T., Zhang, Y.: Training RNNs as fast as CNNs. arXiv preprint arXiv:1709.02755 (2017)
Khomenko, V., Shyshkov, O., Radyvonenko, O., Bokhan, K.: Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization. In: Proceedings of International Conference on Data Stream Mining and Processing, pp. 561–570 IEEE, Piscataway (2016)
Huang, Z., Zweig, G., Levit, M., Dumoulin, B., Chang, S.: Accelerating recurrent neural network training via two stage classes and parallelization. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Olomouc, Czech Republic, pp. 326–331. IEEE, Piscataway, 8–12 December 2013
Ji, S., Vishwanathan, S.V.N., Satish, N., Anderson, M.J., Dubey, P.: BlackOut: speeding up recurrent neural network language models with very large vocabularies. Comput. Sci. 115(8), 59–68 (2015)
Keuper, J., Preundt, F.J.: Distributed training of deep neural networks: theoretical and practical limits of parallel scalability. In Proceedings of the Workshop on Machine Learning in High Performance Computing Environments, pp. 19–26. IEEE, Piscataway (2017)
Gholami, A., Azad, A., Jin, P., Keutzer, K., Buluc, A.: Integrated model, batch, and domain parallelism in training neural networks. arXiv preprint arXiv:1712.04432 (2017)
Niu, D.J., Xia, Z., Liu, Y.W., Cai, T., Zhan, Y.Z.: ALSTM: adaptive LSTM for durative sequential data. In: Proceedings of IEEE 30th International Conference on Tools with Artificial Intelligence, pp. 1018–1026. IEEE, Piscataway (2018)
Sigurdsson, G.A., Varol, G., Wang, X., Farhadi, A., Laptev, I., Gupta, A.: Hollywood in homes: crowdsourcing data collection for activity understanding. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 510–526. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_31
Tang, Y., Ding, D., Rao, Y., Zheng, Y., Zhang, D., Zhao, L.: COIN: a large-scale dataset for comprehensive instructional video analysis. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (accepted, 2019)
Acknowledgment
This work was partly supported by the National Natural Science Foundation of China No. 61806086, and the China Postdoctoral Science Foundation No. 2016M601737.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Niu, D., Liu, Y., Cai, T., Zheng, X., Liu, T., Zhou, S. (2019). A Novel Distributed Duration-Aware LSTM for Large Scale Sequential Data Analysis. In: Jin, H., Lin, X., Cheng, X., Shi, X., Xiao, N., Huang, Y. (eds) Big Data. BigData 2019. Communications in Computer and Information Science, vol 1120. Springer, Singapore. https://doi.org/10.1007/978-981-15-1899-7_9
Download citation
DOI: https://doi.org/10.1007/978-981-15-1899-7_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1898-0
Online ISBN: 978-981-15-1899-7
eBook Packages: Computer ScienceComputer Science (R0)