Abstract
In the family of recurrent neural networks the long short-term model network provides promising solutions for many complex applications such as speech and voice recognition, machine translation and time series analysis. When building these networks, many tunable hyper-parameters need to be set early. Among these hyperparameters, the activation function greatly influences the learning behavior of the neural networks. The present work proposes a differential evolution algorithm (DEA)-based hierarchical combined activation to surrogate the default activation functions of the LSTM cell. A DEA-based neuroevolution method is proposed to discover an optimal combination of function for the LSTM network. To investigate the performance of the proposed neuroevolution method, several experiments were done on three datasets for human activity recognition using two LSTM networks. The results show that the newly evolved activation functions using the DEA on each dataset outperform the traditional activation functions. The classification accuracy of the proposed LSTM with the DEA-based hierarchical activations is higher than that of other state-of-the-art models in the literature.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL (2013) A public domain dataset for human activity recognition using smartphones. In: Esann, vol 3, p 3
Baldominos A, Saez Y, Isasi P (2018) Evolutionary convolutional neural networks: an application to handwriting recognition. Neurocomputing 283:38–52
Basirat M, Roth PM (2018) The quest for the golden activation function. arXiv:1808.00783
Chavarriaga R, Sagha H, Calatroni A, Digumarti ST, Tröster G, Millán JR, Roggen D (2013) The opportunity challenge: a benchmark database for on-body sensor-based activity recognition. Pattern Recognit Lett 34:2033–2042
Ding B, Qian H, Zhou J (2018) Activation functions and their characteristics in deep neural networks. In: 2018 Chinese control and decision conference (CCDC). IEEE, pp 1836–1841
Doetsch P, Kozielski M, Ney H (2014) Fast and robust training of recurrent neural networks for offline handwriting recognition. In: 2014 14th international conference on frontiers in handwriting recognition. IEEE, pp 279–284
Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625–2634
Eger S, Youssef P, Gurevych I (2019) Is it time to swish? Comparing deep learning activation functions across NLP tasks. arXiv:1901.02671
Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3–11
Fan Y, Qian Y, Xie F-L, Soong FK (2014) TTS synthesis with bidirectional LSTM based recurrent neural networks. In: Fifteenth annual conference of the international speech communication association
Godin F, Degrave J, Dambre J, De Neve W (2018) Dual rectified linear units (DRELUS): a replacement for tanh activation functions in quasi-recurrent neural networks. Pattern Recognit Lett 116:8–14
Gonzalez S, Miikkulainen R (2019) Improved training speed, accuracy, and data utilization through loss function optimization. arXiv:1905.11528
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, Cambridge
Graves A (2013) Generating sequences with recurrent neural networks. arXiv:1308.0850
Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2008) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31:855–868
Hagg A, Mensing M, Asteroth A (2017) Evolving parsimonious networks by mixing activation functions. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 425–432
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Igel C (2003) Neuroevolution for reinforcement learning using evolution strategies. In: The 2003 congress on evolutionary computation, 2003. CEC’03. IEEE, vol 4, pp 2588–2595
Jalal A, Kim K et al (2020) Wearable inertial sensors for daily activity analysis based on Adam optimization and the maximum entropy Markov model. Entropy 22:579
Jiang W, Yin Z (2015) Human activity recognition using wearable sensors by deep convolutional neural networks. In: Proceedings of the 23rd ACM international conference on multimedia, pp 1307–1310
Luong M-T, Sutskever I, Le QV, Vinyals O, Zaremba W (2014) Addressing the rare word problem in neural machine translation. arXiv:1410.8206
Manessi F, Rozza A (2018) Learning combinations of activation functions. arXiv:1801.09403
Marchi E, Ferroni G, Eyben F, Gabrielli L, Squartini S, Schuller B (2014) Multi-resolution linear prediction based features for audio onset detection with bidirectional LSTM neural networks. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2164–2168
Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B, Shahrzad H, Navruzyan A, Duffy N et al (2019) Evolving deep neural networks. In: Artificial intelligence in the age of neural networks and brain computing. Elsevier, pp 293–312
Montana DJ, Davis L (1989) Training feedforward neural networks using genetic algorithms. In: IJCAI, vol 89, pp 762–767
Murad A, Pyun J-Y (2017) Deep recurrent neural networks for human activity recognition. Sensors 17:2556
Oung QW, Basah SN, Muthusamy H, Vijean V, Lee H, Khairunizam W, Bakar SA, Razlan ZM, Ibrahim Z (2018) Objective evaluation of freezing of gait in patients with Parkinson’s disease through machine learning approaches. In: 2018 international conference on computational approach in smart systems design and applications (ICASSDA). IEEE, pp 1–7
Pan X, Srikumar V (2016) Expressiveness of rectifier networks. In: International conference on machine learning, pp 2427–2435
Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. arXiv:1710.05941
Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 4780–4789
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 2902–2911
Saha S, Nagaraj N, Mathur A, Yedida R (2019) Evolution of novel activation functions in neural network training with applications to classification of exoplanets. arXiv:1906.01975
Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association
San-Segundo R, Navarro-Hellín H, Torres-Sánchez R, Hodgins J, De la Torre F (2019) Increasing robustness in the detection of freezing of gait in Parkinson’s disease. Electronics 8:119
Sønderby SK, Winther O (2014) Protein secondary structure prediction with long short term memory networks. arXiv:1412.7828
Stanley KO, D’Ambrosio DB, Gauci J (2009) A hypercube-based encoding for evolving large-scale neural networks. Artif Life 15:185–212
Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 10:99–127
Tan TG, Teo J, Anthony P (2014) A comparative investigation of non-linear activation functions in neural controllers for search-based game AI engineering. Artif Intell Rev 41:1–25
Torvi VG, Bhattacharya A, Chakraborty S (2018) Deep domain adaptation to predict freezing of gait in patients with Parkinson’s disease. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), IEEE, pp 1001–1006
Trottier L, Gigu P, Chaib-draa B et al (2017) Parametric exponential linear unit for deep convolutional neural networks. In: Machine learning and applications (ICMLA), 2017 16th IEEE international conference on. IEEE, pp 207–214
Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In 2018 IEEE congress on evolutionary computation (CEC). IEEE, pp 1–8
Xie L, Yuille A (2017) Genetic CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1379–1388
Yao X (1999) Evolving artificial neural networks. Proc IEEE 87:1423–1447
ZahediNasab R, Mohseni H (2020) Neuroevolutionary based convolutional neural network with adaptive activation functions. Neurocomputing 381:306–313
Zaremba W, Sutskever I, Vinyals O (2014) Recurrent neural network regularization. arXiv:1409.2329
Zhang M, Sawchuk AA (2012) Usc-had: a daily activity dataset for ubiquitous activity recognition using wearable sensors. In: Proceedings of the 2012 ACM conference on ubiquitous computing, pp 1036–1043
Zheng Y (2015) Human activity recognition based on the hierarchical feature selection and classification framework. J Electric Comput Eng 2015:140820
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Fully documented templates are available in the elsarticle package on.
Rights and permissions
About this article
Cite this article
Vijayaprabakaran, K., Sathiyamurthy, K. Neuroevolution based hierarchical activation function for long short-term model network. J Ambient Intell Human Comput 12, 10757–10768 (2021). https://doi.org/10.1007/s12652-020-02889-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02889-w