A Comprehensive Survey and Analysis of CNN-LSTM-Based Approaches for Human Activity Recognition

Minango, Pablo; Flores, Andrea; Minango, Juan; Zambrano, Marcelo

doi:10.1007/978-3-031-66961-3_54

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 402))

Included in the following conference series:

Brazilian Technology Symposium

116 Accesses

Abstract

In recent years, the field of Human Activity Recognition (HAR) has become increasingly vital in the realm of computer vision, unveiling a plethora of practical applications that extend far and wide. From enhancing surveillance systems to enabling precise activity tracking, delving into sports analysis, and facilitating efficient event identification, HAR has emerged as a transformative technology. Machine Learning, as a driving force in this arena, has ushered in a new era of methodologies, each adding its own layer of sophistication to HAR systems. Convolutional Neural Networks (CNNs) bring in the power of visual hierarchy and feature learning, while Graph Neural Networks excel in capturing complex relationships within activity data. Long Short-Term Memory (LSTM) networks, with their ability to capture temporal dependencies, have further fortified the capabilities of HAR models. This comprehensive article endeavors to provide an extensive overview of the various categories within HAR and the nuanced methodologies employed across these categories. By delving into the intricacies of each approach, we aim to offer a nuanced understanding for researchers engaged in HAR. Beyond a mere compilation of facts, this survey article seeks to be a guiding light, aiding researchers in the discovery of pertinent references and serving as a valuable resource for shaping the trajectory of future research in the dynamic field of Human Activity Recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Computer Vision with Deep Learning for Human Activity Recognition: Features Representation

A Survey on Deep Learning Based Human Activity Recognition System

Application of human activity/action recognition: a review

Article Open access 08 January 2025

References

Hossain Shuvo, M.M., Ahmed, N., Nouduri, K., Palaniappan, K.: A hybrid approach for human activity recognition with support vector machine and 1d convolutional neural network. In: 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), pp. 1–5 (2020). https://doi.org/10.1109/AIPR50011.2020.9425332
Dang, L.M., Min, K., Wang, H., Piran, M.J., Lee, C.H., Moon, H.: Sensor-based and vision-based human activity recognition: a comprehensive survey. Pattern Recogn. 108, 107,561 (2020)
Google Scholar
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1933–1941 (2016)
Google Scholar
Zheng, H., Zhang, X.M.: A cross-modal learning approach for recognizing human actions. IEEE Syst. J. 15(2), 2322–2330 (2021). https://doi.org/10.1109/JSYST.2020.3001680
Article Google Scholar
Latha, K., Sheela, T.: Block based data security and data distribution on multi cloud environment. J. Ambient Intell. Human. Comput. 1–7 (2019). https://doi.org/10.1007/s12652-019-01395-y
Pareek, P., Thakkar, A.: A survey on video-based human action recognition: recent updates, datasets, challenges, and applications. Artif. Intell. Rev. 54, 2259–2322 (2021)
Article Google Scholar
Zhang, Y., Li, B., Fang, H., Meng, Q.: Current advances on deep learning-based human action recognition from videos: a survey. In: 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 304–311 (2021). https://doi.org/10.1109/ICMLA52953.2021.00054
Dai, C., Liu, X., Lai, J.: Human action recognition using two-stream attention based lstm networks. Appl. Soft Comput. 86, 105,820 (2020)
Google Scholar
Oliveira, G.G.D., Iano, Y., Vaz, G.C., Chuma, E.L., Arthur, R.: Intelligent transportation: application of deep learning techniques in the search for a sustainable environment. In: Proceedings of the 2022 5th International Conference on Big Data and Internet of Things, pp. 7–12 (2022)
Google Scholar
Negrete, J.C.M., Iano, Y., Negrete, P.D.M., Vaz, G.C., de Oliveira, G.G.: Sentiment analysis in the ecuadorian presidential election. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., Cotrim Pezzuto, C., Arthur, R., Gomes de Oliveira, G. (eds.) Proceedings of the 7th Brazilian Technology Symposium (BTSym 2021), BTSym 2021, SIST, vol. 207, pp. 25–34. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-04435-9_3
Negrete, J.C.M., Iano, Y., Negrete, P.D.M., Vaz, G.C., de Oliveira, G.G.: Sentiment and emotions analysis of tweets during the second round of 2021 ecuadorian presidential election. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., Cotrim Pezzuto, C., Arthur, R., Gomes de Oliveira, G. (eds.) Proceedings of the 7th Brazilian Technology Symposium (BTSym 2021), BTSym 2021, SIST, vol. 207, pp. 257–268. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-04435-9_24
Minango, P., Iano, Y., Chuma, E.L., Vaz, G.C., de Oliveira, G.G., Minango, J.: Revision of the 5g concept rollout and its application in smart cities: a study case in South America. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., Cotrim Pezzuto, C., Arthur, R., Gomes de Oliveira, G. (eds.) Proceedings of the 7th Brazilian Technology Symposium (BTSym 2021), BTSym 2021, SIST, vol. 207, pp. 229–238. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-04435-9_21
Hammerla, N.Y., Halloran, S., Plötz, T.: Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880 (2016)
Ordóñez, F.J., Roggen, D.: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1), 115 (2016)
Article Google Scholar
Zhang, P., Lan, C., Xing, J., Zeng, W., Xue, J., Zheng, N.: View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1963–1978 (2019)
Article Google Scholar
Wang, Q., Zhang, K., Asghar, M.A.: Skeleton-based st-gcn for human action recognition with extended skeleton graph and partitioning strategy. IEEE Access 10, 41403–41410 (2022)
Article Google Scholar
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12,026–12,035 (2019)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Google Scholar
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.c.: Convolutional lstm network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems 28 (2015)
Google Scholar
Wang, L., Xu, Y., Cheng, J., Xia, H., Yin, J., Wu, J.: Human action recognition by learning spatio-temporal features with deep neural networks. IEEE access 6, 17913–17922 (2018)
Article Google Scholar
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2625–2634 (2015)
Google Scholar
Naeem, H.B., Murtaza, F., Yousaf, M.H., Velastin, S.A.: T-vlad: Temporal vector of locally aggregated descriptor for multiview human action recognition. Pattern Recogn. Lett. 148, 22–28 (2021)
Article Google Scholar
Yu, S., Xie, L., Liu, L., Xia, D.: Learning long-term temporal features with deep neural networks for human action recognition. IEEE Access 8, 1840–1850 (2019)
Article Google Scholar
Li, Z., Gavrilyuk, K., Gavves, E., Jain, M., Snoek, C.G.: Videolstm convolves, attends and flows for action recognition. Comput. Vis. Image Underst. 166, 41–50 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

State University of Campinas, Campinas, Brazil
Pablo Minango, Andrea Flores & Juan Minango
Instituto Tecnológico Universitario Rumiñahui, Sangolquí, Ecuador
Juan Minango & Marcelo Zambrano

Authors

Pablo Minango
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Flores
View author publications
You can also search for this author in PubMed Google Scholar
Juan Minango
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo Zambrano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pablo Minango .

Editor information

Editors and Affiliations

Faculty of Electrical and Computational Engineering, Unicamp, Campinas, São Paulo, Brazil
Yuzo Iano
Technological Institute of Aeronautics, Federal Fluminense University, Sao Jose dos Campos, São Paulo, Brazil
Osamu Saotome
Urb. Pampas de Santa Teresa, Dpto. E6-402, Santiago de Surco, Lima, Peru
Guillermo Leopoldo Kemper Vásquez
Universidade Presbiteriana Mackenzie, São Paulo, Brazil
Maria Thereza de Moraes Gomes Rosa
Faculty of Electrical and Computational Engineering, Unicamp, Campinas, São Paulo, Brazil
Rangel Arthur
#400, Cidade Universitaria, Campinas, São Paulo, Brazil
Gabriel Gomes de Oliveira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Minango, P., Flores, A., Minango, J., Zambrano, M. (2024). A Comprehensive Survey and Analysis of CNN-LSTM-Based Approaches for Human Activity Recognition. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., de Moraes Gomes Rosa, M.T., Arthur, R., Gomes de Oliveira, G. (eds) Proceedings of the 9th Brazilian Technology Symposium (BTSym’23). BTSym 2023. Smart Innovation, Systems and Technologies, vol 402. Springer, Cham. https://doi.org/10.1007/978-3-031-66961-3_54

Download citation

DOI: https://doi.org/10.1007/978-3-031-66961-3_54
Published: 21 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-66960-6
Online ISBN: 978-3-031-66961-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

A Comprehensive Survey and Analysis of CNN-LSTM-Based Approaches for Human Activity Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Computer Vision with Deep Learning for Human Activity Recognition: Features Representation

A Survey on Deep Learning Based Human Activity Recognition System

Application of human activity/action recognition: a review

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Comprehensive Survey and Analysis of CNN-LSTM-Based Approaches for Human Activity Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Computer Vision with Deep Learning for Human Activity Recognition: Features Representation

A Survey on Deep Learning Based Human Activity Recognition System

Application of human activity/action recognition: a review

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation