Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A Comprehensive Survey and Analysis of CNN-LSTM-Based Approaches for Human Activity Recognition

  • Conference paper
  • First Online:
Proceedings of the 9th Brazilian Technology Symposium (BTSym’23) (BTSym 2023)

Abstract

In recent years, the field of Human Activity Recognition (HAR) has become increasingly vital in the realm of computer vision, unveiling a plethora of practical applications that extend far and wide. From enhancing surveillance systems to enabling precise activity tracking, delving into sports analysis, and facilitating efficient event identification, HAR has emerged as a transformative technology. Machine Learning, as a driving force in this arena, has ushered in a new era of methodologies, each adding its own layer of sophistication to HAR systems. Convolutional Neural Networks (CNNs) bring in the power of visual hierarchy and feature learning, while Graph Neural Networks excel in capturing complex relationships within activity data. Long Short-Term Memory (LSTM) networks, with their ability to capture temporal dependencies, have further fortified the capabilities of HAR models. This comprehensive article endeavors to provide an extensive overview of the various categories within HAR and the nuanced methodologies employed across these categories. By delving into the intricacies of each approach, we aim to offer a nuanced understanding for researchers engaged in HAR. Beyond a mere compilation of facts, this survey article seeks to be a guiding light, aiding researchers in the discovery of pertinent references and serving as a valuable resource for shaping the trajectory of future research in the dynamic field of Human Activity Recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 249.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hossain Shuvo, M.M., Ahmed, N., Nouduri, K., Palaniappan, K.: A hybrid approach for human activity recognition with support vector machine and 1d convolutional neural network. In: 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), pp. 1–5 (2020). https://doi.org/10.1109/AIPR50011.2020.9425332

  2. Dang, L.M., Min, K., Wang, H., Piran, M.J., Lee, C.H., Moon, H.: Sensor-based and vision-based human activity recognition: a comprehensive survey. Pattern Recogn. 108, 107,561 (2020)

    Google Scholar 

  3. Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1933–1941 (2016)

    Google Scholar 

  4. Zheng, H., Zhang, X.M.: A cross-modal learning approach for recognizing human actions. IEEE Syst. J. 15(2), 2322–2330 (2021). https://doi.org/10.1109/JSYST.2020.3001680

    Article  Google Scholar 

  5. Latha, K., Sheela, T.: Block based data security and data distribution on multi cloud environment. J. Ambient Intell. Human. Comput. 1–7 (2019). https://doi.org/10.1007/s12652-019-01395-y

  6. Pareek, P., Thakkar, A.: A survey on video-based human action recognition: recent updates, datasets, challenges, and applications. Artif. Intell. Rev. 54, 2259–2322 (2021)

    Article  Google Scholar 

  7. Zhang, Y., Li, B., Fang, H., Meng, Q.: Current advances on deep learning-based human action recognition from videos: a survey. In: 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 304–311 (2021). https://doi.org/10.1109/ICMLA52953.2021.00054

  8. Dai, C., Liu, X., Lai, J.: Human action recognition using two-stream attention based lstm networks. Appl. Soft Comput. 86, 105,820 (2020)

    Google Scholar 

  9. Oliveira, G.G.D., Iano, Y., Vaz, G.C., Chuma, E.L., Arthur, R.: Intelligent transportation: application of deep learning techniques in the search for a sustainable environment. In: Proceedings of the 2022 5th International Conference on Big Data and Internet of Things, pp. 7–12 (2022)

    Google Scholar 

  10. Negrete, J.C.M., Iano, Y., Negrete, P.D.M., Vaz, G.C., de Oliveira, G.G.: Sentiment analysis in the ecuadorian presidential election. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., Cotrim Pezzuto, C., Arthur, R., Gomes de Oliveira, G. (eds.) Proceedings of the 7th Brazilian Technology Symposium (BTSym 2021), BTSym 2021, SIST, vol. 207, pp. 25–34. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-04435-9_3

  11. Negrete, J.C.M., Iano, Y., Negrete, P.D.M., Vaz, G.C., de Oliveira, G.G.: Sentiment and emotions analysis of tweets during the second round of 2021 ecuadorian presidential election. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., Cotrim Pezzuto, C., Arthur, R., Gomes de Oliveira, G. (eds.) Proceedings of the 7th Brazilian Technology Symposium (BTSym 2021), BTSym 2021, SIST, vol. 207, pp. 257–268. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-04435-9_24

  12. Minango, P., Iano, Y., Chuma, E.L., Vaz, G.C., de Oliveira, G.G., Minango, J.: Revision of the 5g concept rollout and its application in smart cities: a study case in South America. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., Cotrim Pezzuto, C., Arthur, R., Gomes de Oliveira, G. (eds.) Proceedings of the 7th Brazilian Technology Symposium (BTSym 2021), BTSym 2021, SIST, vol. 207, pp. 229–238. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-04435-9_21

  13. Hammerla, N.Y., Halloran, S., Plötz, T.: Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880 (2016)

  14. Ordóñez, F.J., Roggen, D.: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1), 115 (2016)

    Article  Google Scholar 

  15. Zhang, P., Lan, C., Xing, J., Zeng, W., Xue, J., Zheng, N.: View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1963–1978 (2019)

    Article  Google Scholar 

  16. Wang, Q., Zhang, K., Asghar, M.A.: Skeleton-based st-gcn for human action recognition with extended skeleton graph and partitioning strategy. IEEE Access 10, 41403–41410 (2022)

    Article  Google Scholar 

  17. Shi, L., Zhang, Y., Cheng, J., Lu, H.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12,026–12,035 (2019)

    Google Scholar 

  18. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)

    Google Scholar 

  19. Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.c.: Convolutional lstm network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems 28 (2015)

    Google Scholar 

  20. Wang, L., Xu, Y., Cheng, J., Xia, H., Yin, J., Wu, J.: Human action recognition by learning spatio-temporal features with deep neural networks. IEEE access 6, 17913–17922 (2018)

    Article  Google Scholar 

  21. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2625–2634 (2015)

    Google Scholar 

  22. Naeem, H.B., Murtaza, F., Yousaf, M.H., Velastin, S.A.: T-vlad: Temporal vector of locally aggregated descriptor for multiview human action recognition. Pattern Recogn. Lett. 148, 22–28 (2021)

    Article  Google Scholar 

  23. Yu, S., Xie, L., Liu, L., Xia, D.: Learning long-term temporal features with deep neural networks for human action recognition. IEEE Access 8, 1840–1850 (2019)

    Article  Google Scholar 

  24. Li, Z., Gavrilyuk, K., Gavves, E., Jain, M., Snoek, C.G.: Videolstm convolves, attends and flows for action recognition. Comput. Vis. Image Underst. 166, 41–50 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pablo Minango .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Minango, P., Flores, A., Minango, J., Zambrano, M. (2024). A Comprehensive Survey and Analysis of CNN-LSTM-Based Approaches for Human Activity Recognition. In: Iano, Y., Saotome, O., Kemper Vásquez, G.L., de Moraes Gomes Rosa, M.T., Arthur, R., Gomes de Oliveira, G. (eds) Proceedings of the 9th Brazilian Technology Symposium (BTSym’23). BTSym 2023. Smart Innovation, Systems and Technologies, vol 402. Springer, Cham. https://doi.org/10.1007/978-3-031-66961-3_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-66961-3_54

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-66960-6

  • Online ISBN: 978-3-031-66961-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics