Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (170)

Search Parameters:
Keywords = MediaPipe

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 8196 KiB  
Article
Human–Robot Interaction Using Dynamic Hand Gesture for Teleoperation of Quadruped Robots with a Robotic Arm
by Jianan Xie, Zhen Xu, Jiayu Zeng, Yuyang Gao and Kenji Hashimoto
Electronics 2025, 14(5), 860; https://doi.org/10.3390/electronics14050860 - 21 Feb 2025
Abstract
Human–Robot Interaction (HRI) using hand gesture recognition offers an effective and non-contact approach to enhancing operational intuitiveness and user convenience. However, most existing studies primarily focus on either static sign language recognition or the tracking of hand position and orientation in space. These [...] Read more.
Human–Robot Interaction (HRI) using hand gesture recognition offers an effective and non-contact approach to enhancing operational intuitiveness and user convenience. However, most existing studies primarily focus on either static sign language recognition or the tracking of hand position and orientation in space. These approaches often prove inadequate for controlling complex robotic systems. This paper proposes an advanced HRI system leveraging dynamic hand gestures for controlling quadruped robots equipped with a robotic arm. The proposed system integrates both semantic and pose information from dynamic gestures to enable comprehensive control over the robot’s diverse functionalities. First, a Depth–MediaPipe framework is introduced to facilitate the precise three-dimensional (3D) coordinate extraction of 21 hand bone keypoints. Subsequently, a Semantic-Pose to Motion (SPM) model is developed to analyze and interpret both the pose and semantic aspects of hand gestures. This model translates the extracted 3D coordinate data into corresponding mechanical actions in real-time, encompassing quadruped robot locomotion, robotic arm end-effector tracking, and semantic-based command switching. Extensive real-world experiments demonstrate the proposed system’s effectiveness in achieving real-time interaction and precise control, underscoring its potential for enhancing the usability of complex robotic platforms. Full article
Show Figures

Figure 1

20 pages, 4882 KiB  
Article
Empowering Recovery: The T-Rehab System’s Semi-Immersive Approach to Emotional and Physical Well-Being in Tele-Rehabilitation
by Hayette Hadjar, Binh Vu and Matthias Hemmje
Electronics 2025, 14(5), 852; https://doi.org/10.3390/electronics14050852 - 21 Feb 2025
Abstract
The T-Rehab System delivers a semi-immersive tele-rehabilitation experience by integrating Affective Computing (AC) through facial expression analysis and contactless heartbeat monitoring. T-Rehab closely monitors patients’ mental health as they engage in a personalized, semi-immersive Virtual Reality (VR) game on a desktop PC, using [...] Read more.
The T-Rehab System delivers a semi-immersive tele-rehabilitation experience by integrating Affective Computing (AC) through facial expression analysis and contactless heartbeat monitoring. T-Rehab closely monitors patients’ mental health as they engage in a personalized, semi-immersive Virtual Reality (VR) game on a desktop PC, using a webcam with MediaPipe to track their hand movements for interactive exercises, allowing the system to tailor treatment content for increased engagement and comfort. T-Rehab’s evaluation comprises two assessments: system performance and cognitive walkthroughs. The first evaluation focuses on system performance, assessing the tested game, middleware, and facial emotion monitoring to ensure hardware compatibility and effective support for AC, gaming, and tele-rehabilitation. The second evaluation uses cognitive walkthroughs to examine usability, identifying potential issues in emotion detection and tele-rehabilitation. Together, these evaluations provide insights into T-Rehab’s functionality, usability, and impact in supporting both physical rehabilitation and emotional well-being. The thorough integration of technology inside T-Rehab ensures a holistic approach to tele-rehabilitation, allowing patients to participate comfortably and efficiently from anywhere. This technique not only improves physical therapy outcomes but also promotes mental resilience, marking an important step advance in tele-rehabilitation practices. Full article
Show Figures

Figure 1

24 pages, 7979 KiB  
Article
Vision-Based Hand Gesture Recognition Using a YOLOv8n Model for the Navigation of a Smart Wheelchair
by Thanh-Hai Nguyen, Ba-Viet Ngo and Thanh-Nghia Nguyen
Electronics 2025, 14(4), 734; https://doi.org/10.3390/electronics14040734 - 13 Feb 2025
Abstract
Electric wheelchairs are the primary means of transportation that enable individuals with disabilities to move independently to their desired locations. This paper introduces a novel, low-cost smart wheelchair system designed to enhance the mobility of individuals with severe disabilities through hand gesture recognition. [...] Read more.
Electric wheelchairs are the primary means of transportation that enable individuals with disabilities to move independently to their desired locations. This paper introduces a novel, low-cost smart wheelchair system designed to enhance the mobility of individuals with severe disabilities through hand gesture recognition. Additionally, the system aims to support low-income individuals who previously lacked access to smart wheelchairs. Unlike existing methods that rely on expensive hardware or complex systems, the proposed system utilizes an affordable webcam and an Nvidia Jetson Nano embedded computer to process and recognize six distinct hand gestures—“Forward 1”, “Forward 2”, “Backward”, “Left”, “Right”, and “Stop”—to assist with wheelchair navigation. The system employs the “You Only Look Once version 8n” (YOLOv8n) model, which is well suited for low-spec embedded computers, trained on a self-collected hand gesture dataset containing 12,000 images. The pre-processing phase utilizes the MediaPipe library to generate landmark hand images, remove the background, and then extract the region of interest (ROI) of the hand gestures, significantly improving gesture recognition accuracy compared to previous methods that relied solely on hand images. Experimental results demonstrate impressive performance, achieving 99.3% gesture recognition accuracy and 93.8% overall movement accuracy in diverse indoor and outdoor environments. Furthermore, this paper presents a control circuit system that can be easily installed on any existing electric wheelchair. This approach offers a cost-effective, real-time solution that enhances the autonomy of individuals with severe disabilities in daily activities, laying the foundation for the development of affordable smart wheelchairs. Full article
(This article belongs to the Special Issue Human-Computer Interactions in E-health)
Show Figures

Figure 1

25 pages, 2844 KiB  
Article
Real-Time Gesture-Based Hand Landmark Detection for Optimized Mobile Photo Capture and Synchronization
by Pedro Marques, Paulo Váz, José Silva, Pedro Martins and Maryam Abbasi
Electronics 2025, 14(4), 704; https://doi.org/10.3390/electronics14040704 - 12 Feb 2025
Abstract
Gesture recognition technology has emerged as a transformative solution for natural and intuitive human–computer interaction (HCI), offering touch-free operation across diverse fields such as healthcare, gaming, and smart home systems. In mobile contexts, where hygiene, convenience, and the ability to operate under resource [...] Read more.
Gesture recognition technology has emerged as a transformative solution for natural and intuitive human–computer interaction (HCI), offering touch-free operation across diverse fields such as healthcare, gaming, and smart home systems. In mobile contexts, where hygiene, convenience, and the ability to operate under resource constraints are critical, hand gesture recognition provides a compelling alternative to traditional touch-based interfaces. However, implementing effective gesture recognition in real-world mobile settings involves challenges such as limited computational power, varying environmental conditions, and the requirement for robust offline–online data management. In this study, we introduce ThumbsUp, which is a gesture-driven system, and employ a partially systematic literature review approach (inspired by core PRISMA guidelines) to identify the key research gaps in mobile gesture recognition. By incorporating insights from deep learning–based methods (e.g., CNNs and Transformers) while focusing on low resource consumption, we leverage Google’s MediaPipe in our framework for real-time detection of 21 hand landmarks and adaptive lighting pre-processing, enabling accurate recognition of a “thumbs-up” gesture. The system features a secure queue-based offline–cloud synchronization model, which ensures that the captured images and metadata (encrypted with AES-GCM) remain consistent and accessible even with intermittent connectivity. Experimental results under dynamic lighting, distance variations, and partially cluttered environments confirm the system’s superior low-light performance and decreased resource consumption compared to baseline camera applications. Additionally, we highlight the feasibility of extending ThumbsUp to incorporate AI-driven enhancements for abrupt lighting changes and, in the future, electromyographic (EMG) signals for users with motor impairments. Our comprehensive evaluation demonstrates that ThumbsUp maintains robust performance on typical mobile hardware, showing resilience to unstable network conditions and minimal reliance on high-end GPUs. These findings offer new perspectives for deploying gesture-based interfaces in the broader IoT ecosystem, thus paving the way toward secure, efficient, and inclusive mobile HCI solutions. Full article
(This article belongs to the Special Issue AI-Driven Digital Image Processing: Latest Advances and Prospects)
Show Figures

Figure 1

18 pages, 1037 KiB  
Article
Optimisation and Comparison of Markerless and Marker-Based Motion Capture Methods for Hand and Finger Movement Analysis
by Valentin Maggioni, Christine Azevedo-Coste, Sam Durand and François Bailly
Sensors 2025, 25(4), 1079; https://doi.org/10.3390/s25041079 - 11 Feb 2025
Abstract
Ensuring the accurate tracking of hand and fingers movements is an ongoing challenge for upper limb rehabilitation assessment, as the high number of degrees of freedom and segments in the limited volume of the hand makes this a difficult task. The objective of [...] Read more.
Ensuring the accurate tracking of hand and fingers movements is an ongoing challenge for upper limb rehabilitation assessment, as the high number of degrees of freedom and segments in the limited volume of the hand makes this a difficult task. The objective of this study is to evaluate the performance of two markerless approaches (the Leap Motion Controller and the Google MediaPipe API) in comparison to a marker-based one, and to improve the precision of the markerless methods by introducing additional data processing algorithms fusing multiple recording devices. Fifteen healthy participants were instructed to perform five distinct hand movements while being recorded by the three motion capture methods simultaneously. The captured movement data from each device was analyzed using a skeletal model of the hand through the inverse kinematics method of the OpenSim software. Finally, the root mean square errors of the angles formed by each finger segment were calculated for the markerless and marker-based motion capture methods to compare their accuracy. Our results indicate that the MediaPipe-based setup is more accurate than the Leap Motion Controller-based one (average root mean square error of 10.9° versus 14.7°), showing promising results for the use of markerless-based methods in clinical applications. Full article
(This article belongs to the Collection Sensors for Gait, Human Movement Analysis, and Health Monitoring)
Show Figures

Figure 1

20 pages, 8021 KiB  
Article
CNN 1D: A Robust Model for Human Pose Estimation
by Mercedes Hernández de la Cruz, Uriel Solache, Antonio Luna-Álvarez, Sergio Ricardo Zagal-Barrera, Daniela Aurora Morales López and Dante Mujica-Vargas
Information 2025, 16(2), 129; https://doi.org/10.3390/info16020129 - 10 Feb 2025
Abstract
The purpose of this research is to develop an efficient model for human pose estimation (HPE). The main limitations of the study include the small size of the dataset and confounds in the classification of certain poses, suggesting the need for more data [...] Read more.
The purpose of this research is to develop an efficient model for human pose estimation (HPE). The main limitations of the study include the small size of the dataset and confounds in the classification of certain poses, suggesting the need for more data to improve the robustness of the model in uncontrolled environments. The methodology used combines MediaPipe for the detection of key points in images with a CNN1D model that processes preprocessed feature sequences. The Yoga Poses dataset was used for the training and validation of the model, and resampling techniques, such as bootstrapping, were applied to improve accuracy and avoid overfitting in the training. The results show that the proposed model achieves 96% overall accuracy in the classification of five yoga poses, with accuracy metrics above 90% for all classes. The implementation of the CNN1D model instead of traditional 2D or 3D architectures accomplishes the goal of maintaining a low computational cost and efficient preprocessing of the images, allowing for its use on mobile devices and real-time environments. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

27 pages, 5537 KiB  
Article
Real-Time Gaze Estimation Using Webcam-Based CNN Models for Human–Computer Interactions
by Visal Vidhya and Diego Resende Faria
Computers 2025, 14(2), 57; https://doi.org/10.3390/computers14020057 - 10 Feb 2025
Abstract
Gaze tracking and estimation are essential for understanding human behavior and enhancing human–computer interactions. This study introduces an innovative, cost-effective solution for real-time gaze tracking using a standard webcam, providing a practical alternative to conventional methods that rely on expensive infrared (IR) cameras. [...] Read more.
Gaze tracking and estimation are essential for understanding human behavior and enhancing human–computer interactions. This study introduces an innovative, cost-effective solution for real-time gaze tracking using a standard webcam, providing a practical alternative to conventional methods that rely on expensive infrared (IR) cameras. Traditional approaches, such as Pupil Center Corneal Reflection (PCCR), require IR cameras to capture corneal reflections and iris glints, demanding high-resolution images and controlled environments. In contrast, the proposed method utilizes a convolutional neural network (CNN) trained on webcam-captured images to achieve precise gaze estimation. The developed deep learning model achieves a mean squared error (MSE) of 0.0112 and an accuracy of 90.98% through a novel trajectory-based accuracy evaluation system. This system involves an animation of a ball moving across the screen, with the user’s gaze following the ball’s motion. Accuracy is determined by calculating the proportion of gaze points falling within a predefined threshold based on the ball’s radius, ensuring a comprehensive evaluation of the system’s performance across all screen regions. Data collection is both simplified and effective, capturing images of the user’s right eye while they focus on the screen. Additionally, the system includes advanced gaze analysis tools, such as heat maps, gaze fixation tracking, and blink rate monitoring, which are all integrated into an intuitive user interface. The robustness of this approach is further enhanced by incorporating Google’s Mediapipe model for facial landmark detection, improving accuracy and reliability. The evaluation results demonstrate that the proposed method delivers high-accuracy gaze prediction without the need for expensive equipment, making it a practical and accessible solution for diverse applications in human–computer interactions and behavioral research. Full article
(This article belongs to the Special Issue Machine Learning Applications in Pattern Recognition)
Show Figures

Figure 1

23 pages, 20134 KiB  
Article
The Development and Validation of an Artificial Intelligence Model for Estimating Thumb Range of Motion Using Angle Sensors and Machine Learning: Targeting Radial Abduction, Palmar Abduction, and Pronation Angles
by Yutaka Ehara, Atsuyuki Inui, Yutaka Mifune, Kohei Yamaura, Tatsuo Kato, Takahiro Furukawa, Shuya Tanaka, Masaya Kusunose, Shunsaku Takigami, Shin Osawa, Daiji Nakabayashi, Shinya Hayashi, Tomoyuki Matsumoto, Takehiko Matsushita and Ryosuke Kuroda
Appl. Sci. 2025, 15(3), 1296; https://doi.org/10.3390/app15031296 - 27 Jan 2025
Abstract
An accurate assessment of thumb range of motion is crucial for diagnosing musculoskeletal conditions, evaluating functional impairments, and planning effective rehabilitation strategies. In this study, we aimed to enhance the accuracy of estimating thumb range of motion using a combination of MediaPipe, which [...] Read more.
An accurate assessment of thumb range of motion is crucial for diagnosing musculoskeletal conditions, evaluating functional impairments, and planning effective rehabilitation strategies. In this study, we aimed to enhance the accuracy of estimating thumb range of motion using a combination of MediaPipe, which is an AI-based posture estimation library, and machine learning methods, taking the values obtained using angle sensors to be the true values. Radial abduction, palmar abduction, and pronation angles were estimated using MediaPipe based on coordinates detected from videos of 18 healthy participants (nine males and nine females with an age range of 30–49 years) selected to reflect a balanced distribution of height and other physical characteristics. A conical thumb movement model was constructed, and parameters were generated based on the coordinate data. Five machine learning models were evaluated, with LightGBM achieving the highest accuracy across all metrics. Specifically, for radial abduction, palmar abduction, and supination, the root mean square error (RMSE), mean absolute error (MAE), coefficient of determination (R2), and correlation coefficient were 4.67°, 3.41°, 0.94, and 0.97; 4.63°, 3.41°, 0.95, and 0.98; and 5.69°, 4.17°, 0.88, and 0.94, respectively. These results demonstrate that when estimating thumb range of motion, the AI model trained using angle sensor data and LightGBM achieved accuracy that was high and comparable to that of prior methods involving the use of MediaPipe and a protractor. Full article
(This article belongs to the Special Issue Research on Machine Learning in Computer Vision)
Show Figures

Figure 1

18 pages, 4133 KiB  
Article
An Investigation of Hand Gestures for Controlling Video Games in a Rehabilitation Exergame System
by Radhiatul Husna, Komang Candra Brata, Irin Tri Anggraini, Nobuo Funabiki, Alfiandi Aulia Rahmadani and Chih-Peng Fan
Computers 2025, 14(1), 25; https://doi.org/10.3390/computers14010025 - 15 Jan 2025
Viewed by 440
Abstract
Musculoskeletal disorders (MSDs) can significantly impact individuals’ quality of life (QoL), often requiring effective rehabilitation strategies to promote recovery. However, traditional rehabilitation methods can be expensive and may lack engagement, leading to poor adherence to therapy exercise routines. An exergame system can [...] Read more.
Musculoskeletal disorders (MSDs) can significantly impact individuals’ quality of life (QoL), often requiring effective rehabilitation strategies to promote recovery. However, traditional rehabilitation methods can be expensive and may lack engagement, leading to poor adherence to therapy exercise routines. An exergame system can be a solution to this problem. In this paper, we investigate appropriate hand gestures for controlling video games in a rehabilitation exergame system. The Mediapipe Python library is adopted for the real-time recognition of gestures. We choose 10 easy gestures among 32 possible simple gestures. Then, we specify and compare the best and the second-best groups used to control the game. Comprehensive experiments are conducted with 16 students at Andalas University, Indonesia, to find appropriate gestures and evaluate user experiences of the system using the System Usability Scale (SUS) and User Experience Questionnaire (UEQ). The results show that the hand gestures in the best group are more accessible than in the second-best group. The results suggest appropriate hand gestures for game controls and confirm the proposal’s validity. In future work, we plan to enhance the exergame system by integrating a diverse set of video games, while expanding its application to a broader and more diverse sample. We will also study other practical applications of the hand gesture control function. Full article
Show Figures

Figure 1

18 pages, 13488 KiB  
Article
Hydrothermal Coupling Analysis of Frozen Soil Temperature Field in Stage of Pipe Roof Freezing Method
by Xin Feng, Jun Hu, Jie Zhou, Shuai Zhang and Ying Wang
Sustainability 2025, 17(2), 620; https://doi.org/10.3390/su17020620 - 15 Jan 2025
Viewed by 457
Abstract
Taking the Sanya River Mouth Channel project as a case study, this research explores the minimum brine temperature required for the pipe-jacking freezing method during staged freezing. Based on the heat transfer theory of porous media, a three-dimensional model of the actual working [...] Read more.
Taking the Sanya River Mouth Channel project as a case study, this research explores the minimum brine temperature required for the pipe-jacking freezing method during staged freezing. Based on the heat transfer theory of porous media, a three-dimensional model of the actual working conditions was established using COMSOL 6.1 finite element software. By adjusting the brine cooling scheme, the development and distribution patterns of the freezing curtain under different brine temperatures were analyzed. The results indicate that as the staged freezing brine temperature increases, the thickness of the freezing curtain decreases linearly, and the closure of isotherms is inhibited. When the brine temperature is −8 °C, the thickness of the freezing curtain meets the minimum requirement and effectively achieves the freezing effect under both low and high seepage flow conditions. Additionally, seepage significantly affects the formation of the freezing curtain, causing it to shift towards the direction of seepage, with the degree of shift becoming more pronounced as the seepage velocity increases. When the seepage velocity is so high that the thickness of the freezing curtain on one side is less than 2 m, the impact of seepage on the freezing curtain can be reduced by decreasing the hydraulic head difference in the freezing area or by increasing the arrangement of freezing pipes, thereby enhancing the freezing effect. Full article
Show Figures

Figure 1

19 pages, 20282 KiB  
Article
Design of a System for Driver Drowsiness Detection and Seat Belt Monitoring Using Raspberry Pi 4 and Arduino Nano
by Anthony Alvarez Oviedo, Jhojan Felipe Mamani Villanueva, German Alberto Echaiz Espinoza, Juan Moises Mauricio Villanueva, Andrés Ortiz Salazar and Elmer Rolando Llanos Villarreal
Designs 2025, 9(1), 11; https://doi.org/10.3390/designs9010011 - 13 Jan 2025
Viewed by 851
Abstract
This research explores the design of a system for monitoring driver drowsiness and supervising seat belt usage in interprovincial buses. In Peru, road accidents involving long-distance bus transportation amounted to 5449 in 2022, and the human factor plays a significant role. It is [...] Read more.
This research explores the design of a system for monitoring driver drowsiness and supervising seat belt usage in interprovincial buses. In Peru, road accidents involving long-distance bus transportation amounted to 5449 in 2022, and the human factor plays a significant role. It is essential to understand how the use of non-invasive sensors for monitoring and supervising passengers and drivers can enhance safety in interprovincial transportation. The objective of this research is to develop a system using a Raspberry Pi 4 and Arduino Nano that allows for the storage of monitoring data. To achieve this, a conventional camera and MediaPipe were used for driver drowsiness detection, while passenger supervision was carried out using a combination of commercially available sensors as well as custom-built sensors. RS485 communication was utilized to store data related to both the driver and passengers. The simulations conducted demonstrate a high level of reliability in detecting driver drowsiness under specific conditions and the correct operation of the sensors for passenger supervision. Therefore, the proposed system is feasible and can be implemented for real-world testing. The implications of this research suggest that the system’s cost is not a barrier to its implementation, thus contributing to improved safety in interprovincial transportation. Full article
Show Figures

Figure 1

19 pages, 599 KiB  
Article
Semaphore Recognition Using Deep Learning
by Yan Huan and Weiqi Yan
Electronics 2025, 14(2), 286; https://doi.org/10.3390/electronics14020286 - 12 Jan 2025
Viewed by 410
Abstract
This study explored the application of deep learning models for signal flag recognition, comparing YOLO11 with basic CNN, ResNet18, and DenseNet121. Experimental results demonstrated that YOLO11 outperformed the other models, achieving superior performance across all common evaluation metrics. The confusion matrix further confirmed [...] Read more.
This study explored the application of deep learning models for signal flag recognition, comparing YOLO11 with basic CNN, ResNet18, and DenseNet121. Experimental results demonstrated that YOLO11 outperformed the other models, achieving superior performance across all common evaluation metrics. The confusion matrix further confirmed that YOLO11 exhibited the highest classification accuracy among the tested models. Moreover, by integrating MediaPipe’s human posture data with image data to create multimodal inputs for training, it was observed that the posture data significantly enhanced the model’s performance. Leveraging MediaPipe’s posture data for annotation generation and model training enabled YOLO11 to achieve an impressive 99% accuracy on the test set. This study highlights the effectiveness of YOLO11 for flag signal recognition tasks. Furthermore, it demonstrates that when handling tasks involving human posture, MediaPipe not only enhances model performance through posture feature data but also facilitates data processing and contributes to validating prediction results in subsequent stages. Full article
(This article belongs to the Special Issue Advances in Embedded Deep Learning Systems)
Show Figures

Figure 1

22 pages, 9192 KiB  
Article
A Deep-Learning-Driven Aerial Dialing PIN Code Input Authentication System via Personal Hand Features
by Jun Wang, Haojie Wang, Kiminori Sato and Bo Wu
Electronics 2025, 14(1), 119; https://doi.org/10.3390/electronics14010119 - 30 Dec 2024
Viewed by 384
Abstract
The dialing-type authentication as a common PIN code input system has gained popularity due to the simple and intuitive design. However, this type of system has the security risk of “shoulder surfing attack”, so that attackers can physically view the device screen and [...] Read more.
The dialing-type authentication as a common PIN code input system has gained popularity due to the simple and intuitive design. However, this type of system has the security risk of “shoulder surfing attack”, so that attackers can physically view the device screen and keypad to obtain personal information. Therefore, based on the use of “Leap Motion” device and “Media Pipe” solutions, in this paper, we try to propose a new two-factor dialing-type input authentication system powered by aerial hand motions and features without contact. To be specific, based on the design of the aerial dialing system part, as the first authentication part, we constructed a total of two types of hand motion input subsystems using Leap Motion and Media Pipe, separately. The results of FRR (False Rejection Rate) and FAR (False Acceptance Rate) experiments of the two subsystems show that Media Pipe is more comprehensive and superior in terms of applicability, accuracy, and speed. Moreover, as the second authentication part, the user’s hand features (e.g., proportional characteristics associated with fingers and palm) were used for specialized CNN-LSTM model training to ultimately obtain a satisfactory accuracy. Full article
(This article belongs to the Special Issue Biometrics and Pattern Recognition)
Show Figures

Figure 1

17 pages, 3748 KiB  
Article
Sudden Fall Detection of Human Body Using Transformer Model
by Duncan Kibet, Min Seop So, Hahyeon Kang, Yongsu Han and Jong-Ho Shin
Sensors 2024, 24(24), 8051; https://doi.org/10.3390/s24248051 - 17 Dec 2024
Viewed by 615
Abstract
In human activity recognition, accurate and timely fall detection is essential in healthcare, particularly for monitoring the elderly, where quick responses can prevent severe consequences. This study presents a new fall detection model built on a transformer architecture, which focuses on the movement [...] Read more.
In human activity recognition, accurate and timely fall detection is essential in healthcare, particularly for monitoring the elderly, where quick responses can prevent severe consequences. This study presents a new fall detection model built on a transformer architecture, which focuses on the movement speeds of key body points tracked using the MediaPipe library. By continuously monitoring these key points in video data, the model calculates real-time speed changes that signal potential falls. The transformer’s attention mechanism enables it to catch even slight shifts in movement, achieving an accuracy of 97.6% while significantly reducing false alarms compared to traditional methods. This approach has practical applications in settings like elderly care facilities and home monitoring systems, where reliable fall detection can support faster intervention. By homing in on the dynamics of movement, this model improves both accuracy and reliability, making it suitable for various real-world situations. Overall, it offers a promising solution for enhancing safety and care for vulnerable populations in diverse environments. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

18 pages, 2211 KiB  
Article
Accuracy Evaluation of 3D Pose Reconstruction Algorithms Through Stereo Camera Information Fusion for Physical Exercises with MediaPipe Pose
by Sebastian Dill, Arjang Ahmadi, Martin Grimmer, Dennis Haufe, Maurice Rohr, Yanhua Zhao, Maziar Sharbafi and Christoph Hoog Antink
Sensors 2024, 24(23), 7772; https://doi.org/10.3390/s24237772 - 4 Dec 2024
Viewed by 937
Abstract
In recent years, significant research has been conducted on video-based human pose estimation (HPE). While monocular two-dimensional (2D) HPE has been shown to achieve high performance, monocular three-dimensional (3D) HPE poses a more challenging problem. However, since human motion happens in a 3D [...] Read more.
In recent years, significant research has been conducted on video-based human pose estimation (HPE). While monocular two-dimensional (2D) HPE has been shown to achieve high performance, monocular three-dimensional (3D) HPE poses a more challenging problem. However, since human motion happens in a 3D space, 3D HPE offers a more accurate representation of the human, granting increased usability for complex tasks like analysis of physical exercise. We propose a method based on MediaPipe Pose, 2D HPE on stereo cameras and a fusion algorithm without prior stereo calibration to reconstruct 3D poses, combining the advantages of high accuracy in 2D HPE with the increased usability of 3D coordinates. We evaluate this method on a self-recorded database focused on physical exercise to research what accuracy can be achieved and whether this accuracy is sufficient to recognize errors in exercise performance. We find that our method achieves significantly improved performance compared to monocular 3D HPE (median RMSE of 30.1 compared to 56.3, p-value below 106) and can show that the performance is sufficient for error recognition. Full article
Show Figures

Figure 1

Back to TopTop