Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (323)

Search Parameters:
Keywords = RGBD camera

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 38236 KiB  
Article
Disassembly of Distribution Transformers Based on Multimodal Data Recognition and Collaborative Processing
by Li Wang, Feng Chen, Yujia Hu, Zhiyao Zheng and Kexin Zhang
Algorithms 2024, 17(12), 595; https://doi.org/10.3390/a17120595 - 23 Dec 2024
Abstract
As power system equipment gradually ages, the automated disassembly of transformers has become a critical area of research to enhance both efficiency and safety. This paper presents a transformer disassembly system designed for power systems, leveraging multimodal perception and collaborative processing. By integrating [...] Read more.
As power system equipment gradually ages, the automated disassembly of transformers has become a critical area of research to enhance both efficiency and safety. This paper presents a transformer disassembly system designed for power systems, leveraging multimodal perception and collaborative processing. By integrating 2D images and 3D point cloud data captured by RGB-D cameras, the system enables the precise recognition and efficient disassembly of transformer covers and internal components through multimodal data fusion, deep learning models, and control technologies. The system employs an enhanced YOLOv8 model for positioning and identifying screw-fastened covers while also utilizing the STDC network for segmentation and cutting path planning of welded covers. In addition, the system captures 3D point cloud data of the transformer’s interior using multi-view RGB-D cameras and performs multimodal semantic segmentation and object detection via the ODIN model, facilitating the high-precision identification and cutting of complex components such as windings, studs, and silicon steel sheets. Experimental results show that the system achieves a recognition accuracy of 99% for both cover and internal component disassembly, with a disassembly success rate of 98%, demonstrating its high adaptability and safety in complex industrial environments. Full article
Show Figures

Figure 1

14 pages, 5077 KiB  
Article
Development of a Collision-Free Path Planning Method for a 6-DoF Orchard Harvesting Manipulator Using RGB-D Camera and Bi-RRT Algorithm
by Zifu Liu, Rizky Mulya Sampurno, R. M. Rasika D. Abeyrathna, Victor Massaki Nakaguchi and Tofael Ahamed
Sensors 2024, 24(24), 8113; https://doi.org/10.3390/s24248113 - 19 Dec 2024
Viewed by 274
Abstract
With the decreasing and aging agricultural workforce, fruit harvesting robots equipped with higher degrees of freedom (DoF) manipulators are seen as a promising solution for performing harvesting operations in unstructured and complex orchard environments. In such a complex environment, guiding the end-effector from [...] Read more.
With the decreasing and aging agricultural workforce, fruit harvesting robots equipped with higher degrees of freedom (DoF) manipulators are seen as a promising solution for performing harvesting operations in unstructured and complex orchard environments. In such a complex environment, guiding the end-effector from its starting position to the target fruit while avoiding obstacles poses a significant challenge for path planning in automatic harvesting. However, existing studies often rely on manually constructed environmental map models and face limitations in planning efficiency and computational cost. Therefore, in this study, we introduced a collision-free path planning method for a 6-DoF orchard harvesting manipulator using an RGB-D camera and the Bi-RRT algorithm. First, by transforming the RGB-D camera’s point cloud data into collision geometries, we achieved 3D obstacle map reconstruction, allowing the harvesting robot to detect obstacles within its workspace. Second, by adopting the URDF format, we built the manipulator’s simulation model to be inserted with the reconstructed 3D obstacle map environment. Third, the Bi-RRT algorithm was introduced for path planning, which performs bidirectional expansion simultaneously from the start and targets configurations based on the principles of the RRT algorithm, thereby effectively shortening the time required to reach the target. Subsequently, a validation and comparison experiment were conducted in an artificial orchard. The experimental results validated our method, with the Bi-RRT algorithm achieving reliable collision-free path planning across all experimental sets. On average, it required just 0.806 s and generated 12.9 nodes per path, showing greater efficiency in path generation compared to the Sparse Bayesian Learning (SBL) algorithm, which required 0.870 s and generated 15.1 nodes per path. This method proved to be both effective and fast, providing meaningful guidance for implementing path planning for a 6-DoF manipulator in orchard harvesting tasks. Full article
(This article belongs to the Special Issue Intelligent Control and Robotic Technologies in Path Planning)
Show Figures

Figure 1

19 pages, 15889 KiB  
Article
SIGNIFY: Leveraging Machine Learning and Gesture Recognition for Sign Language Teaching Through a Serious Game
by Luca Ulrich, Giulio Carmassi, Paolo Garelli, Gianluca Lo Presti, Gioele Ramondetti, Giorgia Marullo, Chiara Innocente and Enrico Vezzetti
Future Internet 2024, 16(12), 447; https://doi.org/10.3390/fi16120447 - 1 Dec 2024
Viewed by 719
Abstract
Italian Sign Language (LIS) is the primary form of communication for many members of the Italian deaf community. Despite being recognized as a fully fledged language with its own grammar and syntax, LIS still faces challenges in gaining widespread recognition and integration into [...] Read more.
Italian Sign Language (LIS) is the primary form of communication for many members of the Italian deaf community. Despite being recognized as a fully fledged language with its own grammar and syntax, LIS still faces challenges in gaining widespread recognition and integration into public services, education, and media. In recent years, advancements in technology, including artificial intelligence and machine learning, have opened up new opportunities to bridge communication gaps between the deaf and hearing communities. This paper presents a novel educational tool designed to teach LIS through SIGNIFY, a Machine Learning-based interactive serious game. The game incorporates a tutorial section, guiding users to learn the sign alphabet, and a classic hangman game that reinforces learning through practice. The developed system employs advanced hand gesture recognition techniques for learning and perfecting sign language gestures. The proposed solution detects and overlays 21 hand landmarks and a bounding box on live camera feeds, making use of an open-source framework to provide real-time visual feedback. Moreover, the study compares the effectiveness of two camera systems: the Azure Kinect, which provides RGB-D information, and a standard RGB laptop camera. Results highlight both systems’ feasibility and educational potential, showcasing their respective advantages and limitations. Evaluations with primary school children demonstrate the tool’s ability to make sign language education more accessible and engaging. This article emphasizes the work’s contribution to inclusive education, highlighting the integration of technology to enhance learning experiences for deaf and hard-of-hearing individuals. Full article
(This article belongs to the Special Issue Advances in Extended Reality for Smart Cities)
Show Figures

Figure 1

20 pages, 2004 KiB  
Communication
Towards Open-Set NLP-Based Multi-Level Planning for Robotic Tasks
by Peteris Racinskis, Oskars Vismanis, Toms Eduards Zinars, Janis Arents and Modris Greitans
Appl. Sci. 2024, 14(22), 10717; https://doi.org/10.3390/app142210717 - 19 Nov 2024
Viewed by 583
Abstract
This paper outlines a conceptual design for a multi-level natural language-based planning system and describes a demonstrator. The main goal of the demonstrator is to serve as a proof-of-concept by accomplishing end-to-end execution in a real-world environment, and showing a novel way of [...] Read more.
This paper outlines a conceptual design for a multi-level natural language-based planning system and describes a demonstrator. The main goal of the demonstrator is to serve as a proof-of-concept by accomplishing end-to-end execution in a real-world environment, and showing a novel way of interfacing an LLM-based planner with open-set semantic maps. The target use-case is executing sequences of tabletop pick-and-place operations using an industrial robot arm and RGB-D camera. The demonstrator processes unstructured user prompts, produces high-level action plans, queries a map for object positions and grasp poses using open-set semantics, then uses the resulting outputs to parametrize and execute a sequence of action primitives. In this paper, the overall system structure, high-level planning using language models, low-level planning through action and motion primitives, as well as the implementation of two different environment modeling schemes—2.5 or fully 3-dimensional—are described in detail. The impacts of quantizing image embeddings on object recall are assessed and high-level planner performance is evaluated using a small reference scene data set. We observe that, for the simple constrained test command data set, the high-level planner is able to achieve a total success rate of 96.40%, while the semantic maps exhibit maximum recall rates of 94.69% and 92.29% for the 2.5d and 3d versions, respectively. Full article
(This article belongs to the Special Issue Digital Technologies Enabling Modern Industries)
Show Figures

Figure 1

32 pages, 11087 KiB  
Article
Path Planning and Motion Control of Robot Dog Through Rough Terrain Based on Vision Navigation
by Tianxiang Chen, Yipeng Huangfu, Sutthiphong Srigrarom and Boo Cheong Khoo
Sensors 2024, 24(22), 7306; https://doi.org/10.3390/s24227306 - 15 Nov 2024
Viewed by 1345
Abstract
This article delineates the enhancement of an autonomous navigation and obstacle avoidance system for a quadruped robot dog. Part one of this paper presents the integration of a sophisticated multi-level dynamic control framework, utilizing Model Predictive Control (MPC) and Whole-Body Control (WBC) from [...] Read more.
This article delineates the enhancement of an autonomous navigation and obstacle avoidance system for a quadruped robot dog. Part one of this paper presents the integration of a sophisticated multi-level dynamic control framework, utilizing Model Predictive Control (MPC) and Whole-Body Control (WBC) from MIT Cheetah. The system employs an Intel RealSense D435i depth camera for depth vision-based navigation, which enables high-fidelity 3D environmental mapping and real-time path planning. A significant innovation is the customization of the EGO-Planner to optimize trajectory planning in dynamically changing terrains, coupled with the implementation of a multi-body dynamics model that significantly improves the robot’s stability and maneuverability across various surfaces. The experimental results show that the RGB-D system exhibits superior velocity stability and trajectory accuracy to the SLAM system, with a 20% reduction in the cumulative velocity error and a 10% improvement in path tracking precision. The experimental results also show that the RGB-D system achieves smoother navigation, requiring 15% fewer iterations for path planning, and a 30% faster success rate recovery in challenging environments. The successful application of these technologies in simulated urban disaster scenarios suggests promising future applications in emergency response and complex urban environments. Part two of this paper presents the development of a robust path planning algorithm for a robot dog on a rough terrain based on attached binocular vision navigation. We use a commercial-of-the-shelf (COTS) robot dog. An optical CCD binocular vision dynamic tracking system is used to provide environment information. Likewise, the pose and posture of the robot dog are obtained from the robot’s own sensors, and a kinematics model is established. Then, a binocular vision tracking method is developed to determine the optimal path, provide a proposal (commands to actuators) of the position and posture of the bionic robot, and achieve stable motion on tough terrains. The terrain is assumed to be a gentle uneven terrain to begin with and subsequently proceeds to a more rough surface. This work consists of four steps: (1) pose and position data are acquired from the robot dog’s own inertial sensors, (2) terrain and environment information is input from onboard cameras, (3) information is fused (integrated), and (4) path planning and motion control proposals are made. Ultimately, this work provides a robust framework for future developments in the vision-based navigation and control of quadruped robots, offering potential solutions for navigating complex and dynamic terrains. Full article
Show Figures

Figure 1

26 pages, 33294 KiB  
Article
RGB-D Camera and Fractal-Geometry-Based Maximum Diameter Estimation Method of Apples for Robot Intelligent Selective Graded Harvesting
by Bin Yan and Xiameng Li
Fractal Fract. 2024, 8(11), 649; https://doi.org/10.3390/fractalfract8110649 - 7 Nov 2024
Cited by 1 | Viewed by 633
Abstract
Realizing the integration of intelligent fruit picking and grading for apple harvesting robots is an inevitable requirement for the future development of smart agriculture and precision agriculture. Therefore, an apple maximum diameter estimation model based on RGB-D camera fusion depth information was proposed [...] Read more.
Realizing the integration of intelligent fruit picking and grading for apple harvesting robots is an inevitable requirement for the future development of smart agriculture and precision agriculture. Therefore, an apple maximum diameter estimation model based on RGB-D camera fusion depth information was proposed in the study. Firstly, the maximum diameter parameters of Red Fuji apples were collected, and the results were statistically analyzed. Then, based on the Intel RealSense D435 RGB-D depth camera and LabelImg software, the depth information of apples and the two-dimensional size information of fruit images were obtained. Furthermore, the relationship between fruit depth information, two-dimensional size information of fruit images, and the maximum diameter of apples was explored. Based on Origin software, multiple regression analysis and nonlinear surface fitting were used to analyze the correlation between fruit depth, diagonal length of fruit bounding rectangle, and maximum diameter. A model for estimating the maximum diameter of apples was constructed. Finally, the constructed maximum diameter estimation model was experimentally validated and evaluated for imitation apples in the laboratory and fruits on the Red Fuji fruit trees in modern apple orchards. The experimental results showed that the average maximum relative error of the constructed model in the laboratory imitation apple validation set was ±4.1%, the correlation coefficient (R2) of the estimated model was 0.98613, and the root mean square error (RMSE) was 3.21 mm. The average maximum diameter estimation relative error on the modern orchard Red Fuji apple validation set was ±3.77%, the correlation coefficient (R2) of the estimation model was 0.84, and the root mean square error (RMSE) was 3.95 mm. The proposed model can provide theoretical basis and technical support for the selective apple-picking operation of intelligent robots based on apple size grading. Full article
Show Figures

Figure 1

14 pages, 5641 KiB  
Article
Estimation of Lower Limb Joint Angles Using sEMG Signals and RGB-D Camera
by Guoming Du, Zhen Ding, Hao Guo, Meichao Song and Feng Jiang
Bioengineering 2024, 11(10), 1026; https://doi.org/10.3390/bioengineering11101026 - 15 Oct 2024
Viewed by 1027
Abstract
Estimating human joint angles is a crucial task in motion analysis, gesture recognition, and motion intention prediction. This paper presents a novel model-based approach for generating reliable and accurate human joint angle estimation using a dual-branch network. The proposed network leverages combined features [...] Read more.
Estimating human joint angles is a crucial task in motion analysis, gesture recognition, and motion intention prediction. This paper presents a novel model-based approach for generating reliable and accurate human joint angle estimation using a dual-branch network. The proposed network leverages combined features derived from encoded sEMG signals and RGB-D image data. To ensure the accuracy and reliability of the estimation algorithm, the proposed network employs a convolutional autoencoder to generate a high-level compression of sEMG features aimed at motion prediction. Considering the variability in the distribution of sEMG signals, the proposed network introduces a vision-based joint regression network to maintain the stability of combined features. Taking into account latency, occlusion, and shading issues with vision data acquisition, the feature fusion network utilizes high-frequency sEMG features as weights for specific features extracted from image data. The proposed method achieves effective human body joint angle estimation for motion analysis and motion intention prediction by mitigating the effects of non-stationary sEMG signals. Full article
(This article belongs to the Special Issue Bioengineering of the Motor System)
Show Figures

Figure 1

20 pages, 6262 KiB  
Article
YPR-SLAM: A SLAM System Combining Object Detection and Geometric Constraints for Dynamic Scenes
by Xukang Kan, Gefei Shi, Xuerong Yang and Xinwei Hu
Sensors 2024, 24(20), 6576; https://doi.org/10.3390/s24206576 - 12 Oct 2024
Viewed by 748
Abstract
Traditional SLAM systems assume a static environment, but moving objects break this ideal assumption. In the real world, moving objects can greatly influence the precision of image matching and camera pose estimation. In order to solve these problems, the YPR-SLAM system is proposed. [...] Read more.
Traditional SLAM systems assume a static environment, but moving objects break this ideal assumption. In the real world, moving objects can greatly influence the precision of image matching and camera pose estimation. In order to solve these problems, the YPR-SLAM system is proposed. First of all, the system includes a lightweight YOLOv5 detection network for detecting both dynamic and static objects, which provides pre-dynamic object information to the SLAM system. Secondly, utilizing the prior information of dynamic targets and the depth image, a method of geometric constraint for removing motion feature points from the depth image is proposed. The Depth-PROSAC algorithm is used to differentiate the dynamic and static feature points so that dynamic feature points can be removed. At last, the dense cloud map is constructed by the static feature points. The YPR-SLAM system is an efficient combination of object detection and geometry constraint in a tightly coupled way, eliminating motion feature points and minimizing their adverse effects on SLAM systems. The performance of the YPR-SLAM was assessed on the public TUM RGB-D dataset, and it was found that YPR-SLAM was suitable for dynamic situations. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

25 pages, 27763 KiB  
Article
Improved Multi-Size, Multi-Target and 3D Position Detection Network for Flowering Chinese Cabbage Based on YOLOv8
by Yuanqing Shui, Kai Yuan, Mengcheng Wu and Zuoxi Zhao
Plants 2024, 13(19), 2808; https://doi.org/10.3390/plants13192808 - 7 Oct 2024
Viewed by 1103
Abstract
Accurately detecting the maturity and 3D position of flowering Chinese cabbage (Brassica rapa var. chinensis) in natural environments is vital for autonomous robot harvesting in unstructured farms. The challenge lies in dense planting, small flower buds, similar colors and occlusions. This study [...] Read more.
Accurately detecting the maturity and 3D position of flowering Chinese cabbage (Brassica rapa var. chinensis) in natural environments is vital for autonomous robot harvesting in unstructured farms. The challenge lies in dense planting, small flower buds, similar colors and occlusions. This study proposes a YOLOv8-Improved network integrated with the ByteTrack tracking algorithm to achieve multi-object detection and 3D positioning of flowering Chinese cabbage plants in fields. In this study, C2F-MLCA is created by adding a lightweight Mixed Local Channel Attention (MLCA) with spatial awareness capability to the C2F module of YOLOv8, which improves the extraction of spatial feature information in the backbone network. In addition, a P2 detection layer is added to the neck network, and BiFPN is used instead of PAN to enhance multi-scale feature fusion and small target detection. Wise-IoU in combination with Inner-IoU is adopted as a new loss function to optimize the network for different quality samples and different size bounding boxes. Lastly, ByteTrack is integrated for video tracking, and RGB-D camera depth data are used to estimate cabbage positions. The experimental results show that YOLOv8-Improve achieves a precision (P) of 86.5% and a recall (R) of 86.0% in detecting the maturity of flowering Chinese cabbage. Among them, mAP50 and mAP75 reach 91.8% and 61.6%, respectively, representing an improvement of 2.9% and 4.7% over the original network. Additionally, the number of parameters is reduced by 25.43%. In summary, the improved YOLOv8 algorithm demonstrates high robustness and real-time detection performance, thereby providing strong technical support for automated harvesting management. Full article
(This article belongs to the Section Plant Modeling)
Show Figures

Figure 1

21 pages, 5999 KiB  
Article
A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module
by Shuling Wang, Fengze Jiang and Xiaojin Gong
Sensors 2024, 24(19), 6270; https://doi.org/10.3390/s24196270 - 27 Sep 2024
Viewed by 546
Abstract
Depth information is crucial for perceiving three-dimensional scenes. However, depth maps captured directly by depth sensors are often incomplete and noisy, our objective in the depth-completion task is to generate dense and accurate depth maps from sparse depth inputs by fusing guidance information [...] Read more.
Depth information is crucial for perceiving three-dimensional scenes. However, depth maps captured directly by depth sensors are often incomplete and noisy, our objective in the depth-completion task is to generate dense and accurate depth maps from sparse depth inputs by fusing guidance information from corresponding color images obtained from camera sensors. To address these challenges, we introduce transformer models, which have shown great promise in the field of vision, into the task of image-guided depth completion. By leveraging the self-attention mechanism, we propose a novel network architecture that effectively meets these requirements of high accuracy and resolution in depth data. To be more specific, we design a dual-branch model with a transformer-based encoder that serializes image features into tokens step by step and extracts multi-scale pyramid features suitable for pixel-wise dense prediction tasks. Additionally, we incorporate a dual-attention fusion module to enhance the fusion between the two branches. This module combines convolution-based spatial and channel-attention mechanisms, which are adept at capturing local information, with cross-attention mechanisms that excel at capturing long-distance relationships. Our model achieves state-of-the-art performance on both the NYUv2 depth and SUN-RGBD depth datasets. Additionally, our ablation studies confirm the effectiveness of the designed modules. Full article
Show Figures

Figure 1

18 pages, 7421 KiB  
Article
Enhanced Visual SLAM for Collision-Free Driving with Lightweight Autonomous Cars
by Zhihao Lin, Zhen Tian, Qi Zhang, Hanyang Zhuang and Jianglin Lan
Sensors 2024, 24(19), 6258; https://doi.org/10.3390/s24196258 - 27 Sep 2024
Viewed by 1052
Abstract
The paper presents a vision-based obstacle avoidance strategy for lightweight self-driving cars that can be run on a CPU-only device using a single RGB-D camera. The method consists of two steps: visual perception and path planning. The visual perception part uses ORBSLAM3 enhanced [...] Read more.
The paper presents a vision-based obstacle avoidance strategy for lightweight self-driving cars that can be run on a CPU-only device using a single RGB-D camera. The method consists of two steps: visual perception and path planning. The visual perception part uses ORBSLAM3 enhanced with optical flow to estimate the car’s poses and extract rich texture information from the scene. In the path planning phase, the proposed method employs a method combining a control Lyapunov function and control barrier function in the form of a quadratic program (CLF-CBF-QP) together with an obstacle shape reconstruction process (SRP) to plan safe and stable trajectories. To validate the performance and robustness of the proposed method, simulation experiments were conducted with a car in various complex indoor environments using the Gazebo simulation environment. The proposed method can effectively avoid obstacles in the scenes. The proposed algorithm outperforms benchmark algorithms in achieving more stable and shorter trajectories across multiple simulated scenes. Full article
(This article belongs to the Special Issue Intelligent Control Systems for Autonomous Vehicles)
Show Figures

Figure 1

15 pages, 6865 KiB  
Article
Method for Bottle Opening with a Dual-Arm Robot
by Francisco J. Naranjo-Campos, Juan G. Victores and Carlos Balaguer
Biomimetics 2024, 9(9), 577; https://doi.org/10.3390/biomimetics9090577 - 23 Sep 2024
Cited by 1 | Viewed by 1080
Abstract
This paper introduces a novel approach to robotic assistance in bottle opening using the dual-arm robot TIAGo++. The solution enhances accessibility by addressing the needs of individuals with injuries or disabilities who may require help with common manipulation tasks. The aim of this [...] Read more.
This paper introduces a novel approach to robotic assistance in bottle opening using the dual-arm robot TIAGo++. The solution enhances accessibility by addressing the needs of individuals with injuries or disabilities who may require help with common manipulation tasks. The aim of this paper is to propose a method involving vision, manipulation, and learning techniques to effectively address the task of bottle opening. The process begins with the acquisition of bottle and cap positions using an RGB-D camera and computer vision. Subsequently, the robot picks the bottle with one gripper and grips the cap with the other, each by planning safe trajectories. Then, the opening procedure is executed via a position and force control scheme that ensures both grippers follow the unscrewing path defined by the cap thread. Within the control loop, force sensor information is employed to control the vertical axis movements, while gripper rotation control is achieved through a Deep Reinforcement Learning (DRL) algorithm trained to determine the optimal angle increments for rotation. The results demonstrate the successful training of the learning agent. The experiments confirm the effectiveness of the proposed method in bottle opening with the TIAGo++ robot, showcasing the practical viability of the approach. Full article
(This article belongs to the Special Issue Computer-Aided Biomimetics: 2nd Edition)
Show Figures

Figure 1

27 pages, 13890 KiB  
Article
A Fast Multi-Scale of Distributed Batch-Learning Growing Neural Gas for Multi-Camera 3D Environmental Map Building
by Chyan Zheng Siow, Azhar Aulia Saputra, Takenori Obo and Naoyuki Kubota
Biomimetics 2024, 9(9), 560; https://doi.org/10.3390/biomimetics9090560 - 16 Sep 2024
Viewed by 980
Abstract
Biologically inspired intelligent methods have been applied to various sensing systems in order to extract features from a huge size of raw sensing data. For example, point cloud data can be applied to human activity recognition, multi-person tracking, and suspicious person detection, but [...] Read more.
Biologically inspired intelligent methods have been applied to various sensing systems in order to extract features from a huge size of raw sensing data. For example, point cloud data can be applied to human activity recognition, multi-person tracking, and suspicious person detection, but a single RGB-D camera is not enough to perform the above tasks. Therefore, this study propose a 3D environmental map-building method integrating point cloud data measured via multiple RGB-D cameras. First, a fast multi-scale of distributed batch-learning growing neural gas (Fast MS-DBL-GNG) is proposed as a topological feature extraction method in order to reduce computational costs because a single RGB-D camera may output 1 million data. Next, random sample consensus (RANSAC) is applied to integrate two sets of point cloud data using topological features. In order to show the effectiveness of the proposed method, Fast MS-DBL-GNG is applied to perform topological mapping from several point cloud data sets measured in different directions with some overlapping areas included in two images. The experimental results show that the proposed method can extract topological features enough to integrate point cloud data sets, and it runs 14 times faster than the previous GNG method with a 23% reduction in the quantization error. Finally, this paper discuss the advantage and disadvantage of the proposed method through numerical comparison with other methods, and explain future works to improve the proposed method. Full article
(This article belongs to the Special Issue Biomimetics in Intelligent Sensor)
Show Figures

Figure 1

23 pages, 9746 KiB  
Article
Research on SLAM Localization Algorithm for Orchard Dynamic Vision Based on YOLOD-SLAM2
by Zhen Ma, Siyuan Yang, Jingbin Li and Jiangtao Qi
Agriculture 2024, 14(9), 1622; https://doi.org/10.3390/agriculture14091622 - 16 Sep 2024
Cited by 1 | Viewed by 914
Abstract
With the development of agriculture, the complexity and dynamism of orchard environments pose challenges to the perception and positioning of inter-row environments for agricultural vehicles. This paper proposes a method for extracting navigation lines and measuring pedestrian obstacles. The improved YOLOv5 algorithm is [...] Read more.
With the development of agriculture, the complexity and dynamism of orchard environments pose challenges to the perception and positioning of inter-row environments for agricultural vehicles. This paper proposes a method for extracting navigation lines and measuring pedestrian obstacles. The improved YOLOv5 algorithm is used to detect tree trunks between left and right rows in orchards. The experimental results show that the average angle deviation of the extracted navigation lines was less than 5 degrees, verifying its accuracy. Due to the variable posture of pedestrians and ineffective camera depth, a distance measurement algorithm based on a four-zone depth comparison is proposed for pedestrian obstacle distance measurement. Experimental results showed that within a range of 6 m, the average relative error of distance measurement did not exceed 1%, and within a range of 9 m, the maximum relative error was 2.03%. The average distance measurement time was 30 ms, which could accurately and quickly achieve pedestrian distance measurement in orchard environments. On the publicly available TUM RGB-D dynamic dataset, YOLOD-SLAM2 significantly reduced the RMSE index of absolute trajectory error compared to the ORB-SLAM2 algorithm, which was less than 0.05 m/s. In actual orchard environments, YOLOD-SLAM2 had a higher degree of agreement between the estimated trajectory and the true trajectory when the vehicle was traveling in straight and circular directions. The RMSE index of the absolute trajectory error was less than 0.03 m/s, and the average tracking time was 47 ms, indicating that the YOLOD-SLAM2 algorithm proposed in this paper could meet the accuracy and real-time requirements of agricultural vehicle positioning in orchard environments. Full article
(This article belongs to the Section Agricultural Technology)
Show Figures

Figure 1

18 pages, 5473 KiB  
Article
Visual-Inertial RGB-D SLAM with Encoder Integration of ORB Triangulation and Depth Measurement Uncertainties
by Zhan-Wu Ma and Wan-Sheng Cheng
Sensors 2024, 24(18), 5964; https://doi.org/10.3390/s24185964 - 14 Sep 2024
Cited by 1 | Viewed by 1163
Abstract
In recent years, the accuracy of visual SLAM (Simultaneous Localization and Mapping) technology has seen significant improvements, making it a prominent area of research. However, within the current RGB-D SLAM systems, the estimation of 3D positions of feature points primarily relies on direct [...] Read more.
In recent years, the accuracy of visual SLAM (Simultaneous Localization and Mapping) technology has seen significant improvements, making it a prominent area of research. However, within the current RGB-D SLAM systems, the estimation of 3D positions of feature points primarily relies on direct measurements from RGB-D depth cameras, which inherently contain measurement errors. Moreover, the potential of triangulation-based estimation for ORB (Oriented FAST and Rotated BRIEF) feature points remains underutilized. To address the singularity of measurement data, this paper proposes the integration of the ORB features, triangulation uncertainty estimation and depth measurements uncertainty estimation, for 3D positions of feature points. This integration is achieved using a CI (Covariance Intersection) filter, referred to as the CI-TEDM (Triangulation Estimates and Depth Measurements) method. Vision-based SLAM systems face significant challenges, particularly in environments, such as long straight corridors, weakly textured scenes, or during rapid motion, where tracking failures are common. To enhance the stability of visual SLAM, this paper introduces an improved CI-TEDM method by incorporating wheel encoder data. The mathematical model of the encoder is proposed, and detailed derivations of the encoder pre-integration model and error model are provided. Building on these improvements, we propose a novel tightly coupled visual-inertial RGB-D SLAM with encoder integration of ORB triangulation and depth measurement uncertainties. Validation on open-source datasets and real-world environments demonstrates that the proposed improvements significantly enhance the robustness of real-time state estimation and localization accuracy for intelligent vehicles in challenging environments. Full article
Show Figures

Figure 1

Back to TopTop