Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (13)

Search Parameters:
Keywords = DSConv

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 1438 KiB  
Article
SQnet: An Enhanced Multi-Objective Detection Algorithm in Subaquatic Environments
by Yutao Zhu, Bochen Shan, Yinglong Wang and Hua Yin
Electronics 2024, 13(15), 3053; https://doi.org/10.3390/electronics13153053 (registering DOI) - 1 Aug 2024
Abstract
With the development of smart aquaculture, the demand for accuracy for underwater target detection has increased. However, traditional target detection methods have proven to be inefficient and imprecise due to the complexity of underwater environments and the obfuscation of biological features against the [...] Read more.
With the development of smart aquaculture, the demand for accuracy for underwater target detection has increased. However, traditional target detection methods have proven to be inefficient and imprecise due to the complexity of underwater environments and the obfuscation of biological features against the underwater environmental background. To address these issues, we proposed a novel algorithm for underwater multi-target detection based on the YOLOv8 architecture, named SQnet. A Dynamic Snake Convolution Network (DSConvNet) module was introduced for tackling the overlap between target organisms and the underwater environmental background. To reduce computational complexity and parameter overhead while maintaining precision, we employed a lightweight context-guided semantic segmentation network (CGNet) model. Furthermore, the information loss and degradation issues arising from indirect interactions between non-adjacent layers were handled by integrating an Asymptotic Feature Pyramid Network (AFPN) model. Experimental results demonstrate that SQnet achieves an [email protected] of 83.3% and 98.9% on the public datasets URPC2020, Aquarium, and the self-compiled dataset ZytLn, respectively. Additionally, its [email protected]–0.95 reaches 49.1%, 85.4%, and 84.6%, respectively, surpassing other classical algorithms such as YOLOv7-tiny, YOLOv5s, and YOLOv3-tiny. Compared to the original YOLOv8 model, SQnet boasts a PARM of 2.25 M and consistent GFLOPs of 6.4 G. This article presents a novel approach for the real-time monitoring of fish using mobile devices, paving the way for the further development of intelligent aquaculture in the domain of fisheries. Full article
20 pages, 6246 KiB  
Article
YOLOv8n-DDA-SAM: Accurate Cutting-Point Estimation for Robotic Cherry-Tomato Harvesting
by Gengming Zhang, Hao Cao, Yangwen Jin, Yi Zhong, Anbang Zhao, Xiangjun Zou and Hongjun Wang
Agriculture 2024, 14(7), 1011; https://doi.org/10.3390/agriculture14071011 - 26 Jun 2024
Viewed by 894
Abstract
Accurately identifying cherry-tomato picking points and obtaining their coordinate locations is critical to the success of cherry-tomato picking robots. However, previous methods for semantic segmentation alone or combining object detection with traditional image processing have struggled to accurately determine the cherry-tomato picking point [...] Read more.
Accurately identifying cherry-tomato picking points and obtaining their coordinate locations is critical to the success of cherry-tomato picking robots. However, previous methods for semantic segmentation alone or combining object detection with traditional image processing have struggled to accurately determine the cherry-tomato picking point due to challenges such as leaves as well as targets that are too small. In this study, we propose a YOLOv8n-DDA-SAM model that adds a semantic segmentation branch to target detection to achieve the desired detection and compute the picking point. To be specific, YOLOv8n is used as the initial model, and a dynamic snake convolutional layer (DySnakeConv) that is more suitable for the detection of the stems of cherry-tomato is used in neck of the model. In addition, the dynamic large convolutional kernel attention mechanism adopted in backbone and the use of ADown convolution resulted in a better fusion of the stem features with the neck features and a certain decrease in the number of model parameters without loss of accuracy. Combined with semantic branch SAM, the mask of picking points is effectively obtained and then the accurate picking point is obtained by simple shape-centering calculation. As suggested by the experimental results, the proposed YOLOv8n-DDA-SAM model is significantly improved from previous models not only in detecting stems but also in obtaining stem’s masks. In the [email protected] and F1-score, the YOLOv8n-DDA-SAM achieved 85.90% and 86.13% respectively. Compared with the original YOLOv8n, YOLOv7, RT-DETR-l and YOLOv9c, the [email protected] has improved by 24.7%, 21.85%, 19.76%, 15.99% respectively. F1-score has increased by 16.34%, 12.11%, 10.09%, 8.07% respectively, and the number of parameters is only 6.37M. In the semantic segmentation branch, not only does it not need to produce relevant datasets, but also improved its mIOU by 11.43%, 6.94%, 5.53%, 4.22% and [email protected] by 12.33%, 7.49%, 6.4%, 5.99% compared to Deeplabv3+, Mask2former, DDRNet and SAN respectively. In summary, the model can well satisfy the requirements of high-precision detection and provides a strategy for the detection system of the cherry-tomato. Full article
Show Figures

Figure 1

15 pages, 4923 KiB  
Article
Research on Grain Futures Price Prediction Based on a Bi-DSConvLSTM-Attention Model
by Bensheng Yun, Jiannan Lai, Yingfeng Ma and Yanan Zheng
Systems 2024, 12(6), 204; https://doi.org/10.3390/systems12060204 - 11 Jun 2024
Viewed by 639
Abstract
Grain is a commodity related to the livelihood of the nation’s people, and the volatility of its futures price affects risk management, investment decisions, and policy making. Therefore, it is very necessary to establish an accurate and efficient futures price prediction model. Aiming at [...] Read more.
Grain is a commodity related to the livelihood of the nation’s people, and the volatility of its futures price affects risk management, investment decisions, and policy making. Therefore, it is very necessary to establish an accurate and efficient futures price prediction model. Aiming at improving the accuracy and efficiency of the prediction model, so as to support reasonable decision making, this paper proposes a Bi-DSConvLSTM-Attention model for grain futures price prediction, which is based on the combination of a bidirectional long short-term memory neural network (BiLSTM), a depthwise separable convolutional long short-term memory neural network (DSConvLSTM), and an attention mechanism. Firstly, the mutual information is used to evaluate, sort, and select the features for dimension reduction. Secondly, the lightweight depthwise separable convolution (DSConv) is introduced to replace the standard convolution (SConv) in ConvLSTM without sacrificing its performance. Then, the self-attention mechanism is adopted to improve the accuracy. Finally, taking the wheat futures price prediction as an example, the model is trained and its performance is evaluated. Under the Bi-DSConvLSTM-Attention model, the experimental results of selecting the most relevant 1, 2, 3, 4, 5, 6, and 7 features as the inputs showed that the optimal number of features to be selected was 4. When the four best features were selected as the inputs, the RMSE, MAE, MAPE, and R2 of the prediction result of the Bi-DSConvLSTM-Attention model were 5.61, 3.63, 0.55, and 0.9984, respectively, which is a great improvement compared with the existing price-prediction models. Other experimental results demonstrated that the model also possesses a certain degree of generalization and is capable of obtaining positive returns. Full article
Show Figures

Figure 1

17 pages, 10207 KiB  
Article
Improved YOLOv8-Seg Based on Multiscale Feature Fusion and Deformable Convolution for Weed Precision Segmentation
by Zhuxi Lyu, Anjiang Lu and Yinglong Ma
Appl. Sci. 2024, 14(12), 5002; https://doi.org/10.3390/app14125002 - 7 Jun 2024
Viewed by 771
Abstract
Laser-targeted weeding methods further enhance the sustainable development of green agriculture, with one key technology being the improvement of weed localization accuracy. Here, we propose an improved YOLOv8 instance segmentation based on bidirectional feature fusion and deformable convolution (BFFDC-YOLOv8-seg) to address the challenges [...] Read more.
Laser-targeted weeding methods further enhance the sustainable development of green agriculture, with one key technology being the improvement of weed localization accuracy. Here, we propose an improved YOLOv8 instance segmentation based on bidirectional feature fusion and deformable convolution (BFFDC-YOLOv8-seg) to address the challenges of insufficient weed localization accuracy in complex environments with resource-limited laser weeding devices. Initially, by training on extensive datasets of plant images, the most appropriate model scale and training weights are determined, facilitating the development of a lightweight network. Subsequently, the introduction of the Bidirectional Feature Pyramid Network (BiFPN) during feature fusion effectively prevents the omission of weeds. Lastly, the use of Dynamic Snake Convolution (DSConv) to replace some convolutional kernels enhances flexibility, benefiting the segmentation of weeds with elongated stems and irregular edges. Experimental results indicate that the BFFDC-YOLOv8-seg model achieves a 4.9% increase in precision, an 8.1% increase in recall rate, and a 2.8% increase in mAP50 value to 98.8% on a vegetable weed dataset compared to the original model. It also shows improved mAP50 over other typical segmentation models such as Mask R-CNN, YOLOv5-seg, and YOLOv7-seg by 10.8%, 13.4%, and 1.8%, respectively. Furthermore, the model achieves a detection speed of 24.8 FPS on the Jetson Orin nano standalone device, with a model size of 6.8 MB that balances between size and accuracy. The model meets the requirements for real-time precise weed segmentation, and is suitable for complex vegetable field environments and resource-limited laser weeding devices. Full article
(This article belongs to the Section Agricultural Science and Technology)
Show Figures

Figure 1

24 pages, 218661 KiB  
Article
An Image Dehazing Algorithm for Underground Coal Mines Based on gUNet
by Feng Tian, Lishuo Gao and Jing Zhang
Sensors 2024, 24(11), 3422; https://doi.org/10.3390/s24113422 - 26 May 2024
Viewed by 733
Abstract
Aiming at the problems of incomplete dehazing, color distortion, and loss of detail and edge information encountered by existing algorithms when processing images of underground coal mines, an image dehazing algorithm for underground coal mines, named CAB CA DSConv Fusion gUNet (CCDF-gUNet), is [...] Read more.
Aiming at the problems of incomplete dehazing, color distortion, and loss of detail and edge information encountered by existing algorithms when processing images of underground coal mines, an image dehazing algorithm for underground coal mines, named CAB CA DSConv Fusion gUNet (CCDF-gUNet), is proposed. First, Dynamic Snake Convolution (DSConv) is introduced to replace traditional convolutions, enhancing the feature extraction capability. Second, residual attention convolution blocks are constructed to simultaneously focus on both local and global information in images. Additionally, the Coordinate Attention (CA) module is utilized to learn the coordinate information of features so that the model can better capture the key information in images. Furthermore, to simultaneously focus on the detail and structural consistency of images, a fusion loss function is introduced. Finally, based on the test verification of the public dataset Haze-4K, the Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Mean Squared Error (MSE) are 30.72 dB, 0.976, and 55.04, respectively, and on a self-made underground coal mine dataset, they are 31.18 dB, 0.971, and 49.66, respectively. The experimental results show that the algorithm performs well in dehazing, effectively avoids color distortion, and retains image details and edge information, providing some theoretical references for image processing in coal mine surveillance videos. Full article
Show Figures

Figure 1

26 pages, 10617 KiB  
Article
Lightweight Super-Resolution Generative Adversarial Network for SAR Images
by Nana Jiang, Wenbo Zhao, Hui Wang, Huiqi Luo, Zezhou Chen and Jubo Zhu
Remote Sens. 2024, 16(10), 1788; https://doi.org/10.3390/rs16101788 - 18 May 2024
Viewed by 611
Abstract
Due to a unique imaging mechanism, Synthetic Aperture Radar (SAR) images typically exhibit degradation phenomena. To enhance image quality and support real-time on-board processing capabilities, we propose a lightweight deep generative network framework, namely, the Lightweight Super-Resolution Generative Adversarial Network (LSRGAN). This method [...] Read more.
Due to a unique imaging mechanism, Synthetic Aperture Radar (SAR) images typically exhibit degradation phenomena. To enhance image quality and support real-time on-board processing capabilities, we propose a lightweight deep generative network framework, namely, the Lightweight Super-Resolution Generative Adversarial Network (LSRGAN). This method introduces Depthwise Separable Convolution (DSConv) in residual blocks to compress the original Generative Adversarial Network (GAN) and uses the SeLU activation function to construct a lightweight residual module (LRM) suitable for SAR image characteristics. Furthermore, we combine the LRM with an optimized Coordinated Attention (CA) module, enhancing the lightweight network’s capability to learn feature representations. Experimental results on spaceborne SAR images demonstrate that compared to other deep generative networks focused on SAR image super-resolution reconstruction, LSRGAN achieves compression ratios of 74.68% in model storage requirements and 55.93% in computational resource demands. In this work, we significantly reduce the model complexity, improve the quality of spaceborne SAR images, and validate the effectiveness of the SAR image super-resolution algorithm as well as the feasibility of real-time on-board processing technology. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Graphical abstract

21 pages, 68245 KiB  
Article
Pine-YOLO: A Method for Detecting Pine Wilt Disease in Unmanned Aerial Vehicle Remote Sensing Images
by Junsheng Yao, Bin Song, Xuanyu Chen, Mengqi Zhang, Xiaotong Dong, Huiwen Liu, Fangchao Liu, Li Zhang, Yingbo Lu, Chang Xu and Ran Kang
Forests 2024, 15(5), 737; https://doi.org/10.3390/f15050737 - 23 Apr 2024
Viewed by 1086
Abstract
Pine wilt disease is a highly contagious forest quarantine ailment that spreads rapidly. In this study, we designed a new Pine-YOLO model for pine wilt disease detection by incorporating Dynamic Snake Convolution (DSConv), the Multidimensional Collaborative Attention Mechanism (MCA), and Wise-IoU v3 (WIoUv3) [...] Read more.
Pine wilt disease is a highly contagious forest quarantine ailment that spreads rapidly. In this study, we designed a new Pine-YOLO model for pine wilt disease detection by incorporating Dynamic Snake Convolution (DSConv), the Multidimensional Collaborative Attention Mechanism (MCA), and Wise-IoU v3 (WIoUv3) into a YOLOv8 network. Firstly, we collected UAV images from Beihai Forest and Linhai Park in Weihai City to construct a dataset via a sliding window method. Then, we used this dataset to train and test Pine-YOLO. We found that DSConv adaptively focuses on fragile and curved local features and then enhances the perception of delicate tubular structures in discolored pine branches. MCA strengthens the attention to the specific features of pine trees, helps to enhance the representational capability, and improves the generalization to diseased pine tree recognition in variable natural environments. The bounding box loss function has been optimized to WIoUv3, thereby improving the overall recognition accuracy and robustness of the model. The experimental results reveal that our Pine-YOLO model achieved the following values across various evaluation metrics: [email protected] at 90.69%, [email protected]:0.95 at 49.72%, precision at 91.31%, recall at 85.72%, and F1-score at 88.43%. These outcomes underscore the high effectiveness of our model. Therefore, our newly designed Pine-YOLO perfectly addresses the disadvantages of the original YOLO network, which helps to maintain the health and stability of the ecological environment. Full article
(This article belongs to the Topic Individual Tree Detection (ITD) and Its Applications)
Show Figures

Figure 1

19 pages, 9284 KiB  
Article
Research on Short-Term Prediction Methods for Small-Scale Three-Dimensional Wind Fields
by Yuzhao Ma, Haoran Han, Xu Tang and Pak-Wai Chan
Appl. Sci. 2024, 14(5), 1871; https://doi.org/10.3390/app14051871 - 24 Feb 2024
Viewed by 656
Abstract
The accurate prediction of small-scale three-dimensional wind fields is of great practical significance for aviation safety, wind power generation, and related fields. This study proposes a novel method for predicting small-scale three-dimensional wind fields by combining the mesoscale Weather Research and Forecasting (WRF) [...] Read more.
The accurate prediction of small-scale three-dimensional wind fields is of great practical significance for aviation safety, wind power generation, and related fields. This study proposes a novel method for predicting small-scale three-dimensional wind fields by combining the mesoscale Weather Research and Forecasting (WRF) model with Computational Fluid Dynamics (CFD). The method consists of three components: the WRF module, the hybrid neural network prediction module, and the CFD module. First, mesoscale meteorological fields are simulated using the WRF module to establish a historical inflow boundary dataset for the CFD domain. Next, deep separable convolutions are incorporated, and convolutional long short-term memory (ConvLSTM) is combined with a deep separable convolution-gated recurrent unit (DSConvGRU) to construct a hybrid neural network prediction module named ConvLSTM-DSConvGRU. This module is employed for predicting inflow boundary data. Finally, the predicted inflow boundary conditions drive the CFD module to predict small-scale three-dimensional wind fields. The effectiveness of the WRF and CFD downscaling coupling method was validated using observed data from meteorological stations within the simulated domain, along with statistical indicators of errors. Additionally, a comparative evaluation was conducted between the proposed hybrid network model and the four commonly used spatiotemporal prediction models to assess its prediction performance. The results demonstrate that our proposed wind field prediction method achieves accurate simulation and short-term prediction of small-scale three-dimensional wind fields, and the hybrid network model exhibits comprehensive advantages in terms of model complexity and prediction accuracy. Full article
Show Figures

Figure 1

15 pages, 7267 KiB  
Article
An Unstructured Orchard Grape Detection Method Utilizing YOLOv5s
by Wenhao Wang, Yun Shi, Wanfu Liu and Zijin Che
Agriculture 2024, 14(2), 262; https://doi.org/10.3390/agriculture14020262 - 6 Feb 2024
Cited by 1 | Viewed by 1238
Abstract
Rising labor costs and a workforce shortage have impeded the development and economic benefits of the global grape industry. Research and development of intelligent grape harvesting technologies is desperately needed. Therefore, rapid and accurate identification of grapes is crucial for intelligent grape harvesting. [...] Read more.
Rising labor costs and a workforce shortage have impeded the development and economic benefits of the global grape industry. Research and development of intelligent grape harvesting technologies is desperately needed. Therefore, rapid and accurate identification of grapes is crucial for intelligent grape harvesting. However, object detection algorithms encounter multiple challenges in unstructured vineyards, such as similar background colors, light obstruction from greenhouses and leaves, and fruit occlusion. All of these factors contribute to the difficulty of correctly identifying grapes. The GrapeDetectNet (GDN), based on the YOLO (You Only Look Once) v5s, is proposed to improve grape detection accuracy and recall in unstructured vineyards. dual-channel feature extraction attention (DCFE) is a new attention structure introduced in GDN. We also use dynamic snake convolution (DS-Conv) in the backbone network. We collected an independent dataset of 1280 images after a strict selection process to evaluate GDN’s performance. The dataset encompasses examples of Shine Muscat and unripe Kyoho grapes, covering a range of complex outdoor situations. The results of the experiment demonstrate that GDN performed outstandingly on this dataset. Compared to YOLOv5s, this model increased metrics such as 2.02% of mAP0.5:0.95, 2.5% of mAP0.5, 1.4% of precision, 1.6% of recall, and 1.5% of F1 score. Finally, we test the method on a grape-picking robot, and the results show that our algorithm works remarkably well in harvesting experiments. The results indicate that the GDN grape detection model in this study exhibits high detection accuracy. It is proficient in identifying grapes and demonstrates good robustness in unstructured vineyards, providing a valuable empirical reference for the practical application of intelligent grape harvesting technology. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

14 pages, 4900 KiB  
Article
A Lightweight YOLOv8 Tomato Detection Algorithm Combining Feature Enhancement and Attention
by Guoliang Yang, Jixiang Wang, Ziling Nie, Hao Yang and Shuaiying Yu
Agronomy 2023, 13(7), 1824; https://doi.org/10.3390/agronomy13071824 - 9 Jul 2023
Cited by 54 | Viewed by 12816
Abstract
A tomato automatic detection method based on an improved YOLOv8s model is proposed to address the low automation level in tomato harvesting in agriculture. The proposed method provides technical support for the automatic harvesting and classification of tomatoes in agricultural production activities. The [...] Read more.
A tomato automatic detection method based on an improved YOLOv8s model is proposed to address the low automation level in tomato harvesting in agriculture. The proposed method provides technical support for the automatic harvesting and classification of tomatoes in agricultural production activities. The proposed method has three key components. Firstly, the depthwise separable convolution (DSConv) technique replaces the ordinary convolution, which reduces the computational complexity by generating a large number of feature maps with a small amount of calculation. Secondly, the dual-path attention gate module (DPAG) is designed to improve the model’s detection precision in complex environments by enhancing the network’s ability to distinguish between tomatoes and the background. Thirdly, the feature enhancement module (FEM) is added to highlight the target details, prevent the loss of effective features, and improve detection precision. We built, trained, and tested the tomato dataset, which included 3098 images and 3 classes. The proposed algorithm’s performance was evaluated by comparison with the SSD, faster R-CNN, YOLOv4, YOLOv5, and YOLOv7 algorithms. Precision, recall rate, and mAP (mean average precision) were used for evaluation. The test results show that the improved YOLOv8s network has a lower loss and 93.4% mAP on this dataset. This improvement is a 1.5% increase compared to before the improvement. The precision increased by 2%, and the recall rate increased by 0.8%. Moreover, the proposed algorithm significantly reduced the model size from 22 M to 16 M, while achieving a detection speed of 138.8 FPS, which satisfies the real-time detection requirement. The proposed method strikes a balance between model size and detection precision, enabling it to meet agriculture’s tomato detection requirements. The research model in this paper will provide technical support for a tomato picking robot to ensure the fast and accurate operation of the picking robot. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

21 pages, 4341 KiB  
Article
STC-NLSTMNet: An Improved Human Activity Recognition Method Using Convolutional Neural Network with NLSTM from WiFi CSI
by Md Shafiqul Islam, Mir Kanon Ara Jannat, Mohammad Nahid Hossain, Woo-Su Kim, Soo-Wook Lee and Sung-Hyun Yang
Sensors 2023, 23(1), 356; https://doi.org/10.3390/s23010356 - 29 Dec 2022
Cited by 12 | Viewed by 3046
Abstract
Human activity recognition (HAR) has emerged as a significant area of research due to its numerous possible applications, including ambient assisted living, healthcare, abnormal behaviour detection, etc. Recently, HAR using WiFi channel state information (CSI) has become a predominant and unique approach in [...] Read more.
Human activity recognition (HAR) has emerged as a significant area of research due to its numerous possible applications, including ambient assisted living, healthcare, abnormal behaviour detection, etc. Recently, HAR using WiFi channel state information (CSI) has become a predominant and unique approach in indoor environments compared to others (i.e., sensor and vision) due to its privacy-preserving qualities, thereby eliminating the need to carry additional devices and providing flexibility of capture motions in both line-of-sight (LOS) and non-line-of-sight (NLOS) settings. Existing deep learning (DL)-based HAR approaches usually extract either temporal or spatial features and lack adequate means to integrate and utilize the two simultaneously, making it challenging to recognize different activities accurately. Motivated by this, we propose a novel DL-based model named spatio-temporal convolution with nested long short-term memory (STC-NLSTMNet), with the ability to extract spatial and temporal features concurrently and automatically recognize human activity with very high accuracy. The proposed STC-NLSTMNet model is mainly comprised of depthwise separable convolution (DS-Conv) blocks, feature attention module (FAM) and NLSTM. The DS-Conv blocks extract the spatial features from the CSI signal and add feature attention modules (FAM) to draw attention to the most essential features. These robust features are fed into NLSTM as inputs to explore the hidden intrinsic temporal features in CSI signals. The proposed STC-NLSTMNet model is evaluated using two publicly available datasets: Multi-environment and StanWiFi. The experimental results revealed that the STC-NLSTMNet model achieved activity recognition accuracies of 98.20% and 99.88% on Multi-environment and StanWiFi datasets, respectively. Its activity recognition performance is also compared with other existing approaches and our proposed STC-NLSTMNet model significantly improves the activity recognition accuracies by 4% and 1.88%, respectively, compared to the best existing method. Full article
Show Figures

Figure 1

21 pages, 6401 KiB  
Article
Real-Time Ground-Level Building Damage Detection Based on Lightweight and Accurate YOLOv5 Using Terrestrial Images
by Chaoxian Liu, Haigang Sui, Jianxun Wang, Zixuan Ni and Liang Ge
Remote Sens. 2022, 14(12), 2763; https://doi.org/10.3390/rs14122763 - 8 Jun 2022
Cited by 22 | Viewed by 3835
Abstract
Real-time building damage detection effectively improves the timeliness of post-earthquake assessments. In recent years, terrestrial images from smartphones or cameras have become a rich source of disaster information that may be useful in assessing building damage at a lower cost. In this study, [...] Read more.
Real-time building damage detection effectively improves the timeliness of post-earthquake assessments. In recent years, terrestrial images from smartphones or cameras have become a rich source of disaster information that may be useful in assessing building damage at a lower cost. In this study, we present an efficient method of building damage detection based on terrestrial images in combination with an improved YOLOv5. We compiled a Ground-level Detection in Building Damage Assessment (GDBDA) dataset consisting of terrestrial images with annotations of damage types, including debris, collapse, spalling, and cracks. A lightweight and accurate YOLOv5 (LA-YOLOv5) model was used to optimize the detection efficiency and accuracy. In particular, a lightweight Ghost bottleneck was added to the backbone and neck modules of the YOLOv5 model, with the aim to reduce the model size. A Convolutional Block Attention Module (CBAM) was added to the backbone module to enhance the damage recognition effect. In addition, regarding the scale difference of building damage, the Bi-Directional Feature Pyramid Network (Bi-FPN) for multi-scale feature fusion was used in the neck module to aggregate features with different damage types. Moreover, depthwise separable convolution (DSCONV) was used in the neck module to further compress the parameters. Based on our GDBDA dataset, the proposed method not only achieved detection accuracy above 90% for different damage targets, but also had the smallest weight size and fastest detection speed, which improved by about 64% and 24%, respectively. The model performed well on datasets from different regions. The overall results indicate that the proposed model realizes rapid and accurate damage detection, and meets the requirement of lightweight embedding in the future. Full article
Show Figures

Graphical abstract

14 pages, 11920 KiB  
Article
EAR-Net: Efficient Atrous Residual Network for Semantic Segmentation of Street Scenes Based on Deep Learning
by Seokyong Shin, Sanghun Lee and Hyunho Han
Appl. Sci. 2021, 11(19), 9119; https://doi.org/10.3390/app11199119 - 30 Sep 2021
Cited by 8 | Viewed by 2416
Abstract
Segmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while maintaining computation [...] Read more.
Segmentation of street scenes is a key technology in the field of autonomous vehicles. However, conventional segmentation methods achieve low accuracy because of the complexity of street landscapes. Therefore, we propose an efficient atrous residual network (EAR-Net) to improve accuracy while maintaining computation costs. First, we performed feature extraction and restoration, utilizing depthwise separable convolution (DSConv) and interpolation. Compared with conventional methods, DSConv and interpolation significantly reduce computation costs while minimizing performance degradation. Second, we utilized residual learning and atrous spatial pyramid pooling (ASPP) to achieve high accuracy. Residual learning increases the ability to extract context information by preventing the problem of feature and gradient losses. In addition, ASPP extracts additional context information while maintaining the resolution of the feature map. Finally, to alleviate the class imbalance between the image background and objects and to improve learning efficiency, we utilized focal loss. We evaluated EAR-Net on the Cityscapes dataset, which is commonly used for street scene segmentation studies. Experimental results showed that the EAR-Net had better segmentation results and similar computation costs as the conventional methods. We also conducted an ablation study to analyze the contributions of the ASPP and DSConv in the EAR-Net. Full article
(This article belongs to the Special Issue Computing and Artificial Intelligence for Visual Data Analysis)
Show Figures

Figure 1

Back to TopTop