Abstract
This paper proposes a novel network model named DYOLO, aimed at improving the accuracy and real-time performance of object detection tasks. This model combines the dynamic feature fusion capability of D-Net with the high-speed detection performance of YOLOv8n. We designed a new convolutional module, DConv, by incorporating the dynamic large convolution kernel (DLK) and dynamic feature fusion (DFF) modules from the D-Net network structure, and applied this module to improve the C2f modules in the backbone and Neck of YOLOv8n. These improvements enable the model to more effectively capture and fuse multi-scale features, enhancing the utilization of global contextual information. Experimental results demonstrate that the improved model achieves higher accuracy and better adaptability in object detection tasks across different scenarios, as verified on public datasets such as NWPU VHR-10 and RSOD, as well as our self-built Car dataset. Furthermore, while maintaining the original real-time performance of YOLOv8n, the model enhances the capability to capture both local details and overall scene information. This study provides a novel solution for object detection tasks in various scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gui, S., Song, S., Qin, R., Tang, Y.: Remote sensing object detection in the deep learning Era—a review. Remote Sens. 16(2), 327 (2024)
Ravindran, R., Santora, M.J., Jamali, M.M.: Multi-object detection and tracking, based on DNN, for autonomous vehicles: a review. IEEE Sens. J. 21(5), 5668–5677 (2021)
Gupta, C., Gill, N.S., Gulia, P., et al.: A novel finetuned YOLOv8 model for real-time underwater trash detection. J. Real-Time Image Proc. 21(2), 48 (2024)
Wang, G., Chen, Y., An, P., Hong, H., Hu, J., Huang, T.: UAV-YOLOv8: a small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios. Sensors 23, 7190 (2023)
Yang, J., Qiu, P., Zhang, Y., Marcus, D.S., Sotiras, A.: D-Net: dynamic large kernel with dynamic feature fusion for volumetric medical image segmentation. ArXiv (2024)
Zha, C., Luo, S., Xu, X.: Infrared multi-target detection and tracking in dense urban traffic scenes. IET Image Proc. 18(6), 1613–1628 (2024)
Gragnaniello, D., Greco, A., Saggese, A., Vento, M., Vicinanza, A.: Benchmarking 2D multi-object detection and tracking algorithms in autonomous vehicle driving scenarios. Sensors 23(8), 4024 (2023)
Dang, M., Liu, G., Xu, Q., Li, K., Wang, D., He, L.: Multi-object behavior recognition based on object detection for dense crowds. Expert Syst. Appl. 248, 123397 (2024)
Zhang, H., Li, W., Qi, Y., Liu, H., Li, Z.: Dynamic fry counting based on multi-object tracking and one-stage detection. Comput. Electron. Agric. 209, 107871 (2023)
Oreski, G.: YOLO*C—Adding context improves YOLO performance. Neurocomputing 555, 126655 (2023)
Li, S., Huang, H., Meng, X., Wang, M., Li, Y., Xie, L.: A glove-wearing detection algorithm based on improved YOLOv8. Sensors 23(24), 9906 (2023)
Xiao, B., Nguyen, M., Yan, W.Q.: Fruit ripeness identification using YOLOv8 model. Multimed Tools Appl. 83, 28039–28056 (2024)
Chen, F., Deng, M., Gao, H., Yang, X., Zhang, D.: NHD-YOLO: improved YOLOv8 using optimized neck and head for product surface defect detection with data augmentation. IET Image Proc. 18(7), 1915–1926 (2024)
Jiang, T., Li, C., Yang, M., Wang, Z.: An Improved YOLOv5s algorithm for object detection with an attention mechanism. Electronics 11(16), 2494 (2022)
Duan, S., Gao, X., Xia, C., Ge, B.: A2TPNet: alternate steered attention and trapezoidal pyramid fusion network for RGB-D salient object detection. Electronics 11(1968), (2022)
Gao, F., Cai, Y., Deng, F., Yu, C., Chen, J.: Feature alignment in anchor-free object detection. IEEE Trans. Circ. Syst. Video Technol. 33(8), 3799–3810 (2023)
Xu, C., Zhang, J., Wang, M., Tian, G., Liu, Y.: Multilevel spatial-temporal feature aggregation for video object detection. In: IEEE Trans. Circ. Syst. Video Technol. 32(11), 7809–7820 (2022)
Sugashini, T.: YOLO glass: video-based smart object detection using squeeze and attention YOLO network. SIViP 18, 2105–2115 (2024)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ma, H. et al. (2024). DYOLO: A Novel Object Detection Model for Multi-scene and Multi-object Based on an Improved D-Net Split Task Model is Proposed. In: Huang, DS., Zhang, X., Guo, J. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14866. Springer, Singapore. https://doi.org/10.1007/978-981-97-5594-3_38
Download citation
DOI: https://doi.org/10.1007/978-981-97-5594-3_38
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5593-6
Online ISBN: 978-981-97-5594-3
eBook Packages: Computer ScienceComputer Science (R0)