A New Workflow for Instance Segmentation of Fish with YOLO
Abstract
:1. Introduction
1.1. Deep Learning for Fish Counting and Segmentation
- Semantic Segmentation:
- 2.
- Instance Segmentation:
- 3.
- Panopic Segmentation:
1.2. Application Scenarios of Segmentation Technology in Ecology
- Ecological monitoring and protection: By segmenting fish images, we can better understand the spatiotemporal distribution, quantity, and behavior of various fish species, which helps to protect endangered fish and maintain ecological balance.
- Fisheries management: Fisheries are an important component of the global economy, and the accurate identification and measurement of caught fish contribute to sustainable fisheries management to ensure the sustainable development of fisheries resources.
- Aquatic biology research: Segmented fish images provide rich reference data for aquatic biology research, which can help scientists and researchers gain a deeper understanding of fish population distribution, behavioral habits, habitat selection, and interactions with other organisms.
1.3. Datasets
2. Materials and Methods
2.1. Comparison between YOLOv5 and YOLOv8
- (1)
- For the backbone, CSP thought is adopted by both of them, and an SPPF module has been integrated into them.
- (2)
- PAN (path aggregation network) thought is included in both of them.
- (3)
- For classification, both of them apply BCE loss in the loss function.
- (1)
- For the backbone, a C2f module is integrated in YOLOv8 in contrast to a C3 module in YOLOv5.
- (2)
- For the detection head, a coupled head and anchor-base are used in YOLOv5 in comparison to a decoupled head and anchor-free in YOLOv8.
- (3)
- For the positive and negative sample assignment strategy, static assignment and TAL (task alignment learning) dynamic assignment strategies are adopted in YOLOv5 and YOLOv8, respectively.
- (4)
- The PAN-FPN up-sampling CBS module in YOLOv5 is removed from YOLOv8.
- (5)
- Object loss is removed from YOLOv8, whereas CIOU loss and DFL (dual focal loss) are included in YOLOv8.
2.2. Preparation for Transfer Learning and Retraining
2.3. New Workflow Design
2.4. Metrics for Model Performance
- (1)
- Pixel Accuracy (PA): This is the simplest measure, which is the proportion of correctly labeled pixels to the total number of pixels.
- (2)
- Mean Pixel Accuracy (MPA): It is a simple improvement of PA, which calculates the proportion of correctly classified pixels within each class and then calculates the average of all classes.
- (3)
- Mean Intersection over Union (MIoU): The standard measure for semantic segmentation. It calculates the ratio of the intersection and union of two sets, which are the ground truth and predicted segmentation in semantic segmentation problems. This ratio can be transformed into a ratio of true, false negative, and false positive (union) to the sum of true positive (intersection). It calculates IoU on each class, followed by the calculation of an average.
- (4)
- Frequency Weighted Intersection over Union (FWIoU): A method of improving MIoU by setting weights for each class based on its frequency of occurrence.
3. Results
4. Discussion
4.1. Limitation
4.2. Real-World Generalization
4.3. Comparison with Other Research
4.4. Workflow Optimization
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Acknowledgments
Conflicts of Interest
References
- Chicchon, M.; Bedon, H.; Del-Blanco, C.R.; Sipiran, I. Semantic Segmentation of Fish and Underwater Environments Using Deep Convolutional Neural Networks and Learned Active Contours. IEEE Access 2023, 11, 33652–33665. [Google Scholar] [CrossRef]
- Saleh, A.; Sheaves, M.; Jerry, D.; Azghadi, M.R. Transformer-based Self-Supervised Fish Segmentation in Underwater Videos. arXiv 2022, arXiv:2206.05390. [Google Scholar]
- Haider, A.; Arsalan, M.; Choi, J.; Sultan, H.; Park, K.R. Robust segmentation of underwater fish based on multi-level feature accumulation. Front. Mar. Sci. 2022, 9, 1010565. [Google Scholar] [CrossRef]
- Haider, A.; Arsalan, M.; Nam, S.H.; Sultan, H.; Park, K.R. Computer-aided fish assessment in an underwater marine environment using parallel and progressive spatial information fusion. J. King Saud Univ. Comput. Inf. Sci. 2023, 35, 211–226. [Google Scholar] [CrossRef]
- Tarling, P.; Cantor, M.; Clapes, A.; Escalera, S. Deep learning with self-supervision and uncertainty regularization to count fish in underwater images. PLoS ONE 2022, 17, e0267759. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Tang, J.; Lin, S.; Liang, W.; Su, B.; Yan, J.; Zhou, D.; Wang, L.; Lai, Y.; Yang, B. RMP-Net: A structural reparameterization and subpixel super-resolution-based marine scene segmentation network. Front. Mar. Sci. 2022, 9, 1032287. [Google Scholar] [CrossRef]
- Jahanbakht, M.; Xiang, W.; Waltham, N.J.; Azghadi, M.R. Distributed Deep Learning and Energy-Efficient Real-Time Image Processing at the Edge for Fish Segmentation in Underwater Videos. IEEE Access 2022, 10, 117796–117807. [Google Scholar] [CrossRef]
- Holmberg, J.; Norman, B.; Arzoumanian, Z. Estimating Population Size, Structure, and Residency Time for Whale Sharks Rhincodon Typus Through Collaborative Photo-Identification. Endangered Species Res. 2009, 7, 39–53. [Google Scholar] [CrossRef]
- Anantharajah, K.; Ge, Z.; McCool, C.; Denman, S.; Fookes, C.; Corke, P.; Tjondronegoro, D.; Sridharan, S. Local Inter-Session Variability Modelling for Object Classification. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Steamboat Springs, CO, USA, 24–26 March 2014; pp. 309–316. [Google Scholar]
- Boom, B.J.; He, J.; Palazzo, S.; Huang, P.X.; Beyan, C.; Chou, H.M.; Lin, F.P.; Spampinato, C.; Fisher, R.B. A Research Tool for Long-Term and Continuous Analysis of Fish Assemblage in Coral-Reefs Using Underwater Camera Footage. Ecol. Inf. 2014, 23, 83–97. [Google Scholar] [CrossRef]
- Kavasidis, I.; Palazzo, S.; Salvo, R.D.; Giordano, D.; Spampinato, C. An Innovative Web-Based Collaborative Platform for Video Annotation. Multimedia Tools Appl. 2014, 70, 413–432. [Google Scholar] [CrossRef]
- Cutter, G.; Stierhoff, K.; Zeng, J. Automated Detection of Rockfish in Unconstrained Underwater Videos Using Haar Cascades and a New Image Dataset: Labeled Fishes in the Wild. In Proceedings of the 2015 IEEE Winter Applications and Computer Vision Workshops, Waikoloa, HI, USA, 6–9 January 2015; pp. 57–62. [Google Scholar]
- Jäger, J.; Simon, M.; Denzler, J.; Wolff, V.; Fricke-Neuderth, K.; Kruschel, C. Croatian Fish Dataset: Fine-grained classification of fish species in their natural habitat. In Proceedings of the Machine Vision of Animals and their Behaviour (MVAB), Swansea, UK, 7–10 September 2015. [Google Scholar] [CrossRef]
- Ditria, E.M.; Connolly, R.M.; Jinks, E.L.; Lopez-Marcano, S. Annotated Video Footage for Automated Identification and Counting of Fish in Unconstrained Seagrass Habitats. Front. Mar. Sci. 2021, 8, 629485. [Google Scholar] [CrossRef]
- Lopez, S. slopezmarcano/automated-Fish-Detection-in-Low-Visibility: Automated Fish Detection in Low Visibility. 2021. Available online: https://zenodo.org/records/5238512 (accessed on 10 December 2023).
- Saleh, A.; Laradji, I.H.; Konovalov, D.A.; Bradley, M.; Vazquez, D.; Sheaves, M. A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis. Sci. Rep. 2020, 10, 14671. [Google Scholar] [CrossRef] [PubMed]
- González-Sabbagh, S.; Robles-Kelly, A. A Survey on Underwater Computer Vision. ACM Comput. Surv. 2023, 55, 1–39. [Google Scholar] [CrossRef]
Model | Name | Value |
---|---|---|
YOLOv5 | pre-trained model | YOLOv5s |
epochs | 600 | |
early stop epochs | 200 | |
batch | 64 | |
image size | 640 × 640 | |
workers | 32 | |
optimizer | SGD | |
YOLOv8 | pre-trained model | YOLOv8s |
epochs | 600 | |
early stop epochs | 200 | |
batch | 64 | |
image size | 640 × 640 | |
workers | 32 | |
optimizer | SGD | |
lr0 | 0.01 | |
lrf | 0.01 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Wang, Y. A New Workflow for Instance Segmentation of Fish with YOLO. J. Mar. Sci. Eng. 2024, 12, 1010. https://doi.org/10.3390/jmse12061010
Zhang J, Wang Y. A New Workflow for Instance Segmentation of Fish with YOLO. Journal of Marine Science and Engineering. 2024; 12(6):1010. https://doi.org/10.3390/jmse12061010
Chicago/Turabian StyleZhang, Jiushuang, and Yong Wang. 2024. "A New Workflow for Instance Segmentation of Fish with YOLO" Journal of Marine Science and Engineering 12, no. 6: 1010. https://doi.org/10.3390/jmse12061010