Abstract
Fish detection is the upstream module of the whole Fish4Knowledge system and, as such, it needs to be as accurate and fast as possible. Driven by these needs, several state of the art and new approaches for object segmentation in videos have been developed and tested. We opted for background modeling —based approaches as they fit better with the underwater domain peculiarities. In particular, kernel density estimation methods, modeling colors, texture and spatial information of both the background and the foreground, proved to be the best performing ones not only in underwater video sequences but also in other complex scenarios. To provide more robustness to fish detection, we also developed a post-processing layer (added on top of the background modeling one) able to filter out effectively false detections by using “real-world” object properties. Despite the low-quality (low frame rate and spatial resolution) of the processed underwater videos, the achieved results can be considered satisfactory especially considering that most of the state of the art approaches failed. This chapter provides, therefore, an overview on the development and deployment of fish detection module for the Fish4Knowledge system. It includes a detailed analysis of the challenges of underwater video analysis , the limitations of the existing approaches, the devised solutions and the experimental results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Most of the methods are available at https://code.google.com/p/bgslibrary/. The code of the remaining methods were made available by the authors and reference to the code can be found in the corresponding papers.
References
Alexe, B., Deselaers, T., and V. Ferrari. 2012. Measuring the objectness of image windows. IEEE Transactions on PAMI, 99(PrePrints).
Barnich, O., and M. Van Droogenbroeck. 2011. ViBe: A universal background subtraction algorithm for video sequences. IEEE Transactions on Image Processing 20(6): 1709–1724.
Bouguet, J.-Y. 2000. Pyramidal Implementation of the Lucas-Kanade Feature Tracker Description of the Algorithm.
Cheng, C., A. Koschan, C.-H. Chen, D.L. Page, and M.A. Abidi. 2012. Outdoor scene image segmentation based on background recognition and perceptual organization. IEEE Transactions on Image Processing 21(3): 1007–1019.
Di Zenzo, S. 1986. A note on the gradient of a multi-image. Computer Vision, Graphics, and Image Processing 33(1): 116–125.
Faro, A., D. Giordano, and C. Spampinato. 2011. Integrating location tracking, traffic monitoring and semantics in a layered its architecture. IET Intelligent Transport Systems 5(3): 197–206.
Felzenszwalb, P., and D. Huttenlocher. 2004. Efficient graph-based image segmentation. International Journal of Computer Vision 59(2): 167–181.
Gallego, J., Pardas, M., and G. Haro. 2009. Bayesian foreground segmentation and tracking using pixel-wise background model and region based foreground model. In 2009 16th IEEE international conference on image processing (ICIP), 3205–3208.
Hall, P., and M.P. Wand. 1996. On the accuracy of binned kernel density estimators. Journal of Multivariate Analysis 56(2): 165–184.
Han, B., and L. Davis. 2012. Density-based multifeature background subtraction with support vector machine. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(5): 1305–1312.
Heikkila, M., and M. Pietikainen. 2006. A texture-based method for modeling the background and detecting moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(4): 657–662.
Julesz. B. 1981. Textons, the elements of texture perception, and their interactions. Nature.
Kavasidis, I. and S. Palazzo. 2012. Quantitative performance analysis of object detection algorithms on underwater video footage. In Proceedings of the 1st ACM international workshop on multimedia analysis for ecological data, 57–60. New York: ACM.
Kavasidis, I., Palazzo, S., Di Salvo, R., Giordano, D., and C. Spampinato. 2013a. An innovative web-based collaborative platform for video annotation. Multimedia Tools and Applications, 1–20.
Liao, S., Zhao, G., Kellokumpu, V., Pietikainen, M., and S. Li. 2010. Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes. In 2010 IEEE conference on computer vision and pattern recognition (CVPR), 1301–1306.
Liu, G.-H., and J.-Y. Yang. 2008. Image retrieval based on the texton co-occurrence matrix. Pattern Recognition 41(12): 3521–3527.
Liu, Z., D.W. Jacobs, and R. Basri. 1999. The role of convexity in perceptual completion: Beyond good continuation. Vision Research 39(25): 4244–4257.
Mittal, A. and N. Paragios. 2004. Motion-based background subtraction using adaptive kernel density estimation. In Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, 2: II-302–II-309.
Monnet, A., Mittal, A., Paragios, N., and V. Ramesh. 2003. Background modeling and subtraction of dynamic scenes. In Proceedings of the ninth IEEE international conference on computer vision, ICCV ’03, - vol. 2:1305–1313. Washington: IEEE Computer Society.
Oliver, N., B. Rosario, and A. Pentland. 2000. A bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8): 831–843.
Porikli, F. 2005. Multiplicative background-foreground estimation under uncontrolled illumination using intrinsic images. In Seventh IEEE workshops on application of computer vision, 2005. WACV/MOTIONS ’05 Volume 1. 2: 20–27.
Porikli, F. 2006a. Achieving real-time object detection and tracking under extreme conditions. J. Real-Time Image Processing 1(1): 33–40.
Porikli, F. 2006b. Achieving real-time object detection and tracking under extreme conditions. Journal Real-Time Image Processing 1(1): 33–40.
Porikli, F. and C. Wren. 2005. Change detection by frequency decomposition: Wave-back. In Proceedings of the workshop on image analysis for multimedia interactive.
Raimondo, S. and C. Silvia. 2010. Underwater image processing: state of the art of restoration and image enhancement methods. EURASIP Journal on Advances in Signal Processing.
Rosenblatt, M. 1956. Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics 27(3): 832–837.
Seki, M., Wada, T., Fujiwara, H., and K. Sumi. 2003. Background subtraction based on cooccurrence of image variations. In Proceedings of the 2003 IEEE computer society conference on computer vision and pattern recognition 2003 2: II-65–II-72 .
Sheikh, Y., and M. Shah. 2005. Bayesian modeling of dynamic scenes for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(11): 1778–1792.
Spampinato, C. and S. Palazzo. 2012. Enhancing object detection performance by integrating motion objectness and perceptual organization. In 2012 21st international conference on pattern recognition (ICPR), 3640–3643
Spampinato, C., S. Palazzo, and I. Kavasidis. 2014c. A texton-based kernel density estimation approach for background modeling under extreme conditions. Computer Vision and Image Understanding 122: 74–83.
Stauffer, C., and W.E.L. Grimson. 1999. Adaptive background mixture models for real-time tracking. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2: 246–252.
Tsai, D.M., and S.C. Lai. 2009. Independent component analysis-based background subtraction for indoor surveillance. IEEE Trans Image Process 18(1): 158–167.
Wren, C., Azarbayejani, A., Darrell, T., and A. Pentland. 1996. Pfinder: Real-time tracking of the human body. In Proceedings of the second international conference on automatic face and gesture recognition 1996, 51–56.
Yao, J. and J.-M. Odobez. 2007. Multi-layer background subtraction based on color and texture. In IEEE Conference on Computer Vision and Pattern Recognition 2007. CVPR’07, 1–8. New York: IEEE.
Zhang, B., Y. Gao, S. Zhao, and B. Zhong. 2011. Kernel similarity modeling of texture pattern flow for motion detection in complex background. IEEE Transactions on Circuits and Systems for Video Technology 21(1): 29–38.
Zivkovic, Z., and F. van der Heijden. 2006. Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognition Letters 27(7): 773–780.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Giordano, D., Palazzo, S., Spampinato, C. (2016). Fish Detection. In: Fisher, R., Chen-Burger, YH., Giordano, D., Hardman, L., Lin, FP. (eds) Fish4Knowledge: Collecting and Analyzing Massive Coral Reef Fish Video Data. Intelligent Systems Reference Library, vol 104. Springer, Cham. https://doi.org/10.1007/978-3-319-30208-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-30208-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30206-5
Online ISBN: 978-3-319-30208-9
eBook Packages: EngineeringEngineering (R0)