Abstract
Visual grouping is a key mechanism in human scene perception. There, it belongs to the subconscious, early processing and is key prerequisite for other high level tasks such as recognition. In this paper, we introduce an efficient, realtime capable algorithm which likewise agglomerates a valuable hierarchical clustering of a scene, while using purely local appearance statistics.
To speed up the processing, first we subdivide the image into meaningful, atomic segments using a fast Watershed transform. Starting from there, our rapid, agglomerative clustering algorithm prunes and maintains the connectivity graph between clusters to contain only such pairs, which directly touch in the image domain and are reciprocal nearest neighbors (RNN) wrt. a distance metric. The core of this approach is our novel cluster distance: it combines boundary and surface statistics both in terms of appearance as well as spatial linkage. This yields state-of-the-art performance, as we demonstrate in conclusive experiments conducted on BSDS500 and Pascal-Context datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
This is similar to the well known earth mover’s distance (EMD) on histograms, which is in fact the discretized \(\mathcal {W}_1\) distance.
- 3.
References
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR), pp. 886–893 (2005)
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Urtasun, R., et al.: The role of context for object detection and semantic segmentation in the wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 891–898. IEEE (2014)
Arbelaez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 328–335. IEEE (2014)
Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision (ICCV), pp. 1150–1157 (1999)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Workshop on Statistical Learning in Computer Vision, ECCV, p. 7 (2004)
Bruce Goldstein, E.: Perceiving objects and scenes. In: Sensation and Perception, 8th edn., pp. 99–130. Wadsworth Cengage Learning, Belmont, USA (2009). ISBN-13: 978-0-495-60149-4
Wagemans, J., Elder, J.H., Kubovy, M., Palmer, S.E., Peterson, M.A., Singh, M., von der Heydt, R.: A century of gestalt psychology in visual perception I. perceptual grouping and figure-ground organization. Psychol. Bull. 138, 1172–1217 (2012)
Wertheimer, M., Spillmann, L., Wertheimer, M.: On Perceived Motion and Figural Organization. MIT Press, Cambridge (2012)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer Series in Statistics. Springer, Heidelberg (2009)
Murtagh, F., Contreras, P.: Algorithms for hierarchical clustering: an overview. Interdisc. Rev. Data Min. Knowl. Disc. 2, 86–97 (2012)
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1124–1131 (2005)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59, 167–181 (2004)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012)
Van den Bergh, M., Boix, X., Roig, G., de Capitani, B., Van Gool, L.: SEEDS: superpixels extracted via energy-driven sampling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 13–26. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33786-4_2
Kovesi, P.: Image segmentation using SLIC superpixels and DBSCAN clustering (2013). http://www.peterkovesi.com/projects/segmentation
Zhou, B.: Image segmentation using SLIC superpixels and affinity propagation clustering. Int. J. Sci. Res. 4(4), 1525–1529 (2015)
Peng, B., Zhang, L., Zhang, D.: A survey of graph theoretical approaches to image segmentation. Pattern Recogn. 46, 1020–1038 (2013)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000)
Vilaplana, V., Marques, F., Salembier, P.: Binary partition trees for object detection. IEEE Trans. Image Process. 17, 2201–2216 (2008)
Calderero, F., Marques, F.: Region merging techniques using information theory statistical measures. IEEE Trans. Image Process. 19, 1567–1586 (2010)
Alpert, S., Galun, M., Brandt, A., Basri, R.: Image segmentation by probabilistic bottom-up aggregation and cue integration. IEEE Trans. Pattern Anal. Mach. Intell. 34, 315–327 (2012)
Arbelaez, P.: Boundary extraction in natural images using ultrametric contour maps. In: 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW 2006), p. 182 (2006)
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)
Taylor, C.J.: Towards fast and accurate segmentation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1916–1922 (2013)
Haris, K., Efstratiadis, S.N., Maglaveras, N., Katsaggelos, A.K.: Hybrid image segmentation using watersheds and fast region merging. IEEE Trans. Image Process. 7, 1684–1699 (1998)
Marcotegui, B., Beucher, S.: Fast implementation of waterfall based on graphs. In: Ronse, C., Najman, L., Decencière, E. (eds.) Mathematical Morphology: 40 Years On. Computational Imaging and Vision, vol. 30, pp. 177–186. Springer, Heidelberg (2005)
Jain, V., Turaga, S.C., Briggman, K., Helmstaedter, M.N., Denk, W., Seung, H.S.: Learning to agglomerate superpixel hierarchies. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 24, pp. 648–656. Curran Associates, Inc., New York (2011)
Roerdink, J.B., Meijster, A.: The watershed transform: definitions, algorithms and parallelization strategies. Fundamenta Informaticae 41, 187–228 (2001)
Farid, H., Simoncelli, E.P.: Differentiation of discrete multidimensional signals. IEEE Trans. Image Process. 13, 496–508 (2004)
Xie, S., Tu, Z.: Holistically-nested edge detection. In: International Conference on Computer Vision (ICCV), pp. 1395–1403 (2015)
Li, Y., Paluri, M., Rehg, J.M., Dollár, P.: Unsupervised learning of edges. In: International Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Bruynooghe, M.: Methodes nouvelles en classification automatique de donnees taxinomiqes nombreuses. Statistique et Anal. des Donnes 3, 24–42 (1977)
Chan, T.F., Golub, G.H., LeVeque, R.J.: Algorithms for computing the sample variance: analysis and recommendations. Am. Stat. 37, 242–247 (1983)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 99–121 (2000)
Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)
Pont-Tuset, J., Marques, F.: Measures and meta-measures for the supervised evaluation of image segmentation. In: Computer Vision and Pattern Recognition (CVPR), pp. 2131–2138. IEEE (2013)
Robotics Foundation: ROS - Robot Operating System (2016). http://www.ros.org
Ren, Z., Shakhnarovich, G.: Image segmentation by cascaded region agglomeration. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2011–2018. IEEE (2013)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results (2010). http://host.robots.ox.ac.uk/pascal/VOC/voc2010/workshop/index.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Klein, D.A., Schulz, D., Cremers, A.B. (2017). Realtime Hierarchical Clustering Based on Boundary and Surface Statistics. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10111. Springer, Cham. https://doi.org/10.1007/978-3-319-54181-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-54181-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54180-8
Online ISBN: 978-3-319-54181-5
eBook Packages: Computer ScienceComputer Science (R0)