Abstract
In this paper we introduce a novel model that combines Deep Convolutional Neural Networks with a global inference model. Our model is derived from a convex variational relaxation of the minimum s-t cut problem on graphs, which is frequently used for the task of image segmentation. We treat the outputs of Convolutional Neural Networks as the unary and pairwise potentials of a graph and derive a smooth approximation to the minimum s-t cut problem. During training, this approximation facilitates the adaptation of the Convolutional Neural Network to the smoothing that is induced by the global model. The training algorithm can be understood as a modified backpropagation algorithm, that explicitly takes the global inference layer into account.
We illustrate our approach on the task of supervised figure-ground segmentation. In contrast to competing approaches we train directly on the raw pixels of the input images and do not rely on hand-crafted features. Despite its generality, simplicity and complete lack of hand-crafted features, our approach is able to yield competitive performance on the Graz02 and Weizmann Horses datasets.
The authors acknowledge support from the Austrian science fund (FWF) under the projects No. I1148 and No. Y729.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aldavert, D., Ramisa, A., de Mantaras, R.L., Toledo, R.: Fast and robust object segmentation with the integral linear classifier. In: CVPR (2010)
Alvarez, J.M., LeCun, Y., Gevers, T., Lopez, A.: Semantic road segmentation via multi-scale ensembles of learned features. In: ECCV Workshops (2012)
Bertelli, L., Yu, T., Vu, D., Gokturk, B.: Kernelized structural svm learning for supervised object segmentation. In: CVPR (2011)
Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: CVPR (2004)
Bottou, L., Le Cun, Y., Bengio, Y.: Global training of document processing systems using graph transformer networks. In: Proceedings of Computer Vision and Pattern Recognition, pp. 489–493. IEEE, Puerto-Rico (1997)
Boykov, Y.Y., Jolly, M.P.: Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In: ICCV (2001)
Brakel, P., Stroobandt, D., Schrauwen, B.: Training energy-based models for time-series imputation. J. of Mach. Learn. Res. 14, 2771–2797 (2013)
Chambolle, A., Darbon, J.: On total variation minimization and surface evolution using parametric maximum flows. IJCV 84(3), 288–307 (2009)
Chan, T.F., Esedoglu, S., Nikolova, M.: Algorithms for finding global minimizers of image segmentation and denoising models. J. App. Math. 66, 1632–1648 (2004)
Cour, T., Gogin, N., Shi, J.: Learning spectral graph segmentation. In: AISTATS (2005)
Domke, J.: Generic methods for optimization-based modeling. J. Mach. Learn. Res. 22, 318–326 (2012)
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. In: ICML (2012)
Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: ICCV (2009)
Hinton, G.: Training products of experts by minimizing contrastive divergence. Neur. Comput. 14, 1771–1800 (2000)
Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neur. Comput. 18, 1527–1554 (2006)
Jain, V., Seung, H.S.: Natural image denoising with convolutional networks. In: NIPS (2008)
Jancsary, J., Nowozin, S., Sharp, T., Rother, C.: Regression tree fields - an efficient, non-parametric approach to image labeling problems. In: CVPR (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Kuettel, D., Ferrari, V.: Figure-ground segmentation by transferring window masks. In: CVPR (2012)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)
Lempitsky, V.S., Vedaldi, A., Zisserman, A.: Pylon model for semantic segmentation. In: NIPS (2011)
Levin, A., Weiss, Y.: Learning to combine bottom-up and top-down segmentation. IJCV 81(1), 105–118 (2009)
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45, 503–528 (1989)
Marszalek, M., Schmid, C.: Accurate object localization with shape masks. In: CVPR (2007)
Nesterov, Y.: Gradient methods for minimizing composite objective function. Math. Program. 140, 125–161 (2013)
Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: ICCV (2011)
Opelt, A., Pinz, A., Fussenegger, M., Auer, P.: Generic object recognition with boosting. PAMI 28, 416–431 (2004)
Pock, T., Chambolle, A., Cremers, D., Bischof, H.: A convex relaxation approach for computing minimal partitions. In: CVPR (2009)
Samuel, K.G.G., Tappen, M.F.: Learning optimized map estimates in continuously-valued mrf models. In: CVPR (2009)
Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: IJCNN, pp. 2809–2813 (2011)
Tappen, M.F., Samuel, K.G.G., Dean, C.V., Lyle, D.M.: The logistic random field - a convenient graphical model for learning parameters for mrf-based labeling. In: CVPR (2008)
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)
Turaga, S.C., Murray, J.F., Jain, V., Roth, F., Helmstaedter, M., Briggman, K.L., Denk, W., Seung, H.S.: Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Comput. 22(2), 511–538 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ranftl, R., Pock, T. (2014). A Deep Variational Model for Image Segmentation. In: Jiang, X., Hornegger, J., Koch, R. (eds) Pattern Recognition. GCPR 2014. Lecture Notes in Computer Science(), vol 8753. Springer, Cham. https://doi.org/10.1007/978-3-319-11752-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-11752-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11751-5
Online ISBN: 978-3-319-11752-2
eBook Packages: Computer ScienceComputer Science (R0)