Abstract
We develop a framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks. The approach provides a practical method for learning high-order Markov random field (MRF) models with potential functions that extend over large pixel neighborhoods. These clique potentials are modeled using the Product-of-Experts framework that uses non-linear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. We demonstrate the capabilities of this Field-of-Experts model with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the model is trained on a generic image database and is not tuned toward a specific application, we obtain results that compete with specialized techniques.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bell, A. J., & Sejnowski, T. J. (1995). An information-maximization approach to blind separation and blind deconvolution. Neural Computation, 7(6), 1129–1159.
Bertalmío, M., Sapiro, G., Caselles, V., & Ballester, C. (2007). Image inpainting. In ACM SIGGRAPH (pp. 417–424), July 2000.
Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society: Series B, 48(3), 259–302.
Black, M. J., Sapiro, G., Marimont, D. H., & Heeger, D. (1998). Robust anisotropic diffusion. IEEE Transactions on Image Processing, 7(3), 421–432.
Blake, A., & Zisserman, A. (1987). Visual reconstruction. Cambridge: MIT Press.
Bottou, L. (2004). Stochastic learning. In O. Bousquet & U. von Luxburg (Eds.), Lecture notes in artificial intelligence: Vol. 3176. Advanced lectures on machine learning (pp. 146–168). Berlin: Springer.
Buades, A., Coll, B., & Morel, J.-M. (2004). A review of image denoising algorithms with a new one. SIAM Multiscale Modeling and Simulation, 4(2), 490–530.
Charbonnier, P., Blanc-Feéraud, L., Aubert, G., & Barlaud, M. (1997). Deterministic edge-preserving regularization in computed imaging. IEEE Transactions on Image Processing, 6(2), 298–311.
Criminisi, A., Pérez, P., & Toyama, K. (2004). Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on Image Processing, 13(9), 1200–1212.
Darroch, J. N., & Ratcliff, D. (1972). Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 43(5), 1470–1480.
della Pietra, S. D., della Pietra, V. D., & Lafferty, J. (1997). Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380–393.
Descombes, X., Morris, R. D., Zerubia, J., & Berthod, M. (1999). Estimation of Markov random field prior parameters using Markov chain Monte Carlo maximum likelihood. IEEE Transactions on Image Processing, 8(7), 954–963.
Donoho, D. L., Elad, M., & Temlyakov, V. N. (2006). Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transactions Information Theory, 52(1), 6–18.
Efros, A. A., & Leung, T. K. (1999). Texture synthesis by non-parametric sampling. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 2, pp. 1033–1038), Sept. 1999.
Elad, M., & Aharon, M. (2006). Image denoising via learned dictionaries and sparse representations. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 1, pp. 895–900), June 2006.
Elad, M., Milanfar, P., & Rubinstein, R. (2006). Analysis versus synthesis in signal priors. In Proc. of EUSIPCO, Florence, Italy, Sept. 2006.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient belief propagation for early vision. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 1, pp. 261–268), June 2004.
Fitzgibbon, A., Wexler, Y., & Zisserman, A. (2003). Image-based rendering using image-based priors. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 2, pp. 1176–1183), Oct. 2003.
Freeman, W. T., Pasztor, E. C., & Carmichael, O. T. (2000). Learning low-level vision. International Journal of Computer Vision, 40(1), 24–47.
Gävert, H., Hurri, J., Särelä, J., & Hyvärinen, A. FastICA software for MATLAB. http://www.cis.hut.fi/projects/ica/fastica/, Oct. 2005. Software version 2.5.
Gehler, P., & Welling, M. (2006). Products of “edge-perts”. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 18, pp. 419–426).
Geman, D., & Reynolds, G. (1992). Constrained restoration and the recovery of discontinuities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(3), 367–383.
Geman, S., & Geman, D. (1984). Stochastic relaxation Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Geman, S., McClure, D. E., & Geman, D. (1992). A nonlinear filter for film restoration and other problems in image processing. CVGIP: Graphical Models and Image Processing, 54(2), 281–289.
Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In Proceedings of the 23rd symposium on the interface, Computing Science and Statistics (pp. 156–163), Seattle, Washington, Apr. 1991.
Gilboa, G., Sochen, N., & Zeevi, Y. Y. (2004). Image enhancement and denoising by complex diffusion processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 1020–1036.
Gimel’farb, G. L. (1996). Texture modeling by multiple pairwise pixel interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(11), 1110–1114.
Gisy, T. (2005). Image inpainting based on natural image statistics. Diplom thesis, Eidgenössische Technische Hochschule, Zürich, Switzerland, Sept. 2005.
Hashimoto, W., & Kurata, K. (2000). Properties of basis functions generated by shift invariant sparse representations of natural images. Biological Cybernetics, 83(2), 111–118.
Heitz, F., & Bouthemy, P. (1993). Multimodal estimation of discontinuous optical flow using Markov random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(12), 1217–1232.
Hinton, G. E. (1999). Products of experts. In Int. conf. on art. neur. netw. (ICANN) (Vol. 1, pp. 1–6), Sept. 1999.
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.
Hinton, G. E., & Teh, Y.-W. (2001). Discovering multiple constraints that are frequently approximately satisfied. In Conf. on uncert. in art. intel. (UAI) (pp. 227–234), Aug. 2001.
Hofmann, T., Puzicha, J., & Buhmann, J. M. (1998). Unsupervised texture segmentation in a deterministic annealing framework. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 803–818.
Huang, J., & Mumford, D. (1999). Statistics of natural images and models. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 1, pp. 1541–1547), June 1999.
Hyvaärinen, A. (2005). Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6, 695–708.
Jordan, M. I., Ghahramani, Z., Jaakola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37(2), 183–233.
Kashyap, R. L., & Chellappa, R. (1981). Filtering of noisy images using Markov random field models. In Proceedings of the nineteenth Allerton conference on communication control and computing (pp. 850–859). Urbana, Illinois, Oct. 1981.
Kervrann, C., & Boulanger, J. (2006). Unsupervised patch-based image regularization and representation. In A. Leonardis, H. Bischof, & A. Prinz (Eds.), Lect. notes in comp. sci.: Vol. 3954. Eur. conf. on comp. vis. (ECCV) (pp. 555–567). Berlin: Springer.
Kohli, P., Kumar, M. P., & Torr, P. H. S. (2007). ℘3 & beyond: Solving energies with higher order cliques. In IEEE conf. on comp. vis. and pat. recog. (CVPR), June 2007.
Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2), 147–159.
Kumar, S., & Hebert, M. (2006). Discriminative random fields. International Journal of Computer Vision, 68(2), 179–201.
Lan, X., Roth, S., Huttenlocher, D. P., & Black, M. J. (2006). Efficient belief propagation with learned higher-order Markov random fields. In A. Leonardis, H. Bischof, & A. Prinz (Eds.), Lect. notes in comp. sci.: Vol. 3952. Eur. conf. on comp. vis. (ECCV) (pp. 269–282). Berlin: Springer.
LeCun, Y., & Huang, F. J. (2005). Loss functions for discriminative training of energy-based models. In R. G. Cowell and Z. Ghahramani (Eds.) Int. works. on art. int. and stat. (AISTATS) (pp. 206–213), Jan. 2005.
Levin, A., Zomet, A., & Weiss, Y. (2003). Learning how to inpaint from global image statistics. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 1, pp. 305–312), Oct. 2003.
Li, S. Z. (2001). Markov random field modeling in image analysis (2nd ed.) Berlin: Springer.
Lyu, S., & Simoncelli, E. P. (2007). Statistical modeling of images with fields of Gaussian scale mixtures. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 19, pp. 945–952).
Marroquin, J., Mitter, S., & Poggio, T. (1987). Probabilistic solutions of ill-posed problems in computational vision. Journal of American Statistical Association, 82(397), 76–89.
Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 2, pp. 416–423), July 2001.
McAuley, J. J., Caetano, T., Smola, A. J., & Franz, M. O. (2006). Learning high-order MRF priors of color images. In Int. conf. on mach. learn. (ICML) (pp. 617–624), June 2006.
Minka, T. (2005). Divergence measures and message passing (Technical Report MSR-TR-2005-173), Microsoft Research, Cambridge, UK.
Moldovan, T. M., Roth, S., & Black, M. J. (2006). Denoising archival films using a learned Bayesian model. In IEEE int. conf. on image proc. (ICIP) (pp. 2641–2644), Oct. 2006.
Moussouris, J. (1974). Gibbs and Markov random systems with constraints. Journal of Statistical Physics, 10(1), 11–33.
Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods (Technical Report CRG-TR-93-1), Department of Computer Science, University of Toronto, Ontario, Canada, Sept. 1993.
Neher, R., & Srivastava, A. (2005). A Bayesian MRF framework for labeling using hyperspectral imaging. IEEE Transactions on Geoscience and Remote Sensing, 43(6), 1363–1374.
Ning, F., Delhomme, D., LeCun, Y., Piano, F., Bottou, L., & Barbano, P. E. (2005). Toward automatic phenotyping of developing embryos from videos. IEEE Transactions on Image Processing, 14(9), 1360–1371.
Olshausen, B. A., & Field, D. J. (1996). Natural image statistics and efficient coding. Network: Computation in Neural, 7(2), 333–339.
Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23), 3311–3325.
Paget, R., & Longstaff, I. D. (1998). Texture synthesis via a noncausal nonparametric multiscale Markov random field. IEEE Transactions on Image Processing, 7(6), 925–931.
Pickup, L. C., Roberts, S. J., & Zisserman, A. (2004). A sampled texture prior for image super-resolution. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 16).
Poggio, T., Torre, V., & Koch, C. (1985). Computational vision and regularization theory. Nature, 317, 314–319.
Portilla, J. (2006a). Benchmark images. http://www.io.csic.es/PagsPers/JPortilla/denoise/test_images/index.htm.
Portilla, J. (2006b). Image denoising software. http://www.io.csic.es/PagsPers/JPortilla/denoise/software/index.htm. Software version 1.0.3.
Portilla, J., Strela, V., Wainwright, M. J., & Simoncelli, E. P. (2003). Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Transactions on Image Processing, 12(11), 1338–1351.
Potetz, B. (2007). Efficient belief propagation for vision using linear constraint nodes. In IEEE conf. on comp. vis. and pat. recog. (CVPR), June 2007.
Rasmussen, C. E. (2006). minimize.m—Conjugate gradient minimization. http://www.kyb.tuebingen.mpg.de/bs/people/carl/code/minimize/, Sept. 2006.
Roth, S. (2007). High-order Markov random fields for low-level vision. Ph.D. Dissertation, Brown University, Department of Computer Science, Providence, Rhode Island, May 2007.
Roth, S., & Black, M. J. (2005). Fields of experts: A framework for learning image priors. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 2, pp. 860–867), June 2005.
Roth, S., & Black, M. J. (2007a). Steerable random fields. In IEEE int. conf. on comp. vis. (ICCV), Oct. 2007.
Roth, S., & Black, M. J. (2007b). On the spatial statistics of optical flow. International Journal of Computer Vision, 74(1), 33–50.
Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear Gaussian models. Neural Computation, 11(2), 305–345.
Ruderman, D. L. (1994). The statistics of natural images. Network: Computation in Neural, 5(4), 517–548.
Sallee, P., & Olshausen, B. A. (2003). Learning sparse multiscale image representations. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 15, pp. 1327–1334).
Schnörr, C., Sprengel, R., & Neumann, B. (1996). A variational approach to the design of early vision algorithms. Computing Supplement, 11, 149–165.
Sebastiani, G., & Godtliebsen, F. (1997). On the use of Gibbs priors for Bayesian image restoration. Signal Processing, 56(1), 111–118.
Srivastava, A., Liu, X., & Grenander, U. (2002). Universal analytical forms for modeling image probabilities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1200–1214.
Srivastava, A., Lee, A. B., Simoncelli, E. P., & Zhu, S.-C. (2003). On advances in statistical modeling of natural images. Journal of Mathematical Imaging and Vision, 18(1), 17–33.
Stewart, L., He, X., & Zemel, R. S. (2008). Learning flexible features for conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(8), 1415–1426.
Sun, J., Zhen, N.-N., & Shum, H.-Y. (2003). Stereo matching using belief propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 787–800.
Szeliski, R. (1990). Bayesian modeling of uncertainty in low-level vision. International Journal of Computer Vision, 5(3), 271–301.
Tappen, M. F., Russell, B. C., & Freeman, W. T. (2003). Exploiting the sparse derivative prior for super-resolution and image demosaicing. In Proceedings of the 3rd international workshop on statistical and computational theories of vision, Nice, France, Oct. 2003.
Teh, Y. W., Welling, M., Osindero, S., & Hinton, G. E. (2003). Energy-based models for sparse overcomplete representations. Journal of Machine Learning Research, 4, 1235–1260.
Tjelmeland, H., & Besag, J. (1998). Markov random fields with higher-order interactions. Scandinavian Journal of Statistics, 25(3), 415–433.
Trobin, W., Pock, T., Cremers, D., & Bischof, H. (2008). An unbiased second-order prior for high-accuracy motion estimation. In Lect. notes in comp. sci.: Vol. 5096. Pat. recog., proc. DAGM-symp. (pp. 396–405). Berlin: Springer.
Varma, M., & Zisserman, A. (2005). A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1–2), 61–81.
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Weickert, J. (1997). A review of nonlinear diffusion filtering. In Lect. notes in comp. sci.: Vol. 1252. Proceedings of scale-space theory in computer vision (pp. 3–28). Berlin: Springer.
Weiss, Y., & Freeman, W. T. (2007). What makes a good model of natural images? In IEEE conf. on comp. vis. and pat. recog. (CVPR), June 2007.
Welling, M., & Sutton, C. (2005). Learning in Markov random fields with contrastive free energies. In R. G. Cowell and Z. Ghahramani (Eds.), Int. works. on art. int. and stat. (AISTATS) (pp. 389–396), Jan. 2005.
Welling, M., Hinton, G. E., & Osindero, S. (2003). Learning sparse topographic representations with products of Student-t distributions. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 15, pp. 1359–1366).
Wersing, H., Eggert, J., & Körner, E. (2003). Sparse coding with invariance constraints. In Int. conf. on art. neur. netw. (ICANN) (pp. 385–392), June 2003.
Wong, E. (1968). Two-dimensional random fields and representation of images. SIAM Journal on Applied Mathematics, 16(4), 756–770.
Yanover, C., Meltzer, T., & Weiss, Y. (2006). Linear programming relaxations and belief propagation—An empirical study. Journal of Machine Learning Research, 7, 1887–1907.
Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2003). Understanding belief propagation and its generalizations. In G. Lakemeyer & B. Nebel (Eds.), Exploring artificial intelligence in the new millennium (pp. 239–236). San Mateo: Morgan Kaufmann. Chap. 8.
Zalesny, A., & van Gool, L. (2001). A compact model for viewpoint dependent texture synthesis. In Lect. notes in comp. sci.: Vol. 2018. Proceedings of SMILE 2000 workshop (pp. 124–143). Berlin: Springer.
Zhu, S. C., & Mumford, D. (1997). Prior learning and Gibbs reaction–diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236–1250.
Zhu, S. C., Wu, Y., & Mumford, D. (1998). Filters random fields and maximum entropy (FRAME): Towards a unified theory for texture modeling. International Journal of Computer Vision, 27(2), 107–126.
Author information
Authors and Affiliations
Corresponding author
Additional information
The work for this paper was performed while S.R. was at Brown University.
Rights and permissions
About this article
Cite this article
Roth, S., Black, M.J. Fields of Experts. Int J Comput Vis 82, 205–229 (2009). https://doi.org/10.1007/s11263-008-0197-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-008-0197-6