article

Free access

A framework for evaluating approximation methods for Gaussian process regression

Editors: Kevin Murphy, Bernhard Schölkopf Authors:

Krzysztof Chalupka,

Christopher K. I. Williams,

Iain MurrayAuthors Info & Claims

The Journal of Machine Learning Research, Volume 14, Issue 1

Pages 333 - 350

Published: 01 February 2013 Publication History

PDF eReader Publisher Site

Abstract

Gaussian process (GP) predictors are an important component of many Bayesian approaches to machine learning. However, even a straightforward implementation of Gaussian process regression (GPR) requires O(n²) space and O(n³) time for a data set of n examples. Several approximation methods have been proposed, but there is a lack of understanding of the relative merits of the different approximations, and in what situations they are most useful. We recommend assessing the quality of the predictions obtained as a function of the compute time taken, and comparing to standard baselines (e.g., Subset of Data and FITC). We empirically investigate four different approximation algorithms on four different prediction problems, and make our code available to encourage future comparisons.

References

[1]

R. P. Adams, G. E. Dahl, and I. Murray. Incorporating side information into probabilistic matrix factorization using Gaussian processes. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pages 1-9. AUAI Press, 2010.

[2]

M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection. Journal of Computer and System Sciences, 7:448-461, 1973.

[3]

K. Chalupka. Empirical evaluation of Gaussian process approximation algorithms. Master's thesis, School of Informatics, University of Edinburgh, 2011. http://homepages.inf.ed.ac.uk/ ckiw/postscript/Chalupka2011diss.pdf.

[4]

T. Feder and D. H. Greene. Optimal algorithms for approximate clustering. In Proceedings of the 20th ACM Symposium on Theory of Computing, pages 434-444. ACM Press, New York, USA, 1988. ISBN 0-89791-264-0.

[5]

N. De Freitas, Y. Wang, M. Mahdaviani, and D. Lang. Fast Krylov methods for N-body learning. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 251-258. MIT Press, 2006.

[6]

J. Fritz, I. Neuweiler, and W. Nowak. Application of FFT-based algorithms for large-scale universal Kriging problems. Mathematical Geosciences, 41:509-533, 2009.

[7]

M. Gibbs. Bayesian Gaussian processes for Classification and Regression. PhD thesis, University of Cambridge, 1997.

[8]

G. H. Golub and C. F. Van Loan. Matrix Computations. The John Hopkins University Press, third edition, 1996.

[9]

T. F. Gonzales. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38(2-3):293-306, 1985.

[10]

A. Gray. Fast kernel matrix-vector multiplication with application to Gaussian process learning. Technical Report CMU-CS-04-110, School of Computer Science, Carnegie Mellon University, 2004.

[11]

S. Keerthi and W. Chu. A matching pursuit approach to sparse Gaussian process regression. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 643-650. MIT Press, Cambridge, MA, 2006.

[12]

N. Lawrence, M. Seeger, and R. Herbrich. Fast sparse Gaussian process methods: The informative vector machine. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 625-632. MIT Press, 2003.

[13]

N. D. Lawrence. Gaussian process latent variable models for visualization of high dimensional data. In S. Thrun, L. Saul, and B. Schölkopf, editors, Advances in Neural Information Processing Systems 16, pages 329-336. MIT Press, 2004.

[14]

M. Léazaro-Gredilla, J. Quiñonero-Candela, C. E. Rasmussen, and A. R. Figueiras-Vidal. Sparse spectrum Gaussian process regression. Journal of Machine Learning Research, 11:1865-1881, 2010.

[15]

W. Li, K-H. Lee, and K-S. Leung. Large-scale RLSC learning without agony. In Proceedings of the 24th International Conference on Machine learning, pages 529-536. ACM Press New York, NY, USA, 2007.

[16]

E. Liberty, F. Woolfe, P-G. Martinsson, V. Rokhlin, and M. Tygert. Randomized algorithms for the low-rank approximation of matrices. Proceedings of the National Academy of Sciences, 104(51): 20167-72, 2007.

[17]

M. Malshe, L. M. Raff, M. G. Rockey, M. Hagan, P. M. Agrawal, and R. Komanduri. Theoretical investigation of the dissociation dynamics of vibrationally excited vinyl bromide on an ab initio potential-energy surface obtained using modified novelty sampling and feedforward neural networks. II. Numerical application of the method. The Journal of Chemical Physics, 127(13): 134105, 2007.

[18]

S. Manzhos and T. Carrington Jr. Using neural networks, optimized coordinates, and high-dimensional model representations to obtain a vinyl bromide potential surface. The Journal of Chemical Physics, 129:224104-1-224104-8, 2008.

[19]

V. I. Morariu, B. V. Srinivasan, V. C. Raykar, R. Duraiswami, and L. S. Davis. Automatic online tuning for fast Gaussian summation. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 1113-1120, 2009.

[20]

I. Murray. Gaussian processes and fast matrix-vector multiplies, 2009. Presented at the Numerical Mathematics in Machine Learning workshop at the 26th International Conference on Machine Learning (ICML 2009), Montreal, Canada. URL http://www.cs.toronto.edu/~murray/ pub/09gp_eval/ (as of March 2011).

[21]

R. M. Neal. Bayesian Learning for Neural Networks. Springer, New York, 1996. Lecture Notes in Statistics 118.

[22]

C. J. Paciorek. Bayesian smoothing with Gaussian processes using Fourier basis functions in the spectralGP package. Journal of Statistical Software, 19(2):1-38, 2007. URL http://www. jstatsoft.org/v19/i02.

[23]

J. Quiñonero-Candela and C. E. Rasmussen. A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research, 6:1939-1959, 2005.

[24]

J. Quiñonero-Candela, C. E. Rasmussen, and C. K. I. Williams. Approximation methods for Gaussian process regression. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large Scale Learning Machines, pages 203-223. MIT Press, 2007.

[25]

C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, Cambridge, Massachusetts, 2006.

[26]

V. C. Raykar and R. Duraiswami. Fast large scale Gaussian process regression using approximate matrix-vector products. In Learning Workshop 2007, 2007. Available from: http://www.umiacs.umd.edu/~vikas/publications/raykar_learning_workshop_2007_full_paper.pdf.

[27]

Y. Shen, A. Ng, and M. Seeger. Fast Gaussian process regression using KD-trees. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 1225-1232. MIT Press, 2006.

[28]

E. Snelson. Flexible and Efficient Gaussian Process Models for Machine Learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, 2007.

[29]

E. Snelson and Z. Ghahramani. Sparse Gaussian processes using pseudo-inputs. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 1257-1264, 2006.

[30]

E. Snelson and Z. Ghahramani. Local and global sparse Gaussian process approximations. In M. Meila and X. Shen, editors, Artificial Intelligence and Statistics 11. Omnipress, 2007.

[31]

E. Sudderth and M. Jordan. Shared segmentation of natural scenes using dependent Pitman-Yor processes. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 1585-1592, 2009.

[32]

M. Titsias. Variational learning of inducing variables in sparse Gaussian processes. In Artificial Intelligence and Statistics 12, volume 5, pages 567-574. JMLR: W&CP, 2009.

[33]

C. K. Wikle, R. F. Milliff, D. Nychka, and L. M. Berliner. Spatiotemporal hierarchical Bayesian modeling: tropical ocean surface winds. Journal of the American Statistical Association, 96 (454):382-397, 2001.

[34]

C. Yang, R. Duraiswami, and L. Davis. Efficient kernel machines using the improved fast Gauss transform. In L. K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17, pages 1561-1568. MIT Press, 2005.

Cited By

Li YZhou QJiang WTsui K(2024)Optimal Composite Likelihood Estimation and Prediction for Distributed Gaussian Process ModelingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332837846:2(1134-1147)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3328378
Allison RStephenson AF SPyzer-Knapp EOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Leveraging locality and robustness to achieve massively scalable Gaussian process regressionProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666951(18906-18931)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666951
Thomas ESarin V(2023)QR decomposition based low rank approximation for Gaussian process regressionApplied Intelligence10.1007/s10489-023-05064-853:23(28924-28936)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s10489-023-05064-8
Show More Cited By

Index Terms

A framework for evaluating approximation methods for Gaussian process regression

Recommendations

Gaussian process regression: optimality, robustness, and relationship with kernel ridge regression

Gaussian process regression is widely used in many fields, for example, machine learning, reinforcement learning and uncertainty quantification. One key component of Gaussian process regression is the unknown correlation function, which needs to be ...
Distributed robust Gaussian Process regression

We study distributed and robust Gaussian Processes where robustness is introduced by a Gaussian Process prior on the function values combined with a Student-t likelihood. The posterior distribution is approximated by a Laplace Approximation, and ...
Blind equalizer for constant-modulus signals based on Gaussian process regression

A new blind equalization method for constant modulus (CM) signals based on Gaussian process for regression (GPR) by incorporating a constant modulus algorithm (CMA)-like error function into the conventional GPR framework is proposed. The GPR framework ...

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 14, Issue 1

January 2013

3717 pages

ISSN:1532-4435

EISSN:1533-7928

Editors:
Kevin Murphy
Google
,
Bernhard Schölkopf
MPI for Intelligent Systems

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 February 2013

Published in JMLR Volume 14, Issue 1

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
292
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)11

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li YZhou QJiang WTsui K(2024)Optimal Composite Likelihood Estimation and Prediction for Distributed Gaussian Process ModelingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332837846:2(1134-1147)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3328378
Allison RStephenson AF SPyzer-Knapp EOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Leveraging locality and robustness to achieve massively scalable Gaussian process regressionProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666951(18906-18931)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666951
Thomas ESarin V(2023)QR decomposition based low rank approximation for Gaussian process regressionApplied Intelligence10.1007/s10489-023-05064-853:23(28924-28936)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s10489-023-05064-8
Hu CZeng SLi C(2022)Hyperparameters Adaptive Sharing Based on Transfer Learning for Scalable GPs2022 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC55065.2022.9870288(01-07)Online publication date: 18-Jul-2022
https://dl.acm.org/doi/10.1109/CEC55065.2022.9870288
Yang HWang Y(2022)A sparse multi-fidelity surrogate-based optimization method with computational awarenessEngineering with Computers10.1007/s00366-022-01766-839:5(3473-3489)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s00366-022-01766-8
Tran AEldred MWildey TMcCann SSun JVisintainer R(2022)aphBO-2GP-3B: a budgeted asynchronous parallel multi-acquisition functions for constrained Bayesian optimization on high-performing computing architectureStructural and Multidisciplinary Optimization10.1007/s00158-021-03102-y65:4Online publication date: 1-Apr-2022
https://dl.acm.org/doi/10.1007/s00158-021-03102-y
Chen Kvan Laarhoven TMarchiori E(2021)Gaussian processes with skewed Laplace spectral mixture kernels for long-term forecastingMachine Language10.1007/s10994-021-06031-5110:8(2213-2238)Online publication date: 1-Aug-2021
https://dl.acm.org/doi/10.1007/s10994-021-06031-5
Sanz-Alcaine JSebastián ESanz-Gorrachategui IBernal-Ruiz CBono-Nuez APajovic MOrlik P(2021)Online voltage prediction using gaussian process regression for fault-tolerant photovoltaic standalone applicationsNeural Computing and Applications10.1007/s00521-021-06254-633:23(16577-16590)Online publication date: 1-Dec-2021
https://dl.acm.org/doi/10.1007/s00521-021-06254-6
Chen LMagdy WWhalley HWolters M(2020)Examining the Role of Mood Patterns in Predicting Self-Reported Depressive symptomsProceedings of the 12th ACM Conference on Web Science10.1145/3394231.3397906(164-173)Online publication date: 6-Jul-2020
https://dl.acm.org/doi/10.1145/3394231.3397906
Pfefferkorn MMaiworm MWagner CTautz FFindeisen R(2020)Fusing Online Gaussian Process-Based Learning and Control for Scanning Quantum Dot Microscopy2020 59th IEEE Conference on Decision and Control (CDC)10.1109/CDC42340.2020.9304053(5525-5531)Online publication date: 14-Dec-2020
https://dl.acm.org/doi/10.1109/CDC42340.2020.9304053
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents