Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access

Cluster-specific predictions with multi-task Gaussian processes

Published: 01 January 2023 Publication History

Abstract

A model involving Gaussian processes (GPs) is introduced to simultaneously handle multitask learning, clustering, and prediction for multiple functional data. This procedure acts as a model-based clustering method for functional data as well as a learning step for subsequent predictions for new tasks. The model is instantiated as a mixture of multitask GPs with common mean processes. A variational EM algorithm is derived for dealing with the optimisation of the hyper-parameters along with the hyper-posteriors' estimation of latent variables and processes. We establish explicit formulas for integrating the mean processes and the latent clustering variables within a predictive distribution, accounting for uncertainty in both aspects. This distribution is defined as a mixture of cluster-specific GP predictions, which enhances the performance when dealing with group-structured data. The model handles irregular grids of observations and offers different hypotheses on the covariance structure for sharing additional information across tasks. The performances on both clustering and prediction tasks are assessed through various simulated scenarios and real data sets. The overall algorithm, called MagmaClust, is publicly available as an R package.

References

[1]
C. Abraham, P. A. Cornillon, E. Matzner-Løber, and N. Molinari. Unsupervised Curve Clustering using B-Splines. Scandinavian Journal of Statistics, 30(3):581-595, Sept. 2003. ISSN 1467-9469.
[2]
H. Akaike. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716-723, Dec. 1974. ISSN 1558-2523.
[3]
M. A. Álvarez and N. D. Lawrence. Computationally Efficient Convolved Multiple Output Gaussian Processes. Journal of Machine Learning Research, 12(41):1459-1500, 2011. ISSN 1533-7928.
[4]
M. A. Álvarez, L. Rosasco, and N. D. Lawrence. Kernels for Vector-Valued Functions: A Review. Foundations and Trends® in Machine Learning, 4(3):195-266, 2012. ISSN 1935-8237, 1935-8245.
[5]
H. Attias. A variational baysian framework for graphical models. In S. A. Solla, T. K. Leen, and K. Müller, editors, Advances in Neural Information Processing Systems 12, pages 209-215. MIT Press, 2000.
[6]
M. Bauer, M. van der Wilk, and C. E. Rasmussen. Understanding probabilistic sparse gaussian process approximations. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 1533-1541. Curran Associates, Inc., 2016.
[7]
Y. Bengio. Gradient-Based Optimization of Hyperparameters. Neural Computation, 12(8): 1889-1900, Aug. 2000. ISSN 0899-7667.
[8]
J. Bernardo, J. Berger, A. Dawid, and A. Smith. Regression and classification using Gaussian process priors. Bayesian statistics, 6:475, 1998.
[9]
C. Biernacki, G. Celeux, and G. Govaert. Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics & Data Analysis, 41(3):561-575, 2003. ISSN 0167-9473.
[10]
C. M. Bishop. Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York, 2006. ISBN 978-0-387-31073-2.
[11]
E. V. Bonilla, K. M. Chai, and C. Williams. Multi-task gaussian process prediction. In J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 153-160. Curran Associates, Inc., 2008.
[12]
C. Bouveyron and J. Jacques. Model-based clustering of time series in group-specific functional subspaces. Advances in Data Analysis and Classification, 5(4):281-300, Dec. 2011. ISSN 1862-5355.
[13]
C. Bouveyron, E. Côme, and J. Jacques. The discriminative functional mixture model for a comparative analysis of bike sharing systems. The Annals of Applied Statistics, 9(4): 1726-1760, 2015.
[14]
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, 2004. ISBN 978-0-521-83378-3.
[15]
R. Caruana. Multitask Learning. Machine Learning, 28(1):41-75, July 1997. ISSN 1573- 0565.
[16]
K. Chen, T. van Laarhoven, P. Groot, J. Chen, and E. Marchiori. Generalized Convolution Spectral Mixture for Multitask Gaussian Processes. IEEE transactions on neural networks and learning systems, Apr. 2020. ISSN 2162-2388.
[17]
V. Cohen-Addad, B. Guedj, V. Kanade, and G. Rom. Online k-means clustering. In A. Banerjee and K. Fukumizu, editors, Proceedings of The 24th International Conference on Artificial Intelligence and Statistics [AISTATS], volume 130 of Proceedings of Machine Learning Research, pages 1126-1134. PMLR, April 2021.
[18]
T. de Wolff, A. Cuevas, and F. Tobar. MOGPTK: The multi-output Gaussian process toolkit. Neurocomputing, 424:49-53, Feb. 2021. ISSN 0925-2312.
[19]
D. Duvenaud. Automatic Model Construction with Gaussian Processes. Thesis, University of Cambridge, Nov. 2014.
[20]
M. Giacofci, S. Lambert-Lacroix, G. Marot, and F. Picard. Wavelet-Based Clustering for Mixed-Effects Functional Models in High Dimension. Biometrics, 69(1):31-40, 2013. ISSN 1541-0420.
[21]
P. Goovaerts. Geostatistics for natural resources evaluation. Oxford University Press on Demand, 1997.
[22]
K. Hayashi, T. Takenouchi, R. Tomioka, and H. Kashima. Self-measuring Similarity for Multi-task Gaussian Process. Transactions of the Japanese Society for Artificial Intelligence, 27(3):103-110, 2012. ISSN 1346-8030, 1346-0714.
[23]
J. Hensman, N. Fusi, and N. D. Lawrence. Gaussian processes for big data. In Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, UAI'13, page 282-290, Arlington, Virginia, USA, 2013a. AUAI Press.
[24]
J. Hensman, N. D. Lawrence, and M. Rattray. Hierarchical bayesian modelling of gene expression time series across irregularly sampled replicates and clusters. BMC bioinformatics, 14(1):1-12, 2013b.
[25]
M. R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. Journal of research of the National Bureau of Standards, 49(6):409-436, 1952.
[26]
L. Hubert and P. Arabie. Comparing partitions. Journal of Classification, 2(1):193-218, Dec. 1985. ISSN 1432-1343.
[27]
J. Jacques and C. Preda. Funclust: A curves clustering method using functional random variables density approximation. Neurocomputing, 112:164-171, July 2013. ISSN 09252312.
[28]
J. Jacques and C. Preda. Functional data clustering: A survey. Advances in Data Analysis and Classification, 8(3):231-255, Sept. 2014. ISSN 1862-5347, 1862-5355.
[29]
H. Jiang and N. Serban. Clustering Random Curves Under Spatial Interdependence With Application to Service Accessibility. Technometrics, 54(2):108-119, May 2012. ISSN 0040-1706, 1537-2723.
[30]
J. Kleinberg. An impossibility theorem for clustering. Advances in neural information processing systems, 15, 2002.
[31]
A. Leroy, A. Marc, O. Dupas, J. L. Rey, and S. Gey. Functional Data Analysis in Sport Science: Example of Swimmers' Progression Curves Clustering. Applied Sciences, 8(10): 1766, Oct. 2018.
[32]
A. Leroy, P. Latouche, B. Guedj, and S. Gey. Magma: inference and prediction using multi-task gaussian processes with common mean. Machine Learning, 111(5):1821-1849, 2022.
[33]
L. Li and B. Guedj. Sequential learning of principal curves: Summarizing data streams on the y. Entropy, 23(11), 2021. ISSN 1099-4300.
[34]
L. Li, B. Guedj, and S. Loustau. A quasi-Bayesian perspective to online clustering. Electronic Journal of Statistics, 12(2):3071-3113, 2018. ISSN 1935-7524.
[35]
J. L. Morales and J. Nocedal. Remark on algorithm L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization. ACM Transactions on Mathematical Software, 38(1):7:1-7:4, Dec. 2011. ISSN 0098-3500.
[36]
R. M. Neal. Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification. arXiv:physics/9701026, Jan. 1997.
[37]
J. Nocedal. Updating quasi-Newton matrices with limited storage. Mathematics of Computation, 35(151):773-782, 1980. ISSN 0025-5718, 1088-6842.
[38]
G. Parra and F. Tobar. Spectral mixture kernels for multi-output gaussian processes. Advances in Neural Information Processing Systems, 30, 2017.
[39]
J. Quiñonero-Candela, C. E. Rasmussen, and C. K. I. Williams. Approximation Methods for Gaussian Process Regression. MIT Press, 2007. ISBN 978-0-262-02625-3.
[40]
B. Rakitsch, C. Lippert, K. Borgwardt, and O. Stegle. It is all in the noise: Efficient multi-task Gaussian process inference with structured residuals. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 1466-1474. Curran Associates, Inc., 2013.
[41]
J. O. Ramsay and B. W. Silverman. Functional Data Analysis. Springer, 2005.
[42]
C. E. Rasmussen and H. Nickisch. Gaussian processes for machine learning (GPML) toolbox. The Journal of Machine Learning Research, 11:3011-3015, 2010.
[43]
C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, Mass, 2006. ISBN 978-0-262-18253-9.
[44]
A. Schmutz, J. Jacques, C. Bouveyron, L. Cheze, and P. Martin. Clustering multivariate functional data in group-specific functional subspaces. Computational Statistics, 35(3): 1101-1131, 2020.
[45]
A. Schwaighofer, V. Tresp, and K. Yu. Learning gaussian process kernels via hierarchical bayes. In Advances in neural information processing systems, volume 17, 2004.
[46]
G. Schwarz. Estimating the Dimension of a Model. Annals of Statistics, 6(2):461-464, Mar. 1978. ISSN 0090-5364, 2168-8966.
[47]
J. Shi, R. Murray-Smith, and D. Titterington. Hierarchical Gaussian process mixtures for regression. Statistics and Computing, 15(1):31-41, 2005. ISSN 0960-3174, 1573-1375.
[48]
J. Q. Shi and B. Wang. Curve prediction and clustering with mixtures of Gaussian process functional regression models. Statistics and Computing, 18(3):267-283, 2008. ISSN 0960- 3174, 1573-1375.
[49]
J. Q. Shi, B. Wang, R. Murray-Smith, and D. M. Titterington. Gaussian Process Functional Regression Modeling for Batch Data. Biometrics, 63(3):714-723, 2007. ISSN 1541-0420.
[50]
E. Snelson and Z. Ghahramani. Sparse gaussian processes using pseudo-inputs. Advances in neural information processing systems, 18, 2005.
[51]
K. Swersky, J. Snoek, and R. P. Adams. Multi-task bayesian optimization. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 2004-2012. Curran Associates, Inc., 2013.
[52]
M. Titsias. Variational learning of inducing variables in sparse gaussian processes. In Artificial intelligence and statistics, pages 567-574. PMLR, 2009.
[53]
N. Ueda and R. Nakano. Deterministic annealing EM algorithm. Neural Networks, 11(2): 271-282, Mar. 1998. ISSN 0893-6080.
[54]
J. Vanhatalo, J. Riihimäki, J. Hartikainen, P. Jylänki, V. Tolvanen, and A. Vehtari. GPstuff: Bayesian Modeling with Gaussian Processes. Journal of Machine Learning Research, 14 (Apr):1175-1179, 2013. ISSN ISSN 1533-7928.
[55]
D. L. Weakliem. A Critique of the Bayesian Information Criterion for Model Selection:. Sociological Methods & Research, June 2016.
[56]
A. Wilson and R. Adams. Gaussian process kernels for pattern discovery and extrapolation. In S. Dasgupta and D. McAllester, editors, Proceedings of Machine Learning Research, volume 28, pages 1067-1075, Atlanta, Georgia, USA, 17-19 Jun 2013. PMLR.
[57]
J. Yang, H. Zhu, T. Choi, and D. D. Cox. Smoothing and Mean-Covariance Estimation of Functional Data with a Bayesian Hierarchical Model. Bayesian Analysis, 11(3):649-670, 2016. ISSN 1936-0975, 1931-6690.
[58]
C. You, J. T. Ormerod, and S. Müller. On Variational Bayes Estimation and Variational Information Criteria for Linear Regression Models. Australian & New Zealand Journal of Statistics, 56(1):73-87, 2014. ISSN 1467-842X.
[59]
K. Yu, V. Tresp, and A. Schwaighofer. Learning gaussian processes from multiple tasks. In Proceedings of the 22Nd International Conference on Machine Learning, pages 1012-1019, 2005.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 24, Issue 1
January 2023
18881 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents
CC-BY 4.0

Publisher

JMLR.org

Publication History

Published: 01 January 2023
Accepted: 01 January 2023
Revised: 01 July 2022
Received: 01 November 2020
Published in JMLR Volume 24, Issue 1

Author Tags

  1. Gaussian processes mixture
  2. curve clustering
  3. multi-task learning
  4. variational EM
  5. cluster-specific predictions

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 163
    Total Downloads
  • Downloads (Last 12 months)163
  • Downloads (Last 6 weeks)28
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media