Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

A case study on meta-generalising: a Gaussian processes approach

Published: 01 March 2012 Publication History

Abstract

We propose a novel model for meta-generalisation, that is, performing prediction on novel tasks based on information from multiple different but related tasks. The model is based on two coupled Gaussian processes with structured covariance function; one model performs predictions by learning a constrained covariance function encapsulating the relations between the various training tasks, while the second model determines the similarity of new tasks to previously seen tasks. We demonstrate empirically on several real and synthetic data sets both the strengths of the approach and its limitations due to the distributional assumptions underpinning it.

References

[1]
J. H. Albert and S. Chib. Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422):669-679, 1993.
[2]
R.K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. The Journal of Machine Learning Research, 6:1817-1853, 2005.
[3]
A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243-272, 2008.
[4]
A. Arnold, R. Nallapati, and W.W. Cohen. A comparative study of methods for transductive transfer learning. In Proceedings of the 7th IEEE International Conference on Data Mining Workshops, pages 77-82, Omaha, Nebraska, USA, 2007.
[5]
B. Bakker and T. Heskes. Task clustering and gating for bayesian multitask learning. The Journal of Machine Learning Research, 4:83-99, 2003.
[6]
J. Baxter. A model of inductive bias learning. Journal of Artificial Intelligence Research, 12: 149-198, 2000.
[7]
S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In Advances in Neural Information Processing Systems 19, pages 137-145, Vancouver, Canada, 2007.
[8]
S. Ben-David, T. Luu, T. Lu, and D. Pál. Impossibility theorems for domain adaptation. In Proceedings of the 13th International Workshop on Artificial Intelligence and Statistics, volume 13, pages 129-136, Sardinia, Italy, 2010.
[9]
S. Bickel, M. Brückner, and T. Scheffer. Discriminative learning under covariate shift. The Journal of Machine Learning Research, 10:2137-2155, 2009.
[10]
E. Bonilla, K. M. Chai, and C.K.I. Williams. Multi-task gaussian process prediction. In Advances in Neural Information Processing Systems 20, pages 153-160, Vancouver, Canada, 2008.
[11]
K.H. Brodersen, C.S. Ong, K.E. Stephan, and J.M. Buhmann. The binormal assumption on precision-recall curves. In Proceedings of the 2010 International Conference on Pattern Recognition, pages 4263-4266, Istanbul, Turkey, 2010.
[12]
R. Caruana. Multi-task learning. Machine Learning, 28(1):41-75, 1997.
[13]
O. Chapelle, B. Schölkopf, and A. Zien. Semi-supervised learning. MIT Press, Cambridge, MA, 2006.
[14]
K. Crammer, M. Kearns, and J.Wortman. Learning from multiple sources. The Journal of Machine Learning Research, 9:1757-1774, 2008.
[15]
N. A.C. Cressie. Statistics for Spatial Data. John Wiley & Sons. New York. US, 1993.
[16]
L. Csató, E. Fokoué, M. Opper, B. Schottky, and O. Winther. Efficient approaches to gaussian process classification. In Advances in Neural Information Processing Systems 12, pages 251- 257, Denver, Colorado, 2000.
[17]
H. Daumé. Frustratingly easy domain adaptation. In Annual Meeting of the Association for Computational Linguistics, volume 45, pages 256-263, 2007.
[18]
H. Daumé III. Bayesian multitask learning with latent hierarchies. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pages 135-142, Montreal, Canada, 2009.
[19]
H. Daumé III and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26(1):101-126, 2006.
[20]
J. Davis and M. Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd International Conference on Machine Learning, pages 233-240, Pittsburgh, USA, 2006.
[21]
I. Deak. Three digit accurate multiple normal probabilities. Numerische Mathematik, 35(4):369- 380, 1980.
[22]
H. I. Gassmann, I. Deak, and T. Szantai. Computing multivariate normal probabilities: A new look. Journal of Computational and Graphical Statistics, 11(4):920-949, 2002.
[23]
A. Genz. Numerical computation of multivariate normal probabilities. Journal of Computational and Graphical Statistics, 1(2):141-149, 1992.
[24]
M. Girolami and S. Rogers. Variational bayesian multinomial probit regression with gaussian process priors. Neural Computation, 18(8):1790-1817, 2006.
[25]
M. Girolami and M. Zhong. Data integration for classification problems employing Gaussian process priors. In Advances in Neural Information Processing Systems 19, pages 465-472, Vancouver, Canada, 2007.
[26]
A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. Ch. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley. Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation, 101(23): 215-220", 2000.
[27]
A. K. Gupta and D. K. Nagar. Matrix Variate Distributions. Chapman & Hall/CRC, 2000.
[28]
J. A. Hanley and B. J. Mcneil. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1):29-36, April 1982.
[29]
G.E. Hinton. Connectionist learning procedures. Artificial Intelligence, 40(1-3):185-234, 1989.
[30]
J. Huang, A. J. Smola, A. Gretton, K M. Borgwardt, and B. Schölkopf. Correcting sample selection bias by unlabeled data. In Advances in Neural Information Processing Systems 19, pages 601- 608, Vancouver, Canada, 2007.
[31]
R.A. Jacobs, M.I. Jordan, S.J. Nowlan, and G.E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3(1):79-87, 1991.
[32]
L.I. Kuncheva. A theoretical study on six classifier fusion strategies. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2):281-286, 2002.
[33]
Q. Liu, X. Liao, H. Li, J. R. Stack, and L. Carin. Semisupervised multitask learning. IEEE Transactions on Pattern Analysis Machine Intelligence, 31(6):1074-1086, 2009.
[34]
D.J.C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.
[35]
Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation with multiple sources. In Advances in Neural Information Processing Systems 21, pages 1041-1048, Vancouver, Canada, 2009.
[36]
T.P. Minka. Expectation propagation for approximate bayesian inference. In Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence, volume 17, pages 362-369, San Francisco, CA, USA, 2001.
[37]
M. Opper and O. Winther. Gaussian processes for classification: mean-field algorithms. Neural Computation, 12(11):2655-2684, 2000.
[38]
S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345-1359, 2010.
[39]
S.J. Pan, I.W. Tsang, J.T. Kwok, and Q. Yang. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 22(2):199-210, 2009.
[40]
R. Raina, A. Battle, H. Lee, B. Packer, and A.Y. Ng. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th International Conference on Machine Learning, pages 759-766, Corvallis, OR, USA, 2007.
[41]
C. E. Rasmussen and C. K.I. Williams. Gaussian Processes for Machine Learning. MIT press, 2005.
[42]
C.E. Rasmussen and Z. Ghahramani. Infinite mixtures of gaussian process experts. In Advances in Neural Information Processing Systems 14, pages 881-888, Vancouver, Canada, 2001.
[43]
R. Rebonato and P. Jäckel. The most general methodology to create a valid correlation matrix for risk management and option pricing purposes. Journal of Risk, 2(2), 2000.
[44]
G. Skolidis and G. Sanguinetti. Bayesian multitask classification with gaussian process priors. IEEE Transactions on Neural Networks, 22(12):2011 -2021, Dec. 2011.
[45]
G. Skolidis, RH Clayton, and G. Sanguinetti. Automatic classification of arrhythmic beats using gaussian processes. In IEEE Transactions on Computers in Cardiology, 2008, pages 921-924, Bologna, Italy, 2008.
[46]
E. Snelson and Z. Ghahramani. Sparse gaussian processes using pseudo-inputs. In Advances in Neural Information Processing Systems 18, pages 1257-1264, Vancouver, Canada, 2006.
[47]
A. J. Storkey and M. Sugiyama. Mixture regression for covariate shift. In Advances in Neural Information Processing Systems 19, pages 1337-1344, Vancouver, Canada, 2007.
[48]
M. Sugiyama, M. Krauledat, and K.R. Müller. Covariate shift adaptation by importance weighted cross validation. The Journal of Machine Learning Research, 8:985-1005, 2007.
[49]
Volker Tresp. Mixtures of gaussian processes. In Advances in Neural Information Processing Systems 13, pages 654-660, Vancouver, Canada, 2000. MIT Press.
[50]
S.R. Waterhouse. Classification and Regression Using Mixtures of Experts. PhD thesis, Department of Engineering, Cambridge University, 1997.
[51]
Y. Xue, X. Liao, L. Carin, and B. Krishnapuram. Multi-task learning for classification with dirichlet process priors. The Journal of Machine Learning Research, 8:35-63, 2007.
[52]
K. Yu, V. Tresp, and A. Schwaighofer. Learning gaussian processes from multiple tasks. In Proceedings of the 22nd International Conference on Machine Learning, pages 1012-1019, Bonn, Germany, 2005.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 13, Issue
3/1/2012
2065 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 March 2012
Published in JMLR Volume 13

Author Tags

  1. Gaussian processes
  2. meta-generalising
  3. mixture of experts
  4. multi-task learning
  5. transfer learning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 146
    Total Downloads
  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)14
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media