Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Gaussian Process in Machine Learning
Subject: Machine Learning
Dr. Varun Kumar
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 1 / 16
Outlines
1 Introduction to Gaussian Distributed Random Variable
2 Central Limit Theorem
3 MLE Vs MAP
4 Gaussian Process for Linear Regression
5 References
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 2 / 16
Introduction to Gaussian Distributed Random Variable (rv)
Gaussian distribution
1 The general expression for the PDF of a uni-variate Gaussian
distributed random variable is
fX (x) =
1
√
2πσ
e−
(x−µ)2
2σ2
where, σ → Standard deviation, µ → Mean, σ2 → Variance
2 The general expression for the PDF of a multi-variate Gaussian
distributed random variable is
P(X, µx , Σ) =
1
(2π)d/2
det|Σ|
e−1
2
(X−µx )T Σ−1(X−µx )
X → d-dimensional input random vector, i.e X = [x1, x2, ....., xd ]T
µx → d-dimensional mean vector, i.e µx = [µx1
, µx2
, ....., µxd
]T
Σ → Co-variance matrix of size d × d
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 3 / 16
Properties of Gaussian distributed random variable
1. Addition of two Gaussian distributed rv is also a Gaussian. Let
X1 ∼ N(µX1
, ΣX1X1
) and X1 ∼ N(µX2
, ΣX2X2
) are two Gaussian distributed
rv.
Z = X1 + X2 ∼ N(µX1
+ µX2
, ΣX1X1
ΣX2X2
)
2. Normalization is also a Gaussian.
Z =
Z
y
p(y, µ, Σ)dy = 1 → Gaussian distribution
3. Marginalization is also a Gaussian distribution.
p(X1) =
Z ∞
0
p(X1, X2, µ, Σ)dX2 → Gaussian distribution
4. Conditioning: The conditional distribution of X1 on X2
p(X1/X2) =
p(X1, X2, µ, Σ)
R
X1
p(X1, X2, µ, Σ)dX1
→ Gaussian distribution
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 4 / 16
Central limit theorem
⇒ Let {X1, . . . , Xn} be a random sample of size n.
⇒ All random sample are independent and identically distributed (i.i.d.).
⇒ The sample average
X̄n =
X1 + X2 + .... + Xn
n
, n → ∞ ⇒ Gaussian distribution
⇒ By the law of large numbers, the sample averages converge almost
surely to the expected value µ and variance σ2.
⇒ Let Z be the expectation, where Z =
√
nX̄n−µ
σ
lim n→∞
⇒ Resultant PDF
f =
1
√
2πσ
e−
(X̄n−µ)
2σ2 =
1
√
2π
e−Z2
2
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 5 / 16
Continued–
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 6 / 16
MLE vs MAP
Maximum likelihood estimator (MLE)
Let y = ax + n, where n ∼ N(0, σ2)
x̂MLE (y) = arg
max x
fY (y/x) =
1
√
2πσ
e−
(y−ax)2
2σ2
Measure y = ȳ = ax̂MLE
Note: There is no requirement of the distribution of x.
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 7 / 16
Maximum aposteriori probability (MAP)
1 Maximum apriori
xapriori = arg
max x
fX (x)
2 Maximum aposteriori probability (MAP)
x̂MAP = arg
max x
fX (x/y) =
fY (y/x)fX (x)
fY (y)
=
fY (y/x)fX (x)
R
X fY (y/x)fX (x)dx
⇒ If xapriori is uniformly distributed then
x̂MLE = x̂MAP
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 8 / 16
Linear regression
Let we have a data, D = {(x1, y1), ....., (xn, yn)}
⇒ MLE: p(D/W ) =
Qn
i=1 p(yi /xi ; w) ∀ p(yi /xi ; w) ∼ N(W T X, σ2I)
⇒ MAP: p(W /D) ∝ p(D/W )p(W )
p(D) = p(D/W )p(W )
R
W p(D/W )p(W )dw
⇒
p(y/x; D) =
Z
w
p(y/x; w)p(w/D)dw
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 9 / 16
Continued–
In general, the posterior predictive distribution is
P(Y |D, X) =
Z
w
P(Y , w|D, X)dw =
Z
w
P(Y |w, D, X)P(w|D)dw
The above is often intractable in closed form.
The mean and covariance of the given expression can be written as
P(y|D, x) ∼ N(µy|D, Σy|D)
where
µy|D = KT
∗ (K + σ2
I)−1
y
and
Σy∗|D = KKT
∗ (K + σ2
I)−1
K∗
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 10 / 16
Gaussian process
⇒ Problem:
f is an infinite dimensional function. But, the multivariate Gaussian
distributions is for finite dimensional random vectors.
⇒ Definition: A GP is a collection of random variables (RV) such that
the joint distribution of every finite subset of RVs is multivariate
Gaussian:
f ∼ GP(µ, k)
where µ(x) and k(x, x0) are the mean and covariance function.
⇒ Need to model the predictive distribution P(f∗|x, D).
⇒ We can use a Bayesian approach by using a GP prior:
P(f |x) ∼ N(µ, Σ) and condition it on the training data D to model
the joint distribution of f = f (X) (vector of training observations)
and f∗ = f (x∗) (prediction at test input).
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 11 / 16
Gaussian Process Regression GPR
We observe the training labels that are drawn from the zero-mean prior Gaussian :
y = [y1, y2, ...., yn, yt]T
∼ N(0, Σ)
⇒ All training and test labels are drawn from an (n+m)-dimension Gaussian
distribution.
⇒ n is the number of training points.
⇒ m is the number of testing points.
We consider the following properties of Σ :
1 Σij = E((Yi − µi )(Yj − µj ))
2 Σ is always positive semi-definite.
3 Σii = Var(Yi ), thus Σii ≥ 0
4 If Yi and Yj are very independent, i.e. xi is very different from xj , then
Σii = Σij = 0. If xi is similar to xj , then Σij = Σji > 0
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 12 / 16
Continued–
We can observe that this is very similar from the kernel matrix in SVMs.
Therefore, we can simply let Σij = K(xi , xj ). For example,
(a) If we use RBF kernel
Σij = τe−
kxi −xj k2
2σ2
(b) If we use polynomial kernel, then Σij = τ(1 + xT
i xj )d .
We can decompose Σ as
Σ =

K, K∗
KT
∗ , K∗∗

where
K is the training kernel matrix.
K∗ is the training-testing kernel matrix.
KT
∗ is the testing-training kernel matrix
K∗∗ is the testing kernel matrix
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 13 / 16
Continued–
The conditional distribution of (noise-free) values of the latent function f
can be written as:
f∗|(Y1 = y1, ..., Yn = yn, x1, ..., xn, xt) ∼ N(KT
∗ K−1
y, K∗∗ − KT
∗ K−1
K∗)
,
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 14 / 16
Conclusion
Gaussian Process Regression has the following properties:
1 GPs are an elegant and powerful ML method.
2 We get a measure of uncertainty for the predictions for free.
3 GPs work very well for regression problems with small training data
set sizes.
4 Running time O(n3) ← matrix inversion (gets slow when n  0 ) ⇒
use sparse GPs for large n.
5 GPs are a little bit more involved for classification (non-Gaussian
likelihood).
6 We can model non-Gaussian likelihoods in regression and do
approximate inference for e.g., count data (Poisson distribution)
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 15 / 16
References
T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University,
School of Computer Science, Machine Learning , 2006, vol. 9.
E. Alpaydin, Introduction to machine learning. MIT press, 2020.
K. Weinberger,
https://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/lecturenote15.html,
May 2018.
Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 16 / 16

More Related Content

What's hot

07 approximate inference in bn
07 approximate inference in bn07 approximate inference in bn
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro
9xdot
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
Ashraf Uddin
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
Mohsin Ul Haq
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Introduction to Some Tree based Learning Method
Introduction to Some Tree based Learning MethodIntroduction to Some Tree based Learning Method
Introduction to Some Tree based Learning Method
Honglin Yu
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Simplilearn
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
butest
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
Rogier Geertzema
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
EdutechLearners
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
amalalhait
 
Random forest
Random forestRandom forest
Random forest
Musa Hawamdah
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
Rashid Ansari
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Simplilearn
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
Rishabh Gupta
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
HJ van Veen
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
Pabna University of Science & Technology
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
Kasun Ranga Wijeweera
 

What's hot (20)

07 approximate inference in bn
07 approximate inference in bn07 approximate inference in bn
07 approximate inference in bn
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Introduction to Some Tree based Learning Method
Introduction to Some Tree based Learning MethodIntroduction to Some Tree based Learning Method
Introduction to Some Tree based Learning Method
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 
Random forest
Random forestRandom forest
Random forest
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 

Similar to Gaussian process in machine learning

Concentration inequality in Machine Learning
Concentration inequality in Machine LearningConcentration inequality in Machine Learning
Concentration inequality in Machine Learning
VARUN KUMAR
 
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
The Statistical and Applied Mathematical Sciences Institute
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
Frank Nielsen
 
Discussion about random variable ad its characterization
Discussion about random variable ad its characterizationDiscussion about random variable ad its characterization
Discussion about random variable ad its characterization
Geeta Arora
 
The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...
Joe Suzuki
 
Newton's Divide and Difference Interpolation
Newton's Divide and Difference InterpolationNewton's Divide and Difference Interpolation
Newton's Divide and Difference Interpolation
VARUN KUMAR
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability Distribution
Pedro222284
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
Valentin De Bortoli
 
Application of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine LearningApplication of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine Learning
VARUN KUMAR
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
Edgar Marca
 
Basic terminology description in convex optimization
Basic terminology description in convex optimizationBasic terminology description in convex optimization
Basic terminology description in convex optimization
VARUN KUMAR
 
A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...
A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...
A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...
A G
 
Litvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdfLitvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdf
Alexander Litvinenko
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributions
Frank Nielsen
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
Christian Robert
 
Linear Regression
Linear RegressionLinear Regression
Linear Regression
VARUN KUMAR
 
Hyers ulam rassias stability of exponential primitive mapping
Hyers  ulam rassias stability of exponential primitive mappingHyers  ulam rassias stability of exponential primitive mapping
Hyers ulam rassias stability of exponential primitive mapping
Alexander Decker
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
The Statistical and Applied Mathematical Sciences Institute
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysis
VARUN KUMAR
 
(α ψ)- Construction with q- function for coupled fixed point
(α   ψ)-  Construction with q- function for coupled fixed point(α   ψ)-  Construction with q- function for coupled fixed point
(α ψ)- Construction with q- function for coupled fixed point
Alexander Decker
 

Similar to Gaussian process in machine learning (20)

Concentration inequality in Machine Learning
Concentration inequality in Machine LearningConcentration inequality in Machine Learning
Concentration inequality in Machine Learning
 
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Discussion about random variable ad its characterization
Discussion about random variable ad its characterizationDiscussion about random variable ad its characterization
Discussion about random variable ad its characterization
 
The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...
 
Newton's Divide and Difference Interpolation
Newton's Divide and Difference InterpolationNewton's Divide and Difference Interpolation
Newton's Divide and Difference Interpolation
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability Distribution
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Application of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine LearningApplication of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine Learning
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Basic terminology description in convex optimization
Basic terminology description in convex optimizationBasic terminology description in convex optimization
Basic terminology description in convex optimization
 
A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...
A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...
A Fast Algorithm for Solving Scalar Wave Scattering Problem by Billions of Pa...
 
Litvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdfLitvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdf
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributions
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
 
Linear Regression
Linear RegressionLinear Regression
Linear Regression
 
Hyers ulam rassias stability of exponential primitive mapping
Hyers  ulam rassias stability of exponential primitive mappingHyers  ulam rassias stability of exponential primitive mapping
Hyers ulam rassias stability of exponential primitive mapping
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Coverage of Credible I...
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysis
 
(α ψ)- Construction with q- function for coupled fixed point
(α   ψ)-  Construction with q- function for coupled fixed point(α   ψ)-  Construction with q- function for coupled fixed point
(α ψ)- Construction with q- function for coupled fixed point
 

More from VARUN KUMAR

Distributed rc Model
Distributed rc ModelDistributed rc Model
Distributed rc Model
VARUN KUMAR
 
Electrical Wire Model
Electrical Wire ModelElectrical Wire Model
Electrical Wire Model
VARUN KUMAR
 
Interconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI DesignInterconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI Design
VARUN KUMAR
 
Introduction to Digital VLSI Design
Introduction to Digital VLSI DesignIntroduction to Digital VLSI Design
Introduction to Digital VLSI Design
VARUN KUMAR
 
Challenges of Massive MIMO System
Challenges of Massive MIMO SystemChallenges of Massive MIMO System
Challenges of Massive MIMO System
VARUN KUMAR
 
E-democracy or Digital Democracy
E-democracy or Digital DemocracyE-democracy or Digital Democracy
E-democracy or Digital Democracy
VARUN KUMAR
 
Ethics of Parasitic Computing
Ethics of Parasitic ComputingEthics of Parasitic Computing
Ethics of Parasitic Computing
VARUN KUMAR
 
Action Lines of Geneva Plan of Action
Action Lines of Geneva Plan of ActionAction Lines of Geneva Plan of Action
Action Lines of Geneva Plan of Action
VARUN KUMAR
 
Geneva Plan of Action
Geneva Plan of ActionGeneva Plan of Action
Geneva Plan of Action
VARUN KUMAR
 
Fair Use in the Electronic Age
Fair Use in the Electronic AgeFair Use in the Electronic Age
Fair Use in the Electronic Age
VARUN KUMAR
 
Software as a Property
Software as a PropertySoftware as a Property
Software as a Property
VARUN KUMAR
 
Orthogonal Polynomial
Orthogonal PolynomialOrthogonal Polynomial
Orthogonal Polynomial
VARUN KUMAR
 
Patent Protection
Patent ProtectionPatent Protection
Patent Protection
VARUN KUMAR
 
Copyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy LawCopyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy Law
VARUN KUMAR
 
Property Right and Software
Property Right and SoftwareProperty Right and Software
Property Right and Software
VARUN KUMAR
 
Investigating Data Trials
Investigating Data TrialsInvestigating Data Trials
Investigating Data Trials
VARUN KUMAR
 
Gaussian Numerical Integration
Gaussian Numerical IntegrationGaussian Numerical Integration
Gaussian Numerical Integration
VARUN KUMAR
 
Censorship and Controversy
Censorship and ControversyCensorship and Controversy
Censorship and Controversy
VARUN KUMAR
 
Romberg's Integration
Romberg's IntegrationRomberg's Integration
Romberg's Integration
VARUN KUMAR
 
Introduction to Censorship
Introduction to Censorship Introduction to Censorship
Introduction to Censorship
VARUN KUMAR
 

More from VARUN KUMAR (20)

Distributed rc Model
Distributed rc ModelDistributed rc Model
Distributed rc Model
 
Electrical Wire Model
Electrical Wire ModelElectrical Wire Model
Electrical Wire Model
 
Interconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI DesignInterconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI Design
 
Introduction to Digital VLSI Design
Introduction to Digital VLSI DesignIntroduction to Digital VLSI Design
Introduction to Digital VLSI Design
 
Challenges of Massive MIMO System
Challenges of Massive MIMO SystemChallenges of Massive MIMO System
Challenges of Massive MIMO System
 
E-democracy or Digital Democracy
E-democracy or Digital DemocracyE-democracy or Digital Democracy
E-democracy or Digital Democracy
 
Ethics of Parasitic Computing
Ethics of Parasitic ComputingEthics of Parasitic Computing
Ethics of Parasitic Computing
 
Action Lines of Geneva Plan of Action
Action Lines of Geneva Plan of ActionAction Lines of Geneva Plan of Action
Action Lines of Geneva Plan of Action
 
Geneva Plan of Action
Geneva Plan of ActionGeneva Plan of Action
Geneva Plan of Action
 
Fair Use in the Electronic Age
Fair Use in the Electronic AgeFair Use in the Electronic Age
Fair Use in the Electronic Age
 
Software as a Property
Software as a PropertySoftware as a Property
Software as a Property
 
Orthogonal Polynomial
Orthogonal PolynomialOrthogonal Polynomial
Orthogonal Polynomial
 
Patent Protection
Patent ProtectionPatent Protection
Patent Protection
 
Copyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy LawCopyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy Law
 
Property Right and Software
Property Right and SoftwareProperty Right and Software
Property Right and Software
 
Investigating Data Trials
Investigating Data TrialsInvestigating Data Trials
Investigating Data Trials
 
Gaussian Numerical Integration
Gaussian Numerical IntegrationGaussian Numerical Integration
Gaussian Numerical Integration
 
Censorship and Controversy
Censorship and ControversyCensorship and Controversy
Censorship and Controversy
 
Romberg's Integration
Romberg's IntegrationRomberg's Integration
Romberg's Integration
 
Introduction to Censorship
Introduction to Censorship Introduction to Censorship
Introduction to Censorship
 

Recently uploaded

IWISS Catalog 2024
IWISS Catalog 2024IWISS Catalog 2024
IWISS Catalog 2024
Iwiss Tools Co.,Ltd
 
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdfOCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
Muanisa Waras
 
13 tricks to get the most out of the S Pen
13 tricks to get the most out of the S Pen13 tricks to get the most out of the S Pen
13 tricks to get the most out of the S Pen
aashuverma204
 
AWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptx
AWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptxAWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptx
AWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptx
kriangkb1
 
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-IDUNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
GOWSIKRAJA PALANISAMY
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
IIIT Hyderabad
 
一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理
一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理
一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理
hahehot
 
L-3536-Cost Benifit Analysis in ESIA.pptx
L-3536-Cost Benifit Analysis in ESIA.pptxL-3536-Cost Benifit Analysis in ESIA.pptx
L-3536-Cost Benifit Analysis in ESIA.pptx
naseki5964
 
Net Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK EmpireNet Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK Empire
Global Network for Zero
 
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeBangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
bookhotbebes1
 
FD FAN.pdf forced draft fan for boiler operation and run its very important f...
FD FAN.pdf forced draft fan for boiler operation and run its very important f...FD FAN.pdf forced draft fan for boiler operation and run its very important f...
FD FAN.pdf forced draft fan for boiler operation and run its very important f...
MDHabiburRhaman1
 
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
kinni singh$A17
 
Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.
Tool and Die Tech
 
一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理
一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理
一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理
byyi0h
 
CONVEGNO DA IRETI 18 giugno 2024 | PASQUALE Donato
CONVEGNO DA IRETI 18 giugno 2024 | PASQUALE DonatoCONVEGNO DA IRETI 18 giugno 2024 | PASQUALE Donato
CONVEGNO DA IRETI 18 giugno 2024 | PASQUALE Donato
Servizi a rete
 
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
IJAEMSJORNAL
 
Best Practices for Password Rotation and Tools to Streamline the Process
Best Practices for Password Rotation and Tools to Streamline the ProcessBest Practices for Password Rotation and Tools to Streamline the Process
Best Practices for Password Rotation and Tools to Streamline the Process
Bert Blevins
 
GUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdf
GUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdfGUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdf
GUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdf
ProexportColombia1
 
Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...
Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...
Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...
Bert Blevins
 
Germany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptxGermany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptx
rebecca841358
 

Recently uploaded (20)

IWISS Catalog 2024
IWISS Catalog 2024IWISS Catalog 2024
IWISS Catalog 2024
 
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdfOCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
 
13 tricks to get the most out of the S Pen
13 tricks to get the most out of the S Pen13 tricks to get the most out of the S Pen
13 tricks to get the most out of the S Pen
 
AWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptx
AWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptxAWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptx
AWS-Architecture-Icons-Deck_For-Dark-BG_04282023.pptx
 
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-IDUNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
 
一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理
一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理
一比一原版(skku毕业证)韩国成均馆大学毕业证如何办理
 
L-3536-Cost Benifit Analysis in ESIA.pptx
L-3536-Cost Benifit Analysis in ESIA.pptxL-3536-Cost Benifit Analysis in ESIA.pptx
L-3536-Cost Benifit Analysis in ESIA.pptx
 
Net Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK EmpireNet Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK Empire
 
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeBangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
 
FD FAN.pdf forced draft fan for boiler operation and run its very important f...
FD FAN.pdf forced draft fan for boiler operation and run its very important f...FD FAN.pdf forced draft fan for boiler operation and run its very important f...
FD FAN.pdf forced draft fan for boiler operation and run its very important f...
 
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
 
Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.
 
一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理
一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理
一比一原版(UQ毕业证书)昆士兰大学毕业证如何办理
 
CONVEGNO DA IRETI 18 giugno 2024 | PASQUALE Donato
CONVEGNO DA IRETI 18 giugno 2024 | PASQUALE DonatoCONVEGNO DA IRETI 18 giugno 2024 | PASQUALE Donato
CONVEGNO DA IRETI 18 giugno 2024 | PASQUALE Donato
 
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
 
Best Practices for Password Rotation and Tools to Streamline the Process
Best Practices for Password Rotation and Tools to Streamline the ProcessBest Practices for Password Rotation and Tools to Streamline the Process
Best Practices for Password Rotation and Tools to Streamline the Process
 
GUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdf
GUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdfGUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdf
GUIA_LEGAL_CHAPTER_4_FOREIGN TRADE CUSTOMS.pdf
 
Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...
Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...
Enhancing Security with Multi-Factor Authentication in Privileged Access Mana...
 
Germany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptxGermany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptx
 

Gaussian process in machine learning

  • 1. Gaussian Process in Machine Learning Subject: Machine Learning Dr. Varun Kumar Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 1 / 16
  • 2. Outlines 1 Introduction to Gaussian Distributed Random Variable 2 Central Limit Theorem 3 MLE Vs MAP 4 Gaussian Process for Linear Regression 5 References Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 2 / 16
  • 3. Introduction to Gaussian Distributed Random Variable (rv) Gaussian distribution 1 The general expression for the PDF of a uni-variate Gaussian distributed random variable is fX (x) = 1 √ 2πσ e− (x−µ)2 2σ2 where, σ → Standard deviation, µ → Mean, σ2 → Variance 2 The general expression for the PDF of a multi-variate Gaussian distributed random variable is P(X, µx , Σ) = 1 (2π)d/2 det|Σ| e−1 2 (X−µx )T Σ−1(X−µx ) X → d-dimensional input random vector, i.e X = [x1, x2, ....., xd ]T µx → d-dimensional mean vector, i.e µx = [µx1 , µx2 , ....., µxd ]T Σ → Co-variance matrix of size d × d Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 3 / 16
  • 4. Properties of Gaussian distributed random variable 1. Addition of two Gaussian distributed rv is also a Gaussian. Let X1 ∼ N(µX1 , ΣX1X1 ) and X1 ∼ N(µX2 , ΣX2X2 ) are two Gaussian distributed rv. Z = X1 + X2 ∼ N(µX1 + µX2 , ΣX1X1 ΣX2X2 ) 2. Normalization is also a Gaussian. Z = Z y p(y, µ, Σ)dy = 1 → Gaussian distribution 3. Marginalization is also a Gaussian distribution. p(X1) = Z ∞ 0 p(X1, X2, µ, Σ)dX2 → Gaussian distribution 4. Conditioning: The conditional distribution of X1 on X2 p(X1/X2) = p(X1, X2, µ, Σ) R X1 p(X1, X2, µ, Σ)dX1 → Gaussian distribution Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 4 / 16
  • 5. Central limit theorem ⇒ Let {X1, . . . , Xn} be a random sample of size n. ⇒ All random sample are independent and identically distributed (i.i.d.). ⇒ The sample average X̄n = X1 + X2 + .... + Xn n , n → ∞ ⇒ Gaussian distribution ⇒ By the law of large numbers, the sample averages converge almost surely to the expected value µ and variance σ2. ⇒ Let Z be the expectation, where Z = √ nX̄n−µ σ lim n→∞ ⇒ Resultant PDF f = 1 √ 2πσ e− (X̄n−µ) 2σ2 = 1 √ 2π e−Z2 2 Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 5 / 16
  • 6. Continued– Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 6 / 16
  • 7. MLE vs MAP Maximum likelihood estimator (MLE) Let y = ax + n, where n ∼ N(0, σ2) x̂MLE (y) = arg max x fY (y/x) = 1 √ 2πσ e− (y−ax)2 2σ2 Measure y = ȳ = ax̂MLE Note: There is no requirement of the distribution of x. Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 7 / 16
  • 8. Maximum aposteriori probability (MAP) 1 Maximum apriori xapriori = arg max x fX (x) 2 Maximum aposteriori probability (MAP) x̂MAP = arg max x fX (x/y) = fY (y/x)fX (x) fY (y) = fY (y/x)fX (x) R X fY (y/x)fX (x)dx ⇒ If xapriori is uniformly distributed then x̂MLE = x̂MAP Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 8 / 16
  • 9. Linear regression Let we have a data, D = {(x1, y1), ....., (xn, yn)} ⇒ MLE: p(D/W ) = Qn i=1 p(yi /xi ; w) ∀ p(yi /xi ; w) ∼ N(W T X, σ2I) ⇒ MAP: p(W /D) ∝ p(D/W )p(W ) p(D) = p(D/W )p(W ) R W p(D/W )p(W )dw ⇒ p(y/x; D) = Z w p(y/x; w)p(w/D)dw Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 9 / 16
  • 10. Continued– In general, the posterior predictive distribution is P(Y |D, X) = Z w P(Y , w|D, X)dw = Z w P(Y |w, D, X)P(w|D)dw The above is often intractable in closed form. The mean and covariance of the given expression can be written as P(y|D, x) ∼ N(µy|D, Σy|D) where µy|D = KT ∗ (K + σ2 I)−1 y and Σy∗|D = KKT ∗ (K + σ2 I)−1 K∗ Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 10 / 16
  • 11. Gaussian process ⇒ Problem: f is an infinite dimensional function. But, the multivariate Gaussian distributions is for finite dimensional random vectors. ⇒ Definition: A GP is a collection of random variables (RV) such that the joint distribution of every finite subset of RVs is multivariate Gaussian: f ∼ GP(µ, k) where µ(x) and k(x, x0) are the mean and covariance function. ⇒ Need to model the predictive distribution P(f∗|x, D). ⇒ We can use a Bayesian approach by using a GP prior: P(f |x) ∼ N(µ, Σ) and condition it on the training data D to model the joint distribution of f = f (X) (vector of training observations) and f∗ = f (x∗) (prediction at test input). Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 11 / 16
  • 12. Gaussian Process Regression GPR We observe the training labels that are drawn from the zero-mean prior Gaussian : y = [y1, y2, ...., yn, yt]T ∼ N(0, Σ) ⇒ All training and test labels are drawn from an (n+m)-dimension Gaussian distribution. ⇒ n is the number of training points. ⇒ m is the number of testing points. We consider the following properties of Σ : 1 Σij = E((Yi − µi )(Yj − µj )) 2 Σ is always positive semi-definite. 3 Σii = Var(Yi ), thus Σii ≥ 0 4 If Yi and Yj are very independent, i.e. xi is very different from xj , then Σii = Σij = 0. If xi is similar to xj , then Σij = Σji > 0 Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 12 / 16
  • 13. Continued– We can observe that this is very similar from the kernel matrix in SVMs. Therefore, we can simply let Σij = K(xi , xj ). For example, (a) If we use RBF kernel Σij = τe− kxi −xj k2 2σ2 (b) If we use polynomial kernel, then Σij = τ(1 + xT i xj )d . We can decompose Σ as Σ = K, K∗ KT ∗ , K∗∗ where K is the training kernel matrix. K∗ is the training-testing kernel matrix. KT ∗ is the testing-training kernel matrix K∗∗ is the testing kernel matrix Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 13 / 16
  • 14. Continued– The conditional distribution of (noise-free) values of the latent function f can be written as: f∗|(Y1 = y1, ..., Yn = yn, x1, ..., xn, xt) ∼ N(KT ∗ K−1 y, K∗∗ − KT ∗ K−1 K∗) , Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 14 / 16
  • 15. Conclusion Gaussian Process Regression has the following properties: 1 GPs are an elegant and powerful ML method. 2 We get a measure of uncertainty for the predictions for free. 3 GPs work very well for regression problems with small training data set sizes. 4 Running time O(n3) ← matrix inversion (gets slow when n 0 ) ⇒ use sparse GPs for large n. 5 GPs are a little bit more involved for classification (non-Gaussian likelihood). 6 We can model non-Gaussian likelihoods in regression and do approximate inference for e.g., count data (Poisson distribution) Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 15 / 16
  • 16. References T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University, School of Computer Science, Machine Learning , 2006, vol. 9. E. Alpaydin, Introduction to machine learning. MIT press, 2020. K. Weinberger, https://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/lecturenote15.html, May 2018. Subject: Machine Learning Dr. Varun Kumar (IIIT Surat) Lecture 15 16 / 16