In recent years Gaussian Processes have become more and more popular for doing machine learning. A Gaussian Process can be seen as an infinite dimensional Gaussian Distribution defined by a mean function and a covariance function. Using Gaussian Processes for non-linear regression only involves the choice of such a covariance function (the mean function is often neglected due to missing knowledge). Inferring new points now involves an inversion of the covariance matrix defined by this kernel. You can use the following Java Applet to generate some training points and evaluate the influence of different covariance functions and hyperparameters on the predicted curve:
Gaussian Process can be seen as a generalization of Gaussian Distributions. Instead of being specified
by mean and variance like Gaussian Distributions they are fully specified by a mean function and a covariance
function. This fact is illustrated in Figure 1. Putting a query point x and a 'time step' t into it the
GP delivers the value of the probability density function. The covariance function allows for specifying
a-priori knowledge which could be the training data for solving the regression problem.
Figure 1: Gaussian Distribution (left) vs. Gaussian Process (right)
It is well known, that the conditional as well as the marginal distribution of a normal distributed
multidimensional random variable is normal distributed again. This can be used to infer the conditional
probability density function (specified by mean and covariance) of a given joint gaussian distribution
(with known covariance) as described in Figure 2:
Figure 2: Inferring the mean and variance of a conditional Gaussian Distribution
Introducing a new representation, where the mean values and error bars for the variances are drawn over
the dimensions y, reveals how GPs can be used to solve the regression problem. In Figure 3 for example
a 5x5 covariance matrix and a 3-d input vector was used to calculate the 2-d output mean vector and the
corresponding variances which are depict as error bars.
Figure 3: A new representation depicts the mean of 3 input and 2 output points
The correlation between the data values can be expressed by any covariance function. In the applet above,
for example you have the possibility to select between a radial basis function (or sometimes called squared
exponential kernel function), a periodic RBF and a polynomial function. In Figure 4 a RBF is shown. It
can clearly be seen that points lying close together are strongly correlated (or have a covariance value
close to 1). The form of the RBF is governed by learnable (using gradient descent methods) hyperparameters.
For example the horizontal lengthscale controls the influence of data points on each other depending on
their distance. A large horizontal lengthscale means that points lying far from each other are weakly correlated
whereas a small horizontal lengthscale indicates almost no correlation between points to far from each other.
Figure 4: A Radial Basis Function (RBF)
In order to perform non-linear dimensionality reduction the Gaussian Process Latent Variable Model
(GPLVM) has been developed by Neil Lawrence. To apply this to human pose tracking and style-based
inverse kinematics a enhanced model, the so-called Gaussian Process Dynamical Model has been proposed
by Wang et al. It considers in addition to the latent-to-pose mapping of the GPLVM (blue) in addition
a dynamics mapping (green) in latent space. Thus this model can be described by 2 GPs. This fact is
illustrated for 3 frames (or time-steps) in Figure 5:
Figure 5: GPLVMs (left) vs. GPDMs (right)
Learning a GPLVM or a GPDM now involves maximizing the log-posterior probability with respect
to the model which consists of the latent variables and the hyperparameters for the GPs. By doing
so in was experementally shown by Hertzmann et al. and Urtasun et al. that it is sufficient to
use a 2-d or 3-d latent space to capture an entire pose in a realistic way. GP models proved to
exhibit good generalization properties. Thus articulated human body tracking with a 30-d model can
be done with the use of a GPDM trained using only one motion sequence. Figure 6 shows an example
of different poses and the corresponding latent space. The red dot indicate training points and
correspond to the poses of the articulated body model.
Figure 6: A learned latent space and the corresponding poses (Hertzmann et al.)