Fast matrix computations for functional additive models

Barthelmé, Simon

doi:10.1007/s11222-014-9490-0

Fast matrix computations for functional additive models

Published: 16 July 2014

Volume 25, pages 47–63, (2015)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Simon Barthelmé¹

315 Accesses
Explore all metrics

Abstract

It is common in functional data analysis to look at a set of related functions: a set of learning curves, a set of brain signals, a set of spatial maps, etc. One way to express relatedness is through an additive model, whereby each individual function $g_{i}\left( x\right) $ is assumed to be a variation around some shared mean $f(x)$. Gaussian processes provide an elegant way of constructing such additive models, but suffer from computational difficulties arising from the matrix operations that need to be performed. Recently Heersink & Furrer have shown that functional additive model give rise to covariance matrices that have a specific form they called quasi-Kronecker (QK), whose inverses are relatively tractable. We show that under additional assumptions the two-level additive model leads to a class of matrices we call restricted quasi-Kronecker (rQK), which enjoy many interesting properties. In particular, we formulate matrix factorisations whose complexity scales only linearly in the number of functions in latent field, an enormous improvement over the cubic scaling of naïve approaches. We describe how to leverage the properties of rQK matrices for inference in Latent Gaussian Models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Generalized Nonlinear Models

Factor analysis models via I-divergence optimization

Article Open access 25 November 2015

Interpreting latent variables in factor models via convex optimization

Article 06 September 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Barthelmé, S., Trukenbrod, H., Engbert, R., Wichmann, F.: Modeling fixation locations using spatial point processes. J. Vis. 13(12), (2013). doi:10.1167/13.12.1
Behseta, S., Kass, R.E., Wallstrom, G.L.: Hierarchical models for assessing variability among functions. Biometrika 92(2), 419–434 (2005)
Article MATH MathSciNet Google Scholar
Bolin, D., Lindgren, F.: Excursion and contour uncertainty regions for latent gaussian models. J. R. Stat. Soc. B (2014)
Bornn, L., Shaddick, G., Zidek, J.V.: Modeling nonstationary processes through dimension expansion. J. Am. Stat. Assoc. 107(497), 281–289 (2012)
Article MATH MathSciNet Google Scholar
Cheng, W., Dryden, I. L., and Huang, X.: Bayesian registration of functions and curves (2013). arXiv:1311.2105
Cressie, N., Johannesson, G.: Fixed rank kriging for very large spatial data sets. J. R. Stat. Soc.: Ser. B (Statistical Methodology) 70(1), 209–226 (2008)
Article MATH MathSciNet Google Scholar
Fillipone, M., Girolami, M.: Pseudo-marginal bayesian inference for Gaussian processes. IEEE. Trans. Pattern. Anal. Mach. Intell. (2014). doi:10.1109/TPAMI.2014.2316530
Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of large spatial datasets. J. Comput. Graph. Stat. 15(3), 502–523 (2006)
Article MathSciNet Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Granger, C.W.J., Morris, M.J.: Time series modelling and interpretation. J. R. Stat. Soc. Ser. A (General) 139, 246–257 (1976)
Article MathSciNet Google Scholar
Heersink, D.K., Furrer, R.: On moore-penrose inverses of quasi-kronecker structured matrices. Linear Algebra Appl. 436, 561–570 (2011)
Article MathSciNet Google Scholar
Illian, J., Penttinen, A., Stoyan, H., Stoyan, D.: Statistical Analysis and Modelling of Spatial Point Patterns (Statistics in Practice), 1st edn. Wiley-Interscience, Hoboken (2008)
Google Scholar
Kaufman, C., Sain, S.: Bayesian functional ANOVA modeling using Gaussian process prior distributions. Bayesian Anal. 5, 123–150 (2009)
Article MathSciNet Google Scholar
Kneip, A., Ramsay, J.O.: Combining registration and fitting for functional models. J. Am. Stat. Assoc. 103(483), 1155–1165 (2008)
Article MATH MathSciNet Google Scholar
Lindgren, F., Rue, H., Lindström, J.: An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J. R. Stat. Soc.: Ser. B (Statistical Methodology) 73(4), 423–498 (2011)
Article MATH Google Scholar
Liu, D., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1–3), 503–528 (1989)
Article MATH MathSciNet Google Scholar
Minka, T. P.: Expectation propagation for approximate bayesian inference. In: UAI ’01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pp. 362–369. Morgan Kaufmann Publishers Inc, San Francisco (2001)
Møller, J., Syversveen, A.R., Waagepetersen, R.P.: Log gaussian cox processes. Scand. J. Stat. 25(3), 451–482 (1998)
Article Google Scholar
Murray, I., Adams, R.P., MacKay, D.J.: Elliptical slice sampling. JMLR: W&CP 9, 541–548 (2010)
Google Scholar
Neal, R. M.: Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification (1997). arXiv:physics/9701026
Nickisch, H., Rasmussen, C.E.: Approximations for binary Gaussian process classification. J. Mach. Learn. Res. 9, 2035–2078 (2008)
MATH MathSciNet Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization (Springer Series in Operations Research and Financial Engineering), 2nd edn. Springer, Berlin (2006)
Google Scholar
Opper, M., Archambeau, C.: The variational gaussian approximation revisited. Neural Comput. 21(3), 786–792 (2008)
Article MathSciNet Google Scholar
Paciorek, C.J.: Bayesian smoothing with Gaussian processes using fourier basis functions in the spectralGP package. J. Stat. Softw. 19(2), nihpa22751 (2007)
Paninski, L., Pillow, J., Lewi, J.: Statistical models for neural encoding, decoding, and optimal stimulus design. Prog. Brain Res. 165, 493–507 (2007)
Petersen, K. B. and Pedersen, M. S.: The matrix cookbook. Version 20121115 (2012)
Pouzat, C., Chaffiol, A.: Automatic spike train analysis and report generation. an implementation with r, R2HTML and STAR. J. Neurosci. Methods 181(1), 119–144 (2009)
Article Google Scholar
Ramsay, J., Silverman, B.W.: Functional Data Analysis (Springer Series in Statistics), 2nd edn. Springer, Berlin (2005)
Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning series). The MIT Press, Cambridge (2005)
Google Scholar
Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications (Chapman & Hall/CRC Monographs on Statistics & Applied Probability), 1st edn. Chapman and Hall/CRC, London (2005)
Book Google Scholar
Rue, H., Martino, S., Chopin, N.: Approximate bayesian inference for latent gaussian models by using integrated nested laplace approximations. J. R. Stat. Soc.: Ser. B (Statistical Methodology) 71(2), 319–392 (2009)
Article MATH MathSciNet Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia (2003)
Book MATH Google Scholar
Sain, S.R., Nychka, D., Mearns, L.: Functional ANOVA and regional climate experiments: a statistical analysis of dynamic downscaling. Environmetrics 22(6), 700–711 (2011)
Article MathSciNet Google Scholar
Telesca, D., Inoue, L.Y.T.: Bayesian hierarchical curve registration. J. Am. Stat. Assoc. 103(481), 328–339 (2008)
Tierney, L., Kadane, J.B.: Accurate approximations for posterior moments and marginal densities. J. Am. Stat. Assoc. 81(393), 82–86 (1986)
Article MATH MathSciNet Google Scholar

Download references

Acknowledgments

The author wishes to thank Reinhard Furrer for pointing him to Heersink and Furrer (2011), Alexandre Pouget for support, and two anonymous referees for comments.

Author information

Authors and Affiliations

Department of Basic Neuroscience, University of Geneva, Geneva, Switzerland
Simon Barthelmé

Authors

Simon Barthelmé
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Barthelmé.

Appendices

Appendix 1: An orthogonal basis for zero-mean vectors

We prove below proposition 1, which we restate:

The matrix $\mathbf {L}=\left[ \begin{array}{l} a\mathbf {1}_{m-1}^{t}\\ b+\mathbf {I}_{m-1} \end{array}\right] $, with $a=\frac{1}{\sqrt{m}}$ and $b=-\left( \frac{1}{\left( m-1\right) }\left( 1+\frac{1}{\sqrt{m}}\right) \right) $ is an orthonormal basis for the set of zero-mean vectors $\mathcal {A}_{m}=\left\{ \mathbf {u}\in \mathbb {R}^{m}|\sum u_{i}=0\right\} $.

Proof

For $\mathbf {L}_{m\times \left( m-1\right) }$ to be an orthogonal basis for $\mathcal {A}$, all that is required is that $\sum _{i=1}^{m}\mathbf {L}_{ij}=0$ for all $j$ and that $\mathbf {L}^{t}\mathbf {L}=\mathbf {I}$. To lighten the notation, define $r=m-1$. The first condition requires

$$\begin{aligned}&a+rb+1 = 0\nonumber \\&a = -(1+rb) \end{aligned}$$

(25)

and the second condition

$$\begin{aligned}&\mathbf {L}^{t}\mathbf {L} = \mathbf {I}\\&a^{2}+rb^{2}+2b+\mathbf {I} = \mathbf {I} \end{aligned}$$

implying that

$$\begin{aligned} a^{2}+rb^{2}+2b=0 \end{aligned}$$

(26)

injecting 25 into 26, we get:

$$\begin{aligned}&(1+rb)^{2}+rb^{2}+2b = 0\\&\left( r+r^{2}\right) b^{2}+\left( 2+2r\right) b+1 = 0 \end{aligned}$$

This is a 2nd order polynomial in $b$, and it has real roots if:

$$\begin{aligned}&\left( 2+2r\right) ^{2}-4(r+r^{2}) > 0\\&4+4r > 0 \end{aligned}$$

which is true since $r=m-1$ is non-negative.

Solving the quadratic equations yields $b=-\left( \frac{1}{r}+\frac{1}{r\sqrt{m}}\right) $ and substituting into 25 yields $a=\frac{1}{\sqrt{m}}$.

We can therefore always find values $a,b$ such that $\mathbf {L}=\left[ \begin{array}{l} a\mathbf {1}_{m-1}^{t}\\ b+\mathbf {I}_{m-1} \end{array}\right] $ is an orthogonal basis for the linear subset of zero-mean vectors. Moreover, we can show that the columns of N have unit norm:

$$\begin{aligned} \sum _{i}\mathbf {L}_{ij}^{2}&= a^{2}+\left( b+1\right) ^{2}+\left( m-2\right) b^{2}\\&= a^{2}+\left( m-1\right) b^{2}+2b+1\\&= 1 \end{aligned}$$

where the last line is from 26. The result implies that we can compute an orthonormal basis $\mathbf {L}$ for $\mathcal {A}$ such that matrix-vector products with $\mathbf {L}$ are $\mathcal {O}\left( m\right) $.

Appendix 2: Derivative of the Laplace approximation

Using the implicit function theorem an analytical expression for the gradient of the Laplace approximation can be found (see Rasmussen and Williams 2005, for a similar derivation). The Laplace approximation is given by:

$$\begin{aligned} \mathcal {L}\left( \varvec{\theta }\right) =f(\mathbf {x}{}^{\star },\varvec{\theta })-\frac{1}{2}\log \det \mathbf {H}\left( \mathbf {x}{}^{\star },\varvec{\theta }\right) \end{aligned}$$

(27)

where

$$\begin{aligned} f(\mathbf {x},\varvec{\theta })=\log p\left( \mathbf {y}\vert \mathbf {x}\right) +\log p\left( \mathbf {x}\vert \varvec{\theta }\right) \end{aligned}$$

(28)

is the unnormalised log-posterior evaluated at its conditional mode $\mathbf {x}{}^{\star }$, and $\mathbf {H}\left( \mathbf {x}{}^{\star },\varvec{\theta }\right) $ is the Hessian of $f(\mathbf {x},\varvec{\theta })$ with respect to x evaluated at $\mathbf {x}{}^{\star },\varvec{\theta }$. Since $\mathbf {x}^{\star }$ is a maximum it satisfies the gradient equation:

$$\begin{aligned} F\left( \mathbf {x},\varvec{\theta }\right)&= 0\\ F(\mathbf {x},\varvec{\theta })&= \nabla \log p\left( \mathbf {y}\vert \mathbf {x}\right) +\nabla \log p\left( \mathbf {x}\vert \varvec{\theta }\right) \end{aligned}$$

Assuming that $f(\mathbf {x},\varvec{\theta })$ is twice-differentiable and concave in $\mathbf {x}$, we can define an implicit function $\mathbf {x}{}^{\star }\left( \varvec{\theta }\right) $ such that $F(\mathbf {x}{}^{\star }\left( \varvec{\theta }\right) ,\varvec{\theta })=0$ for all $\varvec{\theta }$, with derivative

$$\begin{aligned} \frac{\partial }{\partial \theta _{j}}\mathbf {x}{}^{\star }&= -\left( \frac{\partial }{\partial \mathbf {x}}F(\mathbf {x},\varvec{\theta })\right) ^{-1} \left( \frac{\partial }{\partial \varvec{\theta }}F(\mathbf {x},\varvec{\theta })\right) \nonumber \\&= -\mathbf {H}^{-1}\left( \varvec{\varSigma }^{-1}\left( \frac{\partial }{\partial \theta _{j}}\varvec{\varSigma }\right) \varvec{\varSigma }^{-1}\mathbf {x}^{\star }\left( \varvec{\theta }\right) \right) \end{aligned}$$

(29)

To simplify the notation we will note $\mathbf {G}$ the matrix $\varvec{\varSigma }^{-1}\left( \frac{\partial }{\partial \theta _{j}}\varvec{\varSigma }\right) \varvec{\varSigma }^{-1}$. Note that the same $\mathbf {G}$ matrix appears in the gradient of the Gaussian likelihood with respect to the hyperparameters (Eq. 20). To compute the gradient of $\mathcal {L}\left( \varvec{\theta }\right) $, we need the derivatives of $f\left( \mathbf {x}{}^{\star }\left( \varvec{\theta }\right) ,\varvec{\theta }\right) $ with respect to $\varvec{\theta }$:

$$\begin{aligned} \frac{\partial }{\partial \varvec{\theta }}f=\left( \frac{\partial }{\partial \mathbf {x}{}^{\star }}f \left( \mathbf {x}{}^{\star },\varvec{\theta }\right) \right) \left( \frac{\partial }{\partial \theta _{j}} \mathbf {x}{}^{\star }\right) +\frac{\partial }{\partial \theta _{j}}f\left( \mathbf {x},\varvec{\theta }\right) \end{aligned}$$

where the first part is 0 since the gradient of $f$ is 0 at $\mathbf {x}{}^{\star }\left( \varvec{\theta }\right) $, and

$$\begin{aligned} \frac{\partial }{\partial \theta _{j}}f\left( \mathbf {x},\varvec{\theta }\right)&= \frac{\partial }{\partial \theta _{j}}\left( \log p\left( \mathbf {y}\vert \mathbf {x}\right) +\log p\left( \mathbf {x}\vert \varvec{\theta }\right) \right) \\&= \frac{\partial }{\partial \theta _{j}}\log p\left( \mathbf {x}\vert \varvec{\theta }\right) \end{aligned}$$

This is the gradient of a log-Gaussian density, and we can therefore reuse formula 20 for that part.

The gradient of $\log \det \mathbf {H}\left( \mathbf {x}{}^{\star },\varvec{\theta }\right) $ is also needed and is sadly more troublesome. A formula for the derivative of the log-det function is given in Petersen and Pedersen (2012):

$$\begin{aligned} \frac{\partial }{\partial \theta _{j}}\log \det \text{ H } =\text{ Tr }\left( \mathbf {H}^{-1}\frac{\partial }{\partial \theta _{j}} \mathbf {H}\right) \end{aligned}$$

(30)

The Hessian at the mode is the sum of the Hessian of $\log p\left( \mathbf {y}\vert \mathbf {x}{}^{\star }\right) $ (which depends on $\varvec{\theta }$ through the implicit dependency of $\mathbf {x}{}^{\star }$ on $\varvec{\theta }$), and the Hessian of $\log p\left( \mathbf {x}|\varvec{\theta }\right) $, which equals the inverse covariance matrix $\varvec{\varSigma }^{-1}$. Some additional algebra yields:

$$\begin{aligned} \frac{\partial }{\partial \theta _{j}}\mathbf {H}&= \text{ diag }\left( \left( \frac{\partial }{\partial \theta _{j}}\mathbf {x}_{i}^{\star }\right) \frac{\partial ^{3}}{\partial ^{3}x_{i}}\log p\left( \mathbf {y}\vert \mathbf {x}\right) \right) \\&-\varvec{\varSigma }^{-1}\left( \frac{\partial }{\partial \theta _{j}}\varvec{\varSigma }\right) \varvec{\varSigma }^{-1}\\&= \mathbf {D}-\mathbf {G} \end{aligned}$$

where we have assumed that the Hessian of the log-likelihood is diagonal. Accordingly $\mathbf {H}^{-1}$ is QK, and so is $\frac{\partial }{\partial \theta _{j}}\mathbf {H}$ (since it equals the sum of a rQK matrix $-\mathbf {G}$ and a diagonal perturbation $\mathbf {D}$). Computing the trace in Eq. (30) is therefore tractable if slightly painful, and can be done by plugging in Eq. (11) and summing the diagonal elements.

Appendix 3: Computing pointwise confidence bands

In this section we outline a simple method for obtaining a Rao-Blackwellised confidence band from posterior samples of $p(\varvec{\theta }|\mathbf {y})$. A pointwise confidence band around a latent function $f(x)$ is a vector function $c_{\alpha }\left( x\right) =\left( \begin{array}{c} h_{\alpha }\left( x\right) \\ l_{\alpha }\left( x\right) \end{array}\right) $, such that, for all $x$:

$$\begin{aligned}&p\left( f\left( x\right) >h_{\alpha }\left( x\right) |\mathbf {y}\right) < 1-\alpha \\&p\left( f\left( x\right) \le l_{\alpha }\left( x\right) |\mathbf {y}\right) < \alpha \end{aligned}$$

In latent Gaussian models the posterior over latent functions is a mixture over conditional posteriors:

$$\begin{aligned} p\left( f\left( x\right) |\mathbf {y}\right) =\int p(f|\mathbf {y},\varvec{\theta }) p(\varvec{\theta }|\mathbf {y})\text{ d }\varvec{\theta } \end{aligned}$$

(31)

The conditional posteriors $p\left( f\left( t\right) |\mathbf {y},\varvec{\theta }\right) $ are either Gaussian or approximated by a Gaussian. If we have obtained samples from $p(\varvec{\theta }|\mathbf {y})$ (as in Sect. 3.1) we can approximate (31) using a mixture of Gaussians, which has an analytic cumulative density function. The c.d.f. can be inverted numerically to obtain values $h_{\alpha }\left( t\right) $ and $l_{\alpha }\left( t\right) $. For simultaneous rather than pointwise confidence bands, see Bolin and Lindgren (2014).

Appendix 4: Preconditioning for quasi-Newton optimisation

In the whitened parametrisation the Hessian matrix has the following form (eq 24)

$$\begin{aligned} \mathbf {H}_{z}\left( \mathbf {z}\right) =\mathbf {I}+\mathbf {G}^{-t}\mathbf {H}_{l} \left( \mathbf {G}^{t}\mathbf {z}\right) \mathbf {G}^{-1} \end{aligned}$$

Although the whitened parametrisation gives ideal conditioning for the prior term, the conditioning with respect to the likelihood term ($\mathbf {H}_{l}$) may be degraded. Preconditioning the problem can help tremendously: given a guess $\mathbf {z}_{0}$, one precomputes the diagonal values of $\mathbf {d}_{0}=\text{ diag }\left( \mathbf {I}+\mathbf {G}^{-t} \mathbf {H}_{l}\left( \mathbf {G}^{t}\mathbf {z}_{0}\right) \mathbf {G}^{-t}\right) $ and uses $\text{ diag }\left( \mathbf {d}_{0}\right) $ as a preconditioner. To compute $\mathbf {d}_{0}$ the following identity is useful:

$$\begin{aligned} \text{ diag }\left( \mathbf {U}\text{ diag }\left( \mathbf {v}\right) \mathbf {U}^{t}\right) =\left( \mathbf {U}\odot \mathbf {U}\right) \mathbf {v} \end{aligned}$$

where $\odot $ is the element-wise product. In LGMs $\mathbf {H}_{l}$ is diagonal and so:

$$\begin{aligned} \mathbf {d}_{0}=1+\left( \mathbf {G}^{-t}\odot \mathbf {G}^{-t}\right) \mathbf {G}^{t}\mathbf {z}_{0} \end{aligned}$$

(32)

We inject 16 into the above and some rather tedious algebra shows that $\mathbf {d}_{0}$ can be computed in the following way. Form the matrix $\mathbf {X}_{0}$ by stacking successive subsets of length $n$ from $\mathbf {G}^{t}\mathbf {z}_{0}$. Rotate using the matrix $\mathbf {B}$ (Sect. 2.5.2) to form another matrix:

$$\begin{aligned} \mathbf {Y}=\mathbf {X}_{0}\mathbf {B} \end{aligned}$$

then compute $\mathbf {d}_{0}$ from:

$$\begin{aligned} \mathbf {d}_{0}=\text{ vec }\left( \left[ \begin{array}{cc} \left( \mathbf {U}\odot \mathbf {U}\right) \mathbf {Y}_{:,1}&\left( \mathbf {V}\odot \mathbf {V}\right) \mathbf {Y}_{:,2:m}\end{array}\right] \right) \end{aligned}$$

where we have used Matlab notation for subsets of columns of $\mathbf {Y}$, and $\mathbf {U}$ and $\mathbf {V}$ are as in Corrolary 3.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barthelmé, S. Fast matrix computations for functional additive models. Stat Comput 25, 47–63 (2015). https://doi.org/10.1007/s11222-014-9490-0

Download citation

Received: 20 February 2014
Accepted: 26 June 2014
Published: 16 July 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11222-014-9490-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast matrix computations for functional additive models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hierarchical Generalized Nonlinear Models

Factor analysis models via I-divergence optimization

Interpreting latent variables in factor models via convex optimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: An orthogonal basis for zero-mean vectors

Proof

Appendix 2: Derivative of the Laplace approximation

Appendix 3: Computing pointwise confidence bands

Appendix 4: Preconditioning for quasi-Newton optimisation

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Fast matrix computations for functional additive models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hierarchical Generalized Nonlinear Models

Factor analysis models via I-divergence optimization

Interpreting latent variables in factor models via convex optimization

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: An orthogonal basis for zero-mean vectors

Proof

Appendix 2: Derivative of the Laplace approximation

Appendix 3: Computing pointwise confidence bands

Appendix 4: Preconditioning for quasi-Newton optimisation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation