Appendix A: Linear system with approximate radial basis function
The variational functional is shown as
$$\begin{aligned} H \left[ \varvec{\varPsi } \right] = \sum _{i=1}^N \left\| \varvec{x}_i - \varvec{\varPsi }\left( \varvec{X}_i \right) \right\| ^2 + \lambda \phi \left( \varvec{\varPsi } \right) \end{aligned}$$
(20)
and the solution to Eq. (20) with approximate radial basis function is
$$\begin{aligned} \varvec{\varPsi }^{*} \left( \varvec{X} \right) = \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{X}-\varvec{t}_{\alpha } \right) \end{aligned}$$
(21)
The Eq. (21) is substituted back Eq. (20), one is to have
$$\begin{aligned} H\left[ \varvec{\varPsi }^{*} \right] = \sum _{i=1}^N \left\| \varvec{x}_i - \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{X}_i-\varvec{t}_{\alpha } \right) \right\| ^2 + \lambda \phi \left( \varvec{\varPsi }^{*} \left( \varvec{X} \right) \right) \end{aligned}$$
(22)
The smoothness constraint (second term) has the formula as
$$\begin{aligned} \phi \left( \varvec{\varPsi }^{*} \left( \varvec{X} \right) \right) = \int _{{\mathbb {R}}^d} \frac{ \left\| \tilde{\varvec{\varPsi }}^{*}\left( \varvec{s} \right) \right\| ^2}{{\tilde{G}}\left( \varvec{s} \right) } d\varvec{s} = \int _{{\mathbb {R}}^d} \frac{ \tilde{\varvec{\varPsi }}^{*}\left( \varvec{s} \right) \tilde{\varvec{\varPsi }}^{*}\left( \varvec{-s} \right) }{{\tilde{G}}\left( \varvec{s} \right) } d\varvec{s} \end{aligned}$$
since \(\tilde{\varvec{\varPsi }}^{*}\left( \varvec{s} \right) = \tilde{\varvec{\varPsi }}^{*}\left( \varvec{-s} \right) \) for \(\varvec{\varPsi }^{*}\) is real.
in which,
$$\begin{aligned} \tilde{\varvec{\varPsi }}^{*} \left( \varvec{s} \right)= & {} \int _{{\mathbb {R}}^d} \varvec{\varPsi }^{*} \left( \varvec{t} \right) e^{-i\varvec{t} \cdot \varvec{s}} d\varvec{t} \\= & {} \frac{\int _{{\mathbb {R}}^d} \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{t}-\varvec{t}_{\alpha } \right) e^{-i \varvec{t}\cdot \varvec{s}} d\varvec{t} \cdot e^{i\varvec{t}_{\alpha }\cdot \varvec{s}} }{e^{i\varvec{t}_{\alpha }\cdot \varvec{s}}} \\= & {} \sum _{\alpha =1}^n \varvec{c}_{\alpha } \int _{{\mathbb {R}}^d} G \left( \varvec{t}-\varvec{t}_{\alpha } \right) e^{-i \left( \varvec{t}-\varvec{t}_{\alpha } \right) \cdot \varvec{s}} d \left( \varvec{t}-\varvec{t}_{\alpha } \right) \cdot e^{-i \varvec{t}_{\alpha }\cdot \varvec{s}} \\= & {} \sum _{\alpha =1}^n \varvec{c}_{\alpha } {\tilde{G}} \left( \varvec{s} \right) e^{-i\varvec{t}_{\alpha }\cdot \varvec{s}}~. \end{aligned}$$
and
$$\begin{aligned} \tilde{\varvec{\varPsi }}^{*} \left( -\varvec{s} \right)= & {} \int _{{\mathbb {R}}^d} \varvec{\varPsi }^{*} \left( \varvec{t} \right) e^{-i\varvec{t} \cdot \left( -\varvec{s} \right) } d\varvec{t} \\= & {} \frac{\int _{{\mathbb {R}}^d} \sum _{\beta =1}^n \varvec{c}_{\beta } G \left( \varvec{t}-\varvec{t}_{\beta } \right) e^{-i\varvec{t} \cdot \left( -\varvec{s} \right) } d\varvec{t} \cdot e^{i\varvec{t}_{\beta }\cdot \left( -\varvec{s} \right) } }{e^{i\varvec{t}_{\beta }\cdot \left( -\varvec{s} \right) }} \\= & {} \sum _{\beta =1}^n \varvec{c}_{\beta } \int _{{\mathbb {R}}^d} G \left( \varvec{t}-\varvec{t}_{\beta } \right) e^{-i\left( \varvec{t}-\varvec{t}_{\beta } \right) \cdot \left( -\varvec{s} \right) } d\left( \varvec{t}-\varvec{t}_{\beta } \right) \cdot e^{i\varvec{t}_{\beta }\cdot \varvec{s}} \\= & {} \sum _{\beta =1}^n \varvec{c}_{\beta } {\tilde{G}} \left( -\varvec{s} \right) e^{i\varvec{t}_{\beta }\cdot \varvec{s}} ~. \end{aligned}$$
Therefore, the regularization term can be written as
$$\begin{aligned}&\int _{{\mathbb {R}}^d} \frac{\left\| \tilde{\varvec{\varPsi }}^{*} \left( \varvec{s} \right) \right\| ^2}{{\tilde{G}}\left( \varvec{s} \right) } d\varvec{s} \nonumber \\&\quad = \int _{{\mathbb {R}}^d} \frac{ \left[ \sum _{\alpha =1}^n \varvec{c}_{\alpha } {\tilde{G}}\left( \varvec{s} \right) e^{-i\varvec{t}_{\alpha }\cdot \varvec{s}} \right] \cdot \left[ \sum _{\beta =1}^n \varvec{c}_{\beta } {\tilde{G}}\left( -\varvec{s} \right) e^{i\varvec{t}_{\beta }\cdot \varvec{s}} \right] }{{\tilde{G}}\left( \varvec{s} \right) } d\varvec{s} \nonumber \\&\quad = \int _{{\mathbb {R}}^d} \frac{\sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \cdot \varvec{c}_{\beta } {\tilde{G}}\left( \varvec{s} \right) {\tilde{G}}\left( -\varvec{s} \right) \cdot e^{i\varvec{s}\left( \varvec{t}_{\beta }-\varvec{t}_{\alpha } \right) } }{{\tilde{G}}\left( \varvec{s} \right) } d\varvec{s} \nonumber \\&\quad = \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \cdot \varvec{c}_{\beta } \int _{{\mathbb {R}}^d} d\varvec{s} \cdot e^{i\varvec{s}\cdot \left( \varvec{t}_{\beta }-\varvec{t}_{\alpha } \right) } {\tilde{G}}\left( -\varvec{s} \right) \nonumber \\&\quad = \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \cdot \varvec{c}_{\beta } G \left( \varvec{t}_{\alpha } - \varvec{t}_{\beta } \right) ~. \end{aligned}$$
(23)
Then, Eq. (22) can be converted as
$$\begin{aligned} H\left[ \varvec{\varPsi }^{*} \right]&= \sum _{i=1}^N \left\| \varvec{x}_i - \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{X}-\varvec{t}_{\alpha } \right) \right\| ^2 \nonumber \\&\quad + \lambda \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha }\cdot \varvec{c}_{\beta } G \left( \varvec{t}_{\alpha }-\varvec{t}_{\beta } \right) \end{aligned}$$
(24)
Then taking derivative of Eq. (24) with respect to \(\varvec{c}_{\gamma }\) yields,
$$\begin{aligned} \frac{\partial H\left[ \varvec{\varPsi }^{*} \right] }{\partial \varvec{c}_{\gamma }}= & {} \sum _{i=1}^N 2 \left[ \varvec{x}_i - \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{X}_i-\varvec{t}_{\alpha } \right) \right] \\&\quad \cdot \bigg [-G \left( \varvec{X}_i -\varvec{t}_{\gamma } \right) \bigg ] + 2 \lambda \sum _{\alpha =1}^n \varvec{c}_{\alpha } \cdot G \left( \varvec{t}_{\alpha }-\varvec{t}_{\beta } \right) \\= & {} -\sum _{i=1}^N \varvec{x}_i \cdot G \left( \varvec{X}_i-\varvec{t}_{\gamma } \right) + \sum _{i=1}^N \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{X}_i-\varvec{t}_{\alpha } \right) \\&\quad \cdot G \left( \varvec{X}_i-\varvec{t}_{\gamma } \right) + \lambda \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{t}_{\alpha }-\varvec{t}_{\gamma } \right) = \varvec{0} \end{aligned}$$
It can be written in a matrix form as follows,
$$\begin{aligned} \varvec{G}^T \varvec{G} \varvec{c} + \lambda \varvec{g} \varvec{c} = \varvec{G}^T \varvec{x} \end{aligned}$$
(25)
in which,
$$\begin{aligned} \varvec{c}= & {} \left[ \varvec{c}_1, \varvec{c}_2, \ldots , \varvec{c}_n \right] ^T, \quad \varvec{x} =\left[ \varvec{x}_1, \varvec{x}_2, \ldots , \varvec{x}_N \right] ^T\\ \varvec{G}_{N\times n}= & {} \begin{bmatrix} G\left( \varvec{X}_1-\varvec{t}_1 \right) &{} G\left( \varvec{X}_1-\varvec{t}_2 \right) &{} \cdots &{} G\left( \varvec{X}_1-\varvec{t}_n \right) \\ G\left( \varvec{X}_2-\varvec{t}_1 \right) &{} G\left( \varvec{X}_2-\varvec{t}_2 \right) &{} \cdots &{} G\left( \varvec{X}_2-\varvec{t}_n \right) \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ G\left( \varvec{X}_N-\varvec{t}_1 \right) &{} G\left( \varvec{X}_N-\varvec{t}_2 \right) &{} \cdots &{} G\left( \varvec{X}_N-\varvec{t}_n \right) \end{bmatrix}\\ \varvec{g}_{n \times n}= & {} \begin{bmatrix} G\left( \varvec{t}_1-\varvec{t}_1 \right) &{} G\left( \varvec{t}_1-\varvec{t}_2 \right) &{} \cdots &{} G\left( \varvec{t}_1-\varvec{t}_n \right) \\ G\left( \varvec{t}_2-\varvec{t}_1 \right) &{} G\left( \varvec{t}_2-\varvec{t}_2 \right) &{} \cdots &{} G\left( \varvec{t}_2-\varvec{t}_n \right) \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ G\left( \varvec{t}_n-\varvec{t}_1 \right) &{} G\left( \varvec{t}_n-\varvec{t}_2 \right) &{} \cdots &{} G\left( \varvec{t}_n-\varvec{t}_n \right) \end{bmatrix}~. \end{aligned}$$
Appendix B. The derivative of \(H_{{\mathbf {w}}}\left[ \varvec{\varPsi }^{*} \right] \) with \(\varvec{W}\)
The weighted variational functional is given as follows,
$$\begin{aligned} H_{{\mathbf {w}}} \left[ \varvec{\varPsi }^{*} \right] = \sum _{i=1}^N \left\| \sum _{\alpha =1}^n \varvec{c}_{\alpha } G\left( \varvec{W}\varvec{X}-\varvec{W}\varvec{t}_{\alpha } \right) - \varvec{x}_i \right\| ^2 + \lambda \phi \left[ \varvec{\varPsi }^{*} \left( \varvec{W}\varvec{X} \right) \right] \end{aligned}$$
(26)
The regularization term can be converted as
$$\begin{aligned}&\phi \left[ \varvec{\varPhi }^{*} \left( \varvec{W}\varvec{X} \right) \right] = \int _{{\mathbb {R}}^d} \frac{\left\| \tilde{\varvec{\varPhi }}^{*} \left( \varvec{W}\varvec{s} \right) \right\| ^2}{{\tilde{G}}\left( \varvec{W}\varvec{s} \right) } d\varvec{W}\varvec{s} \\&\quad = \int _{{\mathbb {R}}^d} \frac{ \tilde{\varvec{\varPhi }}^{*} \left( \varvec{W}\varvec{s} \right) \cdot \tilde{\varvec{\varPhi }}^{*} \left( -\varvec{W}\varvec{s} \right) }{{\tilde{G}}\left( \varvec{W}\varvec{s} \right) } d\varvec{W}\varvec{s} \end{aligned}$$
in which,
$$\begin{aligned}&\tilde{\varvec{\varPhi }}^{*} \left( \varvec{W}\varvec{s} \right) = \int _{{\mathbb {R}}^d} \varvec{\varPhi }^{*} \left( \varvec{W}\varvec{t} \right) e^{-i\left( \varvec{W}\varvec{t} \right) \cdot \left( \varvec{W}\varvec{s} \right) } d\varvec{W}\varvec{t} \\&\quad = \frac{ \int _{{\mathbb {R}}^d} \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{W}\varvec{t}-\varvec{W}\varvec{t}_{\alpha } \right) e^{-i\left( \varvec{W}\varvec{t} \right) \cdot \left( \varvec{W}\varvec{s} \right) } d\varvec{W}\varvec{t} \cdot e^{i\left( \varvec{W}\varvec{t}_{\alpha } \right) \cdot \left( \varvec{W}\varvec{s} \right) } }{e^{i\left( \varvec{W}\varvec{t}_{\alpha } \right) \cdot \left( \varvec{W}\varvec{s} \right) }} \\&\quad = \sum \nolimits _{\alpha =1}^n \varvec{c}_{\alpha } \int _{{\mathbb {R}}^d} G\left( \varvec{W}\varvec{t}-\varvec{W}\varvec{t}_{\alpha } \right) e^{-i \left( \varvec{W}\varvec{t}-\varvec{W}\varvec{t}_{\alpha } \right) \cdot \varvec{W}\varvec{s}} d\left( \varvec{W}\varvec{t}-\varvec{W}\varvec{t}_{\alpha } \right) \\&\quad \cdot e^{-i\varvec{W}\varvec{t}_{\alpha }\cdot \varvec{W}\varvec{s}} \\&\quad = \sum _{\alpha =1}^n \varvec{c}_{\alpha } {\tilde{G}} \left( \varvec{W}\varvec{s} \right) e^{-i\varvec{W}\varvec{t}_{\alpha } \cdot \varvec{W}\varvec{s}} \end{aligned}$$
and
$$\begin{aligned}&\tilde{\varvec{\varPhi }}^{*} \left( -\varvec{W}\varvec{s} \right) = \int _{{\mathbb {R}}^d} \varvec{\varPhi }^{*} \left( \varvec{W}\varvec{t} \right) e^{-i\left( \varvec{W}\varvec{t} \right) \cdot \left( -\varvec{W}\varvec{s} \right) } d\left( \varvec{W}\varvec{t} \right) \\&\quad = \frac{ \int _{{\mathbb {R}}^d} \sum _{\beta =1}^n \varvec{c}_{\beta } G\left( \varvec{W}\varvec{t}-\varvec{W}\varvec{t}_{\beta } \right) e^{-i \varvec{W}\varvec{t}\cdot \left( -\varvec{W}\varvec{s} \right) }d\varvec{W}\varvec{t} \cdot e^{i\varvec{W}\varvec{t}_{\beta }\cdot \left( -\varvec{W}\varvec{s} \right) } }{e^{i\varvec{W}\varvec{t}_{\beta }\cdot \left( -\varvec{W}\varvec{s} \right) }} \\&\quad = \sum \nolimits _{\beta =1}^n \varvec{c}_{\beta } \int _{{\mathbb {R}}^d} G \left( \varvec{W}\varvec{t}-\varvec{W}\varvec{t}_{\beta } \right) e^{-i\varvec{W}\left( \varvec{t}-\varvec{t}_{\beta } \right) \cdot \left( -\varvec{W}\varvec{s} \right) } d\varvec{W}\left( \varvec{t}-\varvec{t}_{\beta } \right) \\&\qquad \cdot e^{i\left( \varvec{W}\varvec{t}_{\beta } \right) \cdot \left( \varvec{W}\varvec{s} \right) } \\&\quad = \sum \nolimits _{\beta =1}^n \varvec{c}_{\beta } {\tilde{G}}\left( -\varvec{W}\varvec{s} \right) e^{i\varvec{W}\varvec{t}_{\beta }\cdot \varvec{W}\varvec{s}} \end{aligned}$$
Therefore, the weighted smoothness constraint term can be written as
$$\begin{aligned}&\int _{{\mathbb {R}}^d} \frac{\left\| \tilde{\varvec{\varPsi }}^{*} \left( \varvec{s} \right) \right\| ^2}{{\tilde{G}}\left( \varvec{W}\varvec{s} \right) } d\varvec{W}\varvec{s} \nonumber \\&\quad = \int _{{\mathbb {R}}^d} \frac{ \left[ \sum _{\alpha =1}^n \varvec{c}_{\alpha } {\tilde{G}}\left( \varvec{W}\varvec{s} \right) e^{-i\varvec{W}\varvec{t}_{\alpha }\cdot \varvec{W}\varvec{s}} \right] \cdot \left[ \sum _{\beta =1}^n \varvec{c}_{\beta } {\tilde{G}}\left( -\varvec{W}\varvec{s} \right) e^{i\varvec{W}\varvec{t}_{\beta }\cdot \varvec{W}\varvec{s}} \right] }{{\tilde{G}}\left( \varvec{W}\varvec{s} \right) } d\varvec{W}\varvec{s} \nonumber \\&\quad = \int _{{\mathbb {R}}^d} \frac{\sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \cdot \varvec{c}_{\beta } {\tilde{G}} \left( \varvec{W}\varvec{s} \right) {\tilde{G}}\left( -\varvec{W}\varvec{s} \right) e^{i\varvec{W}\varvec{s}\cdot \varvec{W}\left( \varvec{t}_{\beta }-\varvec{t}_{\alpha } \right) } d\varvec{W}\varvec{s} }{ {\tilde{G}}\left( \varvec{W}\varvec{s} \right) } \nonumber \\&\quad = \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha }\cdot \varvec{c}_{\beta } \int _{{\mathbb {R}}^d} d\varvec{W}\varvec{s} \cdot e^{i\varvec{W}\varvec{s}\cdot \varvec{W}\left( \varvec{t}_{\beta }-\varvec{t}_{\alpha } \right) } {\tilde{G}}\left( -\varvec{W}\varvec{s} \right) \nonumber \\&\quad = \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \cdot \varvec{c}_{\beta } G\left( \varvec{W}\varvec{t}_{\alpha }-\varvec{W}\varvec{t}_{\beta } \right) \end{aligned}$$
(27)
Subsequently, Eq. (26) can be converted as
$$\begin{aligned} H_{{\mathbf {w}}} \left[ \varvec{\varPsi }^{*} \right]= & {} \sum _{i=1}^N \left\| \varvec{x}_i - \sum _{\alpha =1}^n \varvec{c}_{\alpha } G\left( \varvec{W}\varvec{X}-\varvec{W}\varvec{t}_{\alpha } \right) \right\| ^2 \nonumber \\&\quad + \lambda \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \cdot \varvec{c}_{\beta } G\left( \varvec{W}\varvec{t}_{\alpha }-\varvec{W}\varvec{t}_{\beta } \right) \end{aligned}$$
(28)
Then taking derivative of Eq. (28) with respect to \(\varvec{W}\) yields,
$$\begin{aligned} \frac{\partial H_{{\mathbf {w}}}\left[ \varvec{\varPsi }^{*} \right] }{\partial \varvec{W}}= & {} \sum _{i=1}^N 2 \left[ \varvec{x}_i - \sum _{\alpha =1}^n \varvec{c}_{\alpha } G\left( \varvec{W}\varvec{X}-\varvec{W}\varvec{t}_{\alpha } \right) \right] \nonumber \\&\cdot \left[ -\sum _{\alpha =1}^n \varvec{c}_{\alpha } G^{\prime }\left( \varvec{W}\varvec{X}-\varvec{W}\varvec{t}_{\alpha } \right) \right. \nonumber \\&\quad \left. \cdot \left( \varvec{W}\varvec{X}-\varvec{W}\varvec{t}_{\alpha } \right) \cdot \left( \varvec{X}-\varvec{t}_{\alpha } \right) \right] \nonumber \\&+ \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \cdot \varvec{c}_{\beta } G^{\prime } \left( \varvec{W}\varvec{t}_{\alpha }-\varvec{W}\varvec{t}_{\beta } \right) \nonumber \\&\quad \cdot \varvec{W}\left( \varvec{t}_{\alpha }-\varvec{t}_{\beta } \right) \cdot \left( \varvec{t}_{\alpha }-\varvec{t}_{\beta } \right) = \varvec{0} \end{aligned}$$
(29)
So that we can have
$$\begin{aligned} \frac{\partial H_{{\mathbf {w}}} \left[ \varvec{\varPsi }^{*} \right] }{\partial \varvec{W}} =&\sum _{i=1}^N 2 \left[ \varvec{x}_i - \sum _{\alpha =1}^n \varvec{c}_{\alpha } G \left( \varvec{W}\varvec{X}_i - \varvec{W}\varvec{t}_{\alpha } \right) \right] \nonumber \\&\cdot \left[ - \sum _{\alpha =1}^n \varvec{c}_{\alpha } G^{\prime }\left( \varvec{W}\varvec{X}_i-\varvec{W}\varvec{t}_{\alpha } \right) \right. \nonumber \\&\quad \left. \cdot \varvec{W}:\left( \varvec{X}_i-\varvec{t}_{\alpha } \right) \otimes \left( \varvec{X}_i-\varvec{t}_{\alpha } \right) \right] \nonumber \\&+ \sum _{\alpha ,\beta =1}^n \varvec{c}_{\alpha } \varvec{c}_{\beta } G^{\prime } \left( \varvec{W}\varvec{t}_{\alpha }-\varvec{W}\varvec{t}_{\beta } \right) \nonumber \\&\quad \cdot \varvec{W}:\left( \varvec{t}_{\alpha }-\varvec{t}_{\beta } \right) \otimes \left( \varvec{t}_{\alpha }-\varvec{t}_{\beta } \right) = \varvec{0} \end{aligned}$$
(30)
Appendix C. Derivative of \({\tilde{Q}}\) with respect to \(\varvec{C}\)
First, the derivative of \({\tilde{Q}}\) with respect to \(\varvec{C}\) can be shown as,
$$\begin{aligned} \frac{\partial {\tilde{Q}}}{\partial \varvec{C}}= & {} \frac{\partial }{\partial \varvec{C}} \sum _{i=1}^N \sum _{j=1}^M \phi _j \left( \varvec{y}_{old} | \varvec{X}_i \right) \\&\quad \frac{\left\| \varvec{X}_i - \sum _{\alpha =1}^q \varvec{c}_{\alpha } G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0\alpha } \right) \right\| ^2}{2\sigma _j^2} \\&\quad + \lambda \frac{\partial }{\partial \varvec{C}} \mathbf{Tr} \left( \varvec{C}^T \varvec{g} \varvec{C} \right) \end{aligned}$$
For the first term, we shall consider the derivative with one element of matrix \(\varvec{C}\), i.e., \(\varvec{c}_r\),
$$\begin{aligned}&\frac{\partial }{\partial \varvec{c}_r} \sum _{i=1}^N \sum _{j=1}^M \phi _j \left( \varvec{y}_{old}|\varvec{X}_i \right) \frac{\left\| \varvec{X}_i-\sum _{\alpha =1}^q \varvec{c}_{\alpha } G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0\alpha } \right) \right\| ^2}{2\sigma _j^2} \\= & {} \sum _{i=1}^N \sum _{j=1}^M \phi _j \left( \varvec{y}_{old}|\varvec{X}_i \right) \frac{2\left( \varvec{X}_i-\sum _{\alpha =1}^q \varvec{c}_{\alpha } G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0\alpha } \right) \right) }{2\sigma _j^2} \\&\quad \cdot \left( -G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0r} \right) \right) \\= & {} \sum _{i=1}^N \sum _{j=1}^M \phi _j \left( \varvec{y}_{old}|\varvec{X}_i \right) \frac{-\varvec{X}_i+\sum _{\alpha =1}^q \varvec{c}_{\alpha }G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0\alpha } \right) }{\sigma _j^2}\\&\quad \cdot G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0r} \right) \\= & {} \sum _{i=1}^N \sum _{j=1}^M \frac{\phi _j \left( \varvec{y}_{old}|\varvec{X}_i \right) \sum _{\alpha =1}^q \varvec{c}_{\alpha } G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0\alpha } \right) }{\sigma _j^2} \\&\quad G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0r} \right) - \sum _{i=1}^N \sum _{j=1}^M \frac{\phi _j \left( \varvec{y}_{old}|\varvec{X}_i \right) }{\sigma _j^2} \\&\quad \varvec{X}_i G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0r} \right) \\= & {} \frac{1}{\sigma ^2} \sum _{j=1}^M \left( \sum _{i=1}^N \frac{\phi _j \left( \varvec{y}_{old}|\varvec{X}_i \right) }{\left( \sigma _j/\sigma \right) ^2} \right) \sum _{\alpha =1}^q \varvec{c}_{\alpha } G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0\alpha } \right) \\&\quad \cdot G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0r} \right) \\&-\frac{1}{\sigma ^2} \sum _{j=1}^M \frac{\sum _{i=1}^N \phi _j\left( \varvec{y}_{old}|\varvec{X}_i \right) \varvec{X}_i}{\left( \sigma _j/\sigma \right) ^2} G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0r} \right) \\= & {} \frac{1}{\sigma ^2} \sum _{j=1}^M G\left( \hat{\varvec{y}}_{0r}-\varvec{y}_{0j} \right) diag \left( \tilde{\varvec{\varPhi }}\varvec{1} \right) \varvec{G}\left( j,: \right) \varvec{C}\\&\quad - \frac{1}{\sigma ^2} \sum _{j=1}^M G\left( \hat{\varvec{y}}_{0r}-\varvec{y}_{0j} \right) \left( \tilde{\varvec{\varPhi }} \varvec{X} \right) \\= & {} \frac{1}{\sigma ^2} \varvec{G}^T \left( r,: \right) diag\left( \tilde{\varvec{\varPhi }}\varvec{1} \right) \varvec{G}\varvec{C}\\&\quad - \frac{1}{\sigma ^2} \varvec{G}^T\left( r,: \right) \left( \tilde{\varvec{\varPhi }}\varvec{X} \right) \end{aligned}$$
Therefore, the derivative of the first term with respect to \(\varvec{C}\) matrix is,
$$\begin{aligned}&\frac{\partial }{\partial \varvec{C}} \sum _{i=1}^N \sum _{j=1}^M \phi _j \left( \varvec{y}_{old}|\varvec{X}_i \right) \frac{\left\| \varvec{X}_i -\sum _{\alpha =1}^q \varvec{c}_{\alpha } G\left( \varvec{y}_{0j}-\hat{\varvec{y}}_{0\alpha } \right) \right\| ^2}{2\sigma _j^2} \nonumber \\&\frac{1}{\sigma ^2} \left[ \varvec{G}^T diag \left( \tilde{\varvec{\varPhi }}\varvec{1} \right) \varvec{G}\varvec{C} - \varvec{G}^T \left( \tilde{\varvec{\varPhi }}\varvec{X} \right) \right] \end{aligned}$$
(31)
The derivative of second term with respect to \(\varvec{C}\) can be shown as indices form,
$$\begin{aligned}&\frac{\partial }{\partial \varvec{C}} \mathbf{Tr} \left( \varvec{C}^T \varvec{g} \varvec{C} \right) \\&\quad = \frac{\partial }{\partial C_{\gamma k}} \left( C_{\alpha j} g_{\alpha \beta } C_{\beta j} \right) \\&\quad = \delta _{\alpha \gamma } \delta _{jk} g_{\alpha \beta } C_{\beta j} + C_{\alpha j} g_{\alpha \beta } \delta _{\gamma \beta } \delta _{jk}\\&\quad = g_{\gamma \beta } C_{\beta k} + C_{\alpha k} g_{\alpha \gamma } = \varvec{g} \varvec{C} + \varvec{C}^T \varvec{g} = 2\varvec{g} \varvec{C} \end{aligned}$$
Sum the two terms above together to have,
$$\begin{aligned} \frac{{\tilde{Q}}}{\varvec{C}}&= \varvec{G}^T diag \left( \tilde{\varvec{\varPhi }}\varvec{1} \right) \varvec{G}\varvec{C} \\&\quad - \varvec{G}^T \left( \tilde{\varvec{\varPhi }}\varvec{X} \right) + 2\sigma ^2 \lambda \varvec{g} \varvec{C} = \varvec{0} \end{aligned}$$
so that the matrix \(\varvec{C}\) can be obtained
$$\begin{aligned} \varvec{C} = \left[ \varvec{G}^T diag\left( \tilde{\varvec{\varPhi }}\varvec{1} \right) \varvec{G} + 2\sigma ^2 \lambda \varvec{g} \right] ^{-1} \varvec{G}^T \left( \tilde{\varvec{\varPhi }} \varvec{X} \right) \end{aligned}$$
(32)
in which, \(\varvec{G}\) is a \(M\times q\) matrix, \(\tilde{\varvec{\varPhi }}\) is a \(M \times N\) matrix, \(\varvec{1}\) is a \(N \times 1\) vector, \(\varvec{C}\) is a \(q \times d\) matrix, \(\varvec{X}\) is a \(N \times d\) matrix and \(\varvec{g}\) is a \(q \times q\) matrix.
Appendix D. The brief outline to Coherent Point Drift algorithm
First of all, for convenience of the presentation, we introduce the following notations:
-
N - the number of target point sets
-
M - the number of source point sets
-
D - the dimension of the space where a point exists
-
\({\mathbf {X}} = \left( x_1^1, x_1^2, \cdots , x_1^D, \cdots , x_N^1, x_N^2, \cdots , x_N^D \right) ^T\) - target point set
-
\({\mathbf {Y}} = \left( y_1^1, y_1^2, \cdots , y_1^D, \cdots , y_M^1, y_M^2, \cdots , y_M^D \right) ^T\) - source point set
-
\({\mathcal {T}}\) - mapping function transforming the shape represented by \({\mathbf {Y}}\) to the shape represented by \({\mathbf {X}}\)
-
\(\sigma ^2 I_D\) - covariance matrix with D-dimension
Then a component of Gaussian mixture model, as basis of the CPD method, is defined as follows,
$$\begin{aligned} p\left( {\mathbf {x}}_n|{\mathbf {y}}_m, \sigma ^2 \right) = |2\pi \sigma ^2 I_D|^{-1/2} \exp \left\{ -\frac{1}{2\sigma ^2} \left\| {\mathbf {x}}_n - {\mathcal {T}}\left( {\mathbf {y}}_m \right) \right\| ^2 \right\} \end{aligned}$$
in which, \({\mathcal {T}}\left( \varvec{Y} \right) \) is assumed as expectation of the Gaussian mixture model and a target point \({\mathbf {x}}_n\) is the data point produced from the Gaussian mixture model.
Therefore, after considering outlier points distribution and all components, a complete Gaussian mixture model is defined as follows,
$$\begin{aligned} p\left( {\mathbf {x}}_n; \sigma ^2 \right) = wp_{out}\left( {\mathbf {x}}_n \right) + \left( 1-w \right) \sum _{m=1}^M \frac{1}{M} p\left( {\mathbf {x}}_n | {\mathbf {y}}_m, \sigma ^2 \right) \end{aligned}$$
(33)
in which, w is the probability of an outlier appearance.
Then the EM algorithm is applied to find out the local minimum of the negative likelihood of the Guassian mixture model. Two iterative steps are conducted:
The Q-function is defined as
$$\begin{aligned} Q = \frac{{\hat{N}}D}{2} \ln \sigma ^2 + \frac{1}{2\sigma ^2} \sum _{n=1}^N \sum _{m=1}^M p_{mn} \left\| {\mathbf {x}}_n - {\mathcal {T}}\left( {\mathbf {y}}_m \right) \right\| ^2 \end{aligned}$$
(34)
in which \({\hat{N}} = \sum _{n=1}^N \sum _{m=1}^M p_{mn}\) and
$$\begin{aligned} p_{mn} = \frac{\exp \left( -\frac{1}{2\sigma ^2}\left\| {\mathbf {x}}_n - {\mathcal {T}}\left( {\mathbf {y}}_m \right) \right\| ^2 \right) }{\frac{w}{1-w} \frac{M}{N} |2\pi \sigma ^2 I_D|^{1/2} + \sum _{k=1}^K \exp \left( -\frac{1}{2\sigma ^2} \left\| {\mathbf {x}}_n - {\mathbf {y}}_k \right\| ^2 \right) } \end{aligned}$$
The complete derivation and some intermediate formulas can be found in Ref. [32].