Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Paper The following article is Open access

Quantitative passive imaging by iterative holography: the example of helioseismic holography

, , and

Published 1 March 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Special Issue on Big Data Inverse Problems Citation Björn Müller et al 2024 Inverse Problems 40 045016 DOI 10.1088/1361-6420/ad2b9a

0266-5611/40/4/045016

Abstract

In passive imaging, one attempts to reconstruct some coefficients in a wave equation from correlations of observed randomly excited solutions to this wave equation. Many methods proposed for this class of inverse problem so far are only qualitative, e.g. trying to identify the support of a perturbation. Major challenges are the increase in dimensionality when computing correlations from primary data in a preprocessing step, and often very poor pointwise signal-to-noise ratios. In this paper, we propose an approach that addresses both of these challenges: it works only on the primary data while implicitly using the full information contained in the correlation data, and it provides quantitative estimates and convergence by iteration. Our work is motivated by helioseismic holography, a well-established imaging method to map heterogenities and flows in the solar interior. We show that the back-propagation used in classical helioseismic holography can be interpreted as the adjoint of the Fréchet derivative of the operator which maps the properties of the solar interior to the correlation data on the solar surface. The theoretical and numerical framework for passive imaging problems developed in this paper extends helioseismic holography to nonlinear problems and allows for quantitative reconstructions. We present a proof of concept in uniform media.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

In this paper, we consider passive imaging problems described by a linear time-harmonic wave equation

with a random source s and some unknown coefficient q, which is the quantity of interest. We assume that $\mathbb{E} \left[ s \right] = 0$ such that $\mathbb{E} \left[ \psi \right] = 0$ by linearity of $L[q]$. Solutions ψ to this wave equation are observed on part of the boundary $\Gamma = \partial \Omega$ of a domain Ω for many independent realizations of s. Thus we can approximately compute the cross-covariance

Equation (1)

Our aim is to determine the unknown parameter q given noisy observations of C or the corresponding integral operator $(\mathcal{C}f\,)(\boldsymbol{x}_1): = \int_{\Gamma} C(\boldsymbol{x}_1, \boldsymbol{x}_2)f(\boldsymbol{x}_2)\,\mathrm{d}\boldsymbol{x}_2$. If $\mathrm{Tr}_{\Gamma} $ is the trace operator onto Γ, then straightforward calculations show that the forward operator mapping q to $\mathcal{C} = \mathbf{Cov}[\mathrm{Tr}_{\Gamma} \psi]$ is given by

(Recall that the covariance operator $\mathbf{Cov}[v]\in L(\mathbb{X})$ of a random variable v with values in a Hilbert-space $\mathbb{X}$ is defined implicitely by $\text{Cov}(\langle v,\psi\rangle_{\mathbb{X}}, \langle v,\varphi\rangle_{\mathbb{X}}) = \langle \mathbf{Cov}[v]\psi,\varphi\rangle_{\mathbb{X}}$ for all $\varphi,\psi\in\mathbb{X}$.) An early and influential reference on passive imaging is the work of Duvall et al [1] on time-distance helioseismology. Later, passive imaging has also been used in many other fields such as seismology [2], ocean acoustics [3], and ultrasonics [4]. We refer to the monograph [5] by Garnier & Papanicolaou for many further references. Concerning the uniqueness of passive imaging problems, we refer to [6] for results in the time domain and to [710] for results in the frequency domain. For the unique recovery of the source and the potential from passive far-field data, we refer to [11].

Local helioseismology analyzes acoustic oscillations at the solar surface in order to reconstruct physical quantities (subsurface flows, sound speed, density) in the solar interior (e.g. [12] and references therein). Since solar oscillations are excited by near-surface turbulent convection, it is reasonable to assume random, non-deterministic noise terms. In this paper, we will describe sound propagation in the solar interior by a scalar time-harmonic wave equation and study the passive imaging problem of parameter reconstruction from correlation measurements.

Very large data sets of high-resolution solar Doppler images have been recorded from the ground and from space over the last 25 years. This leads to a five-dimensional (22 spatial dimensions and 1 temporal dimension) cross-correlation data set on the solar surface, which cannot be stored and analyzed all at once. In traditional approaches, like time-distance helioseismology, the cross-correlations are reduced to a smaller number of observable quantities, such as travel times [1] or cross-correlation amplitudes (e.g. [1315]). Since the reduction to these quantities leads to a loss in information, we are interested in using the whole cross-correlation data throughout the inversion procedure, stepping forward to full waveform inversions.

Helioseismic holography, a technique within the field of local helioseismology, has proven to be a powerful tool for studying various aspects of the Sun's interior. It operates by propagating the solar wavefield backward from the surface to specific target locations within the Sun [16]. A notable success of helioseismic holography is the detection of active regions on the Sun's far side (e.g. [1719]). Furthermore, helioseismic holography is used in many other applications, e.g. to study the subsurface structure of sunspots [2022], wave absorption in magnetic regions [2325], and seismic emission from solar granules [26]. The main idea of helioseismic holography is the back-propagation ('egression') of the wavefield at the solar surface [27]. Improvements have been proposed in the choice of backward propagators (e.g. using Porter–Bojarski holograms [28, 29]. Helioseismic holography has a strong connection to conventional beam forming, where imaging functionals similar to the holographic back-propagation occur (e.g. [30]). In contrast to these approaches, we will achieve improvements by iterations.

In the present paper, we connect holographic imaging methods to iterative regularization methods. This way, holography can be extended to a full converging regularization method. This approach was successfully applied to inverse source problems in aeroacoustics [10] and is extended in this work to parameter identification problems.

The organization of the paper is as follows. In section 2 we introduce a generic model for the forward problem. In section 3 we establish foundations of our functional analytic setting by establishing sufficient conditions under which the diagonal of an integral operator is well defined, using Schatten class properties of embedding operators. With this we compute the Fréchet derivative of the forward operator and its adjoint in section 4. Next, we discuss the algorithm of iterative holography in section 5. Based on the analysis of sections 2–appendix B we then introduce forward operators in local helioseismology, their derivatives and adjoints in section 6. Then we discuss iterative helioseismic holography as an extension of conventional helioseismic holography in section 7, and demonstrate its performance in numerical examples with simulated data in section 8 before we end the paper with conclusions in section 9. Some technical issues are discussed in three short appendices.

2. A model problem

We first present the main ideas of this paper for a generic scalar time-harmonic wave equation. Let $\Omega_0\subset \Omega$ be a smooth, bounded domain in $\mathbb{R}^d$ and let $\Gamma \subset \overline{\Omega}\setminus \Omega_0$ the hypersurface on which measurements are performed. Γ may be part of the boundary $\partial\Omega$ or it may be contained in the interior of Ω. Moreover, consider the parameters

Here $W\,^{\infty}(\text{div}, \Omega): = \{\boldsymbol{A}\in L^{\infty}(\Omega, \mathbb{R}^d):\text{div} \boldsymbol{A}\in L^{\infty}(\Omega)\}$ with norm $\|\boldsymbol{A}\|_{W\,^{\infty}(\text{div},\Omega)}: = \|\boldsymbol{A}\|_{L^\infty}+\|\text{div}\boldsymbol{A}\|_{L^\infty}$.

Assume that the excitation of wavefields ψ in $\mathbb{R}^d$ by random sources s, which are supported in Ω0, is described by the model

Equation (2a)

Equation (2b)

for the outward pointing normal vector n on $\partial\Omega$ and some operator $B\in L\left(H^{1/2}(\partial\Omega)\to H^{-1/2}(\partial\Omega)\right)$. (Here and in the following $L\left(\mathbb{X},\mathbb{Y}\right)$ denotes the space of bounded linear operators between Banach spaces $\mathbb{X}$ and $\mathbb{Y}$.) Typically, B is some transparent boundary condition, e.g. $B\psi = ik\psi$ for $\partial\Omega = S^{d-1}$. We may also choose $B\psi: = \text{DtN}\psi$ with an exterior Dirichlet-to-Neumann map for the Helmholtz equation with the Sommerfeld radiation condition. In this case, equation (2) is equivalent to a problem posed on $\mathbb{R}^d$ with the Sommerfeld radiation condition.

Assumption 1. Suppose that for some $B_0\in L\left(H^{1/2}(\partial\Omega),H^{-1/2}(\partial\Omega)\right)$, $k\in\mathbb{C}$ and some set $\mathfrak{B}_k\subset L^{\infty} (\Omega, \mathbb{C}) \times W\,^{\infty}(\text{div}, \Omega)$ of admissible parameters $v,\boldsymbol{A}$ the following holds true:

Equation (3a)

Equation (3b)

Equation (3c)

Equation (3d)

Equation (3e)

The conditions (3c )–(3e ) are obviously satisfied for $B\zeta: = ik\zeta$, and they also hold true if B is the exterior Dirichlet-to-Neumann map on a sphere or a circle (see [31, 32]). Throughout this paper we denote by $H^s_0(\Omega)$ the closure of the space of distributions on Ω in $H^s(\mathbb{R}^d)$. For a Lipschitz domain, we have the duality $H^{s}(\Omega)^* = H^{-s}_0(\Omega)$ [33, theorem 3.30].

Proposition 1. Under Assumption 1 the problem (2) is well posed in the sense that for all $s\in H^{-1}_0(\Omega)$ there exists a unique $\psi\in H^1(\Omega)$ satisfying (2) in the weak sense, and ψ depends continuously on s with respect to these norms.

Proof. We only sketch the proof, which is a straightforward modification of similar proofs in [31, 32]. The weak formulation of Problem (2) is given by

Equation (4)

To show that for s = 0 this variational problem only has the trivial solution, we choose $\phi = \psi$ and take the imaginary part. Noting that ${\text{Im}(-2i\boldsymbol{A} \cdot (\nabla\psi)\overline{\psi}) = -\boldsymbol{A} \cdot \,2\text{Re}((\nabla\psi)\overline{\psi}) = -\boldsymbol{A} \cdot \nabla|\psi|^2}$ and using a partial integration and (3b ), we obtain

It follows from (3a ) and (3c ) that both sides must vanish. Hence, $\mathrm{Tr}_{\partial\Omega}\psi = 0$. By elliptic regularity, $\psi\in H^2(\Omega)$ is also a strong solution to (2) with $\frac{\partial\psi}{\partial\boldsymbol{n}} = 0$ on $\partial\Omega$. Due to vanishing Cauchy data on $\partial \Omega$, ψ may be extended by 0 as a strong solution of the wave equation to the exterior of Ω. Now it follows from unique continuation results (see [34, theorem 4.2]) that ψ vanishes identically.

Using assumptions (3d ) and (3e ), it can be shown that the sesquilinear form of the variational formulation is coercive up to a compact perturbation. Therefore, the operator representing this sesquilinear form is Fredholm of index 0. By uniqueness, it is boundedly invertible. □

If we write the solution operator

as an integral operator

the kernel $G_{v,\boldsymbol{A}}$ of $\mathcal{G}_{v,\boldsymbol{A}}$ is the Green's function, which may also be characterized by $(-\Delta-2i\boldsymbol{A} \cdot \nabla+v -k^2) G_{v,\boldsymbol{A}}(\cdot, \boldsymbol{x^{^{\prime}}}) = \delta_{\boldsymbol{x^{^{\prime}}}}$, $\partial_{\boldsymbol{n}} G_{v,\boldsymbol{A}}(\cdot, \boldsymbol{x^{^{\prime}}})-B \mathrm{Tr}_{\partial \Omega} G_{v,\boldsymbol{A}}(\cdot, \boldsymbol{x^{^{\prime}}}) = 0$ on $\partial \Omega$.

For certain random processes of interest, s does not belong to $H^{-1}_0(\Omega)$ almost surely. E.g., white noise is in $H^{-s}_0(\Omega)$ almost surely if and only if $s\gt d/2$. Nevertheless, the solution formula

Equation (5)

may still make sense if $G_{v,\boldsymbol{A}}(\boldsymbol{x},\cdot)$ is sufficiently smooth on Ω0. This is always the case if the support of s and $v,\boldsymbol{A}$ are disjoint or if $s\in H^{-1}_0(\Omega)$ almost surely, which is typically true if s is spatially correlated. Otherwise, we have to impose smoothness conditions on v and A such that $G_{v,\boldsymbol{A}}$ is sufficiently smooth and $\mathcal{G}$ has suitable mapping properties.

Assumption 2. The solution to (2) on Γ is given by (5).

Assume we have observations $\mathrm{Tr}_{\Gamma}\psi_1,\dots,\mathrm{Tr}_{\Gamma}\psi_N$ where ψj solves (2) for independent samples $s_1,\dots,s_N$ of s. As $\mathbb{E} \left[ s \right] = 0$, we have $\mathbb{E} \left[ \mathrm{Tr}_{\Gamma}\psi_j \right] = 0$, and we can compute the correlations by

Equation (6)

This is an unbiased estimator of the covariance

Equation (7)

converging in the limit $N\to\infty$. The integral operator $(\mathcal{C}[v,\boldsymbol{A}]\,f\,)(\boldsymbol{x}_1): = \int_{\Gamma} C_{v,\boldsymbol{A}}(\boldsymbol{x}_1,\boldsymbol{x}_2)f(\boldsymbol{x}_2)\,\mathrm{d}\boldsymbol{x}_2$ is the covariance operator

Equation (8)

$\mathcal{C}$ will be the forward operator of our inverse problem. Recall that if $C_{v,\boldsymbol{A}}\in L^2(\Gamma\times \Gamma)$, then $\mathcal{C}[v,\boldsymbol{A}]$ belongs to the space of Hilbert–Schmidt operators $\text{HS}\left(L^2(\Gamma)\right)$ on $L^2(\Gamma)$, and

Therefore, $\text{HS}\left(L^2(\Gamma)\right)$ is the natural image space of the forward operator. It is a Hilbert space with an inner product $\langle T,S\rangle_{\mathrm{HS}} = \mathrm{tr}(S^*T)$. Here $\mathrm{tr}(K)$ denotes the trace of a linear operator $K: \mathbb{H} \to \mathbb{H}$ in a separable Hilbert space $\mathbb{H}$ defined by $\mathrm{tr}(K): = \sum_{j = 1}^{\infty} \langle K e_j, e_j \rangle_{\mathbb{H}}$ for any is an orthonormal basis $\{e_j:j\in\mathbb{N}\}$ of $\mathbb{H}$.

Let us also consider the case that in addition to sources s in the interior of Ω there are sources $s_{\partial\Omega}$ on the boundary $\partial\Omega$. Such sources generate a field $\psi(\boldsymbol{x}) = \int_{\partial \Omega} G_{v,\boldsymbol{A}}(\boldsymbol{x},\boldsymbol{y}) s_{\partial\Omega}(\boldsymbol{y})\,\mathrm{d}\boldsymbol{y}$. Its restriction to Ω0 is given by

Equation (9)

which is the single layer potential operator for $\Gamma = \partial\Omega$. It is easy to see that K admits a factorization $K = \mathrm{Tr}_{\Gamma} G_{v, \boldsymbol{A}} \mathrm{Tr}_{\partial\Omega}^*$ in the spaces

which implies that $K\in L\left(H^{-1/2}(\partial \Omega), H^{1/2}(\Gamma)\right)$ (see [35, theorem 1(iii)]). Therefore, in the presence of boundary sources the measured covariance operator is given by

Often one assumes that the source process s is spatially uncorrelated and

for some source strength $S\in L^{\infty}(\Omega_0)$, where MS denotes the multiplication operator $M_{S}f: = S\cdot f$. If S is treated as an additional unknown, the forward operator becomes

Equation (10)

Of course, we could also assume that $s_{\partial\Omega}$ is spatially uncorrelated and treat its source strength as a further unknown, but for the sake of notational simplicity, we assume that $\mathbf{Cov}[s_{\partial\Omega}]\in L\left(L^2(\partial\Omega)\right)$ is known.

We first study the continuity and Fréchet differentiability of $\mathcal{G}_{v,\boldsymbol{A}}$ with respect to the parameters $(v,\boldsymbol{A})$. We will assume that v and A are known in $\Omega\setminus \Omega_0$. Let $(v_{\mathrm{ref}},A_{\mathrm{ref}})\in \mathfrak{B}_k$ be some reference solution. Then the set $\mathfrak{B}_k$ of admissible parameters in assumption 1 satisfies

Equation (11)

where $W\,^{\infty}_0(\text{div},\Omega_0): = \{\boldsymbol{A}\in W\,^{\infty}_0(\text{div},\Omega_0): \boldsymbol{A}\cdot\boldsymbol{n} = 0\text{on }\partial\Omega_0\}$.

Lemma 2. Under assumption 1, the mapping $\mathfrak{B}_k\to L\left(H^{-1}_0(\Omega), H^1(\Omega)\right)$, $(v,\boldsymbol{A})\mapsto \mathcal{G}_{v,\boldsymbol{A}}$ is well-defined and continuous, and Fréchet differentiable in the interior of $\mathfrak{B}_k$ w.r.t. the $\mathbb{X}_{\mathcal{G}}$-topology. The Fréchet derivative $\mathcal{G}^{^{\prime}}_{v,\boldsymbol{A}}:\mathbb{X}_{\mathcal{G}}\to L\left(H^{-1}_0(\Omega), H^1(\Omega)\right)$ at $(v,\boldsymbol{A})\in \text{int}(\mathfrak{B}_k)$ is given by

Proof. Again, we only sketch the proof and refer to [31, section 5.3] for a more detailed proof of a similar result. Let $L_{v,\boldsymbol{A}}: H^1(\Omega)\to H^{-1}_0(\Omega)$ denote the operator associated to the sesquilinear form in the weak formulation (4) such that $G_{v,\boldsymbol{A}} = L_{v,\boldsymbol{A}}^{-1}$. $L_{v,\boldsymbol{A}}$ is continuous and affine linear in the parameters. As $L^{^{\prime}}_{v,\boldsymbol{A}}(\partial v,\partial\boldsymbol{A}) = -2iM_{\partial\boldsymbol{A}}\cdot\nabla+M_{\partial v}$, the result follows from the continuity of operator inversion and the formula for its derivative, $\mathcal{G}^{^{\prime}}_{v,\boldsymbol{A}}(\partial v,\partial A) = -\mathcal{G}_{v,\boldsymbol{A}}L^{^{\prime}}_{v,\boldsymbol{A}}(\partial v,\partial\boldsymbol{A})\mathcal{G}_{v,\boldsymbol{A}}$. □

3. Diagonals of operator kernels

The present section serves as a preparation for computing adjoints of the Fréchet derivative of the forward operator defined by (10). A crucial step will be the characterization of adjoints of the mapping

(in a sense to be specified later).

In the discrete setting, MS corresponds to diagonal matrices $\text{diag}(S)\in \mathbb{C}^{d\times d}$ with diagonal S. The adjoint of the mapping

with respect to the Frobenius norm is given by

where $\text{Diag}(A)\in \mathbb{C}^d$ denotes the diagonal of the matrix $A\in \mathbb{C}^{d\times d}$.

We wish to generalize this to an infinite dimensional setting, with the Frobenius norm replaced by the Hilbert–Schmidt norm. Recall that any operator $\mathcal{A}\in \text{HS}\left(L^2(\Omega)\right)$ has a Schwartz kernel $A\in L^2(\Omega\times \Omega)$ such that $(\mathcal{A}\varphi)(\boldsymbol{x}) = \int_{\Omega}A(\boldsymbol{x},\boldsymbol{y})\varphi(\boldsymbol{y})\,\mathrm{d}y$ and $\|\mathcal{A}\|_{\text{HS}} = \|A\|_{L^2}$. It is tempting to define $(\text{Diag}\mathcal{A})(\boldsymbol{x}): = A(\boldsymbol{x},\boldsymbol{x})$. However, as A is only a L2-function and the diagonal $\{(\boldsymbol{x},\boldsymbol{x}):\boldsymbol{x}\in\Omega\}\subset \Omega\times \Omega$ has measure zero, the restriction of A to the diagonal is not well-defined.

To address this problem, we first recall that for Hilbert spaces $\mathbb{X}$, $\mathbb{Y}$ and $p\in [1,\infty)$ the p-Schatten class $S_p\left(\mathbb{X}, \mathbb{Y}\right)$ consists of all compact operator $\mathcal{A}\in L\left(\mathbb{X},\mathbb{Y}\right)$ for which the singular values $\sigma_j(A)$ (counted with multiplicity) form a $\ell^p$ sequence. $S_p\left(\mathbb{X}, \mathbb{Y}\right)$ is a Banach space equipped with the norm $\|\mathcal{A}\|_{S_p}: = (\sum_j \sigma_j(\mathcal{A})^p)^{1/p}$. $S_2(\mathbb{X},\mathbb{Y})$ coincides with $\text{HS}\left(\mathbb{X},\mathbb{Y}\right)$. We write $S_p\left(\mathbb{X}\right): = S_p\left(\mathbb{X},\mathbb{X}\right)$. The elements of $S_1(\mathbb{X})$ are called trace class operators. For such operators, the trace $\mathrm{tr}(\mathcal{A}): = \sum_k \langle \mathcal{A} e_j,e_j\rangle$ is well-defined for any orthonormal basis $\{e_k\}$ of $\mathbb{X}$, and $|\mathrm{tr}(\mathcal{A})|\unicode{x2A7D} \|\mathcal{A}\|_{S_1}$.

Let us first recall Mercer's theorem: It states that for a positive definite operator $\mathcal{A}$ with continuous kernel A, we have

and $A(\boldsymbol{x},\boldsymbol{x})\unicode{x2A7E} 0$ for all x . Since not all (positive semidefinite) Hilbert–Schmidt operators are trace class, we cannot expect that $\boldsymbol{x}\mapsto A(\boldsymbol{x},\boldsymbol{x})$ belongs to $L^1(\Omega)$ for general Hilbert–Schmidt operators. However, with the help of Mercer's theorem, we can show the following result.

Proposition 3. Let $\Omega\subset\mathbb{R}^d$ be open and non-empty. Then there exists a unique bounded linear operator

such that

for all operators $\mathcal{A}\in S_1(L^2(\Omega))$ with continuous kernel A. Moreover,

Equation (12)

Equation (12) is shown in [36, theorem 3.5] where it is also shown that $\text{Diag}(\mathcal{A})$ can be constructed by local averaging, but the first part is not explicitly stated. We sketch an alternative, more elementary proof:

Proof of proposition 3. If $\mathcal{A}$ is positive semidefinite, it may be factorized as $\mathcal{A} = \mathcal{B}^*\mathcal{B}$ with $\mathcal{B}\in \text{HS}\left(L^2(\Omega)\right)$ and $\|\mathcal{A}\|_{S_1} = \|\mathcal{B}\|_{S_2}^2$, e.g. by choosing $\mathcal{B} = \mathcal{A}^{1/2}$. By density of $C(\overline{\Omega}\times \overline{\Omega})$ in $L^2(\Omega\times \Omega)$, there exists a sequence $(B_n)$ converging to B in $L^2(\Omega\times\Omega)$. For the corresponding operators $\mathcal{B}_n$ it follows that $\lim_{n\to\infty}\|\mathcal{B}_n-\mathcal{B}\|_{\mathrm{HS}} = 0$ and $\lim_{n\to\infty}\|\mathcal{A}_n-\mathcal{A}\|_{S_1} = 0$ for $\mathcal{A}_n: = \mathcal{B}_n^*\mathcal{B}_n$ (see proposition 4, part ii below). Thus, we have constructed a sequence of positive semidefinite operators with continuous kernels converging to $\mathcal{A}$ in $S_1\left(L^2(\Omega)\right)$, and the statement follows from the classical Mercer theorem.

We decompose a general $\mathcal{A} \in S_1\left(L^2(\Omega)\right)$ as linear combination of trace class operators: We start with $\mathcal{A} = \text{Re}(\mathcal{A})+i \text{Im}(\mathcal{A})$ where $\text{Re}(\mathcal{A}): = \frac{1}{2}(\mathcal{A}+\mathcal{A}^*)$ and $\text{Im}(\mathcal{A}): = \frac{1}{2i}(\mathcal{A}-\mathcal{A}^*)$. There exists an expansion $\text{Re}(\mathcal{A}) = \sum_{k = 1}^{\infty} \lambda_k \psi_k \otimes \psi_k$. We define $P_1: = \sum_{k = 1}^{\infty} \text{max}(\lambda_k, 0) \psi_k \otimes \psi_k, P_2: = \sum_{k = 1}^{\infty} \text{max}(-\lambda_k, 0) \psi_k \otimes \psi_k$ such that $\text{Re}(\mathcal{A}) = P_1-P_2$ with positive semidefinite $P_1, P_2 \in S_1\left(L^2(\Omega_0)\right)$. Therefore, a general $\mathcal{A}\in S_1\left(L^2(\Omega)\right)$ can be written as a linear combination of positive semi-definite trace class operators: $\mathcal{A} = \mathcal{P}_1- \mathcal{P}_2 + i\mathcal{P}_3 - i\mathcal{P}_4$ where $\Vert P_1 \Vert_{S_1},\Vert P_2 \Vert_{S_1} \unicode{x2A7D} \Vert \text{Re} (\mathcal{A}) \Vert_{S_1}, \Vert P_3 \Vert_{S_1},\Vert P_4 \Vert_{S_1} \unicode{x2A7D} \Vert \text{Im} (\mathcal{A}) \Vert_{S_1}$. By the Courant-Fischer characterization $\sigma_n(\mathcal{A}) = \text{inf}\{\|\mathcal{A}-\mathcal{F}\|:\text{rank}(\mathcal{F})\unicode{x2A7D} n\}$, we get $\sigma_{2n}(\text{Re}(\mathcal{A})), \sigma_{2n}(\text{Im}(\mathcal{A}))\unicode{x2A7D} \sigma_n(\mathcal{A})$ and hence $\|\text{Re}(\mathcal{A})\|_{S_1}, \|\text{Im}(\mathcal{A})\|_{S_1}\unicode{x2A7D} 2\|\mathcal{A}\|_{S_1}$. It follows that $\|\mathcal{P}_j\|_{S_1}\unicode{x2A7D} 2\|\mathcal{A}\|_{S_1}$. Now we can apply the first proven special case to all $\mathcal{P}_j$ to obtain the result. □

To speak of an adjoint of the operator $\mathcal{M}:S \mapsto M_S$, we have to treat MS in some space with a dual pairing. We will use Hilbert–Schmidt spaces between suitable Sobolev spaces. (Recall that $M_S:L^2(\Omega)\to L^2(\Omega)$ is not compact in general.)

Note that a Gelfand triple $\mathbb{V}^{^{\prime}}\hookrightarrow \mathbb{H}\hookrightarrow \mathbb{V}$ of Hilbert spaces induces Gelfand triple

of Hilbert-Schmidt spaces with dual pairing, given by $\langle A,B\rangle_{\text{HS}} : = \mathrm{tr}(B^*A)$ for $A\in \text{HS}\left(\mathbb{V},\mathbb{V}^{^{\prime}}\right)$ and $B\in\text{HS}\left(\mathbb{V}^{^{\prime}},\mathbb{V}\right)$.

We give some preliminary results on p-Schatten class embeddings.

Proposition 4. Let $\Omega \subset \mathbb{R}^d$ be a bounded Lipschitz domain and let $\mathbb{X}, \mathbb{Y}, \mathbb{Z}$ be Hilbert spaces. Then the following holds true:

  • (i)  
    The Sobolev embedding: $j: H\,^m(\Omega) \hookrightarrow H^l(\Omega)$ is an element in the Schatten class $S_p\left(H^m(\Omega), H^l(\Omega)\right)$ if and only if $p \gt \frac{d}{m-l}$.
  • (ii)  
    Let $p, q, r\gt0$ satisfy $\frac{1}{p}+\frac{1}{q} = \frac{1}{r}$ and let $A \in S_p\left(\mathbb{X}, \mathbb{Y}\right), B \in S_q\left(\mathbb{X}, \mathbb{Y}\right)$. Then, $BA \in S_r\left(\mathbb{X}, \mathbb{Z}\right)$ and we have the bound:
  • (iii)  
    Let $A \in S_p\left(\mathbb{X}, \mathbb{Y}\right), B \in L\left(\mathbb{Y}, \mathbb{Z}\right), C \in L\left(\mathbb{Z}, \mathbb{X}\right)$. Then, $BA \in S_p\left(\mathbb{X}, \mathbb{Z}\right), AC \in S_p\left(\mathbb{Z}, \mathbb{Y}\right)$ and we have the bounds:
  • (iv)  
    Let $p, q\gt1$ with $\frac{1}{p}+\frac{1}{q} = 1$. Then, $S_p\left(\mathbb{X}, \mathbb{Y}\right)^{^{\prime}} = S_q\left(\mathbb{Y}^{^{\prime}}, \mathbb{X}^{^{\prime}}\right)$ with the dual pairing $\langle A, B \rangle = \mathrm{tr}(B^* A)$ for $A \in S_p\left(\mathbb{X}, \mathbb{Y}\right)$ and $B \in S_q\left(\mathbb{X}^{^{\prime}}, \mathbb{Y}^{^{\prime}}\right)$.

Proof. Part (i) follows from theorem 1 of [37]. Part (ii) and (iii) follow from lemma 16.7 of [38].

Let $A \in S_p\left(\mathbb{X}, \mathbb{Y}\right)$ and $B \in S_q\left(\mathbb{X}^{^{\prime}}, \mathbb{Y}^{^{\prime}}\right)$. By part (ii) and the boundedness of the trace in S1 [38, lemma 16.23], we get: $|\langle A, B \rangle| = |\mathrm{tr}(B^* A)| \unicode{x2A7D} \Vert B^* A \Vert_{S_1\left(\mathbb{Y}\right)} \unicode{x2A7D} \Vert B^* \Vert_{S_q\left(\mathbb{Y}, \mathbb{X} \right)} \Vert A \Vert_{S_p\left(\mathbb{X}, \mathbb{Y}\right)} = \Vert B \Vert_{S_q\left(\mathbb{X}^{^{\prime}}, \mathbb{Y}^{^{\prime}} \right)} \Vert A \Vert_{S_p\left(\mathbb{X}, \mathbb{Y}\right)}$. Hence, $S_q\left(\mathbb{Y}^{^{\prime}}, \mathbb{X}^{^{\prime}}\right) \subseteq S_p\left(\mathbb{X}, \mathbb{Y}\right)^{^{\prime}}$ and $S_p\left(\mathbb{X}, \mathbb{Y}\right) \subseteq S_q\left(\mathbb{Y}^{^{\prime}}, \mathbb{X}^{^{\prime}}\right)^{^{\prime}}$. By the Hahn-Banach theorem, we have the sequence

$S_p\left(\mathbb{X}, \mathbb{Y}\right)$ is a uniformly convex Banach space [39, chapter 5] and therefore reflexive by Milman–Pettis theorem [40]. Hence, $S_p\left(\mathbb{X}, \mathbb{Y}\right)^{{\prime} ^{\prime}} = S_p\left(\mathbb{X}, \mathbb{Y}\right)$ and the assertion follows. □

Using this proposition, we can prove that multiplication operators are Hilbert–Schmidt in suitable Sobolev spaces:

Lemma 5. Let $\Omega\subset\mathbb{R}^d$ be a bounded Lipschitz domain, $S\in L^{\infty}(\Omega)$, and $s\gt d/4$, $s-1/2\notin \mathbb{N}_0$. (In particular, for $d\in\{2,3\}$ we may choose s = 1.) Then $M_S\in \text{HS}\left(H^s(\Omega),H^{-s}_0(\Omega)\right)$, and the following mapping is continuous

Equation (13)

Proof. The condition $s-1/2\notin \mathbb{N}_0$ ensures that $H^s(\Omega)^{^{\prime}} = H^{-s}_0(\Omega)$ (see, e.g. [41, chapter 4]). Let $\tilde{M_S}: L^2(\Omega) \to L^2(\Omega), \tilde{M_S} \psi: = S \psi$. Then, we consider MS in the function spaces:

By proposition 4, part i, the embedding j is an element of the Schatten class $S_p\left(H^s(\Omega), L^2(\Omega)\right)$ if $p\gt d/s$. Consequently, $j^*\in S_p\left(L^2(\Omega), H_0^{-s}(\Omega)\right)$. It follows from proposition 4, parts ii and iii that $M_S\in S_r(H^s(\Omega),H^{-s}_0(\Omega))$ if $\frac{1}{r} = \frac{1}{p}+\frac{1}{p}\lt\frac{2s}{d}$. As $\frac{2s}{d}\gt1/2$, r = 2 is admissible. The continuity of $\mathcal{M}$ follows from the continuity of the mapping: $S \rightarrow \tilde{M_S}$. □

Lemma 6. Under the assumptions of lemma 5, the adjoint operator $\mathcal{M}^*:\text{HS}\left(H^{-s}_0(\Omega),H^s(\Omega)\right) \to L^{\infty}(\Omega)^{^{\prime}}$ takes values in the pre-dual $L^1(\Omega)\subset L^{\infty}(\Omega)^{^{\prime}}$ of $L^{\infty}(\Omega)$ and

Equation (14)

Proof. Let $f \in L^{\infty}(\Omega)$, $j: H^s(\Omega) \hookrightarrow L^2(\Omega)$, and $\mathcal{A}\in \text{HS}\left(H^{-s}_0(\Omega),H^s(\Omega)\right)$. It follows from proposition 4 that $\tilde{\mathcal{A}}: = j\mathcal{A}j^*\in S_1\left(L^2(\Omega)\right)$. We identify $\tilde{\mathcal{A}}$ and $\mathcal{A}$, i.e. a more precise formulation of (14) is $\mathcal{M}^*(\mathcal{A}) = \text{Diag}(\tilde{\mathcal{A}})$ for all $\mathcal{A}$. By the density result established at the beginning of the proof of proposition 3, it suffices to establish the relation for operators $\mathcal{A}$ with continuous kernel A. Choosing an orthonormal basis $\{e_k:k\in \mathbb{N}\}$ of $L^2(\Omega)$, we obtain

Since Ω is bounded and hence $\overline{A(\boldsymbol{y},\cdot)}\in C(\overline{\Omega}) \subset L^2(\Omega)$, the completeness of $\{e_k\}$ implies that $\sum_{k = 1}^{\infty}e_k(\boldsymbol{y}) \int_{\Omega} \overline{A(\boldsymbol{y},\boldsymbol{x})e_k(\boldsymbol{x})}\,\mathrm{d}\boldsymbol{x} = \overline{A(\boldsymbol{y},\boldsymbol{y})}$. This shows that $\langle \mathcal{M}(f),\mathcal{A}\rangle = \langle\, f,\text{Diag}(\tilde{\mathcal{A}})\rangle$, completing the proof. □

In lemma 2 we consider $M_{\partial v}$ and $M_{\partial \boldsymbol{A}}$ in the following function spaces:

The multiplication operator $M_{\partial v}$ has been discussed in lemmas 5 and 6. Although for $M_{\partial \boldsymbol{A}}$ we have less regularity, the following analogs still hold true:

Lemma 7. Let $d \in \{2, 3\}$ and $\partial \boldsymbol{A} \in W\,^{\infty}(\text{div}, \Omega_0)$. Then,

  • (i)  
    $M_{\partial \boldsymbol{A}} \in S_4\left(L^{2}(\Omega_0)^d, H_0^{-1}(\Omega_0)\right)$ and the following map is continuous:
  • (ii)  
    $\tilde{\mathcal{M}}^*:S_{4/3} (L^{2}(\Omega_0)^d, H^1(\Omega_0)) = S_4\left(L^{2}(\Omega_0)^d, H_0^{-1}(\Omega_0)\right)^{^{\prime}}\to L^{\infty}(\Omega_0)^{^{\prime}} $ takes values in the pre-dual $L^1(\Omega_0) \subset L^{\infty}(\Omega_0)^{^{\prime}}$. For an operator $\mathcal{B} = (\mathcal{B}_1,\dots,\mathcal{B}_d)\in S_{4/3} (L^{2}(\Omega_0)^d, H_1(\Omega_0))$ with continuous kernel $\boldsymbol{B} = (B_1,\dots, B_d):\Omega\times\Omega\to \mathbb{C}^d$ we have

Proof. In this proof, j will denote the embedding $H^1(\Omega_0) \hookrightarrow L^2(\Omega_0)$ and recall from proposition 4, part i that $j\in S_4\left(H^1(\Omega_0), L^2(\Omega_0)\right)$ and hence $j^*\in S_4\left(L^2(\Omega_0),H^{-1}_0(\Omega_0)\right)$.

Part (i): We consider $M_{\partial \boldsymbol{A}} = j^*\circ \tilde{M}_{\partial \boldsymbol{A}}$ in the function spaces

where $\tilde{M}_{\partial \boldsymbol{A}} \boldsymbol{\psi} = \partial \boldsymbol{A} \cdot \boldsymbol{\psi}$ for $\boldsymbol{\psi} \in L^{2}(\Omega_0)^d$. The claim follows by proposition 4.

Part (ii): Let $\tilde{\mathcal{B}}: = j\circ \mathcal{B}: L^{2}(\Omega_0)^d \to L^2(\Omega_0)$. Part (ii) of this proposition yields $\tilde{\mathcal{B}} \in S_1\left(L^{2}(\Omega_0)^d, L^2(\Omega_0)\right)$. As in lemma 6, the assertion now follows. □

4. Fréchet derivative and adjoint of the forward operator

A characterization of the adjoint of $S\mapsto\mathrm{Tr}_{\Gamma} \mathcal{G} M_S \mathcal{G}^*\mathrm{Tr}_{\Gamma}^*$ was given in [10]. There, a characterization of the adjoint of $S \mapsto M_S$ in a functional analytic framework was circumvented, resulting in a rather technical formulation of the result.

With the results of the previous section, the proof of the following central results is now mostly straightforward.

Theorem 8. Assumptions 1 and 2 hold true for some wave number $k\in\mathbb{C}$ and $d\in\{2,3\}$. Let $\mathbb{X}: = \mathbb{X}_{\mathcal{G}} \times L^{\infty}(\Omega_0,\mathbb{R})$ with $\mathbb{X}_{\mathcal{G}}$ defined in (11) and let $\mathfrak{B}: = \mathfrak{B}_k\times L^{\infty}(\Omega_0;[0,\infty))$. Then the following holds true:

  • (i)  
    The forward operator (10) is well-defined and continuous as a mapping
    and it is Fréchet differentiable on the interior of $\mathfrak{B}$. The derivative $\mathcal{C}^{^{\prime}}[v,\boldsymbol{A},S]:\mathbb{X}\to \text{HS}\left(L^2(\Gamma)\right)$ is given by
    where $\text{Re}(\mathcal{A}): = \frac{1}{2}(\mathcal{A}+\mathcal{A}^*)$.
  • (ii)  
    The adjoint $\mathcal{C}^{^{\prime}}[v,\boldsymbol{A},S]^*:\text{HS}\left(L^2(\Gamma)\right) \to \mathbb{X}^{^{\prime}}$ of $\mathcal{C}^{^{\prime}}[v,\boldsymbol{A},S]$ takes values in the pre-dual $L^{1}(\Omega_0;\mathbb{C})\times L^{1}(\Omega_0;\mathbb{R}^d) \times L^{1}(\Omega_0;\mathbb{R})\subset \mathbb{X}^{^{\prime}}$ of $\mathbb{X}$ and is given by

Proof.  Part (i): let $\mathcal{C}_1[v,\boldsymbol{A},S]: = \mathrm{Tr}_{\Gamma} \mathcal{G}_{v,\boldsymbol{A}} M_{S} \mathcal{G}_{v,\boldsymbol{A}}^*\mathrm{Tr}_{\Gamma}^*.$ and $\mathcal{C}_2[v,\boldsymbol{A},S]: = \mathrm{Tr}_{\Gamma} \mathcal{G}_{v,\boldsymbol{A}} \mathrm{Tr}^*_{\partial \Omega} \mathbf{Cov}[s_{\partial\Omega}]$ $\mathrm{Tr}_{\partial \Omega} \mathcal{G}_{v,\boldsymbol{A}}^*\mathrm{Tr}_{\Gamma}^*.$ We consider the factors defining $\mathcal{C}_1[v,\boldsymbol{A},S]$ in the following function spaces:

Here $M_S: H^1(\Omega) \rightarrow H_0^{-1}(\Omega)$ is Hilbert–Schmidt by lemma 5, and all other operators are bounded. By part (iii) of proposition 4, it follows that $\mathcal{C}_1[v,\boldsymbol{A},S]$ is Hilbert–Schmidt.

Similarly, we consider the factors defining $\mathcal{C}_2[v,\boldsymbol{A},S]$ in the following function spaces:

By part (i) of proposition 4, every embedding is an element of S8. By parts ii and iii of proposition 4, it follows that $\mathcal{C}_2[v,\boldsymbol{A},S]$ is Hilbert–Schmidt. Hence, $\mathcal{C}[v,\boldsymbol{A},S]$ is Hilbert–Schmidt. Together with lemma 2, it follows that $\mathcal{C}$ is continuous. Fréchet differentiability and the formula for the derivative follow from lemma 2 and the chain rule.

Part (ii): if $\mathbb{X}_1,\dots,\mathbb{X}_4$ are Hilbert spaces, $\mathcal{A}\in L\left(\mathbb{X}_1,\mathbb{X}_2\right)$ and $\mathcal{B}\in L\left(\mathbb{X}_3,\mathbb{X}_4\right)$, then a straightforward computation shows that the adjoint of the linear mapping $\text{HS}\left(\mathbb{X}_2,\mathbb{X}_3\right)\to \text{HS}\left(\mathbb{X}_1,\mathbb{X}_4\right)$, $\mathcal{T}\mapsto \mathcal{BTA}$ is given by the mapping $\text{HS}\left(\mathbb{X}_1,\mathbb{X}_4\right)\to \text{HS}\left(\mathbb{X}_2,\mathbb{X}_3\right)$, $\mathcal{S}\mapsto \mathcal{B}^* \mathcal{S}\mathcal{A}^*$ and that $\text{Re} \in L\left(\text{HS}\left(\mathbb{X}_j\right)\right)$ is a self-adjoint projection operator. Note that $\mathbf{Cov}[s_{\partial\Omega}] \in S_4\left(H^{1/2}(\partial \Omega), H^{-1/2}(\partial \Omega)\right)$ from proposition 4. Furthermore, by proposition 4 and lemma 5, $\mathcal{E} \in HS\left(H_0^{-1}(\Omega_0), H^1(\Omega_0)\right), M_S \in HS\left(H^1(\Omega_0), H_0^{-1}(\Omega_0)\right)$. Hence, $\mathcal{E} (M_S+\mathrm{Tr}_{\partial \Omega}^* \mathbf{Cov}[s_{\partial\Omega}] \mathrm{Tr}_{\partial \Omega}) \in S_{4/3}\left(H_0^{-1}(\Omega_0)\right)$. Now, the assertion follows from lemma 6 and part (ii) of lemma 7. □

Introducing so-called forward propagators Hα and backward propagators Hβ by

Equation (15)

the Fréchet derivative and its adjoint in theorem 8 can be reformulated as

Equation (16)

These propagators $\mathcal{H}_{\alpha}, \mathcal{H}_{\beta}$ have a physical interpretation in helioseismology that will be discussed in section 7.1.

5. On the algorithmic realization of iterative regularization methods

For notational simplicity, we will use $\boldsymbol{q} = (v, \boldsymbol{A}, S)$ throughout this section. Then we formally have to solve the operator equation

5.1. Avoiding the computation of Corr

Computing Corr from the primary data $\mathrm{Tr}_{\Gamma} \psi_n$ in a preprocessing step drastically increases the dimensionality of the data. In helioseismology, the data set with the best resolution consists of Doppler images of size $4096 \times 4096$. This leads to approximately 1014 independent two-point correlations, at each frequency. Hence, the surface cross-correlation is a noisy five-dimensional data set of immense size, which is infeasible to use in inversions directly. Moreover, these two-point correlations are extremely noisy. In traditional approaches such as time-distance helioseismology, one usually reduces the cross-correlation in an a priori step to a smaller number of physically interpretable quantities with an acceptable signal-to-noise ratio. However, this leads to a significant loss of information, see [42].

To use the full information content of Corr without the need to ever compute these correlations, we exploit the fact that the adjoint of the Fréchet derivative of the forward operator is of the form $\mathcal{C}^{^{\prime}}[\boldsymbol{q}]^*\mathcal{D} = \text{Diag}({\mathcal{H}_{\alpha}^q}^* \text{Re}(\mathcal{D})\mathcal{H}_{\beta}^q)$, see equation (16). As $\text{Re}(\text{Corr}) = \text{Corr}$, pulling the sum outside yields

Equation (17)

We will show in section 7.1 that traditional helioseismic holography can be interpreted as the application of $\mathcal{C}^{^{\prime}}[\boldsymbol{q}]^*$ to Corr. Since $ \frac{1}{N}\sum_{n = 1}^N\text{Diag}(\dots)$ can be interpreted as computing diagonal correlations of the back-propagated signals, equation (17) may be paraphrased as first back-propagating signals and then correlating them, rather than vice versa.

5.2. Iterative regularization methods without image space vectors

For ill-posed inverse problems, the adjoint of the linearized forward operator is typically a bad approximation of the inverse. To obtain a quantitative imaging method, we can improve the approximation in (17) by implementing an iterative regularization method. We will focus on the iteratively regularized Gauss–Newton method (IRGNM) with inner Conjugate Gradient iterations, but the discussion below also applies to other commonly used methods such as Landweber iteration or the Newton-CG method. IRGNM is defined by

Equation (18)

Here q 0 defines the initial guess. Since the image space $\mathbb{Y}$ of the forward operator is high dimensional, direct evaluations of $\mathcal{C}[\boldsymbol{q}]$ and $\mathcal{C}^{^{\prime}}[\boldsymbol{q}]$ must be avoided. However, IRGNM with inner CG iterations as well as other iterative regularization methods only require the operations $\boldsymbol{q}\mapsto \mathcal{C}^{^{\prime}}[\boldsymbol{q}]^* \mathcal{C}[\boldsymbol{q}] $ and

Equation (19)

We will refer to the integral kernels K of $\mathcal{C}^{^{\prime}} [\boldsymbol{q}]^*\ \mathcal{C}^{^{\prime}}[\boldsymbol{q}]$ as sensitivity kernels for the normal equation. In section 7.2 they will be described for various settings of interest in terms of forward–backward operators

Equation (20)

In our numerical tests in helioseismology reported in section 8, the bottleneck concerning computation time is the evaluation of the Green function involved in the definitions of the propagators $\mathcal{H}_{\alpha}$ and $\mathcal{H}_{\beta}$. To accelerate these computations, we use separable reference Green's functions $G_0: = G_{\boldsymbol{q}_0}$ discussed in appendices A and B and the corresponding integral operator $\mathcal{G}_0$ as well as the algebraic identity

Equation (21a)

This identity, with $L_{\boldsymbol{q}}: = \mathcal{G}_{\boldsymbol{q}}^{-1}$ and $L_0: = \mathcal{G}_0^{-1}$, is equivalent to $ \mathcal{G}_0-\mathcal{G}_{\boldsymbol{q}} = \mathcal{G}_0 (L_{\boldsymbol{q}}-L_0) \mathcal{G}_{\boldsymbol{q}}.$ The operators $L_{\boldsymbol{q}}, L_0:H^1(\Omega)\to H^{-1}_0(\Omega)$ represent the corresponding sesquilinear forms in equation (4) in the proof of proposition 1 and involve both the differential operator and the boundary condition. As both the boundary operator B and the leading order differential operator are independent of the parameters q , they cancel out, and

Equation (21b)

This approach is efficient since the operator $\mathcal{G}_0$ can be solved with one-dimensional code and the operator the calculation of $\left[ \text{Id}+\mathcal{G}_0 (L_{\boldsymbol{q}}-L_0) \right]$ can typically be restricted to a supported area of $L_{\boldsymbol{q}}-L_0$. Usually, we compute a pivoted LU-decomposition of $\text{Id}+\mathcal{G}_0 (L_{\boldsymbol{q}}-L_0) $ and solve for a list of input sources $G(\cdot, \boldsymbol{x}), \boldsymbol{x} \in \Gamma$. Furthermore, we can use low-rank approximations for $\mathcal{G}_0$ based on the expansions in appendix A for solar-like medium and appendix B for uniform medium.

5.3. Noise and likelihood modeling

In this section, we study the noise model in order to step forward to the full likelihood modeling and to create realistic noise. The main noise term is realization noise. Recall that the wavefield ψ is a realization of a Gaussian random process with covariance operator $\mathcal{C}[\boldsymbol{q}]$.

The covariance matrix of Gaussian correlation data can be computed by Isserlis theorem [43] and is given by fourth-order correlations (e.g. [29, 44, 45]):

The third term vanishes as $\mathbb{E} \left[ \psi(\boldsymbol{r}_1) \psi(\boldsymbol{r}_2) \right] = \int_{\Omega} \int_{\Omega} G(\pmb{r}_1, \pmb{z}_1) G(\pmb{r}_2, \pmb{z}_2)\mathbb{E} \left[ s(\boldsymbol{z}_1) s(\boldsymbol{z}_2) \right] \,\mathrm{d} \boldsymbol{z}_1 \mathrm{d} \boldsymbol{z}_2$ and $\mathbb{E} \left[ s(\boldsymbol{z}_1) s(\boldsymbol{z}_2) \right] = 0$. Hence, we observe

Therefore, we can define the data covariance operator by

Equation (22)

Hence, if we choose a quadratic log-likelihood approximation, we are formally lead to replace $\|\cdot\|_{\mathbb{Y}}^2$ in (18) by $\|\mathcal{C}_4[\boldsymbol{q}_n]^{-1/2}\cdot\|_{\mathbb{Y}}^2$. However, with this replacement, the iteration (18) would in general not be well defined and numerically unstable since $\mathcal{C}_4[\boldsymbol{q}_n]$ is not boundedly invertible. Note that the operator $\mathcal{C}_4[\boldsymbol{q}_n]$ is not boundedly invertible. Even if $\mathcal{C}_4[\boldsymbol{q}_n]$ is injective, the inverse is given by $\mathcal{C}_4[\boldsymbol{q}_n]^{-1} = \mathcal{C}[\boldsymbol{q}_n]^{-1} \otimes \mathcal{C}[\boldsymbol{q}_n]^{-1}$, and this cannot be bounded in infinite dimensions since $\mathcal{C}[\boldsymbol{q}_n]$ is Hilbert–Schmidt. Hence, the mentioned replacement is not well defined and numerically unstable. Therefore, we choose a bounded, self-adjoint, positive semi-definite approximation

Equation (23)

e.g. by a truncated eigenvalue decomposition or by Lavrentiev regularization $\Gamma_n = (\beta \text{Id}+\mathcal{C}[\boldsymbol{q}_n])^{-1}$. Then $\mathcal{C}_4[\boldsymbol{q}_n]^{-1}\approx \Gamma_n\otimes \Gamma_n$ and $\mathcal{C}_4[\boldsymbol{q}_n]^{-1/2}\approx \Gamma_n^{1/2}\otimes \Gamma_n^{1/2}$. The parameter β may model the presence of measurement and modeling errors in addition to the realization noise modeled by $\mathcal{C}_4[\boldsymbol{q}_n]^{-1/2}$. Then the iteration (18) is replaced by

Note that for numerical efficiency it is very fortunate that the covariance operator $\mathcal{C}_4$ of the correlation data has the separable structure (22). Further note from the second line of the last equation that we only need $\Gamma_n$, not $\Gamma_n^{1/2}$.

6. Forward problems in local helioseismology

In this section, we discuss applications of the model problem considered in the previous sections to helioseismology.

6.1. Acoustic oscillations in the Sun

Ω0 will denote the interior of the Sun (typically $\Omega_0 = B(0, R_{\odot})$ with $R_{\odot} = 696\,$Mm), whereas Ω may also include parts of the solar atmosphere. The measurement region we consider is an open subset Γ of the visible surface $\partial \Omega_0$, accounting for the fact that in typical helioseismic applications, measurements are only available on the near side of the solar surface. Given that solar oscillations near the solar surface are primarily oriented in the radial direction [46], there is also a lack of Doppler information near the poles. This phenomenon results in leakage, causing challenges such as incomplete decoupling of normal modes of oscillation (e.g. [47, 48]). In the subsequent analysis, we will exclusively work in the frequency domain.

The propagation of acoustic waves in a heterogeneous medium like the Sun can be described by the differential equation

Equation (24)

where we have ignored gravitational effects and have assumed an adiabatic approximation [49]. The random source term F describes the stochastic excitation of waves by turbulent motions and ζ is the Lagrangian wave displacement vector. As usual, we denote with ρ the density, c the sound speed, γ the damping, and u the flow field. If we furthermore neglect second order terms in $\gamma, \boldsymbol{u}$, equation (24) can be converted into a Helmholtz-like equation ([29], inspired by [50])

Equation (25)

where $\psi = \rho^{1/2} c^2 \nabla \cdot \boldsymbol{\zeta}$ is the scaled wavefield and $s = \rho^{1/2} c^2 \nabla \cdot \boldsymbol{F}$ a stochastic source term. The potential V is defined by

Equation (26)

The frequency ωc is recognized as the acoustic cutoff frequency. This cutoff frequency arises due to the abrupt decline in density near the solar surface and results in the trapping of acoustic modes with frequencies below the acoustic cutoff frequency. Modes with frequencies surpassing the acoustic cutoff frequency can propagate through the solar atmosphere.

The conditions on $c, \rho, \gamma, \boldsymbol{u}$ are summarized in assumption 3.

Assumption 3. Suppose that for some $B_0\in L\left(H^{1/2}(\partial\Omega),H^{-1/2}(\partial\Omega)\right)$, $k\in\mathbb{C}$ and some set $\mathcal{A}_{c_{\text{min}}, \rho_{\text{min}}, k} \subset W^{1, \infty}(\Omega, \mathbb{R}) \times W^{2, \infty}(\Omega, \mathbb{R}) \times L^{\infty}(\Omega, [0, \infty)) \times W\,^{\infty}(\text{div},\Omega) \times L^{\infty}(\Omega, [0, \infty))$ of admissible parameters $c, \rho, \gamma, \boldsymbol{u}, S$ containing some reference parameters $c_{\mathrm{ref}}, \rho_{\mathrm{ref}}, \gamma_{\mathrm{ref}}, \boldsymbol{u}_{\mathrm{ref}}, S_{\mathrm{ref}}$ such that the following holds true:

Equation (27a)

Equation (27b)

Equation (27c)

Equation (27d)

Equation (27e)

For the flow field, we incorporate a mass conservation constraint (equation (27d )). Additionally, we assume that the flow field does not intersect the computational boundary (equation (27c )). Various boundary conditions, in particular radiation boundary conditions and learned infinite elements, and their efficacy are extensively discussed in [5153]. It is notable that the most popular choices of boundary conditions in helioseismology, such as radiation boundary conditions, Sommerfeld boundary conditions, or free boundary conditions, are incorporated in assumption (27e ).

We define the operator $\mathcal{P}$ that transforms the parameters in the wave equation (25) into the form of equation (2) by

Equation (28)

where

Equation (29a)

Equation (29b)

In figure 1, we present the acoustic sound speed, the density, and the scalar potential v as obtained from the Solar Model S and smoothly extended to the atmosphere. Modeling the forward problem for the Sun remains challenging due to the substantial density gradients near the surface, leading to strong variations of the scalar potential v near the solar surface.

Lemma 9. The operator $\mathcal{P}$, defined in equation (28), is well-defined in the sense that for all parameters $(c, \rho, \gamma, \boldsymbol{u}, S) \in \mathcal{A}_{c_{\mathrm{min}}, \rho_{\mathrm{min}}, k}$ we have $\mathcal{P}(c, \rho, \gamma, \boldsymbol{u}, S) \in \mathcal{B}_k$, and this map is continuous.

Proof. By equations (29a ) and (29b ), we have $v \in L^{\infty} (\Omega, \mathbb{C})$, $\boldsymbol{A} \in W\,^{\infty}(\text{div}, \Omega_0)$, and the mapping is continuous. The conditions (3c )–(3e ) are obviously satisfied, and (3b ) is satisfied by assumption (27d ). For condition (3a ), we note that $\nabla \cdot \boldsymbol{A} = \omega \nabla \cdot \frac{\rho \boldsymbol{u}}{(\rho^{1/2} c)^2} = \frac{2 \omega}{\rho^{1/2} c} \rho \boldsymbol{u} \cdot \nabla \frac{1}{\rho^{1/2} c}$, where we have used (27d ). Therefore,

 □

Figure 1. Refer to the following caption and surrounding text.

Figure 1. The left panel shows the sound speed and density obtained from the Solar Model S [54] in the solar core, the convection zone (CZ), and the radiation zone (RZ). The right panel shows the potential close to the surface for $\omega/2 \pi = 3\,$mHz.

Standard image High-resolution image

Because of assumption (27b ), $c, \rho, \nabla \rho, \gamma, \boldsymbol{u}, \nabla \boldsymbol{u}$ are fixed at $\partial \Omega_0$ and in the exterior. Therefore, the space of parameter perturbations is

where $W_0^{\infty}(\text{div}, \Omega_0) = \{\boldsymbol{u} \in W\,^{\infty}(\text{div}, \Omega_0): \boldsymbol{u} \cdot \boldsymbol{n} = 0\ \text{on } \partial \Omega\}$.

Lemma 10. For $c_{\mathrm{min}}, \rho_{\mathrm{min}}\gt0$ and $k \in \mathbb{C}$, the operator $\mathcal{P}$ is Fréchet differentiable in the interior of $\mathcal{A}_{c_{\mathrm{min}}, \rho_{\mathrm{min}}, k}$ with Fréchet derivative $\mathcal{P}^{^{\prime}}:\mathbb{X}_{\mathcal{P}}\to \mathbb{X}_{\mathcal{G}}$ given by

Equation (30)

For arguments $(\tilde{v}, \tilde{\boldsymbol{A}}, \tilde{S}) \in L^{1}(\Omega_0;\mathbb{C})\times L^{1}(\Omega_0;\mathbb{R}^d) \times L^{1}(\Omega_0;\mathbb{R})\subset \mathbb{X}^{^{\prime}}$, the values of the adjoint

belong to $W^{-2, 1}(\Omega_0, \mathbb{R}) \times W^{-1, 1}(\Omega_0, \mathbb{R}) \times L^{1}(\Omega_0, \mathbb{R}) \times L^1(\Omega_0, \mathbb{R}^d) \times L^{1}(\Omega_0, \mathbb{R}) \subset \mathbb{X}_{\mathcal{P}}^{^{\prime}}$.

Proof. We rephrase the potential v in the form:

It follows that

Equation (31)

where

Furthermore, we have

Equation (32)

The operator $\mathcal{P}$ is Fréchet differentiable with Fréchet derivative (30) since the terms $\partial_q v, \partial_q \boldsymbol{A}$ are well-defined for $q \in \{c, \rho, \gamma, \boldsymbol{u} \}$. The claim follows with the mapping properties of $(\partial_q v)^*, (\partial_q \boldsymbol{A})^*$. □

In analogy to equation (16), we can write the Fréchet derivative in the form:

Equation (33)

The operators $\mathcal{L}_q$ play the role of local correlation operators. The propagators and local correlation operators in the flow-free case can be read in table 1.

Table 1. Distributional kernel of back-propagator and local correlation operator for the different parameters. The functions $g^0_\rho, g^1_\rho, g^2_\rho$ are defined in equation (31). The coordinates are chosen such that $\boldsymbol{x} \in \Gamma$ and $\boldsymbol{y} \in \Omega$. Here, we assume that $\mathbf{Cov}[s_{\partial\Omega}] = M_{B}$ with $B\in L^{\infty}(\partial\Omega)$ and use the notation $S_{\overline{\Omega}}: = S + B \delta_{\partial \Omega}$.

Quantity q Propagator $\mathcal{H}_{\alpha_q}$ Propagator $\mathcal{H}_{\beta_q}$ Local correlation $\mathcal{L}_q^*$
Source strength S $G(\boldsymbol{x}, \boldsymbol{y})$ $G(\boldsymbol{x}, \boldsymbol{y})$ Diag
Sound speed c $G(\boldsymbol{x}, \boldsymbol{y})$ $\int_{\Omega} G(\boldsymbol{x}, \boldsymbol{z}) S_{\overline{\Omega}}(\boldsymbol{z})\overline{G(\boldsymbol{y}, \boldsymbol{z})}\,\mathrm{d} \boldsymbol{z}$ $-2 \frac{\omega^2+2i\omega \gamma}{c^3} \cdot \text{Diag}$
Density ρ $G(\boldsymbol{x}, \boldsymbol{y})$ $\int_{\Omega} G(\boldsymbol{x}, \boldsymbol{z}) S_{\overline{\Omega}}(\boldsymbol{z})\overline{G(\boldsymbol{y}, \boldsymbol{z})}\,\mathrm{d} \boldsymbol{z}$ $\left( g^0_\rho-\boldsymbol{g}_{\rho}^1 \nabla+g^2_{\rho} \Delta \right) \cdot \text{Diag}$
Wave damping γ $G(\boldsymbol{x}, \boldsymbol{y})$ $\int_{\Omega} G(\boldsymbol{x}, \boldsymbol{z}) S_{\overline{\Omega}}(\boldsymbol{z})\overline{G(\boldsymbol{y}, \boldsymbol{z})} \,\mathrm{d} \boldsymbol{z}$ $\frac{2 i \omega}{c^2} \cdot \text{Diag}$
Flow component Ai $G(\boldsymbol{x}, \boldsymbol{y})$ $\hat{\mathbf{e}}_i \cdot \nabla_{\boldsymbol{y}} \left( \frac{\int_{\Omega} G(\boldsymbol{x}, \boldsymbol{z}) S_{\overline{\Omega}}(\boldsymbol{z}) \overline{G(\boldsymbol{y}, \boldsymbol{z})} \,\mathrm{d} \boldsymbol{z}}{\rho^{1/2}(\boldsymbol{y}) c(\boldsymbol{y})}\right)$ $2 i \omega \frac{\rho^{1/2}}{c} \cdot \text{Diag}$

Note that despite the fact that the adjoint with respect to the standard L2 dual pairings takes values in negative Sobolev spaces, it is usually not necessary to deal with such functions (or distributions) numerically in iterative regularization methods. For instance, in Landweber iteration in Banach spaces, the application of the adjoint is followed by the application of a duality mapping which takes values in positive Banach spaces. For Hilbert space methods, one would choose a L2-based Sobolev space $W^{s,2}$ with sufficiently large s and compute the adjoint with respect to the $W^{s,2}$ inner product, which amounts to an evaluation of the adjoint of the embedding $W^{s,2}\hookrightarrow L^2$.

6.2. Source model in helioseismology

It remains to discuss the seismic source model in helioseismology. It has been shown in several settings that the cross-correlation is roughly linked to the imaginary component of the outgoing Green's function [30]. In helioseismology, this relation takes the form [49]

Equation (34)

where $\Pi(\omega)$ is the source power spectrum. This relation leads to a power spectrum in good agreement with the observations [49]. As outlined in [49], equation (34) holds true for an outgoing radiation condition and random sources that are appropriately excited across the volume in proportion to the damping rate

Equation (35)

Moreover, there are surface integrals that persist for frequencies above the acoustic cutoff frequency, and these are dependent on the chosen boundary condition.

The relationship between source power and damping rate emerges from the idea of equipartition among distinct acoustic modes [55]. This choice of covariance couples the source strength with wave attenuation and sound speed. Nevertheless, we consider the source strength as an additional individual parameter. This source model is included in the discussion of the previous sections. In helioseismology, the relation (34) is the standard choice to reduce the computational costs of the operator evaluation. Furthermore, it allows us to evaluate the back-propagator in table 1 efficiently.

7. Iterative helioseismic holography

In this section, we discuss the application of the approach outlined in section 5 to local helioseismology. We first show that it can be interpreted as an extension of conventional helioseismic holography. For this reason, we will refer to this approach as iterative helioseismic holography. We also discuss relations to other methods in local helioseismology.

In a second subsection, we will describe sensitivity kernels for the normal equation as introduced in (19) for the following three scenarios:

  • (i)  
    inversion for the source strength,
  • (ii)  
    inversion for scalar parameter $q \in \{\rho, c, \gamma\}$,
  • (iii)  
    inversion for mass-conserved flow field u .

7.1. Relations to conventional helioseismic holography and other methods

Conventional helioseismic holography is based on the Huygens principle in the sense that the observed wavefield is described as a superposition of seismic point sources on the wavefront. This principle allows holography to propagate the correlations of acoustic waves at the solar surface forward in time ('ingression' using $\mathcal{H}_{\beta}$) or backward in time ('egression' using $\mathcal{H}_{\alpha}$) to a pre-defined target location in the interior in order to image anomalies in the background medium (e.g. [56]). There exists a close connection to seismic migration in terrestrial seismology, which re-locates seismic events on the earth's surface in time and space, based on the wave equation (e.g. [57, 58]). Furthermore, similar back-propagators are used in conventional beamforming in aeroacoustics [10, 30].

The Lindsey-Braun holographic image (see [27]) is constructed by the wave propagators $\mathcal{H}_{\alpha} \in L\left(H_0^{-1}(\Omega), L^2(\Gamma_1)\right)$ and $\mathcal{H}_{\beta}\in L\left(H_0^{-1}(\Omega), L^2(\Gamma_2)\right)$ such that

where $\Gamma_1, \Gamma_2 \subset \Gamma$ are called pupils. In Lindsey–Braun holography the information is extracted from the so-called egression-ingression correlation for parameters $q \in \{c, \rho, \boldsymbol{u}, \gamma\}$ and the egression power for seismic sources

Equation (36)

Equation (37)

where ⊗ the standard tensor product, and $C_{v,\boldsymbol{A}, S}$ is from equation (7).

The comparison of equations (16) and (36) shows that the adjoint of the Fréchet derivative of the covariance operator is linked to traditional helioseismic holography. Denoting potential additional dependence of $I_{\alpha,\beta}$ on the unknown parameters q through $\mathcal{H}_{\alpha}$ and $\mathcal{H}_{\beta}$ by superscripts, in terms of conventional holography the Newton step (18) is a regularized solution to

where $\boldsymbol{K}\,_{\alpha, \beta}^{\boldsymbol{q}}$ are the sensitivity kernel of traditional holography, see (20). We will discuss the sensitivity kernels in more detail in section 7.2.

In traditional helioseismic holography, one has freedom in the choice of the pupils and back-propagators. For example, the pupils can be chosen such that the hologram intensity becomes sensitive to specific flow components [59]. As a further example, Porter–Bojarski holograms, introduced to the field of helioseismology in [60, 61], make use of the normal derivative at the surface in addition to the Dirichlet data. In contrast, the backward propagators in iterative helioseismic holography are determined by the wave equation, and the image is improved by iteration.

While many techniques in helioseismology including traditional helioseismic holography are limited to linear scenarios, iterative holography naturally allows to tackle nonlinear problems.

Among the commonly used imaging techniques in local helioseismology, holography is the only method that uses the complete cross-correlation data. As already discussed in section 5.1, these data are used only in an implicit manner without the (usually infeasible) requirement of computing or storing the cross-correlation data explicitly. Whereas traditional helioseismic holography only provides feature maps [16], iterative helioseismic holography additionally allows to retrieve quantitative information.

7.2. Kernels and resolution

In the following, we compute the sensitivity kernels for the normal equation as defined in equation (19) which are set up explicitly in our current implementation. It is important to note that sensitivity kernels are typically 3D × 3D operators and should be avoided in computations. Nevertheless, in the spherically symmetric case or two-dimensional medium, computation becomes feasible. Therefore, for the purpose of this paper, we can compute the sensitivity kernels in each iteration and do not study more sophisticated approaches. These kernels are infinitely smooth for smooth coefficients, but they are well localized. It turns out that the width of these kernels is of the order of the classical resolution limit of half a wavelength. This provides an upper bound on the achievable resolution. For simplicity, we will assume a spherically symmetric background without a flow field.

  • (i)  
    Inversion for source strength: it follows from theorem 8 that
    where the multiplication operator is defined in lemma 5, so
    with the sensitivity kernel
    The real part comes from the fact that the source strength has to be a real parameter, it is the adjoint of the embedding of a vector space of real-valued functions into the corresponding vector space of complex-valued functions. The source forward–backward kernel takes the form:
    where Dn is the integral kernel of $\Gamma_n$ in (23). The sensitivity kernel becomes $|K_{\alpha_S, \alpha_S}|^2$ and is therefore non-negative. Furthermore, there are almost no sidelobes after averaging over frequency.
  • (ii)  
    Inversion for scalar parameters $q\in\{\rho,c,\gamma\}$: the operators $\partial_q v$ and $\partial_q \boldsymbol{A}$ for $q \in \{\rho, c, \gamma\}$ are computed in equations (31) and (32). For a flow-free background medium, we have $\partial_q \boldsymbol{A} = 0$ for all scalar parameters q. It follows from theorem 8 that
    and hence
    where $F_{\alpha_{v}, \alpha_{v}} = \int_{\Gamma} \int_{\Gamma} \overline{H_{\alpha_{v}}(\boldsymbol{z}, \boldsymbol{x})} D_n(\boldsymbol{z}, \boldsymbol{z}^{^{\prime}}) H_{\alpha_{v}}(\boldsymbol{z}^{^{\prime}}, \boldsymbol{y}) \,\mathrm{d} \boldsymbol{z} \,\mathrm{d} \boldsymbol{z}^{^{\prime}}$ and analogue for $F_{\alpha_v, \alpha_v}, F_{\beta_v, \beta_v}$. In particular, for $q \in \{c, \gamma\}$, we have $\partial_q v = \mathcal{M}(g_q^0)$, and the kernel takes the form
    Equation (38)
  • (iii)  
    Inversion for mass-conserved flow field u: the flow field sensitivity kernel takes the form
    for $i, j \in \{r, \theta, \phi\}$ where $F_{\alpha_{\boldsymbol{u}}, \alpha_{\boldsymbol{u}}} = \int_{\Gamma} \int_{\Gamma} \overline{H_{\alpha_{\boldsymbol{u}}}(\boldsymbol{z}, \boldsymbol{x})} D_n(\boldsymbol{z}, \boldsymbol{z}^{^{\prime}}) H_{\alpha_{\boldsymbol{u}}}(\boldsymbol{z}^{^{\prime}}, \boldsymbol{y}) \,\mathrm{d} \boldsymbol{z} \,\mathrm{d} \boldsymbol{z}^{^{\prime}}$ and analogue the further kernels.

It is remarkable that we are encountering a gradient in the local correlation $(\partial_{\boldsymbol{u}} L_{\boldsymbol{u}})^{*}$. In chapter 4 of [59], enhancements were accomplished by calculating the difference between two holograms which are spaced apart by half the local wavelength. This can be understood as an approximation to the gradient in the target direction and is therefore naturally incorporated in our framework.

In figures 2 and 3, we present the sound speed kernel for a uniform and a solar-like radially stratified medium for four different target positions. The kernels are computed for spherical harmonic degrees $0 \unicode{x2A7D} l\lt100$ and averaged over 100 evenly spaced frequencies between 2.75–3.25 mHz. Since there are no strong ghost images on the backside, we show only half of the geometry. The sound speed kernels are very sharp near the target location. Therefore, we can expect the holograms to catch the main features of the image. It is important to highlight that the kernels maintain their sharpness even in deep regions within the interior. In addition, this result holds true for a radial stratification similar to that of the Sun. We observe similar behavior for the sensitivity kernels for wave damping, density, source strength, and the components of the flow field. Similar to the sensitivity kernels for the source strength, it is important to note that there are only small visible sidelobes in the sensitivity kernels for sound speed perturbations. This is an additional advantage compared to traditional techniques used in helioseismology.

Figure 2. Refer to the following caption and surrounding text.

Figure 2. The sound speed sensitivity kernel $K(\boldsymbol{x},\cdot)$ in the $r-\theta$-plane as defined in (38) for a three-dimensional uniform medium with c0 = 200 km s−1 and the l-range is $0 \unicode{x2A7D} l\lt100$ for four different target positions x . We have averaged the sensitivity kernels over 100 frequencies in the frequency regime 2.75–3.25 mHz and normalized with $K(\boldsymbol{x}, \boldsymbol{x})$ at the target location x .

Standard image High-resolution image
Figure 3. Refer to the following caption and surrounding text.

Figure 3. The sound speed sensitivity kernel $K(\boldsymbol{x},\cdot)$ in the $r-\theta$-plane as defined in (38) in a spherically stratified solar-like background medium and spherical harmonics degrees $0 \unicode{x2A7D} l\lt100$ for four different target positions. We have averaged the sensitivity kernels over 100 frequencies in the frequency regime 2.75–3.25 mHz and normalized with $K(\boldsymbol{x}, \boldsymbol{x})$ at the target location x . For better comparisons, we have multiplied the sensitivity kernels with the sound speed.

Standard image High-resolution image

Figure 4 provides a comparison of the width of the sensitivity kernels for the normal equation and the local half wavelength $\lambda/2$. Note that both are of similar size in all cases, and similar results hold true in angular direction. Therefore, we can expect a resolution of (at least) $\lambda/2$. However, in the case of a solar-like stratification, the sensitivity kernels are increasing close to the solar surface.

Figure 4. Refer to the following caption and surrounding text.

Figure 4. The sound speed sensitivity kernels in a spherically stratified background medium and the l range is $0 \unicode{x2A7D} l\lt100$. In the top panels we present the kernels for a uniform medium with c0 = 200 km s−1 (as in figure 2), and in the second line the kernels for a solar-like medium (as in figure 3). In the first column, we show the kernels in the radial direction, and in the second column the kernels in the angular direction. We have averaged the sensitivity kernels over 100 frequencies in the frequency regime 2.75–3.25 mHz. Furthermore, we compare the width of the sensitivity kernels to the classical resolution limit of $\lambda/2$. For better comparisons, we have multiplied the sensitivity kernels with the sound speed.

Standard image High-resolution image

In helioseismology, and particularly in helioseismic holography, a common issue is the indistinguishability of various sources of perturbations, which complicates the interpretation of seismic data. The design of a holographic back-propagator holds the promise of separating different perturbations. In figure 5, we present the sensitivity kernels for a perturbation in sound speed, a perturbation in damping, and the cross-kernel in a uniform two-dimensional medium. We show the kernels in a region around the target location. Note that the sound speed kernel is on one scale bigger than the damping kernel and the cross-kernel. Furthermore, the cross-kernel exhibits a different shape with positive and negative maxima around the target location. Therefore, we expect that iterative holography can separate different perturbations in the background medium.

Figure 5. Refer to the following caption and surrounding text.

Figure 5. Matrix-valued sensitivity kernel for joint inversion for sound speed c and damping γ. The left two panels exhibit the diagonal entries, and the right panel the cross-kernel in a uniform two-dimensional medium in a rectangular box of $[0.2\,R_{\odot}, 0.4\,R_{\odot}]^2$. The target location is indexed by a red cross. The kernels are normalized by the maximal value of the sound-speed kernel.

Standard image High-resolution image

8. Inversions

In this section, we analyze the performance of iterative holography. The geometry is meshed with a resolution of ten internal points per local wavelength. Furthermore, we impose a Sommerfeld boundary condition throughout the inversions.

Throughout the following inversions, we employ a L2-term as the penalty term and introduce a non-negativity constraint for both sound speed and source strength. The regularization parameter is determined by a power law: $\alpha_n = \alpha_0 \cdot 0.9^n$, where α0 represents the maximal eigenvalue of the first iteration. The stopping criterion for the inversions is a version of the discrepancy principle for the normal equation, with the noise level determined by the trace of the covariance operator $\mathcal{C}_4$. In more advanced inversions, stopping rules may be investigated in the hologram space. We set a limit of at most 50 inner conjugate gradient steps per Newton step. Furthermore, we opt for a spatial resolution of seven grid points per local wavelength.

8.1. Holographic image for source perturbation

We have performed a numerical test for a uniform, flow-free two-dimensional medium with source region $[0.5, 0.7]^2$ and 100 uniformly sampled receivers located on $\partial B(0,1)$ (see figure 6). We choose a constant sound speed c = 350 km s−1, which corresponds to the solar sound speed at $\approx 0.38 R_{\odot}$. The frequency is fixed to be $\omega/2 \pi = 3\,$mHz, which corresponds to the solar 5 min oscillations. In the case of uniform medium, the differential equation simplifies to a Helmholtz equation, such that the Green's function is analytically known (see appendix B). Note that Lindsey–Braun holography ($\mathcal{H}_{\alpha} = \mathcal{H}_{\beta} = \mathcal{G}$) provides sharp feature maps in the case of small wave damping. For stronger wave damping, the quality of these feature maps deteriorates rapidly. Even without damping, the feature maps are not quantitative at all.

Figure 6. Refer to the following caption and surrounding text.

Figure 6. Lindsey–Braun holographic image intensities in uniform two-dimensional medium (Helmholtz equation) for different degrees of damping with 100 equidistant receivers on $\partial B(0, 1)$. The wave number is such that $k^2 = \omega^2(1+i \gamma)/c^2$ with constant sound speed c = 350 km s−1 and $\omega/2 \pi = 3$ mHz corresponding to a wavelength of ${\approx}0.17\,R_{\odot}$. Note the different scalings of the color maps illustrating the non-quantitative nature of Lindsey–Braun holography.

Standard image High-resolution image

8.2. Source strength inversion

Due to its linear nature, inversion for source strength is the simplest case. Therefore, it is in general possible to work with a much finer grid than in the case of parameter identification problems. We add a strong perturbation in the source region $[0, 0.5\,R_{\odot}]^2$. The inversion results at 3 mHz are shown in the first row of figure 7 for 10 000 realizations. Note that even very deep source terms can be inverted using only one frequency. The reconstructions exhibit a remarkable quality, strongly improving the results by traditional Lindsey-Braun holography (see figure 6).

Figure 7. Refer to the following caption and surrounding text.

Figure 7. Inversions in two-dimensional uniform background with sound speed c = 350 km s−1. In the first row, we present inversions for the source strength at 3 mHz and 10 000 realizations. In the second row, we present inversions for the sound speed for 100 frequencies evenly spaced in the band 2.75–3.25 mHz and 1000 realizations. The black cross indicates $\lambda/2$.

Standard image High-resolution image

8.3. Parameter identification

We add a perturbation in the quadratic region $[0.5R_{\odot}, 0.7 R_{\odot}]^2$. Furthermore, we choose 100 evenly spaced frequencies in the frequency range of $2.75-3.25$ mHz and assume 1000 realizations for each frequency. Note that in helioseismology we have many more frequencies available.

The inversions are shown in the second row of figure 7. The Newton iteration was stopped after 15 iterations. The resolution of the reconstruction is again below the classical limit of half a wavelength. We observed qualitatively similar results in the inversions for wave damping and density.

The total number of Dopplergrams is given by $N_{\omega} \times N_{\text{obs}}$, where Nω is the number of frequencies and Nobs the number of realizations for each frequency. Note that the total size of Doppler data is fixed by the observation time. We observe that a larger number of frequencies leads to better reconstructions. On the other hand, the computational costs scale roughly linearly with the number of frequencies. This becomes particularly important for large-scale forward problems like for the Sun. Therefore, the choice of Nω often is a trade-off between quality of reconstructions and computation time.

8.4. Flow fields

The inversion is performed in a solar-like three-dimensional medium. The example flow field is computed by $\boldsymbol{u} = \text{curl} \psi$, where ψ is a stream function. This guarantees conservation of mass and axisymmetry of the flow field. The stream function is chosen similar to models of meridional circulation profiles in the Sun [62]. In the inversion process, we guarantee conservation of mass through Lagrange multipliers, as discussed in appendix C. The inversion for a symmetric flow field is presented in figure 8. We inverted with 100 evenly spaced frequencies between 2.75–3.25 mHz and assumed 1500 realizations for each frequency. Since meridional flows are a small perturbation, the iteration is stopped after one iteration. Because of the symmetry, we show only one half-space of the flow field. Besides the strength of the flow field at larger depths, there is no difference visible in the eye-norm.

Figure 8. Refer to the following caption and surrounding text.

Figure 8. Inversion for the flow field in solar-like three-dimensional background medium in the $r-\theta$-plane. The inversion is performed with 100 evenly spaced frequencies between 2.75–3.25 mHz and 1500 realizations. Due to the symmetry, we show only one half-space of the flow field.

Standard image High-resolution image

9. Conclusions

We have developed a theoretical framework for quantitative passive imaging problems in helioseismology. It shows that traditional holography can be interpreted as an adjoint imaging method. Holographic back-propagation can be seen as part of the adjoint of the Fréchet derivative of the forward operator mapping physical parameters to the covariance operator of the observations. In contrast to traditional holography, the backward propagators are uniquely determined by the wave equation, and the holograms can be improved by iteration rather than clever choices of back-propagators. Iterative helioseismic holography surpasses traditional helioseismic techniques by the quantitative nature of its imaging capabilities and its ability to solve nonlinear problems.

We have demonstrated the performance of iterated holography in inversions for the right hand side of wave equation (source strength), parameters of the zeroth order term (sound speed, absorption) and of the first order term (flows). In all three cases, we have achieved reconstructions with a resolution of slightly less than half of the local wave-length by the iteratively regularized Gauss-Newton method, even for strong realization noise. This is well below the spatial resolution of traditional time-distance helioseismology (see [15]).

Inversions in other more challenging solar setups and for real solar oscillation data are planned as future work and will be presented elsewhere.

In view of the huge size of solar oscillation data, the main bottleneck that prevents the immediate application of iterative holography to interesting large-scale problems in helioseismology is computational complexity. The results of this paper encourage further algorithmic research on iterative regularization methods tailored to passive imaging problems, e.g. by more efficient treatments of sensitivity kernels and Green's functions.

An interesting feature of correlations of Gaussian fields is the structure of the realization noise as described in section 5.3. A thorough mathematical treatment will require further investigation concerning appropriate stopping rules, consistency, and convergence rates as the sample size tends to infinity.

Acknowledgments

This work was supported by the International Max Planck Research School (IMPRS) for Solar System Science at the University of Göttingen. The authors acknowledge partial support from Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through SFB 1456/432680300 Mathematics of Experiment, Project C04.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Appendix A: Reference green's function for the Sun

The Green's function is usually computed in the frequency domain with background parameters specified by the spherically symmetric standard Model S [54]. Additionally, we have to fix the wave attenuation. We choose a frequency-dependent wave attenuation model, motivated by the Full Width at Half Maximum (FWHM) of wave modes [49, 63, 64]:

where ${\gamma_0}/{2 \pi} = 4.29\,\mu$Hz and ${\omega_0}/{2 \pi} = 3$ mHz. We extend the computational boundary by 500 km above the solar surface (compare with the density scale height of H = 105 km) and apply the radiation boundary condition 'Atmo Non Local' (see [52]), assuming an exponential decay of density and constant sound speed in the solar atmosphere.

In a spherical symmetric background, we can decompose the Green's function into spherical harmonics:

Equation (A.1)

where the Ylm are spherical harmonics. The functions $G_l(r_1, r_2)$ satisfy a one-dimensional differential equation and are computed with NGsolve [65, 66]. The computation of the Green's function is usually expensive, as the stiffness matrix has to be inverted. The two-step algorithm of [67] allows us to obtain the full modal Green's function from two computations only. Furthermore, this expansion allows us to use a low-rank approximation for the Green's function.

Appendix B: Green's function in uniform medium

We perform numerical toy examples in uniform flow-free two-dimensional and three-dimensional background mediums and consider a Sommerfeld boundary condition. The differential equation (25) reduces to the Helmholtz equation

Equation (B.1)

where k is constant. In this setting, the Green's function is well known (e.g. [31]):

Equation (B.2)

Equation (B.3)

where $H^1_0$ is the Hankel function of first kind.

The Green's functions are weakly singular at $\boldsymbol{x} = \boldsymbol{y}$. We will approximate the Green's functions around the singularity using asymptotics:

Equation (B.4)

Equation (B.5)

where the constant C denotes the Euler–Mascheroni constant. In inversions of extended properties like large-scale flows, it is more feasible to work in an angular basis (spherical harmonics in three dimensions and trigonometric functions in two dimensions). The Green's functions for the uniform medium can be described by

Equation (B.6)

Equation (B.7)

for $\vert \boldsymbol{x} \vert \unicode{x2A7E} \vert \boldsymbol{y} \vert$. Here, $J_n, h_n^1, j_n$ denote the Bessel function, spherical Hankel function, and spherical Bessel function. Moreover, $\theta_{x,y}$ denotes the angular distance between x and y . Furthermore, this basis transformation allows a natural implementation of the singularity. We use this expansion in order to use low-rank approximations for the Green's function.

Appendix C: Conservation of mass

For flow inversions, considerable improvements are achieved by incorporating mass conservation in the inversion process [68].

An equality constraint $R \delta \boldsymbol{u}_k = 0$, where $R: \mathbb{X} \to \mathbb{Z}$ is a bounded linear operator, can be incorporated by employing the method of Lagrange multiplier. For the iterative Gauss–Newton method, we solve the normal equation:

Equation (C.1)

with Hilbert spaces $\mathbb{X}$, $\mathbb{Y}$, noise covariance operator Γ defined in (23). The Lagrange function takes the form $\mathcal{L} (\delta \boldsymbol{u}_k, \mu): = \Vert (\Gamma_n^{1/2} \times \Gamma_n^{1/2}) \left[\mathcal{C}^{^{\prime}}[\boldsymbol{u}_k](\delta \boldsymbol{u}_k)- (\text{Corr}-\mathcal{C}[\boldsymbol{u}_k]) \right] \Vert_{\mathbb{Y}}+\alpha_k \Vert \delta \boldsymbol{u}_k \Vert_{\mathbb{X}}+\langle \mu, \alpha R \delta \boldsymbol{u}_k \rangle_{\mathbb{Z}}$ with the Lagrange multiplier $\mu \in \mathbb{Z}$. The saddle point can be found by

Please wait… references are loading.