Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 GEOPHYSICS, VOL. 88, NO. 5 (SEPTEMBER-OCTOBER 2023); P. V371–V385, 17 FIGS. 10.1190/GEO2022-0779.1 Seismic time-frequency masking for suppression of seismic speckle noise Andrey Bakulin1, Dmitry Neklyudov2, and Ilya Silvestrov1 window-based approaches borrowed from local stacking techniques. Separate manipulation of phase and amplitude spectra is achieved through modified time-frequency masking inspired by speech processing. A beamformed data set from local stacking is used as a guide for designing phase and amplitude masks that are directly applied to raw data in the time-frequency domain. The phase mask allows the restoration of coherency by repairing phase distortions caused by near-surface scattering. The amplitude mask corrects power spectrum (PS) distortions caused by multiplicative and additive noises. An amplitude mask is implemented using a data-driven minimum statistic approach that estimates the noise PS in each local window. The minimum statistics approach is adapted from speech processing using beamformed data as an initial signal estimate and as a guide to designing an improved amplitude mask. Synthetic and field data examples suggest significant improvements in coherency and signal-to-noise ratio after the suppression of multiplicative noise, which makes the data processable by conventional techniques subsequently. ABSTRACT Seismic speckle noise is the primary factor causing severe reflection distortions caused by small-scale near-surface scattering. As in the case of speckle noise in optics and acoustics, deterministic velocity model-building techniques cannot recover these heterogeneities which are much smaller than a wavelength. Conventional processing techniques struggle to perform when multiplicative noise remains unsuppressed. Although local and global stacking mitigates the effects of speckle noise, it leads to a severe loss of higher frequencies reducing the vertical resolution of the seismic data. The foundation for attacking speckle noise is a recently proposed mathematical model that includes two concurrent random multiplicative noise types: type 1 defining residual statics and type 2 describing random frequency-dependent phase perturbations that mimic small-scale near-surface scattering. Using this model, we have developed seismic time-frequency masking to suppress speckle noise on prestack data. The time-dependent and non-surface-consistent nature of scattering noise dictates reveals significant frequency-dependent distortions initially attributed to variable geophone coupling (Bagaini and Barajas-Olalde, 2004). Subsequently, Reilly et al. (2010) speculate that subseismic heterogeneity (at a scale not captured in current velocity models) could be a more plausible explanation for shallow marine seismic data in desert environments. They show that time-variant statics could effectively compensate reflected events for some of these distortions and deliver improved time and depth imaging. Khalil and Gulunay (2011) demonstrate the severe complications and need to correct intraarray statics in single-sensor land data even when analyzing direct arrivals. Bakulin et al. (2021, 2022a) thoroughly review desert land data with single sensors and small arrays. They introduce a multiplicative random speckle noise model to describe the effects of small-scale near-surface scattering. The multiplicative INTRODUCTION Land and shallow marine environments with complex near surfaces are associated with challenging seismic data quality (Reilly et al., 2010; Meunier, 2011; Bakulin et al., 2020a; Stork, 2020). In particular, desert environments exhibit some of the most difficult data, land and marine (Bagaini et al., 2010; Reilly et al., 2010). The vast majority of the previous land studies have focused on improved sampling and removal of the elastic noise (surface waves, guided waves, etc.). Such studies implicitly assume that underlying reflections maintain coherency, whereas the signal wavelet experiences relatively smooth (and surface-consistent) changes (Ji et al., 2010; Meunier, 2011; Regone et al., 2015; Bakulin et al., 2018b). The introduction of single-sensor data in desert environments Manuscript received by the Editor 9 January 2023; revised manuscript received 21 March 2023; published ahead of production 24 May 2023; published online 17 July 2023. 1 Saudi Aramco, EXPEC Advanced Research Center, Dhahran, Saudi Arabia. E-mail: a_bakulin@yahoo.com; ilya.silvestrov@aramco.com. 2 Institute of Petroleum Geology and Geophysics SB RAS, Novosibirsk, Russia. E-mail: dmitn@mail.ru (corresponding author). © 2023 Society of Exploration Geophysicists. All rights reserved. V371 Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 V372 Bakulin et al. random noise model formalized not only time-variant non-surfaceconsistent statics from Reilly et al. (2010) but also introduced random frequency-dependent phase perturbations typical for speckle noise due to volumetric scattering well known in optics, acoustics, and ultrasound (Abbot and Thurstone, 1979; Damerjian et al., 2014; Goodman, 2020). Recognizing speckle noise in seismic data was a significant milestone because it enables transferring the vast knowledge of speckle noise mitigation techniques from outside geophysics. We stress that multiplicative noise implies that our signals (direct arrivals and reflections) become distorted. Caused by smallscale heterogeneity in the near-surface layer, such distortions lead to a fast wavelet variation over short distances resulting in a loss of local coherency. Distortions are time-variant and non-surface-consistent (Bakulin et al., 2022a). As a result, standard approaches of surface-consistent deconvolution (Cary and Lorentz, 1993) and coupling corrections (Bagaini and Barajas-Olalde, 2004) cannot address seismic speckle scattering noise. This paper introduces a new seismic time-frequency masking (STFM) technique (Bakulin et al., 2021) that could effectively mitigate speckle noise on prestack data. STFM is inspired by time-frequency masking from speech processing (Yilmaz and Rickard, 2004; Wang, 2008a). However, it has several critical distinctions, such as using locally stacked data as a “pilot” to design all masks and addressing multiplicative noise that is not present in speech. The paper follows a simple structure. First, we summarize the new multiplicative noise model that describes the effects of near-surface scattering on prestack seismic data. Then, because local stacking is an essential element of the workflow, we summarize the transformation of speckle noise during stacking presented in detail by Bakulin et al. (2022a, 2022b). Then, we set a goal to design a signal estimation scheme that eliminates a significant amount of multiplicative noise by introducing STFM inspired by speech processing. The phase masking component of STFM was raised in an ad hoc manner by Bakulin et al. (2020b). However, its full justification requires invoking the speckle noise model that only appeared later (Bakulin et al., 2022a). In this study, we reintroduce phase masking on a more solid mathematical footing and demonstrate how it addresses the loss of coherency and fast wavelet variation caused by speckle noise. Then, we describe a new version of amplitude masking designed to suppress amplitude noise. The idea of amplitude masking guided by local stacking was introduced by Bakulin et al. (2020c, 2021) but was not fully developed. This study presents a complete algorithm with all computational details. Then, we evaluate the proposed STFM procedure on a controlled synthetic example to validate the algorithm. Finally, we apply the developed algorithm to field seismic data from desert environments. SEISMIC SPECKLE NOISE CAUSED BY SMALL-SCALE SCATTERING Near-surface noise is usually invoked as a qualitative explanation of land data complexity. Without understanding a specific mechanism, such an explanation is not particularly useful. Inspired by studies of optical and ultrasonic speckle noise, Bakulin et al. (2022a, 2022b) suggest that seismic data experienced similar distortions and referred to them as “seismic speckle noise.” Figure 1 summarizes the speckle mechanism as near-ballistic forward scattering caused by closely packed heterogeneities in the near-surface layer. We stress that it is not an interference with near-surface arrivals or diffractions as in the case of additive noise. In contrast, multiplicative noise is the distortion of the signal itself scattering on small-scale heterogeneities. Many forward-scattered waves with close paths generate a complex interference pattern in the vicinity of the ballistic arrival (Figure 1a). Speckle noise is controlled by the local distribution of tightly packed scatterers around the wavepath of interest. The noise rapidly varies when the wavepath shifts (Fink and Derode, 1998). Because the wavepath changes every time we perturb either the offset or azimuth of sources and receivers, speckle noise characteristics vary spatially, often unpredictably. Likewise, even with a fixed source and receiver position, the wavepath through the near-surface scattering layer could differ for each reflector (Figure 1a). As a result, speckle noise characteristics also vary with time. Each 3D (x-y-t) window of prestack data (x-y-t or two spatial and one time directions) is likely to contain a unique field of speckle noise not observed in any other window. In summary, speckle noise cannot be described by surface-consistent and time-invariant operators currently used in conventional processing. Optics and acoustics established a multiplicative model for monochromatic speckle noise (Goodman, 2020). Bakulin et al. (2022a, 2022b) introduce a mathematical model for a broadband seismic trace with a general multiplicative noise. Let us denote the recorded signal in the kth channel as xk ðtÞ ¼ rk ðtÞ  sðtÞ þ nk ðtÞ; (1) where sðtÞ is the clean signal, rk ðtÞ is the multiplicative noise defined as a random linear system, nk ðtÞ is the additive random noise, “*” means the convolution, and k = 1, : : : , K (K is the number of channels). In the Fourier domain, equation 1 could be rewritten as X k ðωÞ ¼ Rk ðωÞSðωÞ þ N k ðωÞ; (2) where X k ; Rk ; S; and N k are Fourier transforms of the time-domain representations from equation 1. We further assume that additive noise N k ðωÞ is uncorrelated between channels and has a zero mean ðE½N k ðωފ ¼ 0Þ, where E designates the math expectation operator. In addition, additive noise and signal are assumed to be uncorrelated. Signal transformation during stacking in the presence of multiplicative noise Figure 1. (a) Origin of seismic multiplicative noise caused by seismic speckle born in the near-surface scattering layer and (b) conceptual depiction of local ensemble used for prestack data enhancement. Speckle studies have proven that phase fluctuations play a critical role. For example, Bakulin et al. (2022a, 2022b) suggest a simple model of multiplicative noise as random frequency-dependent phase fluctuations occurring in each channel: STFM for speckle noise suppression Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 Rk ðωÞ ¼ eiφk ðωÞ : (3) For each frequency ω, random variables φk ðωÞ are assumed to be independent between all channels. Local stacking (averaging) of the K traces in the frequency domain can be written as K 1X ^ fSðωÞeiφk ðωÞ þ N k ðωÞg: SðωÞ ¼ K k¼1 (4) ^ For a limited number of traces used in the stacking, SðωÞ would be a random variable for different trace ensembles. Its math expectation could be written as ^ E½Sðωފ ¼ SðωÞ   K 1 X E½eiφk ðωÞ Š : K k¼1 (5) Because each term acts as a filter on the signal, let us denote its mathematical expectation as Φk ðωÞ ¼ E½eiφk ðωÞ Š: V373 Type 1 multiplicative noise — Phase perturbations with normal distribution If phase perturbations φk ðωÞ have the same normal Gaussian distribution with zero mean and standard deviation σ φ ðωÞ, then the mathematical expectation can be expressed as ^ E½Sðωފ ¼ jSðωÞjeiφS ðωÞ e− σ2 φ ðωÞ 2 : (10) The stacked phase spectrum in equation 10 coincides with the clean signal phase. Therefore, if standard deviations remain independent of frequency, amplitude reduction is the same across the entire frequency band. In contrast, the loss of signal amplitude during stacking could increase with frequency if we assume that the standard deviation increases with frequency. Type 2 multiplicative noise — Residual statics with normal distribution Residual statics (random time shifts τk ) was recast as the second type of multiplicative noise (Bakulin et al., 2022a), with (6) Rk ðωÞ ¼ e−iωτk : (11) Using the notation x ¼ φk ðωÞ, Φk ðωÞ ¼ Z ∞ Pk ðxÞeix dx; (7) If static corrections are normally distributed with zero mean and standard deviation σ T , then −∞ where Pk is a probability density function of a random variable φk at a given frequency ω in the kth channel. Equation 7 suggests that if the probability density function of the phase fluctuations of the multiplicative noise is even, i.e., PðxÞ ¼ Pð−xÞ (i.e., the expected value is zero), then Φk ðωÞ becomes a real-valued filter. Using notation 6, we recast equation 5 as X  K iφs ðωÞ 1 ^ E½Sðωފ ¼ jSðωÞje Φ ðωÞ ; K k¼1 k (8) ω2 σ 2 τ 2 : (12) Likewise, the phase spectrum of the stack mathematical expectation (equation 12) is identical to the clean signal. In contrast, amplitude experiences an exponential decay with frequency. Berni and Roever (1989) present an alternative derivation of exponential amplitude loss without using a multiplicative model but make similar assumptions of normal Gaussian distribution of intra-array residual statics. Combination of types 1 and 2 noise where jSðωÞj and φS are the amplitude and phase spectra of the clean signal, respectively. Bakulin et al. (2022a) suggest that if the statistical properties of the near-surface layer do not vary rapidly, then all channels involved in local stacking (Figure 1b) could be assumed to have identical distributions. Hence, each channel represents a random realization of the same variable φk ðωÞ describing phase perturbations leading to ^ E½Sðωފ ¼ jSðωÞjeiφs ðωÞ ΦðωÞ; ^ E½Sðωފ ¼ jSðωÞjeiφs ðωÞ e− (9) where jSðωÞj and φS are the amplitude and phase spectra of the clean signal, respectively. Important conclusions could be drawn without specifying any other characteristics of the near-surface scattering layer. First, stacking performs phase “cleanup” or averaging, thus leading to either a clean signal phase (if ΦðωÞ > 0Þ or a signal phase rotated by π (if ΦðωÞ < 0Þ. Second, real-valued ΦðωÞ could be interpreted as an amplitude filtering describing the effect of stacking. Bakulin et al. (2022a) introduce a specific form of ΦðωÞ for two practically significant types of noise: (1) random phase perturbations and (2) random time shifts (residual statics). Bakulin et al. (2022a) demonstrate that the joint action of types 1 and 2 noises consistently explains three main field observations typical for any challenging prestack land data from areas with complex scattering near the surface: 1) Reflections and direct arrivals on prestack data are often cluttered and not trackable throughout the entire gathers. 2) After local stacking, coherent reflections appear; however, the absolute level of amplitude spectra is significantly reduced (−15 to 20 dB or more) across the entire frequency range. 3) There is always an exponential loss of higher frequencies after local stacking. Taking into account a combination of both types of multiplicative noise (equations 3 and 11) acting simultaneously, and assuming that τk and φk ðωÞ are independent and randomly distributed with a normal distribution and standard deviations σ φ and σ τ , we can derive the stack mathematical expectations as ^ E½Sðωފ ¼ jSðωÞjeiφs ðωÞ e− ω2 σ 2 τ 2 σ2 φ e− 2 : (13) Bakulin et al. Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 V374 The phase of the clean signal is recovered after stacking, whereas the amplitude loss factor becomes a product of two terms — one caused by phase perturbations and another by residual statics. Figure 2 visualizes amplitude decay induced by each type of multiplicative noise resulting from local stacking. Expressed in logarithmic scale (decibels), residual statics with a standard deviation σ τ produces amplitude loss increasing as a square of frequency f: Aτ ¼ −171.45 · ðfσ τ Þ2 dB: (14) This equation agrees with the Baeten and van der Heijden (2008) formula. Amplitude reduction is proportional to a square of the standard deviation (Figure 2a). In contrast, type 1 noise (phase perturbations with a normal distribution) leads to a frequency-independent amplitude loss that is also proportional to a square of the standard deviation: Aφ ¼ −4.343 · ðσ 2φ Þ; (15) under the assumption that σ φ is independent of frequency. In summary, phase perturbations scramble the signal (observation 1) and reduce the overall amplitude level (observation 2). In contrast, residual statics explain the accelerating loss of higher frequencies (observation 3), as Bakulin et al. (2022a, 2022b) explained. How can we correct prestack seismic data for these complex nonsurface-consistent and time-dependent multiplicative noises superimposed with the regular additive noise? First, let us show in the following section that the window-based infrastructure of local stacking algorithms is the proper foundation for such corrections. ensembles has already been set up in local stacking approaches (Buzlukov and Landa, 2013; Bakulin et al., 2018b). More specifically, Bakulin et al. (2020a) describe how the ensembles carved around the main reflection events are designed and handled during nonlinear beamforming (NLBF), one of the advanced local stacking approaches frequently used in a desert environment with the complex near-surface (Bakulin et al., 2018a). We would use this existing infrastructure without any modification. We remind you that NLBF provides an output prestack volume of the same size as a raw input volume. Figure 3 summarizes at the highest level that each time window on beamformed data is obtained as a local stacking conducted on a 3D x-y-t ensemble of prestack data approximating equation 13. Such a beamformed sum is individually manufactured at every time window, and every x and y coordinate inside the 3D x-y-t ensemble. In the past, beamformed or locally stacked data were the final output of the data enhancement process. However, although summation may be an optimal signal estimator (for phase and amplitude) in the presence of additive noise, equation 13 suggests it is no longer the case for multiplicative noise. Instead, it implies that summation is only the optimal estimator for the signal phase, not the amplitude. As a result, we now want to recover the signal phase from beamformed data while designing a different estimation procedure for signal amplitude spectra that may correct for exponential frequency filtering caused by two types of multiplicative noise. Such targeted manipulation of local amplitude and phase can be conveniently achieved using the STFM technique described in the next section. SEISMIC TIME-FREQUENCY MASKING The loss of coherency of reflected events caused by seismic speckle noise due to scattering creates a significant challenge for seismic processing (Bakulin et al., 2022a). Although arrival-time information of seismic events is encoded in the phase spectrum (Lichman, The preceding considerations apply to a limited extent of a 3D 1999), their extraction becomes nearly impossible in the presence of x-y-t window containing a piece of a single-reflection event contamisevere phase fluctuations caused by speckle noise. Likewise, nated by a multiplicative noise as illustrated by the synthetic and realsmoothly varying local phase along the reflected event is an implicit data examples in Bakulin et al. (2022a). However, speckle noise underlying requirement for most seismic processing and interpretavaries with space and time. Therefore, if we set a task to denoise tion steps. A similar observation about the primacy of the phase was the phase using equation 13, we must do this in the window-by-winmade for the perception of photographic images in a seminal paper by dow fashion, scanning all three directions of the prestack seismic volOppenheim and Lim (1981). ume. Fortunately, such an infrastructure with 3D window-based x-y-t Equation 13 provides a recipe for restoring the clean signal phase. Bakulin et al. (2022a) present numerical experiments confirming reliable phase estimation with local ensembles containing only tens and hundreds of traces. Such ensembles are typical for supergrouping (Bakulin et al., 2018b) and NLBF (Bakulin et al., 2020a). Therefore, let us set the first goal to restore the clean signal phase while leaving the amplitude alone. Bakulin et al. (2020b, 2020c, 2021) dub such a method “phase substitution.” However, they do not justify why local stacking may lead to a signal phase restoration. They introduce time-frequency masking from speech processing as an additional instrument that allows the concise and convenient formulation of such time-dependent trace manipulation. In speech Figure 2. (a) Frequency-dependent amplitude decay during stacking caused by residual processing, the typical assumption is that a regisstatics with the normal distribution of various standard deviations (σ T ¼ 2, 4, and 8 ms). tered single-channel signal xðtÞ may be described (b) Frequency-independent amplitude decay during stacking caused by phase perturbaas a superposition of the useful signal sðtÞ and adtions with normal distribution. The amplitude decay in decibels versus standard ditive noise: xðtÞ ¼ sðtÞ þ nðtÞ: For example, s(t) deviation σ φ (in radians) is shown. FROM LOCAL ENSEMBLE WINDOW TO ENTIRE VOLUME Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 STFM for speckle noise suppression could be a primary speech of interest, whereas nðtÞ denotes interfering speech or external noises. Frequency components of the signal (primary speech) and noise properties rapidly vary with time and are thus nonstationary. However, in short windows, signal and noise spectra may possess a certain separation of sorts allowing more efficient signal estimation or speech extraction (Yilmaz and Rickard, 2004). Because the mechanics of a windowed Fourier transform is suitable for manipulating localized phases and amplitudes, let us describe it in more detail. Applying the discrete short-time Fourier transform (STFT) to xðtÞ, we obtain a complex-valued 2D time-frequency (TF) spectrum Xðk; lÞ ¼ jXðk; lÞj exp½iφx ðk; lފ. Here, jXðk; lÞj and φx ðk; lÞ are amplitude and phase TF spectra, with k and l denoting the discrete indices for the frequency bin and time frame, respectively. Time-frequency mask (TFM) is defined as a filter Mðk; lÞ applied to noisy speech to obtain the best possible signal estimate of the TF spec^ lÞ of the clean speech: trum Sðk; ^ lÞ ¼ Mðk; lÞ · Xðk; lÞ: Sðk; (16) Most speech studies consider the spectral phase unimportant for noise reduction (Wang and Lim, 1982). As a consequence, TFM is typically a real-valued function, 0 ≤ Mðk; lÞ ≤ 1 (Yilmaz and Rickard, 2004; Wang, 2008b), designed based on signal and noise considerations. Here, M is close to one in the TF spectrum part where the signal is dominant. In contrast, it is close to zero when noise is prevalent. For additive noise, “amplitude TFM” (that does not modify the phase) could be justified as an optimal filter under some conditions when using the standard mean-square error (MSE) metric between the desired and estimated signal time series (Yilmaz and Rickard, 2004). As a result, the estimated signal inherits the phase of a noisy recording. Recent studies suggest that MSE may not fully describe a human voice quality assessment, and better signal phase estimation could improve speech perception (Koutsogiannaki et al., 2014). However, amplitude TFMs remain dominant in speech processing. TFM “mechanics” is suitable for estimating seismic signals contaminated by multiplicative and additive noise. We introduce STFM as a specific TFM variant novel in three aspects: 1) STFM is designed to attack multiplicative noise in addition to the additive noise handled by normal TFM. 2) STFM is usually a complex-valued mask that could be divided into a phase mask (needed to address phase distortions caused by multiplicative noise) and an amplitude mask. 3) The design of the phase and amplitude parts of STFM is guided by a twin prestack seismic volume (the same size and number of traces) representing beamformed or locally stacked data. V375 Xðk; lÞ ¼ jXk; lÞj expðiφX ðk; lÞÞ; Sðk; lÞ ¼ jSðk; lÞj expðiφS ðk; lÞÞ; (18) where j · j denotes amplitude TF spectra, whereas φX and φS are phase TF spectra of raw and beamformed traces, respectively. Phase masks addressing multiplicative noise As mentioned previously, phase masking aims to reduce multiplicative noise distortions in the phase TF spectrum only without modifying the amplitude TF spectra. Let us introduce the phase substitution mask (PSM): PSMðk; lÞ ¼ exp½ifφS ðk; lÞ − φX ðk; lÞgŠ; (19) where φX and φS are TF phase spectra of the original and beamformed traces, respectively. Equation 16 suggests that, after applying PSM to data contaminated by seismic speckle noise, we obtain new seismic volume where the (local) phase was replaced by the beamformed phase from time-dependent local pilots (Figure 3). However, the amplitude spectra of the raw data are preserved. Bakulin et al. (2020b, 2020c, 2021, 2022b) demonstrate that phase substitution restores the prestack reflections’ coherency and makes them visible and trackable. Such demonstrations were presented for synthetic data generated by the finite-difference (FD) modeling and real field data. Furthermore, data after phase substitution become processable by conventional time and depth processing algorithms. Local summation leads to estimating the clean signal phase, assuming that multiplicative noise with supposed properties is the only complicating factor (equation 13). On real data, such an assumption may not be perfectly fulfilled because of the following: • • The stationarity of the speckle noise within an ensemble could not be guaranteed or verified. Small-scale geologic heterogeneity may or may not result in distributions assumed (with fixed standard deviations σ φ for phase perturbation and σ τ for residual statics). Optimal ensemble size and optimal sampling of traces within an ensemble needed for estimating the signal are also not granted. As a result, equation 16 may provide guidance and insight but may not be an absolute truth in all cases. Fortunately, TFM mechanics and beamformed volume allow straightforward extensions. For Before introducing various masks, let us define the TF spectra for input seismic trace xðtÞ and corresponding beamformed trace sðtÞ obtained via local stacking as depicted in Figure 3. After applying STFT to xðtÞ and sðtÞ, we obtain complex-valued TF spectra: Xðk;lÞ ¼ F hxðtÞi ðoriginal trace TF spectrumÞ; Sðk;lÞ ¼ F hsðtÞi ðbeamformed trace TF spectrumÞ: (17) Complex TF spectra of the traces are represented as Figure 3. Schematic explaining the structure of two volumes with raw and beamformed data after local stacking: (a) locally stacked trace (at a select fixed location) is generated from 3D ensembles of neighboring traces in a window-by-window fashion; (b) beamforming of local ensembles is done along the signal trajectories estimated directly from the data as explained by Bakulin et al. (2020a). Bakulin et al. V376 Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 example, a phase-correction mask (PCM) (Bakulin et al., 2020b, 2020c), PSMðk; lÞ ¼ sign½cosfφS ðk; lÞ − φX ðk; lÞgŠ; (20) performs a subset of most necessary modifications already done by PSM but overall does a much less severe transformation of the input phase. If original and beamformed data are in phase (phase difference is less than π=2) at a fixed frequency, no correction is made (PCM = 1). If they are out of phase (difference more than π=2), then the phase at this frequency is flipped (PCM = −1). In contrast to conventional phase-sensitive masks from speech processing of the form cosfφS − φX g (Erdogan et al., 2015), the proposed phase (sign) correction mask is amplitude-preserving because amplitude spectra are not modified. It was shown on synthetic and real data that PSM and PCM significantly increase the coherency of the challenging prestack data with often similar results by visual inspection (Bakulin et al., 2020b, 2020c). The events become difficult to track when multiplicative noise leads to a phase flipping on the original data. PCM only interferes with correcting those most drastic areas while leaving the remaining raw phase untouched. The same job is also done by PSM, which always returns the beamformed phase. In simple terms, a PCM can be considered a “minimal” phase change (also included in PSM) required to restore the coherency scrambled by speckle noise. Transformed data after phase masking Let us examine intermediate prestack volume after applying only phase masking to establish a more apparent conceptual connection between speech and seismic processing. It has a corrected local phase φcor ðk; lÞ that has been “cleaned” to approximate the signal phase. Yet, amplitude spectra were untouched, meaning that multiplicative and additive noise are fully embedded. Because all manipulations are done with the local phase using STFT, we can describe traces s^k ðtÞ after phase masking as s^k ðtÞ ≈ ðjsj þ jΔk jÞ  exp½ifφcor ðk; lÞgŠ; (21) jΔk j ¼ jrk s þ nk j − jsj; (22) where Δk becomes a cumulative noise that remains in the amplitude spectra. Phase distortions caused by multiplicative noises were mitigated by phase masking. However, the amplitude of the phasecorrected traces remains contaminated by additive and multiplicative noises. Equation 21 resembles a typical speech processing problem, where contaminated amplitude spectra require cleaning. In contrast, the phase is already an acceptable representation of the signal phase and hence does not require further estimation. A typical speech processing task is cleaning the amplitude spectrum through a TFM amplitude mask to separate clean speech from noise or concurrent speech (Wang, 2008b; Liang et al., 2013). We emphasize that the amplitude mask design only involves the power spectra of data as well as estimated noise. Therefore, the final result after applying the amplitude and phase masks remains the same irrespective of the order of the mask application (i.e., first the phase mask then amplitude, first the amplitude mask then phase, or both masks applied together). There is a considerable variety of amplitude TFMs in speech processing (Kim, 2022). This is an active area of ongoing research, with machine learning-based TFMs also entering the competition (Erdogan et al., 2015). This study aims to introduce TFMbased approaches to seismic processing. As a result, we only describe an initial amplitude TFM that we found helpful for processing the field seismic data from the desert environment. However, before discussing the amplitude mask applied to raw data, let us address why we do not attempt to directly recover the signal amplitude using equation 13. Why not deconvolve stacked amplitude via deterministic “speckle amplitude compensation”? Equation 13 suggests an alternative route to reconstructing the signal amplitude via deterministic deconvolution similar to inverse Q filtering (Wang, 2008b). Provided standard deviations σ φ and σ τ are obtained or estimated, theoretically, the signal amplitude could be reconstructed using a deterministic compensation scheme: ^ Scorr ðωÞ ¼ SðωÞe ω2 σ 2 τ 2 σ2 φ e2: (23) Although such a scheme restores higher frequencies, prestack data themselves maintain an overly smoothed appearance similar to beamformed data after local stacking. An oversmoothed appearance is not liked by data processors. Such behavior is readily understood because the input is stacked amplitude, not the raw amplitude. Please recall that local stacking approaches often require apertures of 100–400 m with hundreds or even thousands of traces in an ensemble (Bakulin et al., 2018b, 2020a). Although such apertures appear necessary for processors to restore coherency (phase cleanup), they seem excessive for the amplitudes that become oversmoothed. Another reason to remain cautious about deterministic compensation is to recall that equation 13 relies on several idealistic assumptions that may be difficult to verify for real data. Inverse Q filtering applies corrections to the original input traces, not the locally stacked pilots. They perform smooth energy adjustments within the original amplitude spectra. They do not eliminate any local details from the input data themselves. In the case of speckle noise, we have already introduced phase masks to correct for local phase perturbations. PCM preserves many local details of the phase and only interferes where noise leads to a polarity change. Could there be a similar approach to locally “clean” the original amplitude spectra from the damaging effects of the noise instead of substituting the averaged stack amplitude? Such an approach is perfectly feasible with the amplitude time-frequency masking that follows in the footstep of speech processing. Next, we describe one new method that designs a local amplitude STFM mask applied to the raw amplitude TF spectra of the original data. Similar to PCMs, STFM amplitude masks only use beamformed data as a guide for designing the local filter. Likewise, deterministic compensation can be additionally used for improved signal estimation during mask construction. However, in the end, the final STFM is applied to the original raw amplitude before beamforming. Amplitude TFM and minimum statistics approach to suppress seismic noise Some early amplitude masks were so-called ideal binary masks where TFM = 1 in speech-dominated areas and TFM = 0 in noisedominated areas (Yilmaz and Rickard, 2004; Kim, 2022). Such Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 STFM for speckle noise suppression binary or “hard” masks are not helpful for seismic processing where signal and noise overlap in most areas. Later studies proposed a variety of “soft” masks where 0 < TFM < 1 (Liang et al., 2013; Kim, 2022). One of the most widespread soft amplitude TFM is the ideal rationale mask (IRM). One possible realization of IRM is defined as sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jSest ðk; lÞj2 IRMðk; lÞ ¼ ; jSest ðk; lÞj2 þ jN est ðk; lÞj2 (24) where jSest ðk; lÞj2 and jN est ðk; lÞj2 are the power spectra of signal and noise, respectively. One can easily observe that IRM could be written in terms of a localized signal-to-noise ratio (S/N). In other words, the IRM can be considered the instantaneous version of the Wiener filter. IRM design requires an evaluation of the signal and noise power spectrum (PS). The speech-processing community has developed multiple noise estimation algorithms. One popular algorithm, called the minimum statistics (MS) approach, was pioneered by Martin (1994) and enhanced by Doblinger (1995) and Martin (2001), among others. When the noise PS is estimated, the clean speech PS is calculated by spectral subtraction (Boll, 1979) as jSest ðk; lÞj2 ¼ jXðk; lÞj2 − jN est ðk; lÞj2 . Using these localized estimates of signal and noise, IRM (equation 24) can be computed, and the signal estimation performed using equation 16 with the IRM mask as an amplitude TFM. This represents a standard scheme used in speech signal processing (Kim, 2022). Such an approach is completely data-driven because no prior knowledge about the speech is required. We stress that phases are not corrected as being perceived as unimportant for noise reduction in speech (Wang and Lim, 1982). The MS method is grounded on two experimental observations (Martin, 1994, 2001): • • Noise and clean speech (signal) are commonly statistically independent. As a result, the PS of the noisy signal can be represented as a sum of the power spectra of clean speech and noise: jXðk; lÞj2 ¼ jSðk; lÞj2 þ jNðk; lÞj2 . The noise PS could often become equal to that of noisy speech. For example, this occurs during speech pauses. Likewise, this also happens between words and syllables. As a result, the PS of the noise can be approximated by evaluating the minimum energy of the original noisy speech inside each frequency bin. V377 significant bias downward and accelerating loss of higher frequencies, as schematically shown in Figure 5a. Such a behavior is a manifestation of the multiplicative speckle noise described by equation 13. We postulate that the beamformed spectrum in red is certainly a signal. In contrast to speech processing, we apply the MS approach to the difference between the original and beamformed power spectra: jRðk; lÞj2 ¼ jXðk; lÞj2 − jSðk; lÞj2 : (25) Any negative values occurring after spectral subtraction are zeroed out to ensure jRj2 ≥ 0. We may refer to jRðk; lÞj as a residual PS because it contains a significant amount of attenuated signal mixed with noise. Therefore, we shall try to recover the residual signal from this mixture and then recombine it with the beamforming amplitude. In other words, we shall try to carve out a portion of that shaded difference (Figure 5a) due to the signal. A more advanced version of this step is to replace S with Scorr from equation 23: Figure 4. Seismic trace with (in red) and without (in blue) added white Gaussian noise: (a) large window 1–4 s; (b) magnified into smaller window highlighting intervals where the signal amplitude is close to zero, whereas the noise amplitude dominates. Such intervals or “breaks” can be analogous to pauses in speech, providing a local estimate of the noise. One can make an analogy between a speech record and a seismic trace. As in speech, many target arrivals are separated by calm periods (Figure 4). Therefore, we can empirically justify the second assumption from the MS method for seismic data processing. Inspired by speech processing, we suggest the subsequently modified workflow that uses a beamformed prestack volume as an initial guess and as a guide for an improved signal amplitude estimation. Step 1. Windowed Fourier transform Here, xðtÞ and sðtÞ are transformed into a TF domain using STFT. Step 2. Prepare input for MS Bakulin et al. (2022a) present the typical spectra of real data before and after beamforming. The beamformed spectrum experiences Figure 5. (a) Typical amplitude spectra before and after beamforming with reduced overall levels and accelerating loss of higher frequencies (Bakulin et al., 2022a) with the shaded area graphically showing residual PS jRðk; lÞj; an input to the minimum statistic method; (b) an improved estimate of the residual spectrum jRcor ðk; lÞj obtained after performing deterministic speckle compensation using equation 23. Bakulin et al. V378 Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 jRcorr ðk; lÞj2 ¼ jXðk; lÞj2 − jScorr ðk; lÞj2 : (26) In essence, we assume that local stacking suppressed the additive noise but suffered from exponential decay (equation 23) which we attempt to undo to derive a better mask. Such an additional step requires an a priori estimation of speckle noise parameters, i.e., standard deviations σ φ and σ τ from equation 23. More studies are required to directly establish the derivation of speckle noise parameters from the data. In this study, a simple trial-and-error estimation is performed to ensure that the corrected beamformed spectrum in Figure 5b remains reasonably flat and below the raw spectra. Such a ballpark estimation is similar to attenuation estimates invoked in the inverse Q filtering (Wang, 2008b). Data examples reveal that a reasonably wide range of σ φ and σ τ leads to similar signal estimates after STFM, suggesting the stability and low sensitivity of this additional step. However, further speckle compensation done during mask estimation leads to an improved perception by processors compared with the basic version using equation 25. Step 4. Reestimation of the signal power Reevaluate the signal PS by spectral subtraction with a revised noise estimate from the previous step jS~ est ðk;lÞj2 ¼ jXðk;lÞj2 −jN est ðk;lÞj2 . Step 5. Signal estimate smoothing At each frequency k, perform smoothing of jS~ est ðk; lÞj2 in the time direction using simple rank one recursion: jSest ðk; lÞj2 ¼ βjSest ðk; l − 1Þj2 þ ð1 − βÞjS~ est ðk; lÞj2 , β ¼ 0.7 − 0.95. Such smoothing is typical for speech processing (Martin, 1994, 2001), and we also find it useful for seismic data. Step 6. Calculation and application of the amplitude IRM mask Compute IRM (equation 24) using estimated jSest ðk; lÞj2 and jN est ðk; lÞj2 . Then, use equation 16 to perform filtering of the original trace TF spectrum using IRM. Step 3. Noise estimation with the MS method Step 7. Inverse windowed Fourier transform To estimate noise PS, the MS approach is applied to jRðk; lÞj2 (or jRcorr ðk; lÞj2 ). MS approach may be described in a simplified form as follows. At each frequency k 0, the PS of the noise is approximated as jN est ðk 0 ; lÞj2 ¼ min jRðk 0 ; l  ΔtMS Þj2 . In other words, noise is assumed constant and approximated by the minimum energy value in the interval with a width of 2ΔtMS . The noise is independently estimated at each frequency k. The time interval length used for the minimum tracking in step 3 of the workflow is an important parameter. In Figure 6, the residual PS of a field trace for one particular frequency of 20 Hz (in blue) is presented. For comparison, we display two versions of the noise PS for an IRM construction with different time intervals of 40 ms (in red) and 120 ms (in black) used for minimum tracking. If a shorter time interval is used, noise PS estimation will be closer to residual PS. This is because more residual PS components will be treated as noise. As a result, IRM will provide a result close to the initial signal guess (beamformed data). On the other hand, suppose a more extended time interval is used for minimum tracking. In that case, the output result of IRM filtering will be closer to the original data. In other words, IRM will pass more residual PS components to the output. Using a cleaned amplitude spectrum, recover the time-domain ^ lފ. ^ ¼ ISTFT½Sðk; signal estimate by applying an inverse STFT sðtÞ In contrast to speech processing, where TFM is directly applied to the noisy signal PS, STFM utilizes the MS method on the residual PS. Such an adaption incorporates robust a priori information available for multichannel data. Specifically, local stacking provides a reliable average signal estimate (Bakulin et al., 2018a, 2020a) that could be further corrected and enhanced using equation 23 to compensate for speckle noise. Therefore, it is reasonable to assume that a beamformed PS has to be fully included in the final signal estimate and supplemented by a residual recovered signal from the difference (equations 25 and 26). Let us demonstrate that such a goal is achieved with the proposed modification of the minimum statistic approach. To do this, let us consider two extreme cases of ΔtMS → 0 and ΔtMS → ∞ in more detail. If ΔtMS → 0, then the entire shaded area in Figure 5 is considered an additional residual signal and noise estimate in the MS approach jN est ðk 0 ; lÞj2 → 0. Therefore, IRM ¼ 1 in equation 16, which implies that IRM does not modify the amplitude spectra of the input data. As a result, we arrive at the data after the phase masking described by equation 21. If ΔtMS → ∞, then the entire PS spectra difference jRðk; lÞj2 ¼ jXðk; lÞj2 − jSðk; lÞj2 is considered as noise throughout the entire MS workflow. After signal PS reestimation (step 4), we obtain jS~ est ðk; lÞj2 ¼ jXðk; lÞj2 − jXðk; lÞj2 þ jSðk; lÞj2 ¼ jSðk; lÞj2 : Amplitude TFM will have a form IRM ¼ jSðk; lÞj=jXðk; lÞj. Substitution in equation 1 gives ^ lÞ ¼ IRMðk; lÞ · Xðk; lÞ ¼ ðjSðk; lÞj=jXðk; lÞjÞjXðk; lÞj Sðk; × expðiφx ðk; lÞÞ ¼ jSðk; lÞj expðiφx ðk; lÞÞ: Figure 6. Residual PS computed for a field trace at a frequency of 20 Hz (shown in blue) and noise power spectra estimated by tracking the minimum energy in 40 ms (in red) and 120 ms (in black) intervals. (27) In other words, only the beamformed spectrum is recognized as a final signal estimate. In contrast, the PS difference spectrum (the shaded area in Figure 5) is assumed to have zero residual signal. Therefore, the amplitude STFM mask, by design, provides an output amplitude spectrum (estimated signal) that is situated between beamformed (in red) and original spectra (in blue) in Figure 5. STFM for speckle noise suppression V379 amplitude mask’s ability to accurately recover attenuated higher frequencies. Furthermore, the IRM mask is data-driven and does not rely on any specific knowledge about the speckle noise model. Phase masking improvements have been demonstrated on synLet us apply STFM to such a controlled synthetic example with thetic data obtained with FD modeling with a cluttered near-surface random multiplicative and also additive noises. Figure 7a displays the layer (Bakulin et al., 2020b, 2020c, 2022a). FD modeling replicated Klauder wavelet that represents the autocorrelation of the typical viall three experimental observations seen on the field data (Bakulin broseis sweep in land seismic data. Sweep parameters are υ1 ¼ 2 Hz et al., 2022a). PSM and PCM heal the distortions to a large extent (minimum frequency), υ2 ¼ 80 Hz (maximum frequency), T ¼ 12s and improve coherency. However, in those cases, there was no (time duration). Blackman window function was used as a taper. The ground truth to compare. Although we can see fewer distortions amplitude spectrum is a nearly constant function between 5 and after STFM, the resulting wavefield does not correspond to data 75 Hz (Figure 7b). The total time duration of the wavelet is approxobtained with a homogeneous near-surface layer. Instead, the corimately 160 ms. rected wavefield corresponds to new data in a “smoother” model Figure 8a shows a clean signal trace with three distinct arrivals, with some smaller heterogeneities removed but the remaining each described by a Klauder wavelet of different polarities. A clean medium-scale heterogeneity preserved. We cannot obtain a mathdata ensemble consists of 100 repetitions of this trace resulting in ematical description for this smoother model at this stage and hence three flat events embedded within the 5 s traces. To simulate realistic non-surface-consistent seismic speckle noise, each 100 ms can only perform a qualitative assessment. window from the clean ensemble was subject to a combination However, such a quantitative assessment is highly desired to asof multiplicative noises of types 1 and 2 described by equations 3 sess the performance of the newly proposed amplitude masking and 11, respectively. We stress that each channel experiences a scheme. As a result, in this study, we focus on a controlled synthetic unique realization of types 1 and 2 multiplicative noises. Likewise, example where multiplicative noise is introduced using equations 3 different time windows of a fixed channel also experience various and 11 so the results can be more easily judged. However, in conrealizations of multiplicative noises. Only statistical parameters, trast to previous studies, our primary focus is validating the IRM e.g., standard deviations σ φ and σ τ , remain constant throughout the ensemble. Finally, additive Gaussian white noise with the level of −1 dB is also superimposed (different for each channel). As a result, contaminated traces exhibit variable distortions for each of the three arrivals making them difficult to recognize on a single trace (Figure 8b) and are easier to identify in the ensemble (Figure 8c, only 50 traces out of 100 are shown). In addition, signals experience not only a variation in the arrival time but also an evident rapid wavelet variation due to phase perturbations (Figure 8c). Let us estimate the clean signal and contrast Figure 7. Klauder wavelet used in the synthetic experiments: (a) time domain and beamforming approach versus STFM using a (b) amplitude spectra. Zero-phase Klauder wavelet represents autocorrelation of a typ100-trace noisy ensemble as an input. STFM ical vibroseis sweep used in land seismic data. consists of the PSM (equation 19) and IRM Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 SYNTHETIC EXAMPLE WITH RANDOM MULTIPLICATIVE AND ADDITIVE NOISE Figure 8. (a) Clean input trace (in blue) consisting of three events, each with a Klauder wavelet; (b) after contamination with multiplicative and additive noise (in green), signals become scrambled and more difficult to locate; and (c) final ensemble of traces used for STFM application. Multiplicative noises are random and nonstationary, with parameters σ τ ¼ 4 ms and σ φ ¼ π=3, whereas Gaussian white noise has −1 dB. Red beamformed trace on (a) has lost higher frequencies, but all three events have become accurately positioned, validating phase cleanup during stacking. Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 V380 Bakulin et al. amplitude mask based on the preceding MS approach. Figure 9 displays two different signal estimates in three different domains. The clean signal (ground truth) is shown in blue. Arrivals of the noisy trace shown in green (Figure 9a or single realization from Figure 8c) are shifted by variable (non-surface-consistent) residual statics and damaged by unique (for each arrival) phase distortions and additive noise. As for the phase, Figure 9c validates that local stacking performs excellent phase “cleanup,” i.e., the stacked phase approaches the true signal phase (without multiplicative distortions) as predicted by theory (Bakulin et al., 2022a). In terms of amplitude, the beamforming estimates in red are clearly suboptimal (Figure 9a): the amplitude of all three arrivals is diminished, whereas vertical resolution is low. The same conclusion is evident in Figure 9b, which reveals the accelerating loss of higher frequencies due to residual statics. The STFM estimate (in black) corrects for residual statics, restores the amplitude level, and returns higher frequencies while also suppressing additive noise (Figure 9a and 9b). Moreover, these goals are perfectly achieved for all three target events despite time and spatially varying multiplicative noises (Figure 9a). Figure 9b shows that the amplitude spectra of the noisy input and clean signal have similar magnitudes making it impossible to distinguish the signal from the noise. It also highlights the treacherous role of the multiplicative speckle noise that redistributes the energy and scrambles the time-domain representation (Figure 8b and 8c). Unlike absorption, which attenuates higher frequencies directly on each measured trace, in the speckle noise case, “attenuation” only happens when ensembles are stacked to generate local pilots (Figures 2 and 5). The practical dilemma is as follows. Speckle noise complexity and unpredictability severely limit signal identification on individual traces (see Figure 8c; Bakulin et al., 2022a). In contrast, the signal can be extracted through the local stacking of ensembles. However, stacking introduces exponential attenuation of the signal amplitudes as per equations 10 and 12. STFM performs a detective balancing act of locating the signal and undoing the effects of multiple scattering on a single trace by assimilating the cumulative information from nearby traces. Recall that nearby traces experience conceptually similar distortions but with entirely different random realizations. STFM can be analogous to restoring a broken vase from small fragments. In the case of additive noise, the only challenge is to identify the order of the pieces that would fit and restore the vase. In the case of multiplicative noise, it is as if each chip underwent random scraping or alteration on all sides, making it challenging to fit them into their intended positions. Therefore, one has to reinstate each fragment to its original shape with additional material or glue to make a complete vase without holes. Beamforming identifies the order of the damaged pieces, whereas STFM, guided by the established order, goes on to reverse the local alterations to make the fragments fit again. Because we know the true signal in the synthetic example, the S/N for each estimation step can be computed using a simple subtraction of the clean signal as a direct measure of noise. The S/N increases from −3.6 dB on the input to 3.4 dB on the beamforming Figure 9. Comparison of (a) time domain, (b) amplitude spectra, and (c) phase spectra for a synthetic example. The green curve corresponds to one of the input traces from the contaminated input ensemble (Figure 8c). Signal estimate from beamforming (the red) suffers from reduced amplitude and attenuated higher frequencies. STFM estimate (the black) recovers a closer amplitude estimate and restores higher frequencies as observed in (a) and (b). The phase spectrum obtained after local stacking provides an excellent approximation to the clean signal phase. Note that on (c), the STFM and beamformed phase curves (black and red, respectively) are identical, making only the red curve visible. The STFM parameters are as follows: STFT window length 160 ms, window shift 12 ms (92.5% window overlap), minimum statistic window for noise estimation 2ΔtMS ¼ 24 ms, β ¼ 0.5, and beamformed spectrum was corrected using equation 23 with σ τ ¼ 4 ms and σ φ ¼ π=4 and used for the MS estimate (equation 26). Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 STFM for speckle noise suppression estimate and 7 dB on the STFM estimate. In addition to a higher S/N, STFM delivers the tremendous new benefit of a full frequency band. The success of the IRM mask (equation 24) hinges on accurately estimating the noise and signal’s local TF amplitude spectra. Because such estimation is purely data-driven in the MS approach, let us evaluate how well the MS works on this controlled example. Figure 10 compares ground-truth noise spectra with the MS estimates. The blue curve shows the TF slice of the ground-truth synthetic noise for one of the traces computed at 20 Hz. Ground truth is calculated by subtracting the known clean signal spectrum (Figure 8a) from the noisy input (Figure 8b). It absorbs multiplicative and additive noise components. Other curves in Figure 10 show noise estimates obtained at step 3 of the MS flow with various sizes of estimation half-window ΔtMS. The first conclusion is that all estimates provide a relatively good tracking of the actual total noise in the amplitude spectra. The second conclusion in this synthetic example is that noise estimation accuracy improves when the window shrinks. In the following two sections, let us validate STFM performance on real data using two different 3D land data sets from the desert environment. Figure 10. Comparison of ground-truth amplitude TF slice at 20 Hz (the blue) versus various estimates obtained using the MS approach with different window lengths 2ΔtMS (the other colors). V381 FIRST REAL DATA EXAMPLE FROM A DESERT ENVIRONMENT (PROCESSING) The first example is from the late processing stage of legacy data with 72-geophone groups. The data set was subject to traditional time processing with noise attenuation, statics, deconvolution, etc. However, reflected events are still faint. Representative common-midpoint (CMP) gather after NMO is displayed in Figure 11a. Traces from all azimuths are depicted as a function of the offset. Severe scattering noise results in largely invisible reflections with low coherency, similar to Figure 8c. Apart from a low S/N, this gather exhibits a nearoffset noise cone with a visible imprint of high amplitudes likely left after conventional processing (denoted by the white oval). Figure 11b shows the same CMP gather after enhancement with NLBF. Local ensembles of approximately 200 neighboring traces are used during beamforming with the apertures approximately 150 m × 150 m in CMP and offset domains. After beamforming, reflections become trackable in the entire offset range. However, the high-frequency content of the signal is attenuated due to speckle scattering noise, and amplitudes are considerably decreased. Figure 11c shows input data after applying only the PCM from equation 20. The coherency of reflections improves, but the noise cone imprint persists. In contrast, Figure 11d shows the data after applying only an amplitude IRM mask (equation 24) based on the MS approach with ΔtMS ¼ 40 ms. Reflections start to pop up more clearly; however, wavelet and phase distortions cannot be fixed by IRM. A combination of the phase (PCM) and amplitude (IRM) masks provides the most optimal result (Figure 11e). Let us examine the effect of each mask in more detail using a smaller window between 1.4 and 3.0 s from a processed CMP gather. Input data after standard processing remain distorted in terms of amplitude and phase (Figure 12a). The trapezoid marks the noise cone’s approximate location with an elevated amplitudes imprint after prior processing. Beamforming provides an excellent estimation of the signal resulting from local summation, but it is Figure 11. CMP gather (after NMO correction) processed by various approaches: (a) original gather after conventional processing; (b) gather after NLBF; (c) gather after applying PCM only (no amplitude mask); (d) gather after applying amplitude mask only (no phase corrections); (e) gather after applying amplitude and phase correction masks (PCM + IRM). The residual noise cone is encircled. STFM parameters are as follows: STFT window length 160 ms, window shift 12 ms, minimum statistic window for noise estimation 2ΔtMS ¼ 24 ms, β ¼ 0.5, and beamformed spectrum was corrected using equation 23 with σ T ¼ 4 ms and σ φ ¼ π=4 and used for the MS estimate (equation 26). Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 V382 Bakulin et al. overly smooth. As a result, the noise cone is largely eliminated (Figure 11b). Application of the PCM considerably improves coherency (Figure 12c), but the amplitude imprint remains untouched, as expected. Amplitude-only mask (IRM) can improve the S/N, so signal amplitudes stand up more clearly in places where the phase coherency is already acceptable (Figure 12b). However, IRM cannot improve coherency where significant phase distortions break it. Therefore, IRM alone does little to improve reflected events continuity in this example. Complete STFM application with phase and amplitude masks achieves the most optimal result (Figure 12d). STFM with PCM + IRM clears the noise cone and makes reflections coherent without oversmoothing. The final critical benefit of the STFM is the preservation of the higher frequencies, as shown in Figure 13, presenting averaged amplitude spectra of the entire CMP gathers. Beamforming provides the best visual S/N (Figure 11b) but attenuates frequencies above 40 Hz and oversmooths the data. STFM does less harsh noise suppression with less smoothing (Figure 11e) while keeping much more signal amplitudes above 40 Hz (Figure 13, in black). Band-pass-filtered gathers (40–80 Hz) in Figure 14 prove that energy preserved by STFM above 40 Hz does represent the reflected signals. Indeed, they are perfectly consistent with events previously recovered through NLBF with dominant frequencies below 40 Hz. At first glance, low amplitudes of the beamformed data above 40 Hz may imply a less reliable phase recovery at higher frequencies. However, such skepticism is not justified. First, equations 12–15 confirm that drastic amplitude roll-off is fully expected in the presence of speckle scattering noise. Second, controlled numerical experiments corroborate that the estimation of the clean signal phase through stacking remains robust up to the highest frequency in the spectra (Bakulin et al., 2022a). Figure 12. The target window from the NMO-corrected CMP gathers: (a) original data after conventional processing; (b) data after application of the IRM amplitude mask (no phase corrections); (c) data after PCM; (d) data after applying amplitude IRM and phase PCM masks. Amplitude mask only improves the S/N in places with good coherency as seen on (b). In contrast, phase mask repairs the coherency damaged by speckle scattering noise as observed on (c). Complete STFM with phase and amplitude masks delivers the most dramatic improvements as observed on (d), because improved coherency reveals increased S/N uniformly across the entire window, including places where IRM alone, seen on (b), did not appear to achieve good denoising. In addition, STFM successfully reduces the impact of the residual noise cone, as observed on the original data in (a) and outlined by the trapezoid. Figure 13. Comparison of average amplitude spectra from prestack gathers from Figure 11a, 11b, and 11e. Time (in blue) is shown in Figure 11. Observe a typical amplitude drop of approximately −12 dB and exponential loss of higher frequencies on beamformed data similar to the sketch of Figure 5a. STFM mitigates the amplitude drop and preserves higher frequencies. REAL DATA EXAMPLE FROM A DESERT ENVIRONMENT (IN-FIELD QUALITY CONTROL) The second example uses a 3D land data set and focuses on in-field quality control (QC). The 3D data were acquired using orthogonal geometry. Receiver sampling is 25 m along the lines and 150 m between the lines. Likewise, shot sampling is 25 m along the source lines and 100 m between the lines. Despite using nine geophone arrays and two vibrators in a source group, data quality is extremely challenging due to high levels of near-surface scattering noise (Figure 15a). As a result, in-field QC using the traditional approach is ineffective. Gathers after linear noise removal alone (Figure 15a) do not indicate the presence of the reflected events at either primary or secondary targets. Likewise, the brute stack produced from such data shows a highly speckled appearance with few faint reflectors (Figure 16a). Furthermore, in certain parts of the image (the right and bottom sides) of Figure 16a, reflectors disappear beneath intense scattering noise. Please note that the generation of brute stack hinges on legacy NMO functions from a previous survey. In contrast to an earlier example, the application of STFM here employs the most straightforward local stacking approach of supergrouping (Bakulin et al., 2018a) which also uses the same legacy NMO for local summation. After 4D supergrouping with the apertures of 600 m × 600 m, we observe clear reflected events throughout the entire time span (Figure 15b). However, amplitude spectra inserts reveal the domination of low frequencies with the rapid decay of higher frequencies. Similar behavior is observed on the brute stack of supergrouped data (Figure 16b). Most reflection events are lit up but dominated by lower frequencies (Figure 17, the red curve). The application of STFM with the PSM and IRM results in cleaned gathers that preserve Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 STFM for speckle noise suppression higher frequencies, but the existing noise is not completely eliminated (Figure 15c). Likewise, the brute stack of data after STFM shows significant improvement in vertical resolution (Figure 16c). Brute stack with STFM reveals a much more balanced spectrum (Figure 17, the black curve) that is below the original data (in blue) and above the beamformed spectrum (in red). Because legacy NMO information is almost always available, such a data-driven workflow with STFM based on supergrouping can be easily implemented in the field. Supergrouping is the simplest form of local stacking that is computationally inexpensive. It does not perform an estimation of local coherency as implemented in more advanced methods deployed for final processing, such as, for example, NLBF (Bakulin et al., 2020a). As a result, a new in-field QC workflow V383 with STFM can robustly answer critical questions for real-time QC of prestack and stacked data with low S/N: • • • • Are prestack reflections present from primary and secondary targets? What is the achieved frequency band for those signals? Can a robust time stack be produced from the specific field acquisition geometry at hand? What is the processable frequency band for the time window of the primary and secondary target? Any in-field QC based on depth imaging suffers similarly from scattering noise (Bakulin et al., 2022a). Therefore, preprocessing Figure 14. Same as Figure 12 but after band-pass filtering 40–80 Hz. Coherent reflections are revealed in (b–d) that remain untrackable in (a) due to seismic speckle noise. Reflections on (b–d) match events shown in Figure 12b–12d, respectively. The fact that band-passed reflections stand out on their own suggests that STFM achieved the goal of recovering a signal above 40 Hz, where beamforming struggled. Figure 15. Comparison of prestack seismic gathers at different stages: (a) raw field data, (b) data after applying 4D supergrouping with apertures of 600 m × 600 m, and (c) data after STFM with a phase substitution PSM mask and IRM amplitude mask. The inserts show amplitude spectra with their own normalization. Local stacking with supergrouping reveals underlying reflections but suffers from the loss of higher frequencies, as seen on (b). STFM greatly improves coherency and preserves higher frequencies while avoiding amplitude oversmoothing typical for local stacking. STFM parameters are as follows: STFT window length 160 ms, window shift 12 ms, minimum statistic window for noise estimation 2ΔtMS ¼ 24 ms, β ¼ 0.5, and beamformed spectrum was corrected using equation 23 with σ τ ¼ 4 ms and σ φ ¼ π=4 and used for MS estimate (equation 26). Figure 16. Brute stacks produced with different processing: (a) standard approach with only linear noise removal, (b) same but with additional supergrouping, and (c) same as (a) but with the additional application of STFM. Observe the speckled appearance on (a) due to massive scattering noise. Supergrouping and STFM provide significant improvements in coherency, as shown in (b) and (c); however, STFM preserves higher frequencies and avoids oversmoothing. Bakulin et al. Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 V384 Figure 17. Averaged amplitude spectra of the brute stacks from Figure 16. Similar to the sketch in Figure 5a, a massive amplitude drop and accelerating loss of higher frequencies are observed on locally stacked data (the red). STFM (the black) partially compensates for the amplitude loss and retains the higher frequencies. data with the STFM workflow is equally applicable and valuable for any QC based on depth imaging. We do not claim that STFM flow replaces the standard time or depth processing. However, we demonstrate that STFM flow enables significantly more robust fasttrack processing by addressing scattering noise. Such an advantage is of paramount importance in geologic environments with a complex near surface containing abundant small-scale scatterers; such is the case for the desert environment of the Middle East (Ramdani, 2022). Data with small arrays and single sensors from the desert environment are characterized by low coherency and a greatly reduced S/N of −30 to −60 dB or less (Bakulin et al., 2022c). Speckle noise is the primary culprit for such behavior. Therefore, doing QC, processing, or imaging of such data require algorithms such as STFM to address scattering noise currently not cured by the standard processing toolbox. After STFM, such data become processable with more traditional workflows. CONCLUSION We introduced a new STFM approach to suppress multiplicative speckle noise caused by small-scale near-surface scattering. We focused on two types of multiplicative noise: residual statics and random phase perturbations. Although residual statics introduces an arrival time variation, phase perturbations distort the waveforms and break the event coherency. In the absence of a priori noise information, it is almost impossible to correct for multiplicative noise, thus severely limiting the efficiency of standard processing flows relying on local coherency and wavelet stability along the events. Local stacking of nearby traces suppresses multiplicative noise and provides a reliable estimate of the clean signal phase. However, stacked amplitude suffers from the accelerating loss of higher frequencies and oversmoothing. Because scattering noise is timevarying and non-surface-consistent, local stacking in small windows along the reflected event provides the required prerequisites to attack multiplicative speckle noise. It delivers local pilots at every point of the 3D prestack volume of interest. Those pilots change with the time and spatial coordinates, providing local “guides” that could be used to remove multiplicative noise. The amalgamation of these pilots was a previous best estimate of the enhanced data through local stacking or beamforming. If only additive noise is of concern, such a local stacking estimate is optimal. However, in the presence of multiplicative noise, the beamforming pilots are compromised because the signal phase and amplitude follow a different transformation. Their independent unraveling is required to undo speckle noise effects. Therefore, we invoke time-frequency masking from speech processing to perform separate manipulations of phase and amplitude TF spectra. Because the previous study proved that the stacked phase provides an estimate of the clean signal phase, we introduce two different phase masks. PSM replaces the raw phase with the beamformed phase. A more subtle PCM uses a beamformed phase as a guide to correct the raw phase only for polarity reversals. Phase masking compensated for local distortions caused by speckle noise; however, amplitude noise remains. We further introduce one type of amplitude TF filter called the ideal ratio mask that suppresses noise in the PS. This study only describes the data-driven IRM based on the MS method borrowed from speech processing. We modified it to use beamformed amplitude information as a guide. A combination of phase and amplitude masks using local stacking as the guide was called STFM. Controlled synthetic examples validated the accurate recovery of clean signals in the presence of residual statics, phase perturbations, and additive white noise. In addition, culminating field examples with severe speckle noise from a desert environment revealed the critical role played by STFM in making data processable for conventional algorithms. Just like speech processing, it is possible to introduce many other versions of phase and amplitude masks. However, this study is not focused on covering their variety. Instead, it presents the STFM concept based on an additional guide from local stacking or beamforming. Such a concept and construct are novel for speech and seismic processing. Furthermore, we stress the critical role of phase masks in efficiently counteracting the seismic speckle noise caused by smallscale scattering that existing processing approaches cannot cure. DATA AND MATERIALS AVAILABILITY Data associated with this research are confidential and cannot be released. REFERENCES Abbot, J. G., and F. L. Thurstone, 1979, Acoustic speckle: Theory and experimental analysis: Ultrasonic Imaging, 1, 303–324, doi: 10.1016/01617346(79)90024-5. Baeten, G., and H. van der Heijden, 2008, Improving S/N for high frequencies: The Leading Edge, 27, 144–153, doi: 10.1190/1.2840359. Bagaini, C., and C. Barajas-Olalde, 2004, Assessment and compensation of inconsistent coupling conditions in point-receiver land seismic data: 74th Annual International Meeting, SEG, Expanded Abstracts, 75–78, doi: 10 .1190/1.1845295. Bagaini, C., T. Bunting, A. El-Emam, A. Laake, and C. Strobbia, 2010, Land seismic techniques for high-quality data: Oilfield Review, 22, 28–39. Bakulin, A., M. Dmitriev, D. Neklyudov, and I. Silvestrov, 2020c, Importance of phase guides from beamformed data for processing multichannel data in highly scattering media: The Journal of the Acoustical Society of America, 147, EL447–EL552, doi: 10.1121/10.0001330. Bakulin, A., P. Golikov, M. Dmitriev, D. Neklyudov, P. Leger, and V. Dolgov, 2018b, Application of supergrouping to enhance 3D prestack seismic data from a desert environment: The Leading Edge, 37, 200–207, doi: 10 .1190/tle37030200.1. Downloaded 07/22/23 to 139.64.45.52. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms DOI:10.1190/geo2022-0779.1 STFM for speckle noise suppression Bakulin, A., D. Neklyudov, and I. Silvestrov, 2020b, Prestack data enhancement with phase corrections in time-frequency domain guided by local multidimensional stacking: Geophysical Prospecting, 68, 1811–1818, doi: 10.1111/1365-2478.12956. Bakulin, A., D. Neklyudov, and I. Silvestrov, 2021, Targeted noise removal by seismic time-frequency masking (STFM) and minimum statistics approach: 91st Annual International Meeting, SEG, Expanded Abstracts, 2879–2884, doi: 10.1190/segam2021-3581846.1. Bakulin, A., D. Neklyudov, and I. Silvestrov, 2022a, Multiplicative seismic noise caused by small-scale near-surface scattering and its transformation during stacking: Geophysics, 87, no. 5, V419–V435, doi: 10.1190/ geo2021-0830.1. Bakulin, A., D. Neklyudov, and I. Silvestrov, 2022b, Seismic speckle as multiplicative noise explaining land reflections distorted by near-surface scattering: 92nd Annual International Meeting, SEG, Expanded Abstracts, 2601–2605, doi: 10.1190/image2022-3735790.1. Bakulin, A., I. Silvestrov, M. Dmitriev, D. Neklyudov, M. Protasov, K. Gadylshin, and V. Dolgov, 2020a, Nonlinear beamforming for enhancement of 3D prestack land seismic data: Geophysics, 85, no. 3, V283– V296, doi: 10.1190/geo2019-0341.1. Bakulin, A., I. Silvestrov, M. Dmitriev, D. Neklyudov, M. Protasov, K. Gadylshin, V. Tcheverda, and V. Dolgov, 2018a, Nonlinear beamforming for enhancing prestack data with challenging near surface or overburden: First Break, 36, 121–126, doi: 10.3997/1365-2397.n0143. Bakulin, A., I. Silvestrov, and M. Protasov, 2022c, Signal-to-noise ratio computation for challenging land single-sensor seismic data: Geophysical Prospecting, 70, 629–638, doi: 10.1111/1365-2478.13178. Berni, A. J., and W. L. Roever, 1989, Field array performance: Theoretical study of spatially correlated variations in amplitude coupling and static shift and case study in the Paris Basin: Geophysics, 54, 451–459, doi: 10.1190/1.1442671. Boll, S. F., 1979, Suppression of acoustic noise in speech using spectral subtraction: IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 113–120, doi: 10.1109/TASSP.1979.1163209. Buzlukov, V., and E. Landa, 2013, Imaging improvement by prestack signal enhancement: Geophysical Prospecting, 61, 1150–1158, doi: 10.1111/ 1365-2478.12047. Cary, P. W., and G. A. Lorentz, 1993, Four-component surface-consistent deconvolution: Geophysics, 58, 383–392, doi: 10.1190/1.1443421. Damerjian, V., O. Tankyevych, N. Souag, and E. Petit, 2014, Speckle characterization methods in ultrasound images — A review: IRBM (Innovation and Research in BioMedical Engineering), 35, 202–213, doi: 10 .1016/j.irbm.2014.05.003. Doblinger, G., 1995, Computationally efficient speech enhancement by spectral minima tracking in subbands: Proceedings of the 4th European Conference Speech, Communication, and Technology, 1513–1516. Erdogan, H., J. R. Hershey, S. Watanabe, and J. Le Roux, 2015, Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks: IEEE International Conference on Acoustics, Speech and Signal Processing, 708–712. Fink, M., and A. Derode, 1998, Correlation length of ultrasonic speckle in anisotropic random media: Application to coherent echo detection: The Journal of the Acoustical Society of America, 103, 73–82, doi: 10.1121/1.421080. Goodman, J. W., 2020, Speckle phenomena in optics: Theory and applications, 2nd ed.: SPIE Press, 468. V385 Ji, Y., E. Kragh, and C. Bagaini, 2010, Noise attenuation methods for pointreceiver land seismic data: 80th Annual International Meeting, SEG, Expanded Abstracts, 3545–3549, doi: 10.1190/1.3513586. Khalil, A., and N. Gulunay, 2011, Intra array statics (IAS) in the crossspread domain: 73rd Annual International Conference and Exhibition, EAGE, Extended Abstracts, I037, doi: 10.3997/2214-4609.20149308. Kim, G., 2022, Review of time–frequency masking approach for improving speech intelligibility in noise: IETE Technical Review, 39, 623–634, doi: 10.1080/02564602.2021.1886610. Koutsogiannaki, M., O. Simantiraki, G. Degottex, and Y. Stylianou, 2014, The importance of phase on voice quality assessment: 15th Annual Conference of the International Speech Communication Association, 1653–1657. Liang, S., W. Liu, W. Jiang, and W. Xue, 2013, The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio: The Journal of the Acoustical Society of America, 134, EL452–EL458, doi: 10.1121/1.4824632. Lichman, E., 1999, Automated phase-based moveout correction: 69th Annual International Meeting, SEG, Expanded Abstracts, 1150–1153, doi: 10.1190/1.1820706. Martin, R., 1994, Spectral subtraction based on minimum statistics: Proceedings of 7th European Signal Processing Conference, 1182–1185. Martin, R., 2001, Noise power spectral density estimation based on optimal smoothing and minimum statistics: IEEE Transaction on Speech and Audio Processing, 9, 504–512, doi: 10.1109/89.928915. Meunier, J., 2011, Seismic acquisition from yesterday to tomorrow: 2011 distinguished instructor short course: SEG, doi: 10.1190/1 .9781560802853. Oppenheim, A. V., and J. S. Lim, 1981, The importance of phase in signals: Proceedings of the IEEE, 69, 529–541, doi: 10.1109/PROC.1981.12022. Ramdani, A., 2022, High-fidelity outcrop-analog model of the Hanifa Reservoir: Ph.D. thesis, King Abdullah University of Science and Technology. Regone, C., M. Fry, and J. Etgen, 2015, Dense sources vs. dense receivers in the presence of coherent noise: A land modeling study: 85th Annual International Meeting, SEG, Expanded Abstracts, 12–16, doi: 10.1190/ segam2015-5833924.1. Reilly, J. M., A. P. Shatilo, and Z. J. Shevchek, 2010, The case for separate sensor processing: Meeting the imaging challenge in a producing carbonate field in the Middle East: The Leading Edge, 29, 1240–1249, doi: 10 .1190/1.3496914. Stork, C., 2020, How does the thin near surface of the earth produce 10–100 times more noise on land seismic data than on marine data? First Break, 38, 67–75, doi: 10.3997/1365-2397.fb2020062. Wang, D., 2008a, Time-frequency masking for speech separation and its potential for hearing aid design: Trends in Amplification, 12, 332–353, doi: 10.1177/1084713808326455. Wang, D., and J. S. Lim, 1982, The unimportance of phase in speech enhancement: IEEE Transactions on Audio Speech Language Processing, 30, 679–681, doi: 10.1109/TASSP.1982.1163920. Wang, Y., 2008b, Seismic inverse Q filtering: Wiley-Blackwell, 248. Yilmaz, O., and S. Rickard, 2004, Blind separation of speech mixtures via time-frequency masking: IEEE Transactions on Signal Processing, 52, 1830–1847, doi: 10.1109/TSP.2004.828896. Biographies and photographs of the authors are not available.