Fundamentals of Geophysical Data Processingcap1
Fundamentals of Geophysical Data Processingcap1
The first step in data analysis is to learn how to represent and manipulate waveforms
in a digital computer. Time and space are ordinarily regarded as continuous, but for
purposes of computer analysis we must discretize them. This discretizing is also
called digitizing or sampling. Discretizing continuous functions may at first be
regarded as an evil that is necessary only because our data are not always known
analytic functions. However, after gaining some experience with sampled func-
tions, one realizes that many mathematical concepts are easier with sampled time
than with continuous time. For example, in this chapter the concept of the Z
transform is introduced and is shown to be equivalent to the Fourier transform.
The Z transform is readily understood on a basis of elementary algebra, whereas the
Fourier transform requires substantial experience in calculus.
FIGURE 1-1
A continuous time function sampled at uniform time intervals.
or observe b(t) at a uniform spacing of points in time. For this example, such a
digital approximation to the continuous function could be denoted by the vector
Of course if time points were taken more closely together we would have a more
accurate approximation. Besides a vector, a function can be represented as a poly-
nomial where the coeficients of the polynomial represent the values of b(t) at
successive time points. In this example we have
FIGURE 1-2
Coefficients of Z B(Z) are a shifted version of the coefficients of B(Z).
TRANSFORMS 3
FIGURE 1-3
Response to two explosions.
Another value of the delay operator is that it may be used to build up more
complicated time functions from simpler ones. Suppose b(t) represents the acoustic
pressure function or the seismogram observed after a distant explosion. Then b(t)
is called the impulse response. If another explosion occurs at t = 10 time units
after the first, we expect the pressure function y(t) depicted in Fig. 1-3.
In terms of Z transforms this would be expressed as Y(Z) = B(Z) + z"B(z).
If the first explosion were followed by an implosion of half strength, we would have
B(Z) - +Z1O~(Z).If pulses overlap one another in time [as would be the case if
B(Z) was of degree greater than 101, the waveforms would just add together in the
region of overlap. The supposition that they just add together without any inter-
action is called the linearity assumption. This linearity assumption is very often true
in practical cases. In seismology we find that-although the earth is a very hetero-
geneous conglomeration of rocks of different shapes and types-when seismic
waves (of usual amplitude) travel through the earth, they do not interfere with one
another. They satisfy linear superposition. The plague of nonlinearity arises from
large amplitude disturbances. Nonlinearity does not arise from geometrical
complications.
Now suppose there was an explosion at t = 0, a half-strength implosion at
t = 1, and another, quarter-strength explosion at t = 3. This sequence of events
determines a "source" time series, x, = ( I , -4, 0, a). The Z transform of the
source is X(Z) = 1 - 32 + $Z3. The observed y, for this sequence of explosions
and implosions through the seismometer has a Z transform Y(Z) given by
The last equation illustrates the underlying basis of linear-system theory that the
output Y(Z) can be expressed as the input X(Z) times the impulse response B(Z).
There are many examples of linear systems. A wide class of electronic
circuits is comprised of linear systems. Complicated linear systems are formed by
taking the output of one System and plugging it into the input of another. Suppose
4 FUNDAMENTALS OF GEOPHYSICAL DATA PROCESSING
x (0
Input
- C(Z) : B(Z) ,
Output
+ Y* 0)
FIGURE 1-4
Two equivalent filtering systems.
we have two linear systems characterized by B(Z) and C(Z), respectively. Then the
question arises whether the two combined systems of Fig. 1-4 are equivalent.
The use of Z transforms makes it obvious that these two systems are equivalent
since products of polynomials commute, i.e.,
Yl(Z) = [X(Z)B(Z)]C(Z) = XBC (1-1-3)
Y2(Z) = [X(Z)C(Z)]B(Z) = XCB = XBC (1- 1-4)
Consider a system with an impulse response B(Z) = 2 - Z - z2. This polynomial
can be factored into 2 - Z - Z2 = (2 + Z)(l - Z), and so we have the three equiv-
alent systems in Fig. 1-5. Since any polynomial can be factored, any impulse
response can be simulated by a cascade of two-term filters (impulse responses
whose Z transforms are linear in 2).
What do we actually do in a computer when we multiply two Z transforms
together? The filter 2 + Z would be represented in a computer by the storage in
memory of the coefficients (2, 1). Likewise, for 1 - Z the numbers (1, - 1) are
stored. The polynomial multiplication program should take these inputs and
produce the sequence (2, - 1, - 1). Let us see how the computation proceeds in a
general case, say
X(Z)B(Z) = Y(Z) (1-1-5)
FIGURE 1-5
Three equivalent filtering systems.
TRANSFORMS 5
DO 20 J = l , L B
FIGURE 1-6
A computer program to do convolution. 20 Y (I+J-1) = Y (I+J-1) + x(I)*B(J)
Equation (1-1-8) is called a convolution equation. Thus, we may say that the
product of two polynomials is another polynomial whose coefficients are found by
convolution. A simple Fortran computer program which does convolution, includ-
ing end effects on both ends, is shown in Fig. 1-6. The reader should notice that
X(Z) and Y ( Z ) need not strictly be polynomials; they may contain both positive
and negative powers of Z ; that is,
The effect of using negative powers of Z in X(Z) and Y(Z) is merely to indicate that
data are defined before t = 0. The effect of using negative powers of Z in the filter is
quite different. Inspection of (1-1-8) shows that the output y, which occurs at time
k is a linear combination of current and previous inputs; that is, (xi, i 2 k). If the
filter B(Z) had included a term like b-,/Z, then the output yk at time k would be a
linear combination of current and previous inputs and xk+,,.an input which really
has not arrived at time k. Such a filter is called a nonrealizable filter because it
could not operate in the real world where nothing can respond now to an excitation
which has not yet occurred. However, nonrealizable filters are occasionally useful
in computer siml~lationswhere all of the data are prerecorded.
EXERCISES
1 Let B(Z) = 1 + Z + ZZ+ Z3+ Z4. Graph the coefficients of B(Z) as a function of
the powers of 2. Graph the coefficients of [B(Z)IZ.
2 If xt = cos oot, where t takes on integral values b, = (bo, bl) and Y(Z) = X(Z)B(Z),
what are A and Bin y,=Acos w o t + B s i n o o t ?
3 Deduce that, if x, = cos w0 t and b, = (bo, bl, . . . , b,), then y, always takes the form
Acoswot+Bsinwot. .
reduces to the sum (1-2-2) when b(t) is not a continuous function of time but is
defined as
" identification of coefficients " be the same as the apparently more complicated
operation of inverse Fourier integrals ? The inverse Fourier integral is
First notice that the integration of Zn about the unit circle or einm over
-n I cu < + n gives zero unless n = 0 because cosine and sine are oscillatory; that
is,
1
z, S.einw
do =-I
2n
SR-,
(cos n o + i sin no) do
(1-2-6)
- (1 i f n = O
10if n = non-zero integer
In terms of our discretized time functions, the inverse Fourier integral (1-2-5) is
Of all the terms in the integrand (1-2-7) we see by (1-2-6) that only the term with b,
will contribute to the integral; all the rest oscillate and cancel. In other words, it is
only the coefficient of Z to the zero power which contributes to the integral,
reducing (1 -2-7) to
This shows how inverse Fourier transformation is just like identifying coefficients of
powers of Z.
In this book and many others, it is common to assume that the time span
between data samples At = 1 is unity. To adapt given equations to other values of
At, one only need replace cu by o At; that is,
FIGURE 1-7a
If a high-frequency sinusoid is sampled insufficiently often, it becomes indistin-
guishable from a lower-frequency sinusoid. For this reason w,,, = r / A t is said to
be the folding frequency, as higher frequencies are folded down to look like lower
frequencies. In practice, quasi-sinusoidal waves are always sampled more fre-
quently than twice per wavelength. Good theoretical reasons for sampling eight
or more points per wavelength are developed on pp. 44 to 47.
A function is the sum of its even and odd parts. By adding (1-2-10) and (1-2-1 1), we
get
Consider a simple, real, even time function such as (b-,, bo , b,) = (1, 0, 1). Its
transform Z + l / Z = 2 cos o is an even function of o since cos o = cos ( - a ) . Con-
sider the real, odd time function (b-, , bo , b,) = (- 1,0, I). Its transform Z - 1/ Z =
2(sin o ) / i is imaginary and odd, since sin o = - sin (- o ) . Likewise, the transform
of the imaginary even function (i, 0, i) is the imaginary even function i cos o and the
transform of the imaginary odd function (- i, 0, i) is real and odd. Let r and i refer
to real and imaginary, e and o refer to even and odd, and lower-case and upper-case
refer to time and frequency functions. A summary of the symmetries of Fourier
transformation is shown in Fig. 1-7b.
More elaborate time functions can be made up by adding together the two
point functions we have considered. Since sums of even functions are even, and so
on, the table of Fig. 1-7b applies to all time functions. Note that an arbitrary time
function takes the form b, = (re + ro) + i(ie + io), . On transformation of b, , each
of the four individual parts transforms according to the table.
re
ie -cos
+----t
cos
RE
IE
FIGURE 1-7b
Mnemonic table illustrating how even/
odd and reallimaginary properties are
io
O x Ro
I0
EXERCISES
1 Normally a function is specified entirely in the time domain or entirely in the fre-
quency domain. When one is known, the other is determined by transformation.
Now let us give half the information in the time domain by specifying that 6 , = 0 for
t < 0, and half in the frequency domain by giving the real part RE + RO in the
frequency domain. How can you determine the rest of the function?
real w , then we have plotted the transform. (Note that for real o, Z is of unit
magnitude; i.e., on the unit circle.) Since o is a continuous variable and everything
in a computer is finite, how do we select a finite number of values okfor plotting?
The usual choice is to take evenly spaced frequencies. The lowest frequency can be
zero. [Note Z(o = 0) = eiO= 1.] A frequency as high as o = 271 [note Z ( o = 2n) =
ei2n -
- 1 also] need not be considered, since (1-3-1) gives the same value for it as for
zero frequency. Choosing uniformly spaced frequencies between these limits we
have
where
w = e2nilN (1-3-4)
It is not essential to choose N =M as we have done in (1-3-3), but it is a convenience.
There is no loss of generality because one may always append zeros to a time func-
tion before inserting it into (1-3-3). A convenience of the choice N = M is that the
matrix in (1-3-3) will then be square and there will be an exact inverse. In fact, the
inverse to (1-3-3) may be easily shown to be
10 FUNDAMENTALS OF GEOPHYSICAL DATA PROCESSING
Since 11W is the complex conjugate of W, the matrices of (1-3-3) and (1-3-5)
are just complex conjugates of one another. In fact, one observes no fundamental
mathematical difference between time functions and frequency functions. This
" duality" would be even more complete if we had used a scale factor of N -'I2 in
each of (1-3-3) and (1-3-5) rather than 1 in (1-3-3) and N in (1-3-5). Note also
that time functions and frequency functions could be interchanged in the mnemonic
table describing symmetries. In fact, our earlier observation that the product of
two frequency functions amounts to a convolution of the corresponding two time
functions has a dual statement that the product of two time functions corresponds
to the convolution of the corresponding two frequency functions. We will not
" prove " this duality as it is standard fare in both mathematics and systems theory
books. However we will occasionally call upon the reader to realize that in any
theorem the meanings of " time " and "frequency " may be interchanged.
In making a plot of the transform Bkfor (k = 0, 1, . . . , M - 1) the frequency
axis ranges as 0 < ok< 2n. It is often more natural to display the interval
- n < o < n. Since the transform is periodic with period 2n, values of Bk on the
interval n I o < 2n may simply be moved to the interval - n Io < 0 for display.
Thus, for N = 8 one might plot successively
One advantage of this display interval is that for continuous time series which are
sampled sufficiently densely in time the transform values Bkget small on both ends.
If the time series is real, the real part of B, has even symmetry about Bo; the imagin-
ary part has odd symmetry about Bo. Then, one need not bother to display half
the values. Choice of an odd value of N would enable us to put o = 0 exactly in the
middle of the interval, but the reader will soon see why we stick to an even number
of data points.
The matrix times vector operation in (1-3-3) requires N, multiplications and
additions. The rest of this section describes a trick method, called the fast Fourier
transform, of accomplishing the matrix multiplication in N log, N multiplications
and additions. Since, for example, log2 1024 is 10, this is a tremendous saving in
effort.
A basic building block in the fast Fourier transform is called doubling. Given
a series (xo , x,, . . . , xN-,) and its sampled Fourier transform (Xo , XI, . . . , XN-,)
and another series (yo,y,, . . . ,y,v- ,) and its transform (Yo, Y , , . . . , Y,-,), one finds
the transform of the interlaced double-length series
The process of doubling is used many times during the process of computing a fast
Fourier transform. As the word doubling might suggest, it will be convenient to
suppose that N is an integer formed by raising 2 to some integer power. Suppose
TRANSFORMS 11
N = 8 = 23. We begin by dividing our eight-point series x,, x,, . . . , x, into eight
different series of one point each. The Fourier transform of each of the one-point
series is just the point. Next, we use doubling four times to get the transforms of the
four different two point series (xo, x,), (x,, x,), (x,, x,), and (x,, x,). We use
doubling twice more to get the transforms of the two different four point series
(x0 ,x 2 , x4 , x6) and (x,, x, , x, , x,). Finally, we use doubling once more to get the
transform of the original eight-point series (xo, x,, x2 , . . . , x,).
It remains to look into the details of the doubling process.
Let
The transform of the interlaced series zj = (xo, yo, x,, y,, . . . , xN-,, yN-,) is by
definition
We split the sum into two parts, noting that xj multiplies even powers of V and y j
multiplies odd powers.
SUBROUTINE FORK(LX,CX,SIGNI)
C FAST FOURIER 2/15/69
C LX
C CX(K) = SQRT ( 1 1 ~ ~SUM ) (CX(J)*EXP (2*PI*SIGNI*I* (J-1)* (K-1) /LX))
C J=1 FOR K=1,2, ..., (LX=2**INTEGER)
COMPLEX CX (LX) ,CARG ,CEXP ,CW,CTEMP
J=1
SC=SQRT (1. /LX)
DO 30 I=l,LX
IF(1.GT.J) GO TO 10
CTEMP=CX (J) *SC
CX(J)=CX(I)*SC
CX (I)=CTEMP
10 M=LX/2
20 IF(J.LE.M) G O T 0 30
J=J-M
M=M/2
IF(M.GE.l) GO TO 20
30 J=J+M
L=l
40 ISTEP=2*L
DO 5 0 I\I=l,L
CARG=(O. ,1.)*(3.14159265*SIGNI*(M-1))/L
CW=CEX? (CARG)
DO 50 I=PI,LX,ISTEP
CTEMP=CW*CX (I+L)
cx (I+L ) =CX ( I ) -CTEMP
50 CX (I)=CX (I)+CTEMP
L=ISTEP ,
IF(L.LT.LX) GO TO 4 0
RETURN
END
FIGURE 1-8
A program to do fast Fourier transform. Modified from Brenner. Calling this
+
program twice returns the original data. SIGN1 should be 1. on one call and
- 1. on the other. LX must be a power of 2.
= X" - vmY,,
Z k =Xk-,- V ~ - YN (k = N , N + 1, ..., 2 N - 1) (1-3-9)
The first machine computation with this algorithm known to the author
was done by Vern Herbert, who used it extensively in the interpretation of reflection
seismic data. He programmed it on an IBM 1401 computer at Chevron Standard
Ltd., Calgary, Canada in 1962. Herbert never published the method. It was
rediscovered and widely publicized by Cooley and Tukey in 1965. Thus it has come
to be known as the Cooley and Tukey algorithm. (A good reference to literature
on the subject is Ref. [9].)
EXERCISES
y
-* Verify that for an arbitrary N x N case the matrix of (1-3-5) is indeed the inverse of
the matrix of (1-3-3).
In Out
FIGURE 1-9.
A sinusoid sin wt goes into a filter and a delayed sinusoid sin ( o t - 4) comes out.
t-A+
FIGURE 1-10
A graph of cos w,t + cos w2t looks like an amplitude-modulated cosine of the
average frequency.
we see that the sum of two cosines looks like a cosine of the average frequency
multiplied by a cosine of half the difference frequency. Since the frequencies are
taken close together, the difference frequency factor represents a slowly variable
amplitude on the average frequency. Now let us take the output of the filter j),to be
y, = cos (wit - 4,) + cos (0, t - 4,) (1-4-4)
In taking the output of the filter to be of the form of (1-4-4), we have assumed that
neither frequency was attenuated. To allow differential attenuation of the two
frequency components would greatly complicate the discussion. Utilizing the same
trigonometric identity on (1-4-4), we get
Here (x,, yo) is the location of the filter input and (x, y) is where the phase is
observed (like the filter output). The symbols k, and k, denote the "spatial fre-
quencies," that is, k, is 2n divided by the wavelength measured along the x axis.
Methods of theoretical physics provide a relationship between u, and kx and k,.
Often it can be explicitly given in the form
=~ ( k ,, k,) (1-4- 10)
Since velocity is distance divided by time we can define the phase velocity
along the x direction as
X - Xo
(' phase)x = phase delay
16 FUNDAMENTALS OF GEOPHYSICAL DATA PROCESSING
du,
=(X - x0) - (1-4-1 1)
d4
Say y = y o , then (1-4-9) reduces to
which gives
In observational geophysics the velocity one deals with is nearly always the
group velocity. It is the velocity with which bundles of energy move. In the example
shown in Fig. 1-11 there is an excessive amount of" noise " (not unusual in observa-
tional geophysics); however, it can be seen that the disturbance first displays the
long-period oscillations and then the shorter-period oscillations. The group
velocity is found by dividing the distance by the time of arrival. One could observe
phase velocities by having two observation stations near each other and measuring
the time delay of some particular zero crossing. The reason for having the stations
near one another is that the waveforms are steadily changing, and if the stations are
too far apart, it may not be possible to tell which zero crossings are to be compared.
10:03 PST
10/14/70
(The first
arrival)
FIGURE 1-11
An example of a wave packet in which different frequencies may be seen propa-
gating at different speeds. This example is of two air-pressure waves thought to
result from nuclear explosions in Asia; they were recorded in California on one
of the author's microbarographs.
real part RE and an imaginary odd part 1 0 . Taking the squared magnitude, one
has (RE + iIO)(RE - iIO) = (RE)2 + (10)'. The square of an even function is
obviously even and the square of an odd function is also even. Thus, the spectrum
of a real time function is even so that its values at plus frequencies are the same as
its values at minus frequencies. In other words, there is no special meaning to be
attached to negative frequencies.
Although most time functions which arise in applications are real time
functions, a discussion of correlation and spectra is not mathematically complete
without considering complex-valued time functions. Furthermore, complex-valued
time functions can be extremely useful in many physical problems in which rotation
occurs. For example consider two vector-component wind-speed indicators: one
pointing north, recording n, , and the other pointing west, recording w, . Now if one
makes up a complex-valued time series v, = n, + iw,, the magnitude and phase
angle of the complex numbers have obvious physical interpretation. The (RE + iIO)
part of the transform relates to n, and the (RO + ilE) part relates to w t . The
spectrum, however, is (RE + ~ 0 + (IE ) +~ IO)', which is neither even nor odd,
and the fact that V(+o) # V(-o) must have some interpretation. Indeed it does,
and the meaning is that + o corresponds to rotation in one sense (counterclockwise)
and (- o ) to rotation in the other direction. T o see this, suppose n, = cos ( a o t + 4 )
and w, = -sin ( a o t + 4). Then v, = e-i("0'+4). The transform is
18 FUNDAMENTALS OF GEOPHYSICAL DATA PROCESSING
AV(W)
FIGURE 1-12
Spectrum of the complex time series
e-"oot +@,* b
0 Wo W
It is of interest to multiply out the polynomials B(l/Z) with B(Z) in order to examine
the coefficients of R(Z).
The coefficient rk of Zk is given by
rk = C Ei bi+k (1-5-9)
i
If some particular coefficient ck in C(Z) is greater than any of the others, then it
may be said that the waveform a, most resembles the waveform b, if one is delayed k
time units with respect to the other.
EXERCISES
.# Suppose a wavelet is made up of complex numbers. Is the autocorrelation relation
rk = r - true ? Is rk real or complex ? Is R(o)real or complex ?
2 Let x, be some real time function. Let y, = x t + be
~ another real time function. Sketch
the phase as a function of frequency of the cross spectrum X ( l / Z )Y ( Z )as computed
by a computer which put all arctangents in the principal quadrants - ~ / 2< arctan <
2 u, - u,-l
z Im-
+
At U , ut-1
Now that we have some idea what a 90" phase shift filter can be used for,
let us find out the numerical values of q , . The time derivative operation has the
desired 90" phase-shifting property we seek. The trouble with a differentiator is that
higher frequencies are amplified with respect to lower frequencies. Specifically
f (t) =j ~ ( o ) e - d o
Thus we see that time differentiation corresponds to the weight factor - icc, in the
frequency domain. The weight - iw has the proper phase but the wrong amplitude.
The desired weight factor is Q ( o ) = - i o / 1 o 1. It is the step function shown in
Fig. 1-13.
t
iQ(w) (= real)
FIGURE 1-13
Frequency response of 90" phase-shifting filter.
TRANSFORMS 21
II o for n even
- 2/an for n odd
FIGURE 1-14
Quadrature filter.
=-I -,
i
2n
O
e-'" dw -%lo
i T -ion dm
for n even
- 2/nn for n odd
The result is shown in Fig. 1-14.
Since the filter does not vanish for negative n, this is obviously a nonrealizable
filter (one which requires future inputs to create its present output). If the discussion
were in continuous time rather than sampled time, the filter would be of the form
1It, a function which has a singularity at t = 0 and whose integral over + t is diver-
gent. Convolution with the filter coefficients qn is therefore very awkward because
the infinite sequence drops off very slowly. Convolution with the filter q is called
Hilbert transformation.
+
Let us return to the filter 1 iQ(Z) mentioned earlier. As shown in Fig. 1-15,
this filter is simply a step function in the frequency domain. A cheap way to achieve
the 90" phase shift operation is to do it in the frequency domain. One begins with
x, + i - 0 and transforms it to the frequency domain. Then multiply by the step of
Fig. 1-15. Finally, inverse transformation gives x, + iy,. The progress of even,
odd, real, and imaginary parts is detailed in Fig. 1-16.
22 FUNDAMENTALS OF GEOPHYSICAL DATA PROCESSING
FIGURE 1-15
+
The filter 1 iQ(Z) is real and one-sided
in the frequency domain but complex
and two-sided in the time domain.
FIGURE 1-16
Hilbert tratlsform or quadrature filtering
by step weight in the frequency domain.
FIGURE 1-17
Impulse plus i times a 90" phase-shift filter becomes a real step in the frequency
domain.
The function 1 + iQ plays a special role in theoretical time series analysis
which, in later chapters, will be shown to be related to the principle of causality.
For future reference we summarize the properties of this function in Fig. 1-17.
EXERCISES
I By means of partial fractions convolve the waveform
(2/77)(. . . , - +,0, -+,0, - 1,0, 1, 0, 4, 0, *,. . .)
G$
with itself. What is the interpretation of the fact that the result is (. . . , 0, 0, - 1,O
+
0, . . .)? (HINT: n2/8 = 1 + $ + A -t& . . . ).
2 In terms of the fast Fourier transform matrix the quadrature filter Q(o)may be
. represented by the column vector
Multiply this into the inverse transform matrix to show that the transform is propor-
tional to (cos rrk/N)/(sin rk/N). What is the scale factor? Sketch it for k < N indicat-
ingthelimit N + m . [HINT: 1 + x + x 2 + ~ . . x N = ( 1- x N + l)/(I - XI.]