Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Digital Signal Processing: Prabhu Babu, Petre Stoica

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Digital Signal Processing 20 (2010) 359378

Contents lists available at ScienceDirect


Digital Signal Processing
www.elsevier.com/locate/dsp
Spectral analysis of nonuniformly sampled data a review
Prabhu Babu

, Petre Stoica
Department of Information Technology, P.O. Box 337, Uppsala University, SE-751 05 Uppsala, Sweden
a r t i c l e i n f o a b s t r a c t
Article history:
Available online 1 July 2009
Keywords:
Nonuniform sampling
Periodogram
Interpolation techniques
Slotted resampling
Missing data case
In this paper, we present a comprehensive review of methods for spectral analysis of
nonuniformly sampled data. For a given nite set of nonuniformly sampled data, a
reasonable way to choose the Nyquist frequency and the resampling time are discussed.
The various existing methods for spectral analysis of nonuniform data are grouped and
described under four broad categories: methods based on least squares; methods based
on interpolation techniques; methods based on slotted resampling; methods based on
continuous time models. The performance of the methods under each category is evaluated
on simulated data sets. The methods are then classied according to their capabilities
to handle different types of spectrum, signal models and sampling patterns. Finally the
performance of the different methods is evaluated on two real life nonuniform data sets.
Apart from the spectral analysis methods, methods for exact signal reconstruction from
nonuniform data are also reviewed.
2009 Elsevier Inc. All rights reserved.
1. Introduction
Spectral analysis of nonuniformly sampled data is an old and well-known area of research. The nonuniformity in the
data is generally due to reasons like unavailability of samples at specic instants of uniformly sampled data (commonly
called missing data problem), random sampling device following a specic distribution, sampling device with arbitrary jitter
around the regular uniform grid, sampling device following a periodic sampling pattern but arbitrary within each period,
and sampling device with completely arbitrary sampling scheme. The problem of spectral analysis of nonuniformly sampled
data is well motivated both by its theoretical signicance and by its wide spread application in elds like astronomy [1],
seismology [2], paleoclimatology [3], genetics [4] and laser Doppler velocimetry [5]. The simplest way to deal with this
problem is to compute the normal periodogram by neglecting the fact that data samples are nonuniform; this results in a
dirty spectrum. Then a method called CLEAN [6] was proposed in the astronomical literature to do iterative deconvolution
in the frequency domain to obtain the clean spectrum from the dirty one. A periodogram related method, which was
also proposed in the astronomical literature is least squares periodogram (also called the LombScargle periodogram) [1,7]
which estimates the different sinusoids in the data by tting them to the nonuniform data. Most recently, [8,9] introduced
a new ecient method called Iterative Adaptive Approach (IAA), which relies on solving an iterative weighted least squares
problem. In [10] and [11], consistent spectral estimation methods have been proposed by assuming the sampling patterns
are known to be either uniformly distributed or Poisson distributed. Apart from these nonparametric methods there are also
methods based on parametric modeling of the continuous time spectrum such as:
*
Corresponding author. Fax: +46 18 511925.
E-mail address: prabhu.babu@it.uu.se (P. Babu).
1051-2004/$ see front matter 2009 Elsevier Inc. All rights reserved.
doi:10.1016/j.dsp.2009.06.019
360 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
maximum likelihood (ML) tting of continuous time ARMA (CARMA) model to the nonuniformly sampled data [12],
methods based on estimating the parameters of the continuous time model by approximating the derivative operators
in the continuous time model [13],
methods based on transforming the underlying continuous time model to an auxiliary discrete time model [14] and
estimating the parameters of the continuous time model by identifying the parameters of the discrete time model.
There are also methods using the spectral analysis tools from uniform sampling domain by:
obtaining the samples on an uniform grid by resampling and interpolation techniques,
estimating the auto-covariance sequence of the data on a regular grid through slotting techniques [15,16] or using
suitable kernels [1719],
converting the nonuniform data problem into a missing data problem by techniques like slotted resampling, and esti-
mating the underlying spectrum by either parametric modeling [2022], or by nonparametric methods [23,24].
Although our primary interest in this paper is in the methods for spectral analysis of nonuniform data, we will also
briey look into the methods for exact reconstruction of the continuous time signal from the nonuniform data; once the
signal is reconstructed, spectral analysis can be done on the reconstructed signal. These reconstruction methods rely on
some assumptions on the underlying continuous time signal like the signal is assumed to be periodic bandlimited [25],
bandlimited and sparse [26], known to be from a specic functional space like shift invariant space [27], etc. For example,
using the assumption that the underlying signal is periodic bandlimited with a known bandwidth, the continuous time
domain signal can be effectively reconstructed from its nite nonuniform samples by solving a linear system. Together with
the bandlimited assumption, if the signal is assumed to be sparse in the Fourier domain, then the continuous time signal
can be reconstructed using tools from the recently developed eld of compressed sensing or compressive sampling. The idea
of signal reconstruction from nonuniform samples for signals from a specic functional space has its deep root in the area
of functional analysis. For example, [28] proposes various iterative algorithms for the reconstruction of the continuous time
signal from a nite number of nonuniform samples. Apart from this there are few signal reconstruction methods [29,30]
specic to sampling patterns like periodic nonuniform sampling, which is common in high speed analog to digital converters
(ADC). In [29], the authors have given formulas for exact reconstruction of the signals sampled in a periodic nonuniform
fashion, under the assumption that the underlying signals are multiband with a known band structure. The paper [30] has
dealt with the same problem but without the knowledge of the band structure.
Although the literature for nonuniformly sampled data is quite rich and diverse, the tools and techniques employed by
different methods are not classied and compared. One of the main goals of this paper is to discuss and classify the different
methods based on:
the signal model (nonparametric or parametric),
the sampling pattern,
the spectrum of the signal (discrete or continuous).
The different methods are then compared based on their performance on simulated or real life nonuniform data sets.
The outline of the paper is as follows. In Section 2, we will start with the various notations used in the paper, and discuss
preliminaries like various signal models, and sampling patterns, and include tools like least squares, maximum likelihood,
resampling and interpolation techniques, which will be used in the later sections of this paper.
In Section 3, we present various methods for exact signal reconstruction from nonuniform samples and compare their
complexities. We end that section by including a subsection about the choice of parameters like Nyquist limit, frequency
resolution and resampling time in the nonuniform data case.
Section 4 deals with the different methods for spectral analysis of nonuniformly sampled data, which are classied
based on the three factors: signal model, sampling pattern and the underlying spectrum. We compare the different methods
discussed in this section by carrying out numerical simulations on two real life data sets.
Section 5 contains the conclusions of this paper.
2. Notations and preliminaries
The following list contains some notations often used in the paper.
f (t) Continuous time signal
t
s
Sampling period
f (nt
s
) Uniformly sampled signal
f

() Complex conjugate of f ()
F () Fourier transform of f (t)
r() Autocovariance sequence
() Power spectrum
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 361
, Analog/digital frequency
{t
k
}
N
k=1
Nonuniform sampling time instants
E() Expectation operation
p
X
(x) PDF of a random variable X

2
l
2
norm

1
l
1
norm
()
H
Conjugate transpose
W

(W
H
W)
1
W
H
F
Fourier transform
F
1
Inverse Fourier transform
2.1. Signal models
In this subsection, we will introduce the signal models, both continuous and discrete time, used in this paper.
2.1.1. Discrete time models
The discrete time stationary stochastic signal, f (nt
s
), can be modeled by a general equation of the form:
P

p=0
a
p
f (n p) =
M

m=0
b
m
e(n m) +
L

l=0
c
l
u(n l) n (1)
where e(n) represents a zero mean, unit variance white noise, and u(n) represents another stationary process which is
independent of e(n), the t
s
in f (nt
s
) has been omitted for notational simplicity.
If the coecients, {c
l
}
L
l=0
, {b
m
}
M
m=1
are set to zero, then Eq. (1) represents a autoregressive process (AR) of order P.
If the coecients, {c
l
}
L
l=0
, {a
p
}
P
p=1
are set to zero, and a
0
= 1, then Eq. (1) represents an moving average process (MA)
of order M.
If the coecients, {b
m
}
M
m=1
are set to zero, then Eq. (1) represents an autoregressive process with exogenous input (ARX)
of order (P, L).
If the coecients, {c
l
}
L
l=0
are set to zero, then Eq. (1) represents an ARMA(P, M).
2.1.2. Continuous time models
The ARMA model can be generalized to the continuous time case as shown below:
_
a
0
p
M
+a
1
p
M1
+ +a
M
_
f (t) =
_
b
0
p
N
+b
1
p
N1
+ +b
N
_
e(t) (2)
where p denotes the differentiation operator,
d
dt
, and e(t) denotes the continuous time white noise of unit intensity.
Similarly to the discrete time case, if the coecients, {b
p
}
N1
p=0
are set to zero, then Eq. (2) represents a continuous time
autoregressive process (CAR(M)), else it represents a continuous time ARMA (CARMA(M, N)) process.
2.2. Sampling patterns
In this subsection, we will describe various sampling patterns that are used in the other sections.
2.2.1. Uniform sampling
The sampling instants are uniformly placed, i.e. for any k we have that t
k
=kt
s
, where t
s
is the sampling period and it is
given by

max
, where
max
is the maximum frequency present in the continuous time signal.
2.2.2. Sampling pattern with specic distribution
In some cases the sampling instants are random and follow some specic probability distribution like Poisson, uniform
distribution, or stratied uniform distribution.
Uniform distribution. The PDF of uniformly distributed sampling instants within the interval of [0, T ] is given by
p(t)

=
_
1
T
when t [0, T ],
0 when t / [0, T ].
(3)
Stratied uniform distribution. As in the uniform distribution case, here too the sampling instants are uniformly dis-
tributed, but stratied. Let 0 =
0
<
1
< <
N
= T , denote an N partition of the interval [0, T ] dened by a
probability density function, h(t), given by
362 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
Fig. 1. Periodic nonuniform sampling.

j
_
0
h(t) dt =
j
N
, j = 0, 1, . . . , N. (4)
Then the sampling points {t
k
}
N
k=1
are selected such that each t
k
is uniformly distributed in the subinterval [
k1
,
k
).
Poisson distribution. Here any sampling instant t
k
, will be modeled in terms of the previous sampling instant as shown
below:
t
0
= 0,
t
k
=t
k1
+
k
, k = 1, 2, . . . , (5)
where
k
, k = 1, 2, . . . , are i.i.d. random variables with a common exponential distribution p() = exp(), where
denes the mean sampling rate. This leads to the spacing between the sampling instants, given by (t
s+k
t
s
), following
Poisson distribution with PDF given by
p
k
(x) =
_

(x)
k1
(k1)!
exp(x), x 0,
0, x <0.
(6)
2.2.3. Periodic nonuniform sampling
Here the regular sampling grid with sampling time t
s
is divided into periods of size T = Mt
s
, where M Z
+
. Within
each period, the sampling points are nonuniformly placed and they can be written as
t
k
=nT +
k
t
s
, k = 1, . . . , K, with 0
k
< M, n Z. (7)
Fig. 1 shows a periodic nonuniform sampling grid for M = 6, K = 2.
2.2.4. Arbitrary sampling
We also consider arbitrary sampling schemes, which cannot be characterized by any simple probability distribution.
2.3. Resampling and interpolation techniques
We will now discuss resampling and interpolation techniques used in the later sections of the paper. They are generally
employed to interpolate the data on a uniform grid from nonuniformly placed samples.
2.3.1. Kernel based interpolation
Given the N nonuniformly placed samples, {y(t
k
)}
N
k=1
, of the continuous time signal y(t), within the interval [0, T ] the
signal y(t) can be interpolated by
y
i
(t) =
N

k=1
y(t
k
)K(t, t
k
), t [0, T ], (8)
where y
i
(t) is the interpolated signal and K() denotes the interpolation kernel. Some of the commonly used kernels are:
Sinc kernel
K(t, t
k
) =
sin(
(tt
k
)
b
1
)
(t t
k
)
(9)
where b
1
determines the mainlobe width of the sinc.
Gaussian kernel
K(t, t
k
) =
1

2b
2
exp
_
(t t
k
)
2
2b
2
2
_
(10)
where b
2
determines the bandwidth of the Gaussian function.
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 363
Fig. 2. Grids for slotted resampling technique.
Laplacian kernel
K(t, t
k
) =
1
2b
3
exp
_
|t t
k
|
b
3
_
(11)
where b
3
determines the bandwidth of the Laplacian function.
Rectangular kernel
K(t, t
k
) =
_
1
2b
4
, |t t
k
| b
4
,
0, otherwise
(12)
where b
4
determines the width of the rectangular window. In the literature, interpolation through a rectangular kernel
is commonly called slotting technique.
For all these kernels, the user must make the right choice of the parameters. Furthermore none of the above mentioned
kernel based interpolation techniques has the interpolation property: y
i
(t
k
) = y(t
k
), k = 1, 2, . . . , N.
2.3.2. Nearest neighbor interpolation
In this case, the time interval [0, T ] is divided into a regular grid with spacing between the sampling instants equal to t
r
,
the choice of t
r
is discussed in Section 3. Then the sample value at any point mt
r
, is assigned the data value which is closer
to mt
r
.
2.3.3. Slotted resampling
As for the nearest neighbor interpolation, the time interval [0, T ] is divided into a regular grid with spacing t
r
, and
around each point, a slot of width t
w
is placed (see Fig. 2); (a reasonable choice for t
w
is equal to
t
r
2
). Then the sample
value at any point mt
r
is assigned the nearest sampled value within the slot around mt
r
. If there is no sampled value
falling within a slot then the corresponding regular grid point is not assigned a sample value, which leads to a missing data
problem.
2.4. Least squares, maximum likelihood
In this subsection, we will briey describe some estimation techniques that we will use in the remaining sections. Let
{y
i
}
N
i=1
denote the N measurements of an experiment with a model
y
i
=a
H
i
x +n
i
, i = 1, . . . , N, (13)
where x, of size M < N, denotes the parameters of interest and n
i
denotes the noise in the measurements.
2.4.1. Least squares (LS)
The least-squares (LS) estimate of x is given by
x
LS
= argmin
x
y Ax
2
2
(14)
where y = [y
1
, . . . , y
N
]
T
, and A = [a
1
, . . . , a
N
]
H
is assumed to have full column rank. Then the solution to the minimization
problem is analytically given by x
LS
= A

y. If the covariances of the noise in the measurements are known, then an optimal
solution can be obtained by solving a weighted least squares (WLS) problem:
x
WLS
= argmin
x
y Ax
2
R
1
= argmin
x
( y Ax)
H
R
1
( y Ax)
=
_
A
H
R
1
A
_
1
A
H
R
1
y (15)
where R represents the covariance matrix of the noise.
364 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
2.4.2. Maximum likelihood (ML)
The maximum likelihood (ML) estimate of x is obtained by maximizing the likelihood function L(x):
x
ML
= argmax
x
_
L(x)

= p( y/x)
_
= argmax
x
1
()
N
|R|
e
( yAx)
H
R
1
( yAx)
(16)
where p( y/x) represents the PDF of the measurements. The second equality in the above equation follows from the as-
sumption that the observations are jointly Gaussian distributed. So in this case, ML and WLS give the same estimate.
3. Signal reconstruction methods
In this section, we describe various methods for signal reconstruction from nonuniform samples. Table 1 summarizes
some of the known exact signal reconstruction methods.
In the table,
max
and the set S = supp(F ()) denote the maximum frequency in the spectrum and the support of
the spectrum respectively, |S| denotes the cardinality of the set S, and L
2
() denotes the vector space of square integrable
functions. The frame {
k
} of a vector space V denotes a set of linearly dependent vectors spanning V , which are such that
v V there exist c and C with 0 <c C < such that
cv
2

v
H

2
Cv
2
(17)
where v
H

k
denotes the inner product; furthermore for any v V the dual frames {

k
} satisfy

k
_
v
H

k
_

k
=

k
_
v
H

k
_

k
= v. (18)
As can be seen from the table the methods either require an innite number of samples or some restrictive assumptions
on the signal to exactly recover the signal. However, in most practical situations, we have only a nite number of nonuni-
form samples of the data corrupted with noise, and little or no prior knowledge about the signal spectrum, and this makes
these methods somewhat impractical. In the next section, we will describe various methods that can handle practical data.
Before we end this section, we will discuss the choice of various parameters like Nyquist frequency, frequency resolu-
tion and resampling time that are needed for spectral estimation and interpolation of nonuniformly sampled data. For a
given nite number of nonuniform samples, Nyquist frequency indicates the rollover frequency beyond which the spectrum
replicates; frequency resolution determines the minimum frequency spacing that can be resolved; and the resampling time
determines the uniform grid spacing over which the nonuniform data can be resampled or interpolated.
3.1. Nyquist frequency (
nus
)
Nyquist frequency for uniformly sampled signals is well dened through the NyquistShannon sampling theorem. How-
ever, for nonuniformly sampled signals, there is no well accepted denition. For example, [1,6] dene the Nyquist frequency
in the nonuniform case as 1/2t
m
, where t
m
is the smallest time interval in the data, while [19] denes it as 1/2t
a
, where t
a
is the average sampling rate. However the maximum frequency that can be inferred from nonuniform samples appears to
be larger. In [35] (see also [19]), a more realistic way of dening the Nyquist or rollover frequency for nonuniform samples
has been described. Given N nonuniform samples with sampling instants {t
k
}
N
k=1
, the spectral window at any frequency
is dened as
W() =

1
N
N

k=1
e
it
k

2
. (19)
We can easily verify that W(0) = 1, and for any = 0, W() 1. In the case of uniform sampling, the spectral
window will be a periodic function with period equal to twice the Nyquist frequency, that is for any integer m and ,
W() = W(+2m
us
), where
us
is the Nyquist frequency for the uniform sampling case. Similarly the Nyquist frequency
(
nus
) for nonuniform data can be dened as the smallest frequency for which W(2
nus
) 1.
3.2. Frequency resolution ()
Frequency resolution determines the minimum frequency spacing between two sinusoids which can be resolved. In
the uniform sampling case, for a given set of N uniform samples, the frequency resolution, , is given by
2
(t
N
t
1
)
=
2
Nt
s
=
4
Nt
us
=
2
us
N
, where t
s
and t
us
represents uniform sampling and Nyquist time intervals, respectively. Similarly for the
nonuniform sampling case, the frequency resolution can be dened as
2
t
N
t
1
.
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 365
T
a
b
l
e
1
E
x
a
c
t
s
i
g
n
a
l
r
e
c
o
n
s
t
r
u
c
t
i
o
n
m
e
t
h
o
d
s
.
M
e
t
h
o
d
A
s
s
u
m
p
t
i
o
n
s
D
e
s
c
r
i
p
t
i
o
n
s
C
o
m
p
u
t
a
t
i
o
n
a
l
c
o
m
p
l
e
x
i
t
y
C
o
m
m
e
n
t
s
L
a
g
r
a
n
g
e
l
i
k
e
i
n
t
e
r
p
o
l
a
t
i
o
n
[
3
1
,
3
2
]
|
t
k

k
t
s
|
<
t
s
4

k
w
h
e
r
e
k

Z
a
n
d
t
s
=

m
a
x
f
(
t
)
=

k
=

f
(
t
k
)
L
(
t
)
L

(
t
k
)
(
t

t
k
)
L
(
t
)
=
(
t

t
0
)

k
=
1
(
1

t
t
k
)
(
1

t
t

k
)
R
e
q
u
i
r
e
s
i
n

n
i
t
e
n
u
m
b
e
r
o
f

o
p
s
R
e
q
u
i
r
e
s
i
n

n
i
t
e
n
u
m
b
e
r
o
f
s
a
m
p
l
e
s
f
o
r
e
x
a
c
t
s
i
g
n
a
l
r
e
-
c
o
n
s
t
r
u
c
t
i
o
n
F
r
a
m
e
b
a
s
e
d
r
e
c
o
n
s
t
r
u
c
t
i
o
n
[
3
1
,
3
2
]
{
e
i
t
k

}
i
s
a
f
r
a
m
e
f
o
r
L
2
(

m
a
x
,

m
a
x
)
a
n
d
S

m
a
x
,

m
a
x
]
f
(
t
)
=

k
=

f
(
t
k
)
R
k
(
t
)
R
k
(
t
)
=
_

m
a
x

m
a
x

k
(

)
e
i
t

,
w
h
e
r
e
{

k
(

)
}
i
s
t
h
e
d
u
a
l
f
r
a
m
e
R
e
q
u
i
r
e
s
i
n

n
i
t
e
n
u
m
b
e
r
o
f

o
p
s
R
e
q
u
i
r
e
s
i
n

n
i
t
e
n
u
m
b
e
r
o
f
s
a
m
p
l
e
s
f
o
r
e
x
a
c
t
s
i
g
n
a
l
r
e
c
o
n
s
t
r
u
c
t
i
o
n
S
p
a
r
s
e
a
p
p
r
o
a
c
h
[
2
6
]
S

m
a
x
,

m
a
x
]
a
n
d
f
(
t
)
i
s
p
s
p
a
r
s
e
i
.
e
.
|
S
|

p
,
a
n
d
N
>
p
m
i
n
f

1
s
.
t
.
f
=

f
w
h
e
r
e
f
=
[
f
(
t
1
)
,
.
.
.
,
f
(
t
N
)
]
T
,
f
=
[
F
(

K
)
,
.
.
.
,
F
(

K
)
]
T
,

k
=

m
a
x
K
k
,

n
k
=
(
e
i
t
n

k
)
,
K
>
N
R
e
q
u
i
r
e
s
s
o
l
v
i
n
g
a
L
i
n
e
a
r
P
r
o
g
r
a
m
(
L
P
)
,
w
h
i
c
h
c
a
n
b
e
s
o
l
v
e
d
e

c
i
e
n
t
l
y
b
y
w
e
l
l
-
d
e
v
e
l
o
p
e
d
s
o
f
t
w
a
r
e
s
[
3
3
,
3
4
]
i
n
p
o
l
y
n
o
m
i
a
l
t
i
m
e
T
h
e
s
p
a
r
s
i
t
y
p
s
h
o
u
l
d
b
e
k
n
o
w
n
.
I
t
r
e
q
u
i
r
e
s
o
n
l
y

-
n
i
t
e
t
i
m
e
s
a
m
p
l
e
s
L
i
n
e
a
r
s
y
s
t
e
m
a
p
p
r
o
a
c
h
[
2
5
]
f
(
t
)
i
s
p
e
r
i
o
d
i
c
w
i
t
h
p
e
r
i
o
d
T
a
n
d
i
t
i
s
a
l
s
o
b
a
n
d
-
l
i
m
i
t
e
d
w
i
t
h
a
b
a
n
d
w
i
d
t
h
o
f
K
,
a
n
d
N

2
K
+
1
f
(
t
)
=
K

k
=

K
F
(
k
)
e
i
t
k
2
T
,
t
=
t
1
,
.
.
.
,
t
N
,
f
=
W
f

f
=
W

f
w
h
e
r
e
f
=
[
f
(
t
1
)
,
.
.
.
,
f
(
t
N
)
]
T
,
f
=
[
F
(

K
)
,
.
.
.
,
F
(
K
)
]
T
,
W
n
k
=
(
e
i
t
n
k
2
T
)
R
e
q
u
i
r
e
s
s
o
l
v
i
n
g
a
l
i
n
e
a
r
s
y
s
t
e
m
a
n
d
i
t
r
e
q
u
i
r
e
s
O
(
N
3
)

o
p
s
T
h
e
b
a
n
d
w
i
d
t
h
K
s
h
o
u
l
d
b
e
k
n
o
w
n
a
p
r
i
o
r
i
M
u
l
t
i
b
a
n
d
s
p
e
c
t
r
u
m
r
e
c
o
n
s
t
r
u
c
t
i
o
n
[
2
9
,
3
0
]
f
(
t
)
i
s
a
m
u
l
t
i
b
a
n
d
s
i
g
n
a
l
w
i
t
h
n
o
m
o
r
e
t
h
a
n
P
b
a
n
d
s
o
f
m
a
x
i
m
u
m
s
i
z
e
B
a
s
s
h
o
w
n
i
n
F
i
g
.
3
F

k
(

)
=
1
M
t
s
M

1
r
=
0
e
i
2

k
r
F
(

+
2

r
M
t
s
)

0
=
[
0
,
2

M
t
s
)
,
1

K
.
F
(

)
d
e
n
o
t
e
s
t
h
e
F
o
u
r
i
e
r
t
r
a
n
s
f
o
r
m
o
f
f
(
t
)
I
f
t
h
e
b
a
n
d
s
t
r
u
c
t
u
r
e
i
s
k
n
o
w
n
,
t
h
e
n
f
o
r
e
a
c
h

0
,
t
h
e
m
e
t
h
o
d
r
e
q
u
i
r
e
s
O
(
K
3
)

o
p
s
t
o
s
o
l
v
e
t
h
e
l
i
n
e
a
r
s
y
s
-
t
e
m
T
h
e
l
i
n
e
a
r
s
y
s
t
e
m
c
a
n
b
e
s
o
l
v
e
d
o
n
l
y
w
h
e
n
f
(

)
i
s
K
s
p
a
r
s
e
f
o
r
a
l
l

0
{
t
k
}
i
s
a
p
e
r
i
o
d
i
c
n
o
n
u
n
i
f
o
r
m
s
a
m
p
l
i
n
g
p
a
t
t
e
r
n
a
n
d
t
h
e
s
a
m
p
l
i
n
g
i
n
s
t
a
n
t
s
i
n
e
a
c
h
p
e
r
i
o
d
a
r
e
g
i
v
e
n
b
y
t
k
=
n
M
t
s
+

k
t
s
,
k
=
1
,
.
.
.
,
K
,
a
n
d
0

k
<
M

f
(

)
=
A
f
(

0
w
h
e
r
e
f
(

)
=
[
F

1
(

)
,
.
.
.
,
F

K
(

)
]
T
,
f
j
(

)
=
F
(

+
2

j
M
t
s
)
a
n
d
A
j
k
=
1
M
t
s
(
e
i
2

k
j
)
I
f
t
h
e
b
a
n
d
s
t
r
u
c
t
u
r
e
i
s
u
n
k
n
o
w
n
,
t
h
e
n
f
o
r
e
a
c
h

0
,
t
h
e
m
e
t
h
o
d
r
e
q
u
i
r
e
s
t
o
s
o
l
v
e
a
L
P
a
s
i
n
s
p
a
r
s
e
a
p
p
r
o
a
c
h
366 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
Fig. 3. Multiband spectrum.
(a) Nonuniform Samples (b) Spectral Window
Fig. 4. The sampling pattern and the spectral window.
Table 2
Spectral parameters.
Spectral parameter Value
Average sampling time (t
a
) and
a
/2 0.4688 s, 1.06 Hz
Minimum time interval (t
m
) and
m
/2 0.0990 s, 5.05 Hz
Resampling time (t
r
) and
nus
/2 0.001 s, 500 Hz
3.3. Resampling time (t
r
)
The resampling time, denoted by t
r
, determines the minimum spacing of the uniform grid over which the nonuniformly
sampled data can be resampled or interpolated. It is given by

nus
.
As an example, we have generated 20 nonuniform sampling times in the interval 010 seconds. The sampling pattern
and its spectral window are shown in Fig. 4. Table 2 shows the values of the minimum time interval, average time interval
and resampling time. It can be seen from the table that the Nyquist frequency dened from the t
m
or from t
a
is much
smaller than the Nyquist frequency obtained from the spectral window. The symbols
m
and
a
in Table 2 denote the
Nyquist frequencies calculated according to t
m
and t
a
, respectively.
4. Spectral analysis methods
In the rst part of this section, we will describe different methods available in the literature for spectral analysis of
nonuniform data and classify them based on the following:
the signal model (nonparametric or parametric),
the sampling pattern,
the spectrum of the signal (discrete or continuous).
The performance of different methods on both simulated and real life nonuniform data sets is analyzed in the second
part of this section. The methods will be described under four broad categories: methods based on least squares; methods
based on interpolation techniques; methods based on slotted resampling; methods based on continuous time models.
4.1. Methods based on least squares
In this section, we will describe spectral analysis methods for nonuniform data which are based on least squares. Let
{ f (t
k
)}
N
k=1
denote the sequence of nonuniformly spaced observations of f (t). The frequency interval [0,
nus
] is divided into
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 367
M equally spaced points with spacing between them equal to , where M =

nus

, and x denotes the largest integer


less than or equal to x. Any frequency on the grid can be denoted by
m
=m for some integer m [0, M].
4.1.1. Schuster periodogram
The classical or Schuster periodogram at any frequency
m
, denoted here by S
p
(
m
), is a solution to the following least
squares problem:

m
= argmin

m
N

k=1

f (t
k
)
m
e
i
m
t
k

2
;
S
p
(
m
) = |

m
|
2
=
1
N
2

k=1
f (t
k
)e
i
m
t
k

2
. (20)
4.1.2. Least-squares (also called LombScargle) periodogram
The LS periodogram at any frequency
m
, can also be expressed as a least squares problem [7,9]:
min
>0
[0,2]
N

k=1
_
f (t
k
) cos(
m
t
k
+)
_
2
. (21)
The data here is assumed to be real valued as the LS periodogram has been dened only for the real-valued data. Using
a =cos() and b = sin() in (21), we get
min
a,b
N

k=1
_
f (t
k
) a cos(
m
t
k
) b sin(
m
t
k
)
_
2
. (22)
Introducing the following notations
R =
N

k=1
_
cos(
m
t
k
)
sin(
m
t
k
)
_
_
cos(
m
t
k
) sin(
m
t
k
)
_
,
r =
N

k=1
_
cos(
m
t
k
)
sin(
m
t
k
)
_
f (t
k
), (23)
the solution to the minimization problem in (22) can be written as
_
a

b
_
= R
1
r. (24)
The power spectral estimate, denoted by S
LS
(), is given by
S
LS
(
m
) =
1
N
[ a

b]R
_
a

b
_
=
1
N
r
T
R
1
r. (25)
4.1.3. Iterative adaptive approach (IAA)
Following [9], let us introduce the following notations
f =

f (t
1
)
.
.
.
f (t
N
)

; a
m
=

e
i
m
t
1
.
.
.
e
i
m
t
N

. (26)
Using the above notations, a WLS tting criterion, whose use is the central idea of the IAA algorithm, can be dened as
follows:
min

m
f a
m

2
R
1
(27)
where
m
represents the amplitude of the frequency
m
and R represents an estimate of the covariance matrix, given by
R =

M
m=1
|
m
|
2
a
m
a
H
m
. The solution to the above problem is given by

m
=
a
H
m
R
1
f
a
H
m
R
1
a
m
. (28)
368 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
Table 3
Iterative adaptive approach algorithm (IAA).
Initialization
Use the estimate in (28) with R = I as the initial value
Iteration
At the ith iteration, the estimate of at any frequency
m
is given by

i
m
=
a
H
m
(R
i
)
1
f
a
H
m
(R
i
)
1
am
for m = 1, . . . , M where R
i
=
M

m=1
|

i1
m
|
2
a
m
a
H
m
Termination
The iteration will be terminated when the relative change in

m
, |

i
m

i1
m
|
2
, is less than 10
4
.
The nal spectral estimate at frequency
m
, denoted by S
I AA
(
m
), is given by S
I AA
(
m
) = |

c
m
|
2
,
where

c
m
denotes the estimate obtained at convergence
(a) Sampling pattern (b) Schuster periodogram
(c) LS periodogram (d) IAA spectrum
Fig. 5. a) Sampling time instants of 20 samples. b) Schuster periodogram. c) LS periodogram. d) IAA spectrum, and the circles denote the amplitudes of the
true frequencies, 10, 15.5 and 16 Hz, present in the signal. The signal to noise ratio is 17 dB, and the frequency spacing is 0.1 Hz.
Since the covariance matrix R depends on
m
, an iterative algorithm is used to estimate
m
and R. The steps in the IAA
algorithm can be tabulated as shown in Table 3.
Numerical simulations are carried out to compare the methods based on least squares analysis. The data used here is
simulated by picking samples from a mix of three frequencies 10, 15.5 and 16 Hz, with amplitudes 2, 3 and 4 respectively in
the presence of white noise. The sampling pattern is generated according to Poisson distribution as described in Section 2.2.
Fig. 5a shows the time instants of the 20 generated samples. It can be seen from Figs. 5b, 5c and 5d, that IAA works better
than Schuster and LS periodogram: IAA clearly shows the peak at 10 Hz, while this peak is lost in the background noise in
the periodogram based methods; IAA also clearly resolves the two closely spaced frequencies at 15.5 Hz and 16 Hz.
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 369
Table 4
Various combinations of uniform sampling methods and interpolation techniques; t
r
represents the resampling time and lies in the interval [0,

tr
]
(SR Slotted Resampling, NN Nearest Neighbor interpolation, KCI Kernel based Covariance Interpolation, KDI Kernel based Data Interpolation).
Uniform sampling method Interpolation type
Data interpolation (SR, NN and KDI) Covariance interpolation (KCI)
(interpolated data sequence { f (nt
r
)}

N
n=1
is obtained
from { f (t
n
)}
N
n=1
. From { f (nt
r
)}, an estimate of sam-
ple covariance matrix R
DI
is obtained. In the case of
SR, the missing sample values in { f (nt
r
)} are assumed
to be zero)
(interpolated covariance sequence {r(n
r
)}

N
n=1
is ob-
tained from { f (t
n
)}
N
n=1
. A Toeplitz covariance matrix
R
CI
is then formed from {r(n
r
)}

N
n=1
)
Periodogram
1

N
2

k=1
f (kt
r
)e
iktr

2
a
H
R
CI
a, a =
1

N
[e
jr
, . . . , e
j

Nr
]
T
Capon
1
a
H
R
1
DI
a
, a =

N
[e
jtr
, . . . , e
j
N
2
tr
]
T 1
a
H
R
1
CI
a
, a =
1

N
[e
jr
, . . . , e
j

Nr
]
T
Subspace methods (MUSIC,
ESPRIT) [36]
The subspace based spectral estimates are obtained
from R
DI
The subspace based spectral estimates are obtained
from R
CI
4.2. Methods based on interpolation
In this section, we describe different methods for spectral analysis that are based on interpolation in the lag domain.
The primary interest of the paper in analyzing the power spectra justies the restriction to interpolation methods in the lag
domain (however the methods discussed below are also applicable to the sample domain). Once the covariance values are
obtained on a regular grid through interpolation, spectral analysis methods for the uniform sampling case can be applied to
them.
As described in Section 2.3, given a set of samples { f (t
k
)}
N
k=1
of f (t) within a interval [0, T ], various kernels can be used
to interpolate the covariance sequence (r()) of f (t). The general structure of kernel based covariance interpolators is given
by
r
I
() =
N

l=1
N

k=1
f (t
l
) f

(t
k
)K(, t
l
t
k
), [0, T ], (29)
where r
I
() denotes the interpolated covariance sequence. For example, Gaussian, Laplacian [17], sinc [19], or rectangular
kernels [15,16] can be used to interpolate the covariance function. However in all these methods, the user has to make
the right choice of the kernel parameters like bandwidth, mainlobe width and window width. A further drawback of these
kernel based covariance interpolators, except the one based on the sinc kernel, is that they do not ensure a positive semidef-
inite covariance sequence, i.e. the power spectrum obtained from the interpolated covariance sequence could be negative
at some frequencies. To ensure the positive semideniteness of the interpolated covariance sequence, each estimator has to
carry out an additional step as shown below.
Let
I
() denote the power spectral density obtained from r
I
(); then a positive semidenite covariance sequence,
denoted by r
P
(), can be obtained as
r
I
()
F

I
(),

P
() =
_

I
(),
I
() 0,
0, otherwise,
(30)

P
()
F
1
r
P
().
Numerical results are carried out to compare the various kernels used for covariance interpolation. A CARMA(3,2) has
been simulated with AR and MA polynomials given by a(s) = s
3
+0.7s
2
+2.629s +0.506 and b(s) = s
2
+5s +2.792 respec-
tively. For numerical simulations, 50 nonuniformly spaced samples within a time interval of 010 seconds are generated,
see [19] for more details on CARMA process simulation. On those 50 samples of the process, four kernel based methods
for covariance interpolation are applied. Fig. 6 shows the interpolated sequence over the interval 010 seconds; as can be
seen from Fig. 6a the Gaussian and Laplacian kernels with bandwidths equal to 1 second give almost the same interpolated
sequences. The plots in Fig. 6b show the interpolated covariance sequence obtained using the sinc and rectangular kernels
with corresponding bandwidth and slot width equal to 4t
a
and 2t
a
respectively, where t
a
represents the average sampling
period. The bandwidth and slot width of the kernels are chosen arbitrarily for this simulation, as to our knowledge there is
no optimal way of choosing them available in the literature.
Equispaced covariance lags can be obtained from r
P
() by sampling it over an uniform grid, and then any spectral
analysis method designed for uniform sampling can be applied to the so-obtained uniform covariance sequence; the inter-
ested reader is referred [19] for details. Table 4 summarizes how the different uniform sampling methods (parametric or
nonparametric) can be applied to the data obtained from different interpolation techniques.
370 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
(a) Gaussian and Laplacian kernels (b) Sinc and rectangular kernels
Fig. 6. Interpolated covariance sequence with a) Gaussian kernel of bandwidth b
2
= 1 s and Laplacian kernel of bandwidth b
3
= 1 s. b) Sinc kernel of
main-lobe width b
1
= 4t
a
and rectangular kernel of slot-width b
4
= 2t
a
.
4.3. Methods based on slotted resampling
4.3.1. ML tting of AR model to nonuniform samples
Given a set of nonuniform samples within the interval [0, T ], the technique of slotted resampling, as described in Sec-
tion 2.3.3, can be applied to obtain samples on an uniform grid with some missing samples. In [2022], various methods
have been proposed for maximum likelihood tting of a discrete time AR model to time series with missing observations.
In [20], an approach based on state space modeling has been used to obtain the ML estimate of AR parameters, which
will be referred here as ML-Jones, whereas in [22] an approximate but faster method named autoregressive nite interval
likelihood (ARFIL) has been proposed.
Let { f (t
k
)}
N
k=1
denote the sequence of nonuniformly spaced observations of a stationary process sampled within the
time interval [0, T ], and let {

f (nt
r
)}

N
n=1
represent the sequence obtained after slotted resampling over an uniform grid of
resolution t
r
, where in general

N N. The resampled sequence is assumed to be obtained from a discrete time zero mean,
Gaussian distributed AR process of order K, such that

f (n) +a
1

f (n 1) + +a
K

f (n K) =(n) (31)
where (n) denotes an white Gaussian noise with zero mean and variance
2

, and {a
k
}
K
k=1
are the AR parameters. In the
above equation, t
r
in the resampled sequence has been dropped for notational simplicity. The YuleWalker equations for the
given AR(K) are given by
r(n) +a
1
r(n 1) + +a
K
r(n K) = 0, n >0, r(n) =r

(n) (32)
where r(n) represents the covariance sequence of

f (n). The probability density function of f = [

f (1), . . . ,

f (

N)]
T
is given
by
p
F
( f ) =
1
()
N
|R
F
|
e
( f
H
R
F
1
f )
(33)
which is also the likelihood function (LF) of f . In (33) [R
F
]
i, j
= E(

f (i)

( j)) =r(i j).


Let a = [a
1
, . . . , a
K
]
T
. The dependence of LF on a is due to the fact that each element of R
F
can be expressed in terms
of the AR parameters via the YuleWalker equations (32). The AR parameters can then be obtained by maximizing the LF
by using a nonlinear optimization tool. Due to the missing samples, in general R
F
will not be Toeplitz and could in fact be
singular; apart from that the LF is highly nonlinear in the AR parameters and may have local maxima. Instead of directly
maximizing the LF, the authors in [22] have expressed the LF in terms of conditional probability densities. Once the AR
parameters were estimated the spectrum of the signal can be obtained as
F () =

1 +

K
k=1
a
k
e
ik

2
. (34)
4.3.2. Least squares tting of AR model to nonuniform samples
As described before, the nonuniform sampling problem can be converted into a missing data problem by a slotted
resampler. Following the same notations as in the last section, the AR parameters can be estimated by minimizing the
following criterion:
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 371
Table 5
LS-AR algorithm.
Initialization
Use the estimate (38) of a with f
m
=0, as the initial value, a
0
=(u
H
Q
1
u)
1
Q
1
u
Iteration
At any ith iteration, the estimate of f
m
and a is given by
f
i
m
= (S
T
m

Q
1
i
S
m
)
1
(S
T
m

Q
1
i
S
a
) f
a
, a
i
=(u
H
Q
1
i
u)
1
Q
1
i
u
where

Q
i
=

N
n=K+1
M
T
n
a
i1
a
H
i1
M
n
, Q
i
=

N
n=K+1
(M
n
f
i
f
H
i
M
T
n
), and f
i
= S
a
f
a
+ S
m
f
i
m
Termination
The iteration will be terminated when the relative change in a, a
i
a
i1

2
, is less than 10
4
min
a

n=K+1

a
0

f (n) +a
1

f (n 1) + +a
K

f (n K)

2
s.t. a
H
u = 1 (35)
where a = [a
0
, a
1
, . . . , a
K
]
T
and u = [1, 0, . . . , 0]
T
, and the constraint a
H
u = 1 ensures that the rst element of a is equal
to one. By introducing a selection matrix
M
n
= [0
n1
I
K
0

NKn+1
] (n +1)

N (36)
the minimization becomes:
min
a
a
H

n=K+1
_
M
n
f f
H
M
T
n
_
a
s.t. a
H
u = 1 (37)
where f = [ f (1), . . . , f (

N)]
T
. If all the samples in f are available, then the AR parameters can be obtained analytically as
a =
Q
1
u
u
H
Q
1
u
(38)
where Q =

N
n=K+1
(M
n
f f
H
M
T
n
). Computing Q , however, is not possible if f has some missing samples. Let f
a
and
f
m
represent the available and missing samples in f respectively, such that f = S
a
f
a
+ S
m
f
m
, where S
a
and S
m
are
semiunitary selection matrices such that f
a
= S
T
a
f and f
m
= S
T
m
f . The cost function in (37) is also quadratic in the
missing samples as shown below:

n=K+1
a
H
M
n
f f
H
M
T
n
a =

n=K+1
(S
a
f
a
+ S
m
f
m
)
H
M
T
n
a a
H
M
n
(S
a
f
a
+ S
m
f
m
)
=(S
a
f
a
+ S
m
f
m
)
H

n=K+1
M
T
n
a a
H
M
n
(S
a
f
a
+ S
m
f
m
)
=(S
a
f
a
+ S
m
f
m
)
H

Q (S
a
f
a
+ S
m
f
m
) (39)
where

Q =

N
n=K+1
M
T
n
a a
H
M
n
. If the AR parameters in

Q were known, then an estimate of f
m
could be obtained in the
same way as we obtained the estimate of AR parameters in (38). Table 5 shows a cyclic iterative algorithm, called LS-AR,
that estimates the AR parameters and the missing samples iteratively. In [37], the same sort of cyclic algorithm has been
used to estimate an ARX model in the missing data case.
Apart from the above two missing data-based algorithms, there are other missing data algorithms available in the lit-
erature. For example the gapped data amplitude and phase estimation (GAPES) algorithm [38] estimates the amplitude
spectrum from the available samples, and then estimates the missing samples by a least squares t to the estimated am-
plitude spectrum. In the case of arbitrarily missing data, a method called missing data amplitude and phase estimation
(MAPES) proposed in [23] estimates the spectrum and the missing samples cyclically via the expectation maximization
(EM) algorithm. Somewhat similarly the method called the missing data iterative adaptive approach (MIAA) [24] estimates
the spectrum and the missing samples cyclically via a weighted least squares approach.
A numerical simulation has been carried out to compare some of the methods. Two hundred nonuniform samples in
the interval 010 seconds for two sine waves with frequencies 2 Hz and 5 Hz and corresponding amplitudes 0.5 and 1
are generated with the sampling instances following a stratied uniform distribution as described in Section 2.2.2. Using a
372 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
Fig. 7. Amplitude estimates of the two sine simulated data from the methods based on slotted resampling.
slotted resampling technique with mean sampling time as the resampling time (i.e. t
r
=t
a
) and half of the mean sampling
time as the slot width (i.e. t
w
=
t
a
2
), the nonuniform sampling problem has been transformed into a missing data problem.
Fig. 7 shows the spectral estimates of different methods; as can be seen from the gure MIAA gives accurate estimates of
frequencies but the estimates of the amplitudes are biased; on the other hand LS-AR gives more accurate estimates of peak
heights but poorer resolution than MIAA. When compared with MIAA and LS-AR, the ARFIL fails to locate the peak at 2 Hz;
the failure of ARFIL is mainly due to the presence of numerous local maxima of the likelihood function due to missing
samples.
4.4. Methods based on continuous time models
In this section we describe methods that are based on tting a continuous time model to nonuniform samples. The
methods are based on either approximating the continuous time model by a discrete time model [14,39] and estimating
the parameters of the obtained discrete time model, or approximating the derivative operation in the continuous time
model [13] by a weighted summation. As an example, given N nonuniformly spaced samples of f (t), { f (t
k
)}
N
k=1
, let us
briey describe some methods that try to t these samples to a CAR(1) model:
(p +a) f (t) = e(t) (40)
where p denotes the differentiation operator,
d
dt
, and e(t) denotes a continuous time white noise of unit intensity. As
described in [14,39], the CAR model can be approximated by the following discrete time model:
f (t
n
) = e
a(t
n
t
n1
)
f (t
n1
) +
f
_
1 e
2a(t
n1
t
n
)
_
1/2
e(t
n
), n = 2, . . . , N, (41)
where
2
f
denotes the variance of the continuous time process. In [39], a prediction error approach was proposed to estimate
a from the model in (41).
The above described method is an indirect way of estimating the continuous time model, since it rst translates the
original model into a discrete time model and then estimates the parameters of the original model by estimating the
parameters of the discrete time model. In [13], a more direct approach of estimating the continuous time model parameters
has been studied, which approximates the differentiation operator p
j
as
p
j
f (t
k
) D
j
f (t
k
) =
j

=0

k
( j, ) f (t
k+
) (42)
where
k
s are chosen to meet the conditions shown below:
j

=0

k
( j, )(t
k+
t
k
)

=
_
0, = 0, . . . , j 1,
j!, = j.
(43)
Using this approximation of the derivative, the CAR model in (40) can be rewritten as
D
1
f (t
k
) = aD
0
f (t
k
) +e(t
k
), k = 1, . . . , N 1. (44)
From the above set of equations the parameter a can be obtained by least squares. The interested reader is referred to
[13,14,39] and the references there for technical details and numerical results on tting continuous time models to nonuni-
form samples.
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 373
Table 6
Classication of spectral analysis methods for nonuniform data based on the sampling pattern, signal model and type of spectrum (P Parametric, NP
nonparametric, SR Slotted Resampling, NN Nearest Neighbor interpolation, KCI Kernel based Covariance Interpolation, KDI Kernel based Data
Interpolation).
Sampling type Method type
NP-discrete spectra NP-continuous spectra P-discrete spectra P-continuous spectra
Missing data case IAA
MAPES
GAPES
KDI, KCI +NP uniform
sampling methods
NN+NP uniform
sampling methods
CLEAN
KDI, KCI +P uniform
sampling methods
MLJones
ARFIL
LS-AR
Arbitrary irregular sampling Schuster periodogram
LS periodogram
IAA
SR +MIAA
Sparse approach
Linear system approach
SR +MLJones
SR +ARFIL
SR +LS-AR
CAR, CARMA modeling
4.5. Classication and summary
Table 6 shows a classication of different spectral analysis methods based on the sampling pattern, signal model and type
of spectrum. Some of the methods listed in the table are new. For example, both MIAA and LS-AR were previously applied
only on missing data case, but here they are used in conjunction with slotted resampling on arbitrarily sampled data.
Other methods like the sparse approach and the linear system approach, which are listed under exact signal reconstruction
methods, can also be used for spectral analysis of nite-length nonuniform data.
4.6. Performance on real life data sets
In this subsection, the performance of some of the above methods on two real life data sets is evaluated. For each data
set, the following results are provided:
The Nyquist frequency for the sampling pattern, obtained from the spectral window.
The spectral estimates from applicable methods (according to the sampling pattern type). For example, in the both real
life examples the sampling pattern is arbitrary and only the following methods can be applied:
methods based on least squares.
missing data methods applied to slotted resampled data.
uniform sampling methods like periodogram, Capon (nonparametric) and ESPRIT (parametric), which are applied to
the interpolated covariance sequence.
The methods of sparse approach and of linear system approach are not applied here since the real life data sets are neither
sparse nor periodic bandlimited. Regarding the IAA based methods, three types of methods are possible: IAA applied directly
to nonuniform data: IAA applied to slotted resampled data; MIAA applied to slotted resampled data. Out of them MIAA
applied to slotted resampled data is preferred over applying IAA on slotted resampled data, as the later is the rst step of
the iteration in the former.
4.6.1. Radial velocity data
The data considered express the radial velocity of the star HD 102195. Based on these data, an extrasolar planet named
ET-1 revolving around the star has been discovered [40]. The period of revolution of the planet can be calculated by analyz-
ing the spectrum of the radial velocity measurements. The experimental data consists of 38 sample measurements of the
radial velocity of the star taken nonuniformly over a span of 200 days as shown in Fig. 8. Fig. 9 shows the sampling pattern
and the spectral window of the data; as can be seen from the spectral window the spectrum replicates after a frequency of
about 1 cycles/day. Table 7 shows some spectral parameters of the radial velocity data; it can be seen from the table that
although the rollover frequency of the spectrum calculated from t
m
is much larger than that calculated from t
r
, the Nyquist
frequency is still taken to be
nus
. Figs. 10 and 11 show the spectral estimates of the radial velocity for the LS-methods,
as well as methods based on slotted resampling, and covariance interpolation methods respectively. In the case of LS based
methods, the LS-AR method cannot be applied to this data due to the big gaps in the sampling pattern. Table 8 summarizes
the results obtained from the different methods. Excepting the ARFIL method, all the methods indicate a strong periodicity
around 4.1 days (0.2439 cycles/day); so the period of rotation of the exo-planet, ET-1, around the star HD 102195 has been
estimated to be 4.1 days.
374 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
Table 7
Spectral parameters of the radial velocity data.
Spectral parameter Value
Average sampling time (t
a
) and
a
/2 5.2266 days, 0.0957 cycles/day
Minimum time interval (t
m
) and
m
/2 0.0078 days, 64.0533 cycles/day
Resampling time (t
r
) and
nus
/2 0.5 days, 0.5 cycles/day
Fig. 8. The radial velocity data of HD 102195.
(a) Sampling pattern of the data. (b) Spectral window.
Fig. 9. The sampling pattern and the spectral window of the radial velocity data.
Table 8
The main spectral peak for the radial velocity data selected by different methods.
Type Method Cycles/day
LS methods Schuster periodogram 0.2439
LS periodogram 0.2439
IAA 0.2439
Methods based on slotted resampling MIAA 0.2447
ARFIL 0.23
Covariance interpolation (sinc kernel) +nonparametric methods Periodogram 0.2439
Capon 0.2439
Covariance interpolation (sinc kernel) +Parametric method ESPRIT 0.2431
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 375
(a) LS methods. (b) Methods based on slotted resampling.
Fig. 10. Spectrum of the radial velocity data based on a) LS methods (the scales in the plot are changed to show the spectra from different methods clearly),
b) methods based on slotted resampling (for MIAA and ARFIL, the spacing between the samples is taken to be equal to t
r
).
Fig. 11. Spectrum of the radial velocity data obtained by covariance interpolation using a sinc kernel with mainlobe width b
1
= t
r
and then applying two
nonparametric uniform sampling methods (Periodogram and Capon) and a parametric uniform sampling method (ESPRIT).
Fig. 12. Vostok ice core data.
376 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
Table 9
Spectral parameters of the Vostok ice core data.
Spectral parameter Value
Average sampling time (t
a
) and
a
/2 1.4511 10
3
years, 3.4457 10
4
cycles/year
Minimum time interval (t
m
) and
m
/2 44 years, 0.0114 cycles/year
Resampling time (t
r
) and
nus
/2 0.5 years, 0.5 cycles/year
Table 10
The two most signicant spectral peaks (three peaks for ESPRIT) in the Vostok ice core data picked by different methods.
Type Method Cycles/year
LS methods Schuster periodogram {1.034, 2.412} 10
5
LS periodogram {1.034, 2.412} 10
5
IAA {1.034, 2.412} 10
5
Methods based on slotted resampling MIAA {0.9648, 2.481} 10
5
Covariance interpolation (Sinc kernel) +nonparametric methods Periodogram {1.071, 2.581} 10
5
Capon {1.023, 2.484} 10
5
Covariance interpolation (Sinc kernel) +Parametric method ESPRIT {0.9437, 1.08, 2.53} 10
5
(a) Sampling pattern. (b) Spectral window.
Fig. 13. The sampling pattern and the spectral window of the Vostok ice core data.
4.6.2. Vostok ice core data
The data used in this example are concentrations of CO
2
measured in parts per million by volume (p.p.m.v.), obtained
from ice core drilling at the Russian Vostok station in East Antarctica [41,42]. Analyzing these data will be helpful in knowing
the paleoclimatic properties of the glacialinterglacial periods such as temperature, wind speed, changes in atmospheric gas
composition, etc. The data consists of 283 samples of CO
2
concentration ranging over a span of 420,000 years as shown in
Fig. 12. Fig. 13 shows the sampling pattern and its spectral window; as can be seen from the spectral window the Nyquist
frequency is around 0.5 cycles/year. Table 9 shows some spectral parameters of the Vostok ice core data. In the case of
methods based on slotted resampling, the results of LS-AR and ARFIL methods are not shown; LS-AR cannot be applied
here due to large gaps and ARFIL fails to locate any peak in the spectrum. Figs. 14 and 15 show the spectral estimates of
the Vostok ice core data for the LS-methods, methods based on slotted resampling, and covariance interpolation methods,
respectively. Since the spectra obtained from all the methods do not show any periodicity beyond 7 10
5
cycles/year, the
spectra are shown only till 710
5
cycles/year. The mainlobe width of the sinc kernel in the case of covariance interpolation
techniques is chosen somewhat arbitrarily to be 3t
a
. Table 10 summarizes the results obtained from the different methods.
All the methods indicate a strong periodicity around 10
5
years (1 10
5
cycles/year), and relatively weaker periodicities at
40,000 years (2.5 10
5
cycles/year), 28,571 years (3.5 10
5
cycles/year) and 22,222 years (4.5 10
5
cycles/year).
5. Conclusions
In this paper, we have reviewed different methods for spectral analysis of nonuniform data by describing, classifying
and comparing them. Apart from methods for spectral analysis, methods for exact signal reconstruction from nonuniform
samples were also reviewed. The choice of various spectral parameters like Nyquist frequency, and resampling rate were
also discussed. Finally the performance of different spectral analysis methods on two real-life nonuniform data sets, one
P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378 377
(a) LS methods. (b) Methods based on slotted resampling.
Fig. 14. Spectrum of the Vostok ice core data based on a) LS methods, b) methods based on slotted resampling (for MIAA, the spacing between samples is
taken to be equal to t
r
).
Fig. 15. Spectrum of the Vostok ice core data obtained by covariance interpolation using a sinc kernel with mainlobe width b
1
= 3t
a
and applying two
nonparametric uniform sampling methods (Periodogram and Capon) and a parametric uniform sampling method (ESPRIT).
in astrophysics and the other in paleoclimatology, was evaluated. For both real life data sets it was observed that the
nonparametric methods perform better than the parametric methods, especially the IAA based methods showed clearly
periodicities in the data which agree well with physical ndings.
Acknowledgments
We thank Dr. N. Sandgren for providing us the Vostok ice core data and H. He for the radial velocity data. We are also
grateful to Prof. P.M.T. Broersen for the ARFIL simulation code.
References
[1] J.D. Scargle, Studies in astronomical time series analysis. II. Statistical aspects of spectral analysis of unevenly spaced data, Astrophys. J. 263 (1982)
835853.
[2] S. Baisch, G.H.R. Bokelmann, Spectral analysis with incomplete time series: An example from seismology, Comput. Geosci. 25 (1999) 739750.
[3] M. Schulz, K. Stattegger, Spectrum: Spectral analysis of unevenly spaced paleoclimatic time series, Comput. Geosci. 23 (9) (1997) 929945.
[4] A.W.C. Liew, J. Xian, S. Wu, D. Smith, H. Yan, Spectral estimation in unevenly sampled space of periodically expressed microarray time series data, BMC
Bioinformatics 8 (2007) 137156.
[5] C. Tropea, Laser Doppler anemometry: Recent developments and future challenges, Meas. Sci. Technol. 6 (1995) 605619.
[6] D.H. Roberts, J. Lehar, J.W. Dreher, Time series analysis with CLEAN. I. Derivation of a spectrum, Astronom. J. 93 (4) (1987) 968989.
[7] N.R. Lomb, Least-squares frequency analysis of unequally spaced data, Astrophys. Space Sci. 39 (1) (1976) 1033.
[8] T. Yardibi, J. Li, P. Stoica, M. Xue, A.B. Baggeroer, Iterative adaptive approach for sparse signal representation with sensing applications, IEEE Trans.
Aerospace Electron. Syst., in press.
[9] P. Stoica, J. Li, H. He, Spectral analysis of nonuniformly sampled data: A new approach versus the periodogram, IEEE Trans. Signal Process. 57 (3) (2009)
843858.
[10] A. Tarczynski, N. Allay, Spectral analysis of randomly sampled signals: Suppression of aliasing and sampler jitter, IEEE Trans. Signal Process. 52 (12)
(2004) 33243334.
[11] E. Masry, Poisson sampling and spectral estimation of continuous time processes, IEEE Trans. Inform. Theory 24 (2) (1978) 173183.
378 P. Babu, P. Stoica / Digital Signal Processing 20 (2010) 359378
[12] R.H. Jones, Fitting a continuous time autoregression to discrete data, in: D.F. Findley (Ed.), Applied Time Series Analysis II, Academic, New York, 1981,
pp. 651682.
[13] E.K. Larsson, Identication of stochastic continuous-time systems, Ph.D. thesis, Uppsala University, 2004.
[14] R.J. Martin, Irregularly sampled signals: Theories and techniques for analysis, Ph.D. thesis, University College London, 1998.
[15] L. Benedict, H. Nobash, C. Tropea, Estimation of turbulent velocity spectra from laser Doppler data, Meas. Sci. Technol. (11) (2000) 10891104.
[16] W. Harteveld, R. Mudde, H.V. den Akker, Estimation of turbulence power spectra for bubbly ows from laser Doppler anemometry signals, Chem. Eng.
Sci. (60) (2005) 61606168.
[17] O.N. Bjornstad, W. Falck, Nonparametric spatial covariance functions: Estimation and testing, Envir. Ecolog. Statist. 8 (2001) 5370.
[18] P. Hall, N. Fisher, B. Hoffmann, On the nonparametric estimation of covariance functions, Ann. Statist. 22 (4) (1994) 21152134.
[19] P. Stoica, N. Sandgren, Spectral analysis of irregularly-sampled data: Paralleling the regularly-sampled data approaches, Digital Signal Process. 16 (6)
(2006) 712734.
[20] R.H. Jones, Maximum likelihood tting of ARMA models to time series with missing observations, Technometrics 22 (3) (1980) 389395.
[21] P.M.T. Broersen, Automatic spectral analysis with missing data, Digital Signal Process. 16 (2006) 754766.
[22] P.M.T. Broersen, S. de Waele, R. Bos, Autoregressive spectral analysis when observations are missing, Automatica 40 (2004) 14951504.
[23] Y. Wang, P. Stoica, J. Li, T.L. Marzetta, Nonparametric spectral analysis with missing data via the EM algorithm, Digital Signal Process. 15 (2005)
192206.
[24] P. Stoica, J. Li, J. Ling, Missing data recovery via a nonparametric iterative adaptive approach, IEEE Signal Process. Lett. 16 (4) (2009) 241244.
[25] H.G. Feichtinger, K. Grchenig, T. Strohmer, Ecient numerical methods in nonuniform sampling theory, Numer. Math. 69 (4) (1995) 423440.
[26] S. Bourguignon, H. Carfantan, J. Idier, A sparsity-based method for the estimation of spectral lines from irregularly sampled data, IEEE J. Select. Topics
Signal Process. 1 (4) (2007) 575585.
[27] A. Aldroubi, K. Grchenig, Nonuniform sampling and reconstruction in shift-invariant spaces, SIAM Rev. 43 (4) (2001) 585620.
[28] H.G. Feichtinger, K. Grchenig, Theory and practice of irregular sampling, in: J.J. Benedetto, W. Frazier (Eds.), Wavelets Mathematics and Applications,
CRC, Boca Raton, FL, 1993, pp. 305363.
[29] R. Venkataramani, Y. Bresler, Perfect reconstruction formulas and bounds on aliasing error in sub-Nyquist nonuniform sampling of multiband signals,
IEEE Trans. Inform. Theory 46 (6) (2000) 21732183.
[30] M. Mishali, Y.C. Eldar, Blind multi-band signal reconstruction: Compressed sensing of analog signals, IEEE Trans. Signal Process. 57 (3) (2009) 9931009.
[31] J.R. Higgins, Sampling Theory in Fourier and Signal Analysis Foundations, Clarendon, Oxford, UK, 1996.
[32] F. Marvasti, Nonuniform Sampling, Theory and Practice, Kluwer, Norwell, MA, 2001.
[33] M. Grant, S. Boyd, Y. Ye, CVX: Matlab software for disciplined convex programming, http://stanford.edu/boyd/cvx.
[34] J. Sturm, Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones, Optim. Meth. Softw. 11 (1) (1999) 625653.
[35] L. Eyer, P. Bartholdi, Variable stars: Which Nyquist frequency? Astronom. Astrophys. Suppl. Ser. 135 (1999) 13.
[36] P. Stoica, R. Moses, Spectral Analysis of Signals, Pearson/Prentice Hall, 2005.
[37] R. Wallin, A. Isaksson, L. Ljung, An iterative method for identication of ARX models from incomplete data, in: Proceedings of the 39th IEEE Conference
on Decision and Control, Sydney, Australia, 2000.
[38] P. Stoica, E. Larsson, J. Li, Adaptive lter-bank approach to restoration and spectral analysis of gapped data, Astronom. J. 120 (4) (2000) 21632173.
[39] R.J. Martin, Autoregression and irregular sampling: Spectral estimation, Signal Process. 77 (1999) 139157.
[40] J. Ge, J. van Eyken, S. Mahadevan, C. DeWitt, S. Kane, R. Cohen, A. Vanden Heuvel, S. Fleming, P. Guo, G. Henry, et al., The rst extrasolar planet
discovered with a new-generation high-throughput Doppler instrument, Astrophys. J. 648 (1) (2006) 683695.
[41] J. Petit, J. Jouzel, D. Raynaud, N. Barkov, J. Barnola, I. Basile, M. Bender, J. Chappellaz, M. Davis, G. Delaygue, et al., Climate and atmospheric history of
the past 420,000 years from the Vostok ice core, Antarctica, Nature 399 (6735) (1999) 429436.
[42] A. Mathias, F. Grond, R. Guardans, D. Seese, M. Canela, H. Diebner, G. Baiocchi, Algorithms for spectral analysis of irregularly sampled time series, J.
Statist. Softw. 11 (2) (2004) 130.
Prabhu Babu is a Ph.D. student of electrical engineering with applications in signal processing at the Department of Information
Technology, Uppsala University, Sweden.
Petre Stoica is a Professor of System Modeling at Uppsala University, Uppsala, Sweden. Additional information is available at:
http://user.it.uu.se/ps/ps.html.

You might also like