Slice and Dice Chunks of Radio Spectrum
Slice and Dice Chunks of Radio Spectrum
Slice and Dice Chunks of Radio Spectrum
TOOLKIT
DESIGNERS TOOLKIT
W I R E L E S S SYSTE M S D E S I G N N O V E M B E R/ D E C E M B E R 2003 1
DESIGNERS TOOLKIT
[ WIDEBAND CHANNELIZATION ]
2 N O V E M B E R/ D E C E M B E R 2003 W I R E L E S S SYSTE M S D E S I G N
DESIGNERS TOOLKIT
[ WIDEBAND CHANNELIZATION ]
60
complex (I/Q) data. To achieve normal-
80 ly ordered frequency data, it requires a
DDC filter bit reverser.
100 response
The filter banks performance also
120 Kaiser weighted FFT must be considered when weighing the
filter response benefits of the FFT. In FIGURE 5, the
140 effective frequency response is shown for
0.00E+00 5.00E+06 1.00E+07 1.50E+07 2.00E+07 2.50E+07 3.00E+07
the unweighted FFT. The figure displays
Offset (Hz)
the Sinx/x nature of the sidelobe struc-
ture. Compare it to the filter response of
5. The frequency response of a single FFT bin is compared to an a typical DDC filter (85 dBc stop-band
unweighted FFT, Kaiser weighted FFT, and a digital-downconverter filter. and low-passband ripple). A clear advan-
tage exists in the filter frequency response
N integrators with N combs. They are fol- into a pipelined version of each.2 when using the DDC.
lowed by a decimate-by-R block. The next focus is the vast area of FFT The standard approach to improving
Although such a scheme works, it can be techniques. These methods boast a mul- stop-band performance is to weight or
greatly simplified by placing the comb titude of algorithms for programmable window the time-domain data. For
after the decimator. In general, a CIC fil- DSP implementations and a number of example, Figure 5 also shows the effect
ter would have N integrator/comb pairs COTS ASIC implementations that are of Kaiser weighting. Here, the stop-band
with N typically ranging from 3 to 6. A readily available. Here, coverage is level is improved. Yet a significant price
three-stage CIC is schematically illustrat- restricted to wideband, pipelined hard- is paid: a significant widening of the
ed in Figure 3. ware solutionsparticularly those that passband. FIGURE 6 demonstrates this
The magnitude response of a CIC filter are suitable for FPGA realization.3 tradeoff a little more clearly. It shows the
can be shown, for large R, to approximate FIGURE 4 shows an implementation equivalent set of overlapping filters for a
to a (Sinc)N function. Spectral nulls will of the Pipelined FFT (PFFT).4 This 32-point complex FFT with a Kaiser
appear as multiples of 1Fs/MR Hz, where implementation is based on successive n window. Obviously, a signal thats occu-
Fs is the input sample rate and M is the stages, where 2n is the size of the FFT. pying a narrow frequency band (e.g.,
integer number of delays in the comb sec- Each stage has switched delay elements CW) will actually appear in a number of
tion (typically 1). The stopband relative and butterflies. The switches and delays adjacent bins in decreasingbut still
attenuation is a function of the number re-order the data for processing at the significantlevels.
of stages that are used and equals approx- next butterfly. To achieve performance approaching
imately N 13 dB at the first sidelobe. In
contrast, the DC gain is a function of the
decimation rate R and equals (RM)N.
When designing a CIC filter, its critical to
account for bit growth. Insufficient bit
width would lead to an unstable filter.
To provide multiple channels, its pos-
sible to have a number of DDCs in a par-
allel stack. The number of output chan-
nels is equal to the number of DDCs that
are employed in such an architecture.
Clearly, a linear relationship exists
between the amount of silicon and the
number of channels required.
Its possible, however, to optimize the
DDC stack architecture by taking advan-
tage of the changes in sample rate across
the system. From each CIC decimator
onward, the comb half of the CICs and
the FIRs are clocked at a fraction of the 6. Shown here is an equivalent set of overlapping filters for a 32-point
input rate. As a result, theres potential complex FFT with a Kaiser window. A narrow frequency-band signal
for recycling all of the combs and FIRs (i.e., CW) appears in a number of adjacent bins.
3 N O V E M B E R/ D E C E M B E R 2003 W I R E L E S S SYSTE M S D E S I G N
DESIGNERS TOOLKIT Weighting function
h[n]
[ WIDEBAND CHANNELIZATION ]
L 0 Sample
that of a typical DDC, a window is PFT increases the num- number (n)
needed in the time domain that match- ber of bands by a factor x[n] Weight data by weighting function
es the overall DDC filter impulse of two (FIG. 10). This input data
response. For instance, a 1024-bin filter increase could be MSamples Divide into blocks of KSamples
at a time
bank might require a window some achieved, for example,
4000 to 5000 samples long. Such a win- by a simple tree struc-
dow could be achieved by using an FFT ture (FIG. 11). The Time-
of this length and decimating the out- input is complex to pre- aliasing
Overlap process
put by 4 or 5. But it would be very inef- serve positive and nega- and
ficient. It would particularly impact real tive frequencies. First, it add
time, which needs the parallel process- is split into two equal +++++++
ing of several large FFTs. bands using a complex
Fortunately, a much more elegant and downconverter (CDC) K-point DFT
efficient solution exists.5 In its most gen- and a complex upcon- Adjust kmM Short-time FT
W x Sliding-time reference
eral form, the Weight Overlap and Add verter (CUC). Because time K
reference
(WOLA) method is shown in FIGURE 7. the bandwidth of each Short-time FT
The required filter shape is determined one has been halved, its Fixed-time reference
by the weighting function, which is L possible to halve the
samples long. To match the DFT length, sample rate for each of 7. This image is a depiction of a typical Weighted
the weighted data is divided into blocks the sub-bands. Overlap and Add DFT filter-bank implementation.
of KSamples. The blocks are then added In practice, a degree
together before processing by the DFT. of oversampling is
Next, the input data is shifted along by required to avoid the K I h0[m] x0[m]
MSamples and the process is repeated. In image response prob- z 1
the simple case where M = K, a fresh lems caused by finite fil-
result is attained every KSample. The sys- ter cutoff rates. At the K I h1[m] x1[m]
tem is then known as critically sampled output of the first stage,
(i.e., the sample rate just satisfies the 2X oversampling is xk[m] yk[m] K-point
z 1 DFT
Nyquist criterion). This method may be used. For all successive K I hk[m] xk[m]
adequate for some processes, such as stages, the output is
spectral analysis or analysis/synthesis fil- decimated by two. The
4 N O V E M B E R/ D E C E M B E R 2003 W I R E L E S S SYSTE M S D E S I G N
DESIGNERS TOOLKIT
[ WIDEBAND CHANNELIZATION ]
similar. It would differ tuning over the whole input bandwidth.
Fs/2 +Fs/2
only in the signs of the Overall, this structure is an ideal replace-
adder/subtracter ele- Input ment for multiple DDCs in applications
ments. Successive stages 0 +Fs/4 Fs/4 0 like multi-standard base stations, satel-
Fs/4 +Fs/4
also would be quite sim- Filter bank lite communications, and intelligent
ilar except that the local A output antenna systems (FIG. 14).
oscillators would then Fs/8 +Fs/8 A brief comparison of silicon usage for
0 0 0 0
be at Fx/8 (where Fx is Filter bank the different filter banks also must be
the input sampling rate B output made. Within the limited scope of this
for the stage). In addi- article, only a few examples can be con-
Fs/16 +Fs/16
tion, the output would sidered. They are based on designs that
0 0 0 0 0 0 0 0
be decimated by two. Filter bank have been placed and routed in Xilinx
Fortunately, this C output FPGAs. A comparison is done for filter
architecture can be banks with the following parameters
greatly simplified in 10. Here is a representation of the PFT principle of (SEE TABLE):
several ways. With the frequency-band splitting in its most fundamental form. Number of bins = 256, 512, or 1024
tree system, the sam- Filter stop band = 100 dB
pling rate drops by a factor of two at er. That converters local oscillator is a Passband ripple = 0.1 dB
each stage. The result is inefficient use of numerically controlled oscillator driven Filter overlap = 75%
the hardware, which is capable of run- by the routing engine (FIG. 13). Input bit width = 14
ning at the full rate, Fs. The most pro- By performing the tuning operation Sample rate = 102.4e6 complex, 2x
cessing-intensive part of each stage lies in two steps, the designer gains a reduc- oversampled
in the low-pass filters. Because those fil- tion of sizefor a given frequency reso- Device: Virtex 2-6000
ters take an identical form within any lutionof the LUT used for fine tuning. LUTs = 67584
given stage, interleaving techniques may The tuning range thats required at each RAM = 18432 bits
be used to regain full efficiency. This step successive stage is reduced by a factor of 18-b multipliers = 144
involves interleaving the samples for two. In contrast, a DDC would need fine Compared to the other two techniques,
each of the branches within a given
stage. It also means modifying the filters Input sample Sample rate = Fs Sample rate = Fs/2 Sample
(which are normally FIR filters) by rate = Fs rate = Fs/4
adding extra delays between the coeffi- I CDC(C)
cient multipliers (FIG. 12). Several other CDC(B)
Q CUC(C)
simplifications save silicon, including
the avoidance of lookup tables and mul- I I I CDC(C)
tipliers.2 CDC(A) CDC(B)
Q Q Q CUC(C)
The PFT passband performance is very
similar to that of the polyphase DFT that I I CDC(C)
was illustrated in Figure 9. But differences CUC(B) CDC(B)
Q Q CUC(C)
exist in the stop band, because the PFT is
a cascade of filters. Its form potentially CDC = complex I CDC(C)
allows higher stop-band attenuation over downconverter CDC(B)
CUC = complex upconverter Q CUC(C)
a large percentage of the broad bandan
effect that increases with the number of
stages. Simultaneous outputs also are 11. This simple PFT tree structure, which uses complex up- and down-
available at each stage of the PFT. Each converters, would be impractical for a large number of channels.
one gives a different resolution. Finally,
IIR filters can be used at any stage of the
PFT. Silicon may therefore be saved for 2:4 sample 4:8 sample
applications in which linear phase isnt 1:2 sample interleavers/ interleavers/
critical and/or low latency is required. interleavers decimators decimators
Input sample Output sample Output sample Output sample
The tunable PFT actually makes use of rate = Fs rate = 2Fs rate = 2Fs rate = 2Fs
the PFT cascaded structure where inter-
7 I
mediate outputs are readily available. By
CDC I ICDC I ICDC I
modifying the PFT architecture, its pos- Q (A) (B) (C)
sible to extract frequency bands of the
desired size while ensuring that those
bands are centered at any given frequen- CUC Q ICUC Q ICUC Q
(B) (B) (C)
cy. This level of tunability is achieved in
two stages. First, the signals are coarsely
tuned within the PFT stages. Then, 12. This diagram of a simplified PFT depicts the use of interleaving tech-
theyre fine tuned by a complex convert- niques, which allow the FPGA to be run at maximum efficiency.
W I R E L E S S SYSTE M S D E S I G N N O V E M B E R/ D E C E M B E R 2003 5
DESIGNERS TOOLKIT
[ WIDEBAND CHANNELIZATION ]
Interleaver
To
CDC/CUC
multi-rate
Coarse tuning fine tuning
section
by PFT stages
Final
shaping filter to
match bandwidth
and shape
13. In this schematic of the tunable PFT architecture, 14. Using the tunable PFT, this screen shot shows
the extraction of channels at intermediate outputs is the extraction of the required channel frequency
followed by coarse tuning at the PFT stage. The plan from the wideband. The channel plan can be
channels are then fine-tuned by a complex converter. reconfigured on demand.
the most obvious conclusion is that the choice for single, fixed filter banks. downconverters. Then, the crossover
stacked DDC approach is very ineffi- For tunable filter banks, the best com- point for tunable filter banks (channels)
cient. To be fair, however, the particular parison is between stacked DDCs and the is at around the 16 channel point.
design that was utilized didnt make use tunable PFT. FIGURE 15 compares the REFERENCES:
of the dedicated multipliers that are logic requirements of the two approach- [1] Hogenauer, E.B., An economical class of digital
available in Xilinxs Virtex 2 devices. es for up to 256 bins.Above about 16 bins, filters for decimation and interpolation, IEEE
Transactions on Acoustic, Speech and Signal
Even so, the use of stacked DDCs for the TPFT wins rapidly. A similar compar- Processing, ASSP-29(2):155-162, 1981.
more than about eight bins just isnt ison exists for the memory requirements. [2] PFT Architecture and Comparisons with FFT/Digital
economical. Obviously, the most suitable design Down-Converter Techniques, www.rfel.com/download/
W02001-PFT White Paper.pdf.
Its not easy to directly compare the technique for any given application can-
[3] Rabiner, L.R., and Gold, B., Theory and Application
polyphase DFT and PFT approaches. The not be covered within a short paper. At of Digital Signal Processing, Prentice-Hall, 1975.
PFT has been configured as a multiplier- the higher subsystem level, too many fac- [4] Pipelined FFT, www.rfel.com/download/W02004-
less design. It doesnt make use of the tors need to be considered: form factor, Pipelined FFT White Paper.pdf.
dedicated multipliers even though it power consumption, weight, legacy sys- [5] Crochiere, R.E., and Rabiner, L.R., Multirate Digital
Signal Processing, Prentice-Hall, 1983.
could. Plus, the PFT has outputs available tems, etc. At the board and chip level, one [6] Gumas, C.C., Window-presum FFT achieves high
at each stage. Those outputs make it very must factor in the major considerations dynamic range, resolution, Personal Engineering &
useful in certain applications. of speed/sample rate, number of chan- Instrumentation News, July 1997, pg. 58-64,
www.chipcenter.com/dsp/DSP000315F1.html.
Furthermore, silicon efficiency is much nels, dynamic range/filter performance,
[7] TPFT-Tuneable Pipelined Frequency Transform,
improved if its only necessary to output target device, etc. Only then can the engi- www.rfel.com/download/W02003-Tuneable PFT White
bins over selected portions of the broad neer decide which architecture is the most Paper.pdf.
band. The general conclusion is that for appropriate to adopt.
smaller numbers of bins (up to around At the device level, however, the situa- John Lillington, CTO, RF Engines Ltd.,
256), the silicon requirements are similar. tion is a bit clearer. If more than approxi- Innovation Centre, St. Cross Business Park,
For larger numbers of bins, the polyphase mately eight fixed channels are required, Newport, Isle of Wight, PO30 5WB, UK; +44
DFT gains rapidlyparticularly in terms the polyphase DFT approach provides a (0)1983 550330, e-mail: john.lillington@
of memory. It becomes the preferred more efficient solution than digital rfel.com, www.rfel.com.
6 N O V E M B E R/ D E C E M B E R 2003 W I R E L E S S SYSTE M S D E S I G N
Copyright 2003 by Penton Media, Inc.