Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
17 views

Module 1

Digital Signal Processing (DSP) involves converting analog signals to digital for processing and then back to analog, requiring ADC and DAC. DSP offers advantages such as greater accuracy, cost-effectiveness, and flexibility, but also has limitations like increased complexity and power consumption. Applications of DSP span various fields including consumer electronics, image processing, medical imaging, and communications.

Uploaded by

chiragbengre10
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Module 1

Digital Signal Processing (DSP) involves converting analog signals to digital for processing and then back to analog, requiring ADC and DAC. DSP offers advantages such as greater accuracy, cost-effectiveness, and flexibility, but also has limitations like increased complexity and power consumption. Applications of DSP span various fields including consumer electronics, image processing, medical imaging, and communications.

Uploaded by

chiragbengre10
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Introduction to Digital Signal Processing

Introduction
 DSP is a technique of performing the mathematical operations on the

signals in digital domain


 As real time signals are analog in nature we need to first convert the

analog signal to digital, then we have to process the signal in digital


domain and again converting back to analog domain
 Thus ADC is required at the input side whereas a DAC is required at

the output end


 A typical DSP system is as shown in Fig1.1
Fig 1.1: A Typical DSP System
Need for DSP
Analog signal Processing has the following drawbacks:
 They are sensitive to environmental changes

 Aging

 Uncertain performance in production units

 Variation in performance of units

 Cost of the system will be high

 Scalability
Advantages and Disadvantages of DSP
 Greater accuracy

 Cheaper

 Flexibility in configuration, as reconfiguration is possible by

changing the program, whereas an analog system reconfiguration


would involve, redesigning the hardware
 Easy data storage on magnetic media without loss of fidelity

 Implementation of better algorithms for processing

 Adaptable to low frequency signal processing, where analog

processors would require very large passive elements like inductors


and capacitors
 Feasibility of time sharing of a signal processor among a number of

signals
However DSP suffers from few limitations:
 System complexity is increased because of the need to convert real-

life analog signals to digital and the processed digital signal back to
analog
 Limited bandwidth due to constraint on sampling rate

 Power consumption
Applications of DSP
 Consumer electronic appliances - TV, music synthesizers, stereos,

recorders etc
 Image processing techniques like image compression, image

enhancement, image analysis and recognition


 Process control

 Robotics

 Medical application like CT scans, X-ray scanning, MRI, ECG, EEG

etc
 Speech processing

 Earth quake monitoring

 Communication application like modems, cellular phones, FAX etc


A Digital Signal Processing System
 A computer or a processor is used for digital signal processing

 Antialiasing filter is a LPF which passes signal with frequency less

than or equal to half the sampling frequency in order to avoid


Aliasing effect
 Similarly at the other end, reconstruction filter is used to reconstruct

the samples from the staircase output of the DAC (Fig1.2)

Fig 1.2 The Block Diagram of a DSP System


Signals that occur in a typical DSP are as shown in Fig 1.3.

Fig 1.3: (a) Continuous time signal (b) Sampled Signal


(c) Sampled Data Signal (d) Quantized Signal (e) DAC Output
The Sampling Process
 ADC process involves sampling the signal and then quantizing the

same to a digital value


 In order to avoid Aliasing effect, the signal has to be sampled at a

rate at least equal to the Nyquist rate


 The condition for Nyquist Criterion is as given below,

fs≥2 fmax

where , fs is the sampling frequency

fmax is the maximum frequency component in the message signal


 If the sampling of the signal is carried out with a rate less than the

Nyquist rate, the higher frequency components of the signal cannot


be reconstructed properly
Example:

If fs=20kHz, then fmax should be ≤10kHz. If the input signal has a

frequency components greater than fmax, then these components are


converted into low frequency ones, resulting in a noise called
aliasing noise
Problem 1: Determine the Nyquist rate for the analog signal
x(t)=3cos100t-6sin400t+4cos60t
Solution:
The frequencies present in x(t) are
f1=50Hz, f2=200Hz, f3=30Hz

The max frequency component fmax=f2=200Hz. Hence as per

sampling theorem Nyquist rate is fs=2fmax= 400Hz


Problem 2: The signal is sampled at a rate of 300Hz. After sampling
the signal is reconstructed from the samples. Determine the
frequency of the reconstructed analog signal. comment on the
aliasing error
Solution: The signal in problem1 has frequencies 50Hz, 200Hz, 30Hz
and is sampled at 300Hz
30Hz and 50Hz are less than fs/2 (i.e, 150Hz) and hence not affected
by aliasing and hence are
The 200Hz component of x(t) is above fs/2 and is changed by
aliasing effect. In the reconstructed analog filter the 200Hz filter will
be aliased into 200-fs=200-300=-100Hz (minus sign represents a
phase shift 0f 1800)
In summary the reconstructed analog signal has the frequency
components 30Hz, 50Hz and 100Hz
Problem 3: Find the minimum sampling rate for y(t) such that no
aliasing results. Find suitable antialiasing filter (type of cut-off
frequency) if the sampling rate must be 6KHz
Solution:
The maximum frequency component of y(t) is 5kHz, since beyond
this the amplitude is very low. Hence the minimum sampling rate for
y(t)=2fmax =10kHz i.e, fs=10kHz

Using a sampling rate of 6 kHz results in aliasing error which can be


avoided using anti-aliasing filter
For 6kHz sampling rate, the maximum frequency component in y(t)
should be less than 6kHz/2=3kHz. Hence use a low pass filter with
cut-off of 3kHz as anti-aliasing filter
Discrete Time Sequences
 Consider an analog signal x(t) given by,

x(t)= A cos (2ft)


 If this signal is sampled at a Sampling Interval T, in the above

equation replacing t by nT we get,


x(nT) = A cos (2fnT) where n= 0,1, 2,..etc
 For simplicity denote x (nT) as x (n)

x(n) = A cos (2fnT) where n= 0,1, 2,..etc


 We have fs=1/T also θ = 2fnT

 x(n) = A cos (2fnT)= A cos (2fn/fs) = A cos θn


The quantity θ is called as digital frequency
θ = 2fT = 2f/fs radians
Fig 1.5 A Cosine Waveform

 A sequence that repeats itself after every period N is called a

periodic sequence
 Consider a periodic sequence x (n) with period N

x(n)=x (n+N), n=……..,-1,0,1,2,……..


 Frequency response gives the frequency domain equivalent of a

discrete time sequence


 It is denoted as X(e jθ), i.e,

 Frequency response of a discrete sequence involves both magnitude

response and phase response


Problem 4: A signal x(t) has the frequency components of 50Hz,
250Hz and 600Hz. It is sampled with 4kHz sampling rate. Determine
the resulting digital frequencies (in radians)
Solution:
Digital frequency θn=2fn/fs
θ1=2x50/4k=0.025rad
θ2=2x250/4k=0.125rad
θ3=2x600/4k=0.3rad
Problem 5: A signal is sampled with 8kHz sampling frequency.
Determine the analog frequencies (in Hz and rad/s) for the digital
frequencies 0, /4, /3, /2,  and 3/4 radians.
Solution:
Discrete Fourier Transform (DFT) and Fast Fourier Transform
(FFT)
DFT Pair
 DFT is used to transform a time domain sequence x(n) to a frequency

domain sequence X(k)


 The equations that relate the time domain sequence x(n) and the

corresponding frequency domain sequence X(k) are called DFT Pair


and is given by,
DFT(FFT):
where (k=0,1,.......,N-
1)
IDFT(IFFT):
The Relationship between DFT and Frequency Response
 We have

Also

 , k = 0, 1, 2,...(N-1)
 From the above expression it is clear that we can use DFT to find the

Frequency response of a discrete signal


 Spacing between the elements of X(k) is given as

f=fs/N=1/NT=1/T0

Where T0 is the signal record length


 It is clear from the expression of f that, in order to minimize the

spacing between the samples N has to be a large value


 Although DFT is an efficient technique of obtaining the frequency

response of a sequence, it requires more number of complex


operations like additions and multiplications
 Thus many improvements over DFT were proposed

 One such technique is to use the periodicity property of the twiddle

factor e-j2/N
 Those algorithms were called as Fast Fourier Transform

Algorithms
 The following table depicts the complexity involved in the

computation using DFT algorithms


Table 1.1 Complexity in DFT algorithm

Operations Number of Computations


Complex Multiplications N2
Complex Additions N (N-1)
Real Multiplications 4N2
Real Additions 2N (2N-1)
Trigonometric Functions 2N2
FFT algorithms are classified into two categories viz
1. Decimation in Time FFT
2. Decimation in Frequency FFT
 In decimation in time FFT the sequence is divided in time domain

successively till we reach the sequences of length 2


 Whereas in Decimation in Frequency FFT, the sequence X(k) is

divided successively
 The complexity of computation will get reduced considerably in case

of FFT algorithms
Problem 6: Compute the 5-point DFT of the sequence x[n]=[1 2 4] and
hence plot its magnitude and phase spectra
Solution: x[n]=[1 2 4] is a 3-point sequence and hence 5-point DFT
can be applied
The 5-point DFT is computed as
where k=0, 1, 2, 3, 4 and N=5
x[0]=1; x[1]=2; x[2]=4
X[k]=x(0)e-j20k/N+ x(1)e-j2k/N+ x(2)e-j22k/N
=1+2xe-j2k/5+4x e-j4k/5

For k=0, X(0)=7


Similarly,
X(1)=4.55-1.93
X(2)=2.71.3399
X(3)=2.7-1.3399
X(4)=4.551.93
Magnitude of X(k)=X(k)=7,4.55,2.7,2.7,4.55
Phase of X(k)=X(k)= 0,-1.93,1.3399,-1.3399,1.93
Problem 7: A signal x(t) is filtered and sampled using a sampling rate
of 8kHz. If 1024 samples of this signal are used to compute the
discrete Fourier transform X(k), find the frequency spacing between
adjacent X(k) elements. What is the analog frequency corresponding
to k=12, 30, 64, 120, 200, 320, 400
Solution: The frequency spacing,

f=fs/N=8KHz/1024=7.8125Hz

Analog frequency corresponding to k=12 is


fxk=7.8125x12=93.75 Hz
Similarly the analog frequencies corresponding to k=30, 64, 120,
200, 320 and 400 are 234.375Hz, 500Hz, 937.5Hz, 1562.5Hz,
2500Hz and 3125Hz respectively
Fast Fourier Transform (FFT)
 The direct computation of DFT and IDFT requires a large number of

complex multiplies
 A number of algorithms have been developed to efficiently compute

DFT and IDFT


 These algorithms use power of 2 points and exploit the periodic

nature of the complex exponential ej2nk/N occurring in the DFT and


IDFT equations
 Table below compares the complex multiples needed to compute

DFT directly by using an FFT algorithm called radix-2 algorithm


 The radix-2 algorithm uses N that is an integer power of 2, such as 2,

4, 8, 16, etc
 It is possible to show that the DFT requires N2 complex multiples

and radix-2 FFT algorithm requires N/2log 2N


 This produces computational savings for larger value of N

Table: Computations complexity of FFT algorithm


N Direct FFT computation Amount of saving Ratio of computational
point computation (radix2)N/2log2N of complex complexity direct:FFT
N2 multiplications N2/(N/2)log2N=2N2/log2N
2 4 1 3 4
4 16 4 12 4
8 64 12 52 5.33
16 256 32 244 8
32 1024 80 944 12.8
64 4096 192 3904 21.8
256 65536 1024 64512 64
1024 1048576106 5120 106 204.8
Linear Time Invariant Systems
 A system which satisfies superposition theorem is called as a linear

system and a system that has same input output relation at all times
is called a Time Invariant System
 Systems, which satisfy both the properties, are called LTI systems

 An LTI System

 LTI systems are characterized by its impulse response or unit sample

response in time domain whereas it is characterized by the system


function in frequency domain
Convolution
 Convolution is the operation that related the input output of an LTI

system, to its unit sample response


 The output of the system y(n) for the input x(n) and the impulse

response of the system being h (n) is given as

where, x(n) is the input of the system


h(n) is the impulse response of the system
y(n) is the output of the system
Problem 8: Find the convolution of the sequence x(n)=[1 2 3 4] and
h(n)=[3 2 1]
Solution:
y(n)= x(n) * h(n)=[3 8 14 20 11 4]
Problem 9: Compute and plot the sequence x(n) * h(n), where x[n]= [3
4 2 11 0 7 -1 10] and h=[1.2 3.7 4.8]. Determine the length of the
computed sequence
Solution:
y(n)= x(n) * h(n)
y(n)= 3.6000 15.9000 31.6000 39.8000 50.3000 61.2000
24.7000 41.9000 32.2000 48.0000
 Length of the computed sequence y(n)=lenx(n)+lenh(n)=8+3-1=10

plot of sequence x(n) * h(n)

Matlab code is:


x=[3 4 2 11 0 7 -1 10];
h=[1.2 3.7 4.8];
y=conv(x,h);
figure(1);
stem(y);
l=length(y);
Problem 10: Find the sequence X(k) and H(k) using FFT. Next
multiply two sequences to generate the sequence Y(k)=X(k).H(k).
Use IFFT to compute y(n)
Solution :
x=[3 4 2 11 0 7 -1 10];
h=[1.2 3.7 4.8];
xlen=length(x);
hlen=length(h);
ylen=xlen+hlen-1;
N=ylen;
Xk=fft(x,N);
Hk=fft(h,N);
Yk=Xk.*Hk;
y=ifft(Yk)
figure(1);
stem(y);
 X(k)= 36.0000,-5.8262 - 5.7921i, -7.6803 - 3.4410i, 9.8262 -
2.9919i, 14.6803 - 0.8123i, -28.0000, 14.6803 + 0.8123i, 9.8262 +
2.9919i, -7.6803 + 3.4410i, -5.8262 + 5.7921i 
 H(k)= 9.7000,5.6766 - 6.7399i, -1.5399 - 6.3403i,-3.8266 -

0.6975i,-0.3101 + 2.3903i, 2.3000, -0.3101 - 2.3903i,-3.8266 +


0.6975i,-1.5399 + 6.3403i, 5.6766 + 6.7399i
 Y(k)= 102x3.4920,-0.7211 + 0.0639i, -0.0999 + 0.5399i,-0.3969 +

0.0459i, -0.0261 + 0.3534i,-0.6440, -0.0261 - 0.3534i, -0.3969 -


0.0459i,-0.0999 - 0.5399i, -0.7211 - 0.0639i 
 y={ 3.6000 , 15.9000, 31.6000, 39.8000, 50.3000, 61.2000 ,
24.7000 , 41.9000, 32.2000, 48.0000

The result is same as the first problem


Problem 11: For the above problem use 8-point DFT and IDFT.
Compare these results with that obtained in above example
Solution:
x=[3 4 2 11 0 7 -1 10];
h=[1.2 3.7 4.8];
Xk=fft(x,8);
Hk=fft(h,8);
Yk=Xk.*Hk;
y=ifft(Yk)
 y ={ 35.8000 63.9000 31.6000 39.8000 50.3000 61.2000
24.7000 41.9000}
 The last two samples of y(n)=conv(x,h) i.e, 32.8 and 48 are added

with first 2 samples 3.6 and 15.9 respectively to result 35.8 and 63.9
of circular convolution output
Z Transformation
 Z Transformations are used to find the frequency response of the

system
 The Z Transform for a discrete sequence x(n) is given by,

The System Function


 An LTI system is characterized by its System function or the

transfer function
 The system function of a system is the ratio of the Z transformation

of its output to that of its input


 It is denoted as H(z) and is given by

H(z) = Y(z)/ X(z)


 The magnitude and phase of the transfer function H(z) gives the

frequency response of the system


 From the transfer function we can also get the poles and zeros of the

system by solving its numerator and denominator respectively


Digital Filters
 Filters are used to remove the unwanted components in the sequence

 They are characterized by the impulse response h(n)

 The general difference equation for an Nth order filter is given by,

 A typical digital filter structure is as shown in figure below

Structure of a Digital Filter


 Values of the filter coefficients vary with respect to the type of the

filter
 Design of a digital filter involves determining the filter coefficients

 Based on the length of the impulse response, digital filters are

classified into two categories viz Finite Impulse Response (FIR)


Filters and Infinite Impulse Response (IIR) Filters
FIR Filters
 FIR filters have impulse responses of finite lengths

 In FIR filters the present output depends only on the past and present

values of the input sequence but not on the previous output


sequences
 Thus they are non recursive hence they are inherently stable
 FIR filters possess linear phase response and hence they are very

much applicable for the applications requiring linear phase response


 The difference equation of an FIR filter is represented as

 The frequency response of an FIR filter is given as

 Also

 The major drawback of FIR filters is, they require more number of

filter coefficients to realize a desired response as compared to IIR


filters
 Thus the computational time required will also be more
Problem 12: Find the magnitude and phase response of an FIR filter
represented by the difference equation
y(n)= 0.5 x(n) + 0.5 x(n-1)
Solution:
As, y(n)= 0.5 x(n) + 0.5 x(n-1)
h(n)= 0.5 δ(n) + 0.5 δ(n-1) = [0.5 0.5]
H(z)= 0.5+0.5z-1
H(ejθ)= 0.5+0.5 e -jθ
= 0.5+0.5 cosθ -j0.5 sinθ
=0.5 (1+ cosθ) -j0.5 sinθ
= [0.5*2* cos2 (θ/2)]-j[0.5*2* sin (θ/2)* cos (θ/2)]
= cos2 (θ/2) -j[sin (θ/2)* cos (θ/2)]
mag (H(ejθ)) = sqrt (cos4 (θ/2) +sin2 (θ/2) cos2 (θ/2))
= sqrt [cos2 (θ/2)( cos2 (θ/2)+ sin2 (θ/2))]
= cos (θ/2)
Similarly,
Phase (H(ejθ)) = tan –1[-(sin (θ/2) cos (θ/2))/ cos2 (θ/2)]
= tan –1[-tan (θ/2)]
= - (θ/2)
The magnitude and phase response curves of the designed FIR filter
are as shown in figure below
FIR Filter Design
 Frequency response of an FIR filter is given by the following

expression,

 Design procedure of an FIR filter involves the determination of the

filter coefficients bk
IIR Filters
 Unlike FIR filters, IIR filters have infinite number of impulse response

samples
 They are recursive filters as the output depends not only on the past

and present inputs but also on the past outputs


 They generally do not have linear phase characteristics

 Typical system function of such filters is given by,

 Stability of IIR filters depends on the number and the values of the

filter coefficients
 The major advantage of IIR filters over FIR is that, they require lesser

coefficients compared to FIR filters for the same desired response,


thus requiring less computation time
Problem 13: Obtain the transfer function of the IIR filter whose
difference equation is given by, y(n)= 0.9y(n-1)+0.1x(n)
Solution:
y(n)= 0.9y(n-1)+0.1x(n)
Taking Z transformation both sides
Y(z)= 0.9 z-1 Y(z) + 0.1 X(z)
Y(z) [ 1- 0.9 z-1] = 0.1 X(z)
The transfer function of the system is given by the expression,
H(z)= Y(z)/X(z)
= 0.1/ [ 1- 0.9 z-1]
 Realization of the IIR filter with the above difference equation is as

shown in figure below

IIR Filter Structure


IIR Filter Design
 IIR filters can be designed using two methods viz using analog filter

design techniques and direct method


 In this approach, a digital filter can be designed based on its

equivalent analog filter


 An analog filter is designed first for the equivalent analog

specifications for the given digital specifications


 Then using appropriate frequency transformations, a digital filter

can be obtained
 The filter specifications consist of passband and stopband ripples in

dB and Passband and Stopband frequencies in rad/sec

Low pass Filter Specifications


 Direct IIR filter design methods are based on least squares fit to a

desired frequency response


 These methods allow arbitrary frequency response specifications

Decimation and Interpolation


 Decimation and Interpolation are two techniques used to alter the

sampling rate of a sequence


 Decimation involves decreasing the sampling rate without

violating the sampling theorem


 Interpolation increases the sampling rate of a sequence
appropriately by considering its neighbouring samples
Decimation
 Decimation is a process of dropping the samples without violating

sampling theorem
 The factor by which the signal is decimated is called as decimation

factor and it is denoted by M


 It is given by,

where

Decimation Process
Problem 14: Let x(n)=[3 2 2 4 1 0 –3 –2 –1 0 2 3] be decimated with a
factor of 2. Let the filtered sequence be w(n)=[2.1 2 3.9 1.5 0.1 –2.9 –2 –
1.1 0.1 1.9 2.9]. Obtain the decimated sequence y(m)
 Sequence y(m) can be obtained by dropping every alternative sample of

w (n)
y(m) = [2 1.5 -2.9 -1.1 1.9]

Problem 15: Let x(n)=[1 0 2 4.1 5 6 7 3 5.2] and filter sequence h(n)=[0.2
1.2 -0.3 0.4] be decimated by a factor of 3. Obtain the decimated
sequence

Solution:
 The output sequence after filter is w(n)=[0.2 1.2 0.1 3.62 5.2 6.77 8.74

9.2 4.94 8.14 -0.36 2.08]

The decimated sequence is y(m)=[0.2 3.62 8.74 8.14]


Interpolation
 Interpolation is a process of increasing the sampling rate by

inserting new samples in between


 The input output relation for the interpolation, where the sampling

rate is increased by a factor L, is given as,

where w(n)= x(m/L), m=0, ±L, ±2L……


0 for other values of m

Interpolation Process
Problem 16: Let x(n)= [0 3 6 9 12] be interpolated with L=3. If the
filter coefficients of the filters are bk=[1/3 2/3 1 2/3 1/3], obtain the
interpolated sequence
Solution:

After inserting zeros,


w(m) = [0 0 0 3 0 0 6 0 0 9 0 0 12]

bk=[1/3 2/3
↑ 1 2/3 1/3]

We have,

y(m)= Σ bk w(m-k) = b-2 w(m+2)+ b-1 w(m+1)+ b0 w(m)+ b1

w(m-1)+ b2 w(m-2)
Substituting the values of m, we get

y(0)= b-2 w(2)+ b-1 w(1)+ b0 w(0)+ b1 w(-1)+ b2 w(-2)= 0

y(1)= b-2 w(3)+ b-1 w(2)+ b0 w(1)+ b1 w(0)+ b2 w(-1)=1

y(2)= b-2 w(4)+ b-1 w(3)+ b0 w(2)+ b1 w(1)+ b2 w(0)=2

Similarly we get the remaining samples as,


y(n) = [ 0 1 2 3 4 5 6 7 8 9 10 11 12]
Computational Accuracy in DSP Implementations
Introduction
 The issues related to computational accuracy of algorithms when

implemented using programmable digital signal processors are


discussed
 Various formats of number representation and their effect on the

dynamic range and precision of signals represented using these


formats are discussed
 The various sources of errors in the implementation of DSP
algorithms and how to control these errors while designing DSP
systems are discussed
Following topics to be discussed
1. Number formats for signals and coefficients in DSP systems

2. Dynamic range and precision

3. Sources of error in DSP implementations

4. A/D conversion errors

5. DSP computational errors

6. D/A conversion errors

Number formats for signals and coefficients in DSP systems


 In a digital signal processor, as in any other digital system, signals

are represented as numbers right from the input, through different


stages of processing, to the input
 The DSP structures, such as filters, also require numbers to specify

coefficients
 There are various ways of representing these numbers, depending on

the range and precision of signals and coefficients to be represented,


hardware complexity, and speed requirements
 The typical formats used for numbers to represent signals and

coefficients in DSP systems


 In a DSP, signals are represented as discrete sets of numbers from the

input stage along through intermediate processing stages to the


output
 Even DSP structures such as filters require numbers to specify

coefficients for operation


 Two typical formats for these numbers are :

1. Fixed-point format
2. Floating-point format
Fixed point format
 The simplest scheme of number representation is the format in which

the number is represented as an integer or fraction using a fixed


number of bits
 An n-bit fixed-point signed integer −2n-1 ≤ x ≤ 2n-1 −1 is represented

as:

x = -s. 2n-1 + bn-2 .2n-2 + bn-3 .2n-3 + … + b1 .21 + b0 .20 ---- (1)

where s represents the sign of the number (s = 0 for positive and s =


1 for negative)
 Fig 1.12 (a) Fixed point format to represent signed integers

 Fig 1.12 (b) Fixed point format to represent signed fractions

Example 1: What is the range of numbers that can be


represented in a fixed point format using 16 bits if the numbers
are treated as (a) signed integers (b) signed fractions?

Solution: (a). Using 16 bits, the range of integers that can be


represented is determined by substituting n=16 in equation 1 and is
given as
-215 to +215 -1
i.e, -32,768 to +32,767
(b). The range of fractions, as determined from equation 1 using n=16,
is given as
-1 to +(1-2-15)
i.e, -1 to +.999969482
 In DSP implementations, multiplication of integers produces

numbers that may require more bits to represent, and in the event of
a fixed number of available bits, it may create wraparound
 The wraparound generates the most negative number after the most

positive number, and vice versa


 The problem can be tackled by using fractional representation

 When two fractions are multiplied, the result is still a fraction

 The resulting fraction may use the same number of bits as the

original fractions by discarding the less significant bits


Double Precision Fixed point format
 To increase the range of numbers that can be represented in fixed

point format, one obvious approach is to increase its size


 If the size is doubled, the range of numbers increases substantially

 Simply doubling the size and still using the fixed point format

creates a double precision fixed point format


 But such a format requires double the storage for the same data and

may need double the number of accesses for the same size of data
bus of the DSP device
Floating point format
 For DSP applications, if an algorithm involves summation of a large

number of products, it requires a large number of bits to represent the


signal to allow for adequate signal growth over the summation
 However, since a processor architecture will not allow for an

unlimited number of bits, some processor choose a floating point


format for signal processing computations
 A floating point number is made up of a mantissa M x and an

exponent Ex such that its value x is represented as

x= Mx2Ex
 If two floating point numbers x and y are multiplied, the product xy

is given by

xy= MxMy2Ex+ Ey
 Implementation of a floating point multiplier must contain a

multiplier for the mantissa and an adder for the exponent


 An addition of floating point numbers requires normalization of the

numbers to be added so that they have the same components


 A commonly used single-precision floating-point representation is

the IEEE 754-1985 format given as:

IEEE-754 format for floating point numbers

 F represents the magnitude fraction of the mantissa, and the exponent

E is an integer
 Further in determining the mantissa, an implied 1 is placed

immediately before the binary point of the fraction


 The sign bit provides the sign of the fractional part of the number
 That is to say, with n bits for F, the range of fractional numbers that

can be represented in the mantissa is –(2-2 -n) to +(2-2-n)


 The bias depends upon the bits reserved for the component

 In Fig, the bias is 127, the largest positive number represented by 8

bits
 The value of E can be from 0 to 255

 In double precision representation, the exponent uses 11 bits, making

the bias value as 1023


Example 2: Find the decimal equivalent of the floating point
binary number 1011000011100. Assume a format similar to
IEEE-754 in which the MSB is the sign bit followed by 4
exponent bits followed by 8 bits for the fractional part
Solution: The number is negative, as the sign bit is 1
F= 2-4+2-5+2-6 = 0.109375
E=21+22=6
Thus the value of the number is
x= -1.109375x2(6-7) = -0.5546875
 Floating point format, used to increase the range of numbers that can

be represented, suffers from the problem of speed reduction for DSP


computation
 More steps are required to complete a floating point computation

compared to fixed point computation

Dynamic range and Precision


 The dynamic range of a signal is the ratio of the maximum value to

the minimum value that the signal can take in the given number
representation scheme
 The dynamic range of a signal is proportional to the number of bits

used to represent it and increases by 6dB for every additional bit


used for the representation
 The number of bits used to represent a signal also determines the

resolution or the precision with which the signal can be represented


 However, the time taken for certain operations such as the A/D

conversion increases with the increase in the number of bits


 Resolution is the minimum value that can be represented using a

number representation format


 For instance, if N bits are used to represent a number from 0 to 1, the

smallest value it can take is the resolution and is given as


Resolution = 1/2N for large N
 Resolution of a number representation format is normally expressed

as number of bits used in the representation


 At times, it is also expressed as a percentage
 Precision is an issue related to the speed of DSP implementation

 In general, techniques to improve the precision of an implementation

reduce its speed


 Larger word size improves the precision but may pose a problem

with the speed of the processor, especially if its bus width is limited
 For example, if the 32 bit product of a 16x16 multiplication has to be

preserved without loss of precision, two memory accesses are


required to store and recall the product using a 16 bit bus
Example 3: Calculate the dynamic range and precision of each
of the following number representation formats
1. 24 bit, single precision, fixed point format

2. 48 bit, double precision, fixed point format

3. A floating point format with a 16 bit mantissa and an 8 bit


exponent

Solution: a. Since each bit gives a dynamic range of 6dB, the total
dynamic range is 24x6 =144dB
Percentage resolution is (1/224)x100 = 6x10-6
b. Since each bit gives a dynamic range of 6dB, the total dynamic
range is 48x6 =288dB
Percentage resolution is (1/248)x100 = 4x10-13
 For floating point representation, the dynamic range is determined by

the number of bits in the exponent. Since there are 8 exponent bits, the
dynamic range is (28-1)x6= 1530dB
 The percentage resolution depends on the number of bits in the

mantissa. Since there are 16 bits in the mantissa, the resolution is

(1/216)x100 = 1.5x10-3 %

Sources of error in DSP implementations


 A DSP system consists of an A/D converter, a DSP device, and a D/A

converter
 The accuracy of a DSP implementation depends upon a number of

factors contributed by the A/D and D/A conversions and how the
calculations are performed in the DSP device
 The error in the A/D and D/A in the representation of analog signals

by a limited number of bits is called the quantization error


 The quantization error decreases with the increase in the number of

bits used to represent signals in A/D and D/A converters


 The errors in the DSP calculations are due to the limited word length

used
 These errors depend upon how the algorithm is implemented in a

given DSP architecture


 These errors can be reduced by using a larger word length for data

and by using rounding, instead of truncation, in calculations


A/D Conversion Errors
 Consider an A/D converter, shown in Fig, with b bits used to

represent an unsigned signal value


 Its digital representation is of the form …xxx….x, where there are b

bits after the assumed binary point


 In this kind of binary representation, the value of the least significant

bit is given by
Δ=2-b
 The maximum error due to quantization depends on b

 The quantization error for a given conversion as shown in the model

of Fig is given by

ε = xq – x
where x is the input and xq is the quantized output
 This error is called the truncation error if the signal value above the

largest integral multiple of Δ is simply dropped


 It is called the rounding error if the value is rounded to the nearest

integral multiple of Δ. This way the rounding limits the error to ±Δ/2
 The signal to noise ratio (SNR) is a measure that is used to evaluate

the performance of the A/D converter


DSP Computational Errors
 The DSP computations involve using the digitized signal values and

DSP structures represented by coefficients


 These numbers are typically represented in the signed fractional 2’s

complement form
 The computation almost always involve multiplications or multiply

and accumulate (MAC) operations

D/A Conversion Errors


 A source of error in a D/A converter is due to the fact that, typically,

a D/A converter uses fewer bits in conversion than the number of bits
required by the computed result, produced by the DSP device
 This is equivalent to the truncation or the rounding off error in the

A/D converter and can be handled

You might also like