Image Processing

The document discusses digital image fundamentals including components of an image processing system and fundamental steps in digital image processing such as image acquisition, enhancement, restoration, representation and recognition. Digital image processing has applications in remote sensing, medical imaging, robotics and more.

Uploaded by

Manjeet Mrinal

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Image Processing

Uploaded by

Manjeet Mrinal

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 57

n

DIGITAL IMAGE FUNDAMENTALS:

The field of digital image processing refers to processing digital images by
means of digital computer. Digital image is composed of a finite number of
elements, each of which has a particular location and value. These elements are
called picture elements, image elements, pels and pixels. Pixel is the term used
most
widely to denote the elements of digital image.
An image is a two-dimensional function that represents a measure of some
characteristic such as
brightness or color of a viewed scene. An image is a projection of a 3- D scene
into a 2D projection
plane.
An image may be defined as a two-dimensional function f(x,y), where x and y are
spatial (plane)
coordinates, and the amplitude tofat any pair of coordinates (x,y) is called the
intensity of the image
at thatpoint. The term gray levelis used often to refer to the intensity of
monochrome images.
Color images are formed by a combination of individual 2-D images. For example, the
RGB color
system, a color image consists of three (red, green and blue) individual component
images. For this
reason, many of the techniques developed for monochrome images can be extended to
color images
by processing the three component images individually.
An image may be continuous with respect to the x- and y- coordinates and also in
amplitude.
Converting such an image to digital form requires that the coordinates, as well as
the amplitude, be
digitized.
APPLICATIONS OF DIGITAL IMAGE PROCESSING:
Since digital image processing has very wide applications and almost all of the
technical fields are
impacted by DIP, we will just discuss some of the major applications of DIP.
Digital image processing has a broad spectrum of applications, such as
1. Remote sensing via satellites and otherspacecrafts
2. Image transmission and storage for businessapplications
3. Medicalprocessing
4. RADAR (Radio Detection and Ranging)
5. SONAR (Sound Navigation and Ranging)
6. Acoustic Image Processing (The study of underwater sound is known as Underwater
Acousticsor
HydroAcoustics)
7. Robotics and automated inspection of industrial parts Images acquired
by satellites are useful in trackingof
1. Earthresources
2. Geographical mapping
3. Prediction of agriculturalcrops
4. Urban growth and weathermonitoring
5. Flood and fire control and many other environmentalapplications Space image
applicationsinclude:
1. Recognition and analysis of objects contained in images obtained from deep
space-probemissions.
2. Image transmission and storage applications occur in broadcasttelevision
3. Teleconferencing
4. Transmission of facsimile images (Printed documents and graphics) for office
automation
5. Communication over computer networks
6. Closed-circuit television-based security monitoring systemsand
7. In military communications
Medical applications:
8. Processing of chest X-rays
9. Cineangiograms
10. Projection images of trans axial tomography and
11. Medical images that occur in radiology nuclear magnetic resonance (NMR)
12. Ultrasonic scanning
13. IMAGE PROCESSING TOOLBOX (IPT):
14. It is a collection of functions that extend the capability of the MATLAB
numeric computing
environment. These functions, and the expressiveness of the MATLAB language, make
many
image-processing operations easy to write in a compact, clear manner, thus
providing an ideal
software prototyping environment for the solution of image processing problem.
COMPONENTS OF IMAGE PROCESSING SYSTEM:
Fig: Components of Image processing System
Image Sensors: With reference to sensing, two elements are required to acquire
digital image. The
first is a physical device that is sensitive to the energy radiated by the object
we wish to image and
second is specialized image processing hardware.
Specialize Image Processing Hardware: It consists of the digitizer just mentioned,
plus hardware
that performs other primitive operations such as an arithmetic logic unit, which
performs arithmetic
such addition and subtraction and logical operations in parallel onimages.
Computer: It is a general-purpose computer and can range from a PC to a
supercomputer depending
on the application. In dedicated applications, sometimes specially designed
computer is used to
achieve a required levelof performance
Software: It consists of specialized modules that perform specific tasks a well-
designed package also
includes capability for the user to write code, asa minimum, utilizes the
specialized module. More
sophisticated software packages allow the integration of these modules.
Mass Storage: This capability is a must in image processing applications. An image
of size 1024
x1024 pixels, in which the intensity of each pixel is an 8- bit quantity requires
one Megabytes of
storage space if the image is not compressed. Image processing applications falls
into three principal
categories of storage.
 Short term storage for use duringprocessing
 On line storage for relatively fastretrieval
 Archival storage such as magnetic tapes anddisks
Image Display: Image displays in use today are mainly color TV monitors.These
monitors are driven
by the outputs of image and graphics displays cards that are an integral part of
computer system.
Hardcopy Devices: The devices for recording image include laser printers, film
cameras, heat
sensitive devices inkjet units and digital units such as optical and CD ROM disk.
Films provide the
highest possible resolution, but paper is the obvious medium of choice for written
applications.
Networking: It is almost a default function in any computer system in usetoday
because of the large
amount of data inherent in image processing applications. The key consideration in
image
transmission bandwidth.
FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING:
There are two categories of the steps involved in the image processing – 1. Methods
whose outputs are input areimages.
2. Methods whose outputs are attributes extracted from thoseimages.
Fig: Fundamental Steps in Digital Image Processing
Image Acquisition: It could be as simple as being given an image that is already in
digital form.
Generally, the image acquisition stage involves processing such scaling.
Image Enhancement: It is among the simplest and most appealing areas ofdigital
image processing.
The idea behind this is to bring out details that are obscured or simply to
highlight certain features of
interest in image. Image enhancement is a very subjective area of image processing.
Image Restoration: It deals with improving the appearance of an image. It is an
objective
approach, in the sense that restoration techniques tend to be based on mathematical
or
probabilistic models of image processing. Enhancement, on the other hand is based
on human
subjective preferences regarding what constitutes a “good” enhancement result.
Color Image Processing: It is an area that is been gaining importance because of
the use of digital
images over the internet. Color image processing deals with basically color models
and their
implementation in image processingapplications.
Wavelets and Multiresolution Processing: These are the foundation for representing
image in
various degrees of resolution.
Compression: It deals with techniques reducing the storage required to save an
image, or the
bandwidth required to transmit it over the network. It has to major approaches
1. Lossless Compression 2. Lossy Compression
Morphological Processing: It deals with tools for extracting image components that
are useful in
the representation and description of shapeand boundary of objects. It is majorly
used in automated
inspection applications.
Representation and Description: It always follows the output ofsegmentation step
that is, raw pixel
data, constituting either the boundary of an image or points in the region itself.
In either case
converting the datato a form suitable for computer processing is necessary.
Recognition: It is the process that assigns label to an object based on its
descriptors. It is the last step
of image processing which use artificialintelligence of software.
Knowledge Base: Knowledge about a problem domain is coded into an image processing
system in
the form of a knowledge base. This knowledge may be as simple as detailing regions
of an image
where the information of the interest in known to be located. Thus, limiting search
that has to be
conducted in seeking the information. The knowledge base also can be quite complex
such
interrelated list of all major possible defects in a materials inspection problem
or an image database
containing high resolution satellite images of a region in connection with change
detection
application.
Simple Image Model: An image is denoted by a two dimensional function of the form
f{x, y}. The
value or amplitude of f at spatial coordinates {x,y} is a positive scalar quantity
whose physical
meaning is determined by the source of the image. When an image is generated by a
physical
process, itsvalues are proportional to energy radiated by a physical source. As
aconsequence, f(x,y)
must be nonzero and finite; that is o<f(x,y) <co The function f(x,y) may be
characterized by two
components- The amount of the source illumination incident on the scene
beingviewed.
The amount of the source illumination reflected back by the objects in the scene
These are called
illumination and reflectance components and are denoted by i(x,y) an r (x,y)
respectively.
The functions combine as a product to form f(x,y). We call the intensity of a
monochrome image at
any coordinates (x,y) the gray level (l) of the imageat that point l= f (x, y.)
L min ≤ l ≤ Lmax Lminis to be positive and Lmax must be finite
Lmin =iminrmin
Lmax =imaxrmax
The interval [Lmin, Lmax] is called gray scale. Common practice is to shift this
interval numerically
to the interval [0, L-l] where l=0 is considered black and l= L-1 is considered
white on the gray scale.
All intermediate values are shades of gray of gray varying from black towhite.
SAMPLING AND QUANTIZATION:
To create a digital image, we need to convert the continuous sensed data into
digital from. This
involves two processes – sampling and quantization.An image may be continuous with
respect to the
x and y coordinates and also in amplitude. To convert it into digital form we have
to sample the
function in both coordinates and in amplitudes.
Digitalizing the coordinate values is called Sampling. Digitalizing the amplitude
values is called
Quantization. There is a continuous the image along the line segment AB. To simple
this function,
we take equally spaced samples along line AB. The location of each samples is given
by a vertical
tick back (mark) in the bottom part. The samples are shown as block squares
superimposed on
function the set of these discrete locations givesthe sampled function.In order to
form a digital, the
gray level values must also be converted (quantized) into discrete quantities. So,
we divide the gray
level scale into eight discrete levels ranging from eight level values. The
continuous gray levels are
quantized simply by assigning one of the eight discrete gray levels to each sample.
The assignment it
made depending on the vertical proximity of a simple to a vertical tick mark.
Starting at the top of
the image and covering out this procedure line by line produces a two- dimensional
digital image.
Digital ImageDefinition: A digital image f(m,n) described in a 2D discretespace is
derived from an
analog image f(x,y) in a 2D continuous space through a sampling process that is
frequently referred
to as digitization. The mathematics of that sampling process will be described in
subsequentChapters.
For now, we will look at some basic definitions associated with the digital image.
The effect of
digitization is shown in figure.
The 2D continuous image f(x,y) is divided into N rows and M columns.
Theintersection of a row and
a column is termed a pixel. The value assigned tothe integer coordinates (m,n) with
m=0,1,2..N-1
and n=0,1,2…N-1 is f(m,n). In fact, in most cases, is actually a function of many
variablesincluding
depth, color and time (t).
There are three types of computerized processes in the processing of image Low
level Process:
These involve primitive operations such as image processing to reduce noise,
contrast
enhancement and image sharpening.These kinds of processes are characterized by fact
the both
inputs andoutput areimages.
Mid-levelImage Processing: It involves tasks like segmentation, description of
those objects to
reduce them to a form suitable for computerprocessing, and classification of
individual objects. The
inputs to the process are generally images but outputs are attributes extracted
from images.
High level Processing: It involves “making sense” of an ensemble of recognized
objects, as in
image analysis, and performing the cognitive functions normally associated with
vision.
Representing Digital Images: The result of sampling and quantization is matrix of
real numbers.
Assume that an image f(x,y) is sampled so that the resulting digital image has M
rows and N
Columns. The values of thecoordinates (x,y) now become discrete quantities thus the
value of the
coordinates at origin become (x,y) = (0,0) The next Coordinates value alongthe
first signify the image
along the first row. it does not mean that these are the actual values of physical
coordinates when the
image was sampled. Thus, the right side of the matrix represents a digital element,
pixel or pel. The
matrix can be represented in the following form as well. The samplingprocess may be
viewed as
partitioning the XY plane into a grid with the coordinates of the center of each
grid being a pair of
elements from the Cartesian products Z2 which is the set of all ordered pair of
elements (Zi, Zj)with
Zi and Zj being integers from Z.
Hence f(x,y) is a digital image ifgray level (that is, a real number from the set
of real number R) to
each distinct pair of coordinates (x,y). This functional assignment is the
quantization process. If the
gray levels are also integers, Z replaces R, the and a digital image become a 2D
function whose
coordinates and she amplitude value are integers. Due to processing storage and
hardware
consideration, the number gray levels typically are an integer power of 2.
L=2k
Then, the number ‘b’ of bites required to store a digital image is
b=M *N* kWhen
M=N, the equation become
b=N2*k
When an image can have 2k gray levels, it is referred to as “k- bit”. Animage with
256 possible
gray levels is called an “8- bit image” (256=28).
Spatial and Gray level Resolution:
Spatial resolution is the smallest discernible details are an image. Suppose a
chart can be
constructed with vertical lines of width w with the space between the also having
width W, so a
line pair consists of one such line and its adjacent space thus. The width of the
line pair is 2w and
there is 1/2w line pair per unit distance resolution is simply the smallest number
ofdiscernible line pair
unitdistance.
Gray levels resolution refers to smallest discernible change in gray levels.
Measuring discernible
change in gray levels is a highly subjective process reducing the number of bits R
while repairing the
spatial resolution constant creates the problem of false contouring.It is caused by
the use of an
insufficient number of gray levels on the smooth areas of the digital image. It is
called so because the
rides resemble top graphics contours in amap. It is generally quite visible in
image displayed using
16 or less uniformly spaced gray levels.
Image Sensing and Acquisition: The types of images in which we are interested are
generated by
the combination of an “Illumination” sourceand the reflection or absorption of
energy from that
source by the elementsof the “scene” being imaged. We enclose illumination and
scene in quotes to
emphasize the fact that they are considerably more general than the familiar
situation in which a
visible light source illuminates a common everyday 3-D (three-dimensional) scene.
For example, the illumination may originate from a source of electromagnetic energy
such as radar,
infrared, or X-ray energy. But, as noted earlier, it could originate from less
traditional sources, such
as ultrasound or even a computer-generated illumination pattern. Similarly, the
scene elements
could be familiar objects, but they can just as easily be molecules, buried
rockformations,orahumanbrain.Wecouldevenimageasource,suchasacquirin gimages of the
sun.
Depending on the nature of the source, illuminationenergy is reflected from, or
transmitted through,
objects.
An example in the first category is light reflected from a planar surface. An
example in the second
category is when X-rays pass through a patient’s body for the purpose of generating
a diagnostic Xray film. In some applications, the reflected or transmitted energy
is focused onto a photo converter
(e.g., a phosphor screen), which converts the energy into visiblelight.
Electron microscopy and some applications of gamma imaging use this approach. The
idea is simple:
Incoming energy is transformed into a voltage by the combination of input
electrical power and
sensor material that is responsive to the particular type of energy being detected.
The output voltage
waveform is the response of the sensor(s), and a digital quantity is obtained from
each sensor by
digitizing its response. In this section, we look at the principal modalities for
image sensing and
generation.
Fig: Line Sensor
Fig: Single Image Sensor
The components of a single sensor, perhaps the most familiar sensor of thistype is
the photodiode,
which is constructed of silicon materials and whoseoutput voltage
waveform
isproportionaltolight.Theuseofafilterinfrontofasensorimprovesselectivity.F
orexample, a green (pass)
filter in front of a light sensor favors light in thegreen band of the color
spectrum. As a consequence,
the sensor output willbe stronger for green light than for other components in the
visiblespectrum.
Fig: Array sensor Image Acquisition using a Single Sensor
In order to generate a 2-D image using a single sensor, there has to be relative
displacements in both
the x- and y-directions between the sensor and the area to be imaged. Figure shows
an arrangement
used in high- precision scanning, where a film negative is mounted onto a drum
whose mechanical
rotation provides displacement in one dimension. The single sensor is mounted on a
lead screw that
provides motion in the perpendicular direction. Since mechanical motion can be
controlled with high
precision, this method is an inexpensive (but slow) way to obtain high-resolution
images. Other
similar mechanical arrangements use a flat bed, with the sensor moving in two
linear directions.
These types of mechanical digitizers sometimes are referred to as
microdensitometers.
Image Acquisition using a Sensor Strips:
A geometry that is used much more frequently than single sensors consists of an in-
line
arrangement of sensors in the form of a sensor strip, shows. The strip provides
imaging
elements in one direction. Motion perpendicular to the strip provides imaging in
the other
direction. This isthe type of arrangement used in most flatbed scanners. Sensing
devices with
4000 or more in-line sensors are possible. In-line sensors are used routinely in
airborne imaging
applications, in which the imaging system is mounted on an aircraft that flies at a
constant altitude
and speed over the geographical area to be imaged. One dimensional imaging sensor
strips that
respond to various bands of the electromagnetic spectrum are mounted perpendicular
to the
direction of flight. The imaging strip givesone line of an image at a time, and the
motion of the
strip completes the other dimension of a two-dimensional image. Lenses or other
focusing schemes
are used to project area to be scanned onto the sensors. Sensorstrips mounted in a
ring configuration
are used in medical and industrial imaging to obtain cross-sectional (“slice”)
images of 3-Dobjects.
Fig: Image Acquisition using linear strip and circular stripsImage Acquisition
using a
Sensor Arrays:
The individual sensors arranged in the form of a 2-D array. Numerous
electromagnetic and some
ultrasonic sensing devices frequently are arranged in an array format. This is also
the predominant
arrangement found in digital cameras. A typical sensor for these cameras is a CCD
array, which
can be manufactured with a broad range of sensing properties and can be packaged in
rugged
arrays of elements or more. CCD sensors are used widely in digital cameras and
other light
sensing instruments. The response of each sensor is proportional to the integral of
the light energy
projected onto the surface of the sensor, a property that is used in astronomical
and other applications
requiring low noise images. Noise reduction is achieved by letting the sensor
integrate the input light
signal over minutes or even hours. The two dimensional, its key advantage is that a
complete
image can be obtained by focusing the energy pattern onto the surface of the array.
Motion obviously
is not necessary, as is the case with the sensor arrangements This figure shows the
energy from an
illumination source being reflected from a scene element, but, as mentioned at the
beginning of this
section, the energy also could be transmitted through the scene elements. The first
function
performed by the imaging system is to collect the incoming energy and focus it onto
an image plane.
If the illumination is light, the front end of the imaging system is a lens, which
projects the viewed
scene onto the lens focal plane.The sensor array, which is coincident with the
focal plane, produces
outputs proportional to the integral of the light received at each sensor. Digital
andanalog circuitry
sweeps these outputs and convert them to a video signal, which is then digitized by
another section
of the imaging system.
Image sampling and Quantization:
To create a digital image, we need to convert the continuous sensed datainto
digital form. This
involves two processes.
1. Sampling and
2. Quantization
A continuous image, f(x, y), that we want to convert to digital form. An image may
be continuous
with respect to the x- and y- coordinates, and also in amplitude. To convert it to
digital form, we
have to sample the function in both coordinates and in amplitude.
Digitizing the coordinate values is called Sampling. Digitizing the amplitude
values is called
Quantization.
Fig: Sampling
Fig: Quantization
Digital Image Representation:
Digital image is a finite collection of discrete samples (pixels) of any observable
object. The pixels
represent a two- or higher dimensional “view” of the object, each pixel having its
own discrete value
in a finite range. The pixel values may represent the amount of visible light,
infra-red light,
absorption of x-rays, electrons, or any other measurable value such as ultrasound
wave impulses. The
image does not need to have any visual sense; it is sufficient that the samples
form a twodimensional spatial structure that may be illustrated as an image. The
images may be obtained
by a digital camera, scanner, electron microscope, ultrasound stethoscope, or any
other optical or
non-optical sensor.
Examples of digital image are Digital photographs, Satellite images,
Radiological images (x-rays, mammograms), Binary images, Fax images, Engineering
drawings,
Computer graphics, CAD drawings, and vector graphics in general are not considered
in this course
even though their reproduction is a possible source of an image. In fact, one goal
of intermediate
level image processing may be to reconstruct a model (Eg: Vector Representation)
for a given digital
image.
Relationship between Pixels:
We consider several important relationships between pixels in a digital image.
Neighbors of a Pixel:
A pixel p at coordinates (x,y) has four horizontal and whose coordinates are given
by:
(x+1,y), (x-1, y), (x, y+1), (x,y-1)
This set of pixels, called the 4-neighbors or p, is denoted by N4(p). Each pixel is
one unit
distance from (x,y) and some of the neighbors of p lie outside the digital image if
(x,y) is on the
border of the image. The four diagonal neighbors of p have coordinates and are
denoted by ND(p).
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1).
These points, together with the 4-neighbors, are called the 8-neighbors of p,
denoted by N8(p).
As before, some of the points in ND(p) and N8(p) fall outside the image if (x,y) is
on the border of
the image.
Adjacency and Connectivity:
Let v be the set of gray –level values used to define adjacency, in a binary image,
V={1}. In a grayscale image, the idea is the same, but Vtypically contains more
elements, for example, V= {180, 181,
182, …,200}.
If the possible intensity values 0 – 255, Vset can be any subset of these 256
values. if we are
reference to adjacency of pixel with value.
Three types of Adjacency:
4- Adjacency – two pixel P and Q with value from V are 4 –adjacency if A isin the
setN4(P)
8- Adjacency – two pixel P and Q with value from V are 8 –adjacency if A isin the
set N8(P) Madjacency –two pixel P and Q with value from V are m – adjacency if (i)
Q is in N4(p) or
(ii) Q is in ND(q) and the
(iii) Set N4(p) ∩ N4(q) has no pixel whose values are fromV.
Mixed adjacency is a modification of 8-adjacency. It is introduced toeliminate the
ambiguities that
often arise when 8-adjacency isused.
Forexample:
Fig: (a) Arrangement of pixels
(b) pixels that are 8-adjacent (shown dashed) to the center pixel
(c) m-adjacency
Types of Adjacency:
In this example, we can note that to connect between two pixels (finding a path
between
twopixels):
In 8-adjacency way, you can find multiple paths between twopixels While, in m-
adjacency, you
can find only one path between twopixels
So, m-adjacency has eliminated the multiple path connection that has been generated
by the8-
adjacency.
Two subsets S1 and S2 are adjacent, if some pixel in S1 is adjacent to somepixel in
S2. Adjacent
means, either 4-, 8- orm-adjacency.
Digital Path:
A digital path (or curve) from pixel p with coordinate (x,y) to pixel q with
coordinate (s,t) is a
sequence of distinct pixels with coordinates (x0,y0), (x1,y1), …, (xn, yn) where
(x0,y0) = (x,y) and
(xn, yn) = (s,t) and pixels (xi, yi)and (xi-1, yi-1) are adjacent for 1 ≤ i ≤n, n
is the length of thepath.
If (x0,y0) = (xn, yn), the path isclosed.
We can specify 4, 8or m-paths depending on the type of adjacency specified.
Return to the previous example:
Fig:(a) Arrangement of pixels
(b) pixels that are 8-adjacent (shown dashed) to the center pixel
(c) m-adjacency
In figure (b) the paths between the top right and bottom right pixels are 8-paths.
And the path
between the same 2 pixels in figure (c) is m-path
Connectivity:
Let S represent a subset of pixels in an image, two pixels p and q are said to be
connected in S if there
exists a path between them consisting entirelyof pixels inS.
For any pixel p in S, the set of pixels that are connected to it in S is calleda
connected component
of S. If it only has one connected component, then set S is called a connected set.
Region and Boundary:
Region: Let R be a subset of pixels in an image, we call R a region of the image
ifRis a connected
set.
Boundary: The boundary (also called border or contour) of a region R is the set of
pixels in the
region that have one or more neighbors that are not inR.
If R happens to be an entire image, then its boundary is defined as the set of
pixels in the first and
last rows and columns in the image. This extra definition is required because an
image has no
neighbors beyond its borders. Normally, when we refer to a region, we are referring
to subset ofan
image, and any pixels in the boundary of the region that happen to coincide with
the border of the
image are included implicitly as part of theregionboundary.
Distance Measures:
For pixel p,q and z with coordinate (x.y) ,(s,t) and (v,w) respectively D is
adistance function or
metric if
D [p.q] ≥ 0 {D[p.q] = 0iff p=q}
D [p.q] = D [p.q] and
D [p.q] ≥ 0 {D[p.q]+D(q,z) The
Euclidean Distance between p and q is definedas:
De(p,q) = [(x – s)2 + (y - t)2]1/2
Pixels having a distance less than or equal to some value r from (x,y) arethe
points contained in a
disk of radius „ r „centered at (x,y)
The D4distance (also called city-block distance) between p and q isdefinedas:
D4(p,q) = | x – s | + | y – t |
Pixels having a D4 distance from (x,y), less than or equal to some value rform a
Diamond centered
at (x,y)
Example:
The pixels with distance D4≤ 2 from (x,y) form the following contours of constant
distance.
The pixels with D4= 1 are the 4-neighbors of (x,y)
The D8distance (also called chessboard distance) between p and q isdefinedas:
D8(p,q) = max(| x – s |,| y – t |)
Pixels having a D8 distance from (x,y), less than or equal to some value r form a
square Centered at
(x,y).
Example:
D8distance ≤ 2 from (x,y) form the following contours of constant distance.
DmDistance:
It is defined as the shortest m-path between the points.In this case, the distance
between two pixels
will depend on the values of the pixels along the path, as well as the values of
their neighbors.
Example:
Consider the following arrangement of pixels and assume that p, p2, and p4 have
value 1 and that p1
and p3 can have can have a value of 0 or 1 Suppose that we consider the adjacency
of pixels values 1
(i.e. V ={1})
Now, to compute the Dmbetween points p and p4
Here we have 4 cases:
Case1: If p1 =0 and p3 = 0
The length of the shortest m-path (the Dm distance) is 2 (p, p2, p4)
Case2: If p1 =1 and p3 = 0
now, p1 and p will no longer be adjacent (see m-adjacency definition)then, the
length of the
shortest path will be 3 (p, p1, p2, p4)
Case3: If p1 =0 and p3 = 1
The same applies here, and the shortest –m-path will be 3 (p, p2, p3, p4)
Case4: If p1 =1 and p3 = 1
The length of the shortest m-path will be 4 (p, p1, p2, p3, p4)
UNIT -3
IMAGE ENHANCEMENT
Learning Objectives:
Image enhancement techniques are designed to improve the quality of an image as
perceived by a
human being. Image enhancement can be performed both in the spatial as well as in
the frequency
domain. After reading this chapter, the reader should have a basic knowledge about
the following
concepts:
1. Image enhancement in spatial and frequency domain
2. Point operations and mask operations in spatial domain
3. Different types of gray –level transformation
4. Histogram and histogram equalization
5. Frequency domain filtering
Introduction:
Image enhancement approaches fall into two broad categories: spatialdomain methods
and frequency
domain methods. The term spatial domain refers to the image plane itself, and
approaches in this
category are based on direct manipulation of pixels in an image.
Frequency domain processing techniques are based on modifying the Fourier transform
of an image.
Enhancing an image provides better contrast and a more detailed image as compare to
non-enhanced
image. Image enhancement has very good applications. It is used to enhancemedical
images, images
captured in remote sensing, images from satellite etc. As indicated previously, the
term spatial
domain refers to the aggregate of pixels composing an image. Spatial domain methods
are
procedures that operate directly on these pixels. Spatial domain processeswill be
denoted by the
expression.
g(x,y) = T[f(x,y)]
where f(x, y) is the input image, g(x, y) is the processed image, and T is an
operator on f, defined
over some neighborhoodof (x, y).
The principal approach in defining a neighborhood about a point (x, y) is to use a
square or rectangular
sub image area centered at (x, y), as Fig. 2.1 shows. The center of the sub image
is moved from pixel to
pixel starting, say, at the top left corner. The operator T is applied at each
location (x, y)to yield the
output, g, at that location. The process utilizes only the pixels in the area of
the image spanned by
the neighborhood.
The simplest form of T is when the neighborhood is of size 1*1 (that is, a single
pixel). In this case, g
depends only on the value of f at (x, y), and T becomes a gray-level (also called
an intensity or
mapping) transformationfunction of the form
s = T ( r )
where r is the pixels of the input image and s is the pixels of the output image. T
is a transformation
function that maps each value of „r‟ to each value of „s‟.
For example, if T(r) has the form shown in Fig. 2.2(a), the effect of this
transformation would be to
produce an image of higher contrast than the original by darkening the levels below
m and
brightening the levels above m in the original image. In this technique, known as
contrast stretching,
the values of r below m are compressed by the transformation function intoa narrow
range of s, toward
black. The opposite effect takes place for values of r above m.
In the limiting case shown in Fig. 2.2(b), T(r) produces a two-level (binary)image.
A mapping of this
form is called a thresholding function.
One of the principal approaches in this formulation is based on the use of so-
called masks (also referred to as filters, kernels, templates, or
windows).Basically, a mask is a small (say, 3*3) 2-D array,
such as the one shown in Fig. 2.1, in which the values of the mask coefficients
determine the nature
of the process, such as image sharpening. Enhancement techniques based on this type
of approach
often are referred to as mask processing orfiltering.
Fig. 2.2 Gray level transformation functions for contrast enhancement Image
enhancement can be
done through gray level transformations whichare discussed below.
BASIC GRAY LEVEL TRANSFORMATIONS:
1. Image negative
2. Log transformations
3. Power law transformations
4. Piecewise-Linear transformation functions
LINEAR TRANSFORMATION:
First, we will look at the linear transformation. Linear transformationincludes
simple identity and
negative transformation. Identity transformation has been discussed in ourtutorial
of image
transformation,but a brief description of this transformation has been given here.
Identity transition is shown by a straight line. In this transition, eachvalue of
the input image is
directly mapped to each other value of output image. That results in the same input
image and output
image. And hence is called identity transformation. It has been shown below:
Fig. Linear transformation between input and outputNEGATIVE
TRANSFORMATION:
The second linear transformation is negative transformation, which is invert of
identity
transformation. In negative transformation, each value ofthe input image is
subtracted from the L-1
and mapped onto the output image
IMAGE NEGATIVE:
The image negative with gray level value in the range of [0, L-1] isobtained by
negative
transformation given by S = T(r) or
S = L -1 – r
Where r= gray level value at pixel (x,y)
L is the largest gray level consists in the image
It results in getting photograph negative. It is useful when for enhancingwhite
details embedded in
dark regions of the image.
The overall graph of these transitions has been shown below.
Input gray level, r
Fig. Some basic gray-level transformation functions used for imageenhancement
In this case the following transition has been done.
S = (L – 1) – r
Since the input image of Einstein is an 8 bpp image, so the number of levels in
this image are256.
Putting 256 in the equation, we get this
S = 255 – r
So, each value is subtracted by 255 and the result image has been shown above. So,
what happens is
that, the lighter pixels become dark and the darker picture becomes light. And it
results in image
negative.
It has been shown in the graph below.
Fig. Negative transformations
LOGARITHMIC TRANSFORMATIONS:
Logarithmic transformation further contains two type of transformation. Log
transformation and
inverse log transformation.
LOG TRANSFORMATIONS:
The log transformations can be defined by this formula
S = c log(r + 1).
Where S and r are the pixel values of the output and the input image and c is a
constant. The value 1
is added to each of the pixel value of the input image because if there is a pixel
intensity of 0 in the
image, then log (0) is equal to infinity. So, 1 is added, to make the minimum value
at least 1.
During log transformation, the dark pixels in an image are expanded as compare to
the higher pixel
values. The higher pixel values are kind of compressed in log transformation. This
result in
following image enhancement.
ANOTHER WAY TO REPRESENT LOG TRANSFORMATIONS: Enhance
details in the darker regions of an image at the expense of detail in brighter
regions.
T(f) = C * log (1+r)
Here C is constantand r≥ 0
The shape of the curve shows that this transformation maps the narrow range of low
gray
level values in the input image into awider range of output image. The opposite is
true for high level values of input image.
Fig. Log Transformation Curve input vs output
POWER – LAW TRANSFORMATIONS:
There are further two transformation is power law transformations, that include nth
power and nth
root transformation. These transformations canbe given by the expression:
S=Crγ
This symbol γ is called gamma, due to which this transformation is also known as
gamma
transformation.
Variation in the value of γ varies the enhancement of the images. Differentdisplay
devices / monitors
have their own gamma correction, that’s why they display their image at different
intensity, where c
and g are positive constants. Sometimes Eq. (6) is written as
S = C (r +ε) γ
to account for an offset (that is, a measurable output when the input is zero).
Plots of s versus r for
various values of γ are shown in Fig. 2.10. As in the case of the log
transformation, power-law
curves with fractional values of γ map a narrow range of dark input values into a
wider range of
output values, with the opposite being true for higher values of input levels.
Unlike the log function,
however, we notice here a family of possibletransformation curves obtained simply
by varying γ.
In Fig that curves generated with values of γ>1 have exactly the opposite effect as
those generated
with values of γ<1. Finally, we Note that Eq. (6) reduces to the identity
transformation when c=γ=1.
Fig. 2.13 Plot of the equation S = Crγ for various values of γ
(C =1 in all cases)
This type of transformation is used for enhancing images for different typeof
display devices. The
gamma of different display devices is different. For example, Gamma of CRT lies in
between of
1.8 to 2.5, that means the image displayed on CRT is dark.
Varying gamma (γ) obtainsfamily of possible transformation curves
S = C* r γ
Here C and γ are positive constants. Plot of S versus r for various values ofγ is γ
> 1 compresses
dark values and expands bright valuesγ < 1 (similarto Log transformation) but
expands dark values
Compresses bright valuesWhen C = γ = 1, it reduces to identity transformation.
CORRECTING GAMMA:
S=Cr γ
S=Cr (1/2.5)
The same image but with different gamma values has been shown here.
PIECEWISE-LINEAR TRANSFORMATION FUNCTIONS:
A complementary approach to the methods discussed in the previous three sections is
to use
piecewise linear functions. The principal advantage of piecewise linear functions
over the types of
functions we have discussedthus far is that the form of piecewise functions can be
arbitrarily
complex.The principal disadvantage of piecewise functions is that their
specification requires
considerably more user input.
Contrast Stretching: One of the simplest piecewise linear functions is a contrast-
stretching
transformation. Low-contrast images can result from poor illumination, lack of
dynamicrange in the
imaging sensor, or even wrong setting of a lens aperture during image acquisition.
S= T(r )
Figure x(a) shows a typical transformation used for contrast stretching. The
locations of points (r1,
s1) and (r2, s2) control the shape of the transformation function. If r1=s1 and
r2=s2, the
transformation is a linearfunction that produces No changes in gray levels. If
r1=r2, s1=0and s2= L-
1, the transformation Becomes a thresholding function that creates a binary image,
as illustrated in
fig. 2.2(b).
Intermediate values of ar1, s1b and ar2, s2b produce various degrees of spread in
the gray levels of
the output image, thus affecting its contrast. In general, r1≤ r2 and s1 ≤ s2 is
assumed so that the
function is single valuedand Monotonically increasing.
Fig. x Contrast Stretching.
(a) Form of transformation function
(b) A low-contrast stretching.
(c) Result of contrast stretching
(d) Result of thresholding
Figure x(b) shows an 8-bit image with low contrast.
Fig. x(c) shows the result of contrast stretching, obtained by setting (r1, s1)
=(rmin, 0) and (r2, s2)=(rmax,L-1) where rmin and rmax denote the minimum and
maximum gray
levels in the image, respectively. Thus, the transformation function stretched the
levels linearly from
their original range to the full range[0, L-1].
Finally, Fig. x(d) shows the result of using the thresholding function defined
previously,with
r1=r2=m, the mean gray level in the image. The original image on which these
results are based is a
scanning electron microscope image of pollen, magnified approximately 700 times.
Gray-level Slicing:Highlighting a specific range of gray levels in an image often
is desired.
Applications include enhancing features such as masses ofwater in satellite imagery
and enhancing
flaws in X-ray images.There are several ways of doing level slicing, but most of
them are variations
of two basic themes. One approach is to display a high value for all gray levels in
the range of
interest and a low value for all other gray levels.
This transformation, shown in Fig. y(a), produces a binary image. The second
approach, based on the
transformation shown in Fig y(b), brightens
the desired range of gray levels but preserves the background and gray- level
tonalities in the image. Figure
y (c) shows a gray-scale image, and Fig. y(d) shows the result of using the
transformation in Fig. y(a).
Variations of the two transformations shown in Fig. are easy to formulate.
Fig. y (a)This transformation highlights range [A,B] of gray levels andreduces all
others to a
constant level
(b) This transformation highlights range [A,B] but preserves all otherlevels.
An image. (d) Result of using the transformation in (a).
BIT-PLANE SLICING:
Instead of highlighting gray-level ranges, highlighting the contribution made to
total image
appearance by specific bits might be desired. Suppose that each pixel in an image
is represented by
8 bits. Imagine that the image is composed of eight 1-bit planes, ranging from bit-
plane 0 for the
least significant bit to bit plane 7 for the most significant bit. In terms of 8-
bit bytes, plane 0
contains all the lowest order bits in the bytes comprising the pixels in the image
and plane 7
contains all the high-order bits.
Figure 3.12 illustrates these ideas, and Fig. 3.14 shows the various bit planes for
the image shown
in Fig. 3.13. Note that the higher-order bits
(especially the top four) contain the majority of the visually significant data.The
other bit planes
contribute to more subtle details in the image. Separating a digital image into its
bit planes is useful
for analyzing the relative importance played by each bit of the image, a process
that aids in
determining the adequacy of the number of bits used to quantize each pixel.
In terms of bit-plane extraction for an 8-bit image, it is not difficult to
showthat the (binary) image for
bit-plane 7 can be obtained by processing the input image with a thresholding gray-
level
transformation function that
(1) maps all levels in the image between 0 and 127 to one level (for example, 0);
and (2) maps all
levels between 129 and 255 to another (for example, 255).The binary image for bit-
plane 7 in Fig.
3.14 was obtained in just this manner. It is left as an exercise(Problem 3.3) to
obtain the gray- level
transformation functions that would yield the other bit planes.
HISTOGRAM PROCESSING:
The histogram of a digital image with gray levels in the range [0, L-1] is a
discrete function of the
form
H(rk)=nk
where rk is the kth gray level and nk is the number of pixels in the imagehaving
the level rk. A
normalized histogram is given by the equation
P(rk)=nk/n for k=0,1,2,…..,L-1
P(rk) gives the estimate of the probability of occurrence of gray level rk. Thesum
of all components of a
normalized histogram is equal to 1.
The histogram plots are simple plots of H(rk)=nk versus rk.
In the dark image the components of the histogram are concentrated onthe low (dark)
side of the
gray scale. In case of bright image, the histogram components are biased towards
the high side of the
gray scale. The histogram of a low contrast image will be narrow and will be
centered towards the
middle of the gray scale.
The components of the histogram in the high contrast image cover a broadrange of
the gray scale.
The net effect of this will be an image that shows a great deal of gray levels
details and has high
dynamic range.
HISTOGRAM EQUALIZATION:
Histogram equalization is a common technique for enhancing the appearance of
images. Suppose we
have an image which is predominantly dark. Then its histogram would be skewed
towards the lower
end of the grey scale and all the image detail are compressed into the dark end of
thehistogram. If we
could stretch out the grey levels at the dark end to produce a more uniformly
distributed histogram
then the image would become much clearer.
Let there be a continuous function with r being gray levels of the image to be
enhanced. The range of
r is [0, 1] with r=0 repressing black and r=1 representing white. The
transformation function is of the
form
S=T(r) where 0<r<1
It produces a level s for every pixel value r in the original image.
The transformation function is assumed to fulfill two condition T(r) is single
valued and
monotonically increasing in the internal 0<T(r)<1 for 0<r<1.The transformation
function should be
single valued so that the inverse transformations should exist. Monotonically
increasing condition
preserves the increasing order from black to white in the output image. The second
conditions
guarantee that the output gray levels will be in the same range as the input
levels. The gray levels of
the image may be viewedas random variables in the interval [0.1]. The most
fundamental descriptorof a
random variable is its probability density function (PDF) Pr(r) and Ps(s) denote
the probability density
functions of random variables r and s respectively. Basic results from an
elementary probability theory
states that if Pr(r) and Tr are known and T-1(s) satisfies conditions (a), then the
probability density
function Ps(s) of the transformed variable is given by the formula
Thus, the PDF of the transformed variable s is the determined by the graylevels PDF
of the input
image and by the chosen transformations function.A transformation function of a
particular
importance in image processing
This is the cumulative distribution function of r.
L is the total number of possible gray levels in the image.
IMAGE ENHANCEMENT IN FREQUENCY DOMAIN:
Blurring/Noise Reduction:Noise characterized by sharp transitions in image
intensity. Such
transitions contribute significantly to high frequency components of Fourier
transform. Intuitively,
attenuating certain high frequency components result in blurring and reduction of
image noise.
IDEAL LOW-PASS FILTER:
Cuts off all high-frequency components at a distance greater than a certain
distance from origin
(cutoff frequency).
H (u,v) = 1, if D(u,v) ≤ D0 0, if D(u,v) ˃ D0
Where D0 is a positive constant and D(u,v) is the distance between a point (u,v) in
the frequency
domain and the center of the frequency rectangle; that is
D(u,v) = [(u-P/2)2 + (v-Q/2)2] 1/2
Whereas P and Q are the padded sizes from the basic equations Wraparound error in
their
circular convolution can be avoided by paddingthese functions with zeros
Visualization: Ideal Low Pass Filter:
Aa shown in figure below
Fig: Ideal Low Pass Filter 3-D view and 2-D view and line graph
EFFECT OF DIFFERENT CUTOFF FREQUENCIES:
Fig.below(a) Test pattern of size 688x688 pixels, and (b) its Fourier spectrum. The
spectrum is
double the image size due to padding but is shown in half size so that it fits in
the page. The
superimposed circles have radii equal to 10, 30, 60, 160 and 460 with respect to
the full- size
spectrum image. These radii enclose 87.0, 93.1, 95.7, 97.8 and 99.2% of the padded
image power
respectively.
Fig: (a) Test pattern of size 688x688 pixels (b) its Fourier spectrum
Fig: (a) original image, (b)-(f) Results of filtering using ILPFs with
cutofffrequencies set at radii
values 10, 30, 60, 160 and 460, as shown in
fig.2.2.2(b). The power removed by these filters was 13, 6.9, 4.3, 2.2 and 0.8% of
the total,
respectively.
As the cutoff frequency decreases, image becomes more blurred, Noise increases.
Analogous to
larger spatial filter sizes the severe blurring in this image is a clear indication
that most of the sharp
detail informationin the picture is contained in the 13% power removed by the
filter. As the filter
radius is increases less and less power is removed, resulting in less blurring.
Fig. (c) throughare
characterized by “Ringing”, which becomes finer in texture as the amount of high
frequency content
removed decreases.
WHY IS THERE RINGING?
Ideal low-pass filter function is a rectangular function. The inverse Fourier
transform of a rectangular function is a sinc function.
Fig. Spatial representation of ILPFs of order 1 and 20 and correspondingintensity
profiles through the
center of the filters (the size of all cases is1000x1000 and the cutoff frequency
is 5) Observe how
ringing increases as a function of filter order.
BUTTERWORTH LOW-PASS FILTER:
Transfer function of a Butterworth lowpass filter (BLPF) of order n, andwith cutoff
frequency at a
distance D0 from the origin, is defined as
-Transfer function does not have sharp discontinuity establishing cutoffbetween
passed and
filtered frequencies.
Cut off frequency D0 defines point at which H(u,v) = 0.5
Fig. (a) perspective plot of a Butterworth lowpass-filter transfer function.
(b) Filter displayed as an image. (c)Filter radial cross sections of order 1through
4.
Unlike the ILPF, the BLPF transfer function does not have a sharp discontinuity
that gives a clear
cutoff between passed and filtered frequencies.
BUTTERWORTH LPF OF DIFFERENTFREQUENCIES:
Fig. (a) Original image. (b)-(f) Results of filtering using BLPFs of order 2,with
cutoff
frequencies at the radii
Fig. shows the results of applying the BLPF of eq. to fig.(a), with n=2 and D0
equal to the five radii
in fig.(b) for the ILPF, we note here a smooth transition in blurring as a function
of increasing cutoff
frequency. Moreover, no ringing is visible in any of the images processed with this
particular BLPF,
a fact attributed to the filter’s smooth transition betweenlow and high
frequencies.
A BLPF of order 1 has no ringing in the spatial domain. Ringing generallyis
imperceptible in
filters of order 2, but can become significant in filters of higher order.Figure
shows a comparison
between the spatial
Digital Image Processing SCSVMV Dept of ECE
representation of BLPFs of various orders (using a cutoff frequency of 5 inall
cases). Shown also is
the intensity profile along a horizontal scan line through the center of each
filter. The filter of order 2
does show mild ringing and small negative values, but they certainly are less
pronounced than in the
ILPF. A butter worth filters of order 20 exhibits characteristics similar to those
of the ILPF (in the
limit, both filters are identical).
Fig.2.2.7 (a)-(d) Spatial representation of BLPFs of order 1, 2, 5 and 20 and
corresponding
intensity profiles through the center of the filters (the size in all cases is 1000
x 1000 and the
cutoff frequency is 5) Observe how ringing increases as a function of filter order.
GAUSSIAN LOWPASS FILTERS:
The form of these filters in two dimensions is given by
This transfer function is smooth, like Butterworth filter.
Gaussian in frequency domain remains a Gaussian in spatial domainAdvantage: No
ringing artifacts.
Where D0 is the cutoff frequency. When D(u,v) = D0, the GLPF is down to 0.607 of
its maximum
value. This means that a spatial Gaussian filter, obtained by computing the IDFT of
above
equation., will have no ringing.
Page | 41
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Figure shows a perspective plot, image display and radial cross sections ofa GLPF
function.
Fig. (a) Perspective plot of a GLPF transfer function. (b) Filter displayed asan
image. (c). Filter
radial cross sections for various values of D0
Fig.(a) Original image. (b)-(f) Results of filtering using GLPFs with
cutofffrequencies at the
radii shown in fig.2.2.2. compare with fig.2.2.3 and fig.2.2.6
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Fig. (a) Original image (784x 732 pixels)
(b) Result of filtering using a GLPF with D0 = 100
(c) Result of filtering using a GLPF with D0 = 80. Note the reduction in fine skin
lines in the
magnified sections in (b) and (c)
Fig. shows an application of lowpass filtering for producing a smoother, softer-
looking result from a
sharp original. For human faces, the typical objective is to reduce the sharpness
of fine skin lines and
small blemished.
IMAGE SHARPENING USING FREQUENCY DOMAIN FILTERS:
An image can be smoothed by attenuating the high-frequency components of its
Fourier
transform. Because edges and other abrupt changes in intensities are associated
with high-frequency
components, image sharpening can be achieved in the frequency domain by high pass
filtering,
which attenuates the low-frequency components without disturbing high-frequency
information in
the Fourier transform.
The filter function H(u,v) are understood to be discrete functions of size PxQ;
that is the discrete
frequency variables are in the range
u = 0,1, 2…….,P-1 and
v = 0,1,2,…….,(Q-1)
The meaning of sharpening isedges and fine detail characterized by sharp
transitions in image
intensity. Such transitions contribute significantly to high frequency components
of Fourier
transform. Intuitively, attenuating certain low frequency components and preserving
high frequency
components result in sharpening.Intended goal is to do the reverse operation of
low-pass filters.
When low-pass filter attenuated frequencies, high-pass filter passes them.
Digital Image Processing SCSVMV Dept of ECE
Page | 43
When high-pass filter attenuates frequencies, low-pass filter passes them. A high
pass filter is
obtained from a given low pass filter using the equation.
Hhp (u,v) = 1- Htp (u,v)
Where Hlp (u,v) is the transfer function of the low-pass filter. That is when the
lowpass filter
attenuates frequencies, the high-pass filter passedthem, and vice-versa.
We consider ideal, Butter-worth, and Gaussian high-pass filters. As in the previous
section, we
illustrate the characteristics of these filters in both the frequency and spatial
domains. Fig shows
typical 3-D plots, image representations and cross sections for these filters. As
before, we see that the
Butter-worth filter represents a transition between the sharpness of the ideal
filter and the broad
smoothness of the Gaussian filter. Figdiscussed in the sections the follow,
illustrates what these
filters look like in the spatial domain. The spatial filters were obtained and
displayed by using the
procedure used.
Fig: Top row: Perspective plot, image representation, and crosssection of atypical
ideal high-pass filter.
Middle and bottom rows: The same sequence for typical butter-worth and Gaussian
high-pass filters.
IDEAL HIGH-PASS FILTER:
A 2-D ideal high-pass filter (IHPF) is defined as
H (u,v) = 0, if D(u,v) ≤ D0
H (u,v) = 1, if D(u,v) ˃ D0
Where D0 is the cutoff frequency and D(u,v) is given by eq. As intended,
Digital Image Processing SCSVMV Dept of ECE
Page | 43
the IHPF is the opposite of the ILPF in the sense that it sets to zero all
frequencies inside a circle o fradius
D0 while passing, without attenuation, all frequencies outside the circle. As in
case of the ILPF, the IHPF
is not physically realizable.
SPATIAL REPRESENTATION OF HIGHPASS FILTERS:
Fig Spatial representation of typical (a) ideal (b) Butter-worth and (c) Gaussian
frequency domain
high-passfilters, and corresponding intensityprofiles through their centers.
We can expect IHPFs to have the same ringing properties as ILPFs. This is
demonstrated clearly in
Fig. which consists of various IHPF results using the original image in Fig.(a)
with D0 set to 30, 60,
and 160 pixels, respectively. The ringing in Fig. (a) is so severe that it produced
distorted, thickened
object boundaries (E.g., look at the large letter “a”). Edges of thetop three
circles do not show well
because they are not as strong as the other edges in the image (the intensity of
these three objects is
much closerto the background intensity, giving discontinuities of smaller
magnitude).
FILTERED RESULTS OF IHPF:
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Fig. Results of high-pass filtering the image in Fig.(a) using an IHPF withD0 = 30,
60, and 160
The situation improved somewhat with D0 = 60. Edge distortion is quite evident
still, but now we
begin to see filtering on the smaller objects. Due to the now familiar inverse
relationship between the
frequency and spatialdomains, we know that the spot size of this filter is smaller
than the spotof the
filter with D0 = 30. The result for D0 = 160 is closer to what a high- pass
filtered image should look
like. Here, the edges are much cleaner and less distorted, and the smaller objects
have been filtered
properly.Of course, the constant background in all images is zero in these high-
pass filtered images
because high pass filtering is analogous to differentiation in the spatial domain.
BUTTER-WORTH HIGH-PASS FILTERS:
A 2-D Butter-worth high-pass filter (BHPF) of order n and cutoff frequencyD0 is
defined as
Where D(u,v) is given by Eq.(3). This expression follows directly from Eqs.(3) and
(6). The middle
row of Fig.2.2.11. shows an image and cross section of the BHPF function.Butter-
worth high-pass
filter to behave smoother than IHPFs. Fig.2.2.14. shows the performance of a BHPF
of order 2 and
with D0 set to the same values as in Fig.2.2.13. The boundaries are much less
distorted than in
Fig.2.2.13. even for the smallest value of cutoff frequency. FILTERED RESULTS OF
BHPF:
Fig. Results of high-pass filtering the image in Fig.2.2.2(a) using a BHPFof order
2 with D0 = 30, 60, and 160 corresponding to the circles in Fig.2.2.2(b). These
results are much smoother
than those obtained with anIHPF
GAUSSIAN HIGH-PASS FILTERS:
The transfer function of the Gaussian high-pass filter (GHPF) with cutoff frequency
locus at a
distance D0 from the center of the frequency rectangleis given by
Where D(u,v) is given by Eq.(4). This expression follows directly from Eqs.(2) and
(6). The third
row in Fig.2.2.11. shows a perspective plot, image and cross section of the GHPF
function.
Following the same format as for the BHPF, we show in Fig.2.2.15. comparable
results using
GHPFs. As expected, the results obtained are more gradual than with the previous
two filters.
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Fig. Results of high-pass filtering the image in fig.(a) using a GHPF withD0 =
30, 60 and 160, corresponding to the circles in Fig.(b)
UNIT-3
IMAGE RESTORATION ANDSEGMENTATION
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Learning Objectives:
The goal of image restoration is to reconstruct the original scene from a degraded
observation. After
reading this unit, the reader should be familiar with the following concepts:
1. Different types of image degradation
2. Linear image –restoration techniques
3. Nonlinear image restoration techniques
IMAGE RESTORATION:
Restoration improves image in some predefined sense. It is an objective process.
Restoration
attempts to reconstruct an image that has been degraded by using a priori knowledge
of the
degradation phenomenon. These techniques are oriented toward modeling the
degradation and then
applying the inverse process in order to recover the original image.Restoration
techniques are based
on mathematical or probabilistic models of image processing. Enhancement, on the
other hand is
based on human subjective preferences regarding what constitutes a “good”
enhancement result.
Image Restoration refers to a class of methods that aim to remove or reduce the
degradations that
have occurred while the digital image was being obtained. All-natural images when
displayed
have gone through some sort of degradation:
1. During display mode
2. Acquisition mode, or
3. Processing mode
4. Sensor noise
5. Blur due to camera mis focus
6. Relative object-camera motion
7. Random atmospheric turbulence
8. Others
DEGRADATION MODEL:
Degradation process operates on a degradation function that operates on
Digital Image Processing SCSVMV Dept of ECE
Page | 43
an input image with an additive noise term. Input image is represented byusing the
notation f(x,y),
noise term can be represented as η(x,y).These twoterms when combined gives the
result as g(x,y). If we are given g(x,y), some knowledge about the degradation
function H or J and some knowledge about the additive noise teem η(x,y), the
objective of restorationis to obtain an estimate f'(x,y) of the
original image. We want the estimate to be as close as possible to the original
image. The more we
know about hand η , the closer f(x,y) will be to f'(x,y). If it is a linear
position invariantprocess, then
degraded image is given in the spatial domain by
g(x,y)=f(x,y)*h(x,y)+η(x,y)
where h(x,y) isspatial representation of degradation function and symbol *
represents convolution. In frequency domain we may write this equation
asG(u,v)=F(u,v)H(u,v)+N(u,v)
The terms in the capital letters are the Fourier Transform of thecorresponding
terms in the spatial
domain.
Fig: A model of the image Degradation / Restoration ProcessNOISE MODELS:
The principal source of noise in digital images arises during image acquisition and
/or transmission.
The performance of imaging sensors is affected by a variety of factors, such as
environmental
conditions during image acquisition and by the quality of the sensing elements
themselves. Images
are corrupted during transmission principally due to interference in the channels
used for
transmission. Since main sources of noise presented in digital images are resulted
from atmospheric
disturbance andimage sensor circuitry, following assumptions can be made i.e. the
noise model is
spatial invariant (independent of spatial location). The noisemodel is uncorrelated
with the object
function.
Gaussian Noise:
Digital Image Processing SCSVMV Dept of ECE
Page | 43
These noise models are used frequently in practices because of its tractability in
both spatial and
frequency domain. The PDF of Gaussian random variable is as follows, where z
represents the gray
level, μ= mean of average value 0
Rayleigh Noise:
Unlike Gaussian distribution, the Rayleigh distribution is no symmetric. It is
given by the formula.
The mean and variance of this density is
Gamma Noise:
The PDF of Erlang noise is given by
The mean and variance of this density are given by
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Its shape is similar to Rayleigh disruption. This equation is referred to as gamma
density it is correct
only when the denominator is the gamma function.
Exponential Noise:
Exponential distribution has an exponential shape. The PDF of exponential noise is
given as
Where a>0. The mean and variance of this density are given by
Uniform Noise:
The PDF of uniform noise is given by
Digital Image Processing SCSVMV Dept of ECE
Page | 43
The mean and variance of this noise is
Impulse (Salt &Pepper) Noise:
In this case, the noise is signal dependent, and is multiplied to the image.The PDF
of bipolar
(impulse) noise is given by
If b>a, gray level b will appear as a light dot in image. Level a will appearlike a
dark dot.
Restoration in the Presence of Noise Only- Spatial Filtering:When the only
degradation present in an image is noise, i.e.
g(x,y)=f(x,y)+η(x,y)or
G(u,v)= F(u,v)+ N(u,v)
The noise terms are unknown so subtracting them from g(x,y) or G(u,v) is not a
realistic approach. In
the case of periodic noise it is possible to estimate N(u,v) from the spectrum
G(u,v).So, N(u,v)
can be subtracted
Digital Image Processing SCSVMV Dept of ECE
Page | 43
from G(u,v) to obtain an estimate of original image. Spatial filtering can bedone
when only additive
noise is present.
The following techniques can be used to reduce the noise effect:Mean Filter:
(a) Arithmetic Mean filter:
It is the simplest mean filter. Let Sxy represents the set of coordinates in the
sub image of size m*n
centered at point (x,y). The arithmetic mean filter computes the average value of
the corrupted image
g(x,y) in the areadefined by Sxy. The value of the restored image f at any point
(x,y) is the arithmetic
mean computed using the pixels in the region defined by Sxy.
This operation can be using a convolution mask in which all coefficients have value
1/mm A mean
filter soothes local variations in image Noise is reduced as a result of blurring.
For every pixel in
the image, the pixel value is replaced by the mean value of its neighboring pixels
with a weight.
This will be resulted in a smoothing effect in the image.
Geometric Mean Filter:
An image restored using a geometric mean filter is given by the expression
Here, each restored pixel is given by the product of the pixel in the subimage
window, raised to the power 1/mm. A geometric mean filter but it to loosen image
details in the process.
Harmonic Mean Filter:
The harmonic mean filtering operation is given by the expression
The harmonic mean filter works well for salt noise but fails for pepper noise. It
does well with
Gaussian noise also.
Order Statistics Filter:
Order statistics filters are spatial filters whose response is based on ordering
the pixel contained
in the image area encompassed by the filter.
The response of the filter at any point is determined by the ranking result.
Median filter:
It is the best order statistic filter; it replaces the value of a pixel by
themedian of gray levels in the
Neighborhood of the pixel.The original of thepixel is included in the computation
of the median
of the filter are quite possible because for certain types of random noise, the
provide excellent noise
reduction capabilities with considerably less blurring then smoothing filters of
similar size. These
are effective for bipolar and unipolar impulse noise.
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Max and Min Filter:
Using the l00th percentile of ranked set of numbers is called the max filter and is
given by the
equation.
It is used for finding the brightest point in an image. Pepper noise in the image
has very low values,
it is reduced by max filter using the maxselection process in the sublimated area
sky. The 0th
percentile filter is
min filter.
This filter is useful for flinging the darkest point in image. Also, it reducessalt
noise of the min
operation.
Midpoint Filter:
The midpoint filter simply computes the midpoint between the maximum and minimum
values in the
area encompassed by its comeliness the order statistics and averaging. This filter
works best for
randomly distributed noise like Gaussian or uniform noise.
Periodic Noise by Frequency Domain Filtering:
These types of filters are used for this purpose.
Band Reject Filters:It removes a band of frequencies about the origin of theFourier
transformer.
Ideal Band reject Filter: An ideal band reject filter is given by the
Digital Image Processing SCSVMV Dept of ECE
Page | 43
expression
D(u,v)- the distance from the origin of the centered frequency rectangle, W-the
width of the band and Dothe radial center of the frequency rectangle.
Butterworth Band Reject Filter:
Gaussian Band Reject Filter:
These filters are mostly used when the location of noise component in the frequency
domain is
known. Sinusoidal noise can be easily removed by using these kinds of filters
because it shows two
impulses that are mirror images of each other about the origin. Of the frequency
transform.
Band Pass Filter:
The function of a band pass filter is opposite to that of a band reject filterIt
allows a specific
frequency band of the image to be passed and blocks therest of frequencies. The
transfer function of a
band pass filter can be obtained from a corresponding band reject filter with
transfer function
Hbr(u,v) by using the equation
These filters cannot be applied directly on an image because it may remove too much
details of an
image but these are effective in isolating the effectof an image of selected
frequency bands.
Digital Image Processing SCSVMV Dept of ECE
Page | 43
Notch Filters:
A notch filter rejects (or passes) frequencies in predefined neighborhoods
about a center frequency.Due to the symmetry of the Fourier transform notch filters
must appear in
symmetric pairs about the origin.The transferfunction of an ideal notch reject
filter of radius D0 with
centers a (u0,v0) and by symmetry at (-u0 , v0) is
Ideal, Butterworth, Gaussian Notch Filters:
Inverse Filtering:
The simplest approach to restoration is direct inverse filtering where wecomplete
anestimate
of the transform of the original image simply by dividing the transform of the
degraded image
G(u,v) by degradation
Page | 55
function H(u,v)
We know that
Therefore
From the above equation we observe that we cannot recover the undegraded image
exactly because
N(u,v) is a random function whose Fourier transform is not known. One approach to
get around the
zero or small-value problem is to limit the filter frequencies to values near the
origin.We know that
H(0,0) is equal to the average values of h(x,y). By Limiting the analysis to
frequencies near the
origin we reduse the probability of encountering zero values.
Minimum Mean Square Error (Wiener) Filtering:
The inverse filtering approach has poor performance. The wiener filtering approach
uses the
degradation function and statistical characteristics of noise into the restoration
process.The objective
is to find an estimate ofthe uncorrupted image f such that the mean square error
between them is
minimized.
The error measure is given by
Where E{.} is the expected value of the argument.
We assume that the noise and the image are uncorrelated one or the otherhas zero
mean.The gray
levels in the estimate are a linear function of the levels in the degraded image.
Here, H(u,v)= degradation function, H*(u,v)=complex conjugate
of H(u,v)
| H(u,v)|2=H* (u,v) H(u,v)
Sn(u,v)=|N(u,v)|2= power spectrum of the noise Sf(u,v)=|F(u,v)|2= power
spectrum of the underrated image
The power spectrum of the undegraded image is rarely known. An approach used
frequently
when these quantities are not known or cannotbe estimated then the expression used
is
Where K is a specified constant. Constrained Least Squares Filtering:
Page | 55
The wiener filter has a disadvantage that we need to know the power spectra of the
undegraded
image and noise. The constrained least square filtering requires only the knowledge
of only the mean
and variance of thenoise. These parameters usually can be calculated from a given
degraded image
this is the advantage with this method. This method produces an optimal result.
This method requires
the optimal criteria which is important we express thein vector-matrix form
The optimality criteria for restoration is based on a measure of smoothness, such
as the second
derivative of an image (Laplacian).
The minimum of a criterion function C defined as
Subject to the constraint
Where is a Euclidean vector norm is estimate of theundegraded image, is
Laplacian operator.
The frequency domain solution to this optimization problem is given by
Where γ is a parameter that must be adjusted so that the constraint is
satisfied. P(u,v) is the Fourier transform of the Laplacian operator
IMAGE SEGMENTATION:
Edge Detection:
Edge detection is a fundamental tool in image processing and computer vision,
particularly in the
areas of feature detection and feature extraction,which aim at identifying points
in a digital image at
which the imagebrightness changes sharply or more formally has discontinuities.
Motivation: Canny edge detection applied to a photograph
The purpose of detecting sharp changes in image brightness is to capture important
events and
changes in properties of the world. It can be shown that under rather general
assumptions for an
image formation model,discontinuities in image brightness are likely to correspond
to:
1. Discontinuities in depth,
2. Discontinuities in surface orientation,
3. Changes in material properties and
4. Variations in scene illumination
In the ideal case, the result of applying an edge detector to an image may lead to
a set of connected
curves that indicate the boundaries of objects, theboundaries of surface markings
as well as curves
that correspond to discontinuities in surface orientation. Thus, applying an edge
detection algorithm
Page | 55
to an image may significantly reduce the amount of data to be processed and may
therefore filter out
information that may be regarded as less relevant, while preserving the important
structural properties
of animage.
If the edge detection step is successful, the subsequent task of interpreting the
information
contents in the original image may therefore besubstantially simplified. However,
it is not always
possible to obtain suchideal edges from real life images of moderate complexity.
Edges extracted
from non-trivial images are often hampered by fragmentation, meaning that the edge
curves are
not connected, missing edge segments as well as false edges notcorresponding to
interesting
phenomena in the image – thus complicating the subsequent task of interpreting the
image data.
Edge detection is one of the fundamental steps in image processing, image analysis,
image pattern
recognition, and computer vision techniques. During recent years, however,
substantial (and
successful) research has also been made on computer vision methods that do not
explicitly rely on
edge detection as a pre-processing step.
Edge Properties:
The edges extracted from a two-dimensional image of a three-dimensionalscene can be
classified as
either viewpoint dependent or viewpoint independent. A viewpoint independent edge
typically
reflects inherent properties of the three-dimensional objects, such as surface
markings and surface
shape. A viewpoint dependent edge may change as the viewpoint changes, and
typically reflects the
geometry of the scene, such as objects occluding one another.
A typical edge might for instance be the border between a block of red colorand a
block of yellow. In
contrast a line(as can be extracted by a ridge detector) can be a small number of
pixels of a different
color on an otherwise unchanging background. For a line, there may therefore
usuallybe one edge on
each side of the line.
A Simple Edge Model:
Although certain literature has considered the detection of ideal step edges, the
edges obtained from
natural images are usually not at all ideal step edges. Instead they are normally
affected by one or
several of the following effects:
1. Focal blur caused by a finite depth-of-field and finite point spreadfunction.
2. Penumbral blur caused by shadows created by light sources of non-zero radius.
3. Shading at a smooth object
A number of researchers have used a Gaussian smoothed step edge (an error function)
as the simplest
extension of the ideal step edge model for modeling the effects of edge blur in
practical applications.
Thus, a one- dimensional image f which has exactly one
edge placed at x = 0 may be modeled as:
At the left side of the edge, the intensity is , and right of theedge it is . The
scale parameter σ is called the blur scale of the edge.
Why edge detection is a non-trivial task?
To illustrate why edge detection is not a trivial task, let us consider the problem
of detecting edges in
the following one-dimensional signal. Here, we may intuitively say that there
should be an
edge between the 4th and5th pixels.
Page | 55
If the intensity difference were smaller between the 4th and the 5th pixels and if
the intensity
differences between the adjacent neighboring pixels were higher, it would not be as
easy to say
that there should be an edge inthe corresponding region. Moreover, one could argue
that this
case is onein which there are several edges.
5 7 6 41113 148 149
Hence, to firmly state a specific threshold on how large the intensity change
between two
neighboring pixels must be for us to say that there should be an edge between these
pixels is not
always simple. Indeed, this is one of the reasons why edge detection may be a non-
trivial problem
unless the objects in the scene are particularly simple and the
illuminationconditions can be well
controlled (see for example, the edges extracted from the image with the girl
above).
Approaches to Edge Detection:
There are many methods for edge detection, but most of them can be grouped into two
categories,
search-based and zero-crossing based. The search-based methods detect edges by
first computing a
measure of edge strength, usually a first-order derivative expression such as the
gradient magnitude,
and then searching for local directional maxima of the gradient magnitude using a
computed estimate
of the local orientation of the edge, usually the gradient direction. The zero-
crossing based methods
search for zero crossings in a second-order derivative expression computed from the
image in order
to find edges, usually the zero-crossings of the Laplacian orthe zero-crossings of
a non-linear
differential expression.
As a pre-processing step to edge detection, a smoothing stage, typically Gaussian
smoothing, is
almost always applied (see also noise reduction).
The edge detection methods that have been published mainly differ in the
types of smoothing filters that are applied and the way the measures of edge
strength are computed.
As many edge detection methods rely on the computation of image gradients, they
also differ in the
types of filters usedfor computing gradient estimates in the x- and y-directions.
Canny Edge Detection:
John Canny considered the mathematical problem of deriving an optimal smoothing
filter given the
criteria of detection, localization and minimizingmultiple responses to a single
edge. He showed that
the optimal filter given these assumptions is a sum of four exponential terms. He
also showed that
this filter can be well approximated by first-order derivativesof Gaussians. Canny
also introduced the
notion of non-maximum suppression, which meansthat given the pre-smoothing filters,
edge pointsare
defined as points where the gradient magnitude assumes a local maximum in the
gradient direction.
Looking for the zero crossing of the 2nd derivative along the gradient direction
was first proposed by
Haralick. It took less than two decades to find a modern geometric variational
meaning for that
operator that links it to the Marr-Hildreth (zero crossing of the Laplacian) edge
detector. That
observation was presented by Ron Kimmel and Alfred Bruckstein.
Although his work was done in the early days of computer vision, the Canny edge
detector (including
its variations) is still a state-of-the-art edgedetector. Unless the preconditions
are particularly suitable,
it is hard to find an edge detector that performs significantly better than the
Canny edge detector.
The Canny-Deriche detector was derived from similar mathematical criteria as the
Canny edge
detector, although starting from a discreteviewpoint and then leading to a set of
recursive filters for
image smoothing instead of exponential filters or Gaussian filters.The differential
edge detector described below can be seen as a reformulation of Canny's methodfrom
the viewpoint of differential
invariants computed from a scale-space representation leading to a number of
advantages in terms of
both theoretical analysis and sub-pixel implementation.
Other First-Order Methods:
For estimating image gradients from the input image or a smoothed version of it,
different
gradient operators can be applied. The simplest approach is to use central
differences:
Page | 55
Page | 55
corresponding to the application of the following filter masksto the imagedata:
The well-known and earlier Sobel operator is
based on the following filters:
Given such estimates of first- order derivatives, the gradient magnitude isthen
computed as:
while the gradient orientation can be estimated as
Other first-order difference operators for estimating image gradient havebeen
proposed in the
Prewitt operator and Roberts cross.
Thresholding and Linking:
Once we have computed a measure of edge strength (typically the gradientmagnitude),
the next stage
is to apply a threshold, to decide whether edgesare present or not at an image
point. The lower the
threshold, the more edges will be detected, and the result will be increasingly
susceptible to noise
and detecting edges of irrelevant features in the image. Conversely ahigh threshold
may miss subtle
edges, or result in fragmented edges.
If the edge thresholding is applied to just the gradient magnitude image, the
resulting edges will
in general be thick and some type of edge thinning post-processing is necessary.
For edges detected
with non-maximum suppression however, the edge curves are thin by definition and
the edge pixels
can be linked into edge polygon by an edge linking (edge tracking) procedure. On a
discrete grid, the
non-maximum suppression stage can be implemented by estimating the gradient
direction using firstorder derivatives, then rounding off the gradient direction to
multiples of 45 degrees, and finally
comparing the values of the gradient magnitude in the
Page | 55
estimated gradient direction.
A commonly used approach to handle the problem of appropriate thresholds for
thresholding is by
using thresholding with hysteresis. This method uses multiple thresholds to find
edges. We begin by
using the upper threshold to find the start of anedge. Once we have a start point,
wethen trace the
path of the edge through the image pixel by pixel, marking an edge whenever we are
above the lower
threshold. We stop marking our edge only when the value falls below our lower
threshold. This
approach makes the assumption that edges are likely to be in continuous curves, and
allows us to
follow a faint section of an edge we have previously seen, without meaning that
every noisy pixel in
the image is marked down as anedge. Still, however, we have the problem of choosing
appropriate
thresholding parameters, and suitable thresholding values may vary over the image.
Edge Thinning:
Edge thinning is a technique used to remove the unwanted spurious points on the
edge of an
image. This technique is employed after the image has been filtered for noise
(using median,
Gaussian filter etc.), the edge operator has been applied (like the ones described
above) to detect
the edges and after the edges have been smoothed using an appropriate threshold
value. This
removes all the unwanted points and if applied carefully, results in one-pixel
thick edge elements.
Advantages:
1) Sharp and thin edges lead to greater efficiency in object recognition.
2) If you are using Hough transforms to detect lines and ellipses thenthinning
could give
much better results.
3) If the edge happens to be boundary of a region then, thinning couldeasily give
the image
parameters like perimeter without much algebra.
There are many popular algorithms used to do this, one such is describedbelow:
1. Choose a type of connectivity, like 8, 6 or 4.
2. 8 connectivity is preferred, where all the immediate pixels
surrounding a particular pixel are considered.
3. Remove points from North, south, east and west.
4. Do this in multiple passes, i.e. after the north pass, use the samesemi
processed image
in the other passes and so on.
Remove a point if:
1. The point has no neighbors in the North (if you are in the north pass,and
respective
directions for other passes.)
2. The point is not the end of a line. The point is isolated.
3. Removing the points will not cause to disconnect its neighbors in anyway.
4. Else keep the point. The number of passes across direction should bechosen
according to
the level of accuracy desired.
Second-Order approaches to Edge Detection:
Some edge-detection operators are instead based upon second-order derivatives of
the intensity. This
essentially captures the rate of change in the intensity gradient. Thus, in the
ideal continuous case,
detection of zero-crossings in the second derivative captures local maxima in the
gradient.
The early Marr-Hildreth operator is based on the detection of zero- crossings of
the Laplacian
operator applied to a Gaussian-smoothed image.It can be shown, however, that this
operator will also
return false edges corresponding to local minima of the gradient magnitude.
Moreover, this operator
Page | 55
will give poor localization at curved edges. Hence, this operator istoday mainly of
historical interest.
Differential Edge Detection:
A more refined second-order edge detection approach which automatically detects
edges with subpixel accuracy, uses the following differential approach of detecting
zero- crossings of the secondorder directional derivative in the gradient
direction. Following the differential geometric way of
expressing the requirement of non-maximum suppression proposed by Lindeberg, let us
introduce at
every image point a local coordinate system (u,v), with the v-direction parallel to
the gradient
direction.
Assuming that the image has been pre-smoothed by Gaussian smoothing and a scale-
space
representation L(x,y;t) at scale t has been computed, wecan require that the
gradient magnitude of
the scale-space representation, which is equal to the first-order directional
derivative in the vdirection Lv, should have its first order directional derivative
in the v- direction equal to zero
while the second-order directional derivative in the v-direction of Lv should be
negative, i.e.,
Written out as an explicit expression in terms of local partial derivatives Lx,
Ly ... Lyyy, this edge
definition can be expressed as the zero-crossing curves of the differential
invariant
that satisfy a sign-condition on the following differential invariant
where Lx, Ly ... Lyyy denote partial derivatives computed from a scale-
spacerepresentation L
obtained by smoothing the original image with a Gaussian kernel. In this way, the
edges will be
automatically obtained as continuous curves with subpixel accuracy. Hysteresis
thresholding can also
be applied to these differential and subpixel edge segments.
In practice, first-order derivative approximations can be computed by central
differences as
described above, while second-order derivatives can be computed from the scale-
space
representation L according to:
corresponding to the following filter masks:
Higher-order derivatives for the third-order sign condition can be obtained in an
analogous fashion.
Phase Congruency-based Edge Detection:
A recent development in edge detection techniques takes a frequency domain approach
to finding edge locations. Phase congruency (also knownas phase coherence) methods
attempt to find locations
in an image where all sinusoids in the frequency domain are in phase. These
locations will generally
correspond to the location of a perceived edge, regardless of whether the edge is
represented by a
large change in intensity in the spatial domain. A key benefit of this technique is
that it responds
strongly to Mach bands, and avoids false positives typically found around
roofedges. A roof edge, is
a discontinuity in the first order derivative of a grey-level profile.
Thresholding:
Page | 55
Thresholding is the simplest method of image segmentation. From a grayscale image,
thresholding
can be used to create binary images. During the thresholding process, individual
pixels in an image
are markedas object pixels if their value is greater than some threshold value
(assuming an object to
be brighter than the background) and as
―background pixels otherwise. This convention is known as Threshold Above. Variants
include
threshold below, which is opposite of threshold above; threshold inside, where a
pixel is labeled
"object" if its value is between two thresholds; and threshold outside, which is
the opposite of
threshold inside (Shapiro, et al. 2001:83). Typically, an object pixel is givena
value of ―1ǁ while
a background pixel is given a value of ―0. ǁ Finally, a binary image is created by
coloring each
pixel white or black, depending ona pixel's labels.
Threshold Selection:
The key parameter in the thresholding process is the choice of the threshold value
(or values, as
mentioned earlier). Several different methods for choosing a threshold exist; users
can manually
choose a threshold value, or a thresholding algorithm can compute a value
automatically, which is
known as automatic thresholding (Shapiro, et al. 2001:83). A simple method would be
to choose the
mean or median value,the rationale being that if the object pixels are brighter
than the background,
they should also be brighter than the average.
In a noiseless image with uniform background and object values, the meanor median
will work well
as the threshold, however, this will generally notbe the case. A more sophisticated
approach might be
to create a histogramof the image pixel intensities and use the valley point as the
threshold. The
histogram approach assumes that there is some average value for the background and
object pixels,
but that the actual pixel values have some variation around these average values.
However, this may
be computationally expensive, and image histograms may not have clearly defined
valley points,
often making the selection of an accurate threshold difficult.
One method that is relatively simple, does not require much specific knowledge of
the image, and is
robust against image noise, is the followingiterative method:
 An initial threshold (T) is chosen, this can be done randomly or
Page | 55
according to any other method desired.
 The image is segmented into object and background pixels as described above,
creating two
sets:
G1 = {f(m,n):f(m,n)>T} (object pixels) G2 =
{f(m,n):f(m,n) T} (background pixels)
Note: f(m,n) is the value of the pixel located in the mth column, nth rowThe
average of each
set is computed.
m1 = average value of G1m2 =
average value of G2
A new threshold is created that is the average of m1 and m2
T = (m1 + m2)/2
Go back to step two, now using the new threshold computed in step four, keep
repeating until the
new threshold matches the one before it (i.e. until convergence has been
reached).This iterative
algorithm is a special one- dimensional case of the k-means clustering algorithm,
which has been
proven to converge at a local minimum—meaning that a different initial threshold
may give a
different final result.
Adaptive Thresholding:
Thresholding is called adaptive thresholding when a different threshold isused for
different regions in
the image. This may also be known as local ordynamic thresholding (Shapiro, et al.
2001:89).Mples
are clustered in two parts as background and foreground (object), or alternately
are modeled as a
mixture of two Gaussiansentropy-based methods result in algorithms that use the
entropy of the
foreground and background regions, the cross- entropy between the original and
binarized image,
etc.object attribute- based methods search a measure of similarity between the
gray- level andthe
binarized images, such as fuzzy shape similarity, edge coincidence, etc.Spatial
methods [that] use
higher-order probability distribution and/or correlation between pixels. Local
methods adapt the
threshold value on each pixel to the local image characteristics.
Multiband Thresholding:
Colour images can also be thresholders. One approach is to designate a separate
threshold for each of
the RGB components of the image and then combine them with an AND operation. This
reflects the
way the camera works and how the data is stored in the computer, but it does not
correspond to the
way that people recognize color. Therefore, the HSL andHSV color models are more
often used. It is
also possible to use the CMYK color model (Pham et al., 2007).
Region Growing:
Region growing is a simple region-based image segmentation method. It isalso
classified as a pixelbased image segmentation method since it involves the
selection of initial seed points.This approach
to segmentation examines neighboring pixels of initial ―seed pointsǁ and determines
whether the
pixel neighbors should be added to the region. The process is iterated on, in the
same manner as
general data clustering algorithms.
Region-based Segmentation:
The main goal of segmentation is to partition an image into regions. Some
segmentation methods such as "Thresholding" achieve this goal by lookingfor the
boundaries
between regions based on discontinuities in gray levels or color properties.
Regionbasedsegmentation is a technique for determining the region directly. The
basic formulation for
Region-Based Segmentation is:
Page | 55
(b) Ri is a connected region, i = 1, 2, ...,n
(d)P(Ri) = TRUE for i = 1,2,...,n.
Region-based Segmentation:
The main goal of segmentation is to partition an image into regions. Some
segmentation methods
such as "Thresholding" achieve this goal by looking for the boundaries between
regions based on
discontinuities in gray levels or color properties. Region-basedsegmentation is a
technique for
determining the region directly. The basic formulation for Region-Based
Segmentation is:
(b)Ri is a connected region, i = 1, 2, ...,n(d)P(Ri) = TRUE
for i = 1,2,...,n.
P(Ri) is a logical predicate defined over the points in set P(Rk) and is the null
set, means that the
segmentation must be complete; that is, every pixel must be in a region requires
that points in a region
must be connectedin some predefined sense indicates that the regions must be dis
joint deals
with the properties that must be satisfied by the pixels in a segmented region.
For example P(Ri) = TRUE if all pixels in Ri have the same gray level indicates
that region Ri and
Rj are different in the sense of predicate P.
Basic Concept of Seed Points:
The first step in region growing is to select a set of seed points. Seed point
selection is based on some
user criterion (for example, pixels in a certain gray-level range, pixels evenly
spaced on a grid, etc.).
The initial region begins as the exact location of these seeds.The regions are then
grown fromthese
seed points to adjacent points depending on a region membership criterion. The
criterion could be,
for example, pixel intensity, gray level texture, or color. Since the regions are
grown on the basis of
the criterion, the image information itself is important. For example, if the
criterion were a pixel
intensity threshold value, knowledge of the histogram of the image would be of use,
as one could use
it to determine a suitable threshold value for the region membership criterion.
There is a very simple example followed below. Here we use 4-connected neighborhood
to grow
from the seed points. We can also choose 8-connected neighborhood for our pixel’s
adjacent
relationship. And the criteria we make here is the same pixel value. That is, we
keep examining the
adjacent pixels of seed points. If they have the same intensity value with the seed
points, we classify
them into the seed points. It is an iterated process until there are no change in
two successive iterative
stages. Of course, we can make other criteria, but the main goal is to classify the
similarity of the
image into regions.
Page | 55
Some Important Issues:
Then we can conclude several important issues about region growing. Thesuitable
selection of seed
points is important. The selection of seed points isdepending on the users. For
example, in a graylevel lightning image, we may want to segment the lightning from
the background. Then probably,
we can examine the histogram and choose the seed points from the highest range of
it.More
information of the image is better.Obviously, theconnectivity or pixel adjacent
information is helpful
for usto determine thethreshold and seed points.The value, “minimum area
threshold”, no region in
region growing method result will be smaller than this threshold in the segmented
image.The value,
“Similarity threshold value”, if the difference of pixel-value or the difference
value of average gray
level of a set of pixelsless than ―Similarity threshold valueǁ, the regions will be
considered as a
same region.
The criteria of similarities or so-called homogeneity we choose are also important.
It usually depends
on the original image and the segmentation result we want.Here are some criteria we
often useGray
level(average intensity or variance), color, and texture or shape.
Simulation Examples:
Here we show a simple example for region growing.Figure. 1 is the originalimage
which is a grayscale lightning image. The gray-scale value of this image is from 0
to 255. The purpose we apply
region growing on this imageis that we want to mark the strongest lightning part of
the image and we
Figures 1, 2 3 and 4
Fig. 4
also want the result is connected without being split apart. Therefore, we choose
the points having
the highest gray-scale value which is 255 as the seed points showed in the Figure.
2.
After determining the seed points, we have to determine the range of threshold.
Always keeps in
mind that the purpose we want to do is to markthe strongest light in the image. The
third figure is the
region growing result from choosing the threshold between 225 and the value of seed
points (which
is 255). It means we only want to mark out the points whosegray-scale values are
above 225.If we
make the range of threshold wider, we will get a result having a bigger area of the
lightning region
Page | 55
show as the Figure. 3 and the Figure. 4.
We can observe the difference between the last two figures which have different
threshold value
showed above. Region growing provides the ability for us to separate the part we
want connected.As
we can see in Figure. 3 to Figure. 5, the segmented result in this example are
seed- oriented
connected. That means the result grew from the same seed pointsare the same
regions. And the points
will not be grown without connected with the seed points in the beginning.
Therefore, we can
mention that there are still lots of points in the original image having the gray-
scale value above 155
which are not marked in Figure. 5. This characteristic ensures the reliability of
the segmentation and
provides the ability to resist noise. For this example, this characteristic
prevents us marking out the
non-lightning part in the image because the lightning is always connected as one
part.The advantages
and disadvantages of region growing. We briefly conclude the advantages and
disadvantages of
region growing.
Advantages
1. Region growing methods can correctly separate the regions that have
the same properties we define.
2. Region growing methods can provide the original images which haveclear edges the
good
segmentation results.
3. The concept is simple. We only need a small numbers of seed point torepresent
the
property we want, then grow the region.
4. We can determine the seed points and the criteria we want to make.
5. We can choose the multiple criteria at the same time.
6. It performs well with respect to noise.
Disadvantages
1. The computation is consuming, no matter the time or power.
2. Noise or variation of intensity may result in holes or over
segmentation.
3. This method may not distinguish the shading of the real images.
We can conquer the noise problem easily by using some mask to filter the holes or
outlier. Therefore,
the problem or noise actually does not exist. Inconclusion, it is obvious that the
most serious problem
of region growing isthe power and time consuming.
Unit IV
IMAGE COMPRESSION
LEARNING OBJECTIVES:
The rapid growth of digital imaging applications, including desktop publishing,
multimedia,
teleconferencing and high –definition television has increased the need for
effective and standard
image –compression techniques. After completing this unit, the reader is expected
to befamiliar
with the following concepts:
1. Need for image compression
2. Lossless and lossy image compression
3. Spatial domain and frequency domain image compression
4. Wavelet based compression
5. Image compression
The term data compression refers to the process of reducing the amount of data
required to
represent a given quantity of information. A clear distinction must be made between
data and
information. They are not synonymous. In fact, data are the means by which
information is
conveyed. Various amounts of data may be used to represent the same amount of
information. Such
might be the case, for example, if a long- winded individual and someone who is
short and to the
point where to relate the same story. Here, the information of interest is the
story; wordsare the data
used to relate the information. If the two individuals use a different number of
words to tell the
same basic story, two different versions of the story are created, and at least one
includes
nonessential data. That is, it contains data (or words) that either provide no
relevant information or
simply restate that which is already known. It is thus said to contain data
redundancy.
Data redundancy is a central issue in digital image compression. It is not
an abstract concept but a mathematically quantifiable entity. If n1 and n2 denote
the number of
information-carrying units in two data sets that represent the same information,
the relative data
redundancy RD of the first data set (the one characterized by n1) can be defined as
where CR, commonly called the compression ratio, is
Page | 55
For the case n2 = n1, CR = 1 and RD = 0, indicating that (relative to the second
data set) the first
representation of the information contains no redundant data. When n2<< n1, CRtends
to ∞ and RD
tends to 1, implying significant compression and highly redundant data. Finally,
when n2>>n1,
CR tends to 0 and RDtends to ∞, indicating that the second data set contains much
more data than
the original representation. This, of course, is the normally undesirable case of
data expansion. In
general, CRand RD lie in the open intervals (0,∞) and (-∞, 1), respectively. A
practical
compression ratio, such as 10 (or 10:1), means that the first dataset has 10
information carrying units
(say, bits) for every 1 unit in the second or compressed data set. The
corresponding redundancy of
0.9 implies that 90% of the data in the first data set is redundant.
In digital image compression, three basic data redundancies can be identified and
exploited: coding
redundancy, interpixel redundancy, and psychovisual redundancy. Data compression is
achieved
when one or more of these redundancies are reduced or eliminated.
CODING REDUNDANCY:
In this, we utilize formulation to show how the gray-level histogram of animage
also can provide a
great deal of insight into the construction of codes to reduce the amount of data
used to represent
it.Let us assume, once again, that a discrete random variable rk in the interval
[0, 1] represents the
gray levels of an image and that each rk occurs with
probability pr (rk).
where L is the number of gray levels, nk is the number of times that the kth gray
level
appears in the image, and n is the total number of pixels in the image. If the
number of bits used
to representeach value of rk is l (rk), then the average number of bits required to
represent each
pixel is
That is, the average length of the code words assigned to the various gray-level
values is found by
summing the product of the number of bits
used to represent each gray level and the probability that the gray level occurs.
Thus, the total
number of bits required to code an M X N image is MNLavg.
INTERPIXEL REDUNDANCY:
Consider the images shown in Figs. 1.1(a) and (b). As Figs. 1.1(c) and (d) show,
these images have
virtually identical histograms. Note also that both histograms are trimodal,
indicating the presence
of three dominant ranges of gray-level values. Because the gray levels in these
images are not
equally probable, variable-length coding can be used to reduce the coding
redundancy that would
result from a straight or natural binary encoding of their pixels. The coding
process, however, would
not alter thelevel of correlation between the pixels within the images. In other
words,the codes used
to represent the gray levels of each image have nothing todo with the correlation
between pixels.
These correlations result from the structural or geometric relationships between
the objects in the
image.
Page | 55
Fig. 5.1 Two images and their gray-level histograms and normalizedautocorrelation
coefficients
along one line.
Figures 5.1(e) and (f) show the respective autocorrelation coefficients computed
along one line of
each image.
Page | 55
where
The scaling factor in Eq. above accounts for the varying number of sum terms that
arise for each
integer value of n. Of course, n must be strictly less than N, the number of pixels
on a line. The
variable x is the coordinate of the line used in the computation. Note the dramatic
difference between the shape of the functions shown in Figs. 1.1(e) and (f).Their
shapes can be qualitatively
related to the structure in the images in Figs. 1.1(a) and (b). This relationship
is particularly
noticeable in Fig. 1.1 (f), where the high correlation between pixels separated by
45 and 90 samples
can be directly related to the spacing between the vertically oriented matches of
Fig. 1.1(b). In
addition, the adjacent pixels of both images are highly correlated. When n is 1, γ
is 0.9922 and
0.9928 for the images of Figs. 1.1 (a) and (b), respectively. These values are
typical of most
properly sampled television images.
These illustrations reflect another important form of data redundancy—one directly
related to the
interpixel correlations within an image. Because the value of any given pixel can
be reasonably
predicted from the value of its neighbors, the information carried by individual
pixels is relatively
small. Much of the visual contribution of a single pixel to an image is redundant;
it could have
been guessed on the basis of the valuesof its neighbors. A variety of names,
including spatial
redundancy, geometric redundancy, and inter frame redundancy, have been coined
torefer to these
inter pixel dependencies. We use the term inter pixel redundancy to encompass them
all.
In order to reduce the inte rpixel redundancies in an image, the 2-D pixel array
normally used for
human viewing and interpretation must be transformed into a more efficient (but
usually
"nonvisual") format. For example, the differences between adjacent pixels can be
used to representan
image. Transformations of this type (that is, those that remove inter pixel
redundancy) are referred
to as mappings. They are called reversible mappings if the original image elements
can be
reconstructed from the transformed data set.
PSYCHOVISUAL REDUNDANCY:
The brightness of a region, as perceived by the eye, depends on factors other than
simply the light
reflected by the region. For example, intensityvariations (Mach bands) can be
perceived in an area
of constant intensity.Such phenomena result from the fact that the eye does not
respond with equal
sensitivity to all visual information. Certain information simply hasless relative
importance than
other information in normal visual processing. This information is said to be
psychovisual
redundant. It can be eliminated without significantly impairing the quality of
image perception.
That psychovisual redundancies exist should not come as a surprise,because human
perception of
the information in an image normally does not involve quantitative analysis of
every pixel value in
the image. In general, an observer searches for distinguishing features such as
edges ortextural regions and mentally combines them into recognizable groupings.The
brain then correlates these
groupings with prior knowledge in orderto complete the image interpretation
process. Psychovisual
redundancy isfundamentally different from the redundancies discussed earlier.
Unlike coding and interpixel redundancy, psychovisual redundancy is associated with
real or quantifiable visual
information. Its elimination is possible only because the information itself is not
essential for
Page | 55
Page | 55
normal visualprocessing. Since the elimination of psychovisual redundant data
results in a loss of
quantitative information, it is commonly referred to as quantization.
This terminology is consistent with normal usage of the word, whichgenerally means
the
mapping of a broad range of input values to a limitednumber of output values. As it
is an
irreversible operation (visualinformation is lost), quantization results in lossy
data compression.
fidelity criterion.The removal of psychovisual redundant data results in aloss of
real or
quantitative visual information. Because information ofinterest may be lost, a
repeatable or
reproducible means of quantifyingthe nature and extent of information loss is
highly desirable.
Two generalclasses of criteria are used as the basis for such an assessment:
Objective fidelity criteria and
SUBJECTIVE FIDELITY CRITERIA:
When the level of information loss can be expressed as a function of the original
or input image
and the compressed and subsequently decompressed output image, it is said to be
based on an
objective fidelity criterion. A good example is the root-mean-square (rms) error
between an input
and output image. Let f(x, y) represent an input image and let f(x,y) denote an
estimate or
approximation of f(x, y) that results from compressing and subsequently
decompressing the input.
For any value of x and y, the error e(x, y) between f (x, y) and f^ (x, y) can be
defined as
so that the total error between the two images is
where the images are of size M X N. The root-mean-square error, erms, between f(x,
y) and f^(x,y)
then is the square root of the squared error averaged over the M X N array, or
A closely related objective fidelity criterion is the mean-square signal-to- noise
ratio of the
compressed-decompressed image. If f^ (x, y) is considered to be the sum of the
original
image f(x,y) and a noise signal e(x, y), the mean-square signal-to-noise ratio of
the output
image, denoted
SNRrms, is
The rms value of the signal-to-noise ratio, denoted SNRrms, is obtained by taking
the square root
of Eq. above.
Although objective fidelity criteria offer a simple and convenient mechanism for
evaluating
information loss, most decompressed images ultimately are viewed by humans.
Consequently,
measuring image quality by the subjective evaluations of a human observer often is
more
appropriate. This can be accomplished by showing a "typical" decompressed image to
an
appropriate cross section of viewers and averaging their evaluations. The
evaluations may be made
using an absolute rating scale or by means of side-by-side comparisons of f(x, y)
and f^(x, y).
IMAGE COMPRESSION MODELS:
Page | 55
Fig. 3.1 shows, a compression system consists of two distinct structural blocks: an
encoder and a
decoder. An input image f(x, y) is fed into the encoder, which creates a set of
symbols from the
input data. After transmission over the channel, the encoded representation is fed
to the decoder,
where a reconstructed output image f^(x, y) is generated. In general, f^(x, y) may
or may not be an
exact replica of f(x, y). If it is, the system is error free or information
preserving; if not, some level
of distortion is present in the reconstructed image. Both the encoder and decoder
shown in Fig. 3.1
consist of two relatively independent functionsor sub blocks. The encoder is made
up of a source
encoder, which removes input redundancies, and a channel encoder, which increases
the noise
immunity of the source encoder's output. As would be expected, the decoder includes
a channel
decoder followed by a source decoder. If the channel between the encoder and
decoder is noise free
(not prone to error), the channel encoder and decoder are omitted, and the general
encoder and
decoder become the source encoder and decoder, respectively. Fig.A general
compression system modelTHE
SOURCE ENCODER AND DECODER:
The source encoder is responsible for reducing or eliminating any coding,
interpixel, or psycho
visual redundancies in the input image. The specific application and associated
fidelity requirements
dictate the best encoding approach to use in any given situation. Normally, the
approach can be
modeled by a series of three independent operations. As Fig. 3.2 (a)shows, each
operation is
designed to reduce one of the three redundancies. Figure 3.2 (b) depicts the
corresponding source decoder. Inthe first stage of the source encoding process, the
mapper transforms theinput data into a
(usually nonvisual) format designed to reduce inter pixelredundancies in the input
image. This
operation generally is reversible
and may or may not reduce directly the amount of data required torepresent the
image.
Fig.(a)Source encoder and (b) source decoder model
Run-length coding is an example of a mapping that directly results indata
compression in this
initial stage of the overall source encoding process. The representation of an
image by a set of
transform coefficientsis an example of the opposite case. Here, the mapper
transforms the image
into an array of coefficients, making its inter pixel redundancies more accessible
for compression in
later stages of the encoding process. The second stage, or quantizer block in Fig.
(a), reduces the
accuracy of the mapper's output in accordance with some pre established fidelity
criterion. This
stage reduces the psycho visual redundancies of theinput image. This operation is
irreversible. Thus,
Page | 55
it must be omitted whenerror-free compression is desired.
In the third and final stage of the source encoding process, the symbol coder
creates a fixed- or
variable-length code to represent the quantizer output and maps the output in
accordance with the
code. The term symbol coder distinguishes this coding operation from the overall
sourceencoding
process. In most cases, a variable-length code is used to represent the mapped and
quantized data
set. It assigns the shortest code words to the most frequently occurring output
values and thus
reduces coding redundancy. The operation, of course, is reversible. Upon completion
of the symbol
coding step, the input image has been processedto remove each of the three
redundancies.
Figure(a) shows the source encoding process as three successiveoperations, but all
three operations
are not necessarily included in every compression system. Recall, for example, that
the quantizer
must be omitted when error-free compression is desired. In addition, some
compression techniques normally are modeled by merging blocks that are physically
separate in Fig(a). In the predictive
compression systems, for instance, the mapper and quantizer are often represented
by a single
block, which simultaneously performs both operations. The source decoder shown in
Fig(b)
contains only two components: a symbol decoder and aninverse mapper. These blocks
perform, in
reverse order, the inverse operations of the source encoder's symbol encoder and
mapper blocks.
Because quantization results in irreversible information loss, an inverse quantizer
block is not
included in the general source decoder model shown in Fig(b).
THE CHANNEL ENCODER AND DECODER:
The channel encoder and decoder play an important role in the overall encoding-
decoding process
when the channel of Fig. 3.1 is noisy or proneto error. They are designed to reduce
the impact
of channel noise by inserting a controlled form of redundancy into the source
encoded data.
As the output of the source encoder contains little redundancy, it wouldbe highly
sensitive to
transmission noise without the addition of this "Controlled Redundancy."
One of the most useful channels encoding techniques was devised by R.
W. Hamming (Hamming [1950]). It is based on appending enough bits to the data being
encoded to
ensure that some minimum number of bits must change between valid code words.
Hamming
showed, for example, that if 3 bits of redundancy are added to a 4-bit word, so
that the distance
between any two valid code words is 3, all single-bit errors can be detected and
corrected. (By
appending additional bits of redundancy, multiple-bit errors can be detected and
corrected.) The 7-
bit Hamming (7, 4) code word h1, h2, h3…., h6, h7 associated with a 4-bit
binary number b3,b2,b1,b0 is
where denotes the exclusive OR operation. Note that bits h1, h2, and h4 are even-
parity bits for
the bit fields b3 b2 b0, b3b1b0, and b2b1b0, respectively. (Recall that a string of
binary bits has
even parity if the number of bits with a value of 1 is even.) To decode a Hamming
encoded result,
the channel decoder must check the encoded value for odd parity over the bit fields
in which even
parity was previously established. A single-bit error is indicated by a nonzero
parity word c4c2c1,
where
If a nonzero value is found, the decoder simply complements the code word bit
position
indicated by the parity word. The decoded binary valueis then extracted from the
corrected code
word ash3h5h6h7.
VARIABLE-LENGTH CODING:
The simplest approach to error-free image compression is to reduce only coding
redundancy.
Coding redundancy normally is present in any natural binary encoding of the gray
levels in an
image. It can be eliminated by coding the gray levels. To do so requires
construction of a variablelength code that assigns the shortest possible code words
to themost probable gray levels. Here, we
examine several optimal and near optimal techniques for constructing such a code.
These
techniques are formulated in the language of information theory. In practice, the
source symbols
may be either the gray levels of an image or the output of a gray-level mapping
operation (pixel
differences, run lengths, and so on).
HUFFMAN CODING:
The most popular technique for removing coding redundancy is due to Huffman
(Huffman [1952]).
When coding the symbols of an information source individually, Huffman coding
yields the
smallest possible number of code symbols per source symbol. In terms of the
noiseless coding
theorem, the resulting code is optimal for a fixed value of n, subject to
theconstraint that the source
symbols be coded one at a time.
The first step in Huffman's approach is to create a series of source reductions by
ordering the
probabilities of the symbols under consideration and combining the lowest
probability symbols into
a singlesymbol that replaces them in the next source reduction. Figure 4.1
illustrates this process for
binary coding (K-ary Huffman codes can also be constructed). At the far left, a
hypothetical set of
source symbols and their probabilities are ordered from top to bottom in terms of
decreasing
probability values. To form the first source reduction, the bottom two
probabilities,0.06 and 0.04,
are combined to form a "compound symbol" with probability 0.1. This compound symbol
and its
associated probabilityare placed in the first source reduction column so that the
probabilities of
Page | 55
Page | 92
the reduced source are also ordered from the most to the least probable. This
process is then repeated until
a reduced source with two symbols (atthe far right) is reached.
The second step in Huffman's procedure is to code each reduced source, starting
with the smallest
source and working back to the original source.The minimal length binary code for a
two-symbol
source, of course, is thesymbols 0 and 1. As Fig. 4.2 shows, these symbols are
assigned to the two
symbols on the right (the assignment is arbitrary; reversing the order of the 0 and
1 would work just
as well). As the reduced source symbol with probability 0.6 was generated by
combining two
symbols in the reduced source to its left, the 0 used to code it is now assigned to
both of these
symbols, and a 0 and 1 are arbitrarily appended to each to distinguish them from
each other. Fig. Huffman source reductions.
Fig. Huffman code assignment procedure.
This operation is then repeated for each reduced source until the original source
is reached. The
final code appears at the far left in Fig.The average length of this code is
and the entropy of the source is 2.14 bits/symbol. The resulting Huffman
code efficiency is 0.973.
Huffman's procedure creates the optimal code for a set of symbols and probabilities
subject to the
constraint that the symbols be coded one at a time. After the code has been
created, coding and/or
decoding is accomplished in a simple lookup table manner. The code itself is an
instantaneous
uniquely decodable block code. It is called a block code because each source symbol
is mapped
into a fixed sequence of code symbols. It is instantaneous, because each code word
in a string of
code symbols can be decoded without referencing succeeding symbols. It is uniquely
decodable,
because any string of code symbols can be decoded in only one way. Thus, any string
of Huffman
encoded symbols can be decoded by examining the individual symbols of the string in
a left
to right manner. For the binary code of Fig. a left-to-right scan of the encoded
string
010100111100 reveals that the first valid code word is 01010, which is the
code for symbol a3. The next valid code is 011, which corresponds to symbol a1.
Continuing in
this manner reveals the completely decoded message to be a3a1a2a2a6.
ARITHMETIC CODING:
Unlike the variable-length codes described previously, arithmetic coding generates
non-block
codes. In arithmetic coding, which can be traced to the work of Elias, a one-to-one
correspondence
between source symbols and code words does not exist. Instead, an entire sequence
of source
symbols (or message) is assigned a single arithmetic code word. The codeword itself
defines an
interval of real numbers between 0 and 1. As the number of symbols in the message
increases, the
interval used to represent it becomes smaller and the number of information units
(say, bits)
required to represent the interval becomes larger. Each symbol of the message
reduces the size of
the interval in accordance with its probability of occurrence. Because the
technique does not require,
as does Huffman's approach, that each source symbol translate into an integral
number of code
symbols (that is, that the symbols be coded one at a time),it achieves (but only in
theory) the bound
established by the noiseless coding theorem.
Fig.Arithmetic Coding Procedure
Figure illustrates the basic arithmetic coding process. Here, a five-symbolsequence
or message,
a1a2a3a4, from a four-symbol source is coded. At thestart of the coding process,
the message is
assumed to occupy the entire half-open interval [0, 1). As Table shows, this
interval is initially
subdivided into four regions based on the probabilities of each source symbol.
Symbol ax, for
example, is associated with subinterval [0, 0.2). Because it is the first symbol of
the message being
coded, the message interval is initially narrowed to [0, 0.2). Thus in Fig. [0,
0.2) is expanded to the
full height of the figure and its end points labeled by the values of the narrowed
range. The
narrowed range is then subdivided in accordance with the original source symbol
probabilities and
the processcontinues with the next message symbol.
Table: Arithmetic coding example
In this manner, symbol a2 narrows the subinterval to [0.04, 0.08), a3 further
narrows it to [0.056,
Page | 92
0.072), and so on. The final message symbol,which must be reserved as a special
end-of-message
indicator, narrows the range to [0.06752, 0.0688). Of course, any number within
this
subinterval—for example, 0.068—can be used to represent the message.
In the arithmetically coded message of Fig. three decimal digits are used to
represent the fivesymbol message. This translates into 3/5 or 0.6 decimal digits
per source symbol and compares
favorably with the entropy of the source, which is 0.58 decimal digits or 10- ary
units/symbol. As
the length of the sequence being coded increases, the resulting arithmetic code
approaches the
bound established by the noiseless coding theorem.
In practice, two factors cause coding performance to fall short of the bound: (1)
the addition of the
end-of-message indicator that is needed to separate one message from an-other; and
(2) the use of
finite precision arithmetic. Practical implementations of arithmetic coding address
the latter
problem by introducing a scaling strategy and a rounding strategy (Langdon and
Rissanen [1981]).
The scaling strategy renormalizes each subinterval to the [0, 1) range before
subdividing it in
accordance with the symbol probabilities. The rounding strategy guarantees that the
truncations
associated with finite precision arithmetic do not prevent the coding subintervals
from being
represented accurately.
LZW CODING:
The technique, called Lempel-Ziv-Welch (LZW) coding, assigns fixed- length code
words to
variable length sequences of source symbols but requires no a priori knowledge of
the probability
of occurrence of the symbols to be encoded. LZW compression has been integrated
into a variety of
mainstream imaging file formats, including the graphic interchange format (GIF),
tagged image file
format (TIFF), and the portable document format (PDF).
LZW coding is conceptually very simple (Welch [1984]). At the onset of the coding
process, a
codebook or "dictionary" containing the source symbols to be coded is constructed.
For 8-bit
monochrome images, the first 256 words of the dictionary are assigned to the gray
values 0, 1, 2...,
and 255. As the encoder sequentially examines the image's pixels, gray- level
sequences that are not
in the dictionary are placed in algorithmically determined (e.g., the next unused)
locations. If the
first two pixels of the image are white, for instance, sequence “255- 255” might be
assigned to
location 256, the address following the locations reserved for gray levels 0
through 255. The next
time that two consecutive white pixels are encountered, code word 256, the address
of the location
containing sequence 255-255, is used to represent them. If a 9-bit, 512- word
dictionary is employed
in the coding process, the original (8 + 8) bitsthat were used to represent the two
pixels are replaced
by a single 9-bit code word. Cleary, the size of the dictionary is an important
system parameter. If
it is too small, the detection of matching gray-level
Page | 92
sequences will be less likely; if it is too large, the size of the code words will
adversely affect
compression performance.
Consider the following 4 x 4, 8-bit image of a vertical edge:
Table:Details the steps involved in coding its 16 pixels.
A 512-word dictionary with the following starting content is assumed:
Locations 256 through 511 are initially unused. The image is encoded by processing
its pixels in a
left-to-right, top-to-bottom manner. Each successive gray-level value is
concatenated with a
variable—column 1 of Table 6.1 —called the "currently recognized sequence." As can
be seen, this
variable is initially null or empty. The dictionary is searched for
eachconcatenated sequence and if
found, as was the case in the first row of the table, is replaced by the newly
concatenated and
recognized (i.e., located in the dictionary) sequence. This was done in column 1 of
row 2.
Table: LZW Coding Example
Page | 92
Page | 92
No output codes are generated, nor is the dictionary altered. If the concatenated
sequence is not
found, however, the address of the currently recognized sequence is output as the
next encoded
value, the concatenated but unrecognized sequence is added to the dictionary, and
the currently
recognized sequence is initialized to the current pixel value.This occurred in row
2 of the table. The
last two columns detail the gray- level sequences that are added to the dictionary
when scanning the
entire 4 x 4 image. Nine additional code words are defined. At the conclusion of
coding, the
dictionary contains 265 code words and the LZW algorithm has successfully
identified several
repeating gray-level sequences— leveraging them to reduce the original 128-bit
image lo 90 bits
(i.e., 10 9- bit codes). The encoded output is obtained by reading the third column
from top to
bottom. The resulting compression ratio is 1.42:1.A unique feature of the LZW
coding just
demonstrated is that the coding dictionaryor code book is created while the data
are being encoded.
Remarkably, an LZW decoder builds an identical decompression dictionary as it
decodes
simultaneously the encoded data stream.
Although not needed in this example, most practical applications require a strategy
for handling
dictionary overflow. A simple solution is to flush or reinitialize the dictionary
when it becomes full
and continue coding with a new initialized dictionary. A more complex option isto
monitor
compression performance and flush the dictionary when it becomes poor or
unacceptable.
Alternately, the least used dictionary entries can betracked and replaced when
necessary.
BIT-PLANE CODING:
An effective technique for reducing an image's inter pixel redundancies is to
process the
image's bit planes individually. The technique, called bit-plane coding, is based
on the concept of
decomposing a multilevel (monochrome or color) image into a series of binary images
and
compressing each binary image via one of several well-known binary compression
methods.
BIT-PLANE DECOMPOSITION:
The gray levels of an m-bit gray-scale image can be represented in the form of the
base 2
polynomial
Based on this property, a simple method of decomposing the image into acollection
of binary
images is to separate the m coefficients of thepolynomial into m 1-bit bit planes.
The zerothorderbit plane is generatedby collecting the a0 bits of each pixel, while
the (m - 1)st -order bit
plane contains the am-1, bits or coefficients. In general, each bit plane
isnumbered from 0 to m-1
and is constructed by setting its pixels equal tothe values of the appropriate bits
or polynomial
coefficients from eachpixel in the original image.
The inherent disadvantage of this approach is that small changes in gray level can
have a significant
impact on the complexity of the bit planes. If a pixel of intensity 127 (01111111)
is adjacent to a
pixel of intensity 128 (10000000), for instance, every bit plane will contain a
corresponding 0 to1
(or 1 to 0) transition. For example, as the most significant bits of the two binary
codes for 127
and 128 are different, bit plane 7 will contain a zero-valued pixel next to a pixel
of value 1, creating
a 0 to 1 (or 1 to 0) transition at that point.
An alternative decomposition approach (which reduces the effect of small gray-level
variations) is
to first represent the image by an m-bit Gray code. The m-bit Gray code gm-1...
g2g1g0 that
corresponds to the polynomialin Eq. above can be computed from
Here, denotes the exclusive OR operation. This code has the unique property that
successive code
words differ in only one-bit position. Thus,small changes in gray level are less
likely to affect all m
bit planes. For instance, when gray levels 127 and 128 are adjacent, only the 7th
bit plane will
contain a 0 to 1 transition, because the Gray codes that correspond to 127 and 128
are 11000000
and 01000000, respectively.
LOSSLESS PREDICTIVE CODING:
The error-free compression approach does not require decomposition of an image into
a collection
of bit planes. The approach, commonly referred to as lossless predictive coding, is
based on
eliminating the inter pixel redundancies of closely spaced pixels by extracting and
coding only
thenew information in each pixel. The new information of a pixel is definedas the
difference
between the actual and predicted value of that pixel.
Figure shows the basic components of a lossless predictive coding system. The
system consists of
an encoder and a decoder, each containing an identical predictor. As eachsuccessive
pixel of the
input image, denoted fn, is introduced to the encoder, the predictor generates the
anticipatedvalue
of that pixel based on some number of past inputs. The output ofthe predictor is
then rounded to the
nearest integer, denoted f^n and used to form the difference or prediction error
which is coded
using a variable- length code (by the symbol encoder) to generate the next element
of the
compressed data stream.
Fig. A lossless predictive coding model: (a) encoder; (b) decoder
The decoder of Fig. 8.1 (b) reconstructs en from the received variable- length code
words
and performs the inverse operation
Page | 92
Page | 92
Various local, global, and adaptive methods can be used to generate f^n. In most
cases,
however, the prediction is formed by a linear combinationof m previous pixels. That
is,
where m is the order of the linear predictor, round is a function used to denote
the rounding
ornearest integer operation, and the αi, for i = 1,2,..., m are prediction
coefficients. In raster scan
applications, the subscript n indexes the predictor outputs in accordance with
their time of
occurrence.That is, fn, f^n and en in Eqns. above could be replaced with the more
explicit notation f
(t), f^(t), and e (t), where t represents time. In other cases, n is used as an
index on the
spatial coordinates and/or frame number (in a time sequence of images) of an image.
In 1-D
linear predictive coding, for example, Eq. above can be written as
where each subscripted variable is now expressed explicitly as a functionof spatial
coordinates x
and y. The Eq. indicates that the 1-D linear prediction f(x, y) is a function of
the previous pixels on
the current line alone. In 2-D predictive coding, the prediction is a function of
the previouspixels in a
left-to-right, top-to-bottom scan of an image. In the 3-D case, itis based on these
pixels and the
previous pixels of preceding frames. Equation above cannot be evaluated for the
first m pixels of
each line, so these pixels must be coded by using other means (such as a Huffman
code) and
considered as an overhead of the predictive coding process. A similar comment
applies to the
higher-dimensional cases.
LOSSY PREDICTIVE CODING:
In this type of coding, we add a quantizer to the lossless predictive model and
examine the resulting
trade-off between reconstruction accuracy and compression performance. As Fig.
shows, the
quantizer, which absorbs the nearest integer function of the error-free encoder, is
inserted between
the symbol encoder and the point at which the prediction error is formed.
Page | 92
It maps the prediction error into a limited range of outputs, denoted e^n which
establish the amount
of compression and distortion associated with lossy predictive coding.
Fig. A lossy predictive coding model: (a) encoder and (b) decoder.
In order to accommodate the insertion of the quantization step, the error-free
encoder of figure
must be altered so that the predictions generated bythe encoder and decoder are
equivalent. As
Fig.9 (a) shows, this isaccomplished by placing the lossy encoder's predictor
within feedback
loop, where its input, denoted f˙n, is generated as a function of pastpredictions
and the
corresponding quantized errors. That is,
This closed loop configuration prevents error buildup at the decoder's output. Note
from Fig. (b)
that the output of the decoder also is given by the above Eqn.
OPTIMAL PREDICTORS:
The optimal predictor used in most predictive coding applicationsminimizes the
encoder's meansquare prediction error
subject to the constraint that
and
That is, the optimization criterion is chosen to minimize the mean-square
prediction error,
thequantization error is assumed to be negligible (e˙n ≈ en), and the prediction is
constrained to a
linear combination of m previous pixels.1 These restrictions are not essential, but
they simplify the
analysis considerably and, at the same time, decrease the computational complexity
of the predictor. The resulting predictive codingapproach is referred to as
differential pulse code modulation (DPCM)
TRANSFORM CODING:
All the predictive coding techniques operate directly on the pixels of an image and
thus are
spatial domain methods. In this coding, we consider compression techniquesthat are
based on modifying the transform of animage. In transform coding, a reversible,
linear transform
(such as the Fig. A transform coding system: (a) encoder; (b) decoder
Fourier transform) is used to map the image into a set of transformcoefficients,
which are then
quantized and coded. For most natural images, a significant number of the
coefficients have small
magnitudes and can be coarsely quantized (or discarded entirely) with little image
distortion. A variety of
transformations, including the discrete Fourier transform (DFT), can be used to
transform the image data.
Figure shows a typical transform coding system. The decoder implementsthe inverse
sequence of
steps (with the exception of the quantization function) of the encoder, which
performs four
relatively straightforward operations: sub image decomposition, transformation,
quantization, and
coding. An N X N input image first is subdivided into sub images of sizen X n,
which are the n
transformed to generate (N/n) 2 sub images transform arrays, each of size n X n.
The goal of the transformation process is to de corre late the pixels of eachsub
image, or to pack as
much information as possible into the smallest number of transform coefficients.
The quantization
stage then selectively eliminates or more coarsely quantizes the coefficients that
carry the least
information. These coefficients have the smallest impact on reconstructed sub image
quality. The
encoding process terminates by coding (normally using a variable-length code) the
quantized
coefficients. Any or all of the transform encoding steps can be adapted to Digital
Image Processing
local image content, called adaptive transform coding, or fixed for all sub images,
called non
adaptive transform coding.
WAVELET CODING:
The wavelet coding is based on the idea that the coefficients of a transform that
decorrelates the
pixels of an image can be coded more efficiently than the original pixels
themselves. If the
transform's basis functions—in this case wavelets—pack most of the important visual
information
into a small number of coefficients, the remaining coefficientscan be quantized
coarsely or truncated
to zero with little image distortion.
Figure shows a typical wavelet coding system. To en2codea 2 X 2image, an analyzing
wavelet,
Ψ, and minimum decomposition level, J - P, are
selected and used to compute the image's discrete wavelet transform. If the wavelet
has a
complimentary scaling function φ, the fast wavelet transform can be used. In either
case, the
computed transform converts a large portion of the original image to horizontal,
vertical, and
diagonal decomposition coefficients with zero mean and Laplacian-like
distributions.
Fig. A wavelet coding system: (a) encoder; (b) decoder.
Since many of the computed coefficients carry little visual information, they can
be quantized
and coded to minimize inter- coefficient and coding redundancy. Moreover, the
quantization can be
adapted to exploit any positional correlation across the P decomposition levels.
One or more of
the lossless coding methods, including run-length, Huffman, arithmetic, and bit-
plane coding,
can be incorporated into the final symbol coding step. Decoding is accomplished by
inverting
the encoding operations— with the exception of quantization, which cannot be
reversed exactly.
The principal difference between the wavelet-based system and the transform coding
system is the
omission of the transform coder's sub image processing stages.Because wavelet
transforms are both
Page | 92
computationally efficient and inherently local (i.e., their basis functions are
limited in duration),
subdivision of the original image is unnecessary.
MULTIPLE CHOICE QUESTIONS
1. To convert a continuous sensed data into Digital form, which of thefollowing is
required?
a) Sampling
b) Quantization
Page | 92
c) Both Sampling and Quantization
d) Neither Sampling nor Quantization
2. For a continuous image f(x, y), Quantization is defined as
a) Digitizing the coordinate values
b) Digitizing the amplitude values
c) All of the mentioned
d) None of the mentioned
3. What is pixel?
a) Pixel is the elements of a digital image
b) Pixel is the elements of an analog image
c) Pixel is the cluster of a digital image
d) Pixel is the cluster of an analog image
4. Which is a color attribute that describes a pure color?
a) Saturation
b) Hue
c) Brightness
d) Intensity
5. Which is a color attribute that describes a pure color?
a) Saturation
b) Hue
c) Brightness
d) Intensity
6. A typical size comparable in quality to monochromatic TV image is ofsize
a) 256 X 256
b) 512 X 512
c) 1920 X 1080
d) 1080 X 1080
Page | 92
Page | 92
7. The number of grey values is integer powers of:
a) 4 b) 2 c) 8 d) 1 8. The section of the real plane spanned by the coordinates of
an image iscalled the
a) Spatial Domain
b) Coordinate Axes
c) Plane of Symmetry
d) None of the Mentioned
9. The procedure done on a digital image to alter the values of itsindividual
pixels is
a) Neighborhood Operations
b) Image Registration
c) Geometric Spatial Transformation
d) Single Pixel Operation
10. What is the first and foremost step in Image Processing?
a) Image restoration
b) Image enhancement
c) Image acquisition
d) Segmentation
11. How many numbers of steps are involved in image processing?
a) 10
b) 9 c) 11
d) 12
12. After digitization process a digital image with M rows and N columnshave to be
positive and
for the number, L, max gray levels i.e. an integer power of 2 for each pixel. Then,
the number
b, of bits required to store a digitized image is:
a) b=M*N*k
b) b=M*N*L
c) b=M*L*k
d) b=L*N*k
13. In digital image of M rows and N columns and L discrete gray levels,calculate
the bits
required to store a digitized image for M=N=32 and L=16.
a) 16384
b) 4096
c) 8192
d) 512
14. The quality of a digital image is well determined by
a) The number of samples
b) The discrete gray levels
c) All of the mentioned
d) None of the mentioned
15. The most familiar single sensor used for Image Acquisition is
a) Microdensitometer
b) Photodiode
c)CMOS
d) None of the Mentioned
16. The difference isintensity between the highest and the lowestintensity levels
in an
image is
a) Noise
b) Saturation
c)Contrast
d) Brightness
17. Which of the following expression is used to denote spatial domain?
a) g(x,y)=T[f(x,y)]
b) f(x+y)=T[g(x+y)]
c)g(xy)=T[f(x,y)]
d) g(x-y)=T[f(x-y)]
18. What is the general form of representation of power transformation?
a) s=crγ b) c=srγ c) s=rc
d) s=rcγ 19. What isthe general form of representation of log transformation?
Page | 92
Page | 92
a. s=clog10(1/r)
b. s=clog10(1+r)
c. s=clog10(1*r)
d. s=clog10(1-r)
20. Which expression is obtained by performing the negative transformation on the
negative of
an image with gray levels in the range[0, L-1]?
a) s=L+1-r
b) s=L+1+r
c) s=L-1-r
d) s=L-1+r
21. What is the full form for PDF, a fundamental descriptor of randomvariables i.e.
gray
values in an image?
a) Pixel distribution function b) Portable document format
c) Pel deriving function
d) Probability density function
22. What is the output of a smoothing, linear spatial filter?
a) Median of pixels
b) Maximum of pixels
c) Minimum of pixels
d) Average of pixels
23. Which of the following is the disadvantage of using smoothing filter?
a) Blur edges
b) Blur inner pixels
c) Remove sharp transitions
d) Sharp edges
24. Which of the following comes under the application of image blurring?
a) Object detection
b) Grossrepresentation c) Object motion
d) Image segmentation
25. Which of the following derivatives produce a double response at stepchanges in
gray
level?
a) First order derivative
Page | 92
b) Third order derivative
c) Second order derivative
d) First and second order derivatives
26. What is the name of process used to correct the power-law responsephenomena?
a) Beta correction
b) Alpha correction c) Gamma correction d) Pie correction
27. Which of the following transformation function requires muchinformation to be
specified at the time of input?
a) Log transformation
b) Power transformation
c) Piece-wise transformation
d) Linear transformation
28. In which type of slicing, highlighting a specific range of gray levels inan
image often is
desired?
a) Gray-level slicing
b) Bit-plane slicing
c) Contrast stretching
d) Byte-level slicing
29. In general, which of the following assures of no ringing in the output?
a) Gaussian Lowpass Filter
b) Ideal Lowpass Filter
c) Butterworth Lowpass Filter
d) All of the mentioned
30. The lowpass filtering process can be applied in which of the followingarea(s)?
a) The field of machine perception, with application of characterrecognition
b) In field of printing and publishing industry
c) In field of processing satellite and aerial images
d) All of the mentioned
31. The function of filters in Image sharpening in frequency domain is toperform
reverse
operation of which of the following Lowpass filter?
Page | 92
a) Gaussian Lowpass filter
b) Butterworth Lowpass filter
c) Ideal Lowpass filter
d) None of the Mentioned
32. Which of the following is a second-order derivative operator?
a) Histogram
b) Laplacian
c) Gaussian
d) None of the mentioned
33.Dark characteristics in an image are better solved using
a) Laplacian Transform
b) Gaussian Transform
c) Histogram Specification
d) Power-law Transformation
34. is used to detect diseasessuch as bone infection andtumors.
a) MRI Scan
b) PET Scan
c) Nuclear Whole-Body Scan
d) X-Ray
35. An alternate approach to median filtering is
a) Use a mask
b) Gaussian filter
c) Sharpening
d) Laplacian filter
36.Sobel is better than prewet in image
a) Sharpening
b) Blurring
c) Smoothing
d) Contrast
37.If the inner region of the object is textured then approach, we use is
a) Discontinuity
b) Similarity
c) Extraction
d) Recognition
38.Gradient magnitude images are more useful in
a) Point Detection
b) Line Detection
c) Area Detection d) Edge Detection 39.Diagonal lines are angles at
a) 0 b) 30
c) 45
d) 90
40.Transition between objects and background shows
a) Ramp Edges
b) Step Edges
c) Sharp Edges
d) Both A And B 41.For edge detection we use
a) First Derivative
b) Second Derivative
c) Third Derivative
42. Sobel gradient is not that good for detection of
a) Horizontal Lines
b) Vertical Lines
c) Diagonal Lines
d) Edges
43. Method in which images are input and attributes are output is called
a) Low Level Processes
b) High Level Processes
c) Mid-Level Processes
d) Edge Level Processes
44. Computation of derivatives in segmentation is also called
a) Spatial Filtering
b) Frequency Filtering
c) Low Pass Filtering
d) High Pass Filtering
Page | 92
45. Transition of intensity takes place between
a) Adjacent Pixels
b) Near Pixels
c) Edge Pixels
d) Line Pixels
46. Discontinuity approach of segmentation depends upon
a) Low Frequencies
b) Smooth Changes
c) Abrupt Changes
d) Contrast
47. Two regions are said to be adjacent if their union forms
a) Connected Set
b) Boundaries
c) Region
d) Image
47. Example of similarity approach in image segmentation is
a) Edge Based Segmentation
b) Boundary Based Segmentation
c) Region Based Segmentation
d) Both A And B 48. LOG stands for
a) Laplacian of Gaussian
b) Length of Gaussian
c) Laplacian of Gray Level
d) Length of Gray level
49.Double line effect is produced by
a) First Derivative
b) Second Derivative
c) Third Derivative
d) Both A And B 50.Intersection between zero intensity and extreme of second
derivative iscalled
a) Discontinuity
b) Similarity
Page | 92
c) Continuity
d) Zero Crossing
51. DWT stands for
a) Discrete Wavelet Transform
b) Discrete Wavelet Transformation
c) Digital Wavelet Transform
d) Digital Wavelet Transformation
52. Discarding every sample is called
a) Up Sampling
b) Filtering
c) Down Sampling A d) Blurring
53. High contrast images are considered as
a) Low Resolution
b) High Resolution
c) Intense
d) Blurred
54. In multiresolution processing * representsthe
a) Complete Conjugate Operation
b) Complex Conjugate Operation
c) Complete Complex Operation
d) Complex Complex Operation
55.Representing image in more than one resolution is
a) Histogram
b) Image Pyramid
c) Local Histogram
d) Equalized Histogram
56.Subband of input image, showing dD(m,n) is called
a) Approximation
b) Vertical Detail
c) Horizontal Detail
d) Diagonal Detail
57. Which of the following measures are not used to describe a region?
a) Mean and median of grey values
Page | 92
Page | 92
b) Minimum and maximum of grey values
c) Number of pixels alone A d) Number of pixels above and below mean
58. Which of the following techniques is based on the Fourier transform?
a) Structural
b) Spectral
c) Statistical
d) Topological
59. Which of the following of a boundary is defined as the lineperpendicular to the
major axis?
a) Equilateral axis
b) Equidistant axis
c) Minor axis A d) Median axis
60. Which of the following techniques of boundary descriptions have thephysical
interpretation of boundary shape?
a) Fourier transform
b) Statistical moments
c) Laplace transform
d) Curvature
Page | 92
REFERENCES
1. Rafael C. Gonzales, Richard E. Woods, “Digital Image Processing”,
Third Edition, Pearson Education, 2010.
2. Rafael C. Gonzalez, Richard E. Woods, Steven L. Eddins, “Digital Image
Processing Using MATLAB”, Third Edition Tata Mc Graw Hill Pvt.Ltd., 2011.
3. AnilJain K. “Fundamentals of Digital Image Processing”, PHI
Learning Pvt. Ltd., 2011.ACC.NO: B130746
4. Willliam K Pratt, “Digital Image Processing”, John Willey, 2002.
5. Malay K. Pakhira, “Digital Image Processing and Pattern Recognition”,
First Edition, PHI Learning Pvt. Ltd., 2011.