Module 3 Image Segmentation
Module 3 Image Segmentation
Module 3
Image Segmentation
by
Dr. S. D. Ruikar
Syllabus
Example 1
Segmentation based on greyscale
Very simple model of greyscale leads to
Example 2
Segmentation based on texture
Enables object surfaces with varying
Example 3
Segmentation based on depth
This example shows a range image, obtained with
a laser range finder
A segmentation based on the range (the object
distance from the sensor) is useful in guiding
mobile robots
Introduction to image segmentation
Range
Original
image
image
Segmented
image
Principal approaches
Segmentation algorithms generally are based
on one of 2 basis properties of intensity
values
discontinuity : to partition an image based on
sharp changes in intensity (such as edges)
similarity : to partition an image into regions
that are similar according to a set
of predefined criteria.
Detection of Discontinuities
detect the three basic types of gray-level
discontinuities
points , lines , edges
the common way is to run a mask through
the image
Goal: Extract Blobs
AIBO
RoboSoccer
(VelosoLab)
Ideal Segmentation
Result of Segmentation
Thresholding
Basic segmentation operation:
mask(x,y) = 1 if im(x,y) > T
T is threshold
User-defined
Or automatic
Same as
histogram
partitioning:
As Edge Detection
Adaptive thresholding
Region growing
I x, y
TRSH Edge Detected
x
I x, y I x , y
x
TRSH
x x
Detecting
Original
the Edge (2)
First Derivative
I x, y
I x, y I x , y TRSH Edge Not Detected
x
x
TRSH
Gradient Operators
The gradient of the image I(x,y) at
location (x,y), is the vector: I x, y
Gx x
I
G
y
I x , y
y
Gy
The Meaning of the Gradient
It represents the direction of the
strongest variationHorizontal
Vertical
in intensityGeneric
Edge Strength:
I Gx2 G y2
I G x I G y
G
Edge Direction:
x, y 0 x, y
x, y tan 1 y
2 Gx
-1 -2 -1 -1 0 1
0 0 0 -2 0 2
1 2 1 -1 0 1
Gx z7 2z8 z9 z1 2z2 z3 Gy z3 2 z6 z9 z1 2 z4 z7
The Prewitt Edge Detector
-1 -1 -1 -1 0 1
0 0 0 -1 0 1
1 1 1 -1 0 1
Gx z7 z8 z9 z1 z2 z3 Gy z3 z6 z9 z1 z4 z7
The Roberts Edge Detector
0 0 0 0 0 0
0 -1 0 0 0 -1
0 0 1 0 1 0
Gx z9 z5 Gy z8 z6
r x2 y2
because of optics,
sampling, image
acquisition
imperfection
Thick edge
The slope of the ramp is inversely proportional to the
degree of blurring in the edge.
We no longer have a thin (one pixel thick) path.
Instead, an edge point now is any point contained in
the ramp, and an edge would then be a set of such
points that are connected.
The thickness is determined by the length of the
ramp.
The length is determined by the slope, which is in
turn determined by the degree of blurring.
Blurred edges tend to be thick and sharp edges
tend to be thin
First and Second derivatives
1
commonly approx
f 2 f 2 2
x y
f G x G y
the magnitude becomes nonlinear
Gradient Masks
Diagonal edges with Prewitt
and Sobel masks
Example
Example
Example
Laplacian
Laplacian operator
2
f ( x, y ) 2
f ( x, y)
(linear operator) f
2
x 2
y 2
f [ f ( x 1, y ) f ( x 1, y )
2
f ( x, y 1) f ( x, y 1) 4 f ( x, y )]
Laplacian of Gaussian
Laplacian combined with smoothing to
find edges via zero-crossing.
r2
r 2 2
2 2
h( r )
2
e
4
positive central term
surrounded by an adjacent negative region (a function of distance)
zero outer region
Mexican hat
At q, the
value must
be larger
than values
interpolated
at p or r.
Examples:
Non-Maximum Suppression
Non-maxima
Original image Gradient magnitude
suppressed
Linking to the
next edge point
Strong +
Original
connected
image
weak edges
Strong Weak
edges edges
only
fine scale
high
threshold
coarse
scale,
high
threshold
coarse
scale
low
threshold
Finding lines in an image
Option 1:
Search for the line at every possible
position/orientation
What is the cost of this operation?
Option 2:
Use a voting scheme: Hough transform
Finding lines in an image
y b
b0
x m0 m
image space Hough space
y0
x0 x m
image space Hough space
H[d, ] += 1
3. Find the value(s) of (d, ) where H[d, ] is maximum
4. The detected line in the image is given by
Whats the running time (measured in # votes)?
Extensions
Extension 1: Use the image gradient
1. same
2. for each edge point I[x,y] in the image
compute unique (d, ) based on image gradient at (x,y)
H[d, ] += 1
3. same
4. same
1. Whats the running time measured in votes?
2. Extension 2
give more votes for stronger edges
Extension 3
change the sampling of (d, ) to give more/less resolution
Extension 4
The same procedure can be used with circles, squares, or any other
shape
Extensions
Extension 1: Use the image gradient
1. same
2. for each edge point I[x,y] in the image
compute unique (d, ) based on image gradient at (x,y)
H[d, ] += 1
3. same
4. same
1. Whats the running time measured in votes?
2. Extension 2
give more votes for stronger edges
Extension 3
change the sampling of (d, ) to give more/less resolution
Extension 4
The same procedure can be used with circles, squares, or any
other shape
Hough demos
Line : http://www/dai.ed.ac.uk/HIPR2/houghdemo.html
http://www.dis.uniroma1.it/~iocchi/slides/icra2001/java/hough.html
Circle : http://www.markschulze.net/java/hough/
Hough Transform for Curves
The H.T. can be generalized to detect
any curve that can be expressed in
parametric form:
Y = f(x, a1,a2,ap)
a1, a2, ap are the parameters
The parameter space is p-dimensional
The accumulating array is LARGE!
Thresholding
image with dark image with dark
background and background and
a light object two light objects
Multilevel thresholding
a point (x,y) belongs to
to an object class if T1 < f(x,y) T2
to another object class if f(x,y) > T2
to background if f(x,y) T1
T depends on
only f(x,y) : only on gray-level values Global
threshold
both f(x,y) and p(x,y) : on gray-level values and its
neighbors Local threshold
easily use global thresholding
object and background are separated
difficult to segment
Basic Global Thresholding
use T midway
between the max
and min gray levels
generate binary image
Basic Global Thresholding
based on visual inspection of histogram
1. Select an initial estimate for T.
2. Segment the image using T. This will produce two
groups of pixels: G1 consisting of all pixels with
gray level values > T and G2 consisting of pixels
with gray level values T
3. Compute the average gray level values 1 and 2
for the pixels in regions G1 and G2
4. Compute a new threshold value
5. T = 0.5 (1 + 2)
6. Repeat steps 2 through 4 until the difference in T
in successive iterations is smaller than a
predefined parameter To.
Example: Heuristic method
note: the clear
valley of the
histogram and the
effective of the
segmentation
between object
and background
T0 = 0
3 iterations
with result T = 125
Basic Adaptive Thresholding
subdivide original image into small
areas.
utilize a different threshold to segment
each subimages.
since the threshold used for each pixel
depends on the location of the pixel in
terms of the subimages, this type of
thresholding is adaptive.
Example : Adaptive Thresholding
Further subdivision
a). Properly and improperly
segmented subimages from
previous example
b)-c). corresponding histograms
d). further subdivision of the
improperly segmented subimage.
e). histogram of small subimage at
top
f). result of adaptively segmenting
d).
Greylevel histogram-based
segmentation
We will look at two very simple image
segmentation techniques that are based
on the greylevel histogram of an image
Thresholding
Clustering
We will use a very simple object-
background test image
We will consider a zero, low and high noise image
Greylevel histogram-based
segmentation
2000.00
1500.00
Noise free
Low noise
High noise
1000.00
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
Greylevel histogram-based
segmentation
We can define the input image signal-to-noise
ratio in terms of the mean greylevel value of
the object pixels and background pixels and
the additive noise standard deviation
b o
S/ N
Greylevel histogram-based
segmentation
For our test images :
S/N (noise free) =
S/N (low noise) = 5
2000.00
Background
1500.00
1000.00
Object
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
T
Greylevel thresholding
We can define the greylevel
thresholding algorithm as follows:
If the greylevel of pixel p <=T then pixel p
is an object pixel
else
Pixel p is a background pixel
Greylevel thresholding
This simple threshold test begs the
obvious question how do we determine
the threshold ?
Many approaches possible
Interactive threshold
Adaptive threshold
Minimisation method
Greylevel thresholding
We will consider in detail a
minimisation method for determining
the threshold
Minimisation of the within group variance
Robot Vision, Haralick & Shapiro, volume
1, page 20
Greylevel thresholding
Idealized object/background image
histogram
h(i)
2500.00
2000.00
1500.00
1000.00
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
T
Greylevel thresholding
Any threshold separates the histogram into
2 groups with each group having its own
statistics (mean, variance)
The homogeneity of each group is measured
by the within group variance
The optimum threshold is that threshold
which minimizes the within group variance
thus maximizing the homogeneity of each
group
Greylevel thresholding
Let group o (object) be those pixels with
greylevel <=T
Let group b (background) be those
pixels with greylevel >T
The prior probability of group o is po(T)
P(i ) h( i ) / N
( T ) i o ( T ) P(i ) / po ( T )
T 2
2
o
i0
( T ) i b ( T ) P( i ) / pb ( T )
255 2
2
b
i T 1
Greylevel thresholding
The within group variance is defined as :
( T ) ( T ) po ( T ) ( T ) pb ( T )
2
W
2
o
2
b
2000.00
Histogram
1500.00
Within group variance
1000.00
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
Greylevel thresholding
We can examine the performance of
this algorithm on our low and high
noise image
For the low noise case, it gives an optimum
threshold of T=124
Almost exactly halfway between the object
and background peaks
We can apply this optimum threshold to
both the low and high noise images
Greylevel thresholding
0.02
Background
0.01
Object
x
0.00
o b
T
Greylevel thresholding
p(x)
0.02
Background
0.01
Object
x
0.00
o b
T
Greylevel thresholding
Easy to see that, in both cases, for any
value of the threshold, object pixels will
be miss-classified as background and vice
versa
For greater histogram overlap, the pixel
miss-classification is obviously greater
We could even quantify the probability of error
in terms of the mean and standard deviations
of the object and background histograms
Greylevel clustering
Consider an idealized
object/background histogram
Background
Object
c1 c2
Greylevel clustering
g1 ( k ) c1 g1 ( k ) c2 k 1.. N 1
g2 ( k ) c2 g2 ( k ) c1 k 1.. N 2
In other words all grey levels in set 1 are
nearer to cluster centre c1 and all grey levels
in set 2 are nearer to cluster centre c2
Greylevel clustering
Repeat
c1= mean of pixels assigned to object label
c2= mean of pixels assigned to background label
N 1 i 1 N 2 i 1
E(r) >0
Greylevel clustering
i 1
g1 ( i ) c1 g2 ( i ) c2
1 N (r ) 1
(r ) 2 1 N (r )
2
(r ) 2
E1
(r )
N 1 i 1 N 2 i 1
Greylevel clustering
g2
g1
c1 c2
Relaxation labelling
Object
Background
Relaxation labelling
background pixel
Relaxation labelling
i4 i i5
i6 i7 i8
Relaxation labelling
Object
0 g max
Relaxation labelling
decrements p(i)
Relaxation labelling
q( i1 ) cs p( i1 ) cd ( 1 p( i1 ))
We can now average all the contributions from the 8-
neighbours of i to get the total increment to p(i)
1 8
p (i ) (cs p (ih ) cd (1 p (ih )))
8 h 1
Relaxation labelling
153
Relaxation labelling
154
Relaxation labelling
155
Relaxation labelling
156
Relaxation labelling
Relaxation labeling - 20
High noise circle image Optimum threshold iterations
157
Relaxation labelling
158
Relaxation labelling
159
Relaxation labelling
4000.00
3000.00 Original
2 iterations
5 iterations
2000.00 10 iterations
1000.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
160
The Expectation/Maximization (EM)
algorithm
In relaxation labelling we have seen that we are
representing the probability that a pixel has a certain
label
In general we may imagine that an image comprises
L segments (labels)
Within segment l the pixels (feature vectors) have a probability
distribution represented by pl ( x | l )
l represents the parameters of the data in segment l
Mean and variance of the greylevels
Mean vector and covariance matrix of the colours
Texture parameters
161
The Expectation/Maximization (EM)
algorithm
1 5 3
162
The Expectation/Maximization (EM)
algorithm
Once again a chicken and egg problem arises
If we knew : l 1..L then we could obtain a labelling for
l
each x by simply choosing that label which
maximizes pl ( x | l )
If we new the label for each x we could obtain l : l 1..L
by using a simple maximum likelihood estimator
The EM algorithm is designed to deal with this type
of problem but it frames it slightly differently
It regards segmentation as a missing (or incomplete) data
estimation problem
163
The Expectation/Maximization (EM)
algorithm
The incomplete data are just the measured pixel
greylevels or feature vectors
We can define a probability distribution of the incomplete
data as pi ( x;1 , 2 ..... L )
The complete data are the measured greylevels or
feature vectors plus a mapping function f (.) which
indicates the labelling of each pixel
Given the complete data (pixels plus labels) we can easily
work out estimates of the parameters l : l 1..L
But from the incomplete data no closed form solution exists
The Expectation/Maximization (EM)
algorithm
Once again we resort to an iterative strategy and hope that we
get convergence
The algorithm is as follows:
Step 2: (M step)
Update the parameter estimates based on the
current labelling
Until Convergence
165
The Expectation/Maximization (EM)
algorithm
A recent approach to applying EM to image
segmentation is to assume the image pixels
or feature vectors follow a mixture model
Generally we assume that each component of the
mixture model is a Gaussian
A Gaussian mixture model (GMM)
p( x | ) l 1 l pl ( x | l )
L
1 1
pl ( x | l ) exp( ( x ) l ( x l ))
T 1
(2 ) det( l )
d /2 1/ 2 l
2
l 1
L
l 1
166
The Expectation/Maximization (EM)
algorithm
Our parameter space for our distribution now
includes the mean vectors and covariance matrices
for each component in the mixture plus the mixing
weights
1 , 1 ,1 ,..........L , L , L
167
The Expectation/Maximization (EM)
algorithm
Define a posterior probability P(l | x j , l ) as the
probability that pixel j belongs to region l given the
value of the feature vector x j
Using Bayes rule we can write the following equation
l pl ( x j | l )
P(l | x j , l )
k pk ( x | k )
L
k 1
168
The Expectation/Maximization (EM)
algorithm
The M step simply updates the parameter
estimates using MLL estimation
1 n
l
( m 1)
P(l | x j ,l( m) )
n j 1
n
j
x
j 1
P (l | x j , l
( m)
)
l( m 1) n
P (
j 1
l | x j , l
( m)
)
P
j 1
(l | x j , l
(m)
) ( x j l
(m)
)( x j l
( m) T
)
l( m 1) n
P (l |
j 1
x j , l
(m)
)
169
Boundary Characteristic for Histogram
Improvement and Local Thresholding
light object of dark background
0 if f T
s ( x , y ) if f T and 2 f 0
if f T and 2 f 0
Gradient gives an indication of whether a pixel is on an edge
Laplacian can yield information regarding whether a given pixel lies
on the dark or light side of the edge
all pixels that are not on an edge are labeled 0
all pixels that are on the dark side of an edge are labeled +
all pixels that are on the light side an edge are labeled -
Example
Region-Based Segmentation - Region
Growing
Region Growing
criteria:
1. the absolute gray-
level difference
between any pixel
and the seed has to
be less than 65
2. the pixel has to be
8-connected to at
least one pixel in
that region (if
more, the regions
are merged)
Corner detection
Corners contain more edges than
lines.
A point on a line is hard to match.
Corners contain more edges than
lines.
A corner is easier
Edge Detectors Tend to Fail at
Corners
Finding Corners
Intuition:
Right at corner, gradient is ill defined.
Near corner, gradient has two different
values.
Formula for Finding Corners
We look at matrix:
Gradient with respect to x,
Sum over a small region, times gradient with respect to y
the hypothetical corner
I 2
I I
C x x y
I x I y I 2
y
I 2
I I 1 0
C x x y
I x I y I 0 2
2
y
1 0
CR 1
R
0 2
With R a rotation matrix.
So every case is like one on last slide.
So, to detect corners
Filter image.
Compute magnitude of the gradient
everywhere.
We construct C in a window.
Use Linear Algebra to find l1 and l2.
If they are both big, we have a corner.
Image Segmentation
Segmentation divides an image into its
constituent regions or objects.
Segmentation of images is a difficult task in
image processing. Still under research.
Segmentation allows to extract objects in
images.
Segmentation is unsupervised learning.
Model based object extraction, e.g., template
matching, is supervised learning.
What it is useful for
After a successful segmenting the image, the
contours of objects can be extracted using edge
detection and/or border following techniques.
Shape of objects can be described.
Based on shape, texture, and color objects can be
identified.
Segmentation Algorithms
Segmentation algorithms are based on one of
two basic properties of color, gray values, or
texture: discontinuity and similarity.
First category is to partition an image based
on abrupt changes in intensity, such as
edges in an image.
Second category are based on partitioning an
image into regions that are similar according
to a predefined criteria. Histogram
thresholding approach falls under this
category.
Domain spaces
spatial domain (row-column (rc) space)
histogram spaces
color space
texture space
I = imread('pout.tif');
figure, imshow(I);
figure, imhist(I) %look at the hist to get a threshold, e.g., 110
BW=roicolor(I, 110, 255); % makes a binary image
figure, imshow(BW) % all pixels in (110, 255) will be 1 and white
% the rest is 0 which is black
v v
> w
w u
u
Gray Scale Image - Multimodal
Histogram of lena
Segmented Image
Image after segmentation we get a outline of her face, hat, shadow etc
Color Image - bimodal
X = i=1,..N U R(i)
R(i) R(j) = 0 for I j
P(R(i)) = TRUE for i = 1,2,,N
P(R(i) U R(j)) = FALSE for i j
Split and Merge
The segmentation property is a logical
predicate of the form P(R,x,t)
x is a feature vector associated with
region R
t is a set of parameters (usually
thresholds). A simple segmentation
rule has the form:
P(R) : I(r,c) < T for all (r,c) in R
Split and Merge
In the case of color images the feature
vector x can be three RGB image
components (R(r,c),G(r,c),B(r,c))
A simple segmentation rule may have
the form:
P(R) : (R(r,c) <T(R)) && (G(r,c)<T(G))&&
(B(r,c) < T(B))
Region Growing (Merge)
A simple approach to image segmentation is
to start from some pixels (seeds) representing
distinct image regions and to grow them,
until they cover the entire image
For region growing we need a rule describing
a growth mechanism and a rule checking the
homogeneity of the regions after each growth
step
Region Growing
The growth mechanism at each stage k and
for each region Ri(k), i = 1,,N, we check if
there are unclassified pixels in the 8-
neighbourhood of each pixel of the region
border
Before assigning such a pixel x to a region
Ri(k),we check if the region homogeneity:
P(Ri(k) U {x}) = TRUE , is valid
Region Growing Predicate
The arithmetic mean m and standard deviation std of
a region R having n =|R| pixels:
1 1
m( R ) I ( r , c ) std ( R) ( I ( r , c ) m( R )) 2
n ( r ,c )R n 1 ( r ,c )R
The predicate
P: |m(R1) m(R2)| < k*min{std(R1), std(R2)},
is used to decide if the merging
of the two regions R1, R2 is allowed, i.e.,
if |m(R1) m(R2)| < k*min{std(R1), std(R2)},
two regions R1, R2 are merged.
Split
The opposite approach to region growing is
region splitting.
It is a top-down approach and it starts with
the assumption that the entire image is
homogeneous
If this is not true, the image is split into four
sub images
This splitting procedure is repeated
recursively until we split the image into
homogeneous regions
Split
If the original image is square N x N, having
dimensions that are powers of 2(N = 2n):
All regions produced but the splitting algorithm are
squares having dimensions M x M , where M is a
power of 2 as well.
Since the procedure is recursive, it produces an
image representation that can be described by a tree
whose nodes have four sons each
Such a tree is called a Quadtree.
Split
Quadtree
R0 R1
R0
R3
R2 R1
local minima
watershed point
watershed point dam
Image
Segmentation
Background
First-order derivative
f
f '( x) f ( x 1) f ( x)
x
Second-order derivative
2 f
f ( x 1) f ( x 1) 2 f ( x)
x 2
Characteristics of First and Second Order
Derivatives
First-order derivatives generally produce thicker edges in image
The Laplacian
2
f 2
f
f ( x, y ) 2 2
2
x y
f ( x 1, y ) f ( x 1, y ) f ( x, y 1) f ( x, y 1)
4 f ( x, y )
9
1 if | R( x, y ) | T R wk zk
g ( x, y ) k 1
0 otherwise
Image Segmentation
R T
where T : a nonnegativ e threshold
Detection of Discontinuities
Line Detection
First-order derivatives:
The gradient of an image f(x,y) at location (x,y) is
defined as the vector:
G x f
x
f f
G y y
The magnitude of this vector:
f mag (f ) G G 2
x
2
y
1
2
Gx
The direction of this vector:
( x, y ) tan
1
Gy
Direction of edge
900
Basic Edge Detection by Using First-
Order Derivative
f
g x x
Edge normal: f grad ( f )
g y f
y
Edge unit normal: f / mag(f )
Prewitt operators
Sobel operators
Detection of Discontinuities
Gradient Operators
f G x G y
Detection of Discontinuities
Gradient Operators: Example
Detection of Discontinuities
Gradient Operators: Example
Detection of Discontinuities
Gradient Operators
x y
r 2 2
2 2 r2 The Laplacian of a
h( r )
2
e Gaussian (LoG)
4
The Laplacian of a Gaussian sometimes is called the
Mexican hat function. It also can be computed by
smoothing the image with the Gaussian smoothing
mask, followed by application of the Laplacian mask.
Detection of Discontinuities
Gradient Operators
Detection of Discontinuities
Gradient Operators: Example
Sobel gradient
In this example,
we can find the
license plate
candidate after
edge linking
process.
Edge Linking and Boundary Detection
Global Processing via the Hough Transform
2,3,4
1,4
Edge Linking and Boundary Detection
Hough Transform Example
Thresholding
r ( x, y ) (a) (c) i ( x, y )
(d) (e)
f ( x, y ) i ( x, y ) r ( x, y )
Thresholding
Basic Global Thresholding
Thresholding
Basic Global Thresholding
Thresholding
Basic Adaptive Thresholding
Thresholding
Basic Adaptive Thresholding
Answer: subdivision
Thresholding
Optimal Global and Adaptive Thresholding
Color image
Region-Based Segmentation
f2(L2 == 0) = 255
G