Digital Image Processing
Digital Image Processing
Digital Image Processing - Introduction : Motivation and Prespective Applications, comonents of image
Processing System Fundamentals: Element of Visual Perception. A Simple Image Model, Sampling and
Quantization;Image Enhancement in Spatial Domain Introduction; Basic Gray Level Functions:
Piecewise-Linear Transformation Functions - Contrast Stretching
Arithmetic/Logic Operations-Image Subtraction, Image Averaging: Basics of Spatial Filtering:
Smoothing - Mean filter, Ordered Statistic Filter; Sharpening - The Laplacian:
● Definition:
○ Digital image processing (DIP) deals with manipulating digital images using a digital
computer.
○ It focuses on developing a computer system capable of processing images efficiently.
● Applications:
○ DIP finds applications in various fields:
1. Gamma Ray Imaging:
■ Used in nuclear medicine and astronomical observations.
2. X-Ray Imaging:
■ Provides X-rays of the human body for medical diagnosis.
3. Ultraviolet Band:
■ Applied in lithography, industrial inspection, microscopy, and lasers.
4. Visual and Infrared Band:
■ Used in remote sensing applications.
5. Microwave Band:
■ Utilized for radar imaging.
1. Image Sensors:
○ These physical devices are sensitive to the energy radiated by the object we wish to
image.
○ They capture the raw data from the scene.
2. Specialized Image Processing Hardware:
○ Includes digitizers (convert analog signals to digital form).
○ Performs primitive operations (e.g., arithmetic, logical) on images.
3. Computer:
○ A general-purpose computer (ranging from PCs to supercomputers) depending on the
application.
○ Specialized computers may be used for dedicated applications.
4. Software:
○ Specialized modules perform specific tasks.
○ Well-designed packages allow user-written code and module integration.
5. Mass Storage:
○ Essential for image processing applications.
○ An image of size 1024 x 1024 pixels with 8-bit intensity per pixel requires 1 megabyte of
storage space (uncompressed).
Example:
Suppose we’re building an image processing system for medical X-ray images. Let’s apply these
concepts:
1. Motivation:
○ Develop an efficient system to enhance X-ray images for accurate diagnosis.
○ Apply algorithms to remove noise and improve contrast.
2. Components:
○ Image sensors capture X-ray energy.
○ Specialized hardware digitizes and processes the raw data.
○ A powerful computer runs image processing algorithms.
○ Software modules enhance image quality.
○ Mass storage stores the processed X-ray images.
● Motivation:
○ Images often suffer from low contrast due to improper illumination, non-linear response of
sensors, or a limited dynamic range.
○ Contrast stretching aims to enhance the dynamic range of gray levels in an image.
○ It improves visibility and reveals details that might be hidden in low-contrast regions.
● Mathematical Formulation:
○ Given an input intensity (r), the output intensity (s) is computed using a piecewise-linear
transformation:
■ For (0 \leq r < a): [ s = l \cdot r ]
■ For (a \leq r < b): [ s = m \cdot (r - a) + v ]
■ For (b \leq r < L-1): [ s = n \cdot (r - b) + w ]
○ Here:
■ (L) represents the total number of intensity levels (e.g., 256 for an 8-bit image).
■ (l), (m), and (n) are slopes.
■ (a) and (b) define the range of intensities to be stretched.
■ (v) and (w) are offsets.
● Example:
○ Suppose we have a grayscale image with intensity values ranging from 0 to 255.
○ We want to stretch the intensity levels from 50 to 200.
○ Using the contrast stretching formula, we calculate the new intensity values for each
pixel.
Original Intensity ((r)) New Intensity ((s))
0 0
50 0
100 85
150 170
200 255
● The resulting image will have improved contrast, with a wider range of intensity levels.
Histogram Equalization
● Purpose:
○ Histogram equalization is a fundamental technique in image processing.
○ It aims to improve the contrast of an image by redistributing the intensity levels.
○ The goal is to achieve a uniform histogram, making the image visually more appealing.
● Mathematical Formulation:
○ Given an input intensity (r), the output intensity (s) is computed using the cumulative
distribution function (CDF) of the original image: [ s = T® = \frac{{L-1}}{{N}}
\sum_{k=0}^{r} h(k) ]
○ Where:
■ (L) represents the total number of intensity levels (e.g., 256 for an 8-bit image).
■ (N) is the total number of pixels in the image.
■ (h(k)) is the histogram value at intensity level (k).
● Example:
○ Suppose we have a grayscale image with the following histogram:
0 10
1 15
2 20
… …
255 5
- Using the formula, we calculate the new intensity values for each pixel.
Local Enhancement
● Motivation:
1. Local enhancement techniques focus on improving specific regions of an image.
2. They adaptively adjust pixel intensities based on local context.
3. Useful for enhancing details in specific areas.
● Methods:
1. Local Histogram Equalization:
■ Apply histogram equalization to small local neighborhoods.
■ Enhances contrast locally.
2. Adaptive Contrast Stretching:
■ Stretch the intensity range based on local statistics (mean, standard deviation).
■ Useful for non-uniform illumination correction.
3. Local Gamma Correction:
■ Adjust gamma locally to enhance or suppress details.
■ Useful for medical imaging and low-light photography.
Example:
Suppose we have a medical X-ray image. Let’s apply these techniques:
1. Histogram Equalization:
○ Enhance the overall contrast of the entire image.
○ Achieve a uniform histogram.
2. Local Enhancement:
○ Apply local histogram equalization to specific regions (e.g., bone structures).
○ Improve visibility of subtle features.
Arithmetic/Logic Operations
Arithmetic and logic operations play a crucial role in image processing. These operations manipulate pixel
values to achieve specific goals. Here are some key concepts:
Image Subtraction
Image subtraction is useful for detecting changes between two images. For instance, it can highlight
moving objects in video surveillance. The formula for image subtraction is as follows:
Image Averaging
Image averaging is a simple yet effective technique for noise reduction. It involves taking the average of
pixel values within a local neighborhood. Here’s how it works:
1. Laplacian Operator:
○ For a grayscale image I(x, y):
○ \text{Laplacian}(x, y) = \nabla^2 I(x, y) = \frac{\partial^2 I}{\partial x^2} + \frac{\partial^2
I}{\partial y^2}
2. Discrete Laplacian:
○ Approximated using finite differences:
○ \text{Laplacian}(x, y) = I(x+1, y) + I(x-1, y) + I(x, y+1) + I(x, y-1) - 4I(x, y)
Example:
Let’s consider a small 3x3 grayscale image with the following pixel values:
[ 50, 60, 70 ]
[ 55, 80, 65 ]
[ 75, 40, 90 ]
MODULE 2
Image Enhancement in Frequency Domain Fourier Transform NAd the Frequency Domain & Image
Restoration - Basic Filtering in Frequency Domain: Filters, Low- pass , High Pass, Correspondence
Between Filtering in Spatial and frequency domain, Smoothing Frequency Domain Filters- Gaussian Low
pass Filters Sharpening frequency Domain FIlters-Gaussian Highpass Filters; Homomorphic Filtering
Image Restoratin: A model of restoration Process, Noise Models, Restoration in the presence of Noise,
only Spatial Filtering-Mean Filters: Arithmetic Mean filter, Geometric Mean Filters, Order Stastic Filters -
Median Filter, Max and Min filters; Periodic Noise Reduction by Frequency Domain Filtering - Bandpass
Filters; Minimum Mean square Error Restoration
1. Fourier Transform:
○ The Fourier Transform decomposes an image into its frequency components.
○ For a 2D grayscale image I(x, y), the continuous Fourier Transform is given by:
○ F(u, v) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} I(x, y) e^{-j2\pi(ux + vy)} \, dx \, dy
○ The inverse Fourier Transform reconstructs the image from its frequency components:
○ I(x, y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} F(u, v) e^{j2\pi(ux + vy)} \, du \, dv
2. Frequency Domain Filtering:
○ Enhancements can be applied by modifying the Fourier Transform.
○ High-pass filters emphasize edges and fine details, while low-pass filters reduce noise
and smooth the image.
○ Example: To enhance high-frequency components (edges), multiply the Fourier
Transform by a high-pass filter function.
Image Restoration
Image restoration aims to recover the original image from a degraded or noisy version. Here’s how it
works:
1. Degradation Model:
○ The observed image G(x, y) is related to the original image F(x, y) through a degradation
process:
○ G(x, y) = H(x, y) \ast F(x, y) + N(x, y)
■ H(x, y): Degradation function (blur kernel)
■ N(x, y): Additive noise
2. Inverse Filtering:
○ The goal is to estimate F(x, y) from G(x, y).
○ Inverse filtering attempts to invert the degradation process:
○ \hat{F}(u, v) = \frac{G(u, v)}{H(u, v)}
■ Note: Division in the frequency domain can amplify noise.
3. Wiener Filtering:
○ Wiener filtering balances noise reduction and image restoration:
○ \hat{F}(u, v) = \frac{H^*(u, v) G(u, v)}{|H(u, v)|^2 + K}
■ K: Regularization parameter (trade-off between noise and restoration)
Example:
Let’s consider a simple example:
1. Original Image:
○ A grayscale image with sharp edges (e.g., an image of text).
○ Fourier Transform: F(u, v)
2. Degradation:
○ Blurring due to motion (e.g., camera shake during capture).
○ Degradation function: H(x, y) (blur kernel)
○ Additive noise: N(x, y)
3. Observed Image:
○ G(x, y) = H(x, y) \ast F(x, y) + N(x, y)
4. Restoration:
○ Apply inverse filtering or Wiener filtering to estimate F(x, y) from G(x, y).
1. Low-Pass Filter:
○ A low-pass filter allows low-frequency components (such as smooth variations) to pass
through while attenuating high-frequency components (such as noise or fine details).
○ The ideal low-pass filter in the frequency domain is defined as:
○ H(u, v) = \begin{cases} 1, & \text{if } D(u, v) \leq D_0 \\ 0, & \text{otherwise} \end{cases}
■ (D(u, v)): Distance from the origin in the frequency domain
■
(D_0): Cutoff frequency (determines the balance between smoothing and
preservation of details)
2. High-Pass Filter:
○ A high-pass filter enhances high-frequency components (such as edges or rapid intensity
changes) while suppressing low-frequency components.
○ The ideal high-pass filter is complementary to the low-pass filter:
○ H(u, v) = 1 - \text{Low-Pass Filter}(u, v)
1. Gaussian Filter:
○ The Gaussian filter is widely used for smoothing.
○ Its frequency domain representation is given by:
○ H(u, v) = e^{-\frac{D(u, v)^2}{2\sigma^2}}
■ (\sigma): Standard deviation (controls the amount of smoothing)
2. Butterworth Filter:
○ The Butterworth filter provides a smooth transition between passband and stopband.
○ Its frequency domain representation is:
○ H(u, v) = \frac{1}{1 + \left(\frac{D(u, v)}{D_0}\right)^{2n}}
■ (D_0): Cutoff frequency
■ (n): Order (higher values yield sharper transitions)
Example:
Let’s consider a simple 4x4 grayscale image with pixel values:
1. Laplacian Filter:
○ The Laplacian operator detects rapid intensity changes.
○ Its frequency domain representation is:
○ H(u, v) = -4\pi^2(u^2 + v^2)
○ Apply the Laplacian filter to emphasize edges.
1. Logarithmic Transformation:
○ Take the logarithm of the original image:
○ \text{LogImage}(x, y) = \log(1 + \text{Image}(x, y))
2. Apply Low-Pass Filter:
○ Use a low-pass filter (e.g., Gaussian) to suppress high-frequency noise.
3. Inverse Logarithmic Transformation:
○ Exponentiate the filtered result:
○ \text{EnhancedImage}(x, y) = e^{\text{FilteredLogImage}(x, y)} - 1
Example:
Let’s consider a simple 3x3 grayscale image with pixel values:
[ 50, 60, 70 ]
[ 55, 80, 65 ]
[ 75, 40, 90 ]
1. Degradation Model:
○ The observed image G(x, y) is related to the original image F(x, y) through a degradation
process:
○ G(x, y) = H(x, y) \ast F(x, y) + N(x, y)
■ H(x, y): Degradation function (blur kernel)
■ N(x, y): Additive noise
2. Inverse Filtering:
○ The goal is to estimate F(x, y) from G(x, y).
○ Inverse filtering attempts to invert the degradation process:
○ \hat{F}(u, v) = \frac{G(u, v)}{H(u, v)}
■ Note: Division in the frequency domain can amplify noise.
3. Wiener Filtering:
○ Wiener filtering balances noise reduction and image restoration:
○ \hat{F}(u, v) = \frac{H^*(u, v) G(u, v)}{|H(u, v)|^2 + K}
■ K: Regularization parameter (trade-off between noise and restoration)
Noise Models
Noise in images can arise from various sources, such as sensor imperfections, transmission, or
digitization. Common noise models include:
1. Noise Estimation:
○ Estimate the noise level in the observed image.
○ Common methods include statistical analysis or visual inspection.
2. Regularization:
○ Regularization techniques (e.g., Tikhonov regularization) balance noise reduction and
image fidelity.
Example:
Let’s consider a small 3x3 grayscale image with the following pixel values:
[ 50, 60, 70 ]
[ 55, 80, 65 ]
[ 75, 40, 90 ]
1. Filter Operation:
○ For a square neighborhood of size (N \times N), centered at pixel ((x, y)):
○ \text{Filtered}(x, y) = \frac{1}{N^2} \sum_{i=-k}^{k} \sum_{j=-k}^{k} \text{Image}(x+i, y+j)
○ (k = \frac{N-1}{2})
1. Filter Operation:
○ Similar to the arithmetic mean filter, but using the geometric mean:
○ \text{Filtered}(x, y) = \left(\prod_{i=-k}^{k} \prod_{j=-k}^{k} \text{Image}(x+i,
y+j)\right)^{\frac{1}{N^2}}
1. Median Filter:
○ The median filter replaces each pixel value with the median of its neighboring pixels.
○ It is robust against outliers and preserves edges.
○ For the same square neighborhood:
○ \text{Filtered}(x, y) = \text{Median}(\text{Image}(x-k:x+k, y-k:y+k))
2. Max and Min Filters:
○ The max filter replaces each pixel value with the maximum value in the neighborhood.
○ The min filter replaces each pixel value with the minimum value in the neighborhood.
Example:
Let’s consider a small 3x3 grayscale image with the following pixel values:
[ 50, 60, 70 ]
[ 55, 80, 65 ]
[ 75, 40, 90 ]
Bandpass filters are designed to selectively allow a certain range of frequencies to pass through while
attenuating others. They are particularly useful for reducing periodic noise in images. Here’s how they
work:
Minimum Mean Square Error (MMSE) restoration aims to estimate the original image from a degraded
version by minimizing the mean square error. Here’s the process:
1. Degradation Model:
○ The observed image G(x, y) is related to the original image F(x, y) through a degradation
process:
○ G(x, y) = H(x, y) \ast F(x, y) + N(x, y)
■ H(x, y): Degradation function (blur kernel)
■ N(x, y): Additive noise
2. MMSE Estimation:
○ Estimate the original image using MMSE:
○ \hat{F}(u, v) = \frac{H^*(u, v) G(u, v)}{|H(u, v)|^2 + \frac{1}{S_N(u, v)}}
■ (S_N(u, v)): Power spectral density of the noise
Example:
Let’s consider a simple 3x3 grayscale image with the following pixel values:
[ 50, 60, 70 ]
[ 55, 80, 65 ]
[ 75, 40, 90 ]
1. Bandpass Filtering:
○ Apply an ideal bandpass filter with (D_1 = 10) and (D_2 = 20):
■ Compute the Fourier Transform (F(u, v)).
■ Multiply (F(u, v)) by the bandpass filter in the frequency domain.
■ Inverse Fourier Transform to obtain the filtered image.
2. MMSE Restoration:
○ Assume a known degradation function (H(x, y)) (e.g., motion blur).
○ Estimate the original image using MMSE:
■ Compute the power spectral density of the noise (S_N(u, v)).
■ Apply the MMSE formula to estimate (\hat{F}(u, v)).
MODULE 3
Colour & morphological Image processing & registration: color Fundamentals: color Models - Converting
Colors to different models : Color Transformation Smoothing and Sharpening, Color Segmentation
Morphological Image processing : Introduction, Logic Operations involving Binary Images, Dilation and
Erosion, Opening and Closing, Morphological Algorithms - Boundary Extraction, Region Filling Extraction
of Connected Components, Convex hull, Thinning , Thickening,
Color Fundamentals
Color Models
1. RGB (Red-Green-Blue):
○ Represents colors using combinations of red, green, and blue channels.
○ Each channel ranges from 0 to 255.
○ Example: RGB(255, 0, 0) represents pure red.
2. CMY (Cyan-Magenta-Yellow):
○ Used in subtractive color mixing (e.g., printing).
○ Complementary to RGB.
○ Example: CMY(0, 255, 255) represents cyan.
3. HSV (Hue-Saturation-Value):
○ Describes colors based on perceptual attributes:
■ Hue: Dominant color (0 to 360 degrees).
■ Saturation: Intensity of color (0 to 100%).
■ Value: Brightness (0 to 100%).
1. RGB to HSV:
○ Convert RGB to normalized values (0 to 1).
○ Compute hue, saturation, and value.
2. HSV to RGB:
○ Compute red, green, and blue channels from HSV values.
Color Transformation
Smoothing and Sharpening
1. Smoothing:
○ Apply filters (e.g., Gaussian) to reduce noise and blur colors.
○ Example: Convolve the RGB channels with a 3x3 Gaussian kernel.
2. Sharpening:
○ Enhance edges and details.
○ Example: Use Laplacian or unsharp masking.
Color Segmentation
1. K-Means Clustering:
○ Cluster similar colors into groups.
○ Example: Segment an image into sky, grass, and buildings.
Example:
Let’s consider an RGB image with the following pixel values (each channel):
1. RGB to HSV:
○ Convert RGB to normalized values (0 to 1):
■ Red: (1, 0, 0)
■ Green: (0.25, 0.75, 0.125)
■ Blue: (0, 0.5, 1)
○ Compute hue, saturation, and value.
2. HSV to RGB:
○ Compute red, green, and blue channels from HSV values.
3. Smoothing:
○ Apply a 3x3 Gaussian filter to each channel.
4. Sharpening:
○ Enhance edges using Laplacian or unsharp masking.
5. Color Segmentation:
○ Use K-means clustering to segment the image into color regions.
1. Binary Images:
○ Binary images consist of only two intensity levels (usually 0 and 1).
○ Objects of interest are typically represented by white pixels (1), while the background is
represented by black pixels (0).
Example:
Let’s consider a simple 5x5 binary image with the following pixel values (1 for white, 0 for black):
[ 0, 0, 0, 0, 0 ]
[ 0, 1, 1, 1, 0 ]
[ 0, 1, 1, 1, 0 ]
[ 0, 1, 1, 1, 0 ]
[ 0, 0, 0, 0, 0 ]
1. Logical AND:
○ Combine this image with another binary image using logical AND.
○ Example: If we have another binary image with all 1s, the result will be the same as the
original image.
2. Logical OR:
○ Combine this image with another binary image using logical OR.
○ Example: If we have another binary image with all 0s, the result will be the same as the
original image.
3. Logical NOT:
○ Invert the pixel values of this binary image.
Boundary Extraction
1. Boundary of a Binary Object:
○ The boundary of a binary object (white region) is the set of pixels that have at least one
black neighbor.
○ It represents the contour of the object.
2. Boundary Extraction Algorithm:
○ Given a binary image (A):
■ Compute the boundary by subtracting the eroded image from the original image:
■ \text{Boundary}(A) = A - E(A)
■ (E(A)): Erosion of (A)
Region Filling
1. Region Filling Algorithm:
○ Given a seed point ((x, y)) inside a white region:
■ Initialize an empty set (R).
■ Add ((x, y)) to (R).
■ While there are neighboring white pixels:
■ Add them to (R).
■ The set (R) represents the filled region.
Convex Hull
1. Convex Hull:
○ The convex hull of a set of points is the smallest convex polygon that encloses all the
points.
○ Useful for shape analysis and pattern recognition.
2. Convex Hull Algorithm:
○ Given a set of points:
■ Compute the convex hull using algorithms like Graham’s scan or Jarvis march.
Example:
Let’s consider a simple 5x5 binary image with the following pixel values (1 for white, 0 for black):
[ 0, 0, 0, 0, 0 ]
[ 0, 1, 1, 1, 0 ]
[ 0, 1, 0, 1, 0 ]
[ 0, 1, 1, 1, 0 ]
[ 0, 0, 0, 0, 0 ]
1. Boundary Extraction:
○ Compute the boundary of the white region.
2. Region Filling:
○ Given a seed point inside the white region, find the filled region.
3. Connected Components:
○ Label the connected components in the image.
4. Convex Hull:
○ Compute the convex hull of the white pixels.
5. Thinning:
○ Apply a thinning algorithm to reduce the thickness of the white regions.
6. Thickening:
○ Apply dilation followed by erosion to expand the white regions.
1. Motivation:
○ Combine information from multiple images.
○ Correct geometric distortions.
○ Enable comparison and analysis.
2. Geometric Transformation:
○ Transform one image to match the other.
○ Common transformations include translation, rotation, scaling, and shearing.
Example:
Let’s consider two stereo images of a simple scene. We want to register them and recover depth
information.
1. Affine Transformation:
○ Apply an affine transformation to align the images.
○ Example: Rotate the right image by 10 degrees.
2. Stereo Correspondence:
○ Use block matching to find corresponding points.
○ Compute the disparity map.
3. Depth Recovery:
○ Convert disparity to depth using triangulation: [ \text{Depth} = \frac{B \cdot
f}{\text{Disparity}} ]
■ (B): Baseline (distance between cameras)
■ (f): Focal length
MODULE 4
Segmentation & obejct Recognition Segmentation : introduction, Region Extraction, pixel-Based approach
, Multi level Thresholding, Local Thresholding Segmentation: Region based Approach , Edge Operators
pater Fitting Approach, Edge Linking and Edge Following, Edge Elements,
Feature Extraction Resperestantion: Boundary based Description, Regionbased Description, relationship,
Object recognition: Deterministic Methods, Clustering, Statistical Classification, Syntactic Recognition,
tree Search, Graph Matching
Segmentation: Introduction
1. What is Segmentation?:
○ Segmentation is the process of dividing an image into meaningful regions or objects.
○ It aims to group pixels with similar characteristics together.
Region Extraction
1. Region-Based Segmentation:
○ Divide an image into regions based on pixel intensity, color, or texture.
○ Common methods include:
■ Thresholding: Assign pixels to regions based on intensity thresholds.
■ Region Growing: Start with seed pixels and expand regions based on similarity
criteria.
Pixel-Based Approach
1. Pixel-Based Segmentation:
○ Assign each pixel to a specific class or region.
○ Example: Assign pixels as foreground or background.
Multi-Level Thresholding
1. Multi-Level Thresholding:
○ Divide an image into more than two regions using multiple intensity thresholds.
○ Mathematically:
■ Given an image (I(x, y)) and (N) thresholds ({T_1, T_2, \ldots, T_N}): [ R_i = {(x,
y) \mid I(x, y) \text{ belongs to region } i} ]
■ (R_i): Region corresponding to threshold (T_i)
Local Thresholding
1. Local Thresholding:
○ Adaptively compute thresholds for different image regions.
○ Example: Otsu’s method computes local thresholds based on local histograms.
Example:
Let’s consider a grayscale image with intensity values:
[ 50, 60, 70, 80, 90 ]
[ 55, 65, 75, 85, 95 ]
[ 60, 70, 80, 90, 100 ]
[ 65, 75, 85, 95, 105 ]
[ 70, 80, 90, 100, 110 ]
Edge Operators
1. Edge Detection:
○ Edges represent significant changes in intensity or color within an image.
○ Edge detection algorithms highlight these transitions.
2. Gradient Operators:
○ Compute the gradient (rate of change) of intensity in an image.
○ Common gradient operators:
■ Sobel: Computes gradients in both horizontal and vertical directions.
■ Prewitt: Similar to Sobel but with different weightings.
Edge Elements
1. Canny Edge Detector:
○ Combines gradient information and non-maximum suppression.
○ Detects edges with sub-pixel accuracy.
Example:
Let’s consider a grayscale image with intensity values:
1. Edge Detection:
○ Apply the Sobel operator to compute gradients.
○ Highlight edges by thresholding the gradient magnitude.
2. Pattern Fitting:
○ Detect straight lines using the Hough transform.
○ Fit lines to the detected edge segments.
3. Edge Linking and Following:
○ Connect adjacent edge pixels to form contours.
○ Use Freeman chain code to encode the contours.
Boundary-Based Description
1. Boundary Extraction:
○ The boundary of an object represents its contour or outer edge.
○ It is obtained by detecting significant intensity changes or gradients.
○ Common methods:
■ Edge Detection: Sobel, Prewitt, Canny operators.
■ Contour Tracing: Freeman chain code, Moore-Neighbor tracing.
2. Boundary Descriptors:
○ Quantify the shape of the boundary.
○ Examples:
■ Perimeter: Total length of the boundary.
■ Curvature: Rate of change of direction along the boundary.
Region-Based Description
1. Region Properties:
○ Describe regions based on pixel values within the object.
○ Common properties:
■ Area: Number of pixels in the region.
■ Centroid: Geometric center of the region.
■ Compactness: Ratio of perimeter to area.
2. Region Descriptors:
○ Quantify the characteristics of the entire region.
○ Examples:
■ Euler Number: Topological property indicating the number of holes.
■ Moments: Statistical measures of pixel intensities.
Example:
Let’s consider a binary image with a simple object:
[ 0, 0, 0, 0, 0 ]
[ 0, 1, 1, 1, 0 ]
[ 0, 1, 0, 1, 0 ]
[ 0, 1, 1, 1, 0 ]
[ 0, 0, 0, 0, 0 ]
1. Boundary Extraction:
○ Compute the boundary using contour tracing.
○ Calculate the perimeter.
2. Region-Based Description:
○ Compute the area of the object.
○ Find the centroid (average position of white pixels).
3. Shape Context:
○ Measure the curvature at different points along the boundary.
○ Compare the distribution of boundary points with the centroid.
Deterministic Methods
1. Template Matching:
○ Compares a template (small image) with regions of the larger image.
○ Measures similarity using correlation or sum of squared differences.
○ Example:
■ Given a template of a stop sign, find all occurrences in a traffic scene.
2. Feature Matching:
○ Detects distinctive features (corners, edges) in both template and target images.
○ Matches features based on descriptors (e.g., SIFT, SURF).
○ Example:
■ Match keypoints between two images for object recognition.
Clustering
1. K-Means Clustering:
○ Groups similar data points into clusters.
○ Example:
■ Cluster color histograms of images to recognize objects.
Statistical Classification
1. Bayesian Decision Theory:
○ Classifies objects based on probability distributions.
○ Uses Bayes’ theorem to estimate class probabilities.
○ Example:
■ Given features (e.g., texture, color), classify fruits as apples or oranges.
Syntactic Recognition
1. Syntactic Pattern Recognition:
○ Models patterns using formal grammars or automata.
○ Analyzes structural relationships.
○ Example:
■ Recognize license plates using context-free grammars.
Tree Search
1. Decision Trees:
○ Hierarchical structure for classification.
○ Each node represents a decision based on features.
○ Example:
■ Decide whether an email is spam or not based on keywords.
Graph Matching
1. Graph Matching Algorithms:
○ Match graphs (representing objects) based on topology.
○ Example:
■ Match hand-drawn sketches to predefined shapes.
Example:
Let’s consider an image recognition scenario:
1. Template Matching:
○ Given a template of a cat’s face, find all occurrences in a collection of cat images.
2. Feature Matching:
○ Detect keypoints (corners) in the template and target images.
○ Match keypoints based on descriptors (e.g., SIFT).
3. K-Means Clustering:
○ Cluster color histograms of images into categories (e.g., animals, vehicles).
4. Bayesian Decision Theory:
○ Estimate probabilities of an image belonging to different classes (e.g., dog, car).
5. Syntactic Recognition:
○ Define a grammar for recognizing handwritten digits (0-9).
6. Decision Trees:
○ Build a decision tree to classify images as indoor or outdoor scenes.
7. Graph Matching:
○ Match graphs representing social networks to identify similar users.