Chapter 1

CHAPTER 1
INTRODUCTION
1.1 GENERAL BACKGROUND
Images are considered as one of the most important media of conveying
information, in the field of computer vision. Images refers to visual representation
of data, while a digital image is a binary representation of visual data. These images
can take the form of photographs, graphics, and individual video frames. For this
purpose, an image is a picture that was created or copied and stored in electronic
form. Understanding images allows for the extraction of useful information that can
be utilized for a variety of other activities, such as helping in navigation of robots,
identifying an airport using remote sensing data, extracting malign tissues from
body scans and detection of abnormalities in human body.
1.2. IMAGE PROCESSING
An image is defined as a two-dimensional function, F(x,y), where x and y
are spatial coordinates, and the amplitude of F at any pair of coordinates (x,y) is
called the intensity of that image at that point. When x,y, and amplitude values of F
are finite, it is called as digital image. In other words, an image can be defined by a
two-dimensional array specifically arranged in rows and columns.
Image processing is one of the rapidly developing domains of computer
science. Technological improvements in imaging, computer processors and mass
storage devices have fuelled its growth. In several domains of science and
engineering, processing color and grayscale images or the other two-dimensional
signals have become a momentous tool for research and investigation. These and
the other sources produce huge amount of data every day, more than what could
ever be examined manually. The massive volumes of images produced each day by
1
these and other sources are beyond the limits of manual assessment. Extracting
precious information from images is the fundamental task of image processing.
Theoretically, computers have the ability to accomplish this with negligible or no
human involvement [117].
1.3. TYPES OF IMAGE PROCESSING
The two types of image processing technologies utilized are analog and
digital image processing which are explained below.
1.3.1 Analog Image Processing
Analog image processing is done on analog signals. It includes processing on
two dimensional analog signals. In this type of processing, the images are manipulated
by electrical means by varying the electrical signal. The common example include is
the television image. Digital image processing has dominated over analog image
processing with the passage of time due its wider range of applications.
1.3.2 Digital Image Processing
Digital Image is composed of a finite number of elements, each of which
elements have a particular value at a particular location. These elements are referred
to as picture elements, image elements, and pixels. A Pixel is most widely used to
denote the elements of a Digital Image. Digital images are interpreted as 2D or 3D
matrices by a computer, where each value or pixel in the matrix represents the
amplitude, known as the “intensity” of the pixel. Typically, computers are used to
deal with 8-bit images, wherein the amplitude value ranges from 0 to 255. Based
on the ranges, digital images are classified into following types.
➢ Binary Image
➢ Grayscale Image
➢ RGB color image
2
(i) Binary Image
A binary image is one that consists of pixels that can have one of exactly
two colors, usually black and white. It is the simplest type of image. It takes only
two values 0 and 1 where 0 represents black and 1 represents white. The binary
image consists of a 1-bit image and it takes only 1 binary digit to represent a pixel.
Binary images are mostly used for general shape or outline.
(ii) Gray scale Image
Grayscale images are monochrome images and they have only one color.
Grayscale images do not contain any information about color. Each pixel
determines available different grey levels. Gray scale or 8-bit images are composed
of 256 unique colors, where a pixel intensity of 0 represents the black color and
pixel intensity of 255 represents the white color. All the other 254 values in between
are the different shades of gray.
(iii) RGB color image
Typical color images can be described using three colors namely Red,
Green and Blue (RGB). Color images can be represented using three 2D arrays of
same size, one for each color channel: red (R), green (G), and blue (B). Each array
element contains an 8-bit value, indicating the amount of red, green, or blue at that
point in a [0, 255] scale.
1.4. ADVANTAGES OF DIGITAL IMAGE PROCESSING
Digital image processing has many advantages over analog image
processing; it allows a much wider range of algorithms to be applied to input data,
and can avoid problems such as the build-up of noise and signal distortion during
processing. The advantages of Digital Image Processing are as follows.
3
➢ Enhanced Image Quality – One of the primary advantages of digital image
processing is the ability to enhance the quality of images. With the use of
algorithms, digital images can be sharpened, brightened, or color corrected to
produce a clearer and more visually appealing picture.
➢ Improved Medical Diagnosis – Digital image processing is also used in the
field of medicine to improve the accuracy of diagnosis. For example, medical
images like X-rays and MRIs can be processed to highlight areas of interest
or to differentiate between healthy and diseased tissues.
➢ Increased Efficiency – Digital image processing systems can process images
much faster than manual methods. This can help save time and resources in
industries like manufacturing, where inspection and quality control processes
are crucial.
➢ Enhanced Security – Digital image processing systems are also used for
security and surveillance purposes. For example, facial recognition
algorithms can be used to identify people or to detect unusual activity in
public spaces.
➢ Creative Applications – Lastly, digital image processing can be used in
creative applications like graphic design, video editing, and virtual reality. By
manipulating digital images, artists and designers can create new and unique
visual experiences.
1.5. FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING
There are 11 fundamental steps in digital image processing are depicted in
Figure 1.1 and the description of each step are explained in this section.
4
Figure 1.1. Fundamental steps of Digital Image Processing
➢ Image acquisition: This is the first fundamental steps in digital image
processing. Image acquisition could be as simple as being given an image
that is already in digital form. Generally, the image acquisition stage
involves pre-processing, such as scaling etc.
➢ Image enhancement: Image enhancement is among the simplest and most
appealing areas of digital image processing. Basically, the idea behind
enhancement techniques is to bring out detail that is obscured, or simply to
highlight certain features of interest in an image. Such as, changing
brightness & contrast etc.
➢ Image restoration: Image restoration is an area that also deals with
improving the appearance of an image. However, unlike enhancement,
which is subjective, image restoration is objective, in the sense that
restoration techniques tend to be based on mathematical or probabilistic
models of image degradation.
5
➢ Color image processing: Color image processing is an area that has been
gaining its importance because of the significant increase in the use of
digital images over the Internet. This may include color modeling and
processing in a digital domain etc.
➢ Wavelets and Multi-resolution processing: Wavelets are the foundation for
representing images in various degrees of resolution. Images subdivision
successively into smaller regions for data compression and for pyramidal
representation.
➢ Compression: Compression deals with techniques for reducing the storage
required to save an image or the bandwidth to transmit it. Particularly in the
uses of internet it is very much necessary to compress data.
➢ Morphological processing: Morphological processing deals with tools for
extracting image components that are useful in the representation and
description of shape.
➢ Segmentation: Segmentation procedures partition an image into its
constituent parts or objects. In general, autonomous segmentation is one of
the most difficult tasks in digital image processing. A rugged segmentation
procedure brings the process a long way toward successful solution of
imaging problems that require objects to be identified individually.
➢ Representation and description: Representation and description almost
always follow the output of a segmentation stage, which usually is raw pixel
data, constituting either the boundary of a region or all the points in the
region itself. Choosing a representation is only part of the solution for
transforming raw data into a form suitable for subsequent computer
processing. Description deals with extracting attributes that result in some
6
quantitative information of interest or are basic for differentiating one class
of objects from another.
➢ Object Recognition: Recognition is the process that assigns a label, such as,
“vehicle” to an object based on its descriptors.
➢ Knowledge Base: Knowledge may be as simple as detailing regions of an
image where the information of interest is known to be located, thus limiting
the search that must be conducted in seeking that information. The
knowledge base also can be quite complex, such as an interrelated list of all
major possible defects in a materials inspection problem or an image
database containing high-resolution satellite images of a region in
connection with change-detection applications.
1.6. TECHNIQUES IN DIGITAL IMAGE PROCESSING
Some image processing technique takes image as both input and output.
Some other techniques will take images as input but attributes of images as output.
All these techniques are not required at a time for the processing of images. The
selection of techniques is application specific. The details of some image processing
techniques are as follows.
1.6.1. Image Pre-processing
A fundamental step in image processing and computer vision is image
preprocessing [45,46]. The aim of image preprocessing is the improvement of
image data by enhancing some features while suppressing some unwanted
distortions. Enhancing the features depends on specific applications. Image data
recorded by sensors on a satellite, consist of errors related to geometry and
brightness values of the pixels. In image preprocessing, these errors are corrected
using appropriate mathematical models which are either definite or statistical
7
models. Image preprocessing also includes primitive operations to reduce noise,
contrast enhancement, image smoothing and sharpening, and advanced operations
such as image segmentation. According to the size of the pixel neighborhood that
is used for the calculation of a new pixel brightness, the image preprocessing
techniques can be categorized as,
➢ pixel brightness transformations.
➢ geometric transformations.
➢ image restoration that requires knowledge about the entire image.
1.6.2. Image Enhancement
The images captured from some conventional digital cameras and satellite
images may lack in contrast and brightness, because of the limitations of imaging
subsystems and illumination conditions while capturing an image. Image
enhancement is one of the simplest and alluring technique to overcome this
difficulty. Here it enhances some features that are concealed or highlight certain
features of interest, for subsequent analysis of an image. Image enhancement [45]
techniques are categorized into two types. They are as follows
➢ spatial domain method
➢ frequency domain method.
Spatial domain method deals with the modification or aggregation of pixels that
forms the image and frequency domain method enhances the image in a linear
manner by positioning the invariant operator. Some of the image enhancement
techniques are Contrast Stretching, Noise Filtering and Histogram Modification.
8
1.6.3. Image Restoration
The process of recovering degraded or corrupted image by removing the
noise or blur, to improve the appearance of the image is called image restoration
[46]. The degraded image is the convolution of the original image, degraded
function, and additive noise. Restoration of the image is done with the help of prior
knowledge of the noise or the disturbance that causes the degradation in the image.
It can be done in two domains: spatial domain and frequency domain. In spatial
domain, the filtering action for restoring the images is done directly on the operating
pixels of the
digital image and in frequency domain the filtering action is done by mapping the
spatial domain into the frequency domain, by fourier transform. After the filtering,
the image is remapped by inverse fourier transform into spatial domain, to obtain
the restored image. Any of the domains can be chosen based on the applications
required.
1.6.4. Morphological Processing
Morphological processing [182] is a collection of linear operations that is
used for extracting image components that are useful in the representation and
description of shape. Structuring element is a small set to probe an image under
study. Structuring element is compared with the corresponding neighborhood of
pixels by positioning it, at all possible locations in the image. Some operations
check whether the element "fits" within the neighborhood, while others check
whether it "hits" or intersects the neighborhood. The basic morphological
operations are dilation, erosion, and their combinations. Erosion is useful for the
removal of structures of certain shape and size, which is given by the structuring
9
element while dilation is useful in filling of holes of certain size and shape given
by the structuring element.
1.6.5. Object Recognition
Object Recognition [125] is the process that provides a label to an object
in a digital image or video based on its descriptors. eg: -vehicle. Object recognition
deals with training the computer to identify a particular object from various
perspectives, in various lighting conditions, and with various backgrounds. Due to
scene clutter, photometric effects, changes in shape and viewpoints of the object,
the appearance of an object can be varied. It has wide applications in the field of
monitoring and surveillance, robot localization and navigation, medical analysis
etc.
1.6.6. Image Compression
Compression, as the name implies, deals with techniques for reducing the
storage required to save an image, or the bandwidth required to transmit it, without
degrading the quality of an image to an unacceptable level [45,46]. Compression
allows to store more images in each amount disk or memory space. It also reduces
the time required for images to be transmitted over the Internet or downloaded from
web pages. Compression techniques can be of two types. They are lossy and lossless
compression.
Lossy compressions: Lossy compressions are irreversible, some data from the
original image file is lost. Lossy methods are especially suitable for natural images
such as photographs in applications where minor loss of fidelity is acceptable to
achieve a substantial reduction in bit rate.
10
Lossless compression: In lossless compression, we can reduce the size of an image
without any quality loss. Lossless compression is preferred for archival purposes
and often for medical imaging, technical drawings, clip art, or comics.
Some of the common image compression techniques are as follows
➢ Fractal
➢ Wavelets
➢ Chroma sub sampling
➢ Transform coding
➢ Run-length encoding
1.6.7. Image Segmentation
Image Segmentation is the process of partitioning a digital image into
multiple segments, to simplify and/or change the representation of an image into
something that is more meaningful and easier to analyze. It may make use of
statistical classification, thresholding, edge detection, region detection, or any
combination of these techniques. Usually, a set of classified elements is obtained as
the output of the segmentation step. Segmentation techniques can be classified as
either region-based or edge-based. The former techniques rely on common patterns
in intensity values within a cluster of neighboring pixels, and the goal of the
segmentation algorithm is to group regions according to their anatomical or
functional roles. The edge-based techniques rely on discontinuities in image values
between distinct regions, and the goal of the segmentation algorithm is to accurately
demarcate the boundary separating these regions.
1.6.8. Image classification
Image classification is the task of assigning a label or class to an entire
image. Images are expected to have only one class for each image. Image
11
classification models take an image as input and return a prediction about which
class the image belongs to. It involves assigning a label or tag to an entire image
based on preexisting training data of already labeled images. While the process may
appear simple at first glance, it entails pixel-level image analysis to determine the
most appropriate label for the overall image. This provides us with valuable data
and insights, enabling informed decisions and actionable outcomes. Depending on
the problem at hand, there are different types of image classification methodologies
to be employed. These are binary, multiclass, multilabel, and hierarchical.
Binary: Binary classification takes an either-or logic to label images, and classifies
unknown data points into two categories. When your task is to categorize benign or
malignant tumors, analyze product quality to find out whether it has defects or not,
and many other problems that require yes/no answers are solved with binary
classification.
Multiclass: While binary classification is used to distinguish between two classes
of objects, multiclass, as the name suggests, categorizes items into three or more
classes. It's very useful in many domains like NLP (sentiment analysis where more
than two emotions are present), medical diagnosis (classifying diseases into
different categories), etc.
Multilabel: Unlike multiclass classification, where each image is assigned to
exactly one class, multilabel classification allows the item to be assigned to multiple
labels. For example, you may need to classify image colors and there are several
colors. A picture of a fruit salad will have red, orange, yellow, purple, and other
colors depending on your creativity with fruit salads. As a result, one image will
have multiple colors as labels.
12
Hierarchical: Hierarchical classification is the task of organizing classes into a
hierarchical structure based on their similarities, where a higher-level class
represents broader categories and a lower-level class is more concrete and specific.
1.7. CHARACTERISTICS OF DIGITAL IMAGE PROCESSING
Some of the characteristics of Digital Image processing are as follows
➢ It uses software, and some are free of cost.
➢ It provides clear images.
➢ It helps in image enhancement to recollect the data through images.
➢ It is used widely everywhere in many fields.
➢ It is used to support a better experience of life.
1.8. APPLICATIONS OF DIGITAL IMAGE PROCESSING
Digital image processing puts a live effect on things and is growing with time
to time and with new technologies in almost all fields. Few significant applications
of digital image processing in different fields are explained below.
➢ Image sharpening and restoration: It refers to the process in which we can
modify the look and feel of an image. It basically manipulates the images
and achieves the desired output. It includes conversion, sharpening,
blurring, detecting edges, retrieval, and recognition of images.
➢ Robot Vision: There are several robotic machines which work on the digital
image processing. Through image processing technique robot finds their
ways, for example, hurdle detection root and line follower robot.
➢ Pattern recognition: It involves the study of image processing, it is also
combined with artificial intelligence such that computer-aided diagnosis,
handwriting recognition and images recognition can be easily implemented.
Now a days, image processing is used for pattern recognition.
13
➢ Video processing: It is also one of the applications of digital image
processing. A collection of frames or pictures are arranged in such a way
that it makes the fast movement of pictures. It involves frame rate
conversion, motion detection, reduction of noise and colour space
conversion etc.
➢ Medical Field: There are several applications under medical field which
depends on the functioning of digital image processing. Few applications of
digital image processing in medical field are Gamma-ray imaging, PET
scan, X-Ray Imaging, Medical CT scan and UV imaging etc.
1.9. MEDICAL IMAGE PROCESSING
Medical image processing encompasses the use and exploration of image
datasets of the human body to diagnose pathologies or guide medical interventions
such as surgical planning, or for research purposes. Medical image processing is
carried out by radiologists, engineers, and clinicians to better understand the
anatomy of either individual patients or population groups. Medical image
processing deals with the development of problem-specific approaches to the
enhancement of raw medical image data for the purposes of selective visualization
as well as further analysis [192]. The main benefit of medical image processing is
that it allows for in-depth, but non-invasive exploration of internal anatomy. 3D
models of the anatomies of interest can be created and studied to improve treatment
outcomes for the patient, develop improved medical devices and drug delivery
systems, or achieve more informed diagnoses. It has become one of the key tools
leveraged for medical advancement in recent years. There are many topics in
medical image processing: some emphasize general applicable theory and some
focus on specific applications.
14
1.9.1. Medical Imaging Modalities
Detection of disease at the initial stage, using various modalities, is one of the
most important factors to decrease mortality rate occurring due to cancer and
tumors. Modalities help radiologists and doctors to study the internal structure of
the detected disease for retrieving the required features. various medical image
modalities (I-Scan-2, CT-Scan, MRI, X-Ray, Mammogram and Electrocardiogram
(ECG)) used for classifying the diseases in the primary studies. As observed,
following modalities were used for the evaluation of medical data using different
algorithms or techniques namely
➢ Magnetic Resonance Imaging (MRI)
➢ Computed Tomography (CT)
➢ Mammogram
➢ Electrocardiogram (ECG)
➢ X Ray
(i) Magnetic Resonance Imaging (MRI)
It uses magnetic resonance for obtaining electromagnetic signals. These
signals are generated from human organs, which further reconstructs information
about human organ structure [184]. MRIs with high resolution have more structural
details which are required to locate lesions and disease diagnosis.
(ii) Computed Tomography (CT)
It is a technology which generates 3-D images from 2-D X-Ray images using
digital geometry [180].
15
(iii) Mammogram
For the effective breast cancer screening and early detection of abnormalities
in the body, mammograms are used. Calcifications and masses are considered as
the most common abnormalities resulting in breast cancer [20].
(iv) Electrocardiogram (ECG)
It is used to measure the heart activity electrically and to detect the cardiac
problems in humans [14, 154, 216].
(v) X Ray
X-rays are a form of electromagnetic radiation, similar to visible light.
Medical x-rays are used to generate images of tissues and structures inside the body.
If x-rays traveling through the body also pass through an x-ray detector on the other
side of the patient, an image will be formed that represents the “shadows” formed
by the objects inside of the body. Chest X Ray (CXR) images are the images that
helps to identify the pulmonary diseases such as tuberculosis, pneumonia and
COVID etc. [210].
The process of medical image processing begins by acquiring raw data from
any of the above images and reconstructing them into a format suitable for use in
relevant software. A 3D bitmap of greyscale intensities containing a voxel (3D
pixels) grid creates the typical input for image processing. CT scan greyscale
intensity depends on X-ray absorption, while in MRI it is determined by the strength
of signals from proton particles during relaxation and after application of very
strong magnetic fields. Among all imaging modalities, X Ray is one of the most
preferable images for disease detection as it is economical and inexpensive, and the
equipment is easy to install.
16
1.9.2. Medical Image Classification
Medical Image Classification is a task in medical image analysis that involves
classifying medical images, such as X-rays, MRI scans, and CT scans, into different
categories based on the type of image or the presence of specific structures or
diseases. The goal is to use computer algorithms to automatically identify and
classify medical images based on their content, which can help in diagnosis,
treatment planning, and disease monitoring. Medical image classification is one of
the most important problems in the image recognition area, and its aim is to classify
medical images into different categories to help doctors in disease diagnosis or
further research. Overall, medical image classification can be divided into two
steps. The first step is extracting effective features from the image. The second step
is using the features to build models that classify the image dataset. In the past,
doctors usually used their professional experience to extract features to classify the
medical images into different classes, which is usually a difficult, boring, and time-
consuming task. This approach is prone to leading to instability or nonrepeatable
outcomes. Considering the research until now, medical image classification
application research has had great merit [102].
Chest X-rays (CXR) are commonly used for detection and screening of lung
disorders [174] as CXR is one of the most common and easy-to-get medical tests
used to diagnose common diseases of the chest. It is one of the most frequently used
diagnostic modality in detecting different lung diseases such as pneumonia or
tuberculosis [31]. Detection and classification of lung diseases using chest X-ray
images is a complex process for radiologists. Therefore, it received significant
attention from the researchers to develop automatic lung disease detection
techniques [17,79,154]. Since the past decade, many computer-aided diagnosis
17
(CAD) systems have been introduced for lung disease detection using X-ray
images. But such systems were failed to achieve the required performance for lung
disease detection and classification. The recent Covid-19 assisted lung infections
have made these tasks very challenging for such CAD systems. It is essential to
detect the appearance of diseases such as pneumonia, tuberculosis, COVID-19 in
the lungs and its classification which can be accomplished by the following tasks
using CXR images.
➢ Feature Extraction
➢ Feature Selection
➢ Building a classification model
1.9.3. Feature Extraction
Feature extraction (FE) is an important step in image retrieval, image
processing, data mining and computer vision. It is the process of extracting relevant
information from raw data [165]. It is used to extract the most distinct features
present in the images which are used to represent and describe the data. Data is a
collection of various prominent features as the common saying for an image “a
picture is worth a thousand words” [39]. Feature extraction is a part of the
dimensionality reduction process, in which, an initial set of the raw data is divided
and reduced to more manageable groups. The most important characteristic of these
large data sets is that they have a large number of variables. These variables require
a lot of computing resources to process. So, extracting features from the data helps
to get the best feature from those big data sets by selecting and combining variables
into features, thus, effectively reducing the amount of data. These features are easy
to process, but still able to describe the actual data set with accuracy and originality.
18
Feature extraction also refers to the process of transforming raw data into numerical
features that can be processed while preserving the information in the original data
set. It yields better results than applying machine learning or deep learning directly
to the raw data. Feature extraction can be accomplished using two methods.
➢ Manual feature extraction – Handcrafted features
➢ Automatic feature extraction – Deep features
(i) Manual Feature Extraction
Manual feature extraction requires identifying and describing the features
that are relevant for a given problem and implementing a way to extract those
features. In many situations, having a good understanding of the background or
domain can help make informed decisions as to which features could be useful.
Extracting features such as color features, texture features, statistical features etc.
from the images manually by applying algorithms are called as manual feature
extraction and the features extracted manually are called as Handcrafted features.
These features are further used for the purpose of image classification. Handcrafted
features-based classification can be done using Machine learning models.
(ii) Automatic feature extraction
Automated feature extraction uses specialized algorithms or deep networks
to extract features automatically from the images without the need for human
intervention. Automatic feature extraction from the images can be done using deep
neural networks or most probably, the first layer of the deep neural networks. The
features extracted automatically using the deep networks are called as deep features
and deep features-based classification can be accomplished using deep learning-
based classification model.
19
1.9.4. Feature selection
Feature selection is a dimensionality reduction technique widely used for
data mining and knowledge discovery and it allows exclusion of redundant features,
concomitantly retaining the underlying hidden information [191]. Feature selection
can eliminate the irrelevant noisy features and thus improve the quality of the data
set and the performance of learning systems [107]. Although a large number of
features can be used to describe an image, only a small number of them are useful
and efficient for classification. Feature selection is typically done to choose a
condensed and pertinent feature subset in order to reduce the dimensionality of
feature space, which will eventually improve the classification accuracy and save
time. More features do not always provide better classification performance.
Feature selection techniques can be divided into three groups which are as follows
➢ Filter methods
➢ Wrapper methods
➢ Embedding methods
(i) Filter methods
In filter methods, the features are selected based on general characteristics of
the dataset which are images. These methods are generally used while doing the
pre-processing step. A filter method reduces the number of features independently
of the classification model. It is faster and usually the better approach when the
number of features is huge. It avoids overfitting but sometimes may fail to select
best features.
(ii) Wrapper methods
Wrapper methods wrap the feature selection around the classification model
and use the prediction accuracy of the model to iteratively select or eliminate a set
20
of features. Based on the conclusions made from training in prior to the model,
addition and removal of features takes place. The main advantage of wrapper
methods over the filter methods is that they provide an optimal set of features for
training the model, thus resulting in better accuracy than the filter methods but are
computationally more expensive.
(iii) Embedded methods
In embedded methods the feature selection process is an integral part of the
classification model, thus having its own built-in feature selection methods.
Embedded methods encounter the drawbacks of filter and wrapper methods and
merge their advantages. These methods are faster like those of filter methods and
more accurate than the filter methods and take into consideration a combination of
features as well.
1.9.5. Classification model
Classification is a type of supervised learning that categorizes input data into
predefined labels. It involves training a model on labeled examples to learn patterns
between input features and output classes. It also involves assigning a class label to
each instance in a dataset based on its features. The goal of classification is to build
a model that accurately predicts the class labels of new instances based on their
features. In classification, the model is fully trained using the training data, and then
it is evaluated on test data before being used to perform prediction on new unseen
data.
(i) Types of classification
There are two main types of classification: binary classification and multi-
class classification. Binary classification involves classifying instances into two
classes, such as “spam” or “not spam”, while multi-class classification involves
21
classifying instances into more than two classes. In addition to binary classification
and multi-class classification, few other types exist namely multi-label and multi-
level classification.
➢ Binary classification: Binary classification is a type of supervised learning
problem that requires classifying data into two mutually exclusive groups or
categories. Binary classification models are trained using a dataset that has been
labeled with the desired outcome. The training data in such a situation is labeled in
a binary format: true and false; positive and negative; O and 1; spam and not spam,
etc. depending on the problem being tackled.
➢ Multi-class classification: Multiclass classification is a type of supervised
learning problem that requires classifying data into three or more groups/categories.
Unlike binary classification, where the model is only trained to predict one of the
two classes for an item, a multiclass classifier is trained to predict one from three
or more classes for an item. For example, a multiclass classifier could be used to
classify images of animals into different categories such as dogs, cats, and birds.
Most of the binary classification algorithms can be also used for multi-class
classification. Few of them are Support Vector Machine (SVM), Random Forest
(RF), K-Nearest Neighbor (KNN) and Naive Bayes (NB) etc.
➢ Multi-label classification: Multilabel classification is a type of supervised
learning algorithm that can be used to assign zero or more labels to each data
sample. For example, a multilabel classifier could be used to classify an image to
consist of both the animal such as a dog and a cat. In multi-label classification tasks,
0 or more classes for each input example are predicted. In this case, there is no
mutual exclusion because the input example can have more than one label.
22
➢ Multi-level classification: Multi-level classification is a type of supervised
learning algorithm in which classification is done at more than one levels. In this
type of classification, the first level and the subsequent levels may be binary or
multi-class classification. For example, in case of viral pneumonia prediction, the
first level helps to predict whether pneumonia is positive or not. If it is positive,
then the second level helps to predict whether it is bacterial pneumonia or viral
pneumonia. Likewise, the classification extends up to multi-level such as two level,
three level and four level classification etc.
(ii) Machine learning based classification
With advance growth of machine learning, nowadays it is just easier to
create model using machine learning and feed data to the model and wait until the
model is complete. With the machine learning model, it is much easier and faster to
classify category from input data. Machine learning can be broadly classified into
four categories: reinforcement learning, unsupervised learning, semi-supervised
learning, and supervised learning, depending on the techniques and modes of
learning. Classification techniques comes under supervised learning. This indicates
that the "labelled" dataset is used to train the machines in the supervised learning
technique, and the machines then predict the output based on the training.
In machine learning, the prediction model contains two important steps
namely feature extraction and classification. Features are extracted from the input
images and the feature vector is created. The extracted features using feature
extraction technique are given as input to the classification where the machine
learning algorithms are utilized to classify the input data. The workflow of
classification using machine learning is shown in Figure 1.2.
23
Feature Classification Classified
Input image Extraction (Machine Output
learning model)
Figure 1.2. Machine learning based classification
Any input data can be given as input to the classification model and it can
use the same flow to get the predicted class as output.
(iii) Deep learning-based classification
Deep learning is a sub field of machine learning and it provides an alternative
avenue for the data classification. Classification based on deep learning extract
features present in the images by applying mathematical operations that are referred
to as layers. Deep learning structure adds more hidden layers between the input
layers and the output layers and extends the traditional neural networks (NN) to
model more complex and nonlinear relationships. The algorithms are created
exactly just like machine learning but it consists of many more levels of algorithms.
The input is fed into the input layer of the deep learning model and the final
prediction or the classified output is obtained using the output layer.
In deep learning-based classification, the prediction model contains a neural
network which carry out both feature extraction and classification together. Neural
Networks extract deep features from the input data and performs classification. The
workflow of classification using deep learning is shown in Figure 1.3.
Input image Feature Extraction + Classification Classified

(Deep learning model) Output
Figure 1.3. Deep learning-based classification
24
(iv) Fuzzy classification
Fuzzy classification is the process of grouping elements having same
characteristics using fuzzy logic. Fuzzy logic contains the multiple logical values
and these values are the truth values of a variable or problem between 0 and 1. In
the Boolean system, only two possibilities (0 and 1) exist, where 1 denotes the
absolute truth value and 0 denotes the absolute false value. But in the fuzzy system,
there are multiple possibilities present between the 0 and 1, which are partially false
and partially true. The fundamental concept of Fuzzy Logic is the membership
function, which defines the degree of membership of an input value to a certain set
or category. The membership function is a mapping from an input value to a
membership degree between 0 and 1, where 0 represents non-membership and 1
represents full membership.
Fuzzy Logic is implemented using Fuzzy Rules, which are if-then
statements that express the relationship between input variables and output
variables in a fuzzy way. The output of a Fuzzy Logic system is a fuzzy set, which
is a set of membership degrees for each possible output value. The architecture of
fuzzy logic system is shown in Figure 1.4.
Figure 1.4. Architecture of fuzzy logic system
25
The steps in the architecture of fuzzy logic system are as follows.
➢ Rule Base: Rule Base is a component used for storing the set of rules and
the If-Then conditions given by the experts are used for controlling the
decision-making systems. There are so many updates that come in the Fuzzy
theory recently, which offers effective methods for designing and tuning of
fuzzy controllers. These updates or developments decreases the number of
fuzzy set of rules.
➢ Fuzzification: Fuzzification is a module or component for transforming the
system inputs, i.e., it converts the crisp number into fuzzy steps. The crisp
numbers are those inputs which are measured by the sensors and then
fuzzification passed them into the control systems for further processing.
This component divides the input signals into following five states in any
Fuzzy Logic system: Large Positive (LP), Medium Positive (MP), Small (S),
Medium Negative (MN) and Large Negative (LN).
➢ Inference Engine: This component is a main component in any Fuzzy Logic
system (FLS), because all the information is processed in the Inference
Engine. It allows users to find the matching degree between the current
fuzzy input and the rules. After the matching degree, this system determines
which rule is to be added according to the given input field. When all rules
are fired, then they are combined for developing the control actions.
➢ Defuzzification: Defuzzification is a module or component, which takes the
fuzzy set inputs generated by the Inference Engine, and then transforms
them into a crisp value. It is the last step in the process of a fuzzy logic
system. The crisp value is a type of value which is acceptable by the user.
26
Various techniques are present to do this, but the user has to select the best
one for reducing the errors.
1.10. MOTIVATION OF THE RESEARCH
Medical image processing seeks attention recently as it provides a significant
contribution in healthcare. Disease detection has become an important topic in
medical image processing and medical imaging research. Prediction or detection of
diseases can be achieved by analyzing the medical images such as Xray, CT and
MRI scan using computer algorithms. In recent years, infectious respiratory
illnesses have surpassed all other causes of death in the world. Pneumonia,
Tuberculosis (TB) and COVID-19 are the most severe and prevalent infectious
respiratory disorders caused by bacteria and virus that typically affect the lungs
which can even lead to death [77]. Currently, COVID-19 is ranked as the highest
cause of death in recent years as the mortality rate crosses 6 million. The most
interesting and complicating fact about these respiratory diseases are the similarity
in their symptoms. So, it is necessary to classify all the three diseases which can be
accomplished by applying deep learning techniques in Chest X Ray (CXR) images
of patients. Most of the existing research works utilizes machine learning
algorithms for disease prediction and recent researches are carried out using deep
learning approaches. It is observed from the literature survey that, deep learning
algorithms performs better than machine learning algorithms. It is necessary to
develop a deep learning model to detect and classify lung diseases which can
provide higher accuracy than existing research works. Motivated by this fact, a
novel deep learning based multi-level classification is carried out in this research to
detect whether the patient is affected by Pneumonia or Tuberculosis or COVID-19.
27
1.11. OUTLINE OF THE PROPOSED RESEARCH WORK
The proposed research work is developed for multi- level classification of
lung diseases such as pneumonia, tuberculosis and COVID-19 using Chest X Ray
(CXR) images. The outline of the proposed research work is shown in Figure 1.5.
Input CXR Pre- Feature

image processed extraction Feature
Feature selection
Pre- Image using
processing Vector using
“Fusion of
and Data Handcrafted Modified
Augmentation and Deep Moth Flame
features” optimization
(FHD) (MMFO)
Reduced
Feature vector
• Normal Fuzzy rank-based

• Tuberculosis Multi-level ensemble Deep Learning
• Bacterial Pneumonia classification model using Gompertz
• COVID-19 function
Figure 1.5. Outline of the proposed research work
The input Chest X Ray (CXR) images undergoes pre-processing and data
augmentation. Then, the pre-processed CXR images are given as input to the
proposed feature extraction technique namely “Fusion of Handcrafted and Deep
features” (FHD) to extract features from the CXR images. After that, the extracted
features are fed into proposed Modified Moth Flame Optimization (MMFO)
algorithm to reduce the number of features for classification. The reduced feature
vector is given as input to the fuzzy rank-based ensemble deep learning model.
Three deep learning models namely VGG-16, ResNet50 and Modified XceptionNet
are ensembled using Gompertz function. At last, multi-level classification of lung
28
diseases such as Pneumonia, Tuberculosis and COVID-19 is performed using the
ensemble deep learning model. The performance of classification is evaluated using
some performance metrics such as Accuracy, Precision, Recall, Specificity, F1
score and Error rate.
1.12. RESEARCH CONTRIBUTIONS
The main objective of this research is to develop a novel fuzzy integral based
ensemble deep learning model for multi-level classification of lung diseases using
Chest X Ray (CXR) images. The significant contributions in this research are as
follows.
➢ A deep survey of machine learning and deep learning methods for
diagnosing COVID-19 variants was carried out.
➢ Deep learning techniques for classification of COVID-19 was studied
and their performance was compared to identify the best performing deep
learning model.
➢ A novel feature extraction technique namely “Fusion of Handcrafted and
Deep features” (FHD) was developed to extract features from the Chest
X Ray images.
➢ An optimization technique namely “Modified Moth Flame
Optimization” (MMFO) algorithms was developed for the purpose of
feature selection.
➢ A novel framework that uses a fuzzy rank-based ensemble of three pre-
trained deep learning models, namely, VGG-16, ResNet50 and Modified
XceptionNet using Gompertz function was developed for multi-level
classification of lung diseases using CXR images.
29
In addition to this, all the proposed techniques are compared with few
existing techniques and the improvement in the overall results are also discussed.
1.13. ORGANIZATION OF THE THESIS
The organization of this thesis is as follows:
➢ Chapter 2 presents a detailed survey of existing research works related
to disease prediction using image processing techniques. An in-depth analysis and
review of various techniques used for lung disease detection namely feature
extraction, feature selection and classification are also presented.
➢ Chapter 3 discusses about the existing machine learning and deep
learning techniques utilized for diagnosing lung diseases. Few techniques are
implemented and performance of them are also compared to identify the best
technique. This chapter also discusses the limitations and advantages of these
implemented methods. This is necessary for the development of an efficient lung
disease classification model. The detailed explanation of each technique is also
described in this chapter.
➢ A novel feature extraction technique namely “Fusion of Handcrafted
and Deep features” (FHD) feature extraction is proposed and explained in Chapter
4. In the proposed FHD, both handcrafted and deep features are extracted from the
Chest X Ray (CXR) images and both are concatenated for the process of
classification. Experiments for multi-level classification of lung diseases are carried
out using CXR images with four deep learning models namely VGG-16,
MobileNetV2, ResNet50 and Modified XceptionNet. The experimental results
obtained using these methods are presented and analyzed.
30
➢ Chapter 5 describes about a novel optimization algorithm namely
“Modified Moth Flame Optimization” (MMFO) algorithm. MMFO is developed
based on Moth Flame Optimization Algorithm (MFO). MMFO selects the optimal
features for multi-level classification of lung diseases. The proposed MMFO is
evaluated using the deep learning classification models namely VGG-16,
ResNet50, MobileNetV2 and Modified XceptionNet.
➢ A novel a fuzzy rank-based ensemble of three pre-trained deep
learning models, namely, VGG-16, ResNet50 and Modified XceptionNet using
Gompertz function is developed for multi-level classification of lung diseases and
is presented in Chapter 6.
➢ Finally, chapter 7 briefly summarizes and concludes the research
work. Few suggestions and recommendations in order to improve the efficiency and
reliability of the developed method are presented.
1.14. SUMMARY
A novel multi-level classification of lung diseases using modified moth
flame optimization and fuzzy Gompertz based ensemble deep learning model was
developed in this research work. In order to construct a new model for lung diseases
detection, it is necessary to examine and evaluate the current research in this field.
Basic introduction about image processing along with the definition of image and
image processing techniques are presented in this Chapter. Notable techniques for
disease detection using image processing such as Feature Extraction, Feature
Selection and Classification are described in detail in this chapter. Introduction and
working procedure of Machine Learning and Deep Learning are also discussed.
Basic concepts of fuzzy along with the architecture of Fuzzy Inference System
(FIS) is explained in detail. The research gap is identified and the motivation of the
31
research is described. The outline of the proposed research work with a neat
diagram is presented. Moreover, the research contributions along with the
organization of this thesis is also presented.
32

Chapter 1

Uploaded by

Copyright:

Available Formats

Chapter 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 1

Uploaded by

Copyright:

Available Formats

CHAPTER 1

1.1 GENERAL BACKGROUND

Images are considered as one of the most important media of conveying

information, in the field of computer vision. Images refers to visual representation

be utilized for a variety of other activities, such as helping in navigation of robots,

body scans and detection of abnormalities in human body.

1.2. IMAGE PROCESSING

An image is defined as a two-dimensional function, F(x,y), where x and y

two-dimensional array specifically arranged in rows and columns.

Image processing is one of the rapidly developing domains of computer

science. Technological improvements in imaging, computer processors and mass

engineering, processing color and grayscale images or the other two-dimensional

precious information from images is the fundamental task of image processing.

Theoretically, computers have the ability to accomplish this with negligible or no

human involvement [117].

1.3. TYPES OF IMAGE PROCESSING

digital image processing which are explained below.

1.3.1 Analog Image Processing

Analog image processing is done on analog signals. It includes processing on

1.3.2 Digital Image Processing

Digital Image is composed of a finite number of elements, each of which

denote the elements of a Digital Image. Digital images are interpreted as 2D or 3D

on the ranges, digital images are classified into following types.

➢ RGB color image

Binary images are mostly used for general shape or outline.

(ii) Gray scale Image

are the different shades of gray.

(iii) RGB color image

point in a [0, 255] scale.

1.4. ADVANTAGES OF DIGITAL IMAGE PROCESSING

Digital image processing has many advantages over analog image

processing; it allows a much wider range of algorithms to be applied to input data,

processing. The advantages of Digital Image Processing are as follows.

algorithms, digital images can be sharpened, brightened, or color corrected to

produce a clearer and more visually appealing picture.

➢ Improved Medical Diagnosis – Digital image processing is also used in the

field of medicine to improve the accuracy of diagnosis. For example, medical

or to differentiate between healthy and diseased tissues.

➢ Increased Efficiency – Digital image processing systems can process images

industries like manufacturing, where inspection and quality control processes

security and surveillance purposes. For example, facial recognition

algorithms can be used to identify people or to detect unusual activity in

➢ Creative Applications – Lastly, digital image processing can be used in

1.5. FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING

There are 11 fundamental steps in digital image processing are depicted in

➢ Image acquisition: This is the first fundamental steps in digital image

processing. Image acquisition could be as simple as being given an image

that is already in digital form. Generally, the image acquisition stage

involves pre-processing, such as scaling etc.

➢ Image enhancement: Image enhancement is among the simplest and most

appealing areas of digital image processing. Basically, the idea behind

enhancement techniques is to bring out detail that is obscured, or simply to

highlight certain features of interest in an image. Such as, changing

brightness & contrast etc.

➢ Image restoration: Image restoration is an area that also deals with

improving the appearance of an image. However, unlike enhancement,

which is subjective, image restoration is objective, in the sense that

restoration techniques tend to be based on mathematical or probabilistic

models of image degradation.

gaining its importance because of the significant increase in the use of