Chapter 1
Chapter 1
Chapter 1
INTRODUCTION
of data, while a digital image is a binary representation of visual data. These images
can take the form of photographs, graphics, and individual video frames. For this
purpose, an image is a picture that was created or copied and stored in electronic
form. Understanding images allows for the extraction of useful information that can
identifying an airport using remote sensing data, extracting malign tissues from
are spatial coordinates, and the amplitude of F at any pair of coordinates (x,y) is
called the intensity of that image at that point. When x,y, and amplitude values of F
are finite, it is called as digital image. In other words, an image can be defined by a
storage devices have fuelled its growth. In several domains of science and
signals have become a momentous tool for research and investigation. These and
the other sources produce huge amount of data every day, more than what could
ever be examined manually. The massive volumes of images produced each day by
1
these and other sources are beyond the limits of manual assessment. Extracting
The two types of image processing technologies utilized are analog and
two dimensional analog signals. In this type of processing, the images are manipulated
by electrical means by varying the electrical signal. The common example include is
the television image. Digital image processing has dominated over analog image
processing with the passage of time due its wider range of applications.
elements have a particular value at a particular location. These elements are referred
to as picture elements, image elements, and pixels. A Pixel is most widely used to
matrices by a computer, where each value or pixel in the matrix represents the
amplitude, known as the “intensity” of the pixel. Typically, computers are used to
deal with 8-bit images, wherein the amplitude value ranges from 0 to 255. Based
➢ Binary Image
➢ Grayscale Image
2
(i) Binary Image
A binary image is one that consists of pixels that can have one of exactly
two colors, usually black and white. It is the simplest type of image. It takes only
two values 0 and 1 where 0 represents black and 1 represents white. The binary
image consists of a 1-bit image and it takes only 1 binary digit to represent a pixel.
Grayscale images are monochrome images and they have only one color.
Grayscale images do not contain any information about color. Each pixel
determines available different grey levels. Gray scale or 8-bit images are composed
of 256 unique colors, where a pixel intensity of 0 represents the black color and
pixel intensity of 255 represents the white color. All the other 254 values in between
Typical color images can be described using three colors namely Red,
Green and Blue (RGB). Color images can be represented using three 2D arrays of
same size, one for each color channel: red (R), green (G), and blue (B). Each array
element contains an 8-bit value, indicating the amount of red, green, or blue at that
and can avoid problems such as the build-up of noise and signal distortion during
3
➢ Enhanced Image Quality – One of the primary advantages of digital image
processing is the ability to enhance the quality of images. With the use of
images like X-rays and MRIs can be processed to highlight areas of interest
much faster than manual methods. This can help save time and resources in
are crucial.
➢ Enhanced Security – Digital image processing systems are also used for
public spaces.
creative applications like graphic design, video editing, and virtual reality. By
manipulating digital images, artists and designers can create new and unique
visual experiences.
Figure 1.1 and the description of each step are explained in this section.
4
Figure 1.1. Fundamental steps of Digital Image Processing
5
➢ Color image processing: Color image processing is an area that has been
digital images over the Internet. This may include color modeling and
successively into smaller regions for data compression and for pyramidal
representation.
description of shape.
always follow the output of a segmentation stage, which usually is raw pixel
data, constituting either the boundary of a region or all the points in the
6
quantitative information of interest or are basic for differentiating one class
➢ Object Recognition: Recognition is the process that assigns a label, such as,
knowledge base also can be quite complex, such as an interrelated list of all
Some image processing technique takes image as both input and output.
Some other techniques will take images as input but attributes of images as output.
All these techniques are not required at a time for the processing of images. The
brightness values of the pixels. In image preprocessing, these errors are corrected
7
models. Image preprocessing also includes primitive operations to reduce noise,
such as image segmentation. According to the size of the pixel neighborhood that
is used for the calculation of a new pixel brightness, the image preprocessing
➢ geometric transformations.
The images captured from some conventional digital cameras and satellite
images may lack in contrast and brightness, because of the limitations of imaging
difficulty. Here it enhances some features that are concealed or highlight certain
Spatial domain method deals with the modification or aggregation of pixels that
forms the image and frequency domain method enhances the image in a linear
8
1.6.3. Image Restoration
noise or blur, to improve the appearance of the image is called image restoration
[46]. The degraded image is the convolution of the original image, degraded
function, and additive noise. Restoration of the image is done with the help of prior
knowledge of the noise or the disturbance that causes the degradation in the image.
It can be done in two domains: spatial domain and frequency domain. In spatial
domain, the filtering action for restoring the images is done directly on the operating
pixels of the
digital image and in frequency domain the filtering action is done by mapping the
spatial domain into the frequency domain, by fourier transform. After the filtering,
the image is remapped by inverse fourier transform into spatial domain, to obtain
the restored image. Any of the domains can be chosen based on the applications
required.
used for extracting image components that are useful in the representation and
pixels by positioning it, at all possible locations in the image. Some operations
check whether the element "fits" within the neighborhood, while others check
operations are dilation, erosion, and their combinations. Erosion is useful for the
removal of structures of certain shape and size, which is given by the structuring
9
element while dilation is useful in filling of holes of certain size and shape given
in a digital image or video based on its descriptors. eg: -vehicle. Object recognition
deals with training the computer to identify a particular object from various
scene clutter, photometric effects, changes in shape and viewpoints of the object,
the appearance of an object can be varied. It has wide applications in the field of
etc.
Compression, as the name implies, deals with techniques for reducing the
storage required to save an image, or the bandwidth required to transmit it, without
allows to store more images in each amount disk or memory space. It also reduces
the time required for images to be transmitted over the Internet or downloaded from
web pages. Compression techniques can be of two types. They are lossy and lossless
compression.
Lossy compressions: Lossy compressions are irreversible, some data from the
original image file is lost. Lossy methods are especially suitable for natural images
10
Lossless compression: In lossless compression, we can reduce the size of an image
without any quality loss. Lossless compression is preferred for archival purposes
and often for medical imaging, technical drawings, clip art, or comics.
➢ Fractal
➢ Wavelets
➢ Transform coding
➢ Run-length encoding
something that is more meaningful and easier to analyze. It may make use of
in intensity values within a cluster of neighboring pixels, and the goal of the
between distinct regions, and the goal of the segmentation algorithm is to accurately
image. Images are expected to have only one class for each image. Image
11
classification models take an image as input and return a prediction about which
class the image belongs to. It involves assigning a label or tag to an entire image
based on preexisting training data of already labeled images. While the process may
appear simple at first glance, it entails pixel-level image analysis to determine the
most appropriate label for the overall image. This provides us with valuable data
the problem at hand, there are different types of image classification methodologies
Binary: Binary classification takes an either-or logic to label images, and classifies
unknown data points into two categories. When your task is to categorize benign or
malignant tumors, analyze product quality to find out whether it has defects or not,
and many other problems that require yes/no answers are solved with binary
classification.
of objects, multiclass, as the name suggests, categorizes items into three or more
classes. It's very useful in many domains like NLP (sentiment analysis where more
than two emotions are present), medical diagnosis (classifying diseases into
exactly one class, multilabel classification allows the item to be assigned to multiple
labels. For example, you may need to classify image colors and there are several
colors. A picture of a fruit salad will have red, orange, yellow, purple, and other
colors depending on your creativity with fruit salads. As a result, one image will
12
Hierarchical: Hierarchical classification is the task of organizing classes into a
represents broader categories and a lower-level class is more concrete and specific.
Digital image processing puts a live effect on things and is growing with time
to time and with new technologies in almost all fields. Few significant applications
modify the look and feel of an image. It basically manipulates the images
➢ Robot Vision: There are several robotic machines which work on the digital
ways, for example, hurdle detection root and line follower robot.
13
➢ Video processing: It is also one of the applications of digital image
conversion etc.
➢ Medical Field: There are several applications under medical field which
enhancement of raw medical image data for the purposes of selective visualization
as well as further analysis [192]. The main benefit of medical image processing is
models of the anatomies of interest can be created and studied to improve treatment
outcomes for the patient, develop improved medical devices and drug delivery
systems, or achieve more informed diagnoses. It has become one of the key tools
leveraged for medical advancement in recent years. There are many topics in
medical image processing: some emphasize general applicable theory and some
14
1.9.1. Medical Imaging Modalities
Detection of disease at the initial stage, using various modalities, is one of the
most important factors to decrease mortality rate occurring due to cancer and
tumors. Modalities help radiologists and doctors to study the internal structure of
the detected disease for retrieving the required features. various medical image
(ECG)) used for classifying the diseases in the primary studies. As observed,
following modalities were used for the evaluation of medical data using different
➢ Mammogram
➢ Electrocardiogram (ECG)
➢ X Ray
signals are generated from human organs, which further reconstructs information
about human organ structure [184]. MRIs with high resolution have more structural
It is a technology which generates 3-D images from 2-D X-Ray images using
15
(iii) Mammogram
For the effective breast cancer screening and early detection of abnormalities
in the body, mammograms are used. Calcifications and masses are considered as
It is used to measure the heart activity electrically and to detect the cardiac
(v) X Ray
Medical x-rays are used to generate images of tissues and structures inside the body.
If x-rays traveling through the body also pass through an x-ray detector on the other
side of the patient, an image will be formed that represents the “shadows” formed
by the objects inside of the body. Chest X Ray (CXR) images are the images that
The process of medical image processing begins by acquiring raw data from
any of the above images and reconstructing them into a format suitable for use in
pixels) grid creates the typical input for image processing. CT scan greyscale
of signals from proton particles during relaxation and after application of very
strong magnetic fields. Among all imaging modalities, X Ray is one of the most
preferable images for disease detection as it is economical and inexpensive, and the
16
1.9.2. Medical Image Classification
classifying medical images, such as X-rays, MRI scans, and CT scans, into different
classify medical images based on their content, which can help in diagnosis,
the most important problems in the image recognition area, and its aim is to classify
further research. Overall, medical image classification can be divided into two
steps. The first step is extracting effective features from the image. The second step
is using the features to build models that classify the image dataset. In the past,
doctors usually used their professional experience to extract features to classify the
medical images into different classes, which is usually a difficult, boring, and time-
Chest X-rays (CXR) are commonly used for detection and screening of lung
disorders [174] as CXR is one of the most common and easy-to-get medical tests
used to diagnose common diseases of the chest. It is one of the most frequently used
tuberculosis [31]. Detection and classification of lung diseases using chest X-ray
17
(CAD) systems have been introduced for lung disease detection using X-ray
images. But such systems were failed to achieve the required performance for lung
disease detection and classification. The recent Covid-19 assisted lung infections
have made these tasks very challenging for such CAD systems. It is essential to
the lungs and its classification which can be accomplished by the following tasks
➢ Feature Extraction
➢ Feature Selection
processing, data mining and computer vision. It is the process of extracting relevant
information from raw data [165]. It is used to extract the most distinct features
present in the images which are used to represent and describe the data. Data is a
dimensionality reduction process, in which, an initial set of the raw data is divided
and reduced to more manageable groups. The most important characteristic of these
large data sets is that they have a large number of variables. These variables require
a lot of computing resources to process. So, extracting features from the data helps
to get the best feature from those big data sets by selecting and combining variables
into features, thus, effectively reducing the amount of data. These features are easy
to process, but still able to describe the actual data set with accuracy and originality.
18
Feature extraction also refers to the process of transforming raw data into numerical
features that can be processed while preserving the information in the original data
set. It yields better results than applying machine learning or deep learning directly
to the raw data. Feature extraction can be accomplished using two methods.
that are relevant for a given problem and implementing a way to extract those
domain can help make informed decisions as to which features could be useful.
Extracting features such as color features, texture features, statistical features etc.
from the images manually by applying algorithms are called as manual feature
extraction and the features extracted manually are called as Handcrafted features.
These features are further used for the purpose of image classification. Handcrafted
to extract features automatically from the images without the need for human
intervention. Automatic feature extraction from the images can be done using deep
neural networks or most probably, the first layer of the deep neural networks. The
features extracted automatically using the deep networks are called as deep features
19
1.9.4. Feature selection
data mining and knowledge discovery and it allows exclusion of redundant features,
can eliminate the irrelevant noisy features and thus improve the quality of the data
set and the performance of learning systems [107]. Although a large number of
features can be used to describe an image, only a small number of them are useful
feature space, which will eventually improve the classification accuracy and save
Feature selection techniques can be divided into three groups which are as follows
➢ Filter methods
➢ Wrapper methods
➢ Embedding methods
the dataset which are images. These methods are generally used while doing the
of the classification model. It is faster and usually the better approach when the
number of features is huge. It avoids overfitting but sometimes may fail to select
best features.
Wrapper methods wrap the feature selection around the classification model
and use the prediction accuracy of the model to iteratively select or eliminate a set
20
of features. Based on the conclusions made from training in prior to the model,
addition and removal of features takes place. The main advantage of wrapper
methods over the filter methods is that they provide an optimal set of features for
training the model, thus resulting in better accuracy than the filter methods but are
classification model, thus having its own built-in feature selection methods.
Embedded methods encounter the drawbacks of filter and wrapper methods and
merge their advantages. These methods are faster like those of filter methods and
more accurate than the filter methods and take into consideration a combination of
features as well.
between input features and output classes. It also involves assigning a class label to
each instance in a dataset based on its features. The goal of classification is to build
a model that accurately predicts the class labels of new instances based on their
features. In classification, the model is fully trained using the training data, and then
it is evaluated on test data before being used to perform prediction on new unseen
data.
There are two main types of classification: binary classification and multi-
21
classifying instances into more than two classes. In addition to binary classification
and multi-class classification, few other types exist namely multi-label and multi-
level classification.
problem that requires classifying data into two mutually exclusive groups or
categories. Binary classification models are trained using a dataset that has been
labeled with the desired outcome. The training data in such a situation is labeled in
a binary format: true and false; positive and negative; O and 1; spam and not spam,
learning problem that requires classifying data into three or more groups/categories.
Unlike binary classification, where the model is only trained to predict one of the
two classes for an item, a multiclass classifier is trained to predict one from three
or more classes for an item. For example, a multiclass classifier could be used to
classify images of animals into different categories such as dogs, cats, and birds.
Most of the binary classification algorithms can be also used for multi-class
classification. Few of them are Support Vector Machine (SVM), Random Forest
learning algorithm that can be used to assign zero or more labels to each data
consist of both the animal such as a dog and a cat. In multi-label classification tasks,
0 or more classes for each input example are predicted. In this case, there is no
mutual exclusion because the input example can have more than one label.
22
➢ Multi-level classification: Multi-level classification is a type of supervised
learning algorithm in which classification is done at more than one levels. In this
type of classification, the first level and the subsequent levels may be binary or
then the second level helps to predict whether it is bacterial pneumonia or viral
create model using machine learning and feed data to the model and wait until the
model is complete. With the machine learning model, it is much easier and faster to
classify category from input data. Machine learning can be broadly classified into
that the "labelled" dataset is used to train the machines in the supervised learning
technique, and the machines then predict the output based on the training.
namely feature extraction and classification. Features are extracted from the input
images and the feature vector is created. The extracted features using feature
extraction technique are given as input to the classification where the machine
learning algorithms are utilized to classify the input data. The workflow of
23
Feature Classification Classified
Input image Extraction (Machine Output
learning model)
Any input data can be given as input to the classification model and it can
avenue for the data classification. Classification based on deep learning extract
features present in the images by applying mathematical operations that are referred
to as layers. Deep learning structure adds more hidden layers between the input
layers and the output layers and extends the traditional neural networks (NN) to
model more complex and nonlinear relationships. The algorithms are created
exactly just like machine learning but it consists of many more levels of algorithms.
The input is fed into the input layer of the deep learning model and the final
network which carry out both feature extraction and classification together. Neural
Networks extract deep features from the input data and performs classification. The
24
(iv) Fuzzy classification
characteristics using fuzzy logic. Fuzzy logic contains the multiple logical values
and these values are the truth values of a variable or problem between 0 and 1. In
the Boolean system, only two possibilities (0 and 1) exist, where 1 denotes the
absolute truth value and 0 denotes the absolute false value. But in the fuzzy system,
there are multiple possibilities present between the 0 and 1, which are partially false
and partially true. The fundamental concept of Fuzzy Logic is the membership
function, which defines the degree of membership of an input value to a certain set
statements that express the relationship between input variables and output
variables in a fuzzy way. The output of a Fuzzy Logic system is a fuzzy set, which
is a set of membership degrees for each possible output value. The architecture of
25
The steps in the architecture of fuzzy logic system are as follows.
➢ Rule Base: Rule Base is a component used for storing the set of rules and
the If-Then conditions given by the experts are used for controlling the
decision-making systems. There are so many updates that come in the Fuzzy
theory recently, which offers effective methods for designing and tuning of
system inputs, i.e., it converts the crisp number into fuzzy steps. The crisp
numbers are those inputs which are measured by the sensors and then
fuzzification passed them into the control systems for further processing.
This component divides the input signals into following five states in any
Fuzzy Logic system: Large Positive (LP), Medium Positive (MP), Small (S),
Engine. It allows users to find the matching degree between the current
fuzzy input and the rules. After the matching degree, this system determines
which rule is to be added according to the given input field. When all rules
are fired, then they are combined for developing the control actions.
fuzzy set inputs generated by the Inference Engine, and then transforms
them into a crisp value. It is the last step in the process of a fuzzy logic
system. The crisp value is a type of value which is acceptable by the user.
26
Various techniques are present to do this, but the user has to select the best
diseases can be achieved by analyzing the medical images such as Xray, CT and
illnesses have surpassed all other causes of death in the world. Pneumonia,
Tuberculosis (TB) and COVID-19 are the most severe and prevalent infectious
respiratory disorders caused by bacteria and virus that typically affect the lungs
which can even lead to death [77]. Currently, COVID-19 is ranked as the highest
cause of death in recent years as the mortality rate crosses 6 million. The most
interesting and complicating fact about these respiratory diseases are the similarity
in their symptoms. So, it is necessary to classify all the three diseases which can be
algorithms for disease prediction and recent researches are carried out using deep
learning approaches. It is observed from the literature survey that, deep learning
develop a deep learning model to detect and classify lung diseases which can
provide higher accuracy than existing research works. Motivated by this fact, a
novel deep learning based multi-level classification is carried out in this research to
27
1.11. OUTLINE OF THE PROPOSED RESEARCH WORK
lung diseases such as pneumonia, tuberculosis and COVID-19 using Chest X Ray
(CXR) images. The outline of the proposed research work is shown in Figure 1.5.
Reduced
Feature vector
The input Chest X Ray (CXR) images undergoes pre-processing and data
augmentation. Then, the pre-processed CXR images are given as input to the
features” (FHD) to extract features from the CXR images. After that, the extracted
features are fed into proposed Modified Moth Flame Optimization (MMFO)
algorithm to reduce the number of features for classification. The reduced feature
vector is given as input to the fuzzy rank-based ensemble deep learning model.
Three deep learning models namely VGG-16, ResNet50 and Modified XceptionNet
28
diseases such as Pneumonia, Tuberculosis and COVID-19 is performed using the
The main objective of this research is to develop a novel fuzzy integral based
ensemble deep learning model for multi-level classification of lung diseases using
Chest X Ray (CXR) images. The significant contributions in this research are as
follows.
and their performance was compared to identify the best performing deep
learning model.
Deep features” (FHD) was developed to extract features from the Chest
X Ray images.
feature selection.
29
In addition to this, all the proposed techniques are compared with few
existing techniques and the improvement in the overall results are also discussed.
review of various techniques used for lung disease detection namely feature
learning techniques utilized for diagnosing lung diseases. Few techniques are
implemented and performance of them are also compared to identify the best
technique. This chapter also discusses the limitations and advantages of these
and Deep features” (FHD) feature extraction is proposed and explained in Chapter
4. In the proposed FHD, both handcrafted and deep features are extracted from the
Chest X Ray (CXR) images and both are concatenated for the process of
out using CXR images with four deep learning models namely VGG-16,
30
➢ Chapter 5 describes about a novel optimization algorithm namely
based on Moth Flame Optimization Algorithm (MFO). MMFO selects the optimal
is presented in Chapter 6.
work. Few suggestions and recommendations in order to improve the efficiency and
1.14. SUMMARY
flame optimization and fuzzy Gompertz based ensemble deep learning model was
developed in this research work. In order to construct a new model for lung diseases
detection, it is necessary to examine and evaluate the current research in this field.
Basic introduction about image processing along with the definition of image and
image processing techniques are presented in this Chapter. Notable techniques for
Selection and Classification are described in detail in this chapter. Introduction and
working procedure of Machine Learning and Deep Learning are also discussed.
Basic concepts of fuzzy along with the architecture of Fuzzy Inference System
(FIS) is explained in detail. The research gap is identified and the motivation of the
31
research is described. The outline of the proposed research work with a neat
32