Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chapter 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

CHAPTER 1

INTRODUCTION

1.1 GENERAL BACKGROUND

Images are considered as one of the most important media of conveying

information, in the field of computer vision. Images refers to visual representation

of data, while a digital image is a binary representation of visual data. These images

can take the form of photographs, graphics, and individual video frames. For this

purpose, an image is a picture that was created or copied and stored in electronic

form. Understanding images allows for the extraction of useful information that can

be utilized for a variety of other activities, such as helping in navigation of robots,

identifying an airport using remote sensing data, extracting malign tissues from

body scans and detection of abnormalities in human body.

1.2. IMAGE PROCESSING

An image is defined as a two-dimensional function, F(x,y), where x and y

are spatial coordinates, and the amplitude of F at any pair of coordinates (x,y) is

called the intensity of that image at that point. When x,y, and amplitude values of F

are finite, it is called as digital image. In other words, an image can be defined by a

two-dimensional array specifically arranged in rows and columns.

Image processing is one of the rapidly developing domains of computer

science. Technological improvements in imaging, computer processors and mass

storage devices have fuelled its growth. In several domains of science and

engineering, processing color and grayscale images or the other two-dimensional

signals have become a momentous tool for research and investigation. These and

the other sources produce huge amount of data every day, more than what could

ever be examined manually. The massive volumes of images produced each day by

1
these and other sources are beyond the limits of manual assessment. Extracting

precious information from images is the fundamental task of image processing.

Theoretically, computers have the ability to accomplish this with negligible or no

human involvement [117].

1.3. TYPES OF IMAGE PROCESSING

The two types of image processing technologies utilized are analog and

digital image processing which are explained below.

1.3.1 Analog Image Processing

Analog image processing is done on analog signals. It includes processing on

two dimensional analog signals. In this type of processing, the images are manipulated

by electrical means by varying the electrical signal. The common example include is

the television image. Digital image processing has dominated over analog image

processing with the passage of time due its wider range of applications.

1.3.2 Digital Image Processing

Digital Image is composed of a finite number of elements, each of which

elements have a particular value at a particular location. These elements are referred

to as picture elements, image elements, and pixels. A Pixel is most widely used to

denote the elements of a Digital Image. Digital images are interpreted as 2D or 3D

matrices by a computer, where each value or pixel in the matrix represents the

amplitude, known as the “intensity” of the pixel. Typically, computers are used to

deal with 8-bit images, wherein the amplitude value ranges from 0 to 255. Based

on the ranges, digital images are classified into following types.

➢ Binary Image

➢ Grayscale Image

➢ RGB color image

2
(i) Binary Image

A binary image is one that consists of pixels that can have one of exactly

two colors, usually black and white. It is the simplest type of image. It takes only

two values 0 and 1 where 0 represents black and 1 represents white. The binary

image consists of a 1-bit image and it takes only 1 binary digit to represent a pixel.

Binary images are mostly used for general shape or outline.

(ii) Gray scale Image

Grayscale images are monochrome images and they have only one color.

Grayscale images do not contain any information about color. Each pixel

determines available different grey levels. Gray scale or 8-bit images are composed

of 256 unique colors, where a pixel intensity of 0 represents the black color and

pixel intensity of 255 represents the white color. All the other 254 values in between

are the different shades of gray.

(iii) RGB color image

Typical color images can be described using three colors namely Red,

Green and Blue (RGB). Color images can be represented using three 2D arrays of

same size, one for each color channel: red (R), green (G), and blue (B). Each array

element contains an 8-bit value, indicating the amount of red, green, or blue at that

point in a [0, 255] scale.

1.4. ADVANTAGES OF DIGITAL IMAGE PROCESSING

Digital image processing has many advantages over analog image

processing; it allows a much wider range of algorithms to be applied to input data,

and can avoid problems such as the build-up of noise and signal distortion during

processing. The advantages of Digital Image Processing are as follows.

3
➢ Enhanced Image Quality – One of the primary advantages of digital image

processing is the ability to enhance the quality of images. With the use of

algorithms, digital images can be sharpened, brightened, or color corrected to

produce a clearer and more visually appealing picture.

➢ Improved Medical Diagnosis – Digital image processing is also used in the

field of medicine to improve the accuracy of diagnosis. For example, medical

images like X-rays and MRIs can be processed to highlight areas of interest

or to differentiate between healthy and diseased tissues.

➢ Increased Efficiency – Digital image processing systems can process images

much faster than manual methods. This can help save time and resources in

industries like manufacturing, where inspection and quality control processes

are crucial.

➢ Enhanced Security – Digital image processing systems are also used for

security and surveillance purposes. For example, facial recognition

algorithms can be used to identify people or to detect unusual activity in

public spaces.

➢ Creative Applications – Lastly, digital image processing can be used in

creative applications like graphic design, video editing, and virtual reality. By

manipulating digital images, artists and designers can create new and unique

visual experiences.

1.5. FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING

There are 11 fundamental steps in digital image processing are depicted in

Figure 1.1 and the description of each step are explained in this section.

4
Figure 1.1. Fundamental steps of Digital Image Processing

➢ Image acquisition: This is the first fundamental steps in digital image

processing. Image acquisition could be as simple as being given an image

that is already in digital form. Generally, the image acquisition stage

involves pre-processing, such as scaling etc.

➢ Image enhancement: Image enhancement is among the simplest and most

appealing areas of digital image processing. Basically, the idea behind

enhancement techniques is to bring out detail that is obscured, or simply to

highlight certain features of interest in an image. Such as, changing

brightness & contrast etc.

➢ Image restoration: Image restoration is an area that also deals with

improving the appearance of an image. However, unlike enhancement,

which is subjective, image restoration is objective, in the sense that

restoration techniques tend to be based on mathematical or probabilistic

models of image degradation.

5
➢ Color image processing: Color image processing is an area that has been

gaining its importance because of the significant increase in the use of

digital images over the Internet. This may include color modeling and

processing in a digital domain etc.

➢ Wavelets and Multi-resolution processing: Wavelets are the foundation for

representing images in various degrees of resolution. Images subdivision

successively into smaller regions for data compression and for pyramidal

representation.

➢ Compression: Compression deals with techniques for reducing the storage

required to save an image or the bandwidth to transmit it. Particularly in the

uses of internet it is very much necessary to compress data.

➢ Morphological processing: Morphological processing deals with tools for

extracting image components that are useful in the representation and

description of shape.

➢ Segmentation: Segmentation procedures partition an image into its

constituent parts or objects. In general, autonomous segmentation is one of

the most difficult tasks in digital image processing. A rugged segmentation

procedure brings the process a long way toward successful solution of

imaging problems that require objects to be identified individually.

➢ Representation and description: Representation and description almost

always follow the output of a segmentation stage, which usually is raw pixel

data, constituting either the boundary of a region or all the points in the

region itself. Choosing a representation is only part of the solution for

transforming raw data into a form suitable for subsequent computer

processing. Description deals with extracting attributes that result in some

6
quantitative information of interest or are basic for differentiating one class

of objects from another.

➢ Object Recognition: Recognition is the process that assigns a label, such as,

“vehicle” to an object based on its descriptors.

➢ Knowledge Base: Knowledge may be as simple as detailing regions of an

image where the information of interest is known to be located, thus limiting

the search that must be conducted in seeking that information. The

knowledge base also can be quite complex, such as an interrelated list of all

major possible defects in a materials inspection problem or an image

database containing high-resolution satellite images of a region in

connection with change-detection applications.

1.6. TECHNIQUES IN DIGITAL IMAGE PROCESSING

Some image processing technique takes image as both input and output.

Some other techniques will take images as input but attributes of images as output.

All these techniques are not required at a time for the processing of images. The

selection of techniques is application specific. The details of some image processing

techniques are as follows.

1.6.1. Image Pre-processing

A fundamental step in image processing and computer vision is image

preprocessing [45,46]. The aim of image preprocessing is the improvement of

image data by enhancing some features while suppressing some unwanted

distortions. Enhancing the features depends on specific applications. Image data

recorded by sensors on a satellite, consist of errors related to geometry and

brightness values of the pixels. In image preprocessing, these errors are corrected

using appropriate mathematical models which are either definite or statistical

7
models. Image preprocessing also includes primitive operations to reduce noise,

contrast enhancement, image smoothing and sharpening, and advanced operations

such as image segmentation. According to the size of the pixel neighborhood that

is used for the calculation of a new pixel brightness, the image preprocessing

techniques can be categorized as,

➢ pixel brightness transformations.

➢ geometric transformations.

➢ image restoration that requires knowledge about the entire image.

1.6.2. Image Enhancement

The images captured from some conventional digital cameras and satellite

images may lack in contrast and brightness, because of the limitations of imaging

subsystems and illumination conditions while capturing an image. Image

enhancement is one of the simplest and alluring technique to overcome this

difficulty. Here it enhances some features that are concealed or highlight certain

features of interest, for subsequent analysis of an image. Image enhancement [45]

techniques are categorized into two types. They are as follows

➢ spatial domain method

➢ frequency domain method.

Spatial domain method deals with the modification or aggregation of pixels that

forms the image and frequency domain method enhances the image in a linear

manner by positioning the invariant operator. Some of the image enhancement

techniques are Contrast Stretching, Noise Filtering and Histogram Modification.

8
1.6.3. Image Restoration

The process of recovering degraded or corrupted image by removing the

noise or blur, to improve the appearance of the image is called image restoration

[46]. The degraded image is the convolution of the original image, degraded

function, and additive noise. Restoration of the image is done with the help of prior

knowledge of the noise or the disturbance that causes the degradation in the image.

It can be done in two domains: spatial domain and frequency domain. In spatial

domain, the filtering action for restoring the images is done directly on the operating

pixels of the

digital image and in frequency domain the filtering action is done by mapping the

spatial domain into the frequency domain, by fourier transform. After the filtering,

the image is remapped by inverse fourier transform into spatial domain, to obtain

the restored image. Any of the domains can be chosen based on the applications

required.

1.6.4. Morphological Processing

Morphological processing [182] is a collection of linear operations that is

used for extracting image components that are useful in the representation and

description of shape. Structuring element is a small set to probe an image under

study. Structuring element is compared with the corresponding neighborhood of

pixels by positioning it, at all possible locations in the image. Some operations

check whether the element "fits" within the neighborhood, while others check

whether it "hits" or intersects the neighborhood. The basic morphological

operations are dilation, erosion, and their combinations. Erosion is useful for the

removal of structures of certain shape and size, which is given by the structuring

9
element while dilation is useful in filling of holes of certain size and shape given

by the structuring element.

1.6.5. Object Recognition

Object Recognition [125] is the process that provides a label to an object

in a digital image or video based on its descriptors. eg: -vehicle. Object recognition

deals with training the computer to identify a particular object from various

perspectives, in various lighting conditions, and with various backgrounds. Due to

scene clutter, photometric effects, changes in shape and viewpoints of the object,

the appearance of an object can be varied. It has wide applications in the field of

monitoring and surveillance, robot localization and navigation, medical analysis

etc.

1.6.6. Image Compression

Compression, as the name implies, deals with techniques for reducing the

storage required to save an image, or the bandwidth required to transmit it, without

degrading the quality of an image to an unacceptable level [45,46]. Compression

allows to store more images in each amount disk or memory space. It also reduces

the time required for images to be transmitted over the Internet or downloaded from

web pages. Compression techniques can be of two types. They are lossy and lossless

compression.

Lossy compressions: Lossy compressions are irreversible, some data from the

original image file is lost. Lossy methods are especially suitable for natural images

such as photographs in applications where minor loss of fidelity is acceptable to

achieve a substantial reduction in bit rate.

10
Lossless compression: In lossless compression, we can reduce the size of an image

without any quality loss. Lossless compression is preferred for archival purposes

and often for medical imaging, technical drawings, clip art, or comics.

Some of the common image compression techniques are as follows

➢ Fractal

➢ Wavelets

➢ Chroma sub sampling

➢ Transform coding

➢ Run-length encoding

1.6.7. Image Segmentation

Image Segmentation is the process of partitioning a digital image into

multiple segments, to simplify and/or change the representation of an image into

something that is more meaningful and easier to analyze. It may make use of

statistical classification, thresholding, edge detection, region detection, or any

combination of these techniques. Usually, a set of classified elements is obtained as

the output of the segmentation step. Segmentation techniques can be classified as

either region-based or edge-based. The former techniques rely on common patterns

in intensity values within a cluster of neighboring pixels, and the goal of the

segmentation algorithm is to group regions according to their anatomical or

functional roles. The edge-based techniques rely on discontinuities in image values

between distinct regions, and the goal of the segmentation algorithm is to accurately

demarcate the boundary separating these regions.

1.6.8. Image classification

Image classification is the task of assigning a label or class to an entire

image. Images are expected to have only one class for each image. Image

11
classification models take an image as input and return a prediction about which

class the image belongs to. It involves assigning a label or tag to an entire image

based on preexisting training data of already labeled images. While the process may

appear simple at first glance, it entails pixel-level image analysis to determine the

most appropriate label for the overall image. This provides us with valuable data

and insights, enabling informed decisions and actionable outcomes. Depending on

the problem at hand, there are different types of image classification methodologies

to be employed. These are binary, multiclass, multilabel, and hierarchical.

Binary: Binary classification takes an either-or logic to label images, and classifies

unknown data points into two categories. When your task is to categorize benign or

malignant tumors, analyze product quality to find out whether it has defects or not,

and many other problems that require yes/no answers are solved with binary

classification.

Multiclass: While binary classification is used to distinguish between two classes

of objects, multiclass, as the name suggests, categorizes items into three or more

classes. It's very useful in many domains like NLP (sentiment analysis where more

than two emotions are present), medical diagnosis (classifying diseases into

different categories), etc.

Multilabel: Unlike multiclass classification, where each image is assigned to

exactly one class, multilabel classification allows the item to be assigned to multiple

labels. For example, you may need to classify image colors and there are several

colors. A picture of a fruit salad will have red, orange, yellow, purple, and other

colors depending on your creativity with fruit salads. As a result, one image will

have multiple colors as labels.

12
Hierarchical: Hierarchical classification is the task of organizing classes into a

hierarchical structure based on their similarities, where a higher-level class

represents broader categories and a lower-level class is more concrete and specific.

1.7. CHARACTERISTICS OF DIGITAL IMAGE PROCESSING

Some of the characteristics of Digital Image processing are as follows

➢ It uses software, and some are free of cost.

➢ It provides clear images.

➢ It helps in image enhancement to recollect the data through images.

➢ It is used widely everywhere in many fields.

➢ It is used to support a better experience of life.

1.8. APPLICATIONS OF DIGITAL IMAGE PROCESSING

Digital image processing puts a live effect on things and is growing with time

to time and with new technologies in almost all fields. Few significant applications

of digital image processing in different fields are explained below.

➢ Image sharpening and restoration: It refers to the process in which we can

modify the look and feel of an image. It basically manipulates the images

and achieves the desired output. It includes conversion, sharpening,

blurring, detecting edges, retrieval, and recognition of images.

➢ Robot Vision: There are several robotic machines which work on the digital

image processing. Through image processing technique robot finds their

ways, for example, hurdle detection root and line follower robot.

➢ Pattern recognition: It involves the study of image processing, it is also

combined with artificial intelligence such that computer-aided diagnosis,

handwriting recognition and images recognition can be easily implemented.

Now a days, image processing is used for pattern recognition.

13
➢ Video processing: It is also one of the applications of digital image

processing. A collection of frames or pictures are arranged in such a way

that it makes the fast movement of pictures. It involves frame rate

conversion, motion detection, reduction of noise and colour space

conversion etc.

➢ Medical Field: There are several applications under medical field which

depends on the functioning of digital image processing. Few applications of

digital image processing in medical field are Gamma-ray imaging, PET

scan, X-Ray Imaging, Medical CT scan and UV imaging etc.

1.9. MEDICAL IMAGE PROCESSING

Medical image processing encompasses the use and exploration of image

datasets of the human body to diagnose pathologies or guide medical interventions

such as surgical planning, or for research purposes. Medical image processing is

carried out by radiologists, engineers, and clinicians to better understand the

anatomy of either individual patients or population groups. Medical image

processing deals with the development of problem-specific approaches to the

enhancement of raw medical image data for the purposes of selective visualization

as well as further analysis [192]. The main benefit of medical image processing is

that it allows for in-depth, but non-invasive exploration of internal anatomy. 3D

models of the anatomies of interest can be created and studied to improve treatment

outcomes for the patient, develop improved medical devices and drug delivery

systems, or achieve more informed diagnoses. It has become one of the key tools

leveraged for medical advancement in recent years. There are many topics in

medical image processing: some emphasize general applicable theory and some

focus on specific applications.

14
1.9.1. Medical Imaging Modalities

Detection of disease at the initial stage, using various modalities, is one of the

most important factors to decrease mortality rate occurring due to cancer and

tumors. Modalities help radiologists and doctors to study the internal structure of

the detected disease for retrieving the required features. various medical image

modalities (I-Scan-2, CT-Scan, MRI, X-Ray, Mammogram and Electrocardiogram

(ECG)) used for classifying the diseases in the primary studies. As observed,

following modalities were used for the evaluation of medical data using different

algorithms or techniques namely

➢ Magnetic Resonance Imaging (MRI)

➢ Computed Tomography (CT)

➢ Mammogram

➢ Electrocardiogram (ECG)

➢ X Ray

(i) Magnetic Resonance Imaging (MRI)

It uses magnetic resonance for obtaining electromagnetic signals. These

signals are generated from human organs, which further reconstructs information

about human organ structure [184]. MRIs with high resolution have more structural

details which are required to locate lesions and disease diagnosis.

(ii) Computed Tomography (CT)

It is a technology which generates 3-D images from 2-D X-Ray images using

digital geometry [180].

15
(iii) Mammogram

For the effective breast cancer screening and early detection of abnormalities

in the body, mammograms are used. Calcifications and masses are considered as

the most common abnormalities resulting in breast cancer [20].

(iv) Electrocardiogram (ECG)

It is used to measure the heart activity electrically and to detect the cardiac

problems in humans [14, 154, 216].

(v) X Ray

X-rays are a form of electromagnetic radiation, similar to visible light.

Medical x-rays are used to generate images of tissues and structures inside the body.

If x-rays traveling through the body also pass through an x-ray detector on the other

side of the patient, an image will be formed that represents the “shadows” formed

by the objects inside of the body. Chest X Ray (CXR) images are the images that

helps to identify the pulmonary diseases such as tuberculosis, pneumonia and

COVID etc. [210].

The process of medical image processing begins by acquiring raw data from

any of the above images and reconstructing them into a format suitable for use in

relevant software. A 3D bitmap of greyscale intensities containing a voxel (3D

pixels) grid creates the typical input for image processing. CT scan greyscale

intensity depends on X-ray absorption, while in MRI it is determined by the strength

of signals from proton particles during relaxation and after application of very

strong magnetic fields. Among all imaging modalities, X Ray is one of the most

preferable images for disease detection as it is economical and inexpensive, and the

equipment is easy to install.

16
1.9.2. Medical Image Classification

Medical Image Classification is a task in medical image analysis that involves

classifying medical images, such as X-rays, MRI scans, and CT scans, into different

categories based on the type of image or the presence of specific structures or

diseases. The goal is to use computer algorithms to automatically identify and

classify medical images based on their content, which can help in diagnosis,

treatment planning, and disease monitoring. Medical image classification is one of

the most important problems in the image recognition area, and its aim is to classify

medical images into different categories to help doctors in disease diagnosis or

further research. Overall, medical image classification can be divided into two

steps. The first step is extracting effective features from the image. The second step

is using the features to build models that classify the image dataset. In the past,

doctors usually used their professional experience to extract features to classify the

medical images into different classes, which is usually a difficult, boring, and time-

consuming task. This approach is prone to leading to instability or nonrepeatable

outcomes. Considering the research until now, medical image classification

application research has had great merit [102].

Chest X-rays (CXR) are commonly used for detection and screening of lung

disorders [174] as CXR is one of the most common and easy-to-get medical tests

used to diagnose common diseases of the chest. It is one of the most frequently used

diagnostic modality in detecting different lung diseases such as pneumonia or

tuberculosis [31]. Detection and classification of lung diseases using chest X-ray

images is a complex process for radiologists. Therefore, it received significant

attention from the researchers to develop automatic lung disease detection

techniques [17,79,154]. Since the past decade, many computer-aided diagnosis

17
(CAD) systems have been introduced for lung disease detection using X-ray

images. But such systems were failed to achieve the required performance for lung

disease detection and classification. The recent Covid-19 assisted lung infections

have made these tasks very challenging for such CAD systems. It is essential to

detect the appearance of diseases such as pneumonia, tuberculosis, COVID-19 in

the lungs and its classification which can be accomplished by the following tasks

using CXR images.

➢ Feature Extraction

➢ Feature Selection

➢ Building a classification model

1.9.3. Feature Extraction

Feature extraction (FE) is an important step in image retrieval, image

processing, data mining and computer vision. It is the process of extracting relevant

information from raw data [165]. It is used to extract the most distinct features

present in the images which are used to represent and describe the data. Data is a

collection of various prominent features as the common saying for an image “a

picture is worth a thousand words” [39]. Feature extraction is a part of the

dimensionality reduction process, in which, an initial set of the raw data is divided

and reduced to more manageable groups. The most important characteristic of these

large data sets is that they have a large number of variables. These variables require

a lot of computing resources to process. So, extracting features from the data helps

to get the best feature from those big data sets by selecting and combining variables

into features, thus, effectively reducing the amount of data. These features are easy

to process, but still able to describe the actual data set with accuracy and originality.

18
Feature extraction also refers to the process of transforming raw data into numerical

features that can be processed while preserving the information in the original data

set. It yields better results than applying machine learning or deep learning directly

to the raw data. Feature extraction can be accomplished using two methods.

➢ Manual feature extraction – Handcrafted features

➢ Automatic feature extraction – Deep features

(i) Manual Feature Extraction

Manual feature extraction requires identifying and describing the features

that are relevant for a given problem and implementing a way to extract those

features. In many situations, having a good understanding of the background or

domain can help make informed decisions as to which features could be useful.

Extracting features such as color features, texture features, statistical features etc.

from the images manually by applying algorithms are called as manual feature

extraction and the features extracted manually are called as Handcrafted features.

These features are further used for the purpose of image classification. Handcrafted

features-based classification can be done using Machine learning models.

(ii) Automatic feature extraction

Automated feature extraction uses specialized algorithms or deep networks

to extract features automatically from the images without the need for human

intervention. Automatic feature extraction from the images can be done using deep

neural networks or most probably, the first layer of the deep neural networks. The

features extracted automatically using the deep networks are called as deep features

and deep features-based classification can be accomplished using deep learning-

based classification model.

19
1.9.4. Feature selection

Feature selection is a dimensionality reduction technique widely used for

data mining and knowledge discovery and it allows exclusion of redundant features,

concomitantly retaining the underlying hidden information [191]. Feature selection

can eliminate the irrelevant noisy features and thus improve the quality of the data

set and the performance of learning systems [107]. Although a large number of

features can be used to describe an image, only a small number of them are useful

and efficient for classification. Feature selection is typically done to choose a

condensed and pertinent feature subset in order to reduce the dimensionality of

feature space, which will eventually improve the classification accuracy and save

time. More features do not always provide better classification performance.

Feature selection techniques can be divided into three groups which are as follows

➢ Filter methods

➢ Wrapper methods

➢ Embedding methods

(i) Filter methods

In filter methods, the features are selected based on general characteristics of

the dataset which are images. These methods are generally used while doing the

pre-processing step. A filter method reduces the number of features independently

of the classification model. It is faster and usually the better approach when the

number of features is huge. It avoids overfitting but sometimes may fail to select

best features.

(ii) Wrapper methods

Wrapper methods wrap the feature selection around the classification model

and use the prediction accuracy of the model to iteratively select or eliminate a set

20
of features. Based on the conclusions made from training in prior to the model,

addition and removal of features takes place. The main advantage of wrapper

methods over the filter methods is that they provide an optimal set of features for

training the model, thus resulting in better accuracy than the filter methods but are

computationally more expensive.

(iii) Embedded methods

In embedded methods the feature selection process is an integral part of the

classification model, thus having its own built-in feature selection methods.

Embedded methods encounter the drawbacks of filter and wrapper methods and

merge their advantages. These methods are faster like those of filter methods and

more accurate than the filter methods and take into consideration a combination of

features as well.

1.9.5. Classification model

Classification is a type of supervised learning that categorizes input data into

predefined labels. It involves training a model on labeled examples to learn patterns

between input features and output classes. It also involves assigning a class label to

each instance in a dataset based on its features. The goal of classification is to build

a model that accurately predicts the class labels of new instances based on their

features. In classification, the model is fully trained using the training data, and then

it is evaluated on test data before being used to perform prediction on new unseen

data.

(i) Types of classification

There are two main types of classification: binary classification and multi-

class classification. Binary classification involves classifying instances into two

classes, such as “spam” or “not spam”, while multi-class classification involves

21
classifying instances into more than two classes. In addition to binary classification

and multi-class classification, few other types exist namely multi-label and multi-

level classification.

➢ Binary classification: Binary classification is a type of supervised learning

problem that requires classifying data into two mutually exclusive groups or

categories. Binary classification models are trained using a dataset that has been

labeled with the desired outcome. The training data in such a situation is labeled in

a binary format: true and false; positive and negative; O and 1; spam and not spam,

etc. depending on the problem being tackled.

➢ Multi-class classification: Multiclass classification is a type of supervised

learning problem that requires classifying data into three or more groups/categories.

Unlike binary classification, where the model is only trained to predict one of the

two classes for an item, a multiclass classifier is trained to predict one from three

or more classes for an item. For example, a multiclass classifier could be used to

classify images of animals into different categories such as dogs, cats, and birds.

Most of the binary classification algorithms can be also used for multi-class

classification. Few of them are Support Vector Machine (SVM), Random Forest

(RF), K-Nearest Neighbor (KNN) and Naive Bayes (NB) etc.

➢ Multi-label classification: Multilabel classification is a type of supervised

learning algorithm that can be used to assign zero or more labels to each data

sample. For example, a multilabel classifier could be used to classify an image to

consist of both the animal such as a dog and a cat. In multi-label classification tasks,

0 or more classes for each input example are predicted. In this case, there is no

mutual exclusion because the input example can have more than one label.

22
➢ Multi-level classification: Multi-level classification is a type of supervised

learning algorithm in which classification is done at more than one levels. In this

type of classification, the first level and the subsequent levels may be binary or

multi-class classification. For example, in case of viral pneumonia prediction, the

first level helps to predict whether pneumonia is positive or not. If it is positive,

then the second level helps to predict whether it is bacterial pneumonia or viral

pneumonia. Likewise, the classification extends up to multi-level such as two level,

three level and four level classification etc.

(ii) Machine learning based classification

With advance growth of machine learning, nowadays it is just easier to

create model using machine learning and feed data to the model and wait until the

model is complete. With the machine learning model, it is much easier and faster to

classify category from input data. Machine learning can be broadly classified into

four categories: reinforcement learning, unsupervised learning, semi-supervised

learning, and supervised learning, depending on the techniques and modes of

learning. Classification techniques comes under supervised learning. This indicates

that the "labelled" dataset is used to train the machines in the supervised learning

technique, and the machines then predict the output based on the training.

In machine learning, the prediction model contains two important steps

namely feature extraction and classification. Features are extracted from the input

images and the feature vector is created. The extracted features using feature

extraction technique are given as input to the classification where the machine

learning algorithms are utilized to classify the input data. The workflow of

classification using machine learning is shown in Figure 1.2.

23
Feature Classification Classified
Input image Extraction (Machine Output
learning model)

Figure 1.2. Machine learning based classification

Any input data can be given as input to the classification model and it can

use the same flow to get the predicted class as output.

(iii) Deep learning-based classification

Deep learning is a sub field of machine learning and it provides an alternative

avenue for the data classification. Classification based on deep learning extract

features present in the images by applying mathematical operations that are referred

to as layers. Deep learning structure adds more hidden layers between the input

layers and the output layers and extends the traditional neural networks (NN) to

model more complex and nonlinear relationships. The algorithms are created

exactly just like machine learning but it consists of many more levels of algorithms.

The input is fed into the input layer of the deep learning model and the final

prediction or the classified output is obtained using the output layer.

In deep learning-based classification, the prediction model contains a neural

network which carry out both feature extraction and classification together. Neural

Networks extract deep features from the input data and performs classification. The

workflow of classification using deep learning is shown in Figure 1.3.

Input image Feature Extraction + Classification Classified


(Deep learning model) Output

Figure 1.3. Deep learning-based classification

24
(iv) Fuzzy classification

Fuzzy classification is the process of grouping elements having same

characteristics using fuzzy logic. Fuzzy logic contains the multiple logical values

and these values are the truth values of a variable or problem between 0 and 1. In

the Boolean system, only two possibilities (0 and 1) exist, where 1 denotes the

absolute truth value and 0 denotes the absolute false value. But in the fuzzy system,

there are multiple possibilities present between the 0 and 1, which are partially false

and partially true. The fundamental concept of Fuzzy Logic is the membership

function, which defines the degree of membership of an input value to a certain set

or category. The membership function is a mapping from an input value to a

membership degree between 0 and 1, where 0 represents non-membership and 1

represents full membership.

Fuzzy Logic is implemented using Fuzzy Rules, which are if-then

statements that express the relationship between input variables and output

variables in a fuzzy way. The output of a Fuzzy Logic system is a fuzzy set, which

is a set of membership degrees for each possible output value. The architecture of

fuzzy logic system is shown in Figure 1.4.

Figure 1.4. Architecture of fuzzy logic system

25
The steps in the architecture of fuzzy logic system are as follows.

➢ Rule Base: Rule Base is a component used for storing the set of rules and

the If-Then conditions given by the experts are used for controlling the

decision-making systems. There are so many updates that come in the Fuzzy

theory recently, which offers effective methods for designing and tuning of

fuzzy controllers. These updates or developments decreases the number of

fuzzy set of rules.

➢ Fuzzification: Fuzzification is a module or component for transforming the

system inputs, i.e., it converts the crisp number into fuzzy steps. The crisp

numbers are those inputs which are measured by the sensors and then

fuzzification passed them into the control systems for further processing.

This component divides the input signals into following five states in any

Fuzzy Logic system: Large Positive (LP), Medium Positive (MP), Small (S),

Medium Negative (MN) and Large Negative (LN).

➢ Inference Engine: This component is a main component in any Fuzzy Logic

system (FLS), because all the information is processed in the Inference

Engine. It allows users to find the matching degree between the current

fuzzy input and the rules. After the matching degree, this system determines

which rule is to be added according to the given input field. When all rules

are fired, then they are combined for developing the control actions.

➢ Defuzzification: Defuzzification is a module or component, which takes the

fuzzy set inputs generated by the Inference Engine, and then transforms

them into a crisp value. It is the last step in the process of a fuzzy logic

system. The crisp value is a type of value which is acceptable by the user.

26
Various techniques are present to do this, but the user has to select the best

one for reducing the errors.

1.10. MOTIVATION OF THE RESEARCH

Medical image processing seeks attention recently as it provides a significant

contribution in healthcare. Disease detection has become an important topic in

medical image processing and medical imaging research. Prediction or detection of

diseases can be achieved by analyzing the medical images such as Xray, CT and

MRI scan using computer algorithms. In recent years, infectious respiratory

illnesses have surpassed all other causes of death in the world. Pneumonia,

Tuberculosis (TB) and COVID-19 are the most severe and prevalent infectious

respiratory disorders caused by bacteria and virus that typically affect the lungs

which can even lead to death [77]. Currently, COVID-19 is ranked as the highest

cause of death in recent years as the mortality rate crosses 6 million. The most

interesting and complicating fact about these respiratory diseases are the similarity

in their symptoms. So, it is necessary to classify all the three diseases which can be

accomplished by applying deep learning techniques in Chest X Ray (CXR) images

of patients. Most of the existing research works utilizes machine learning

algorithms for disease prediction and recent researches are carried out using deep

learning approaches. It is observed from the literature survey that, deep learning

algorithms performs better than machine learning algorithms. It is necessary to

develop a deep learning model to detect and classify lung diseases which can

provide higher accuracy than existing research works. Motivated by this fact, a

novel deep learning based multi-level classification is carried out in this research to

detect whether the patient is affected by Pneumonia or Tuberculosis or COVID-19.

27
1.11. OUTLINE OF THE PROPOSED RESEARCH WORK

The proposed research work is developed for multi- level classification of

lung diseases such as pneumonia, tuberculosis and COVID-19 using Chest X Ray

(CXR) images. The outline of the proposed research work is shown in Figure 1.5.

Input CXR Pre- Feature


image processed extraction Feature
Feature selection
Pre- Image using
processing Vector using
“Fusion of
and Data Handcrafted Modified
Augmentation and Deep Moth Flame
features” optimization
(FHD) (MMFO)

Reduced
Feature vector

• Normal Fuzzy rank-based


• Tuberculosis Multi-level ensemble Deep Learning
• Bacterial Pneumonia classification model using Gompertz
• COVID-19 function

Figure 1.5. Outline of the proposed research work

The input Chest X Ray (CXR) images undergoes pre-processing and data

augmentation. Then, the pre-processed CXR images are given as input to the

proposed feature extraction technique namely “Fusion of Handcrafted and Deep

features” (FHD) to extract features from the CXR images. After that, the extracted

features are fed into proposed Modified Moth Flame Optimization (MMFO)

algorithm to reduce the number of features for classification. The reduced feature

vector is given as input to the fuzzy rank-based ensemble deep learning model.

Three deep learning models namely VGG-16, ResNet50 and Modified XceptionNet

are ensembled using Gompertz function. At last, multi-level classification of lung

28
diseases such as Pneumonia, Tuberculosis and COVID-19 is performed using the

ensemble deep learning model. The performance of classification is evaluated using

some performance metrics such as Accuracy, Precision, Recall, Specificity, F1

score and Error rate.

1.12. RESEARCH CONTRIBUTIONS

The main objective of this research is to develop a novel fuzzy integral based

ensemble deep learning model for multi-level classification of lung diseases using

Chest X Ray (CXR) images. The significant contributions in this research are as

follows.

➢ A deep survey of machine learning and deep learning methods for

diagnosing COVID-19 variants was carried out.

➢ Deep learning techniques for classification of COVID-19 was studied

and their performance was compared to identify the best performing deep

learning model.

➢ A novel feature extraction technique namely “Fusion of Handcrafted and

Deep features” (FHD) was developed to extract features from the Chest

X Ray images.

➢ An optimization technique namely “Modified Moth Flame

Optimization” (MMFO) algorithms was developed for the purpose of

feature selection.

➢ A novel framework that uses a fuzzy rank-based ensemble of three pre-

trained deep learning models, namely, VGG-16, ResNet50 and Modified

XceptionNet using Gompertz function was developed for multi-level

classification of lung diseases using CXR images.

29
In addition to this, all the proposed techniques are compared with few

existing techniques and the improvement in the overall results are also discussed.

1.13. ORGANIZATION OF THE THESIS

The organization of this thesis is as follows:

➢ Chapter 2 presents a detailed survey of existing research works related

to disease prediction using image processing techniques. An in-depth analysis and

review of various techniques used for lung disease detection namely feature

extraction, feature selection and classification are also presented.

➢ Chapter 3 discusses about the existing machine learning and deep

learning techniques utilized for diagnosing lung diseases. Few techniques are

implemented and performance of them are also compared to identify the best

technique. This chapter also discusses the limitations and advantages of these

implemented methods. This is necessary for the development of an efficient lung

disease classification model. The detailed explanation of each technique is also

described in this chapter.

➢ A novel feature extraction technique namely “Fusion of Handcrafted

and Deep features” (FHD) feature extraction is proposed and explained in Chapter

4. In the proposed FHD, both handcrafted and deep features are extracted from the

Chest X Ray (CXR) images and both are concatenated for the process of

classification. Experiments for multi-level classification of lung diseases are carried

out using CXR images with four deep learning models namely VGG-16,

MobileNetV2, ResNet50 and Modified XceptionNet. The experimental results

obtained using these methods are presented and analyzed.

30
➢ Chapter 5 describes about a novel optimization algorithm namely

“Modified Moth Flame Optimization” (MMFO) algorithm. MMFO is developed

based on Moth Flame Optimization Algorithm (MFO). MMFO selects the optimal

features for multi-level classification of lung diseases. The proposed MMFO is

evaluated using the deep learning classification models namely VGG-16,

ResNet50, MobileNetV2 and Modified XceptionNet.

➢ A novel a fuzzy rank-based ensemble of three pre-trained deep

learning models, namely, VGG-16, ResNet50 and Modified XceptionNet using

Gompertz function is developed for multi-level classification of lung diseases and

is presented in Chapter 6.

➢ Finally, chapter 7 briefly summarizes and concludes the research

work. Few suggestions and recommendations in order to improve the efficiency and

reliability of the developed method are presented.

1.14. SUMMARY

A novel multi-level classification of lung diseases using modified moth

flame optimization and fuzzy Gompertz based ensemble deep learning model was

developed in this research work. In order to construct a new model for lung diseases

detection, it is necessary to examine and evaluate the current research in this field.

Basic introduction about image processing along with the definition of image and

image processing techniques are presented in this Chapter. Notable techniques for

disease detection using image processing such as Feature Extraction, Feature

Selection and Classification are described in detail in this chapter. Introduction and

working procedure of Machine Learning and Deep Learning are also discussed.

Basic concepts of fuzzy along with the architecture of Fuzzy Inference System

(FIS) is explained in detail. The research gap is identified and the motivation of the

31
research is described. The outline of the proposed research work with a neat

diagram is presented. Moreover, the research contributions along with the

organization of this thesis is also presented.

32

You might also like