Project Work

CHAPTER ONE
INTRODUCTION
1.1 Background to the study
The researched studies shown that Coronavirus Disease (COVID-19) is an infectious disease
caused by a novel Coronavirus which was originated in Wuhan China last December (2019).
This disease is met to affect the respiratory system of a person, and some people will eventually
get better without having special treatment, especially for those who have a strong immune
system as describe by (World Health Organization, 2021).
Although the mode of operation of this disease may be different, old persons are more
vulnerable, including those with existing comorbidities such as cardiovascular disease, diabetes,
respiratory disease cancer. COVID-19 is not just a respiratory disease, it is multi systemic.
Recent studies determined that this virus affects almost all the organs of the body, which is
stimulated byits widespread inflammatory response in (Temgoua, Endomba, Nkeck, Kenfack,
Tochie, and Essouma, M., 2019).
The study shown that, about 10–18% of COVID-19 patients develop severe symptoms; these
individuals may experience long COVID-19, which may cause complications to the heart, lungs,
and nervous system, according to (Ames, H., 2020).
COVID-19 can spread easily, because this virus is transmissible by droplets into the air from the
infected person through speaking, coughing, and sneezing, or even touching some contaminated
objects or areas. The World Health Organization (WHO) stated that frequent hand-washing,
disinfecting, social distancing, wearing masks and not touching your face can protect one from
being infected.
The WHO listed several symptoms and emphasized that fever, a dry cough, and tiredness were
the most common, while less common symptoms were headaches, sore throat, diarrhea,
conjunctivitis, loss of smell, and rashes, and serious symptoms were breathing problems, chest
pain and loss of speech and movement. As of 29 June 2021, the studies shown that, there were
182,333,954 COVID-19 cases and 3,948,238 deaths worldwide describe by (Worldometer,
2021), and this disease had mutated into several variants documented in countries such as the
United Kingdom, South Africa, the United States, India and Brazil, which brings increased
severity to the disease, as well as quicker transmission, a higher death rate and reduced
effectivity of vaccines in (Centers for Disease Control and Prevention (CDC),2021).
As the virus keeps on spreading despite the efforts of the community to contain the virus, an
outbreak can lead to increased demands in hospital resources, and shortages of medical
equipment, healthcare staff and of course COVID-19 testing kits in (Wynants, Calster, Collins,
Riley, Heinze, Schuit, Bonten, Dahly,Damen, and Debray, 2020).
Machine learning can be categorized as supervised, unsupervised, and reinforcement learning.

Supervised machine learning is an approach that trains the machine using labeled datasets,
wherein the examples are correctly labeled according to the class to which they belong in
Supervised vs. Unsupervised Learning (2021).
The machine will analyze the given data and will eventually predict new instances based on
information learned from the past data. Unlike supervised machine learning, the unsupervised
machine learning learns by itself without the presence of the correctly labeled data.
In unsupervised machine learning, the machine will be fed by the training samples, and it is the
job of the machine to determine the hidden patterns from the dataset. For the reinforcement
learning, the machine acts as an agent that aims to discover the most appropriate actions through
a trial-and-error approach and observation in the environment in (Kaelbling, Littman, and
Moore,1996)
Every time the machine successfully performs a task, it will be rewarded by increasing its state;
otherwise, it will be punished by decreasing its state, and this approach will be repeated several
times until the machine learns how to perform a specific task correctly. Reinforcement learning
is used in training robots in how to perform human-like tasks and personal assistance
1.2. Statement of Research Problem

Limited access to COVID-19 testing kits can hinder the early diagnosis of the disease, and giving
the best possible care for the suspected COVID-19 patients can be burdensome. Consequently,
an automatic prediction system that aims to determine the presence of COVID-19 in a person is
essential. Machine learning classification algorithms, datasets and machine learning software are
the necessary tools for designing a COVID-19 prediction model.
1.3. Research Objectives

1) To examine the extent of COVID-19 symptoms between gender variable and toward COVID-
19 outbreak prediction.
2) To identify the relationship between age and affected patient towards taken on COVID-19
vaccines
3) To find out the relationship between test date and affected patient or non-effected attitude
towards COVID-19 symptoms
4) To identify the relationship between frequent cough and severe fever towards taken on
treatment
5) To identify the relationship between difficulty in shortness of breath and headache towards
taken on COVID-19 confirmation.
1.4 Research questions
The research question contains a set of questions about COVID-19 symptoms prediction. Most
importantly, we asked respondents a general and subjective question about whether or not they
feel comfortable with COVID-19 outbreak.
1) To what extent do you think it is ok for gender to take on COVID-19 test outbreak
prediction?
2) To what extent do you feel uncomfortable that aged people are more affected toward
taken on COVID-19 vaccines?
3) To what extent do you think it is ok for tested date and affected or non-affected person
toward COVID-19 symptoms?
4) To what extent do you think it is ok for patient with cough and severe fever to taken on
treatment?
5) To what extent do you think it is ok for patient with shortness of breath and headache
taken on COVID-19 confirmation?
Respondents were only asked about one parent; in other words, they were asked about either
their mother or father. The gender was randomized. Asking about only one gender reduces
sample size in each cell but may be important to reduce bias if the answer about one parent is
anchored by the answer about the other parent.
1.5 Scope of the Study
Though, the concept of COVID-19 symptoms prediction is so broad and significant to the
continuous survival of social and economic growth in this period of global democratic
phenomenon. The study covers relationship between demographic variable and behavioral factor
toward COVID-19 in Nigeria. The study basically focused on Lokoja General Hospital.
TIME: the given time to the researcher may be too short, which could not cover all the possible
areas.
ATTITUTE: the attitude of the staff Lokoja General Hospital also was a problem because their
are not willing to give out any information for fear of being apprehended.
1.6 Limitations of the Study

This study is limited by the following: Firstly, dearth of data acted as constrain to the researcher.
Secondly, finance and domestic demands collectively impinged on the study process. Thirdly,
uncompromising attitude of some staffs of Lokoja General Hospital. The researcher visited the
Lokoja General Hospital three times but met uncompromised staffs who were not ready to offer
help. Most of the relevant data for the study were classified as “untouchable” by the staffers met
on duty. However, in spite of these difficulties, the researcher was able to gather all available
data finally scraped-through the odds to produce the work on display.
1.7 Definition of Key terms

Attributes: A column of data or attributes can have different data types, such as real, integer,
nominal, or string. With supervised learning, there are type types of attributes, features and
labels.
Instances: A row of data. Instances are the inputs to a machine learning scheme. CSV files can
express instances as independent lists, while JSON can represent relationships within the data.
Data mining: It is the process of discovering patterns in large datasets involving methods at the
intersection of machine learning, statistics, and database systems.
Data analysis: This is famously the act of inspecting, cleansing, transforming, and modelling
data with the goal of discovering useful information, informing conclusions, and supporting
decision-making.
Exploratory data analysis: It is an approach to analyzing data sets to summarize their main
features often with visual methods.
Classification: It is the problem of identifying to which of a set of categories a new observation
belongs, on the basis of a training set of data containing observations whose category is known.
CHAPTER TWO
LITERATURE REVIEW
2.1 INTRODUCTION
This session reviews the concept of relevant work on covid-19 prediction using machine learning
model. Here we discuss the various stage of covid-19 symptoms and approaches to model and
analyze this effect to the most survival model of Tinto (2016).
1.1 Over view of machine learning model
Machine Learning is a sub-area of artificial intelligence, whereby the term refers to a branch of
artificial intelligence that aims at enabling machines to perform their jobs skillfully by using
intelligent software (Mohssen, 2016). In other words: Machine Learning enables information
technology systems to recognize patterns on the basis of existing algorithms and data sets and to
develop adequate solution concepts. Therefore, in Machine Learning, artificial knowledge is
generated on the basis of experience.
In a way, machine learning works in a similar way to human learning. For example, if a child is
shown images with specific objects on them, the child can learn to identify and differentiate
between them. Machine learning works in the same way: Through data input and certain
commands, the computer is enabled to "learn" to identify certain objects (persons, objects, etc.)
and to distinguish between them. For this purpose, the software is supplied with data and trained.
For instance, the programmer can tell the system that a particular object is a human being
(="human") and another object is not a human being (="no human"). The software receives
continuous feedback from the programmer. These feedback signals are used by the algorithm to
adapt and optimize the model. With each new data set fed into the system, the model is further
optimized so that it can clearly distinguish between "humans" and "non-humans" in the end.
2.2There are three types of Machine Learning Algorithms (Fumo, 2017)
2.2.1 Supervised Learning: This algorithm consists of a target / outcome variable (or
dependent variable) which is to be predicted from a given set of predictors (independent
variables). Using these set of variables, they generate a function that map inputs to desired
outputs. The training process continues until the model achieves a desired level of accuracy on
the training data. Examples of Supervised Learning: Regression, Decision Tree, Random Forest,
KNN, Logistic Regression etc.
2.2.2 Unsupervised Learning: This algorithm does not have any target or outcome variable to
predict / estimate. It is used for clustering population in different groups, which is widely used
for segmenting customers in different groups for specific intervention. Examples of
Unsupervised Learning: A priori algorithm, K-means.
2.2.3 Reinforcement Learning: Here the machine is trained to make specific decisions. It
works this way: the machine is exposed to an environment where it trains itself continually using
trial and error. This machine learns from past experience and tries to capture the best possible
knowledge to make accurate business decisions. Example of Reinforcement Learning: Markov
Decision Process
2.3Common Machine Learning Algorithms
According to Sunil (2017), here are commonly used machine learning algorithms. These
algorithms can be applied to almost any data problem:
2.3.1Linear Regression
It is used to estimate real values (cost of houses, number of calls, total sales etc.) based on
continuous variable(s). Here, we establish relationship between independent and dependent
variables by fitting a best line. This best fit line is known as regression line. The best way to
understand linear regression is to relive this experience of childhood. Let us say, A child in fifth
grade is ask to arrange people in his class by increasing order of weight, without asking them
their weights. What will the child do? He would likely look (visually analyze) at the height and
build of people and arrange them using a combination of these visible parameters. This is linear
regression in real life. The child has actually figured out that height and build would be
correlated to the weight by a relationship, which looks. Linear Regression is mainly of two types:
Simple Linear Regression and Multiple Linear Regression. Simple Linear Regression is
characterized by one independent variable. And, Multiple Linear Regression (as the name
suggests) is characterized by multiple (more than 1) independent variables. While finding the
best fit line, you can fit a polynomial or curvilinear regression. And these are known as
polynomial or curvilinear regression.
2.3.2 Logistic Regression
It is a classification not a regression algorithm. It is used to estimate discrete values (Binary
values like 0/1, yes/no, true/false) based on given set of independent variable(s). In simple
words, it predicts the probability of occurrence of an event by fitting data to a logit function.
Hence, it is also known as logit regression. Since, it predicts the probability, its output values lies
between 0 and 1 (as expected).
2.3.3 Decision Tree
It is a type of supervised learning algorithm that is mostly used for classification problems.
Surprisingly, it works for both categorical and continuous dependent variables. In this algorithm,
population is split into two or more homogeneous sets. This is done based on most significant
attributes/ independent variables to make as distinct groups as possible.
2.3.4 SVM (Support Vector Machine)
Support Vector Machines (SVM) is a supervised learning method used for regression and
classification (Vapnik, 2000). The algorithm tries to find an optimal hyper plane which separates
the d-dimensional training data perfectly into its classes. An optimal hyper plane is the one that
maximizes the distance between examples on the margin (border) which separates different
classes. These examples on the margin are the so-called support vectors. Since training data is
often not linearly separable, SVM maps data into a high-dimensional feature space though some
nonlinear mapping. In this space, an optimal separating hyper plane is constructed. In order to
reduce computational cost, the mapping will be performed by kernel functions, which depend
only on input space variables. The most used kernel functions are: linear, polynomial, radial base
function (RBF) and sigmoid.
2.3.5 Naive Bayes
It is a classification technique based on Bayes’ theorem with an assumption of independence
between predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a
particular feature in a class is unrelated to the presence of any other feature. For example, a fruit
may be considered to be an apple if it is red, round, and about 3 inches in diameter. Even if these
features depend on each other or upon the existence of the other features, a naive Bayes classifier
would consider all of these properties to independently contribute to the probability that this fruit
is an apple. Naive Bayesian model is easy to build and particularly useful for very large data sets.
Along with simplicity, Naive Bayes is known to outperform even highly sophisticated
classification methods.
2.3.6 KNN (K- Nearest Neighbors)
It can be used for both classification and regression problems. However, it is more widely used
in classification problems in the industry. K nearest neighbors is a simple algorithm that stores
all available cases and classifies new cases by a majority vote of its k neighbors. The case being
assigned to the class is most common amongst its K nearest neighbors measured by a distance
function. These distance functions can be Euclidean, Manhattan, Minkowski and Hamming
distance. First three functions are used for continuous function and fourth one (Hamming) for
categorical variables. If K = 1, then the case is simply assigned to the class of its nearest
neighbor. At times, choosing K turns out to be a challenge while performing kNN modeling.
KNN can easily be mapped to our real lives. If you want to learn about a person, of whom you
have no information, you might like to find out about his close friends and the circles he moves
in and gain access to his/her information.
2.3.7 K-Means
It is a type of unsupervised algorithm which solves the clustering problem. Its procedure follows
a simple and easy way to classify a given data set through a certain number of clusters (assume k
clusters). Data points inside a cluster are homogeneous and heterogeneous to peer groups.
Remember figuring out shapes from ink blots, k means is somewhat similar to this activity. You
look at the shape and spread to decipher how many different clusters/populations are present.
Fig. 2.1 Illustration of KNN using ink blots (Sunil, 2017).
2.3.8 Random Forest
Random Forests (RF) are sets of decision trees that vote together in a classification. Each tree is
constructed by chance and selects a subset of features randomly from a subset of data points. The
tree is then trained on these data points (only on the selected characteristics), and the remaining
out of bag is used to evaluate the tree. Random Forests are known to be effective in preventing
over fitting. As proposed by Breiman (2001) its features are: it is easy to implement, it has good
generalization properties, its algorithm outputs more information than just class label, it runs
efficiently on large databases, it can handle thousands of input variables without variable
deletion and it provides estimates of what variables are important in the classification.
2.3.9 Dimensionality Reduction Algorithms

In the last 4-5 years, there has been an exponential increase in data capturing at every possible
stages. Corporates/Government Agencies/Research organizations are not only coming with new
sources but also, they are capturing data in great detail. For example: E-commerce companies are
capturing more details about customer like their demographics, web crawling history, what they
like or dislike, purchase history, feedback and many others to give them personalized attention
more than your nearest grocery shopkeeper. As a data scientist, the data offered also consist of
many features, this sounds good for building good robust model but there is a challenge. How’d
you identify highly significant variable(s) out of 1000 or 2000, In such cases, dimensionality
reduction algorithm helps us along with various other algorithms like Decision Tree, Random
Forest, PCA, Factor Analysis, identify based on correlation matrix, missing value ratio and
others
2.3.10 Gradient Boosting Algorithms
i. GBM
GBM is a boosting algorithm that is use to deal with plenty of data and to make a prediction with
high prediction power. Boosting is actually an ensemble of learning algorithms which combines
the prediction of several base estimators in order to improve robustness over a single estimator. It
combines multiple weak or average predictors to a build strong predictor. These boosting
algorithms always work well in data science competitions like Kaggle, AV Hackathon, Crowd
Analytics.
ii. XGBoost
Another classic gradient boosting algorithm that’s known to be the decisive choice between
winning and losing in some Kaggle competitions. The XGBoost has an immensely high
predictive power which makes it the best choice for accuracy in events as it possesses both linear
model and the tree learning algorithm, making the algorithm almost 10x faster than existing
gradient booster techniques.
The support includes various objective functions, including regression, classification and
ranking. One of the most interesting things about the XGBoost is that it is also called a
regularized boosting technique. This helps to reduce overfit modelling and have a massive
support for a range of languages such as Scala, Java, R, Python, Julia and C++. Supports
distributed and widespread training on many machines that encompass GCE, AWS, Azure and
Yarn clusters. XGBoost can also be integrated with Spark, Flink and other cloud dataflow
systems with a built-in cross validation at each iteration of the boosting process.
iii. LightGBM
LightGBM is a gradient boosting framework that uses tree-based learning algorithms. It is
designed to be distributed and efficient with the following advantages: Faster training speed and
higher efficiency, Lower memory usage, better accuracy, Parallel and GPU learning supported
and capable of handling large-scale data. The framework is a fast and high-performance gradient
boosting one based on decision tree algorithms, used for ranking, classification and many other
machine learning tasks. It was developed under the Distributed Machine Learning Toolkit
Project of Microsoft.
Since the LightGBM is based on decision tree algorithms, it splits the tree leaf wise with the best
fit whereas other boosting algorithms split the tree depth wise or level wise rather than leaf-wise.
So, when growing on the same leaf in Light GBM, the leaf-wise algorithm can reduce more loss
than the level-wise algorithm and hence results in much better accuracy which can rarely be
achieved by any of the existing boosting algorithms. Also, it is surprisingly very fast, hence the
word ‘Light’.
iv. Catboost
CatBoost is a recently open-sourced machine learning algorithm from Yandex. It can easily
integrate with deep learning frameworks like Google’s TensorFlow and Apple’s Core ML. The
best part about CatBoost is that it does not require extensive data training like other ML models,
and can work on a variety of data formats; not undermining how robust it can be. It makes sure
that it handles missing data well before it proceeds with the implementation. Catboost can
automatically deal with categorical variables without showing the type conversion error, which
helps to focus on tuning the model better rather than sorting out trivial errors.
2.3.11 Bayesian Networks (BNs)
According to Ben-Gall (2008), Bayesian Networks (BNs) belong to the family of probabilistic
graphical models. These graph structures are used to represent knowledge about an uncertain
domain. In particular, each node in the graph represents a random variable, while the edges
between the nodes represent probabilistic dependencies among the corresponding random
variables. Such conditional dependencies in the graph are often estimated using known statistic
and computational methods. Thus, Bayesian networks combine principles of graph theory,
probability theory and statistics.
2.3.12 Artificial neural networks
Artificial Neural Networks (ANN) are composed by several computational elements that interact
through connections with different weights. With inspiration in the human brain, neural networks
exhibit features such as the ability to learn complex patterns of data and generalize learned
information (Zhao, 2012). The simplest form of an ANN is the MultiLayer Perceptron (MLP)
consisting of three layers: the input layer, the hidden layer and the output layer. (Haykin, 2009)
states that the learning processes of an artificial neural network are determined by how parameter
changes occur. Different types of neural networks use different principles in determining their
own rules. There are many types of artificial neural networks, each with their unique strengths.
Some of the most important types of neural networks and their applications (Mehta, 2019).
i. Feedforward Neural Network – Artificial Neuron

This is one of the simplest types of artificial neural networks. In a feedforward neural network,
the data passes through the different input nodes until it reaches the output node. In other words,
data moves in only one direction from the first tier onwards until it reaches the output node. This
is also known as a front propagated wave which is usually achieved by using a classifying
activation function. Unlike in more complex types of neural networks, there is no
backpropagation and data move in one direction only. A feedforward neural network may have a
single layer or it may have hidden layers.
In a feedforward neural network, the sum of the products of the inputs and their weights are
calculated. This is then fed to the output. Here is an example of a single layer feedforward neural
network.
Fig. 2.2 Feedforward Neural Network-Artificial Neuron (Mehta, 2019).
Feedforward neural networks are used in technologies like face recognition and computer vision.
This is because the target classes in these applications are hard to classify. A simple feedforward
neural network is equipped to deal with data which contains a lot of noise. Feedforward neural
networks are also relatively simple to maintain.
ii.Radial Basis Function Neural Network

A radial basis function considers the distance of any point relative to the center. Such neural
networks have two layers. In the inner layer, the features are combined with the radial basis
function. Then the output of these features is taken into account when calculating the same
output in the next time-step.
Fig. 2.3Radial Basis Function Neural Network (Mehta, 2019).
The radial basis function neural network is applied extensively in power restoration systems. In
recent decades, power systems have become bigger and more complex. This increases the risk of
a blackout. This neural network is used in the power restoration systems in order to restore
power in the shortest possible time.
iii. Multilayer Perceptron

A multilayer perceptron has three or more layers. It is used to classify data that cannot be
separated linearly. It is a type of artificial neural network that is fully connected. This is because
every single node in a layer is connected to each node in the following layer.
A multilayer perceptron uses a nonlinear activation function (mainly hyperbolic tangent or
logistic function). Here’s what a multilayer perceptron looks like.
Fig. 2.4 Multilayer Perceptron (Mehta, 2019).
This type of neural network is applied extensively in speech recognition and machine translation
technologies.
iv. Convolutional Neural Network
A convolutional neural network (CNN) uses a variation of the multilayer perceptrons. A CNN
contains one or more than one convolutional layer. These layers can either be completely
interconnected or pooled. Before passing the result to the next layer, the convolutional layer uses
a convolutional operation on the input. Due to this convolutional operation, the network can be
much deeper but with much fewer parameters. Due to this ability, convolutional neural networks
show very effective results in image and video recognition, natural language processing, and
recommender systems. Convolutional neural networks also show great results in semantic
parsing and paraphrase detection. They are also applied in signal processing and image
classification. CNNs are also being used in image analysis and recognition in agriculture where
weather features are extracted from satellites like LSAT to predict the growth and yield of a
piece of land. Here’s an image of what a Convolutional Neural Network looks like.
Fig. 2.5 Convolutional Neural Network (Mehta, 2019).
v. Recurrent Neural Network (RNN) – Long Short-Term Memory
A Recurrent Neural Network is a type of artificial neural network in which the output of a
particular layer is saved and fed back to the input. This helps predict the outcome of the layer.
The first layer is formed in the same way as it is in the feedforward network. That is, with the
product of the sum of the weights and features. However, in subsequent layers, the recurrent
neural network process begins. From each time-step to the next, each node will remember some
information that it had in the previous time-step. In other words, each node acts as a memory cell
while computing and carrying out operations. The neural network begins with the front
propagation as usual but remembers the information it may need to use later.
If the prediction is wrong, the system self-learns and works towards making the right prediction
during the backpropagation. This type of neural network is very effective in text-to-speech
conversion technology. Here’s what a recurrent neural network looks like.
Fig. 2.6 Recurrent Neural Network (RNN) – Long Short-Term Memory (Mehta, 2019).
vi. Modular Neural Network
A modular neural network has a number of different networks that function independently and
perform sub-tasks. The different networks do not really interact with or signal each other during
the computation process. They work independently towards achieving the output. As a result, a
large and complex computational process can be done significantly faster by breaking it down
into independent components. The computation speed increases because the networks are not
interacting with or even connected to each other. Here’s a visual representation of a Modular
Neural Network.
Fig. 2.7 Modular Neural Network (Mehta, 2019).
vii.Sequence-To-Sequence Models
A sequence-to-sequence model consists of two recurrent neural networks. There’s an encoder
that processes the input and a decoder that processes the output. The encoder and decoder can
either use the same or different parameters. This model is particularly applicable in those cases
where the length of the input data is not the same as the length of the output data. Sequence-to-
sequence models are applied mainly in chatbots, machine translation, and question answering
systems.
2.4 Related work on literature review
Early detection and diagnosis using AI techniques help to prevent the spread and to combat the
COVID-19 pandemic using different data such as CT scans, X-ray, clinical data, and blood
sample data.
(Yan, Zhang,and Xiao, 2020), predicted the criticality and survival chances of patients with
severe COVID-19 infection based on different risk factors and demographic information. The
dataset used consists of 375 records from patients admitted to Tongji Hospital from January 10th
to February 18th, 2020, including 201 survivors and 174 deceased within the same period.
They used an XGBoost (XGB) model and identified only three main clinical features as
significant, i.e., lactic dehydrogenase (LDH), lymphocyte, and high-sensitivity C-reactive protein
(Hs-CRP), selected from more than 300 features.
The proposed model was validated using data from 29 patients. The key findings of the research
were the model’s ability to predict the risk of death with 0.95 precision and 0.90 prediction
accuracy. Such models will equip physicians with a tool for identifying critical conditions,
thereby helping to reduce the mortality rate.
Even though these findings are of great importance, the research has some limitations, which
affect the accuracy of the reported results.
These limitations were due to the small size of the dataset, namely, 29 records of patients
only.Similarly, (Wong and So, 2020) also used XGB with another dataset to predict the severe
and the death cases and identify the risk factors associated with COVID-19.
The dataset was retrieved from United Kingdom Biobank (UKBB) and includes 93 different
variables collected between 16 March 2020 and 19 July 2020.
Two different studies have been conducted based on the sample’s groups. For the first study, the
data were clinical prediagnostic data of 1747 COVID-19 infected patient records containing both
severe and death cases. For the severity class, the accuracy achieved was 0.668, and for the
fatality class, the accuracy was 0.712. For the second study, the data were taken from the
negative cases, the general population with no COVID-19 infection, consisting of 489987
records. The same model was applied, and the accuracy achieved was similar to the first study,
with an accuracy of 0.669 for the severity class and 0.749 for the fatality class, respectively.
It is worth mentioning that the researchers identified the five most significant risk factors for
severe cases and death cases, with age being the top factor for both cases. Other factors include
obesity, impaired renal function, multiple comorbidities, and cardiometabolic abnormalities.
Sun, Song, andShi,(2020), developed a prediction model using the support vector machine
(SVM) to predict the severe cases of COVID-19 patients. In the study, they used the clinical and
laboratory features that are significantly associated with these cases. Using 336 cases of COVID-
19 patients, 26 severe/critical cases and 310 noncritical, they found that the main features to
discriminate the mild and severe cases are age, growth hormone secretagogues (GHSs), immune
feature cluster of differentiation 3 (CD3) percentage, and total protein.
They found that the proposed model was effective and robust in predicting patients in severe
conditions with up to 0.775 accuracy.
Another research conducted byYao, Zhang, and Zhang,(2019), also applied the SVM model to
classify the COVID-19 patients according to the severity of the symptoms. They applied SVM
for the binary class label on a total of 137 records including urine and blood test results and
combining both severely ill patients and patients with mild symptoms. The results showed that
around 32 factors have high correlations with severe COVID-19, with an accuracy of 0.815. It is
worth mentioning that, amongst all factors, age and gender had mostly affected the classification
of cases between severe and mild. Patients aged around 65 had more severe cases than others.
Moreover, male patients were at a higher risk of developing severe COVID-19 symptoms. In
terms of the urine and blood test samples, blood test result features show more significant
differences between severe and mild cases than urine test result features.
Hu, Liu, and Jiang, (2020) used the logistic regression (LR) model to identify the COVID-19
patients’ severity. They used a dataset containing demographic and clinical data for 115 COVID-
19 patients under the non-severe condition and 68 COVID-19 patients under the severe
condition. Four features have been selected as the most significant features to discriminate the
mild and severe cases: age, high-sensitivity C-reactive protein level, lymphocyte count, and d-
dimer level. This model was evaluated, and the results showed that the prediction was effective
with area under the receiver operating characteristic (AUROC) of 0.881, sensitivity of 0.839, and
specificity of 0.794, respectively.
Bertsimas, Lukin, and Mingardi, (2020), used 3927 COVID-19 patients’ sample for predicting
the mortality risk using XGB. The study used demographic and the clinical features of the
patients from 33 hospital data.
The model achieved the accuracy of 0.85 and AUC of 0.90. Moreover, Sánchez-Montañés,
Rodr´ıguez-Belenguer, SerranoLopez, Soria-Olivas, andAlakhdar-Mohmara, (2020)developed
LR-based mortality prediction using 1969 COVID-19-positive patients. The study found age and
O2 as the significant features and achieved an AUC of 0.89, sensitivity of 0.82, and specificity of
0.81, respectively.
In Zagrouba, Adnan Khan,and ur-Rahman, (2021, supervised machine learning techniques have
been investigated to predict the COVID-19 outbreak. In Zagroubaet al (2021), SVM has been
used for prediction over the dataset obtained from the WHO with 303 patients. The proposed
scheme exhibits an accuracy of 0.967 during the testing phase. Similarly, An, Lim, Kim, Chang,
Choi, and Kim (2020), developed the model to predict the mortality of COVID-19 patients using
several machine learning algorithms such as LASSO, SVM (linear and RBF), RF, and KNN.
The models were trained to identify three cases, i.e., mortality and survived and mortality and
survived within 14 and 30 days after the initial diagnosis. Linear SVM achieved the highest
performance with an AUC of 0.962, sensitivity of 0.92, and specificity of 0.91, respectively.
The study found age, diabetes mellitus, and cancer as a significant factor in the mortality
prediction for COVID-19 patients.
Authors/Years Technique Dataset Target Result
class
Yan, Zhang, and Xiao, XGB 404 patients Death, 0.95 precision
(2020) survived 0.90 accuracy
Wong, and So, (2020) XGB 1747 COVID-19 patients Fatal, Accuracy 0.668 (fatalit
severe 0.712 (severe)
Sun, Song, andShi,(2020) SVM 336 COVID-19 patients Severe, 0.775 accuracy
critical
Yao, Zhang, and Zhang, SVM 137 COVID-19 patients Severe, 0.815 accuracy
(2019), non-severe
Hu, Liu, and Jiang, (2020) LR 115 COVID-19 patients Severe, 0.881 AUROC
non-severe 0.839 sensitivity
0.794 specificity
Zagrouba, Adnan, and ur- SVM 303 patients Negative, 0.967 accuracy
Rahman, (2021), positive
cases
Bertsimas, Lukin, and XGB 3927 COVID-19 patients 0.85 accuracy
Mingardi, (2020) 0.90 AUC
Bertsimas, Lukin, and LR 1696 COVID-19 patients Home, 0.89 AUC
Mingardi, (2020) deceased 0.82 sensitivity
0.81 specificity
An, Lim, Kim, Chang, Choi, SVM 8000 COVID-19 patients Mortality, 0.962 AUC
andKim (2020) (linear) recovered 0.92 sensitivity
0.91 specificity
Table 1.0: Review of Related Studies on mortality prediction for COVID-19 patients
2.6 Research Gap
In conclusion, the importance of machine learning specifically, on predictive analysis, has been
proven from several studies. Some of the studies have been conducted to perform the prediction
and forecasting, yet there is still a need for further exploration and to extend the findings
associated with COVID-19 using a real dataset of covid-19 symptoms clinical records. The
summary of the related studies is shown in Table 1. The proposed model in this study attempts to
predict the covid-19 symptoms using machine learning and identifying the main risk factors
associated with COVID-19. Targeted patients are isolated at home.
CHAPTER THREE
RESEARCH METHODOLOGY
3.0 Introduction
This chapter covers the description and discussion on the various techniques and
procedures used in the study to collect and analyze the data as it is deemed appropriate. Thus, the
following areas will be treated: Research design, study population sample size/sampling
technique, sources of data, method of data collection, and method of data analysis
3.1 Research Design
According to Saunders and Thorn hill (2003) cited in Adefolarin (2014), research design
means the plan and structure of investigation so conceived as to seek answers to questions for a
research study. Babbie and Mounton (2001) cited in Orod ho (2009), research design essentially
the overall framement of a research project. The master plans within which various data
gathering tools are used. It constitutes guidelines which direct the researcher toward solving the
research problem. As Sekaran (2000) pointed out that research design constitutes the blue print
for collection, measurement and analysis of data. The research work can best be described as a
survey research. A survey research is one in which a group of people or items is studied by
collecting and analyzing data from only a few people or the entire group. Indeed in a more
specific objective this research work is a comparative stand point of research study where it deal
in survey research as only a sample of the population is studied.
3.2 Research Population Study
The research work focuses on participatory approach in organizational management: this mainly
involves bottom up approach, where staff at the lower echelon of the organization are allowed to
be part and parcel of decision making process. The population of the study consists of staff of
General Hospital Lokoja. The choices of this population was based on the exploratory nature of
the study, convenience and the desired to reduce potentially exogenous influences beyond the
scope of the story. Questionnaire was administered to obtain the opinion of Administration or
Personnel department 30, Finance 25, Education 25, Works 40 and Health Departments 30 and
therefore, the total population for the study is 30+25+25+40+30= 150 on Staff employed and
posted to General Hospital Lokoja , Kogi State. And the total sample size of the study is 1290
3.3 Sample Size/ Sampling Determination
A sample size of the study was determined using the Solving’s (1960) formula, which is as
follows
Thus; n = N
1+N(e)2
Where n= sample size
N = population of the study
e= level of significance of error allowable
1=unit (a constant)
Note e= 0.05
n= 150
1+150(0.05) 2
n= 150
151 (0.0025)
n= 150
1.375
n= 109
Therefore the sample size of this study is 109.

Population of the General Hospital Lokoja x 109
Total Population of 5 department 1
The sampling technique the researcher used is non-probability sampling in selecting the
individuals with whom the structured questionnaire will be administered. Purposive sampling is
a non-probability sampling method which is commonly used in qualitative research. According
to Leedy and Ormrod (2010), researchers use the purposive sampling method to select those
individuals who can provide you with the most important and relevant information.
3.4 Sources /Method of Data Collection
This study is based on the two possible sources of data which are the primary and secondary
source.
a. Primary sources of data: the primary data for this study consist of raw data generated
from responses to questionnaires and interview by the respondents.
b. Secondary source of data: the secondary data includes information obtained through
the review of literature that is journal, monographs, textbooks, newspaper, and
internet and achieve, periodical.
According to Epetemehin (2014), questionnaire method is the most important and systematic
method of collecting primary data, especially when the inquiry is quite extensive. It involves
preparation of a list of questions relevant to the inquiry and presenting them in the form often
called questionnaire. Epetimehin had maintained that a questionnaire is divided into two parts
highlighted below:
1. General introductory part which contains questions regarding the identity of the respondent
and it demands personal information such as name, address, telephone number, qualifications,
profession, etc. and,
2. Main question part which contains questions connected with the inquiry. These questions
differ from inquiry to inquiry. Preparation of the questionnaire is a highly specialized job and is
perfected with experience.
Both primary and secondary sources of data collection were used. Questionnaire was
used to obtain information as a primary sources while textbook, journals, newspaper, periodical,
archival and interment constituted secondary sources of data collection. The questionnaire was
designed showing closed – ended questions – Yes or No. The questionnaire was administered to
the sample size of the study covers respondents drawn within Lokoja.
3.5 Method of Data Analysis
The techniques of data analysis used in this research are simple percentage and chi – square
methods, employed analyzing the responses to the questionnaire as defined below:
The simple percentage formula:
n 100
x
N 1
Where: n = sample size
N = size of the population
100= standard (percentage)
The chi – square formula
2 FO −FE 2
x =( )
F
Where 0 = observed frequency
E = expected frequency
The X2 value from the formula is compared with the value of tabulated X 2 for a given
significance level and degree of freedom. The level of significance of the use of Chi-square is at
0.05 (5%).
3.6 The Decision Rule
If the computed X2 is greater than the critical value at 0.05 level of significance obtained
from the Chi-square table, then the null hypothesis (Ho) will be rejected and vice-visa.
CHAPTER FOUR
DATA PRESENTATION AND ANALYSIS
4.0 DATA ANALYSIS, FINDINGS, AND DISCUSSION
This chapter deal with analysis and interpretation of the data gathered in the course of the
research. As earlier state, the statistical tools employed to assist in the analysis of this work are
tables and simple percentage, also degree of frequency and chi-square to test the hypothesis.
Importantly, a total of 109 Questionnaire were administered to the respondents in the field that
were chosen for the study. The administration of Questionnaire lasted for (3) three weeks. It was a
Herculean exercise but very rewarding for the researcher. 90 were returned duly completed after
being admitted having been filled well by the respondents.
.1 Data Presentation and Analysis
SECTION A
Below are the data obtained in the study.
TABLE 1: ANALYSIS OF RESPONDENTS BIO- DATA
Table 2: Sexes of respondent
Sex Frequency Percentage (%)
Male 50 56%
Female 40 44%
Total 90 100
Sources: field work, 2022
The table above show that 50 representing 56% where male, while 40 respondents are female where
44% this indicated that the population of male more than the female respondent
TABLE 2: ANALYSIS BASE ON MARITAL STATUS
Marital status 20 22%
Single 60 67%
Married 10 11%
Total 90 100
Source: field work 2022
The above shows that 60 respondents are single 67%. This implies that there is more single person
in these varies organizations than married.
TABLE 3.4: ANALYSIS BASED ON AGE
TABLE: 3 AGE OF RESPONDENTS
Ages Frequency Percentage %
20-30 years 25 28%
31-40 years 20 22%
41-50 years 25 28%
Above 51years 20 22%
Total 90 100
Table 3.4 above shows that 25 respondents are below 20-30 years and it is 28%, 20 respondents are
within 31-40 years which is 22%, 25 respondents are within 41-50 years which is 28% also 20
respondents are 51 years above of age in these organization
TABLE 4.5. ANALYSIS BASED ON ACADEMIC QUALIFICATION
TABLE 4: RESPONDENTS ACADEMIC QUALIFICATION
Qualification Frequency Percentage %
Secondary 15 16%
National Diploma 25 28%
HND 25 28%
BSCs 25 28%
Total 90 100
Source: field, 2022
From table 4 analysis shows that there is 15 respondents which is 16% having secondary
certificate, 25 respondents which is 28% having National Diploma, 25 of the respondents are
with Higher National Diploma which is 28% while 28 the respondents BSCs holders the
respondents with 28% . From the above analysis it can be seen that lesser respondents are
holder’s secondary degree qualifications.
SECTION B
This section b seeks to enquire from the respondents their opinion on the questions from the
questionnaires
QUESTION ONE
Table 4.6: 1) To what extent do you think it is ok for gender to take on COVID-19 test outbreak
prediction?
Option Strongly agree Agree Undecided Disagree Strong’s Total

disagree
Response 20 20 25 15 10 90
Percentag 22% 22% 28% 17% 11% 100%

e
Source: field work, 2022
The above table shows that 20 respondents strongly agree with the question and is
represented by 22%, 20 respondents agree by 22% and 25 respondents was undecided and is
represents by 28% while 15 respondents that is 17% strongly disagree 10 respondents that is
11% strongly disagree. It can be deduced from the above table that the highest percentage of the
respondents disagree with the statement
QUESTION TWO
Table: 4.7: 2) 2 To what extent do you feel uncomfortable that aged people are more affected
toward taken on COVID-19 vaccines?
Options Strongly Agree Undecided Disagree Strongly Total

agree disagree
Response 20 35 15 10 10 90
Percentage 22% 39% 17% 11% 11% 100%
Source: field work, 2019.
As show in table 4.9 shows that 20 respondents strongly agree and is 22%, 35 respondents agree
and is 39%, 15 respondents are undecided which is 17%, 10 respondents disagree and is 11%
while 10 respondents strongly disagree which is 11%. From the analysis, of the difference
between demographic variables and COVID-19 Vaccine
QUESTION THREE
Table 4.8 3) 3 To what extent do you think it is ok for tested date and affected or non-affected person toward
COVID-19 symptoms?

agree disagree
Response 10 16 20 30 14 90
Percentage 11% 18% 22% 33% 16% 100%
The table indicated that 10 respondents strongly agree and is 11%, 16 respondents agree which is
18%, 20 respondents are undecided about the question and is 22%, 30 respondents disagree
which respondents by 33% and 14 respondents strongly disagree which is 16% hence the
difference between demographic variables and COVID-19 symptoms
QUESTION FOUR
Table 4.9.4) 4 To what extent do you think it is ok for patient with cough and severe fever to taken on
treatment?

agree disagree
Response 15 25 40 5 5 90
Percentage 17% 28% 44% 6% 5% 100%
The above table indicates that 15 respondents strongly agree and is 17%, 25 respondents
agree and is 28%, 40 respondents are undecided and is 44%, 5 respondents disagree and is 6%
while 5 respondents strongly disagree and is 5% this analysis shows that it is a motivating factor,
going the high percentage agree
QUESTION FIVE:
Table 4. 10: 5) 5 To what extent do you think it is ok for patient with shortness of breath and headache taken on
COVID-19 confirmation
Options Strongly agree Agree Undecided Disagree Strongly Total

disagree
Response 16 30 15 10 19 90
Percentage 18% 33% 17% 11% 21% 100%
Source: field work, 2022

The analysis shows that 16 respondents strongly agree and is represented by 18%, 30
respondents agree and 33%, 15 respondents are undecided and is 17%, 10 respondents disagree
which is 11% while 19 respondents strongly disagree and is 21% there is indication here that
you can take on a debt to cover running household expenses
TEST OF HYPOTHESIS ONE

H0: There is no significant Relationship between gender variable and taken on COVID-19 test
outbreak prediction
H1: There is significant association between gender variable and taken on COVID-19 test
outbreak prediction
To test this hypothesis 1 would be used
Variable o E o-e (o-e)2 (o-e)2/e
Strongly agree 20 10 10 100 10
Agree 20 10 10 100 10
Undecided 25 10 15 225 22.5
Disagree 15 10 5 25 0
Strongly disagree 10 10 0 0 0
Total 90 42.5
From the value x2 c= 42.5, x2 T at 0.05 with df= 4 is 9.49
DECISION RULE
Reject H0: if calculated x2 is greater that table calculated x2.
From the above table analysis, the calculated x2 c= 42.5 and is greater than the value x2 T= 9.49,
we there reject the hypothesis (H0) and alternative hypothesis (H1) is accepted which states that
there is significant association between gender variable and COVID-19 test outbreak prediction
TEST OF HYPOTHESIS TWO
H0: There is no significant Relationship between feel uncomfortable that aged people are more
affected toward taken on COVID-19 vaccines
H1: There is significant association between feel uncomfortable that aged people are more
affected toward taken on COVID-19 vaccines
Variable o E o-e (o-e)2 (o-e)2/e
Agree 35 10 25 625 62.5
Undecided 15 10 5 25 2.5
Disagree 10 10 0 0 0
Strongly disagree 10 10 0 0 0
Total 90 75.00
DECISION RULE
there is significant association between feel uncomfortable that aged people are more affected
toward taken on COVID-19 vaccines
TEST OF HYPOTHESIS FOUR
H0: There is no significant Relationship between patient with cough and severe fever to taken on
treatment
H1: There is significant association between patient with cough and severe fever to taken on
treatment
Variable o e o-e (o-e)2 (o-e)2/e
Strongly agree 15 10 5 25 2.5
Agree 25 10 15 625 22.5
Undecided 40 10 20 400 40
Disagree 5 10 -5 25 2.5
Strongly disagree 5 10 -5 25 2.5
Total 90 70.0
DECISION RULE
there is a significant association between patient with cough and severe fever to taken on
treatment
TEST OF HYPOTHESIS FIVE
H0: There is no significant Relationship between patient with shortness of breath and headache
taken on COVID-19 confirmation
H1: There is significant association between patient with shortness of breath and headache taken
on COVID-19 confirmation
Variable o e o-e (o-e)2 (o-e)2/e
Strongly agree 16 10 6 36 3.6
Agree 30 10 20 400 40
Undecided 15 10 5 25 2.5
Disagree 10 10 0 0 0
Strongly disagree 19 10 9 81 8.1
Total 90 54.2
DECISION RULE
there is a significant correlation between patient with shortness of breath and headache taken on
COVID-19 confirmation
TEST OF HYPOTHESIS THREE
H0: There is no significant Relationship between tested date and affected or non-affected person
toward COVID-19 symptoms
H1: There is significant association between tested date and affected or non-affected person
toward COVID-19 symptoms?
QUESTION THREE
Table 4.8 3) To what extent do you think it is ok for tested date and affected or non-affected person toward
COVID-19 symptoms?

agree disagree
Response 10 16 20 30 14 90
Percentage 11% 18% 22% 33% 16% 100%
Variable o E o-e (o-e)2  (o-e)2

E
Agree 16 10 6 36 3.6
Undecided 20 10 10 100 10
Disagree 30 10 10 100 10
Strongly disagree 14 10 4 16 1.6
Total 90 25.2
From the value x2 c=25.2, x2 T at 0.05 with df= 4 is 9.49
DECISION RULE:
Reject H0: if calculated x2 is greater than table calculated x2.
From the table, the analysis calculated x2 c= 25.2 and is greater than the value x2 t= 9.49 we
therefore reject the null hypothesis (H1) and which states that to take what extent do you think it
is ok for tested date and affected or non-affected person toward COVID-19 symptoms is
accepted
CHAPTER FIVE
SUMMARY CONCLUSION AND RECOMMENDATION
5.0 SUMMARY OF FINDINGS

The research work has considered the relationship between demographic variables and
Covid-19 symptoms prediction in Lokoja Local general hospital as case study. The research
focuses on the levels of Covid-19 patient symptoms in Lokoja local government:
 There is significant association between patient with cough and severe fever to taken
on treatment
 There is no significant Relationship between patient with shortness of breath and
headache taken on COVID-19 confirmation
There is significant association between tested date and affected or non-affected person toward
COVID-19 symptoms
5.1 CONCLUSION
In this study, have ex-rayed the local government and demographic variables and consumer
attitude toward debt and a closer investigation shows that Lokoja Local Govt Area, activity is
still confirmed to the same narrow functional competence as before the reform of 1976 local
government. Participation in some of the more technical and strategic service is rare, disputes the
involvement by the 1976 reform guidelines and Nigeria 1999 constitutional provision. Hence,
Lokoja Local Govt Area participation in economic planning is minimal and her activities are
generally better linked to the development programmed in federal capital territory, Abuja. For
instance, Lokoja Local Govt Area is not responsible for managing the universal primary
education programme (UPE) even though it makes a substantial contribution to it. Also, it is left
out of the housing programme, agricultural revolution and diversification of economy
programmed in its domain. To administer this Programme in the FCT, Abuja, the federal
government and FCDA have created special agencies such as FCT, universal education board
(FCT, UBEC), FCT, Water Board, Abuja Investment Company etc.
5.2 RECOMMENDATION
Idah local government is regarded as confirmed a narrow functional competence before the
reform of 1976 local government and smooth practice, the following are recommended:
1. It is recommended to determine other social demographic factors which might affect you
feel uncomfortable that aged people are more affected toward taken on COVID-19
vaccines
2. It focuses on developing a consistent measure of patient knowledge and attitude toward.
3. It also examines the relationship between personal financial knowledge and a variety of
customer attitudes toward debt
5.3 PROPOSAL FOR FURTHER STUDY
Considering the Lokoja Local Government staff as a very apt in the practice governance
and development agenda in Nigeria and we suggest for the research work should accommodate
the relevance imputes relationship between demographic variable and patient attitude toward
Covid-19 test in Lokoja local government. It is parchment that the research should enlarge the
sample size and population as well as the area study so as to form a stranger basis to adjudge the
impact of in local government in Nigeria beyond the case study adopted by this research.
References
Yan L, Zhang H.-T, and Xiao Y, (2020), “Prediction of criticality in patients with severe Covid-
19 infection using three clinical features: a machine learning-based prognostic model
with clinical data in Wuhan,” medRxiv.
Wong K. C. Y, and So H.-C., (2020) “Uncovering clinical risk factors and prediction of severe
COVID-19: a machine learning approach based on UK biobank data,” medRxiv
Sun L, Song F, and Shi N, (2020) “Combination of four clinical indicators predicts the
severe/critical symptom of patients infected COVID-19,” Journal of Clinical Virology,
vol. 128, p. 104431
Yao H, Zhang N, and Zhang R, (2019), “Severity detection for the coronavirus disease 2019
(COVID-19) patients using a machine learning model based on the blood and urine
tests,” Frontiers in Cell and Developmental Biology, vol. 8, pp. 1–10, 2020.
Hu C, Liu Z, and Jiang Y, (2020), “Early prediction of mortality risk among patients with severe
COVID-19, using machine learning,” International Journal of Epidemiology, vol. 49, no.
6, pp. 1918–1929, 2020.
Bertsimas D, Lukin, and Mingardi L, (2020), “COVID-19 mortality risk assessment: an

international multi-center study,” PLoS One, vol. 15, no. 12, p. e0243262, 2020
Sánchez-Montañés M., Rodr´ıguez-Belenguer P., SerranoLopez A. J., Soria-Olivas E, and

Alakhdar-Mohmara Y., (2020) “Machine ´ learning for mortality analysis in patients with
COVID-19,” International Journal of Environmental Research and Public Health, vol.
17, no. 22, pp. 8386–20
An C., Lim H, Kim D.-W, Chang J. H, Choi Y. J, and Kim S. W (2020) “Machine learning
prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean
cohort study,” Scientific Reports, vol. 10, p. 18716, 2020
Zagrouba R, Adnan Khan M, ur-Rahman A, (2021), “Modelling and simulation of COVID-19

outbreak prediction using supervised machine learning,” Computers, Materials &
Continua, vol. 66, no. 3, pp. 2397–2407, 2021

Project Work

Uploaded by

Copyright:

Available Formats

Project Work

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Work

Uploaded by

Copyright:

Available Formats

CHAPTER ONE

Machine learning can be categorized as supervised, unsupervised, and reinforcement learning.

1.2. Statement of Research Problem

1.3. Research Objectives

1.6 Limitations of the Study

1.7 Definition of Key terms

2.3.9 Dimensionality Reduction Algorithms

2.3.11 Bayesian Networks (BNs)

2.3.12 Artificial neural networks

i. Feedforward Neural Network – Artificial Neuron

Fig. 2.2 Feedforward Neural Network-Artificial Neuron (Mehta, 2019).

ii.Radial Basis Function Neural Network

iii. Multilayer Perceptron

iv. Convolutional Neural Network

v. Recurrent Neural Network (RNN) – Long Short-Term Memory

2.6 Research Gap

3.1 Research Design

3.2 Research Population Study

3.3 Sample Size/ Sampling Determination

Where n= sample size

N = population of the study

e= level of significance of error allowable

Therefore the sample size of this study is 109.

Total Population of 5 department 1

3.4 Sources /Method of Data Collection

3.5 Method of Data Analysis

The simple percentage formula:

Where: n = sample size

N = size of the population

100= standard (percentage)

The chi – square formula

Where 0 = observed frequency

DATA PRESENTATION AND ANALYSIS

4.0 DATA ANALYSIS, FINDINGS, AND DISCUSSION

.1 Data Presentation and Analysis

Below are the data obtained in the study.

TABLE 1: ANALYSIS OF RESPONDENTS BIO- DATA

Table 2: Sexes of respondent

Sex Frequency Percentage (%)

Sources: field work, 2022

Marital status 20 22%

Source: field work 2022

TABLE 3.4: ANALYSIS BASED ON AGE

TABLE: 3 AGE OF RESPONDENTS

Ages Frequency Percentage %

20-30 years 25 28%

31-40 years 20 22%

41-50 years 25 28%

Above 51years 20 22%

Sources: field work, 2022

TABLE 4: RESPONDENTS ACADEMIC QUALIFICATION

Qualification Frequency Percentage %

National Diploma 25 28%

Source: field, 2022

Option Strongly agree Agree Undecided Disagree Strong’s Total

Percentag 22% 22% 28% 17% 11% 100%

Source: field work, 2022

Options Strongly Agree Undecided Disagree Strongly Total

Percentage 22% 39% 17% 11% 11% 100%