Project Work
Project Work
Project Work
INTRODUCTION
1.1 Background to the study
The researched studies shown that Coronavirus Disease (COVID-19) is an infectious disease
caused by a novel Coronavirus which was originated in Wuhan China last December (2019).
This disease is met to affect the respiratory system of a person, and some people will eventually
get better without having special treatment, especially for those who have a strong immune
system as describe by (World Health Organization, 2021).
Although the mode of operation of this disease may be different, old persons are more
vulnerable, including those with existing comorbidities such as cardiovascular disease, diabetes,
respiratory disease cancer. COVID-19 is not just a respiratory disease, it is multi systemic.
Recent studies determined that this virus affects almost all the organs of the body, which is
stimulated byits widespread inflammatory response in (Temgoua, Endomba, Nkeck, Kenfack,
Tochie, and Essouma, M., 2019).
The study shown that, about 10–18% of COVID-19 patients develop severe symptoms; these
individuals may experience long COVID-19, which may cause complications to the heart, lungs,
and nervous system, according to (Ames, H., 2020).
COVID-19 can spread easily, because this virus is transmissible by droplets into the air from the
infected person through speaking, coughing, and sneezing, or even touching some contaminated
objects or areas. The World Health Organization (WHO) stated that frequent hand-washing,
disinfecting, social distancing, wearing masks and not touching your face can protect one from
being infected.
The WHO listed several symptoms and emphasized that fever, a dry cough, and tiredness were
the most common, while less common symptoms were headaches, sore throat, diarrhea,
conjunctivitis, loss of smell, and rashes, and serious symptoms were breathing problems, chest
pain and loss of speech and movement. As of 29 June 2021, the studies shown that, there were
182,333,954 COVID-19 cases and 3,948,238 deaths worldwide describe by (Worldometer,
2021), and this disease had mutated into several variants documented in countries such as the
United Kingdom, South Africa, the United States, India and Brazil, which brings increased
severity to the disease, as well as quicker transmission, a higher death rate and reduced
effectivity of vaccines in (Centers for Disease Control and Prevention (CDC),2021).
As the virus keeps on spreading despite the efforts of the community to contain the virus, an
outbreak can lead to increased demands in hospital resources, and shortages of medical
equipment, healthcare staff and of course COVID-19 testing kits in (Wynants, Calster, Collins,
Riley, Heinze, Schuit, Bonten, Dahly,Damen, and Debray, 2020).
The machine will analyze the given data and will eventually predict new instances based on
information learned from the past data. Unlike supervised machine learning, the unsupervised
machine learning learns by itself without the presence of the correctly labeled data.
In unsupervised machine learning, the machine will be fed by the training samples, and it is the
job of the machine to determine the hidden patterns from the dataset. For the reinforcement
learning, the machine acts as an agent that aims to discover the most appropriate actions through
a trial-and-error approach and observation in the environment in (Kaelbling, Littman, and
Moore,1996)
Every time the machine successfully performs a task, it will be rewarded by increasing its state;
otherwise, it will be punished by decreasing its state, and this approach will be repeated several
times until the machine learns how to perform a specific task correctly. Reinforcement learning
is used in training robots in how to perform human-like tasks and personal assistance
Though, the concept of COVID-19 symptoms prediction is so broad and significant to the
continuous survival of social and economic growth in this period of global democratic
phenomenon. The study covers relationship between demographic variable and behavioral factor
toward COVID-19 in Nigeria. The study basically focused on Lokoja General Hospital.
TIME: the given time to the researcher may be too short, which could not cover all the possible
areas.
ATTITUTE: the attitude of the staff Lokoja General Hospital also was a problem because their
are not willing to give out any information for fear of being apprehended.
This session reviews the concept of relevant work on covid-19 prediction using machine learning
model. Here we discuss the various stage of covid-19 symptoms and approaches to model and
analyze this effect to the most survival model of Tinto (2016).
1.1 Over view of machine learning model
Machine Learning is a sub-area of artificial intelligence, whereby the term refers to a branch of
artificial intelligence that aims at enabling machines to perform their jobs skillfully by using
intelligent software (Mohssen, 2016). In other words: Machine Learning enables information
technology systems to recognize patterns on the basis of existing algorithms and data sets and to
develop adequate solution concepts. Therefore, in Machine Learning, artificial knowledge is
generated on the basis of experience.
In a way, machine learning works in a similar way to human learning. For example, if a child is
shown images with specific objects on them, the child can learn to identify and differentiate
between them. Machine learning works in the same way: Through data input and certain
commands, the computer is enabled to "learn" to identify certain objects (persons, objects, etc.)
and to distinguish between them. For this purpose, the software is supplied with data and trained.
For instance, the programmer can tell the system that a particular object is a human being
(="human") and another object is not a human being (="no human"). The software receives
continuous feedback from the programmer. These feedback signals are used by the algorithm to
adapt and optimize the model. With each new data set fed into the system, the model is further
optimized so that it can clearly distinguish between "humans" and "non-humans" in the end.
2.2There are three types of Machine Learning Algorithms (Fumo, 2017)
2.2.1 Supervised Learning: This algorithm consists of a target / outcome variable (or
dependent variable) which is to be predicted from a given set of predictors (independent
variables). Using these set of variables, they generate a function that map inputs to desired
outputs. The training process continues until the model achieves a desired level of accuracy on
the training data. Examples of Supervised Learning: Regression, Decision Tree, Random Forest,
KNN, Logistic Regression etc.
2.2.2 Unsupervised Learning: This algorithm does not have any target or outcome variable to
predict / estimate. It is used for clustering population in different groups, which is widely used
for segmenting customers in different groups for specific intervention. Examples of
Unsupervised Learning: A priori algorithm, K-means.
2.2.3 Reinforcement Learning: Here the machine is trained to make specific decisions. It
works this way: the machine is exposed to an environment where it trains itself continually using
trial and error. This machine learns from past experience and tries to capture the best possible
knowledge to make accurate business decisions. Example of Reinforcement Learning: Markov
Decision Process
2.3Common Machine Learning Algorithms
According to Sunil (2017), here are commonly used machine learning algorithms. These
algorithms can be applied to almost any data problem:
2.3.1Linear Regression
It is used to estimate real values (cost of houses, number of calls, total sales etc.) based on
continuous variable(s). Here, we establish relationship between independent and dependent
variables by fitting a best line. This best fit line is known as regression line. The best way to
understand linear regression is to relive this experience of childhood. Let us say, A child in fifth
grade is ask to arrange people in his class by increasing order of weight, without asking them
their weights. What will the child do? He would likely look (visually analyze) at the height and
build of people and arrange them using a combination of these visible parameters. This is linear
regression in real life. The child has actually figured out that height and build would be
correlated to the weight by a relationship, which looks. Linear Regression is mainly of two types:
Simple Linear Regression and Multiple Linear Regression. Simple Linear Regression is
characterized by one independent variable. And, Multiple Linear Regression (as the name
suggests) is characterized by multiple (more than 1) independent variables. While finding the
best fit line, you can fit a polynomial or curvilinear regression. And these are known as
polynomial or curvilinear regression.
2.3.2 Logistic Regression
It is a classification not a regression algorithm. It is used to estimate discrete values (Binary
values like 0/1, yes/no, true/false) based on given set of independent variable(s). In simple
words, it predicts the probability of occurrence of an event by fitting data to a logit function.
Hence, it is also known as logit regression. Since, it predicts the probability, its output values lies
between 0 and 1 (as expected).
2.3.3 Decision Tree
It is a type of supervised learning algorithm that is mostly used for classification problems.
Surprisingly, it works for both categorical and continuous dependent variables. In this algorithm,
population is split into two or more homogeneous sets. This is done based on most significant
attributes/ independent variables to make as distinct groups as possible.
2.3.4 SVM (Support Vector Machine)
Support Vector Machines (SVM) is a supervised learning method used for regression and
classification (Vapnik, 2000). The algorithm tries to find an optimal hyper plane which separates
the d-dimensional training data perfectly into its classes. An optimal hyper plane is the one that
maximizes the distance between examples on the margin (border) which separates different
classes. These examples on the margin are the so-called support vectors. Since training data is
often not linearly separable, SVM maps data into a high-dimensional feature space though some
nonlinear mapping. In this space, an optimal separating hyper plane is constructed. In order to
reduce computational cost, the mapping will be performed by kernel functions, which depend
only on input space variables. The most used kernel functions are: linear, polynomial, radial base
function (RBF) and sigmoid.
2.3.5 Naive Bayes
It is a classification technique based on Bayes’ theorem with an assumption of independence
between predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a
particular feature in a class is unrelated to the presence of any other feature. For example, a fruit
may be considered to be an apple if it is red, round, and about 3 inches in diameter. Even if these
features depend on each other or upon the existence of the other features, a naive Bayes classifier
would consider all of these properties to independently contribute to the probability that this fruit
is an apple. Naive Bayesian model is easy to build and particularly useful for very large data sets.
Along with simplicity, Naive Bayes is known to outperform even highly sophisticated
classification methods.
2.3.6 KNN (K- Nearest Neighbors)
It can be used for both classification and regression problems. However, it is more widely used
in classification problems in the industry. K nearest neighbors is a simple algorithm that stores
all available cases and classifies new cases by a majority vote of its k neighbors. The case being
assigned to the class is most common amongst its K nearest neighbors measured by a distance
function. These distance functions can be Euclidean, Manhattan, Minkowski and Hamming
distance. First three functions are used for continuous function and fourth one (Hamming) for
categorical variables. If K = 1, then the case is simply assigned to the class of its nearest
neighbor. At times, choosing K turns out to be a challenge while performing kNN modeling.
KNN can easily be mapped to our real lives. If you want to learn about a person, of whom you
have no information, you might like to find out about his close friends and the circles he moves
in and gain access to his/her information.
2.3.7 K-Means
It is a type of unsupervised algorithm which solves the clustering problem. Its procedure follows
a simple and easy way to classify a given data set through a certain number of clusters (assume k
clusters). Data points inside a cluster are homogeneous and heterogeneous to peer groups.
Remember figuring out shapes from ink blots, k means is somewhat similar to this activity. You
look at the shape and spread to decipher how many different clusters/populations are present.
Fig. 2.1 Illustration of KNN using ink blots (Sunil, 2017).
2.3.8 Random Forest
Random Forests (RF) are sets of decision trees that vote together in a classification. Each tree is
constructed by chance and selects a subset of features randomly from a subset of data points. The
tree is then trained on these data points (only on the selected characteristics), and the remaining
out of bag is used to evaluate the tree. Random Forests are known to be effective in preventing
over fitting. As proposed by Breiman (2001) its features are: it is easy to implement, it has good
generalization properties, its algorithm outputs more information than just class label, it runs
efficiently on large databases, it can handle thousands of input variables without variable
deletion and it provides estimates of what variables are important in the classification.
According to Ben-Gall (2008), Bayesian Networks (BNs) belong to the family of probabilistic
graphical models. These graph structures are used to represent knowledge about an uncertain
domain. In particular, each node in the graph represents a random variable, while the edges
between the nodes represent probabilistic dependencies among the corresponding random
variables. Such conditional dependencies in the graph are often estimated using known statistic
and computational methods. Thus, Bayesian networks combine principles of graph theory,
probability theory and statistics.
Artificial Neural Networks (ANN) are composed by several computational elements that interact
through connections with different weights. With inspiration in the human brain, neural networks
exhibit features such as the ability to learn complex patterns of data and generalize learned
information (Zhao, 2012). The simplest form of an ANN is the MultiLayer Perceptron (MLP)
consisting of three layers: the input layer, the hidden layer and the output layer. (Haykin, 2009)
states that the learning processes of an artificial neural network are determined by how parameter
changes occur. Different types of neural networks use different principles in determining their
own rules. There are many types of artificial neural networks, each with their unique strengths.
Some of the most important types of neural networks and their applications (Mehta, 2019).
Feedforward neural networks are used in technologies like face recognition and computer vision.
This is because the target classes in these applications are hard to classify. A simple feedforward
neural network is equipped to deal with data which contains a lot of noise. Feedforward neural
networks are also relatively simple to maintain.
The radial basis function neural network is applied extensively in power restoration systems. In
recent decades, power systems have become bigger and more complex. This increases the risk of
a blackout. This neural network is used in the power restoration systems in order to restore
power in the shortest possible time.
This type of neural network is applied extensively in speech recognition and machine translation
technologies.
A convolutional neural network (CNN) uses a variation of the multilayer perceptrons. A CNN
contains one or more than one convolutional layer. These layers can either be completely
interconnected or pooled. Before passing the result to the next layer, the convolutional layer uses
a convolutional operation on the input. Due to this convolutional operation, the network can be
much deeper but with much fewer parameters. Due to this ability, convolutional neural networks
show very effective results in image and video recognition, natural language processing, and
recommender systems. Convolutional neural networks also show great results in semantic
parsing and paraphrase detection. They are also applied in signal processing and image
classification. CNNs are also being used in image analysis and recognition in agriculture where
weather features are extracted from satellites like LSAT to predict the growth and yield of a
piece of land. Here’s an image of what a Convolutional Neural Network looks like.
Fig. 2.5 Convolutional Neural Network (Mehta, 2019).
A Recurrent Neural Network is a type of artificial neural network in which the output of a
particular layer is saved and fed back to the input. This helps predict the outcome of the layer.
The first layer is formed in the same way as it is in the feedforward network. That is, with the
product of the sum of the weights and features. However, in subsequent layers, the recurrent
neural network process begins. From each time-step to the next, each node will remember some
information that it had in the previous time-step. In other words, each node acts as a memory cell
while computing and carrying out operations. The neural network begins with the front
propagation as usual but remembers the information it may need to use later.
If the prediction is wrong, the system self-learns and works towards making the right prediction
during the backpropagation. This type of neural network is very effective in text-to-speech
conversion technology. Here’s what a recurrent neural network looks like.
Fig. 2.6 Recurrent Neural Network (RNN) – Long Short-Term Memory (Mehta, 2019).
vi. Modular Neural Network
A modular neural network has a number of different networks that function independently and
perform sub-tasks. The different networks do not really interact with or signal each other during
the computation process. They work independently towards achieving the output. As a result, a
large and complex computational process can be done significantly faster by breaking it down
into independent components. The computation speed increases because the networks are not
interacting with or even connected to each other. Here’s a visual representation of a Modular
Neural Network.
Fig. 2.7 Modular Neural Network (Mehta, 2019).
vii.Sequence-To-Sequence Models
A sequence-to-sequence model consists of two recurrent neural networks. There’s an encoder
that processes the input and a decoder that processes the output. The encoder and decoder can
either use the same or different parameters. This model is particularly applicable in those cases
where the length of the input data is not the same as the length of the output data. Sequence-to-
sequence models are applied mainly in chatbots, machine translation, and question answering
systems.
2.4 Related work on literature review
Early detection and diagnosis using AI techniques help to prevent the spread and to combat the
COVID-19 pandemic using different data such as CT scans, X-ray, clinical data, and blood
sample data.
(Yan, Zhang,and Xiao, 2020), predicted the criticality and survival chances of patients with
severe COVID-19 infection based on different risk factors and demographic information. The
dataset used consists of 375 records from patients admitted to Tongji Hospital from January 10th
to February 18th, 2020, including 201 survivors and 174 deceased within the same period.
They used an XGBoost (XGB) model and identified only three main clinical features as
significant, i.e., lactic dehydrogenase (LDH), lymphocyte, and high-sensitivity C-reactive protein
(Hs-CRP), selected from more than 300 features.
The proposed model was validated using data from 29 patients. The key findings of the research
were the model’s ability to predict the risk of death with 0.95 precision and 0.90 prediction
accuracy. Such models will equip physicians with a tool for identifying critical conditions,
thereby helping to reduce the mortality rate.
Even though these findings are of great importance, the research has some limitations, which
affect the accuracy of the reported results.
These limitations were due to the small size of the dataset, namely, 29 records of patients
only.Similarly, (Wong and So, 2020) also used XGB with another dataset to predict the severe
and the death cases and identify the risk factors associated with COVID-19.
The dataset was retrieved from United Kingdom Biobank (UKBB) and includes 93 different
variables collected between 16 March 2020 and 19 July 2020.
Two different studies have been conducted based on the sample’s groups. For the first study, the
data were clinical prediagnostic data of 1747 COVID-19 infected patient records containing both
severe and death cases. For the severity class, the accuracy achieved was 0.668, and for the
fatality class, the accuracy was 0.712. For the second study, the data were taken from the
negative cases, the general population with no COVID-19 infection, consisting of 489987
records. The same model was applied, and the accuracy achieved was similar to the first study,
with an accuracy of 0.669 for the severity class and 0.749 for the fatality class, respectively.
It is worth mentioning that the researchers identified the five most significant risk factors for
severe cases and death cases, with age being the top factor for both cases. Other factors include
obesity, impaired renal function, multiple comorbidities, and cardiometabolic abnormalities.
Sun, Song, andShi,(2020), developed a prediction model using the support vector machine
(SVM) to predict the severe cases of COVID-19 patients. In the study, they used the clinical and
laboratory features that are significantly associated with these cases. Using 336 cases of COVID-
19 patients, 26 severe/critical cases and 310 noncritical, they found that the main features to
discriminate the mild and severe cases are age, growth hormone secretagogues (GHSs), immune
feature cluster of differentiation 3 (CD3) percentage, and total protein.
They found that the proposed model was effective and robust in predicting patients in severe
conditions with up to 0.775 accuracy.
Another research conducted byYao, Zhang, and Zhang,(2019), also applied the SVM model to
classify the COVID-19 patients according to the severity of the symptoms. They applied SVM
for the binary class label on a total of 137 records including urine and blood test results and
combining both severely ill patients and patients with mild symptoms. The results showed that
around 32 factors have high correlations with severe COVID-19, with an accuracy of 0.815. It is
worth mentioning that, amongst all factors, age and gender had mostly affected the classification
of cases between severe and mild. Patients aged around 65 had more severe cases than others.
Moreover, male patients were at a higher risk of developing severe COVID-19 symptoms. In
terms of the urine and blood test samples, blood test result features show more significant
differences between severe and mild cases than urine test result features.
Hu, Liu, and Jiang, (2020) used the logistic regression (LR) model to identify the COVID-19
patients’ severity. They used a dataset containing demographic and clinical data for 115 COVID-
19 patients under the non-severe condition and 68 COVID-19 patients under the severe
condition. Four features have been selected as the most significant features to discriminate the
mild and severe cases: age, high-sensitivity C-reactive protein level, lymphocyte count, and d-
dimer level. This model was evaluated, and the results showed that the prediction was effective
with area under the receiver operating characteristic (AUROC) of 0.881, sensitivity of 0.839, and
specificity of 0.794, respectively.
Bertsimas, Lukin, and Mingardi, (2020), used 3927 COVID-19 patients’ sample for predicting
the mortality risk using XGB. The study used demographic and the clinical features of the
patients from 33 hospital data.
The model achieved the accuracy of 0.85 and AUC of 0.90. Moreover, S´anchez-Montañ´es,
Rodr´ıguez-Belenguer, SerranoLopez, Soria-Olivas, andAlakhdar-Mohmara, (2020)developed
LR-based mortality prediction using 1969 COVID-19-positive patients. The study found age and
O2 as the significant features and achieved an AUC of 0.89, sensitivity of 0.82, and specificity of
0.81, respectively.
In Zagrouba, Adnan Khan,and ur-Rahman, (2021, supervised machine learning techniques have
been investigated to predict the COVID-19 outbreak. In Zagroubaet al (2021), SVM has been
used for prediction over the dataset obtained from the WHO with 303 patients. The proposed
scheme exhibits an accuracy of 0.967 during the testing phase. Similarly, An, Lim, Kim, Chang,
Choi, and Kim (2020), developed the model to predict the mortality of COVID-19 patients using
several machine learning algorithms such as LASSO, SVM (linear and RBF), RF, and KNN.
The models were trained to identify three cases, i.e., mortality and survived and mortality and
survived within 14 and 30 days after the initial diagnosis. Linear SVM achieved the highest
performance with an AUC of 0.962, sensitivity of 0.92, and specificity of 0.91, respectively.
The study found age, diabetes mellitus, and cancer as a significant factor in the mortality
prediction for COVID-19 patients.
Authors/Years Technique Dataset Target Result
class
Yan, Zhang, and Xiao, XGB 404 patients Death, 0.95 precision
(2020) survived 0.90 accuracy
Wong, and So, (2020) XGB 1747 COVID-19 patients Fatal, Accuracy 0.668 (fatalit
severe 0.712 (severe)
Sun, Song, andShi,(2020) SVM 336 COVID-19 patients Severe, 0.775 accuracy
critical
Yao, Zhang, and Zhang, SVM 137 COVID-19 patients Severe, 0.815 accuracy
(2019), non-severe
Hu, Liu, and Jiang, (2020) LR 115 COVID-19 patients Severe, 0.881 AUROC
non-severe 0.839 sensitivity
0.794 specificity
Zagrouba, Adnan, and ur- SVM 303 patients Negative, 0.967 accuracy
Rahman, (2021), positive
cases
Bertsimas, Lukin, and XGB 3927 COVID-19 patients 0.85 accuracy
Mingardi, (2020) 0.90 AUC
Bertsimas, Lukin, and LR 1696 COVID-19 patients Home, 0.89 AUC
Mingardi, (2020) deceased 0.82 sensitivity
0.81 specificity
An, Lim, Kim, Chang, Choi, SVM 8000 COVID-19 patients Mortality, 0.962 AUC
andKim (2020) (linear) recovered 0.92 sensitivity
0.91 specificity
Table 1.0: Review of Related Studies on mortality prediction for COVID-19 patients
In conclusion, the importance of machine learning specifically, on predictive analysis, has been
proven from several studies. Some of the studies have been conducted to perform the prediction
and forecasting, yet there is still a need for further exploration and to extend the findings
associated with COVID-19 using a real dataset of covid-19 symptoms clinical records. The
summary of the related studies is shown in Table 1. The proposed model in this study attempts to
predict the covid-19 symptoms using machine learning and identifying the main risk factors
associated with COVID-19. Targeted patients are isolated at home.
CHAPTER THREE
RESEARCH METHODOLOGY
3.0 Introduction
This chapter covers the description and discussion on the various techniques and
procedures used in the study to collect and analyze the data as it is deemed appropriate. Thus, the
following areas will be treated: Research design, study population sample size/sampling
technique, sources of data, method of data collection, and method of data analysis
According to Saunders and Thorn hill (2003) cited in Adefolarin (2014), research design
means the plan and structure of investigation so conceived as to seek answers to questions for a
research study. Babbie and Mounton (2001) cited in Orod ho (2009), research design essentially
the overall framement of a research project. The master plans within which various data
gathering tools are used. It constitutes guidelines which direct the researcher toward solving the
research problem. As Sekaran (2000) pointed out that research design constitutes the blue print
for collection, measurement and analysis of data. The research work can best be described as a
survey research. A survey research is one in which a group of people or items is studied by
collecting and analyzing data from only a few people or the entire group. Indeed in a more
specific objective this research work is a comparative stand point of research study where it deal
in survey research as only a sample of the population is studied.
The research work focuses on participatory approach in organizational management: this mainly
involves bottom up approach, where staff at the lower echelon of the organization are allowed to
be part and parcel of decision making process. The population of the study consists of staff of
General Hospital Lokoja. The choices of this population was based on the exploratory nature of
the study, convenience and the desired to reduce potentially exogenous influences beyond the
scope of the story. Questionnaire was administered to obtain the opinion of Administration or
Personnel department 30, Finance 25, Education 25, Works 40 and Health Departments 30 and
therefore, the total population for the study is 30+25+25+40+30= 150 on Staff employed and
posted to General Hospital Lokoja , Kogi State. And the total sample size of the study is 1290
A sample size of the study was determined using the Solving’s (1960) formula, which is as
follows
Thus; n = N
1+N(e)2
1=unit (a constant)
Note e= 0.05
n= 150
1+150(0.05) 2
n= 150
151 (0.0025)
n= 150
1.375
n= 109
The sampling technique the researcher used is non-probability sampling in selecting the
individuals with whom the structured questionnaire will be administered. Purposive sampling is
a non-probability sampling method which is commonly used in qualitative research. According
to Leedy and Ormrod (2010), researchers use the purposive sampling method to select those
individuals who can provide you with the most important and relevant information.
This study is based on the two possible sources of data which are the primary and secondary
source.
a. Primary sources of data: the primary data for this study consist of raw data generated
from responses to questionnaires and interview by the respondents.
b. Secondary source of data: the secondary data includes information obtained through
the review of literature that is journal, monographs, textbooks, newspaper, and
internet and achieve, periodical.
According to Epetemehin (2014), questionnaire method is the most important and systematic
method of collecting primary data, especially when the inquiry is quite extensive. It involves
preparation of a list of questions relevant to the inquiry and presenting them in the form often
called questionnaire. Epetimehin had maintained that a questionnaire is divided into two parts
highlighted below:
1. General introductory part which contains questions regarding the identity of the respondent
and it demands personal information such as name, address, telephone number, qualifications,
profession, etc. and,
2. Main question part which contains questions connected with the inquiry. These questions
differ from inquiry to inquiry. Preparation of the questionnaire is a highly specialized job and is
perfected with experience.
Both primary and secondary sources of data collection were used. Questionnaire was
used to obtain information as a primary sources while textbook, journals, newspaper, periodical,
archival and interment constituted secondary sources of data collection. The questionnaire was
designed showing closed – ended questions – Yes or No. The questionnaire was administered to
the sample size of the study covers respondents drawn within Lokoja.
The techniques of data analysis used in this research are simple percentage and chi – square
methods, employed analyzing the responses to the questionnaire as defined below:
n 100
x
N 1
2 FO −FE 2
x =( )
F
E = expected frequency
The X2 value from the formula is compared with the value of tabulated X 2 for a given
significance level and degree of freedom. The level of significance of the use of Chi-square is at
0.05 (5%).
3.6 The Decision Rule
If the computed X2 is greater than the critical value at 0.05 level of significance obtained
from the Chi-square table, then the null hypothesis (Ho) will be rejected and vice-visa.
CHAPTER FOUR
This chapter deal with analysis and interpretation of the data gathered in the course of the
research. As earlier state, the statistical tools employed to assist in the analysis of this work are
tables and simple percentage, also degree of frequency and chi-square to test the hypothesis.
Importantly, a total of 109 Questionnaire were administered to the respondents in the field that
were chosen for the study. The administration of Questionnaire lasted for (3) three weeks. It was a
Herculean exercise but very rewarding for the researcher. 90 were returned duly completed after
being admitted having been filled well by the respondents.
SECTION A
Male 50 56%
Female 40 44%
Total 90 100
The table above show that 50 representing 56% where male, while 40 respondents are female where
44% this indicated that the population of male more than the female respondent
TABLE 2: ANALYSIS BASE ON MARITAL STATUS
Single 60 67%
Married 10 11%
Total 90 100
The above shows that 60 respondents are single 67%. This implies that there is more single person
in these varies organizations than married.
Total 90 100
Table 3.4 above shows that 25 respondents are below 20-30 years and it is 28%, 20 respondents are
within 31-40 years which is 22%, 25 respondents are within 41-50 years which is 28% also 20
respondents are 51 years above of age in these organization
TABLE 4.5. ANALYSIS BASED ON ACADEMIC QUALIFICATION
Secondary 15 16%
HND 25 28%
BSCs 25 28%
Total 90 100
From table 4 analysis shows that there is 15 respondents which is 16% having secondary
certificate, 25 respondents which is 28% having National Diploma, 25 of the respondents are
with Higher National Diploma which is 28% while 28 the respondents BSCs holders the
respondents with 28% . From the above analysis it can be seen that lesser respondents are
holder’s secondary degree qualifications.
SECTION B
This section b seeks to enquire from the respondents their opinion on the questions from the
questionnaires
QUESTION ONE
Table 4.6: 1) To what extent do you think it is ok for gender to take on COVID-19 test outbreak
prediction?
Response 20 20 25 15 10 90
The above table shows that 20 respondents strongly agree with the question and is
represented by 22%, 20 respondents agree by 22% and 25 respondents was undecided and is
represents by 28% while 15 respondents that is 17% strongly disagree 10 respondents that is
11% strongly disagree. It can be deduced from the above table that the highest percentage of the
respondents disagree with the statement
QUESTION TWO
Table: 4.7: 2) 2 To what extent do you feel uncomfortable that aged people are more affected
toward taken on COVID-19 vaccines?
Response 20 35 15 10 10 90
As show in table 4.9 shows that 20 respondents strongly agree and is 22%, 35 respondents agree
and is 39%, 15 respondents are undecided which is 17%, 10 respondents disagree and is 11%
while 10 respondents strongly disagree which is 11%. From the analysis, of the difference
between demographic variables and COVID-19 Vaccine
QUESTION THREE
Table 4.8 3) 3 To what extent do you think it is ok for tested date and affected or non-affected person toward
COVID-19 symptoms?
Response 10 16 20 30 14 90
The table indicated that 10 respondents strongly agree and is 11%, 16 respondents agree which is
18%, 20 respondents are undecided about the question and is 22%, 30 respondents disagree
which respondents by 33% and 14 respondents strongly disagree which is 16% hence the
difference between demographic variables and COVID-19 symptoms
QUESTION FOUR
Table 4.9.4) 4 To what extent do you think it is ok for patient with cough and severe fever to taken on
treatment?
Response 15 25 40 5 5 90
The above table indicates that 15 respondents strongly agree and is 17%, 25 respondents
agree and is 28%, 40 respondents are undecided and is 44%, 5 respondents disagree and is 6%
while 5 respondents strongly disagree and is 5% this analysis shows that it is a motivating factor,
going the high percentage agree
QUESTION FIVE:
Table 4. 10: 5) 5 To what extent do you think it is ok for patient with shortness of breath and headache taken on
COVID-19 confirmation
Response 16 30 15 10 19 90
Agree 20 10 10 100 10
Disagree 15 10 5 25 0
Strongly disagree 10 10 0 0 0
Total 90 42.5
DECISION RULE
From the above table analysis, the calculated x2 c= 42.5 and is greater than the value x2 T= 9.49,
we there reject the hypothesis (H0) and alternative hypothesis (H1) is accepted which states that
there is significant association between gender variable and COVID-19 test outbreak prediction
TEST OF HYPOTHESIS TWO
H0: There is no significant Relationship between feel uncomfortable that aged people are more
affected toward taken on COVID-19 vaccines
H1: There is significant association between feel uncomfortable that aged people are more
affected toward taken on COVID-19 vaccines
To test this hypothesis 2 would be used
Variable o E o-e (o-e)2 (o-e)2/e
Undecided 15 10 5 25 2.5
Disagree 10 10 0 0 0
Strongly disagree 10 10 0 0 0
Total 90 75.00
DECISION RULE
From the above table analysis, the calculated x2 c= 75.0 and is greater than the value x2 T= 9.49,
we there reject the hypothesis (H0) and alternative hypothesis (H1) is accepted which states that
there is significant association between feel uncomfortable that aged people are more affected
toward taken on COVID-19 vaccines
TEST OF HYPOTHESIS FOUR
H0: There is no significant Relationship between patient with cough and severe fever to taken on
treatment
H1: There is significant association between patient with cough and severe fever to taken on
treatment
To test this hypothesis 4 would be used
Variable o e o-e (o-e)2 (o-e)2/e
Undecided 40 10 20 400 40
Disagree 5 10 -5 25 2.5
Total 90 70.0
DECISION RULE
From the above table analysis, the calculated x2 c= 70.0 and is greater than the value x2 T= 9.49,
we there reject the hypothesis (H0) and alternative hypothesis (H1) is accepted which states that
there is a significant association between patient with cough and severe fever to taken on
treatment
TEST OF HYPOTHESIS FIVE
H0: There is no significant Relationship between patient with shortness of breath and headache
taken on COVID-19 confirmation
H1: There is significant association between patient with shortness of breath and headache taken
on COVID-19 confirmation
To test this hypothesis 5 would be used
Variable o e o-e (o-e)2 (o-e)2/e
Agree 30 10 20 400 40
Undecided 15 10 5 25 2.5
Disagree 10 10 0 0 0
Total 90 54.2
DECISION RULE
From the above table analysis, the calculated x2 c= 54.2 and is greater than the value x2 T= 9.49,
we there reject the hypothesis (H0) and alternative hypothesis (H1) is accepted which states that
there is a significant correlation between patient with shortness of breath and headache taken on
COVID-19 confirmation
TEST OF HYPOTHESIS THREE
H0: There is no significant Relationship between tested date and affected or non-affected person
toward COVID-19 symptoms
H1: There is significant association between tested date and affected or non-affected person
toward COVID-19 symptoms?
To test this hypothesis 3 would be used
QUESTION THREE
Table 4.8 3) To what extent do you think it is ok for tested date and affected or non-affected person toward
COVID-19 symptoms?
Response 10 16 20 30 14 90
Strongly agree 10 10 0 0 0
Agree 16 10 6 36 3.6
Undecided 20 10 10 100 10
Disagree 30 10 10 100 10
Total 90 25.2
From the value x2 c=25.2, x2 T at 0.05 with df= 4 is 9.49
DECISION RULE:
From the table, the analysis calculated x2 c= 25.2 and is greater than the value x2 t= 9.49 we
therefore reject the null hypothesis (H1) and which states that to take what extent do you think it
is ok for tested date and affected or non-affected person toward COVID-19 symptoms is
accepted
CHAPTER FIVE
Covid-19 symptoms prediction in Lokoja Local general hospital as case study. The research
There is significant association between patient with cough and severe fever to taken
on treatment
There is significant association between tested date and affected or non-affected person toward
COVID-19 symptoms
5.1 CONCLUSION
In this study, have ex-rayed the local government and demographic variables and consumer
attitude toward debt and a closer investigation shows that Lokoja Local Govt Area, activity is
still confirmed to the same narrow functional competence as before the reform of 1976 local
government. Participation in some of the more technical and strategic service is rare, disputes the
involvement by the 1976 reform guidelines and Nigeria 1999 constitutional provision. Hence,
Lokoja Local Govt Area participation in economic planning is minimal and her activities are
generally better linked to the development programmed in federal capital territory, Abuja. For
instance, Lokoja Local Govt Area is not responsible for managing the universal primary
education programme (UPE) even though it makes a substantial contribution to it. Also, it is left
out of the housing programme, agricultural revolution and diversification of economy
programmed in its domain. To administer this Programme in the FCT, Abuja, the federal
government and FCDA have created special agencies such as FCT, universal education board
(FCT, UBEC), FCT, Water Board, Abuja Investment Company etc.
5.2 RECOMMENDATION
Idah local government is regarded as confirmed a narrow functional competence before the
reform of 1976 local government and smooth practice, the following are recommended:
1. It is recommended to determine other social demographic factors which might affect you
feel uncomfortable that aged people are more affected toward taken on COVID-19
vaccines
3. It also examines the relationship between personal financial knowledge and a variety of
Considering the Lokoja Local Government staff as a very apt in the practice governance
and development agenda in Nigeria and we suggest for the research work should accommodate
the relevance imputes relationship between demographic variable and patient attitude toward
Covid-19 test in Lokoja local government. It is parchment that the research should enlarge the
sample size and population as well as the area study so as to form a stranger basis to adjudge the
impact of in local government in Nigeria beyond the case study adopted by this research.
References
Yan L, Zhang H.-T, and Xiao Y, (2020), “Prediction of criticality in patients with severe Covid-
19 infection using three clinical features: a machine learning-based prognostic model
with clinical data in Wuhan,” medRxiv.
Wong K. C. Y, and So H.-C., (2020) “Uncovering clinical risk factors and prediction of severe
COVID-19: a machine learning approach based on UK biobank data,” medRxiv
Sun L, Song F, and Shi N, (2020) “Combination of four clinical indicators predicts the
severe/critical symptom of patients infected COVID-19,” Journal of Clinical Virology,
vol. 128, p. 104431
Yao H, Zhang N, and Zhang R, (2019), “Severity detection for the coronavirus disease 2019
(COVID-19) patients using a machine learning model based on the blood and urine
tests,” Frontiers in Cell and Developmental Biology, vol. 8, pp. 1–10, 2020.
Hu C, Liu Z, and Jiang Y, (2020), “Early prediction of mortality risk among patients with severe
COVID-19, using machine learning,” International Journal of Epidemiology, vol. 49, no.
6, pp. 1918–1929, 2020.
An C., Lim H, Kim D.-W, Chang J. H, Choi Y. J, and Kim S. W (2020) “Machine learning
prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean
cohort study,” Scientific Reports, vol. 10, p. 18716, 2020