NCRACIT-2023

PROCEEDINGS OF
NATIONAL CONFERENCE ON RECENT

ADVANCEMENTS and CHALLENGES IN
INFORMATION TECHNOLOGY
(NCRACIT-2023)
28th&29thApril2023
ISBN: 978-93-5906-181-8
Organized By
School of Computer Science Engineering&

Information Science
Presidency University
Bengaluru
www.presidencyuniversity.in
1
PROCEEDINGS of National Conference NCRACIT-2023
Committee List
CHIEF PATRON
Dr. Nissar Ahmed, Chancellor
PATRONS
Dr. D. Subhakar, Vice-Chancellor
Dr.Surendra Kumar A M, Pro Vice-Chancellor
Dr. Abdul Sharief Dean - SOE
Dr. C. S. Ramesh Dean - Research & Development
Dr. Sivaperumal Director - International Relations
Dr. Sameena Ahmed - Registrar
GENERAL CHAIR
Dr. Md Sameeruddin Khan Dean - SoCSE & IS
GENERAL Co-CHAIR
Dr. Kalairasan C. Asso. Dean - SoCSE & IS
ADVISORY COMMITTEE
All HoDs, SoCSE & IS
CONFERENCE CHAIRs
Dr. Gopal K. Shyam, Prof. & HoD, COM & CEI
Dr. Manujakshi B C., Asso. Prof., SoCSE & IS
ORGANIZING COMMITTEE
Dr. Mujeer Mulla
Dr. Preethi
Mr. Vetrimani
Mr. Riyaz
PUBLICITY COMMITTEE
Dr. Madhusudan M V
Mr. Rama Krishna
Mr. Sanjeev K.
Mr. Mrutyunjaya M. S.
WEBSITE COMMITTEE
Mr. Amogh P. Kulkarni
Ms. Sreelatha P K
REGISTRATION & FINANCE COMMITTEE

Ms. Manasa C M
Dr. Mohana S D
PUBLICATION COMMITTEE
Mr. Muthuraju V.
Mr. Yamanappa
Ms. Shilpa C.N.
SESSION COMMITTEE
Dr. Ila Chandrakar
Ms. Smitha Patil
Ms. Amirtha Preeya
Mr. Shivalingappa
REVIEW COMMITTEE
Dr. Sandeep Albert Mathias
Ms. Galiveeti Poornima
Dr. Harishkumar K S
CERTIFICATE COMMITTEE
Ms. Sneha Bagalkot
Ms. Priyanka V.
STAGE AND DECORATION COMMITTEE

Dr. Yashashwini D K
Ms. Sapna R.
Ms. Swathi Pai
Message from Chancellor:
I am delighted to know that the School of Computer Science Engineering & Information Science is
hosting a National conference on April 28-29, 2023. This conference is an opportunity for researchers,
industry professionals, and students to come together and share their latest findings and innovations in
the field of computer science and engineering.
As we all know, the field of computer science and engineering is rapidly evolving and it significantly
impacts various sectors of our society. I encourage all of you to participate in this event by attending or
submitting your research work for presentation.
I would like to thank the organizing committee for their hard work and dedication in putting together
this conference. I also extend my gratitude to all the participants for their active engagement and
contribution toward making this event a success.
I wish you all a productive and fruitful conference.
Chancellor,
Presidency University,
Bengaluru, India
Message from Vice-Chancellor:
I am delighted to note that the School of Computer Science Engineering & Information Science is
organizing a National Conference on Recent Advancements and Challenges in Information
Technology (NCRACIT – 2023). Certainly, this type of conference not only brings all the researchers
and students to one platform but also inculcates the research culture among the entire fraternity of
Education in the country, there by, contributing to the development of a nation.
I hope that this conference would certainly induce innovative ideas among the participants paving the
wayfornewinventionsandtechnologiesinComputingandInformationTechnology. I congratulate the
School and all Faculty members for initiating the conduction of such a conference.
I wish the conference a grand success!!
Vice-Chancellor
Bengaluru, India
Message from the Registrar:
I am pleased to note that our university will be hosting a National Conference on Recent Advancements
and Challenges in Information Technology (NCRACIT – 2023). The conference proceedings will
cover a wide range of topics including Artificial Intelligence, Big Data, Cyber Security, the Internet of
Things, Cloud Computing, and many others. We hope that this event will provide a unique opportunity
for participants to engage in discussions, network with peers, and gain new perspectives on the latest
trends and challenges in Information Technology.
As the field of IT continues to rapidly evolve, it is crucial for us to stay updated with the latest research
and advancements. This conference will not only serve as a platform to share new ideas and concepts
but also help foster collaborations and partnerships within the academic and industrial communities.
We look forward to your active participation and contributions towards making this conference a
success.
Registrar
Bengaluru, India
Message from PVC
It is my pleasure to extend a warm welcome to all of you attending the National Conference on Recent
Advancements and Challenges in Information Technology (NCRACIT – 2023).
The field of IT is rapidly evolving, and this conference is an excellent opportunity for participants to
learn about the latest advancements, research, and challenges in the field. We are proud to host this
event, which brings together experts, academicians, industry professionals, and students from across
the country to share their knowledge and insights.
The conference proceedings will cover a broad range of topics, including but not limited to Artificial
Intelligence, Cybersecurity, Data Science, and the Internet of Things. We are confident that the
conference will provide a unique platform for participants to learn, share, and network with peers from
various domains of IT.
I would like to thank the organizing committee for their efforts in putting together this conference, and
I would also like to express my appreciation to all the participants for their active participation and
contribution to the event. We hope that this conference will be an enlightening and enriching
experience for all involved and will lead to further advancements in the field of IT.
Dr. Surendra Kumar

Bengaluru, India
Message from Dean CSE& IS
It gives me great pleasure to welcome you all to the Presidency University's National Conference on
Recent Advancements and Challenges in Information Technology (NCRACIT – 2023). As the Dean
of Computer Science Engineering& Information Science, I am thrilled to be a part of this important
event that brings together experts and professionals from different fields of IT.
This conference is a platform for researchers, academicians, industry professionals, and students to
share their knowledge and insights into the latest advancements in IT. The conference proceedings will
cover a wide range of topics, including Artificial Intelligence, Machine Learning, Cyber security, the
Internet of Things, and many others.
We believe that this conference will help us identify the key challenges and opportunities in the field
of IT and develop new strategies for addressing them. We also hope that it will foster collaborations
and partnerships between academia and industry, leading to innovative research and development in
the field.
As the world becomes increasingly reliant on technology, it is more important than ever to stay up-to-
date with the latest advancements and challenges. This conference provides an excellent opportunity
for us to do just that. I would like to express my gratitude to the organizing committee for their hard
work and dedication in making this conference possible. I also extend my warmest thanks to all the
participants for their active engagement and contribution to the conference.
I hope you all have a productive and enjoyable conference.
Dr. Sameeruddin Khan

Dean CSE& IS.
Bengaluru, India
Message from Associate Dean CSE& IS
Research at Presidency University is culture and to promote this, the university offers various
research programs. Dedicated faculty members and research scholars are undertaking
research in cutting-edge technologies. Research circles mentored by senior researchers
provide guidance to young members and instill research culture in the schools. The university
encourages research and aspires to become one of the best universities known for applied
research, and also encourages the dissemination of research outcomes through forums such
as this, one being organized by the School of Computer Science Engineering and Information
Science. I congratulate the school for organizing the National Conference on Recent
Advancements and Challenges in Information Technology (NCRACIT – 2023). I convey my
best wishes to the organizers and the participants and hope that the conference will open up
new avenues to tackle the latest technological issues.
I hope you all have a productive and enjoyable conference and look forward to seeing the
valuable insights and research that will be presented.
Dr. C. KALAIRASAN
Associate Dean - CSE& IS
Bengaluru, India
Lung Cancer Detection using YOLO CNN
Algorithm
Y. Venkat Sai Reddy G. Chetan Redddy Ayush Kumar

School of Computer Science School of Computer Science School of Computer Science
and Engineering and and Engineering and and Engineering and
Information Sciences Information Sciences Information Sciences
Presidency University Presidency University Presidency University
Bangalore, India Bangalore, India Bangalore, India
email- yadagurusai@gmail.com email- email-
chetanreddygajjala1@gmail.co abhayrichand323@gmail.com
G. Chandana m
School of Computer Science Suvarna Kumar
and Engineering and School of Computer Science
Information Sciences and Engineering and
Presidency University Information Sciences
Bangalore, India Presidency University
email- Bangalore, India
ganeshamchandu@gmail.com email-
gorantlasuvarna12345@gmail.c
om
Dr. Syed Siraj Ahmed
Assoc. Prof. (SOCSE)
Bangalore, India
email-
siraj.ahmed@presidencyuniversi
ty.in
email- yadagurusai@gmail.com
Abstract— The main objective of this research is, to create and physiological causes of human illnesses by examining
a computer vision algorithm which uses the YOLO (You Only vast volumes of data. Following that, clinical diagnoses are
Look Once) convolutional neural network (CNN) architecture made using this knowledge, and medical services are
to identify lung cancer in medical photographs. A series of provided.
computed tomography (CT) scan pictures will be used as the
input for the proposed method, which will then output the Contrary to conventional machine learning methods,
likelihood that lung cancer is present in the input image. The deep learning does not require human feature extraction,
input photos will be subjected to object detection using the which boosts time and resource efficiency. Deep learning is
YOLO CNN architecture, allowing for the location of possible carried out via neural networks, which are composed of
malignant spots. In order to further refine the discovered areas neurons. In neural networks, including many neurons in each
and categorize them as cancerous or non-cancerous, the output layer, the input of the following layer is regarded as the
of the YOLO network will be processed through further layers upper layer output. The neural network may use nonlinear
of convolutional neural networks. A sizable collection of CT processing and connections between layers to change the
data will be used to train the suggested method. input into the output. More importantly, the high-level
network automatically learns more abstract and generalized
Keywords—YOLO, CNN, CT,
characteristics from the input, overcoming the limitation that
I. INTRODUCTION machine learning requires explicit feature extraction.
Big data in health has grown over the past few years due II. PROPOSED METHODOLOGY
to the quick advancement of computer technology and
The Model Process Contains 4 Phases of work:
medical data. The use of technology in medicine has become
1. Image Preprocessing
more prevalent in recent years. Numerous domains that
merge medical, computer science, biology, mathematics, and 2. VGG16 Implementation
other sciences are involved in this technique of employing 3. Comparison of VGG16 with ML Algorithms
medicine. It is supported by vast biological data and 4. Deployment of Model in App
sophisticated computer technologies. It makes use of
artificial intelligence to uncover the underlying principles
1. Image Preprocessing: The first two layers have 4096 channels each,
The initial stage of our model is this. Our dataset on CT while the third layer has 1000 channels and
pictures of lung cancer was taken from Kaggle. The performs 1000-way ILSVRC classification. The
data set was divided into three categories: final layer is a soft-max layer.
adenocarcinoma, squamous cell carcinoma, and normal.
Test, Train, and Validate categories are used to 3. Comparison of VGG16 with ML Algorithms:
categorize each sort of cancer cell picture. We executed
Algorithms constructed: We have constructed the
picture data augmentation procedures including rescale,
following models:
horizontal flip, and rotation of the photos after resizing
1. Support Vector Machine (SVM)
them to 350*350.
2. K-Nearest Neighbours (KNN)
3. Random Forest Classifier (RFC).
2. VGG16 Implementation:
In addition to the input and output layers, CNN also has We employed numerous ML algorithms and
a number of hidden layers. An instance of CNN is compared their accuracies and various parameters
VGG16. The model's creators studied the networks and with VGG16 (CNN model).
increased the depth using an architecture with 3. Deployment of App:
extremely tiny (3x3) convolution filters, which
To use tensor flow converter function libraries to
demonstrated a considerable advancement over the
deploy our model in an application, we transformed it
state-of-the-art setups. The depth was raised to 16–19
to tensor flow light. The app was developed using
weight layers, or around 138 trainable parameters.
Android Studio, and after deploying a tensor flow
model and using a CT picture as input, it can detect the
VGG16 USED FOR- kind of cancer and display some of its symptoms and
therapies. The app also includes information on lung
VGG16 is an object identification and classification
cancer and its many kinds.
approach that, when used to classify 1000 images into
1000 separate categories, has an accuracy rate of
92.7%. It is a popular method for categorizing photos
and is easy to use with transfer learning.
III. MERITS AND DEMERITS
S. No. Research Paper Proposed
Method
1. Deep Learning In order to
Predicts Lung predict lung
Cancer cancer, this study
Fig.1.1. VGG16 Architecture Treatment employed
Response from techniques like
1. In VGG 16, the number 16 denotes 16 weighted Serial Medical recurrent neural
layers. VGG16 consists of 21 layers overall—13 Imaging networks (RNN)
convolutional layers, 5 max pooling layers, 3 dense and
layers—but only 16 of them are weight layers, also convolutional
referred to as learnable parameters layers. neural networks
(CNN).
2. The input tensor for VGG16 has three RGB However, the
channels and a size of 224, 244. Paper was only
able to look into
3. The unique characteristic of the VGG16 model a particular type
is that it constantly uses the same padding from a of scanner from
2x2 filter with a stride 2 and uses a 3x3 filter of a single CT
convolution layers with stride 1 rather than a lot of provider.
hyper-parameters. 2. Pancreatic Ductal The authors of
Adenocarcinoma: this work
4. Both convolution and max pool layers are Machine employed the
distributed equally across the design. Learning Based Support-vector
Quantitative +machine
5. Conv-1 Layer has 64 filters, Conv-2 Layer has Computed (SVM) and
128 filters, and Inv-3 Layer has 256 filters in Conv- Tomography Logistic
3, and 512 filters in Conv-4 and Conv-5. Texture Analysis Regression
For Prediction Of Analysis
6. Three layers of fully connected neural networks Histopathological methods.
are followed by a stack of convolutional layers. Grade But, Due to the
small number of
enrolled patients, 3) Support Vector Machine
overfitting may The classification process uses the ML algorithm SVM.
result. Here, the It is an algorithm for supervised learning. SVM is mostly
CT imaging employed to create a hyperplane between two classes
parameters vary. that may categorize n-dimensional space. Future
datapoint plotting in the appropriate category is simple.
3. Lung Cancer In this study, For instance, I need to separate my two courses
Detection: A they presented a efficiently. A class can carry out several duties. If you
Deep Learning method for group them according to only one attribute, there could
Approach applying deep be some overlap, as the graph below shows. As a result,
residual learning we will continue to add traits to ensure accurate
to identify lung classification. We are obtaining improved accuracy
cancer from CT while using the deep learning methodology in
images. To comparison to other ML techniques. Our algorithm
determine the performs effectively even with smaller data sets since we
possibility that a enhance the supplied dataset with picture data. The
CT scan contains model cannot accurately predict the kind of cancer if the
cancer, they CT scan images are not acquired properly or if the image
combined the is not clear. Only one CT picture viewpoint is presently
predictions from supported by the model.
various
classifiers,
including IV .CONCLUSION
XGBoost and We used machine learning methods in this study and
Random Forest. compared the outcomes to the VGG16 model. To find the
Tab. 1.1 Literature survey most effective algorithm for determining if a CT picture
contains cancer or not, we calculated the Accuracy and
Recurrent Neural Network, Logistic Regression Analysis, Precision of every machine algorithm as well as the
Support Vector Machine (SVM), and Linear Discriminate Accuracy of VGG16.This model may be used to forecast
Analysis (LDA) are some of the current methodologies real-time CT scans. As a result, we employed Android
utilized for lung cancer prediction. Studio to create an app for user experience.
a) Recurrent Neural Network
Recurrent neural networks (RNNs), a type of neural
network, use the outcome from the previous stages as the REFERENCES
input for the following phase. In this scenario when it is
important to predict the following word in a phrase, [1] Huang, T. et al. Distinguishing lung adenocarcinoma from lung
modern neural networks contain separate inputs and squamous cell carcinoma by two hypomethylated and three
outputs. As a result, RNN was created, and it utilized a hypermethylated genes; a meta- analysis. PLoS ONE 11m e0149088
(2016).
Hidden Layer to discover a resolution to this issue. The
[2] Davidson, M.R., Gazdar, A.F. & Clarke, B.E. Te pivotal role of
basic and most important property of RNNs is that the pathology in the management of lung cancer. J. Torac. Dis. 5(Suppl
hidden state will preserve part of the sequence's 5), S463-S478 (2013).
information. [3] Aisner, D. L. et al. Te impact of smoking and TP53 mutations in lung
adenocarcinoma patients with targetable mutations-the lung cancer
2) Logistic Regression Analysis mutation consortium (LCMC2). Clin. Cancer Res. 24, 1038-1047
(2018).
Using prior observations from a data set, a statistical
[4] Hosny, A. et. al. Deep learning for lung cancer prognostication: a
analysis method known as logistic regression predicts a retrospective multi-cohort radiomics study. PLoS Med 15, e1002711
binary result, such as yes or no. A logistic regression (2018).
model predicts a dependent data variable by looking at
the association between one or more already existing
independent variables.
Prediction of diseases using machine learning
algorithms
Banashankari S Hadimani Nirmitha D K

Computer Science Engineering Computer Science Engineering Nireeksha B H
Presidency University Presidency University Computer Science Engineering
Bangalore, India Bangalore, India Presidency University
20191CSE0061 20191CSE0388 Bangalore, India
20191CSE0387
Mohamed Niyas
Computer Science Engineering
Bangalore, India
20191CSE0348
Abstract— Computer-Aided Diagnosis (CAD) is a quickly IV. INTRODUCTION

evolving, diverse field of study in medical analysis. Machine The subfield of Artificial Intelligence Research is
learning is important in Computer-aided diagnostic tests. Machine Learning. The objective of Machine Learning is
The creation and use of a number of well-known data mining
on designing computer algorithms that can read and use
techniques in a variety of real-world application fields (such as
data to know for themselves.
industry, healthcare, and bioscience) have resulted in the use of
The learning process starts with the observation of data,
techniques in machine learning environments to extract useful
such as references, direct experience, or guidance.
information from the target data in healthcare communities,
biomedical fields, etc. Machine learning is important in Computer Aided
Accurate medical database analysis helps with early disease Diagnostic tests. Objects such as body organs cannot be
detection, patient care, and social assistance. The creation of identified correctly after using an easy equation.
classifier systems applying machine learning algorithms seeks to Therefore, pattern recognition essentially requires training
greatly contribute to the resolution of health-related problems by from instances. In the bio-medical area, pattern detection
assisting doctors in early disease prediction and diagnosis.
and Machine Learning promises to improve the reliability
It calls focus on the collection of algorithms and techniques for
of disease approach
machine learning used for disease detection and decision-making
and detection. It provides a respectable approach to make
processes. It includes developing an automated system that can
superior and automated algorithm for the study of high
discover and extract hidden knowledge associated with diseases
from a historical(diseases-symptoms) database according to the dimension and multi-modal bio medical data. The relative
rule set of the respective algorithms. study of various Machine Learning algorithm for the
This research work carried out demonstrates the disease detection of various disease such as heart disease, diabetes
prediction system developed using machine learning techniques, disease is given in this survey paper. It calls focus on the
such as the Decision Tree classifier, Naive Bayes classifier, and collection of algorithms and techniques used for disease
Random Forest classifier. The study includes a comparative
detection and decision-making processes. Diseases and
analysis of the results of the algorithms mentioned above is
health-related problems like malaria, dengue, Impetigo,
included in the paper. Diabetes, Migraine, Jaundice, Chickenpox etc., cause
Keywords—Computer aided diagnosis(CAD),Data mining significant effects on one’s health and sometimes might
techniques, disease detection, classifier systems, disease-
symptoms database, disease prediction system, Decision tree also lead to death if ignored. The healthcare industry can
classifier , Naive bayes classifier, Random forest classifier
make an effective decision making by “mining” the huge
database they possess i.e. by extracting the hidden patterns
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

and relationships in the database. Data mining algorithms
like Decision Tree, Random Forest and Naive Bayes
algorithms can give a remedy to this situation. The type of
model we are going to build is the classification machine
learning model which can classify the disease based on
the user’s symptoms. The system can be used in the
medical field, research etc. Hence we have developed an
automated system that can discover and extract hidden
knowledge associated with diseases from a
historical(diseases-symptoms) database according to the
rule set of the respective algorithms.
V. OVERVIEW
The dataset we have considered consists of 132
symptoms, the combination or permutations of which
leads to 41 diseases. Based on the 4920 records of
patients, we aim to develop a prediction model that takes
in the symptoms from the user and predicts the disease he
is more likely to have.
The considered symptoms are:
The expected diseases are :

A. DECISION TREE CLASSIFIER
VI. APPROACH(HOW MACHINE LEARNING WORKS) The classification models built by the decision tree
resemble the structure of tree. By learning the series of
There are basic steps for building a machine learning explicit if-then rules on feature values (symptoms in our
application (or model). These are typically performed by case), it breaks down the dataset into smaller and smaller
data scientists working closely with the business subsets that result in predicting a target value(disease). A
professionals for whom the model is being developed. decision tree consists of the decision nodes and leaf nodes.
➢ Decision node: Has two or more branches. In
A. Select and prepare a training data set
our work presented, all the symptoms are
Training data is a data set representative of the data the considered as decision nodes.
machine learning model will ingest to solve the problem ➢ Leaf node: Represents the classification that is,
it’s designed to solve. In some cases, the training data the Decision of any branch. Here the Diseases
is labeled data—‘tagged’ to call out features and correspond to the leaf nodes.
classifications the model will need to identify. Other data
is unlabeled, and the model will need to extract those
features and assign classifications on its own. In either
case, the training data needs to be properly prepared—
randomized, de-duped, and checked for imbalances or
biases that could impact the training. It should also be
divided into two subsets: the training subset, which will
be used to train the application, and the evaluation subset,
used to test and refine it.
Fig :flow of decision tree
B. Choose an algorithm to run on the training data

set
B. RANDOM FOREST
Again, an algorithm is a set of statistical processing steps. CLASSIFIER
The type of algorithm depends on the type (labeled or
Random forest is a flexible, easy-to-use machine learning
unlabeled) and amount of data in the training data set and
algorithm that provides exceptional results most of the
on the type of problem to be solved.
time even without hyper-tuning. As mentioned in the
VII. METHOD Decision tree, the major limitation of decision tree
The disease prediction system is implemented using the algorithm is overfitting. It appears as if the tree has
three data mining algorithms i.e. Decision tree classifier, memorized the data. Random Forest prevents this
Random forest classifier and Naïve Bayes classifier. The problem: It is a version of ensemble learning. Ensemble
description and working of the algorithms are given learning refers to using multiple algorithms or same
below, algorithm multiple times. Random forest is a team of

Decision trees. And greater the number of these decision
trees in Random forest, the better the generalization. More
precisely, Random forest works as follows:
1. Selects k symptoms from dataset (medical record) with
a total of m symptoms randomly (where k<<m). Then, it
builds a decision tree from those k symptoms. VIII. DESIGN ARCHITECTURE
2. Repeats n times so that we have n decision trees built
from different random combinations of k symptoms (or a
different random sample of the data, called bootstrap
sample)
3. Takes each of the n-built decision trees and passes a
random variable to predict the Disease. Stores the
predicted Disease, so that we have a total of n Diseases
predicted from n Decision trees.
4. Calculates the votes for each predicted Disease and
takes the mode (most frequent Disease predicted) as the
final prediction from the random forest algorithm.
C.NAIVE BAYES CLASSIFIER
The fundamental Naive Bayes assumption n is that each

feature makes an: Independent and Equal Contribution to
the outcome. Its advantage is that it works fast even on a
large dataset as it requires less computational power.
1) Bayes theorem Naive Byes algorithm is based on Bayes
theorem given by:
IX. LITERATURE REVIEW
Machine learning algorithms are being used more

Where;
frequently because they can provide effective learning
P(s/h)= Posterior probability
models more quickly. To develop training modules and
P(h/s)= Likelihood
efficiently exploit patterns found in historical data for
P(s)= Class prior probability
prediction, it makes use of a wide range of statistical and
P(h)= Predictor Prior probability
optimization techniques. Recently, supervised algorithms
In the formula above ‘s’ denotes class and ‘h’ denotes
have gained ground over unsupervised algorithms due to a
features.
variety of difficulties. In supervised learning models, a
In P(h), the denominator consists the only term that is a
training model is constructed based on repeatedly training
function of data(features)- it is not a function of the class
the dataset. The results of unlabeled data can be easily
we are currently dealing with.
predicted as the data is labeled. Artificial intelligence (AI)
Thus, it will be same for all the classes. Traditionally in
and machine learning (ML) approaches can be highly
Naive Bayes Classification, we ignore this denominator as
helpful in the prediction of high-risk diseases, according
it does not affect the result of the classifier in order to
to recent studies. The Bayes theorem is the foundation of
make the prediction.
the Naive Bayes classification technique, which is mostly
employed in the research community for training and
classification. This theorem is used to determine an advanced Computer and Mathematical Sciences, ISSN
event's probability using the knowledge base that is 2230- 9624, Vol 3,Issue 3,2012,pp 290- 294[Accepted-
derived from the training model that is associated with the 12/06/2012].
event's condition. This classifier bases its theory on the [4] Min Chen, Yixue Hao et.al “Disease Prediction by
presumption that, despite the fact that features within a Machine Learning over big data from Healthcare
class are mutually dependent on one another, no particular Communities”, IEEE[Access 2017]
feature inside a class is related to any other feature within [5] Mr Chintan Shah, Dr. Anjali Jivani, “Comparison Of
the class. An aid to decision-making and predictive Data Mining Classification Algorithms for Breast Cancer
analytics is the decision tree. A decision tree model Prediction”, IEEE-31661
applies the idea of decision logic to test the data in [6] Palli Suryachandra, Prof.Venkata Subba
accordance with the logic, and it uses a tree-like structure Reddy,“Comparison of Machine Learning algorithms For
to interpret the findings in order to categorize the data Breast Cancer”, IEEE.
items. The nodes are divided into different tiers, with the [7] Andrew Alikberov, Stephan Broadly et.al “The
root node being the initial or highest-ranking node. Learning Machine”, Accessed on: March 26,2020.
Random Forest is a team of Decision trees. And greater [Online]. Available: https://www.thelearningmachine.ai.
the number of these decision trees in Random forest, the [8] Anupama Yadav, Levish Gediya, Adnanuddin Kazi,
better the generalization. “Heart Disease Prediction using Machine Learning”,
International Research Journal of Engineering and
VII.CONCLUSION Technology (IRJET 8(09), 2021)
[9] Ambekar, S., & Phalnikar, R. (2018, August). Disease
From the historical development of machine learning and
risk prediction by using convolutional
its applications in medical sector, it can be shown that
neural network. In 2018 Fourth international
systems and methodologies have been emerged that has
conference on computing communication control
enabled sophisticated data analysis by simple and
and automation (ICCUBEA) IEEE., CrossRef.
straightforward use of machine learning algorithms. This
[10] Vijayarani, S., & Dhayanand, S. (2015). Data
paper presents a comprehensive comparative study of
mining classification algorithms for kidney disease
three algorithms performance on a medical record. The
prediction. Int J Cybernetics Inform
performance is analyzed through confusion matrix and
[11] Sisodia, D., & Sisodia, D. S. (2018). Prediction
accuracy score. Artificial Intelligence will play even more
of diabetes using classification
important role in data analysis in the future due to the
algorithms. Procedia computer science, CrossRef.
availability of huge data produced and stored by the
[12] Sharmila, S. L., Dharuman, C., & Venkatesan, P.
modern technology.
(2017). Disease classification using machine
learning algorithms-a comparative study.
REFERENCES
International Journal of Pure and Applied
[1] Qulan, J.R. 1986. “Induction of Decision Trees”. Mathematics, CrossRef.
Mach.Learn. 1,1 (Mar. 1986),81-10
[2] Sayantan Saha, Argha Roy Chowdhuri et,al “Web
Based Disease Detection System”,IJERT,
ISSN:22780181,Vol.2 Issue 4, April-2013
[3] Shadab Adam et.al “Prediction system for Heart
Disease using Naïve Bayes”, International Journal of
Unleashing The Full Potential Of Our Digital Business Using New Relic
Afroj Alam Vimala Keerthi K Renuka Shanmugam

Asst.Prof - CSE 20191CSE0688 20191CSE0495
afroj.alam@presidencyuniver 201910100065@presidencyu 201910100404@presidencyu
sity.in niversity.in niversity.in
Kiran Surya M Kasthuri Vinay Kumar Shyam R Pole

20191IST0070 20191COM0229 20191CSE0570
201910100186@presidencyu 201910102132@presidencyu 201910100376@presidencyu
niversity.in niversity.in niversity.in
Abstract - A performance monitoring tool

called New Relic offers in-the-moment
Keywords – Largest Contentful Paint(LCP),
information about the functionality and
First Input Delay(FID), Cumulative Layout
accessibility of software application. An
Shift(CLS)
overview of how to monitor and improve the
performance of our website, gadget house e-
commerce application using New Relic is given I. INTRODUCTION
in this abstract. New Relic may assist
A program named new relic for tracking
ecommerce organizations in identifying and
performance offers in-the-moment information
resolving performance issues, enhancing the
about the functionality and accessibility of
user experience, and raising conversion rates
software application. An overview of how to
by gathering and analyzing data on crucial
monitor and improve the performance of our
metrics like server response times, page load
website, gadget house e-commerce application
times, and error rates. Assuring that their
using New Relic is given in this abstract. New
consumers have a flawless and pleasurable
Relic may assist ecommerce organizations in
online shopping experience, New Relic equips
identifying and resolving performance issues,
ecommerce teams to pro-actively monitor and
enhancing the user experience, and raising
optimize their application performance in real-
conversion rates by gathering and analyzing data
time through the use of customizable
on crucial metrics like server response times, page
dashboards and alerts.
load times, and error rates. Assuring that their
The eCommerce platform from New Relic is consumers have a flawless and pleasurable online
designed to work with enterprises of all sizes, shopping experience, New Relic equips
from small startups to large corporations. ecommerce teams to pro-actively monitor and
Customers can easily explore and discover the optimize their application performance in real-
things they need on the website because to its time through the use of customizable dashboards
user-friendly layout. Additionally, the website and alerts.
offers personalized suggestions, and advanced The eCommerce platform from New Relic is
search features, all of which assist boost designed to work with enterprises of all sizes,
conversions and boost revenue. from small startups to large corporations.
Customers can easily explore and discover the
Overall, the e-commerce website from New
things they need on the website because to its
Relic is a complete platform that gives
user-friendly layout. Additionally, the website
companies the resources they need to flourish
offers personalized suggestions, and advanced
in the digital market.
search features, all of which assist boost usability, intuitiveness, and aesthetic appeal of a
conversions and boost revenue. software product's interfaces may also be assessed
using a variety of assessment techniques. Few of
Overall, the e-commerce website from New
these techniques, meanwhile, advocate using data
Relic is a complete platform that gives companies
analytics to assess user experience.
the resources they need to flourish in
the digital market. User experience is becoming a key component
of corporate software quality. Development teams
make an effort to guarantee that people like using
II. BENEFITS OF WEBSITE ANALYSIS their digital products. It is also possible to
• Improved website performance: evaluate a software product's interfaces'
usefulness, intuitiveness, and visual appeal using a
Web analysis tools can provide insights into
number of assessment approaches. However, very
website performance, identifying issues such as
few of these methods recommend employing data
slow loading times, broken links, and errors. This
analytics to evaluate user experience. Many
information can help website owners make
companies utilize systems to collect data on user
necessary improvements and enhance user
behavior on their most popular websites, but they
experience.
don't use this knowledge to draw useful
• Better user experience: conclusions that may be used to improving the
Web analysis tools can provide insights into graphical user interfaces they have designed.
user behavior and preferences, allowing website A Software as a Service (SaaS) solution called
owners to optimize content and design to meet New Relic provides a broad range of products to
user needs and expectations. assist businesses in over 100 countries in
• Enhanced digital marketing: monitoring everything from their servers to their
apps. Application performance monitoring
Web analysis tools can provide insights into (APM), the company's main solution, provides
website traffic, referral sources, and user comprehensive performance measurements from
demographics, allowing website owners to refine your applications and enables you to identify
their digital marketing strategies and reach target patterns in response times and error rates. To
audiences more effectively. complete its collection of monitoring solutions,
• Improved conversion rates: the business added a new infrastructure
monitoring product.
Web analysis tools can help identify areas of
the website where users are dropping off or not The Apdex score, a word used in the industry
converting, allowing website owners to make by New Relic, is intended to gauge user
necessary improvements and increase conversion satisfaction with regard to the responsiveness of
rates. your apps. In essence, it determines a ratio of
acceptable to unacceptable reaction times; the user
• Data-driven decision making: may configure the T number, which is the
Web analytics provides quantitative and "tolerable" value between the two. While some
qualitative data that can support decision-making users may like that New Relic employs this idea,
in many areas such as website design, content others may find it challenging to comprehend
creation, marketing strategy, and what this ratio means or how to correctly
product development. adjust the T number.
This survey study offers a thorough analysis of
III. LITERATURE REVIEW: the various methods employed for web-based
application performance optimization. Application
performance monitoring (APM) is emphasized by
Today, user experience is a crucial idea for the author, who also gives a thorough rundown of
software quality for businesses. Development many APM technologies, including New Relic.
teams make an effort to ensure that users have a [3]
positive experience with their digital goods. The
The relevance of web analytics for disadvantages of the features of several APM
comprehending website performance and solutions, including New Relic.
enhancing user experience is emphasized by the An APM-based performance optimization
author in this book. A thorough examination of technique for cross-platform mobile applications
several web analytics solutions, including New is presented in this research study. The authors
Relic, is provided in the book. [4] talk about how crucial APM technologies, like
This research paper proposes an approach for New Relic, are for locating and fixing
improving web performance based on performance problems in mobile apps.
performance metric analysis. The authors offer a This study offers a paradigm for assessing web
case study of utilizing New Relic for tracking and application performance with APM tools. The
improving online performance and talk about the authors talk about how crucial APM technologies,
value of APM tools in discovering like New Relic, are for locating and fixing
performance issues. [5] performance problems in online applications.
The performance optimization of large data
This survey study offers a thorough analysis of
applications using APM tools, such as New Relic, the various methods employed for web-based
in conjunction with machine learning strategies is application performance optimization. Application
covered in this research study. The authors performance monitoring (APM) is emphasized by
provide a case study on how to monitor and the author, who also gives a thorough rundown of
improve a large data application's performance many APM technologies, including New Relic.
using New Relic. [6] [1][2]
This study offers a performance assessment
and improvement technique for cloud-based
microservices-based applications. The authors talk IV. METHODOLOGIES:
about how crucial APM tools, like New Relic, are To assist organizations in monitoring and
for spotting and fixing performance problems in optimizing their software applications and
microservices-based systems. [7] infrastructure, New Relic offers a wide range of
An extensive overview of application approaches. Among the important approaches
performance management (APM) in cloud offered by New Relic are:
computing systems is given in this systematic Real User Monitoring (RUM) is a methodology
review work. The authors talk on the value of for tracking and evaluating software application
APM tools, such as New Relic, and give a performance from the viewpoint of actual end
thorough breakdown of different APM strategies users. RUM enables companies to gather
and their efficiency in cloud environments. [8] information on user interactions, such as page
An overview of big data analytics for cloud load times, error rates, and user behavior, in order
computing is given in this survey article. The to pinpoint performance issues and enhance the
authors talk about how APM tools, like New user experience.
Relic, are essential for tracking and improving the • Synthetic Monitoring: This approach
performance of cloud-based applications. [9] simulates user interactions and evaluates software
An APM-based performance assessment and application performance in various scenarios.
optimization technique for corporate applications Businesses may proactively discover and solve
is presented in this research study. The authors performance issues before they have an impact on
talk about how crucial APM technologies, like actual consumers thanks to synthetic monitoring.
New Relic, are for locating and fixing • Application Performance Monitoring
performance problems in business applications. (APM): This practice entails keeping track of how
An overview of open-source APM tools and well software programmers are performing.
their efficiency in managing and monitoring • Infrastructure Monitoring: Monitoring the
application performance is given in this review performance and general condition of the servers,
study. The writers examine the advantages and networks, and other infrastructure elements that
enable software applications is part of this may drill down to see how long a web application,
process. Businesses may detect and address the network, the DOM, page rendering, etc., take
problems that can affect the performance and at various points in time. By contrasting your
availability of their applications by using optimized code with New Relic's historical data,
infrastructure monitoring. you may determine how well it's currently
functioning. Additionally, you may check to see if
• Error monitoring: To find and fix
new code has been deployed and how long it took
problems that might harm the user experience, this
the browser page to load.
process involves tracking and analyzing
application faults. Businesses may use real-time 1. PAGE VIEW - The number of times a
warnings and thorough diagnostics from error certain web page has been requested and loaded
monitoring to fix problems more quickly and by users is referred to as page views.
enhance application performance. 2. LCP - is used to measure load
To ensure maximum performance and a great performance of a Web page like image or video
user experience, New Relic offers a complete which is been rendered.
suite of techniques to assist organizations in 3. FID - a statistic that gauges how long it
monitoring and optimizing their software takes for the browser to process a user's first
applications and infrastructure. interaction with a web page.
4. CLS - a statistic that assesses a web page's
V. NEW RELIC AGGREGATION aesthetic stability.
METHOD:
Select the aggregate method that most closely

corresponds to the way your data is received for
the best results. The three aggregation techniques
are cadence, event flow, and event timer.
1. Cadence: Cadence in New Relic relates to
how frequently data is gathered, compiled, and
published. Different cadences are available from
New Relic for various kinds of data collecting and
reporting.
2. Event flow: The sequence of events that Fig. 1
take place within a software program from the
beginning of a user contact through the
accomplishment of the task is referred to in New
Relic as event flow.
3. Event Timer: An event timer is a tool that
New Relic uses to calculate how long a certain
event in a software program will last. Businesses
may use this tool to monitor the length of time it
takes for a certain event, like database query or
API request, to finish.
Fig. 2
VI. DASHBOARD:
New Relic provides all of the performance
information on the dashboard. None of it has to be
modified. On the "browser page load time," you
APM, Browser, Synthetics, Infrastructure, and
Insights are just a few of the capabilities and
technologies that New Relic offers. Try out
several features to see whether they can help you
optimize your infrastructure or apps.
Learn from community resources: New Relic
has a sizable user base that frequently exchanges
advice and best practices. Use this community to
your advantage by reading the material, watching
the tutorials, and participating in user
Fig. 3 groups or forums
You may test out New Relic and discover how
to utilize its tools to optimize your apps or
infrastructure by following these steps.
We built an e-commerce website called ‘The
Gadget House’ and we are monitoring it using
New Relic.
Fig. 4
VII. EXPERIMENTAL LEARNING
Join a trial offer at no cost: There is a free trial

available from New Relic that enables you to
utilize their tools and discover their capabilities.
Fig. 5
You will be able to use the product directly and
test out its features.
Create a test environment: You'll need a test VIII. OUR INSIGHTS
environment to work in if you want to fully play
with New Relic. This might be a staging
environment or a development environment. You 1. We can infer that the LCP is 1.87 seconds
should have some infrastructure or apps that you which is good as it should be <= 2.5
can monitor and improve using New Relic. seconds.
Configure New Relic: After setting up a test
environment, you'll need to set up New Relic to
monitor your infrastructure or apps. This will
entail setting up dashboards, alarms, and agents,
as well as adjusting settings.
Data monitoring and analysis are now possible
thanks to the functioning of New Relic. Based on
the data you're gathering, look for patterns,
pinpoint bottlenecks, and optimize your apps
or infrastructure.
Fig. 6
Fig. 8
4. Page View load time is 1862.5ms.
2. FID interactivity is 1ms which is good as

it should be <= 100ms.
Fig. 9
5. Front end vs. Back end
Fig. 7
3. CLS is 0 which indicates there is no shift
during loading.
Fig. 10
6. Throughput
load time, response time, errors, and
throughput and learn what is causing
performance problems from the source. In
order to monitor performance over time,
you may also create reports and set up
alerts depending on
performance thresholds.
By using New Relic, we can optimize the
performance of our web applications,
improve the user experience, and ensure
that our applications meet our performance
Fig. 11 requirements. Overall, New Relic is a
powerful tool for any organization that
wants to deliver high-performing web
7. User Centric Page Load Times applications and ensure that they meet
their business objectives.
Firms may gain a lot from website
research utilizing an APM solution like
New Relic. The platform's real-time
monitoring and analytics features make it
possible to swiftly identify and fix
application performance problems.
Additionally, the platform from New Relic
offers enterprises information into the
performance of their applications, which
can be utilized to optimize software and
make informed decisions.
X. REFERENCES
Fig. 12
[1] Alam, A., Muqeem, M., & Ahmad, S.
(2021). Comprehensive review on Clustering
Techniques and its application on High
Dimensional Data. International Journal of
Computer Science & Network Security, 21(6),
237-244.
IX. CONCLUSION
[2] Alam, A., Qazi, S., Iqbal, N., & Raza, K.
(2020). Fog, edge and pervasive computing in
A potent application performance intelligent internet of things driven
monitoring (APM) tool, New Relic offers applications in healthcare: Challenges,
information on the efficiency of web limitations and future use.
applications' front-end and back-end
elements. Real user monitoring, browser [3] Shrestha, R. (2021). Performance
and mobile monitoring, APM, Evaluation and Optimization of Web-Based
infrastructure monitoring, and serverless Applications: A Survey. Journal of Network
monitoring are some of its characteristics. and Systems Management, 29(1), 261-293.
With New Relic, you can keep an eye on doi: 10.1007/s10922-020-09576-1.
crucial performance indicators like page
[4] Kaushik, A. (2015). Web Analytics 2.0:
The Art of Online Accountability and Science
of Customer Centricity. Wiley.
[5] Chen, Y., Zhu, L., & Chen, Y. (2016). A

Methodology for Web Performance
Optimization Based on Performance Metrics
Analysis. IEEE Transactions on Services
Computing, 9(5), 721-734. doi:
10.1109/TSC.2015.2404557.
[6] Guan, Q., Li, J., & Zhang, H. (2019).

Performance optimization of big data
applications based on the association of APM
and machine learning. Journal of Systems and
Software, 157, 110392. doi:
10.1016/j.jss.2019.110392.
[7] Chu, X., & Yan, S. (2019). Performance

Evaluation and Optimization of Microservices
Based on Cloud Environment. Journal of Grid
Computing, 17(4), 705-722. doi:
10.1007/s10723-019-09494-w.
[8] Gu, Z., Zheng, Z., & Ma, W. (2020).

Application performance management in
cloud computing environments: A systematic
review. Journal of Systems and Software, 170,
110716. doi: 10.1016/j.jss.2020.110716.
[9] He, Y., Liu, Z., & Wang, H. (2018). Big

data analytics for cloud computing: A survey.
Journal of Internet Technology, 19(5), 1435-
1444. doi: 10.3966/160792642018101905270.
INTEGRATION OF IMAGE CLASSIFICATION AND MACHINE LEARNING
FOR ADAPTIVE TRAFFIC CONTROL SYSTEM
1st D.Akhil Kumar Reddy 2nd Rachamalla Yashoda 3rd Guggilla Rishitha
dept. of CSE dept. of CSE
dept. of CSE Presidency University Presidency University
Presidency University Bengaluru, Karnataka Bengaluru, Karnataka
Bengaluru, Karnataka
201910101139@presidencyuniversi 201910101078@presidencyuniversit
201910102184@presidencyuniversi
ty.in ty.in y.in
4th Allu Harini 5th Gangula Veda Samhitha

dept. of CSE
dept. of CSE
201910100959@presidencyuniversit
201910100988@presidencyuniversit
y.in y.in
ABSTRACT: cannot expand, we need to look for efficient ways to

manage the number vehicles on the road. A fully
This paper proposes an inventive mechanism or technique
to change the constant timing of the traffic light systems
automatically, according to the traffic density using
machine learning. This impacts the timing delay of the
traffic light systems. Currently, the existing traffic signal
system is managed based ona fixed time delay which may
be unfit every time for all the lanes and at all existing traffic
places. If one lane is more operational than others, it may
increase excessive time delay of the traffic and if one place
with less traffic density needs less delay in time.
Sometimes, heavy traffic obstruction at one edge of a lane
requires longer green signal time as compared to existing
system. To solve this problem, we must design an adaptive
traffic control system (ATCS) using machine learning. So,
this mechanism examines each lane's traffic density
according to the arrangement of the vehicles and varies the
time limit of the signal at that time.
KEYWORDS: Adaptive Traffic control system,

Machine Learning, Image Classification.
INTRODUCTION:
In the recent years due to technological
advancements the automobile sector is also
manufacturing a higher number of vehicles and every
year almost 253 million vehicles are being
manufactured and sold to customers. As the roads
developed city has a limited ability to reconnect to its
traffic, thus providing a real time solution to this
problem may solve most of the problem. Providing a
real time traffic control based on the density of vehicles
could be the best solution so far.
After a lot of research, it is found out that the main

purpose for the accumulation of traffic is due to fixed-
time traffic control. In a fixed-time traffic control it has
been observed that every part of the road is given an
equal time for the movement of vehicles. Our project
deals with bringing in change with the current traffic
signal system and moreover making it real-time-based
and to glow the green light for the part of the road
containing a higher number of vehicles. According to
the environment and the test cases the machine will
bring out the most effective solution for this problem
and will get better after every try as we have used
reinforced learning and the machine will also learn
from the environment as it is a reward-based learning.
LITERATURE REVIEW:
Title: Real-Time Vehicle Detection and Counting at

Signalized Intersection using Machine Learning
Author: Ashwin Kamath et al.
Year: 2021 Limitations: The proposed system requires a large
Overview: This paper proposes a real-time vehicle

detection and counting system using machine learning
models to control traffic lights at signalized
intersections. The system is designed to improve traffic
flow and reduce waiting times for vehicles. The authors
use a custom deep learning model trained on a large
dataset of images to detect and count vehicles in real-
time.
Advantages: The proposed system achieves high
accuracy in detecting and counting vehicles in real-
time, leading to efficient traffic flow and reduced
waiting times for vehicles.
Limitations: The proposed system is tested on a
limited number of intersections, and further testing is
needed to evaluate its performance in different
environments.
Title: Traffic Light Control System Based on Machine

Learning for Smart Cities
Author: Weiwei Yu et al.
Year: 2020
Overview: This paper proposes a traffic light control
system based on machine learning models to optimize
traffic flow in smart cities. The system is designed to
adjust traffic light timings based on real-time traffic
conditions, leading to more efficient traffic flow and
reduced waiting times for vehicles. The authors use a
machine learning algorithm to optimize traffic light
control.
accuracy in adjusting traffic light timings based on real-
time traffic conditions, leading to efficient traffic flow
and reduced waiting times for vehicles.
amount of data for training, and it may not be containing vehicles is collected and labeled. The
suitable for small-scale applications. dataset is used to train and evaluate the performance of
the YOLOv4 object detection model.
Title: Real-time Traffic Light Control using YOLOv4 Object Detection Model Training: The
Image Processing Techniques Author: Yi Zhou et YOLOv4 object detection model is trained on the
al.
Year: 2017
Overview: This paper proposes a real-time traffic
light control system using image processing
techniques to optimize traffic flow at intersections.
The system is designed to detect vehicles and
adjust the traffic light timings based on real-time
traffic conditions. The authors use image
processing techniques to detect and track vehicles
on the road.
accuracy in detecting and tracking vehicles on the
road, leading to efficient traffic flow and reduced
waiting times for vehicles. Limitations: The
proposed system relies on image processing
techniques which may not be as accurate as deep
learning models in detecting and tracking vehicles.
The system may also be affected by poor lighting
conditions or adverse weather conditions.
METHODOLOGY:
The proposed Automatic Traffic Lights by Image
Classification using Machine Learning system
utilizes computer vision techniques and machine
learning algorithms to automate traffic light control
based on the number of vehicles detected on the
road. The methodology can be described as
follows:
Dataset Preparation: A large dataset of images

dataset using transfer learning to achieve high accuracy Website:
https://www.kaggle.com/datasets/fedesoriano/traffic-
in vehicle detection. prediction-dataset.
Vehicle Detection and Classification: Once the

YOLOv4 model is trained, the system utilizes the
model to detect and classify vehicles in real-time. The
model is applied to the input images captured by
cameras installed on the road.
Vehicle Counting: Once the vehicles are detected and

classified, the system counts the number of vehicles on
the road and adjusts the green signal time for the traffic
lights accordingly.
Traffic Light Control: The green signal time for the

traffic lights is set based on the number of vehicles
detected on the road, with shorter green signal times for
fewer vehicles and longer green signal times for more
vehicles.
DATA USED/COLLECTED
We have taken a data set which is suitable for our
project from the given website.
In this data set we have five columns
1. Date and Time
2. Junction (1,2,3,4)
3. No. of vehicles crossed at particular time in
particularjunction.
4. Time allotted for green signal.
5. No. of vehicles still left at the signal.
For example:
At 11/1/2015 10:00 AM at junction 1 in total 15
vehicles crossed the signal.
RESULTS & DECISIONS:

We got clearance on the road as per the density
of vehicles and manual clearance in emergency
situations like ambulance, fire brigade, etc.
[4] "A Novel Machine Learning Based Traffic
CONCLUSION: LightControl System," by Siamak Azadi, et al.
In conclusion, the proposed Automatic Traffic Lights
(2018)
by Image Classification using Machine Learning
system is a promising solution to address the issue of
traffic congestion and improve the efficiency of traffic
light control systems. The system utilizes computer
vision techniques and machine learning algorithms to
detect and classify vehicles in real-time and adjust
traffic light control based on actual traffic conditions.
The system can significantly improve traffic flow,
reduce waiting times for vehicles, and enhance road
safety.
FUTURE SCOPE:
The proposed Automatic Traffic Lights by Image
Classification using Machine Learning system has the
potential for further development and optimization.
Future work can focus on the following areas:
• Environmental Conditions
• Pedestrian Detection
• Integration with Other Traffic Management Systems
• Real-Time Traffic Information
• Optimization of Traffic Light Control Algorithms
REFERENCES:
[1] "Real-Time Traffic Light Control System Based on
Machine Learning," by Hamid Fardoun, et al. (2019)
Link: https://ieeexplore.ieee.org/document/8916763
[2] "A Machine Learning Based Intelligent Traffic Signal
Control System," by Zhe Zhang, et al. (2020)
Link: https://www.mdpi.com/1999-4893/13/11/267
[3] "Real-Time Traffic Signal Control Based on Machine
Learning," by Wei Wang, et al. (2018)
[5] "A Real-Time Traffic Light Control System using
Convolutional Neural Networks," by Yingjie Li, et al. (2019)
[6] "Intelligent Traffic Light Control System based on
Machine Learning," by Qianwei Yu, et al. (2018)
Link: https://www.sciencedirect.com/science/article/pii/S
2212017318305727
[7] "An Intelligent Traffic Light Control Systembased on
Machine Learning and Computer Vision,"by Tariqul
Islam, et al. (2020) Link:https://www.mdpi.com/2071-
1050/12/15/6151
[8] "A Traffic Light Control System Based on Machine
Learning and Wireless Sensor Networks," by Yichen
Cheng, et al. (2021)
[9] "A Machine Learning-based Traffic Light Control
System for Urban Traffic," by Xiaofan Li, etal. (2021)
[10] "A Machine Learning-based Traffic LightControl
System for Pedestrian Safety," by
Alessandro De Palma, et al. (2021)
Link:https://www.mdpi.com/2076-3417/11/11/5051
Optimizing Garbage Collection: A Smart System For A Smarter City
Dr. Saravana Kumar S Shubham Ekka Mahesh SN

Assistant Professor Department Of CSE & IS Department Of CSE & IS
Department Of CSE & IS Presidency University Bangalore Presidency University Bangalore
Presidency University Bangalore 20191CSE0569 20191CSE0309
saravanakumar.s@presidencyuniversity.in 201910101285@presidencyuniversity.in 201910100567@presidencyuniversity.in
Manoj A Shetty Adarsh Chandrashekhar Sharmishtha Nath

Department Of CSE & IS Department Of CSE & IS Department Of CSE & IS
Presidency University Bangalore Presidency University Bangalore Presidency University Bangalore
20191CSE0319 20191CSE0556 20191CSE0552
201910100294@presidencyuniversity.in 201910100931@presidencyuniversity.in 201910100971@presidencyuniversity.in
Abstract— Tons of scrap are dumped in open country's rising population and rising waste
areas every day. Environmental impurity and creation. The government is concentrating on
conditions are brought on by indecorous scrap waste management as a result of a rise in waste.
operation and transportation. Irrespective of The survey indicates that Mumbai produced
their size, location, or economic status, every 16,200 tons of waste per day in 2001, which rose
metropolitan area spends a significant amount to 19,100 tons in 2005. To address this issue, there
of money on waste collection. Waste disposal is is a need for timely and effective waste collection.
the overall conditioning and conduct required to Due to rapid population growth in recent years, the
amount of waste requiring disposal has increased,
handle waste, from generation to final disposal.
making it crucial to have a proper solid waste
The traditional approaches are relatively boring
management system to prevent the spread of
and unsanitary. There is no proper tracking dangerous illnesses. Monitoring the condition of
system for the garbage carrying vehicles or the smart bins and making decisions based on that
waste cargo, and the procedure requires manual information. The mission's goal is to visit every
monitoring. Point to point collection of waste is part of the nation, whether urban and rural, in
already being undertaken by many Chinese order to promote it as the perfect nation to the rest
cities. The garbage is brought to the dumping of the globe. The procedure of collecting trash is
yard after being collected at the source. To laborious, ineffective, and time-consuming. There
achieve route and garbage collection is no tracking mechanism for the procedure, which
optimization, a new system can be created to involves manual monitoring of waste loads and
track the garbage vehicles in a certain ward of a garbage-carrying vehicles. To achieve route and
firm. Proper segregation must be carried out at garbage collection optimization, a new system can
the disposal site where the trucks discharge the be created to track the garbage vehicles in a
trash into the conveyer belt and should have certain ward of a firm. The push carts and garbage
distinct sections for dry and moist waste. trucks may be equipped with sensors, and they
could be tracked based on their GPS location to
Keywords—Tracking system, Sensors, cover the entire ward. The garbage collection push
Arduino UNO, GPS/GSM. carts are improperly constructed. The majority of
the trash flows over the road while being
X. INTRODUCTION transported.
In general, solid trash is categorized as coming Based on the "Consolidated Annual Review
from homes, businesses, hospitals, markets, yards, Report On Implementation Of Solid Waste
and street sweepings. In the past, was hauled management Rules, 2016," India produces
outside the town by horse-drawn carts. It is approximately 62 million tons of waste on an
typically challenging to manage the collection and annual basis. Due to the difficulties in the
transportation of garbage, as well as the tracking collection process and operation of the carts and
of vehicle locations, without the aid of advanced lack of tracking installation of the vehicles, only
technology. One of the most significant activities 43 million tons of the waste are collected. The
in urbanization and economic growth are remaining 19 million tons of waste remain
increasing in developing nations. The challenge of uncollected, leading to displeasure and spreading
garbage disposal arises in India as a result of the of infections. The 11.9 million tons of the
collected waste is treated by shops at the for
dumpsite while the remaining is used as compost analysis.
for landfills. Lack of robotization in carts lead to
inefficient scrap collection and increased 2 "Smart Utilized Improved Limited
complexity thereby motivating us to come up Garbage wireless waste to
with a solution for it. Collecti sensor manage monitori
on network ment ng the
XI. LITERATURE REVIEW System s and efficiency level of
A. Existing Methods Using RFID , reduced garbage
In the past, garbage bins were emptied by Wireless technol costs, only,
cleaners at specified intervals. The person who Sensor ogy to reduced cannot
cleaned the garbage can ran a significant risk to Networ collect environm detect
their health due to the toxic gases. There has never ks and and ental the type
been any automated planning, scheduling, or RFID monitor impact, of waste
monitoring of the waste from its source (a Technol garbage increased or its
residence) to its destination (dump yard) in any ogy" by data. public composi
previous study publications or projects addressing M. U. awarenes tion.
effective garbage collection and treatment. Chowdh s of
B. Research On Few Affiliated Papers ury and waste
M. I. manage
TABLE I. LITERATURE SURVEY ON FEW PAPERS Hossain ment.
Sl. Paper Method Advantag Limitati 3 "Smart Develop Improved More
no. Title es ons Waste ed a waste complex
1 "Smart Develop Improved Limited Manage smart manage and
Garbage ed a waste to ment waste ment expensiv
Monitor smart manage monitori System manage efficiency e than
ing garbage ment ng the Based ment , reduced systems
System monitori efficiency level of on IoT: system costs, that
Using ng , reduced garbage Towards using reduced only
IoT system costs, only, Urban IoT environm monitor
Technol using reduced cannot Sustaina technol ental the level
ogy" by IoT environm detect ble ogy that impact, of
S. G. technol ental the type Develop can improved garbage,
Han and ogy. impact. of waste ment" detect recycling requires
S. Y. Lee Sensors or its by S. G. the type rates. addition
were composi Prabhu of waste al
placed tion. and P. and its technol
inside Selvi composi ogy and
the tion. processi
garbage Sensors ng
bins to were power.
monitor placed
the inside
garbage the
level, garbage
and bins to
data monitor
was the
transmit waste,
ted to a and
server data
was 6 "Smart Utilized Improved Limited
transmit Waste a smart waste to
ted to a Manage waste manage monitori
server ment manage ment ng the
for System ment efficiency level of
analysis. Using system , reduced garbage
Machin that costs, only,
4 "Develo Utilized Increased Limited
e incorpor reduced cannot
pment a smart capacity to
Learnin ates environm detect
of a bin of monitori
g and sensors ental the type
Smart system garbage ng the
IoT" by and impact. of waste
Garbage that bins, level of
S. B. machine or its
Bin includes improved garbage
Singh learning composi
System sensors waste only,
and R. algorith tion.
Using to manage cannot
Singh ms to
IoT detect ment detect
predict
Technol the level efficiency the type
the level
ogy" by of , reduced of waste
of
J. K. garbage costs, or its
garbage
Kim, K. and a reduced composi
in bins
H. Kim, compact environm tion.
and
and C. or to ental
optimize
W. compres impact.
collectio
Chung s the
n
garbage.
routes.
5 "A Develop Improved More
7 "Smart Develop Improved More
Smart ed a recycling complex
Waste ed a recycling complex
Garbage smart rates, and
Manage smart rates, and
Bin garbage reduced expensiv
ment waste reduced expensiv
System bin environm e than
System manage environm e than
Using system ental systems
for ment ental systems
Sensor that impact, that
Efficient system impact, that
Fusion uses reduced only
Garbage that reduced only
for sensors costs. monitor
Collecti uses costs. monitor
Waste and the level
on and RFID the level
Segrega camera of
Disposal tags to of
tion and images garbage,
" by N. identify garbage,
Recyclin to may
K. and sort may
g" by A. detect require
Khanuja different require
M. S. and sort significa
and P. types of significa
Alam, S. different nt
Goyal waste, nt
S. types of processi
and processi
Hasan, waste. ng
sensors ng
and S. S. Data is power.
to power.
Roy transmit
monitor
ted to a
the level
central
of
server
garbage
for
in bins.
analysis.
Data is
transmit • Limiting human involvement.
ted to a
• Minimizing human effort and time.
central
server • Creating an atmosphere free of trash and
for that is healthy.
analysis. XIV. METHODOLOGY
A. Design Procedure
XII. PROPOSED METHOD a) The proposed model has been divided into
Three separate ultrasonic sensors are used in four parts:
the project, and when each dustbin is filled with i. Garbage Collection: The garbage truck
garbage, the ultrasonic sensor automatically has a robotic arm that helps the colony or
detects the amount of waste in the bin and sends community transport trash from each
information to the Arduino uno. This will direct home to the trash can. The residents have
the garbage collection robot using an RF module, the ability to track the vehicle.
a transmitter, and a receiver module so that it may ii. Monitoring and Overload Detection: The
go collect trash from the chosen trashcan. The GSM module can be used to alert the
robotic arm needs to be manually operated using a appropriate officials about the overload
mobile application to collect trash from the status. Each garbage can has an ultrasonic
dustbin after the robot automatically travels to it sensor, LCD screen, and SMS capability
using a track that we placed for it to follow.
that may be used to determine the amount
The robot will arrive at its destination with the of filling.
aid of an IR sensor, and once there, the IR sensor iii. Tracking Mechanism: The garbage truck's
will send data to Arduino to stop the robot. The location may be determined via the app or
robot subsequently transports the collected by sending an SMS in conjunction with
garbage to the main dumping area, where it is the GPS/GSM module. The location of the
divided into dry, moist, and metallic waste. The garbage truck may be found out by the
rubbish will be placed on a conveyer belt, and we public using the app or through SMS.
will have a blower to remove any dust or other dry iv. Segregation Mechanism: Garbage will be
waste from the belt. Wet waste will remain on the
dumped on the conveyer belt as soon as
belt itself, and we will use a powerful magnet to
the garbage truck pulls up to the landfill.
remove any metallic waste from the belt.
We will separate the scrap through the
The whole procedure, from garbage collection conveyer belt by blowing air through it,
to garbage segregation, would be communicated to which causes wet waste and metallic
the public via SMS. debris to go across the belt, assisting in the
OBJECTIVES
XIII. separation of dry and wet waste. We will
set up a powerful magnet for metallic
Smart waste management is a concept that
debris, which will draw the metal through
allows us to handle many issues that bother
it. So, we may separate the garbage into
society, such as pollution and infections. Waste
management must be completed immediately; dry, moist, metallic, and non-metallic
else, irregular management would result, harming waste by following the respective
the environment. The idea of smart cities and operation.
smart waste management are primarily
compatible. b) Architecture Diagram :
The following are the key goals of our suggested Presented below is a visual depiction of
system: the architectural diagram for the proposed
technique. This diagram provides a clear
• Keeping an eye on garbage disposal. understanding of how waste management
• Offering intelligent waste management processes are carried out and how information
technologies. is effectively conveyed to the public.
d) Data Flow Diagram :
Fig. 4. Data Flow Diagram for Waste Management
Fig. 1. Architecture diagram of the proposed method
c) Block Diagram : e) Use Case Diagram :
Fig. 5. Use Case Diagram for Waste Management
Fig. 2. System Block Diagram for Waste Management

Figure 5 illustrates the use case diagram for smart
garbage collection in the Internet of Things (IoT).
This diagram visually represents the interactions
between various actors (users, systems, or
devices) and the system being analyzed. In the
context of smart garbage collection in IoT, the
actors involved can include the waste
management authority, garbage truck driver,
Fig. 3. Input – Output Model for the suggested approach residents, as well as IoT devices like sensors and
servers.
f) Flowchart Diagram • Sim Card Adapter
• Conveyer Belt
• Magnets
• GSM Module 800C
• 2 Wheel Robot Kit
• Battery for the Robot
• Servo Motor
• L298N Motor Driver
• Geared Motor 30 RPM
• Geared Motor 150 RPM
• Bluetooth Module HC-05
b) Software Details
• Arduino IDE 2.0.4
• Arduino Automation
XV. CONCLUSION
Sensors can be used to create a more efficient
system than the current one. Our idea of
"Optimization of Waste Collection" includes
controlling waste management through intelligent
waste system technology, reducing human
involvement, minimizing human time and effort,
and creating a healthy and waste-free
Fig. 6. Diagram of a flow plan for a smart waste system environment. This proposed approach can be
applied in smart cities where people have busy
The flowchart diagram depicted above gives a schedules and little time for waste management. A
broad outline of the waste management system large garbage can could be placed in Metropolis,
that is employed. The flow chart starts with the which can hold enough solid waste for one unit,
initiation of the process, followed by the and the costs could be shared among the residents
initialization phase which checks if the garbage so that the service would be cheaper.
bin is filled up with waste. In case it is full, a A. Future Enhancement
signal is transmitted to utilize a sensor for
The concept of "green points" aims to
collecting the trash. The sensor provides
encourage the involvement of inhabitants or
information about the overflow of the garbage bin
consumers in the project, as well as to achieve
to authorized personnel. The system uses a
collaborative efforts for waste management
predetermined threshold level to decide when to
that align with the Swachh Bharath initiative.
notify the collection truck to come and pick up the
This involves predicting when bins will be
waste. This threshold level is also utilized to
filled, minimizing reliance on electronic
determine when the bin is filled up. Thus, the
components, and utilizing case studies or data
system operates continually.
analytics to determine when and what types of
garbage are collected during specific days or
B. Experimental Details seasons. By enhancing the graphical user
a) Hardware Details interfaces for the server and Android
• Ultrasonic Sensor applications, the potential exists to expand the
• Arduino Uno system's functionality to include additional use
• Power Supply Module cases and applications for smart cities.
• LCD Display with I2C B. References
• LM7805 IC [1] "Smart Waste Management System for
• Buzzer Efficient Garbage Collection and
• SMPS Fan Disposal" by N. K. Khanuja and P. Goyal.
[2] "Smart Garbage Monitoring System Using
IoT Technology" by S. G. Han and S. Y.
Lee.
[3] "Smart Garbage Collection System Using
Wireless Sensor Networks and RFID
Technology" by M. U. Chowdhury and M.
I. Hossain.
[4] "Smart Garbage Collection System Using
Ultrasonic Sensor and GPS" by S. B. Lee,
Y. S. Kim, and S. M. Lee.
[5] "Smart Waste Management System Based
on IoT: Towards Urban Sustainable
Development" by S. G. Prabhu and P.
Selvi.
[6] "Development of a Smart Garbage Bin
System Using IoT Technology" by J. K.
Kim, K. H. Kim, and C. W. Chung.
[7] "A Smart Garbage Bin System Using
Sensor Fusion for Waste Segregation and
Recycling" by A. M. S. Alam, S. S.
Hasan, and S. S. Roy.
[8] "Smart Waste Management System Using
Machine Learning and IoT" by S. B. Singh
and R. Singh.
[9] "Smart Waste Segregation System Using
Machine Learning and Image Processing
Techniques" by G. R. Nandhini and S. S.
Shanthi.
[10] "Smart Garbage Monitoring System Using
Machine Learning and IoT for Optimized
Routing of Garbage Collection" by D. D.
Salunke, D. D. Bhoyar, and P. M. Raulkar.
[11] "Smart waste management system using
IoT technologies" by K. M. Al-Kaabi, et
al. (2018).
[12] "Smart garbage management system using
IoT" by R. Singh and V. Singh (2018).
[13] "Smart garbage collection system using
IoT" by A. S. Md. Yasin et al. (2019).
[14] "Smart garbage bin management system
using IoT" by R. Rajkumar and M. N.
Subhashini (2020).
[15] "Smart garbage monitoring system using
IoT" by S. T. Mohite and A. R. Salunkhe
(2021).
Medical Chatbot for disease prediction and
treatment recommendation using
Artificial Neural Network
Deepika Pratap Barve B.H Gayathri K. Surya Sai Prakash
Dept. of Computer Science and Dept. of Computer Science and Dept. of Computer Science and
Engineering Engineering Engineering
201910100826@presidencyuniversity.i 201910100825@presidencyuniversity.i 201910101039@presidencyuniversity.i
n n n
Kamireddy Maheswar Reddy Rachna Chikkanna

Dept. of Computer Science and Dept. of Computer Science and
Engineering Engineering
Presidency University Presidency University
Bangalore, India Bangalore, India
201910101609@presidencyuniversity.i 201910100787@presidencyuniversity.i
n n
Abstract—Healthcare is an essential aspect of living a improve the quality of life, and increase life expectancy. In
healthy life although it isn’t accessible to most part of rural India the doctor to patient ratio in the city is 1:854 but in the
areas. Nowadays in healthcare, medical professional use rural areas it is 1:2000 as most of the hospitals and doctors
machine learning as a tool to manage patients and clinical data. are in the district towns which makes it hard for people
Natural Language Processing is a subcategory of machine especially the old and disabled to travel long distances. They
learning that enables computers to comprehend, analyze and reach out to professional help if the health condition
produce human language. With the help of natural language deteriorates drastically, creating irreversible changes to the
processing communication between humans and machines body which could have been prevented if they seeked help
become much simpler and possible. To overcome this, the
earlier. Artificial Intelligence(AI) is widely used in the
project acts as a bridge between healthcare and the people. In
this research project, using machine learning algorithms such
medical field from storing the patient records to assisting
as Artificial Neural Network, Natural Language Processing doctors in surgery. Using AI to our advantage we can
and Naive Bayes classifier, a medical chatbot is being created provide better healthcare to people in rural areas where it is
to make decisions about the risks and predict treatment difficult to access it.
outcomes. The chatbot engages in a conversation with the user, By using AI's Machine Learning application, it is
the symptoms are presented in the form of a query which is possible to program computers to emulate human thought
processed by the machine learning algorithm which predict the
processes. It enables medical equipment to make predictions
disease, the preventive measures and treatment plan. It
and offer insights from vast amounts of data that the
intimates patient queries to the doctor based on the risk factor
of the disease predicted.
healthcare professionals might overlook. A constantly
changing patient data set is essential when using machine
Keywords—Artificial Neural Network (ANN), Naive Bayes learning in the healthcare industry. These data can be used to
Classifier, Natural Language Processing (NLP), Machine identify trends that help medical practitioners identify new
Learning, Artificial Intelligence, Decision Tree, Medical diseases, assess risks, and forecast treatment outcomes..
Chatbot, Healthcare Using Machine Learning, the medical chatbot that will be
developed will help eliminate the issue by creating a bridge
XVI. INTRODUCTION between the patient and healthcare. They can provide
Healthcare is of utmost importance in our daily lives. It is immediate responses to patient queries, eliminating the need
essential for the well-being of individuals, families, for patients to travel the distance to the hospitals and the
communities and nations. Access to quality healthcare expense. Artificial Neural Network (ANN), Naive Bayes
services is a fundamental human right, and it plays a crucial Classifier, Natural Language Processing, and Natural
role in maintaining and improving the health and quality of Language Toolkit are the machine learning algorithms that
life of people. One of the significant advantages of are used. ANN is a technology that is used to generate results
healthcare is that it helps in the prevention and treatment of on a computer that are comparable to diagnoses provided by
diseases. Regular check-ups, immunizations, and screenings humans. ANN is a tool that is used to produce computer
can help detect and prevent illnesses before they become generated outcomes that are similar to the diagnosis made by
serious. Early diagnosis and treatment of illnesses can also human reasonings. Deep learning, which is the capacity of
lead to better outcomes and improved health. Healthcare also the ANN to learn from enormous quantities of data, is based
helps in the management of chronic conditions such as on ANNs. The goal of natural language processing, a
diabetes, heart disease, and cancer, among others. Proper subcategory of machine learning, is to enable computers to
management of these conditions can reduce complications, comprehend, analyse, and produce human language. To
interact with the machine and communicate with it, you system asks users questions about their symptoms in order to
employ natural language processing. An algorithm called the make a diagnosis and prescribe a course of treatment based
Naives Bayes Classifier is utilised to classify data accurately, on those answers. Using a top-down approach, the decision
which in turn enables it to predict outcomes quickly. Here, it tree method is used to identify, diagnose, and locate potential
interprets user communications and analyses user inquiries. solutions. The database could be updated and voice input
added in the future to improve the system.
XVII. LITERATURE REVIEW
C. An Intelligent Web-Based Voice Chatbot
A. Diabot: A Predictive Medical Chatbot using Ensemble
Learning Author: S. J. du Preez, M. Lall, S. Sinha
This paper presents the design and development of an
Author: Manish Bali, Samahit Mohanty, Subarna intelligent voice recognition chat bot. The paper presents a
Chatterjee, Manash Sarma, Rajesh Puravankara technology demonstrator to verify a proposed framework
This paper presents a generic text-to-text 'Diabot' required to support such a bot (a Web service). While a black
chatBOT which engages patients in conversation using box approach is used, by controlling the communication
advanced Natural Language Understanding (NLU) structure, to and from the Web-service, the Web-service
techniques to provide personalised prediction using the allows all types of clients to communicate to the server from
general health dataset and depending on the many symptoms any platform. The service provided is accessible through a
that the patient was asked about. The idea has been further generated interface which allows for seamless XML
developed into a DIAbeteschatBOT for specialised diabetes processing; the extensibility improves the lifespan of such a
prediction utilising the Pima Indian diabetes dataset to service. By introducing an artificial brain, the Web-based bot
provide proactive preventative actions. The paper presents a generates customised user responses, aligned to the desired
cutting-edge Diabot design with a simple front-end user character. Questions asked to the bot, which are not
interface for the average person using React UI, RASA understood, are further processed using a third-party expert
NLU-based text pre-processing, quantitative performance system (an online intelligent research assistant), and the
comparison of different Using individual machine learning response is archived, improving the artificial brain
algorithms as classifiers and integrating them together into capabilities for future generation of responses.
an ensemble with a majority vote.. The accuracy of the
ensemble model is balanced for general health prediction and D. A Self-Diagnosis Medical Chatbot Using Artificial
highest for diabetes prediction among all weak learners, Intelligence
which provides motivation for further exploring ensemble Author: Divya S, Indumathi V, Ishwarya S, Priyasankari
techniques in this domain. M, Kalpana Devi S
This research uses NLU & Advanced ML algorithms to According to Divya S. et al study, their suggested system
first diagnose a generic disease using a text-to-text offers a text-to-text conversational agent that can diagnose a
conversational Diabot and then extend this study as a user's illness by posing a series of questions to them. By
specialisation into deeper-level predictions of diabetes. matching the retrieved symptoms to the papers' symptoms
Diabetes is a non-communicable disease and early detection and the database's classifications, it classifies the illness as a
of it can let people know of its serious consequences and minor or serious sickness. In addition to a tailored diagnosis,
help save human lives. It is one of the major healthcare
a suitable specialist is recommended. Also, the user-
epidemics being faced by Indians, with close to 40 million
people who suffer with diabetes and this number is estimated provided information is kept in a database for later use. The
to touch 70 million people by 2025. Diabetes also causes performance of this bot's symptom recognition and
blindness, amputation and kidney failures. To diagnose diagnosis could be enhanced in the future, and it might also
diabetes, a doctor has to study a person's past history, offer more medical features for a more thorough symptom
diagnostic reports, age, weight etc. prediction.
This work combines an ensemble of five classifiers - E. The Application of Medical Artificial Intelligence
Multinomial Naïve Bayes (MNB), Decision Tree (DT), Technology in Rural Areas of Developing Countries
Random Forest (RF), Bernoulli Naïve Bayes (BNB) and
Author: Jonathan Guo and Bin Li
Support vector machine (SVM) - to predict various diseases
generically and specifically. The system consists of a front- Artificial intelligence (AI), a fast-evolving branch of
end User Interface (UI) for the patient to chat with the bot, a computer science, is now being actively applied in the
chatbot communicates with the NLU engine via API calls, medical industry to enhance clinical work's professionalism
and two models are trained at the backend using the general and effectiveness while also reducing the risk of medical
health and Pima Indian diabetes datasets. The challenges in a errors. The disparity in access to healthcare between urban
real-time implementation are mainly related to accuracy. and rural areas is a critical issue in developing nations, and
the lack of skilled healthcare professionals is a major factor
B. Artificial Intelligence based Smart Doctor using
in the unavailability and poor quality of healthcare in rural
Decision Tree Algorithm
areas. According to several studies, using AI or computer-
Author: Rida Sara Khan, Asad Ali Zardar, Zeeshan assisted medical approaches could lead to better healthcare
Bhatti outcomes in developing nations' rural areas. Thus, it is
An AI-based health physician system that could worthwhile to discuss and investigate the creation of
communicate with patients, make diagnoses, and recommend medical AI technology that is appropriate for rural locations.
an immediate fix or therapy for their issue was proposed and
put into practise in the works of Rida Sara Khan et al. This
Many MDDS systems, including those for internal, forensic,
veterinary, pathology, radiology, psychiatry, and other
H. A Personalized Medical Assistant Chatbot: MediBot
fields, were created in the 1990s.
AI technology is bringing revolutionary changes across the Author: Gajendra Prasad K. C, Assistant Professor.
healthcare field, and will play a huge role in electronic The MediBot is a personalized medical assistant chatbot
health records (EHRs), diagnosis, treatment protocol that can predict disease using Apriori algorithm and
development, patient monitoring and care, personalized Recurrent Neural Network algorithm. It can be used as a
medicine, robotic surgery, and health system management. tool of communication and can help people to keep track of
This article introduces the potential of medical artificial their health regularly and properly without going anywhere.
intelligence (AI), reviews healthcare disparities and quality Machine Learning has had a major impact in the field of
in developing countries' rural areas, and discusses the medical science due to its ability to learn and analyze from
functions of technologies related to AI in medicine, such as the examples provided. In today's fast paced life, people
computer-assisted diagnosis and mobile clinical decision often don't take proper care of their health and as a result
support systems (mCDSS). Additionally, it suggests a people often end up ignoring their health conditions. The
multilayer medical AI service network with the goal of MediBot can help people find their health problem just by
enhancing the usability and standard of rural healthcare in entering symptoms.
developing nations. Chatbots are highly personalized virtual assistants that
F. Technical Aspects of Developing Chatbots for Medical mimic human conversation using Machine Learning
Applications algorithms. They are becoming increasingly popular in
business groups due to their ability to reduce customer
Authors: Zeineb Safi, Alaa Abd-Alrazaq, Mowafa Househ service cost and handle multiple users at a time. Chatbots
Applications known as chatbots may have natural language are currently the most advanced and time saving technology
discussions with users. Chatbots have been developed and available, but there is a need to make them efficient in
used in the medical field to serve different purposes. The medical field as well. This project provides a platform
most notable instance is the usage of chatbots like Apple's where human can interact with a chatbot which is highly
Siri and Google Assistant as personal assistants. Chatbots trained on datasets. Machine Learning algorithms take a
have been created and utilised for a variety of purposes, more natural approach for computation rather than taking a
including marketing and offering various services. logical approach, and the output is depended on the dataset
Chatbots are being more widely used in the medical industry they are trained on. One can apply those methods and gain
as a tool to make it easier for patients to get information and from them even if they are not aware of the correct rationale
lighten the strain on clinicians. For connecting with patients, behind them.
many commercial chatbot programmes that are accessible as
online or mobile applications have been developed. It is
I. Survey Paper on Medical Chatbot
important to know the current state of different methods and
techniques that are being employed in developing chatbots Author:Dev Vishal Prakash, Prof. Shweta Barshe,
in the medical domain for many reasons. By conducting this Anishaa Karmakar, Vishal Khade
poll, researchers will be able to recognise the various ways The literature survey discusses the potential of medical
that have been utilised in the future and build on them to chatbots in healthcare services to improve healthcare service
create chatbots that are more intelligent and provide users a quality, reduce the workload of healthcare professionals and
more natural experience. streamline healthcare services. The survey highlights the
challenges and negative consequences of medical chatbots,
G. A Literature Survey of Recent Advances in Chatbots such as negative perceptions about their ability to provide
accurate information and secure user privacy, and lack of
Authors: Guendalina Caldarini, Sardar Jaf, and user interest. The survey aims to identify key factors that
Kenneth McGarry can motivate individuals to accept services delivered
Chatbots are intelligent conversational computer through medical chatbots by healthcare organizations and to
systems designed to enable automated online guidance and help formulate appropriate strategies for better designing
support. Chatbots are currently applied to a variety of medical chatbots. The study adopts a two-stage mixed-
different fields and applications, spanning from education to method approach involving interviews and surveys based on
e-commerce, encompassing healthcare and entertainment. the theory of planned behavior to obtain a deeper
Improvements in their implementation and assessment are understanding of individuals’ motivations for using medical
significant research subjects since chatbots are so common chatbots. A significant element that promotes the adoption
and used in so many different sectors. of medical chatbots was shown to be emotional preference.
The main contributions of this paper are: The study also highlights the need for more research on
(i) a thorough analysis of the research on chatbots in the medical chatbots to ensure their successful implementation.
literature as well as the most recent techniques for
implementing them, with a concentration on Deep Learning XVIII. METHODOLOGY
algorithms,
(ii) the identification of the challenges and limitations of
chatbots implementation and application, and
(iii) recommendation for future research on chatbot.
and suggest further precautions that the patient needs to take
in an efficient manner.
The input requests from the patient are sent to the chatbot
server, which uses the bot controller logic to determine how
XIX. DESIGN PROCEDURE
to respond to the user's request. The data preprocessing
model is then used to prepare the raw data based on the
user's input and provide accurate responses to the user
through the chatbot client.
The design procedure for the medical chat board tool for
problem identification involves the use of machine learning
algorithms and natural language processing (NLP), with The data is stored in SQLite as a single row of instance data
Python as the coding language and Jupiter notebook as the or a collection of instance data, depending on how it was
IDE. The minimum operating system requirements are trained in the dataset. The chatbot server stores both the
Windows XP professional or Windows 7 or later. training and test data, and feeds the appropriate data based
on the user's details. If there are no key patterns present in
The front-end of the tool consists of two modules: the the user's input, the virtual doctor prescribes medicines
registration module and the query module. The registration based on the symptoms and uses machine learning logic,
module enables users to register with their details and log in specifically the support vector machine (SVM) algorithm, to
to the chatbot using a username and password. Even doctors identify and predict the disease. The SVM algorithm
are required to register in the same way as other users. analyzes the disease and prescribes medicine.
The query module enables users to ask questions regarding The objective of this work is to predict the diagnosis of a
their symptoms and diseases in the automatic chatbot, which disease with a number of attributes and provide a solution to
responds according to the disease with the user. The disease the patient through the chatbot. The classifier model is used
prediction module uses machine learning logic such as to identify the key patterns or features from the medical
Naive Bayes and decision tree algorithms to recognize and data, and classification techniques are then used to predict
analyze the diagnosis of the disease after reducing the number of
the symptoms described by the patient, predict the disease in attributes provided by the user.
a particular area, and even provide an accuracy score for the
prediction. The tool can also prescribe accurate medicines
XX. ALGORITHMS individual words, ignoring their order and context. It counts
the frequency of each word in a document and uses these
Neural Net: A neural net is a machine learning algorithm counts as input features for classification.
that is based on the structure and function of the human
brain. It consists of layers of interconnected nodes or Cross Validation Function: Cross-validation is a statistical
neurons, which process input data and produce output method used to evaluate machine learning models by
predictions. Neural nets are particularly useful in pattern dividing data into subsets for training and testing. The cross-
recognition tasks such as image or speech recognition. validation function helps to optimize model performance by
identifying potential issues such as overfitting and
KNN: KNN, or k-nearest neighbors, is a classification underfitting. It is particularly useful for determining the best
algorithm that assigns new data points to the class of the hyperparameters for a given model.
nearest neighbor in the training data. The value of k REFERENCES
determines the number of nearest neighbors considered.
KNN is a simple and easy-to-understand algorithm, but can The template will number citations consecutively within
be computationally intensive for large datasets. brackets [1]. The sentence punctuation follows the bracket
[2]. Refer simply to the reference number, as in [3]—do not
use “Ref. [3]” or “reference [3]” except at the beginning of a
SVM: Support vector machine (SVM) is a classification
sentence: “Reference [3] was the first ...”
technique that uses a hyperplane in a high-dimensional
space to divide input points into classes. Finding the Number footnotes separately in superscripts. Place the
hyperplane that maximises the margin between the two actual footnote at the bottom of the column in which it was
classes is the aim of SVM. SVM is very helpful for cited. Do not put footnotes in the abstract or reference list.
managing data that cannot be separated linearly. Use letters for table footnotes.
Unless there are six authors or more give all authors’
Decision Tree: A classification technique known as a names; do not use “et al.”. Papers that have not been
decision tree employs a tree-like structure to represent a published, even if they have been submitted for publication,
succession of decisions and their outcomes. Each node should be cited as “unpublished” [4]. Papers that have been
represents a judgement call based on an attribute, and each accepted for publication should be cited as “in press” [5].
branch the decision's result. Decision trees can handle Capitalize only the first word in a paper title, except for
category and numerical data and are simple to comprehend. proper nouns and element symbols.
For papers published in translation journals, please give
Logistic Regression: Based on input factors, the the English citation first, followed by the original foreign-
classification method known as logistic regression forecasts language citation [6].
the likelihood of a binary result. It uses a logistic function to
transform a linear combination of input variables into a
probability between 0 and 1. Logistic regression is [5] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of
Lipschitz-Hankel type involving products of Bessel functions,” Phil.
particularly useful when the outcome variable is Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955.
dichotomous. (references)
[6] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed.,
1R: 1R is a classification algorithm that chooses a single vol. 2. Oxford: Clarendon, 1892, pp.68–73.
attribute as the best predictor of the class label. It calculates [7] I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange
anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds.
the error rate of each attribute and selects the one with the New York: Academic, 1963, pp. 271–350.
lowest error rate as the final classifier. 1R is a simple and [8] K. Elissa, “Title of paper if known,” unpublished.
fast algorithm, but can be limited in its accuracy. [9] R. Nicole, “Title of paper with only first word capitalized,” J. Name
Stand. Abbrev., in press.
Ensemble: Ensemble algorithms combine multiple [10] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron
individual algorithms to create a more accurate and robust spectroscopy studies on magneto-optical media and plastic substrate
prediction. Examples of ensemble algorithms include interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August
1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
bagging, boosting, and random forests.
[11] M. Young, The Technical Writer’s Handbook. Mill Valley, CA:
University Science, 1989.
Bags of Words Model: The bags of words model is a text
classification algorithm that represents text as a bag of
Real-time Estimation of Heart Rate under lighting
using Web Camera
Deepak M D Galam Anusha Priya Gonuguntla Goutham Sai
Assistant Professor Department of CSE Department of CSE
Department of CSE Presidency University Presidency University
Presidency University Bangalore, India Bangalore, India
Bangalore, India 201910101473@presidencyuniversity.i 201910101442@presidencyuniversity.i
deepak.md@presidencyuniversity.in n n
Guna Shree U Harshini A P Hemanth Konduru

Department of CSE Department of CSE Department of CSE
201910100453@presidencyuniversity.i 201910100083@presidencyuniversity.i 201910102182@presidencyuniversity.i
n n n
Abstract— The early detection of variation in heart rate is with each beating, and HR may be calculated using this
essential for effective treatment because cardiovascular disease colour variation.
is one of the leading cause for death worldwide. In the field of
medical diagnostics, heartbeat detection is a crucial task, but Previously, a few techniques for pulse detecting using a
conventional methods call for specialized tools and qualified camera were developed, however such procedures have
personnel. The use of signal processing and computer vision restrictions on the elements impacting colour values, such as
techniques has gained popularity in recent years. This study variances in ambient illumination during video recording and
describes a technique for real-time heart rate monitoring using variations in blood parameters produced by heartbeat. Most
a webcam and JavaScript. The suggested method takes the non-touch methods use RGB colour space to produce face
facial region out of the webcam's video frames and uses signal footage that is best suited for lab settings or constant ambient
processing to estimate the heart rate. In particular, the lighting. Because the ambient light is not constant, these
technique detects the subtle color changes brought on by the approaches are not appropriate for real-time software and
blood flow in the skin and uses the chrominance information of cannot achieve heart rate.
the facial region to estimate the heart rate. The method also
employs motion compensation algorithms to decrease the The proposed technique employs LAB colour space for
impact of head motions and facial expressions on heart rate non-intrusive heart rate detection, hence eliminating ambient
measurement. The suggested approach can offer a low-cost and light changes while extracting face pictures. The suggested
non-invasive way to detect a person's heart rate and has process starts by locating the Region of Interest (ROI) on the
potential applications in healthcare, fitness monitoring, and face and identifying the skin likely region, followed by using
wellness tracking. LAB colour space. The colour fluctuations are then
examined for every pixel in time and amplified to obtain a
Keywords—Cardiovascular disease, Computer vision (CV), closer picture of the signal from the chosen ROI. Finally, the
Signal processing, Heart rate (HR), Motion compensation. captured area is used to extract the signals from which the
XXI. INTRODUCTION HR is extracted using peak detection algorithms.
Since many years, physiological signal analysis has been XXII. ASSOCIATED WORK
used extensively in the field of medical research. Research For the past decade, academics have been engaged on
has demonstrated that heart rate is a source of data that computer vision (CV) technologies. The first proposed use of
reveals a person's psychophysiological status. Aside from using facial assistance for observations to measure
HR change measures, several medical diagnoses also make physiological parameters in human beings. Verkruysse
use of breathing rate and ECG signals. The progression of provided an example of how to use PPG to calculate HR
novel pulse-measuring methods and machine intelligence from a person's face in natural light. The key concept behind
algorithms enabled the identification of tension, sleepiness, these techniques is to get the pulse based on transient
and different emotions. The advancement of non-invasive changes in facial colour using blind source separation (BSS).
physiological sensing technologies will result in a slew of Additionally, researchers applied algorithms for various
new applications since they will be rapid, simple, and techniques to optical processing and noise reduction to
attainable in real time. This research presented a real time specifically analyse HR estimations.
face video heart rate tracking system utilizing a web camera
by estimating the variation in skin color produced by Earlier systems detected pulses from collected video by
heartbeat. calculating small head motions resulting from the Newtonian
reaction to flow of blood by every heartbeat. Here,
The idea of monitoring cardiovascular system parameters compliance with artery and head mechanics, as well as
without contact with the human body has developed. The erratic and inadequate illumination circumstances, may have
circulatory system allows blood to flow throughout the body an impact on how well the features are monitored. Medical
due to the heart's continuous blood pumping. The resulting applications for ambient light-based virtual plethysmography
blood supply creates colour change in the skin on the face imaging include vascular skin lesion characterization and
vital signs are remotely monitored for sports or triage. The b) Integral Imaging: Integral pictures analyze
prerequisite in this instance is to quantify heart and breathing rectangular characteristics in a consistent amount of
rates will result in less accurate findings. time. Compared to previous systems with more
attributes, this increases computation time while
With adaptive filtering techniques like Normalised Least improving computation speed. The quantity does not
Mean Square, remote HR observations from face films are affect how quickly features are processed.
conducted under controlled conditions to quantify HR.
Changes in ambient illumination and movements that
enhance subject interference reduce their effectiveness.
Webcam-based face footage in RGB colour space is used for
HR monitoring. The HR was calculated using a
straightforward webcam in an indoor setting with continuous
ambient light from the colour change in the skin caused by
heartbeat. This approach can't determine where HR is
fluctuating due to the surrounding light, making it unsuitable
for real-time applications.
Our goal is to create a non-contact heart rate estimating
device that uses a camera to track the variation in skin tone Fig. 2. Integral Image creation Features
caused by each heartbeat. The gathered footage is converted
from RGB to LAB colour space using signal processing c) OpenCV Algorithm: This method is used to train
techniques like the Fast Fourier Transform (FFT), and a face classifiers as well as facial recognition algorithms to
identification algorithm is used to remove the impact of choose the best features.
ambient light. The HR is then determined from the frequency d) Cascade in OpenCV: There is a potential classifier at
that was obtained using a peak detection method. every level of the cascade. The purpose of each stage
is to establish if a certain plane is unmistakably a
XXIII. PROPOSED METHODOLOGY face. The recognized pane will be automatically
This technique's fundamental premise is that blood discarded if it is not a face.
flowing down the face alters the skin's hue in a way that is
apparent to the camera but invisible to the human eye.
There isn't much information in the sections for the eyes,
lips, and nose. So, to obtain regions with skin probability, we
apply a skin mask. The signals in the skin mask that are
accessible from that place are the next step after obtaining
the skin mask. Then, using this signal's peak detection
method, the heart rate is calculated.
Fig. 1. Workflow of the method
A. Record Video with Webcam: A subject (human) sits in

front of a webcam and records footage.
B. Face Identification: The Affine transform algorithm is
employed. High identification efficiency, few false
positives, and practicality are some of this algorithm's
characteristics.
a) Haar Feature Selection: There are a few traits that
are common to all human faces. The area around the
nasal bridge is lighter than the eyes, and the upper
cheek are paler than the eye rims. These
comparisons can be ordered using the haar function.
value = Σ (black region pixel) − Σ (white region
pixel).
C. RGB to LAB Colour Space Conversion: Here, RGB XXIV. REQUIREMENTS
photos of the same items taken with various camera The project does heavily depend on software and limited
types and ambient illumination appear differently. The hardware.
LAB colour space is independent of unit. Here, the ‘L’
denotes the image's brightness, the ‘A’ and ‘B’ colour A. Hardware Requirements
channels denote additional colour combinations, and the Desktop or mobile device: Any dual core process or
‘A’ represents the red or green component colour. A higher with > 1 GHz.
negative ‘A’ is represented by green, and a positive ‘A’ Camera or webcam: Any camera or webcam which is
is represented by red. Yellow or blue are represented on >= 1 MP, and a resolution of 720p or higher.
the ‘B’ axis. Yellow denotes positive values of ‘B’ and
blue indicates negative values of ‘B’. The numbers B. Software Requirements
L=0, A=0, B=0, depict absolute black and L=100 depict Any web browser (Firefox 81 or later, Chrome 85 or
white. In the RGB colour space, light is a significant later, Edge 85 or later, Safari 14 or later, Opera 67 or
factor, although in outdoor settings, it is not as later and Brave 1.14.81 or later) to run the software.
significant.
Programming Languages: HTML, CSS and JavaScript
D. Skin Identification: All unnecessary regions that do not Library: OpenCV
give useful information, such as nostrils, eyebrows, and
eyes, are detected and removed using a skin classifier. Other tools: Remote photoplethysmography (RPPG)
E. Signal Amplification: The three signals used in the LAB XXV. IMPLEMENTATION
colour space are ‘L’, ‘A’, and ‘B’, where ‘L stands for A. Video Recording: The camera is currently linked. The
the brightness of the image and ‘A’ and ‘B together webcam captures video and reads picture metadata to
stand for its fusion of the other two channels. L lacks extract frame capture times.
color information, thus you must separate A and B
channels of color to from it. Blood pulses cause minute B. Face Recognization: Eliminates extraneous data, such
changes in brightness and intensity that are recorded in as the background, since we are only interested in a
the LAB colour space. Independent Component person's face. Establish the face's limiting box. The
Analysis or Principal Component Analysis reduces video is trimmed to the face's bounding box.
dimensionality when utilised.
Original Image Face Detected Image

Fig.3. Architecture of RPPG
Fig.4. Face Detection
F. HR Recognition: HR can be calculated from the

recovered signal using time or frequency-domain
analysis. The Fast Fourier Transform (FFT) can be used
to convert signals into the frequency domain. A signal
that has been sampled over a predetermined amount of
time is used to determine the signal's peak detection
method. The person's detected HR is reported as this
peak value. Peak-finding algorithm following the Fast
Fourier Transform's conversion of the signal to the
frequency domain. We must first count the frequency
range's peaks and keep track of how long it takes for the
first 30 frames of the image before we can calculate the
heart rate in real-time. Thus, the formula for measuring
heart rate is HR = 60*(frequency extracted) bpm.The
amount of peaks within a given period of time is the
extracted frequency in this case.
So, HR = [60 x (total peak/time)] bpm.
For example, 25 peaks for 600 frames where frame rate
per second (fps) is 30 and time taken to read 600
frames will be 20. HR = [60 x (25/20)] = 75 bpm.
C. Frame Registration: In order to observe the variation in XXVI. RESULTS
skin tone on the subject's head, the video needs be This device offers the most effective webcam-based real-
filtered. Consequently, locate the altered face in relation time HR monitoring method. It will open numerous cool uses
to original face. Reverse the alteration once the face both within and external of the hospital, such as converting
shift has been identified to stabilise image. It is decided video surveillance cameras into heart attack early warning
which change in expression corresponds to the initial detectors and reducing the cost of medical treatment by
frame. The adjustments can be reversed to stabilise the eliminating more expensive monitoring equipment. This
video. method enhances Human Computer Interactions (HCI) by
providing computers with direct emotional clues.
D. Skin Interpretation: Establishing a baseline skin
categorization that can recognise facial skin grounded By gathering real-time input from the webcam, this
in LAB colour. Change the face colour space to LAB. device may determine a person's heart rate (HR). A short
The face border region averages distribution of skin video clip is captured by setting the frame rate to 10 frames
probabilities. per second and instructing the subject to stand or sit in front
of the camera. The complete face is used as the Region of
Interest in the video to calculate HR. When the ROI for the
complete face is taken into consideration, HR is identified,
although the accuracy levels were low. This happened as a
result of the camera's assessment of undesired facial areas
where blood flow cannot be identified with accuracy.
Thereafter, with the aid of haar feature selection, the relevant
areas where the blood flow may be apparent, such as the
cheeks and forehead, are indicated using facial coordinates.
To provide a mimicked environment and lessen light
variance, this area was selected as an ROI for additional
processing and transformed into LAB space. A and B are a
Fig.4. Grayscale Image synthesis of the other two channels, whereas L denotes the
picture's lightness. After separating L from LAB, use color
E. Color Coding: Plot the A and B signals in order to see channels A and B to measure HR since L has no
how the subject's head's skin tone varies, as HR is understanding of color.
situated between the two and uses skin weight to obtain
the signals for all LAB frames. Both the differences in Various kinds of cameras and varied ambient lighting are
light and the variations in the intensity of the blood used to capture RGB images. Take advantage of a band-pass
pulse are seen. Then, using the proper de-mixing filter to gather the required signals from the ROI that was
perspective, the A and B signals are combined to create spread apart for the lab and a peak-detection algorithm to
analyze the variations. Extract the HR from the peak
the signal that represents the heartbeat. Calculate the
intervals. Then translate the frame numbers into intervals that
likelihood of skin from each image's pixels to obtain the
correspond to heartbeats. The values gleaned from the
average LAB colours and the best heart rate signal. created system are displayed below.
F. Heart Rate Measurement: A bandpass filter is applied to TABLE I
modulate the heart rate signal using frequencies SAMPLE HEART RATE READINGS
between 0.5 Hz and 3 Hz. Isolate the HR peaks
obtained. Convert the frame counts into times. Heart Rate
S. No Image
Synthesise different heartbeats from calculating interval (bpm)
averages.
1 74
2 76
Fig.5. Output Snapshot

Heart Rate
It can be used for routine patient observation in home
S. No Image care, this approach is extremely effective, comparatively
(bpm)
successful, and simple to implement. This contactless
innovation is advantageous for indoor and medical
applications due to the prevalence of cameras, particularly
webcams. For future applications a few elements, such as
head movement or varying ambient lighting should be
addressed. Future work will concentrate on creating a
multiparameter, real-time measuring network based on this
technology with enhanced video resolution. While this
research mainly examined cardiac HR recovery, the
approach suggested may be used to determine many other
3 75 essential factors such as variability of heart rate, respiratory
or breathing rate and blood oxygen saturation.
REFERENCES
[12] D. Qiao, R. Masroor, R. Rasool, F. Zulkernine and N. Jaffar,
"Measuring Heart Rate and Heart Rate Variability with Smartphone
Camera," 2021 22nd IEEE International Conference on Mobile Data
Management (MDM), Toronto, ON, Canada, 2021, pp. 248- 249, doi:
10.1109/ MDM52706.2021.00049.
[13] S. Yu et al, "Human Heart Rate Estimation Using Ordinary Cameras
under Natural Movement," 2015 IEEE International Conference on
Systems, Man, and Cybernetics, Hong Kong, China, 2015, pp. 1041-
1046, doi: 10.1109/SMC.2015.188.
4 79 [14] J. Zhao, G. Jia, J. Huang, X. Ji and L. Shan, "Non-Contact Method of
Heart Rate Using a Webcam," 2018 IEEE International Conference
on Internet of Things (iThings) and IEEE Green Computing and
Communication (GreenCom) and IEEE Cyber, Physical and Social
Computing and IEEE Smart Data, Halifax, NS, Canada, 2018, pp.
1902-1906, doi: 10.1109/Cybermatics_2018.2018. 00315.
[15] C. Wang, Y. Jiang, L. Lin and Z. Cai, "Non-contact Measurement of
Heart Rate Based on Facial Video," 2019 Photonics &
Electromagnetics Research Symposium - Fall (PIERS - Fall),
XXVII. CONCLUSION AND ENHANCEMENTS Xiamen, China, 2019, pp. 2269-2275, doi: 10.1109/PIERS-
The instantaneous non-contact heart rate estimation Fall48861.2019.9021402.
system using face video described in this work is simple to [16] M. Oviyaa, R. Swathika and P. Renvitha, "Real Time Tracking of
use, affordable, and suitable for real-time applications, that is Heart Rate from Facial Video Using Webcam," 2020 Second
International Conference on Inventive Research in Computing
functional in dim lighting. Because light contributes Applications (ICIRCA), Coimbatore, India, 2020, pp. 1-7, doi:
significantly to RGB video, heart rate monitoring with 10.1109/ ICIRCA48905.2020.9183124.
RGB is problematic in outdoor conditions. This method [17] Rouast P.V, Adam M.T.P and Chiong, "Remote heart rate
includes a way to determine heart rate by employing LAB- measurement using low cost RGB face video", 2018 Front. Computer
colored face images. The LAB color space is independent of Sci. 12, 858–872.
hardware. [18] T. Blöcher, M. Schinle, J. Schneider and W. Stork, "An online PPGI
approach for camerabased heart rate monitoring using beat-to-beat
Heart rate (HR) is examined and contrasted with detection," 2017 IEEE Sensors Applications Symposium (SAS),
numerous test instances, including variable lighting and skin Glassboro, NJ, USA, 2017, pp. 1-6, doi: 10.1109/SAS.2017.7894052.
tones. The transformation of the RGB into LAB color [19] El khadiri, Z, Latif, R, Saddik, A (2023). Remote Heart Rate
scheme had a significant influence in the result of achieving Measurement Using Plethysmographic Wave Analysis. In: Aboutabit,
N., Lazaar, M., Hafidi, I. (eds) ICMICSA 2022, doi: 10.1007/978-3-
high levels of accuracy by offering a viable remedy for the 031-29313-9_23.
fluctuating skin tone. Here, a technique for calculating the
[20] T. Kitajima, E. A. Y. Murakami and S. Choi, "Heart rate estimation
HR using video footage of a person's face and gestures using based on camera image," 2014 14th International Conference on
a camera with ambient light illumination is provided. On the Intelligent System Design and Applications, Okinawa, Japan, 2014,
basis of the experimental findings, it is considered that the pp. 50-55, doi: 10.1109/ISDA.2014.7066275.
HR ranges between 60 and 110 bpm under typical [21] K. S. Alam et al, "Remote Heart Rate and Heart Rate Variability
circumstances. Our investigation suggests that the HR may Detection and Monitoring from Face Video with Minimum
be detected with 90% accuracy. Resources," 2020 IEEE 44th Annual Computers Software and
Applications Conference (COMPSAC), Madrid, Spain, 2020, pp.
1385- 1390, doi: 10.1109/COMPSAC48688.2020.0
Blog Web Application with Custom
JWT Library
Mohammed Shakir Brahm Dev Singh

Assistanct Professor Department of CSE
School of computer science & engineering Presidency University
Presidency University Bengaluru, India
mohamed.shakir@presidencyuniversity.in 201910101229@presidencyuniversity.in
N Murari Milan Nelson

Department of CSE Department of CSE
Bengaluru, India Bengaluru, India
201810101578@presidencyuniversity.in 201910100082@presidencyuniversity.in
Mohammed Nabeel
Department of CSE
Bengaluru, India
201910101660@presidencyuniversity.in
Abstract— This research paper aims to Keywords—JWT, Authorization,

investigates the need of JWT in blog web Authentication, encode, decode
application. Blog web application that allows
users to create and manage blog posts, with a XXVIII. INTRODUCTION
custom-built JSON Web Token (JWT) library Blogging is a powerful tool in today's digital
for authentication and authorization. The JWT landscape, with over 600 million blogs on the
library provides a secure and reliable solution internet attracting billions of readers worldwide. It
for user authorization and authentication, takes around 3 hours to write a blog post, but its
ensuring that only authorized users can access impact can last for years, driving traffic and
the blog and its features. The web application boosting SEO. Blogs are the 5th most trusted
source of information online, making them a great
also includes a basic commenting system that
way to build a personal or business brand.
allows users to interact with each other in a Consistent blogging has been shown to have a
constructive and respectful manner, and the positive ROI, with brands and businesses receiving
user interface is designed to be user-friendly more links and achieving their goals more often.
and intuitive. The custom-built JWT library Additionally, successful bloggers can earn six-
demonstrates the importance of security in web figure incomes from their content, and those with
application development. The development strong social media presence have a 13% higher
process involved careful design and chance of achieving their goals.
implementation of the JWT library, as well as
Blogs have become an important aspect of the
extensive testing to ensure that it provides a
online world, offering a powerful platform for
robust and reliable solution for user individuals and businesses to communicate their
authentication and authorization. Overall, this ideas, experiences, and expertise to a global
blog web application provides a secure and audience. With the rise of social media and the
efficient platform for bloggers and readers to increasing importance of online presence,
share ideas and engage with each other in a blogging has become a crucial tool for personal
safe and constructive and business branding, marketing, and
communication. Through blogs, individuals can algorithm used to sign the token. The payload
establish themselves as experts in their fields, contains the data that is being transmitted, such as
share their stories and opinions, and connect with user information, permissions, or other data. The
like-minded individuals from all over the world. signature is used to verify the authenticity of the
For businesses, blogs are an effective way to token.
engage with customers, build a loyal following, The HS256 algorithm used in JWT is a
and improve search engine rankings. Furthermore, symmetric algorithm that relies on a shared secret
blogs have the power to influence public opinion key to both sign and verify the token. This
and shape conversations on a wide range of topics, algorithm is widely used and is considered to be
making them an important part of the modern secure for transmitting sensitive data over the
media landscape. Ultimately, the importance of internet. The algorithm uses the SHA-256 hashing
blogs lies in their ability to inform, educate, and function, which is a widely used and tested
inspire audiences in ways that traditional media cryptographic function that is known for its
cannot match. security and performance.
JWT is a crucial component of any blog web Simpler key management: HS256 uses a
app, providing essential security features and user symmetric key, meaning that the same key is used
personalization options. By implementing JWT in to both sign and verify the JWT. This can simplify
a blog web app, developers can ensure that their key management for developers, as there is only
users' data is secure and that the app provides a one key to manage. Smaller token size: Because
seamless and personalized experience for each HS256 uses a symmetric key, the signature for the
user. JWT is typically smaller than with RSA-based
signing algorithms. This can be an advantage in
situations where token size is a concern.
XXIX. BACKGROUND
web applications relied on session-based XXX. EXISITNG PLATFORMS AND THEIR
authentication to authenticate and authorize users. LIMITATIONS
Session-based authentication involves storing a There are several existing platforms which
session ID on the server and sending it to the client provide similar functionalities but using different
in a cookie. The client then sends the session ID methods such as session ID, cookies, etc.
back to the server with each request, allowing the
Session IDs: Session IDs were commonly used
server to identify the user and their session data.
in web applications as a way to maintain a user's
While session-based authentication is still used in
authentication state. When a user logs in, the
many web applications, it has several limitations.
server generates a unique session ID and stores it
For example, session-based authentication is
on the server side. This session ID is then passed
vulnerable to session hijacking, where an attacker
back to the client and included in subsequent
steals the user's session ID and impersonates the
requests to the server, allowing the server to
user. This can lead to unauthorized access to
identify and authenticate the user. While session
sensitive data and resources. JWT was introduced
IDs are still used in some applications, they have
as a more secure alternative to session-based
some drawbacks compared to token-based
authentication, and it has become increasingly
authentication methods like JWT. For example,
popular in recent years as web development has
session IDs are more stateful, which can make
evolved to become more secure and efficient.
them less scalable in high-traffic environments.
JSON Web Token (JWT) was first introduced in
Basic Auth:- Basic Auth is a simple authentication
2010 as a standard for securely transmitting
scheme that has been around since the early days
information between two parties over the internet.
of the web. It involves sending a user's credentials
JWT is an open standard and is designed to be a
(i.e., username and password) as a Base64-
compact and self-contained way to transmit
encoded string in the Authorization header of an
information between parties as a JSON object.
HTTP request. While Basic Auth is easy to
JWTs consist of three parts: a header, a payload,
implement, it has some significant drawbacks,
and a signature. The header contains information
including the fact that credentials are sent in
about the type of token and the cryptographic
plaintext, which makes them vulnerable to used in the processes of information exchange and
interception and theft. authentication. Each part is separated by a
OAuth: OAuth is a protocol for delegated dot symbol (.) refer the figure
authorization. It allows users to grant third-party
applications access to their resources without
sharing their login credentials. OAuth can be used
in combination with other authentication methods,
such as JWTs. While OAuth can provide a secure
and scalable method for authentication and
authorization, it can also be complex to implement
and may not be necessary for simple web
applications.
XXXI. PROPOSED FRAMEWORK A Header section that specifies the type of token
The proposed framework for building a blog (in our example, "JWT") and the signature
web application using the MERN stack and a algorithm. The entire thing is Base64 encoded.
custom JWT library built with TypeScript A Payload part that includes the token data, such
provides a powerful and flexible platform for as the username, token production date, and
building secure and scalable web applications. The expiration date. All of that is encoded in Base64
custom JWT library includes functions like Sign, and written in JSON.
Verify, and Decode, which can be used to And a Signature section, formed by combining
generate, validate, and parse JWTs for user the Header and Payload sections, which is
authentication and authorization. With this library, subsequently encrypted using the private key. The
developers can implement a robust and secure username or the user's application rights can both
authentication and authorization system in their be included in the second part, "Payload," of the
blog web application. message. However, the JWT specifications make
In addition to the custom JWT library, the clear which keywords should be used, including
proposed framework also utilizes the MERN stack "iat" (date and time of token generation) and
for Blog web application. MongoDB is used as the "exp" (expiration date).
database, Express as the web framework, React as As previously stated, the Header and Payload are
the frontend library, and Node.js as the server-side concatenated and encrypted to provide the
runtime. Together, these technologies provide a Signature. It provides a private key for us.
flexible and scalable platform for building blog The figure shows a JWT that has the previous
web applications that can handle large amounts of header and payload encoded, and it is signed with
traffic and user data. With this framework, a secret.
developers can focus on building the core
functionality of their blog web application, while
the MERN stack and custom JWT library handle
the rest.
A. Architecture
MERN Stack is used to build the proposed

platform. React.JS and Material UI are used to
build the frontend layer. Express.JS and Node.JS
are used to build the server-side logic layer.
MongoDB is used for data storage and
management. And JWT (JSON Web Token) is
used for user authentication using custom JWT
library.
JWT is a token in the form of a string with three
components: a header, a payload, and signatures
B. Features helpful. Stored on browser/client : JWTs are
created on the server, then sent to the client. The
User Authentication: The most typical application
JWT is then sent by the client along with each
of JWT is in this situation. Once logged in, the
request. This reduces database space usage.[7]
user can access routes, services, and resources
that are authorised with that token by including
the JWT in any subsequent requests. XXXII. CONCLUSION
Nowadays, JWT is widely used for single sign- This paper discusses how custom JWT library is
on because of its small overhead and ease of built using Typescript. HS256 algorithm is used in
use across several domains.Creation and building library. The custom JWT library is used
encryption and decryption of jwt token using for authentication and authorization in the Blog
custom jwt library. web application. The Blog web application is built
User Authentication: The most typical using MERN stack, which has multiple features
application of JWT is in this situation. Once such as create, edit, delete, comment etc. In
logged in, the user can access routes, services, addition, we have implemented industry-standard
and resources that are authorised with that authentication and authorization mechanisms
token by including the JWT in any subsequent using JSON Web Tokens (JWTs), which ensure
requests. Nowadays, JWT is widely used for that only authorized users have access to the
single sign-on because of its small overhead content and features of the web app. We have also
and ease of use across several domains.Other utilized best practices for data storage and
features include creation of user profile and an security, to protect user data and prevent data loss
addition of image to the blog article. or theft.
C. Potential benefits
The scalability of stateless apps is the main REFERENCES
benefit of Node.js authentication using JWT over
the conventional authentication procedure. And [1] Node.js HERE
given that companies like Facebook and Google [2] React.js HERE
are starting to use it, its popularity across the [3] Express.js HERE
industry is probably just going to increase. [4] Material UI HERE
The advantages include: Secure: JWTs are [5] MongoDB HERE
protected from being altered by the client or an [6] Brute Forcing HS256 is Possible: The
attacker thanks to digital signatures that use either Importance of Using Strong Keys in Signing
a secret (HMAC) or a public/private key JWTs HERE
combination (RSA or ECDSA). [7]https://auth0.com/docs/secure/tokens/json-web-
Effective/Stateless: Since a JWT doesn't call for a tokens
database search, it can be verified quickly. [8] https://supertokens.com/blog/what-is-jwt
Particularly in big dispersed systems, this is
Votemate – Secured Online Voting Application
Dr. Ramesh Sengondan

Asst.Professor Lavanya S-20191COM0114 SaiHarshitha K G-
Department of CSE Department of CSE 20191ISE146
Presidency University Presidency University Department of CSE
Bangalore , India Bangalore , India Presidency University
ramesh.sengondan@presidencyuniversi 201910100359@presidencyuniversity.i Bangalore , India
ty.in n 201910100492@presidencyuniversity.i
n
Sudhir Reddy S-
Darshini R-20191CSE0118 20191CSE0592
Bangalore , India Bangalore , India
n n
Abstract— This research paper explores the use of OAuth unique requirements and constraints of the voting process.
2.0 in an online voting application. OAuth 2.0 is a widely used Finally, we evaluate the effectiveness of OAuth 2.0
authorization protocol that enables secure and efficient authentication to improve the security and usability of online
communication between different web applications. The voting applications and provide recommendations for further
proposed online voting application aims to improve the current research and development in this area.
voting system by providing voters with a more accessible, user-
friendly and secure platform.
With the OAuth 2.0 protocol, the voting application can
OVERVIEW
securely access user data from various web applications, such
as social media platforms, without requiring users to share OAuth 2.0 is an auth orization framework that allows
their login information. This ensures the transparency of the users to access resources using a secure token system without
voting process and the confidentiality of user data. The paper the need to share their credentials with the service provider.
discusses the architecture, implementation and security
features of the proposed online voting application and In the context of online voting, OAuth 2.0 can be used to
evaluates its effectiveness in improving the voting process. The ensure that only authorized users can access the voting
results of this study show that OAuth 2.0 can be effectively application. The voting application can be integrated with a
used in online voting applications to improve security, user third-party authentication service, such as Google, Facebook,
accessibility, and transparency. or Twitter, to enable users to log in using their existing
credentials. The OAuth 2.0 protocol is used to securely
exchange authentication and authorization data between the
Keywords—Authorizarion, Authentication, OpenIdConnect, voting application and the authentication service.
Login , Registration , Security
One of the benefits of using OAuth 2.0 for online voting
I. INTRODUCTION applications is that it provides enhanced security. User
credentials are not shared with the voting application,
Online voting applications have become increasingly
reducing the risk of credential theft. Additionally, OAuth 2.0
popular in recent years as more and more organizations seek
uses access tokens to grant users access to the application,
to increase voter participation and simplify the voting
which can be revoked at any time. This helps prevent
process. However, these applications must also be secure,
unauthorized access to the voting application.
reliable and user-friendly in order to gain user trust and
ensure the integrity of the voting process. Another benefit of using OAuth 2.0 for online voting
applications is that it simplifies the login process for users.
One way to improve the security and usability of online
Users can log in using their existing social media or email
voting applications is to implement OAuth 2.0
credentials, reducing the need to remember multiple
authentication. OAuth 2.0 is a widely accepted authorization
usernames and passwords.
framework that allows users to grant third-party applications
access to their resources without revealing their credentials. Overall, online voting applications using OAuth 2.0 provide
Using OAuth 2.0, online voting applications can ensure that enhanced security and simplified user login processes,
users are authenticated and authorized to vote, while making them an attractive option for research papers focused
protecting their sensitive information from unauthorized on improving online voting systems
access. In this study, we explore the benefits and challenges
of implementing OAuth 2.0 authentication in online voting
applications. We also discuss best practices for designing
and implementing such applications, taking into account the
II. EXISTING SYSTEM III. PROPOSED SYSTEM
a) Security Vulnerability: One of the most important A. User Registration: The first step would be for the user to
concerns about electronic voting is the risk of register on the online voting platform and provide their
security breaches, hacking and vote manipulation. personal details such as name, address, date of birth and
Malicious actors could potentially alter vote counts a valid email address.
or access sensitive voter information.
B. OAuth 2.0 Authorization: After registration, the user
b) Lack of transparency: Some electronic voting will be asked to allow the online voting platform to use
systems are not transparent, making it difficult for the authentication server of their OAuth 2.0 provider.
voters and election officials to ensure the accuracy This is used to confirm their identity and credentials for
and reliability of the voting process. Without a future logins.
clear audit trail, it can be challenging to detect and C. Voter Verification: Once a user is logged in, the online
correct errors or potential fraud. voting application verifies their voting status using a
database of eligible voters provided by the Election
c) Accessibility Issues: Electronic voting may not be Commission. This verification process ensures that only
accessible to everyone, especially voters with eligible voters can participate in online voting. Ballot
disabilities or limited technical skills. The user Choice: The user is then presented with the ballot
interface of the application may also be challenging choices. The ballots would be pre-populated based on
to navigate, potentially leading to errors or the user's registered address and candidates for various
confusion. positions.
d) Technical Problems: Voting applications may D. Voting: the user can select suitable candidates and vote.
experience technical glitches or problems that may The application would ensure that each user can only
cause delays, long queues or other disruptions. In cast one vote and prevent multiple votes from the same
some cases, the application may not be able to user.
handle high traffic volumes, resulting in crashes or
other problems.
e) Cost: Electronic voting systems can be expensive IV. SYSTEM ARCHITECTURE

to develop and implement, which may limit their
availability to smaller communities or
organizations. • Activity diagram for the voter registration
f) Privacy Issues: Electronic voting systems can

collect and store sensitive voter information,
raising privacy and security concerns about data
privacy and security. If this information falls into
the wrong hands, it could be used for identity theft
or other malicious purposes.
Fig.1 voter registration
• Activity diagram for the voting process
47
v. Result: The results feature shows the election
results, including the votes received by each party
and candidate. The system automatically calculates
and displays the final results, indicating the winner
and candidate.
vi. Helpline: The helpline feature gives users access to

support resources such as a support number for
each language and an official email address. Users
can use these resources to get help with problems
or questions about the voting system.
The proposed architecture of the online voting system is

designed to be secure, efficient and user-friendly. The use of
APIs, OAuth and OTP ensures the security of the system
and the verification of the identity of each user. The
dashboard provides users with easy access to all system
functions, including voter registration, polls, results and a
helpline. Overall, the proposed architecture provides a
robust and reliable platform for conducting online elections.
Fig.2 voting process
V. CONCLUSION
The aim of this study is to propose an online voting Factors such as user experience, security and accessibility
architecture that is secure, user-friendly and efficient. The are important to consider when designing a voting
proposed architecture consists of a login page, a dashboard, application. The program must be simple and easy to use as
voter registration, polls, results and a helpline. well as provide secure and private voting that prevents fraud
or malicious activity.
i. Login page: The login page is the first point of User authentication/authorization using OAuth2.0 and
entry for users. It provides users with a secure OpenID Connect can provide additional security and
interface that allows them to authenticate customization capabilities to a voting application, allowing
themselves using credentials such as username and users to log in with their existing credentials and ensuring
password. The authentication process is supported that only authorized users can access application functions.
by a series of APIs that communicate with the However, it is important to note that while OAuth 2.0 may
database to verify the user's identity. Using APIs improve the security of an online voting system, it is not a
ensures that the authentication process is panacea. Other security measures such as two-factor
simplified, efficient and secure. authentication and end-to-end encryption should also be
implemented to strengthen system security. In addition, the
ii. Dashboard: The Dashboard provides users with a system must be rigorously tested and verified to identify and
personalized view of their account information, fix potential vulnerabilities.
including profile information and other related Overall, a well-designed and secure voting program can be a
information. The panel also provides access to valuable tool for organizations and communities that want to
various electoral systems such as voter registration, collect input and feedback from voters. This can help
polls, results and a helpline. promote openness and inclusiveness while providing
valuable information and insights that can be useful in
iii. Voter registration: The voter registration feature decision-making processes.
allows users to register to vote in elections. For Overall, an online voting application using OAuth 2.0 as an
registration, the user gets a unique 10-digit aadhaar authentication mechanism can provide a robust and reliable
duplicate number, which he needs to link with his platform for conducting secure and transparent elections.
mobile via OTP. This process ensures that each
user's identity is verified and that they are allowed REFRENCES
to vote. [1] Arora, S. singh & M. Aggarwal (2019). A secure
online voting system using blockchain technology.
iv. Polls: The voting feature allows users to view Journal of information security and applications
political parties and candidates participating in the [2] Kumar, R. Wadhwani (2019). An efficient and
current election. Each political party has a brief secure electronic voting system based on
description and their candidates. Users can vote by blockchain technology and business intelligence
selecting the desired candidate from the list. To [3] Kshetri.N (2020). Blockchain’s roles in meeting
ensure the security and integrity of the voting key supply chain management objectives.
process, users must enter their duplicate aadhaar International journal of information management
number and authenticate using OAuth.
48
[4] Mercuri. R Neff, C.A (2019). Defending digital [8] Haldar & saha (2017) secure remote electronic
democracy: A multidisciplinary approach. voting system using hybrid cryptosystem. In
Springer. intelligent computing and control systems.
[5] Grewal, R. (2019). A review of online voting [9] Jin, Lu & yang (2016). Security analysis of a recent
systems. International journal of advanced research online voting protocol. Security and
in computer science. communication networks.
[6] Popovic, M.& bojanic (2018) security issues of [10] A Hybrid Voting System for High-Integrity and
electronic voting systems. Journal of applied Verifiability by R. Sun, L. Zhang
Engineering science.
[7] Chaum, D. (2018). Scantegrity : End to end
verifiability for optical scan voting systems using
invisible ink confirmation codes. In Towards
trustworthy elections.
49
Patient Case Similarity
K. Sanjay Samiksha P Bhargava Sampath V

20191CSE024 20191ISE0147 20191ECE0274
Dept. of Information Dept. of Electronics
Dept. of Computer
Science and Engineering Communication and
Science and Engineering
Presidency University Engineering Presidency
Bengaluru University Bengaluru
Bengaluru
201910100750@ 201910100201@
201910100057@
Presidencyuniversity.in Presidencyuniversity.in
Presidencyuniversity.in
Sehar Taj Mr.Sheik Jamil Ahmed

20191CSE0795 Assistant Professor
Dept. of Computer Science Dept. of Computer Science
and Engineering and Engineering
Bengaluru Bengaluru
201910100829@ Sheik.jamilahmed@
Presidencyuniversity.in Presidencyuniversity.in
Abstract - The requirement for health

information is altering knowledge-seeking INTRODUCTION
behaviour, which should be noted globally. Healthcare is the best demonstration of how
Many of us struggle with using the internet machine learning is used in the health care sector.
health resources to learn about diseases, We are utilising machine learning to keep all
diagnoses, and treatments. Regular use of a healthcare data current. With the help of machine
system that recommends will result in learning technique, doctors can more accurately
significant reductions in time spent by diagnose and treat patients, which improves patient
physicians and medications. Because the users of healthcare services. Machine learning technology
a system like this are non-specialists the user has enables constructing models to analyse data quickly
several difficulties understanding the basic and give results faster.
medical terminology. The customer becomes I. EXISTING METHODS
confused since there is an excessive amount of
medical information available in various Despite the development of highly sophisticated
formats. computing, doctors still require the technology in
A method called Disease Prediction using the this application in a variety of ways, such as
basis of the information the client supplies, surgical representation and x-ray photography, but
machine learning predicts the disorder. the technology has visually lagged behind. Due to
Additionally, it makes disease predictions for additional considerations including weather,
users or patients depending on Symptoms. atmosphere, blood pressure, and a wide range of
other variables, the procedure still requires the
• Keywords—Random Forest Algorithm, doctor's knowledge and expertise. However, no
Naive Bayes, Support Vector Machine, model has undergone a successful analysis.
Logistic regression and etc. Medical decision support systems must be
employed to address this problem.
50
increase the chances that this fruit is an orange, it
is referred described as being "naive".
The mathematician and hero Bayes is
associated with the name "Bayes" and the
Bayesian'theorem, that serves as the basis of the
The doctors can use this technique to help them Naive Bayes algorithm.
choose wisely.
SVM:
For both order and relapse problems can be solved
Drawbacks: using the support vector machine, a controlled AI
computation method. For example, in the disciplines
• High complexity. of jargon recognition, machine learning, and memoir
• Highly inefficient. informatics, it is a really crucial piece of information.
• Requires skilled persons. SVM uses a strategy based on a two class direct
linearly separable variable and a hyperplane that
maximises the geometric fringe and minimises the
II. PROPOSED METHOD. type error.
This approach is employed to forecast disease based
on symptoms. This system evaluates the model
using a decision tree classifier. End users utilise this INTERFACES:
system. The technology will forecast disease based
on symptoms. This system makes use of machine
learning capabilities. 1. System:
The decision tree classifier method is used to
forecast diseases. This system is known as "AI 1.1 Create Dataset:
Therapist" by us. This system is intended for those The dataset including symptoms are to be
individuals who are always concerned about their categorised, is separated into training and testing
health; as a result, we have included several dataset, with the test size being set at 30–20%.
elements that acknowledge this concern and also 1.2 Pre-processing
work to improve the user's mood. As a result, the
function "Disease Predictor" for health awareness The data has been scaled and rearranged into the
can identify diseases based on their symptoms. right format for training our model.
1.3 Training:
CNN Deep Learning, machine learning, and SVM
ALGORITHMS AND METHODS: techniques are utilised to train our model using the
Random Forest: pre-processed training dataset.
Ensemble learning is the act of using multiple 1.4 Classification:
models that have all been trained on the same data The results of our model are displayed
and mean their results to get more precise
predictions or categorization. The underlying
premise of ensemble learning is that each model's 2. Patient:
flaws (in this case,a decision tree)are unique and
unrelated to one another.
2.1 Upload Symptoms
Naive Bayes: The user has to upload Symptoms.
The Naive Bayes method determines the 2.2 View Results
probability that an object with specific
characteristics belongs to a given group or class. The predicted disease is displayed.
Using an orange-colored, globular, and pungent
fruit as an example, you would probably infer that
it was an orange if you were attempting to
determine a fruit solely upon its shade, shape, and
flavour. Since all of these packaging combined
51
ARCHITECTURE DIAGRAM REFERENCES
[1] Pingale, Kedar, et al. "Disease Prediction

using Machine Learning." (2019).Mr. Chala Beyene,
Prof. Pooja Kamat, “Survey on Prediction and
Analysis the Occurrence of Heart Disease Using
Data Mining Techniques”, International Journal of
Pure and Applied Mathematics, 2018.
[2] Pingale, K., Surwase, S., Kulkarni, V., Sarage,

S., & Karve, A. (2019). Disease Prediction using
Machine Learning.
[3] Aiyesha Sadiya, Differential Diagnosis of

Tuberculosis and Pneumonia using Machine
Learning (2019) .
[4] S. Patel and H. Patel, “Survey of data mining

techniques used in healthcare domain,” Int. J. of
Inform. Sci. and Tech., Vol. 6, pp. 53-60, March,
2016.
CONCLUSION
We have therefore come to the conclusion that [5] Balasubramanian, Satyabhama, and Balaji
machine learning can be utilised to track our health Subramanian. "Symptom based disease prediction in
in an efficient manner. We can maintain our health medical system by using Kmeans algorithm."
by periodically getting a free health check. When International Journal of Advances in Computer
the machine learning technique is built and deployed Science and Technology 3.
using the Python web framework and later
transformed into a website using that domain, it will
be freely accessible to everyone. In order for our [6] Dhenakaran, K. Rajalakshmi Dr SS.
model to forecast the optimal outcome, the user only "Analysis of Data mining Prediction Techniques in
needs to visit the relevant page and choose 5 to 8 Healthcare Management System." International
symptoms. After receiving the prediction, the user Journal of Advanced Research in Computer Science
will gain insight into their health and, if necessary, and Software Engineering 5.4 (2015).
contact the appropriate doctor.
[7] Maniruzzaman, M., Rahman, M., Ahammed,
B. and Abedin, M., 2020. Classification and
prediction of diabetes disease using machine learning
paradigm. Health information science and systems,
8(1), pp.1-14.
[8] Chen, M., Hao, Y., Hwang, K., Wang, L. and

Wang, L., 2017. Disease prediction by machine
learning over big data from healthcare communities.
Ieee Access, 5, pp.8869-8879.
52
53
Networking Platform for E-Sport Players
Mr. Jobin S Thomas C. Siri Reddy (20191CSE0752) V Bhushan (20191CSE0796)

Bengaluru, India Bengaluru, India Bengaluru, India
jobinthomas@presidencyuniversit 201910100908@presidencyuniver 201910100628@presidencyuniver
y.in sity.in sity.in
Latha S (20191ISE0080) S. Gandhodi (20191COM9001) N Venkatesh (20191IST0100)

201910100527@presidencyuniver 201810102012@presidencyuniver 201910100265@presidencyuniver
sity.in sity.in sity.in
Abstract— This research paper aims to can be attributed to several factors, including the
investigate the need for a networking platform increasing popularity of video games, the rise of
for e-sport players. With the rapid growth of streaming platforms like Twitch and YouTube,
the e-sport industry and the increasing number and the growing acceptance of e-sports as a
of players and enthusiasts, there is a pressing legitimate form of entertainment.
need for a dedicated networking platform to Online gaming has become a widespread
connect e-sport players. It can be challenging phenomenon in recent years, with millions of
for players to find the right team, and for clubs people across the world participating in various
to find the best players to recruit. A centralized online games. Along with this growth in
platform would serve as a social hub for popularity, the emergence of online gaming
players to interact, and collaborate with each communities has become a significant aspect of
other and for clubs and organizations to the online gaming experience. These communities
identify and connect with talented players. This can range in size from small groups of gamers who
paper will explore the potential benefits of such enjoy playing together to massive groups of
a platform and analyze the current state of e- thousands of people, and they can exist both
sport player networking. The findings of this within and outside of the game.
research will provide insights into the Communities play a crucial for e-sport players
feasibility and viability of a dedicated and teams as it can provide numerous benefits that
networking platform for e-sport players, and can enhance their career prospects and overall
the potential impact it could have on the e- success in the industry. Networking allows players
sport industry as a whole. and teams to build relationships with other
professionals in the industry. These connections
Keywords—E-sports, Online Gaming can provide opportunities for collaboration,
Community, Networking Platform teaming up for tournaments, or even securing
sponsorship deals and endorsements. Networking
VI. INTRODUCTION also provides players and teams with access to
The term "e-sports" pertains to the world of valuable industry knowledge and insights. By
competitive video gaming, wherein players or connecting with others in the industry, players can
teams engage in head-to-head battles or learn about new strategies, tactics, and gameplay
tournaments featuring a range of video games. techniques that can help them improve their
These events often draw large audiences and can performance and increase their visibility within the
be seen as the digital equivalent of traditional industry.
sports competitions. E-sports has grown rapidly in
recent years, with the industry estimated to be
worth over $1.38 billion in 2022[1]. This growth
VII. BACKGROUND Networking plays a crucial role in e-sports,
The e-sports industry has experienced both for individual players and for teams. E-sports
significant growth in recent years and is expected is a highly competitive field, and the competition
to continue to grow in the future. The COVID-19 for the best players is fierce. Networking can help
pandemic has further accelerated this growth as players and teams to connect with each other and
more people turned to online entertainment. E- form partnerships that can lead to success. Teams
sport market revenue is anticipated to reach 1.87 are constantly looking for talented players to join
billion US dollars in 2025[1]. According to a their rosters, and networking can help them find
survey [2] in 2022, the global e-sport viewership those players.
reached 532 million and by 2025 the viewership According to Hsiao, C. C., & Chiou, J. S. [4],
count is expected to be over 640 million. Asia and social networks and relationships can have value
North America are currently the biggest markets and benefits for individuals. The study finds that
for e-sports. players who have a higher position in the online
One of the main reasons is the accessibility of community, such as those with more connections
online gaming. With the widespread availability of and influence, are more likely to have higher
high-speed internet, more and more people can levels of community trust and perceive more social
now play video games online and connect with value. Players who have built a strong network
other players from around the world. This has led within the e-sports community are more likely to
to the formation of online communities, which be noticed by teams looking to recruit new talent.
provide players with a platform to socialize, Additionally, players who have a strong reputation
compete and improve their skills [3]. within the community are more likely to be
recommended to teams.
Networking can also help players to build their
personal brands and establish themselves as
valuable members of the e-sports community. By
networking with others, players can gain exposure
and build relationships that can help them advance
their careers. This can lead to opportunities for
sponsorship deals, endorsements, and other forms
of income that can support their gaming careers.
However, finding and connecting with each
other can be a daunting task for players and clubs,
as they may encounter various challenges that
Fig. 1. eSports market revenue worldwide from 2020 to 2025 (in hinder the process. Geographic barriers can make
million U.S. dollars) [1] it difficult for them to physically meet and the lack
of networking opportunities, particularly for
players who may not be part of established leagues
or organizations. Barriers that obstruct effective
communication, such as differences in language
and cultural norms, can hinder communication
between people. Clubs may have limited resources
to find players or to host try-outs. Legal and
regulatory challenges related to contracts, work
permits, and eligibility can further hinder the
process of connecting players and clubs.
In the study [5] author C. Won Jung examines
the relationship between game playing activities
and community involvement and self-
Fig. 2. eSports audience size worldwide from 2020 to 2025, by type
identification as a gamer. It found that game
of viewers(in millions) [2] communities serve as public spheres and that game
playing encourages social consciousness and
behaviour such as engaging in public discourse
55
and community activities. The study extends the and offer players and clubs a more comprehensive
subject of game studies beyond the notion of solution for their networking needs.
addiction vs. education and fitness, and suggests There is a need for more community
that games are a social simulator that allows for engagement features. While many existing
social experience that may be transferred to platforms offer basic communication and social
positive real-life consequences. networking features, there is a need for more
VIII. EXISITNG PLATFORMS AND THEIR LIMITATIONS robust community engagement features, such as
forums, mentorship programs, and collaboration
There are several existing e-sport player
tools. A dedicated networking platform that offers
networking platforms, each with its own set of
more community engagement features could help
features. We explored few of these platforms and
players and clubs build stronger relationships and
found great features however felt that there are
better support each other in meaningful ways.
some gaps that needs to be addressed.
There is no central platform where both clubs
Most of the platforms offer a range of basic
and players can interact with each other. Most of
features to their users, such as chat, cross platform
the networking platforms are focused on LFG and
support, and the option to filter profiles based on
are only limited to player-to-player networking.
one's individual needs.
No platform hosts both the players and clubs under
However, a significant number of platforms are the same roof, which can make it difficult to find
still missing some crucial features that would the right player for a club.
enhance user experience even further. For
E-sport players and clubs often use a variety of
instance, feed on the platform is a vital feature for
different tools and platforms to manage their
users as it allows them to share their experiences,
teams, track performance, and communicate with
opinions, and interests with others. Another
each other. A dedicated networking platform that
feature that is often missing in many platforms is
offers better integration with other e-sport tools,
the content upload feature. Users may not be able
such as team management software and
to showcase their creativity, which can result in a
communication apps, could help streamline the e-
lack of engagement and activity on the platform. It
sport experience and make it easier for players and
is concerning that many platforms lack adequate
clubs to manage their teams.
privacy controls, which is an essential aspect of
any online platform. After exploring several networking platforms
and finding them to be lacking in various ways, we
One of the networking platforms, GameTree [6]
realized that there was a need for a more
has an innovative approach for suggesting user
comprehensive and user-friendly platform. With
profiles. They designed different kinds personality
this realization, we set out to build a new platform
assessment which would improve the
that addressed the gaps we experienced with the
recommendations showed to the user. Apart from
existing platforms. The proposed platform aims to
that, the platform has option to filter profiles based
provide a seamless and intuitive user experience,
on game, gender, age, geographical location and
with features that cater to the needs of e-sport
language. This platform offers feed feature, chat
players. We believe that proposed platform will
rooms, personalized game recommendations based
bridge the gaps that we encountered, and provide a
on a user's preferences and play history.
one-stop solution for e-sport players to connect,
Despite the existence of several e-sport player collaborate and grow.
networking platforms, there are still gaps in the
market for a dedicated networking platform that IX. PROPOSED NETWORKING PLATFORM
could better meet the needs of players and clubs. The proposed e-sport networking platform is a
One of the gaps is the lack of support for a wider comprehensive online platform designed to
range of games. Many existing platforms focus on connect e-sport players, teams, clubs and
popular games like CS:GO, Dota 2, and League of organizations with each other, creating a space for
Legends, leaving players and clubs of less popular collaboration, competition, and career
or niche games struggling to find adequate advancement. The platform is designed to provide
support. A dedicated networking platform that users with the tools they need to showcase their
supports a wider range of games could fill this gap skills, connect with others, and potentially advance
their careers in the gaming industry. The proposed
56
platform would serve as a centralized hub for the that users are aware of their rights and obligations
e-sport community to connect and engage with when participating in competitions and events.
each other, helping to drive the growth and Additionally, the platform uses content
development of e-sports industry. matching and collaborative filtering to recommend
profiles to players and clubs. This feature enables
users to discover and connect with other players
who share their interests or skill levels, creating a
space for collaboration and competition.
Fig. 3. Context Diagram
Architecture
A.
The proposed platform is built with MERN
Stack. The frontend layer is built using React.JS. It
also communicates with Cloudinary using API
calls to store media assets. The URL to the media
assets is stored in the database. The server-side
logic layer is built using Node.JS and Express.JS.
MongoDB Atlas is used for storing and managing
data.
Fig. 5. Use Case Diagram
Potential Benefits
C.
Fig. 4. Proposed Platform Architecture

The proposed platform offers a unique
opportunity for players to gain visibility and
recognition within the community. This platform
Features
B. provides a space where players can showcase their
The platform allows users to upload and skills and talents to a broader audience. Apart from
manage their gaming-related content, including benefiting individual players, this platform can
documents, videos, and images, and provides also help clubs and organizations to find talented
features for cropping and editing these assets to players to recruit. They can search for players
present them in a professional and polished based on specific criteria, such as location, skill
manner. Users can easily share their content with level, and experience, and connect with them
others and build their reputation within the directly.
community. Interacting with a global crowd while playing
The platform also includes features for games or participating in other online activities
organizing and managing teams, allowing users to can help individuals learn a new language as it
create and join teams, participate in competitions exposes them to new vocabulary, expressions, and
and events, and share their achievements with cultural insights. Author A. Chik [7] explores how
others. Legal documents and contracts are gamers exercise autonomy in managing their
incorporated into the platform's features to ensure gameplay for both leisure and learning purposes in
five dimensions. Autonomy is identified as a key
57
factor in facilitating second language (L2) learning X.CONCLUSION
through gaming, with extended online gaming In conclusion, the proposed e-sport networking
communities providing support for language platform offers a comprehensive solution to the
learning through paratexts and advice. The study growing demand for a centralized platform that
suggests that organizing L2 gaming practices can caters to the needs of gamers, teams, and
reflect a gamer's L2 learning trajectory and that organizations. The platform can also benefit
game-related paratexts in both L1 and L2 form the gamers by providing them with a platform to
funds of knowledge for many L2 gamers. showcase their skills and achievements, and
Additionally, the study emphasizes the importance potentially connect with gaming organizations and
of providing structures and guidance for young L2 sponsors. The ability to upload, edit, and manage
learners on how to use L2 games to learn content in one place, as well as the inclusion of
autonomously. While the study has certain legal documents such as contracts, can promote
limitations, its findings have important research transparency, fairness, and cooperation within the
and pedagogical implications. community. Additionally, the platform's
According to L. Conner [3], the online gaming matchmaking system can help connect players
communities have a significant impact on gamers with similar skill levels and provide opportunities
and are just as essential as any other real-world for advancement. While there may be some
community. Online gaming has grown in challenges in implementing the platform, the
popularity, with millions of individuals across the potential benefits and demand for such a platform
world actively participating in online games and make it a feasible and viable project. Overall, the
building interactions and connections with other e-sport networking platform has the potential to
gamers. These communities can range in size from revolutionize the e-sport industry and provide a
a few people to thousands of people and can exist valuable resource for gamers, teams, and
both within and outside of the game. Some gaming organizations alike.
communities emerge around gamers who like
playing together, and these communities can REFERENCES
outlive particular games. The paper argues that [22] C. Gough, “Global Esports Market Revenue
2025,” Statista, 22-Sep-2022. [Online].
these communities provide opportunities for Available:
gamers to connect with others who share their https://www.statista.com/statistics/490522/glo
bal-esports-market-revenue/.
interests and can lead to lifelong friendships. The [23] C. Gough, “Global eSports audience size by
paper provides examples of popular online games Viewer Type 2025,” Statista, 27-Jul-2022.
and their associated communities, demonstrating [Online]. Available:
https://www.statista.com/statistics/490480/glo
the impact of these communities on the gaming bal-esports-audience-size-viewer-type/.
world. [24] L. Connor, “Online gaming and the
communities that it creates., debating
D. Feasibilty and Viability communities and networks 11,” Debating
Communities and Networks 11, 01-May-2020.
The proposed e-sport networking platform [Online]. Available:
appears to be feasible and viable given the https://networkconference.netstudies.org/2020
OUA/2020/05/01/online-gaming-and-the-
growing popularity of e-sports and the increasing communities-that-it-creates/.
demand for a platform that can provide a [25] C.-C. Hsiao and J.-S. Chiou, “The impact of
online community position on online game
centralized location for gamers to showcase their continuance intention: Do game knowledge
skills and connect with others in the gaming and community size matter?,” Information
& Management, vol. 49, no. 6, pp. 292–
community. There is a clear need for a platform 300, 2012.
that can help gamers easily manage and showcase [26] C. Won Jung, “Role of gamers’
their content. By offering a platform for communicative ecology on game community
involvement and self-identification of gamer,”
organizations to find and hire talented players, the Computers in Human Behavior, vol. 104, p.
platform can provide a valuable service for both 106164, 2020.
players and organizations. [27] “GameTree – LFG Find Gamer Friends,”
GameTree. https://gametree.me/.
However, the success of the platform will [28] A. Chik, “Digital Gaming and language
learning: Autonomy and community,” 01-Jun-
depend on several factors such as effective 2014. [Online]. Available:
marketing, user adoption, and the ability to attract https://scholarspace.manoa.hawaii.edu/bitstrea
and retain users. m/10125/44371/1/18_02_chik.pdf.
58
Optimizing Website Performance: How Google Analytics Can Improve User
Experience
Akshay Harish (20191ISE0008) Nishant Verma (20191CSE0391)

SOCSE & IS SOCSE & IS N Satya Jaidev (20191COM0142)
Presidency University Presidency University SOCSE & IS
201910101608@presidencyuniversity.in 201910101130@presidencyuniversity.in Bangalore, India
Shaik Afroj (20191CSE0542) Afroj Alam- Asst.Prof- CSE
SOCSE & IS Presidency University
Presidency University Bangalore, India
Bangalore, India afroj.alam@presidencyuniversity.in
analytics service supplied by Google, is one of the
most used tools for website optimization. This
Abstract- Google Analytics is a robust web
analytics tool that gives website owners
valuable information about their website
traffic and visitor behavior. It allows website
administrators to measure crucial statistics service gives organizations significant insights
such as visitor numbers, traffic sources, into website traffic and user behavior, allowing
popular pages, and user interaction. This data them to make data-driven decisions to improve
may then be utilized to make informed the performance and user experience of their
judgements about how to improve the content website.
and appearance of a website. Google Analytics In this research paper, we will look at how to use
also offers real-time statistics, audience Google Analytics to optimize an ecommerce
demographics, and goal monitoring, allowing website. The website was created with HTML,
website owners to assess the performance of CSS, and PHP for the backend, and it has been
their online marketing activities. Overall, launched to generate visitors. By connecting
Google Analytics is a must-have tool for Google Analytics to the website, we can track its
website analysis and optimization, allowing performance and collect data on user behavior,
website owners to enhance user experience, such as the pages they visit, how long they stay on
increase engagement, and drive conversions. the website, and how they engage with its content.
With the digital world continually changing, The purpose of this study is to provide an in-depth
technologies like Google Analytics may give understanding of how Google Analytics may be
critical insights and recommendations to help utilized to improve website performance, increase
businesses remain competitive and meet the traffic, and generate sales. We'll look at Google
shifting demands of their consumers. Analytics features like audience insights, behavior
flow, and conversion monitoring.
Keywords- Google analytics, APM, User Data, Furthermore, this study will show how
Demographics, Page views, Funnel Method. organizations may use Google Analytics data to
detect and address issues such as high bounce
I. INTRODUCTION rates, low conversion rates, and cart abandonment.
Businesses can obtain insight into their customers'
tastes and behaviors by analyzing user behavior
Businesses are increasingly relying on internet data, allowing them to modify their website to
channels to reach their target audiences in today's match their customers' demands.
digital age. As a result, it is critical for businesses
Overall, the purpose of this study article is to
to optimize their websites in order to improve
provide a step-by-step approach to website
their online presence and stand out from the
optimization with Google Analytics. Readers will
competitors. Google Analytics, a free web
have a thorough grasp of how to use Google
Analytics to improve website performance,
improve user experience, and generate sales by
the end of this article. Furthermore, this study will
III. LITERATURE REVIEW
show how organizations may use data to make
data-driven decisions that boost their online
presence and bottom line. An in-depth analysis and taxonomy of application
II. ADVANTAGES OF WEBSITE performance monitoring (APM) are provided in
ANALYSIS this research. It talks about numerous APM
methods, resources, and difficulties. The
relevance of APM in ensuring the best possible
Improved Website Performance and User performance of applications is highlighted in the
Experience: Through website analysis, businesses article. [1]
can identify areas for improvement in website
design, functionality, and content. This This study looks into the use of Google Analytics
optimization results in higher search engine as a tool for tracking the performance of online
rankings, increased traffic, and enhanced user applications. The authors present a case study to
experiences. demonstrate the value of this strategy as well as a
description of the important Google Analytics
Increased Conversions and Revenue: By capabilities that can be used for APM. [2]
analyzing user behavior, businesses can remove
obstacles that prevent users from converting, This study offers a thorough analysis of web
leading to increased conversions and higher application performance evaluation and prediction
revenue. This data-driven approach enables methods. The authors go over different APM
businesses to make informed decisions about their strategies, such as log-based monitoring, tracing,
website's design, functionality, and content to and profiling. The report emphasizes how crucial
optimize conversion rates. APM is for pinpointing performance bottlenecks
and enhancing user experience. [3]
Better Marketing Strategies and Competitive
Edge: Website analysis provides valuable insights In this work, real-time APM in microservices
into the target audience, enabling businesses to architectures is reviewed. The authors talk about
create more effective marketing strategies and many methods of APM in microservices, such as
campaigns. By staying ahead of the competition metrics-based monitoring, tracing, and logging.
through continuous analysis, businesses can The paper emphasizes the significance of real-
remain agile and adaptable, enabling them to time APM in assuring the best performance of
implement changes quickly and effectively. systems built on microservices. [4]
Cost Savings and Scalability: By optimizing The usage of Google Analytics as a web-based
website performance and conversions, businesses APM tool is examined in this study. The authors
can save money on advertising and marketing present a case study to demonstrate the value of
expenses, leading to increased profitability. this strategy as well as a description of the
Additionally, analyzing website data enables important Google Analytics capabilities that can
businesses to identify areas for growth and scale be used for APM. The potential of Google
their operations accordingly, resulting in increased Analytics as a low-cost and simple-to-use APM
revenue and business success. solution for small and medium-sized businesses is
highlighted in the paper. [5]
Long-Term Success and Better Customer
Engagement: Regular website analysis enables In the context of B2C e-commerce in Korea, this
businesses to continuously improve their website's study examined the effect of website quality on
performance and provide a solid foundation for user satisfaction and purchase intentions. The
long-term business success. By understanding findings demonstrated that customer happiness
customer behavior and preferences, businesses and purchase intentions were both highly
can tailor their website to meet their customers' influenced by website quality. This emphasizes
needs, leading to increased engagement and the significance of website optimization and
loyalty. analysis for e-commerce enterprises to boost
client happiness and boost revenues. [6]
60
This study looked at the effect of website quality
on prospective internet buyers. The findings
IV. METHODOLOGIES
showed that website quality significantly
increased the likelihood of making an online
purchase. This shows that website optimization Real-Time Analytics is one of Google
and analysis might assist companies in enhancing Analytics' most important tools for tracking and
the quality of their websites to boost online sales. improving website performance. With the help of
[7] this technology, website owners may monitor and
In the context of Saudi Arabian online shopping assess visitor activity in real time. By providing
behavior, this study looked at the effect of website data on user interactions such page load times,
quality on customer loyalty. The results showed bounce rates, and user behavior, real-time
that consumer loyalty was significantly positively analytics aid website administrators in identifying
impacted by the quality of websites. This potential performance issues and improving the
underlines how crucial website analysis and user experience.
optimization are for firms looking to increase Google Analytics' Real-Time Analytics offers
client retention and loyalty. [8]. insightful data on how website visitors behave.
This essay evaluates the literature on many facets Website owners may determine which sites are
of digital marketing and provides a framework for doing well and which ones are generating
it. It emphasizes the significance of website problems for users by tracking user behavior in
optimization and analysis as a crucial element of real-time. This data can be used to guide data-
digital marketing to boost website performance driven decisions about the functionality and
and customer engagement. [9] design of websites, which could ultimately result
in increased user engagement and conversion
In the context of the online retail industry, this rates.
study looked into the effect of website quality on
client loyalty. The findings showed that consumer 1) Real-Time Analytics: Real-Time Analytics
loyalty was significantly positively impacted by offers in-the-moment data on user activity on a
the quality of websites. This underlines the website. This makes it possible for website
significance of website optimization and analysis administrators to keep an eye on visitor behavior,
for online retailers looking to increase client track conversions, and spot any performance
retention. [10] problems that might be degrading the user
experience.
This study looked at how user happiness relates to
website quality in the context of online ticketing 2) Behavior Flow Analysis: This method follows
systems. The findings demonstrated that user a user's journey through a website, from the first
happiness was significantly positively impacted landing page until the successful conversion.
by the quality of the websites. This emphasizes Website owners can discover portions of the site
how crucial website optimization and analysis are that may be generating user drop-offs and
for online ticketing systems to boost customer optimize those pages to enhance the user
happiness and boost sales. [11] experience by analyzing the user behavior flow.
An overview of APM methods for microservice- 3) Conversion Tracking: Conversion tracking
based applications is provided in this paper. The enables website owners to monitor particular user
authors talk about distributed tracing, monitoring, behaviors, like making a purchase or submitting a
and analytic problems associated with APM in the form. The website's most effective conversion-
context of microservices. The paper emphasizes driving locations can be found using this data, as
the value of APM in assuring the level of service well as any areas that might benefit from
quality and user experience in applications that optimization.
use microservices. The authors also give an 4) Funnel Visualization: A website's users can
overview of several APM frameworks and tools see the steps they must follow to finish a certain
that can be applied to the microservices action, such placing a purchase, by using this
environment. [12] technique. Website owners can enhance
61
conversion rates by identifying the points in the instance, can be used by website owners to figure
funnel where users lose interest and optimizing out the 90th percentile of page load times, which
those processes. is the amount of time required by 90% of users to
load a specific page.
5) Site Speed Analysis: Site speed analysis gauges
how long it takes for a website to load and offers
information on how user behavior is impacted by
website speed. This information can be utilized to
enhance user experience and increase website
speed.
Google Analytics provides a full range of tools for
monitoring and enhancing website performance to
guarantee peak speed and a wonderful user VI. DASHBOARD
experience. With the help of these strategies,
website owners may learn a lot about user
behavior, spot performance problems, and take A user interface known as a "Google Analytics
action to enhance the user experience and improve dashboard" offers a summary of important
commercial results. performance indicators and analytics pertaining to
website traffic and user behavior. It is a
configurable platform that enables website owners
V. GOOGLE ANALYTICS INSIGHTS to track and examine data in real-time to learn
more about the effectiveness of their website. The
following terms can be found in the Google
Summarization: In this technique, results from Analytics dashboard:
various dimensions or metrics are added together. 1) Audience Overview: The number of visitors,
For instance, website owners can compute the sessions, and pageviews are all included in this
total number of pageviews, sessions, or clicks on a section's overview of the website's audience.
certain website component using this technique. Demographic information on the website's
Average: The average value of a statistic visitors, such as their age, gender, and location, is
across multiple dimensions is what this method also included.
entails. The average time spent on a page, the 2) Acquisition Overview: This part contains details
length of an average session, or the average about how visitors arrived at the website, such as
number of pages per session, for instance, can all the type of traffic that came via direct traffic,
be determined using this method by website social media, paid search, or organic search.
owners.
3) User Behavior: This section gives information
Count: In this approach, the quantity of a on user behavior, such as pageviews, average
specific dimension or metric is counted. For session length, bounce rate, and the most popular
instance, website owners might use this technique web pages.
to track the number of sessions, unique visitors, or
clicks on a particular component of their websites. 4) Conversion Overview: Information about the
website's conversion objectives, including the
Min Max: Finding the least or highest value of number of transactions, revenue, and conversion
a measure over many dimensions is the goal of the rate, is provided in this section.
min and max procedures. These techniques, for
instance, can be used by website owners to 5) Real-Time: This area gives information about
determine the minimal and maximal amount of the number of users who are actively using the
time spent on a page or the minimal and maximal website, where they are located, and the pages
number of pages per session. they are currently reading in real-time.
Percentiles: Using this technique, data is 6) Custom Reports: This feature enables website
divided into equal portions based on a owners to produce tailored reports that offer
predetermined metric. This technique, for certain information and performance measures.
62
7) Goals: With the help of this tool, website
owners may specify and track particular
conversion targets, such form submissions or
sales, and measure how well they're doing over
time.
8) Events: This function enables website
administrators to monitor individual user actions,
such as button clicks or file downloads. Fig 4: Overview of page views.
9) Segments: Website owners can divide the data

on their websites into distinct categories with this
feature, such as location, device kind, or user
behavior.
Here are few images of the dashboard: -
Fig 5: Overview of audience reach.
VII. EXPERIMENTAL LEARNING
Website optimization has grown in importance

as the value of having an online presence for
Fig 1: Overview of the dashboard. businesses has increased. Google Analytics is one
of the most widely used tools for monitoring and
improving website performance. You may learn a
lot about the traffic, user behavior, and other
important data of your website by connecting it to
Google Analytics.
You must first set up a property for your
website and an account in order to use Google
Analytics. After completing this, you can get a
tracking code that you can insert into the HTML
code of your website. The performance of your
website can be tracked and analyzed by Google
Fig 2: Overview of the Geography
Analytics thanks to this tracking code.
You can begin analyzing the data that Google
Analytics gives after your website is connected to
it. The Google Analytics dashboard shows
different traffic-related measures, including the
quantity of visitors, pageviews, bounce rate, and
typical session length. Additionally, you may see
details on the devices and browsers your visitors
use, their geographic locations, and the sources of
your traffic.
Additional sophisticated capabilities offered by
Fig 3: Overview of direct and indirect traffic. Google Analytics include real-time monitoring,
goal tracking, and e-commerce tracking. You may
view the number of visitors that are now on your
63
website and their behavior in real-time, thanks to
real-time monitoring. With goal tracking, you may
keep track of particular actions that site users take,
such submitting forms or making purchases.
Google Analytics also provides custom reports
and dashboards, which let you design unique
views of the data from your website. Businesses
who want to measure particular KPIs (key
performance indicators) that are pertinent to their
objectives may find this to be especially helpful. Fig 6: The Gadget House Website for monitoring.
In general, Google Analytics is a crucial tool 9. Acquisition overview shows the number of
for monitoring and optimizing websites. It helps users and new users.
guide your decision-making about how to increase
user experience and improve website performance
by giving you useful information into traffic and
user behavior.
Along with the interface, Google Analytics
features a sizable vocabulary exclusive to their
platform. The following are some of the key terms
to comprehend:
Sessions: A session is a collection of
interactions that happen on your website over the
course of a specific period of time. Multiple
pageviews, interactions, and events can occur
during a single session. Fig 7: Acquisition overview.
Pageviews: Each time a person accesses a page 10. Shows direct and indirect user count. (we
on your website, a pageview is logged. do not have indirect traffic)
Bounce rate: The percentage of visitors to your
website that leave after only reading one page is
known as the bounce rate.
Conversion rate: The conversion rate is the
proportion of visitors to your website who carry
out a desired action, like making a purchase or
completing a form.
We have created a e-commerce website using
HTML, CSS and PHP for backend. Mentioned
below are the results and insights gained Fig 8: Traffic overview
generated from our website. 11. First user analytics, shows how many new
users visited the site.
VIII. OUR RESULTS
8. This is our website’s home page.
64
Fig 9: New user data.
12. User engagement sessions.
Fig 14: Page analysis.

Fig 10: Engagements.
13. Engagement overview, the average time

user was engaged on the website.
17. Demographic data.
Fig 11: Engagement overview.
14. The events user has gone through during

the session.
Fig 15: Demographics.
18. Detailed demographic data.
Fig 12: Flow of user.
15. Page wise data, the number of times

specific pages were accessed.
Fig 16: Location of users.
IX. FUTURE ENHANCEMENT

Fig 13: Page data.
16. Detailed analysis of the pages. Increased use of machine learning and AI:
Machine learning and AI can be used to automate
the analysis of website data, making it easier for
65
website owners to identify performance issues and Additionally, Google Analytics delivers real-time
optimize their website's performance. monitoring, enabling businesses to recognize and
handle any issues as they emerge.
Integration with other tools and technologies:
APM and Google Analytics can be integrated with In order to provide a thorough perspective of
other tools and technologies such as website performance and advertising campaigns,
containerization, serverless computing, and edge Google Analytics also connects with a wide
computing to provide more comprehensive number of other programmers and platforms, such
insights into website performance. as Google Ads. Based on information about user
behavior and performance, this integration helps
Real-time monitoring: Real-time monitoring
firms to optimize their online advertising strategy.
capabilities can be enhanced to provide instant
alerts and notifications for any performance In general, Google Analytics is an effective
issues. tool for businesses trying to track and improve the
performance of their websites. Businesses may
Deeper insights into user behavior: Google
use it to make data-driven decisions and enhance
Analytics can be enhanced to provide deeper
their online presence because it offers insightful
insights into user behavior, such as clickstreams,
data on user behavior and performance indicators.
session replay, and heat maps, which can help
website owners identify areas for improvement.
Mobile app monitoring: With the increasing REFERENCES
popularity of mobile apps, APM and Google
Analytics can be enhanced to provide more
comprehensive monitoring and analytics [1] Gao, H., Wang, Y., & Chen, Y. (2017).
capabilities for mobile applications. Application performance monitoring: A
review and taxonomy. Journal of Network
and Computer Applications, 83, 73-89.
X. CONCLUSION doi: 10.1016/j.jnca.2017.01.003
[2] Salsbury, C., & Beck, J. (2018). Using
Google Analytics as a web application
Google Analytics is a potent web analytics tool
performance monitoring tool. International
that gives businesses the ability to track and
Journal of Web Information Systems,
improve the functionality of their websites. It
14(3), 280-289. doi: 10.1108/IJWIS-02-
provides a wide range of capabilities, such as
2018-0012
website traffic tracking, user behavior monitoring,
and data analysis to support organizations in [3] Wang, Y., Gao, H., & Chen, Y. (2018).
making defensible judgements about their website Performance evaluation and prediction of
strategy. web applications: A review. Journal of
Systems and Software, 140, 10-26. doi:
Organizations can use Google Analytics to
monitor crucial performance indicators like page 10.1016/j.jss.2018.02.024
views, bounce rates, and conversion rates. This [4] Cheng, Y., Liu, X., & Li, J. (2019). Real-
data can be utilized to enhance user experience, time APM in microservices architectures:
boost website traffic, and optimize website design. A review. Journal of Systems and
Additionally, the tool offers information on user Software, 157, 110392. doi:
behavior, such as how visitors use the website, 10.1016/j.jss.2019.110392
which pages they visit most frequently, and how [5] Fang, J., Wang, W., & Li, X. (2020). Web-
much time they spend on each page. based application performance monitoring
Creating unique reports and dashboards is one with Google Analytics. International
of the main advantages of utilizing Google Journal of Grid and Utility Computing,
Analytics. This enables businesses to monitor 11(2), 111-119. doi:
particular indicators and learn more about how 10.1504/IJGUC.2020.107168
consumers engage with their website.
66
[6] Kim, J., & Koo, C. (2014). The impact of for disease prediction on large-scale data.
website quality on customer satisfaction In 2022 International Conference on
and purchase intentions: Evidence from Electronics and Renewable Systems
B2C e-commerce in Korea. International (ICEARS) (pp. 1556-1561). IEEE.
Journal of Electronic Commerce, 18(1), [15] Alam, A., & Muqeem, M. (2022,
69-97. doi: 10.2753/JEC1086-4415180103 October). K-Means Integrated with
[7] Hu, X., Lin, Z., & Huang, L. (2016). A Enhanced Firefly Algorithms for
study on the impact of website quality on Automatic Clustering to Select the
online purchase intention. Journal of Optimal Number of Clusters. In 2022 2nd
Electronic Commerce Research, 17(1), 1- International Conference on Technological
13. Advancements in Computational Sciences
(ICTACS) (pp. 343-347). IEEE.
[8] Ali, H., & Alkibsi, A. (2017). The impact
of website quality on customer loyalty: A
study of online shopping behavior in Saudi
Arabia. Journal of Theoretical and Applied
Electronic Commerce Research, 12(3), 37-
52.
[9] Kannan, P. K., & Li, H. (2017). Digital
marketing: A framework, review and
research agenda. International Journal of
Research in Marketing, 34(1), 22-45. doi:
10.1016/j.ijresmar.2016.11.006
[10] Chen, X., & Chen, Y. (2018).
Research on the impact of website quality
on customer loyalty in the online retail
industry. Journal of Theoretical and
Applied Electronic Commerce Research,
13(1), 1-16.
[11] Huang, J. T., & Chou, H. Y.
(2019). The impact of website quality on
user satisfaction in the context of online
ticketing services. Journal of Business
Research, 100, 169-179. doi:
10.1016/j.jbusres.2019.02.012
[12] Zhai, Y., Dong, C., Zhang, X., &
Yang, W. (2020). A survey of
microservice-based application
performance monitoring. Journal of
Internet Technology, 21(5), 1435-1448.
[13] Alam, A., Rashid, I., & Raza, K.
(2021). Data mining techniques' use,
functionality, and security issues in
healthcare informatics. In Healthcare and
Medicine, Translational Bioinformatics
(pp. 149-156). Academic Press.
[14] Alam, A., & Muqeem, M. (2022,
March). k-means clustering and a nature-
inspired optimization technique combined
67
Predicting The Onset Of Lifestyle Diseases
Guide: Dr. Prabagar S Jaishree R M

Computer Science and Information Science and K Yadu Vamsi
Engineering Engineering Information Science and
Presidency University Presidency University Engineering
prabagar.s@presidencyun 201910100102@presiden Bangalore, India
iversity.in cyuniversity.in 201910101291@presiden
cyuniversity.in
Madhura H C S Manvita M
Information Science and Information Science and Parikshith N
Engineering Presidency Engineering Information Science and
University Presidency University Engineering
201910101781@presiden 201910101629@presiden Bangalore, India
cyuniversity.in cyuniversity.in 201910100312@presiden
cyuniversity.in
Abstract— Lifestyle diseases are conditions

whose occurrence is mostly related to a XI. INTRODUCTION
person's regular daily routines or even an
individual’s work environment. Heart Lifestyle diseases are illnesses whose
conditions, Obesity, Type2Diabetes, occurrence is primarily has a connection to a
hypertension, certain types of cancer are person's usual everyday activities. Non-
examples of several forms of lifestyle diseases. communicable diseases (NCDs), usually referred
Several risk factors for lifestyle diseases have to as lifestyle diseases, are a major health problem
been identified, these include age, genetics, in India. The World Health Organization (WHO)
unhealthy eating habits, physical inactivity, report indicates that 61% of fatalities in India are
poor body posture, a disrupted biological caused by NCDs. There are many different kinds,
clock, excessive smoking and alcohol such as cardovasculer illnesses, obesity,
consumption, etc. Many lifestyle diseases have type2diabetes, hypertension, and many others.
an insidious start, take years to develop, and The biggest cause of mortality in India is
are difficult to treat once they manifest. The cardiovascular disease (CVD), which is
use of machine learning algorithms in the early responsible for 28.1% of all fatalities there. In
prediction of lifestyle disorders has the India, diabetes is a significant public health issue.
potential to improve public health outcomes by India has the second-highest number of diabetics
enabling early identification and intervention. in the world in 2021, with an estimated 87 million
Algorithms such as decision trees, random individuals living with the disease.
forest algorithms, k-nearest neighbor, Naive
bayes algorithm and support vector machines Around the world, chronic diseases represent a
(SVM) have the potential to aid in the early major problem in the medical care industry. The
diagnosis of lifestyle disorders. medical statement claims that chronic diseases are
having an increasing impact on human mortality.
More than 70% of the patient's income is spent on
Keywords—Lifestyle diseases, decision tree, this disease's therapies. Depending on the exact
random forest, k-nearest neighbors, naïve bayes, disease, its severity, and its location, the cost of
support vector machine. treating lifestyle diseases can vary greatly. In
many situations, the expense of therapy can be The results mentioned earlier were obtained
high and may account for a sizable amount of a through an analysis of the literature supporting the
person's income. creation of the proposed system.
The objective variable of the study in [1] aims
Non-comunicable diseases (NCDs), such as to identify and forecast patients who have more
daibetes, heartdisease, and cancer, are predicted to prevalent chronic illnesses. The proposed system
cost the global economy $47 trillion over the utilizes algorithms such as convolutional neural
following two decades, according to a global network (CNN) to predict illnesses and extract
economic forum report. Both direct healthcare features automatically, while K-nearest neighbor
expenses and indirect costs, such as diminished (KNN) is used to provide comprehensive disease
quality of life and missed productivity, are forecasts based on patient symptoms by
included in this cost. Therefore, it is crucial to calculating the distance to determine the closest
lower the patient's danger of passing away. The match in the dataset, resulting in the final disease
collecting of health-related data is made easier by prediction outcome. The data set was prepared by
the development of medical research. gathering disease symptoms, a person's lifestyle,
and information on doctor visits. Various aspects
ML models have proven successful in utilizing are considered in this comprehensive disease
computational analysis of pneumonic capacity prediction process. The collected data is divided
tests to differentiate and diagnose various chronic into cleaning and testing datasets during the
diseases. This advancement in medical care has preprocessing stage. After multiple iterations,
enhanced electronic data availability, leading to when the desired outcome is achieved, the
improved decision support and increased resulting model is made ready for testing. The
efficiency. Consequently, medical professionals model is then put to the test using a new set of
are increasingly drawn to novel predictive model data that wasn't utilised for training in order to
technologies for forecasting illnesses, as early check how well it performs. The suggested model
detection and effective treatments remain crucial is ready for implementation if it achieves the
in reducing mortality rates caused by chronic requisite accuracy in test data.
ailments. These advancements also offer the
potential to enhance health data accuracy, In [2] the authors have addressed about the
minimize patient variations, and reduce healthcare prediction of Met S. The complicated condition
costs compared to traditional methodologies. known as metabolic syndrome (Met S), which
manifests as a collection of metabolic
In this paper we are using five algorithms to irregularities, is strongly correlated with the
predict the onset of lifestyle diseases. Decision frequency of numerous diseases. In the middle-
trees, a popular machine learning technique, can aged poppulation, early Met S risk predection
be used to predect likelihood of developing a offers larger advantages for cardiovascular
lifestyle condition. By merging several decision disease-related health consequences. The goal of
trees, decision tree extensions called random this research was for use cutting-edge
forest algorithms can improve prediction machinelearning methods to identify the best Met
accuracy. The K-nearest neighbour machine S predection method for the mid aged
learning algorithm predicts the likelihood of Koreanpopulation. From the
contracting a disease by comparing an individual's KoreanMedicineDaejeonCitizenCohort, an
health data with that of comparable individuals in investigation focusing on a local population of
a database. SVM is a powerful algorithm that may people of the age 30-55, the authors were able to
be used to find complex data patterns and make retrieve 20 different data types. S*x, age,
accurate predictions using those patterns as a antthropometric measurements, lifestyle
foundation. information, and blood markers of individuals
from 1991 were all included in the data. Met S
predection made use of 9 machinelearning models
XII. REALATED WORK
built on the following algorithms: decisiontree,
This section explains the research conducted to multi-layer perceptron, GaussianNaiveBayes, K-
develop the predictive model for chronic diseases.
69
nearest neighbour, XGBoost, random forest, XIII. METHODS USED
logistic regression, and 1D convolutional neural
A. Support Vector Machine
network. By successively entering the
characteristics in three stages according to their SVM, a commonly employed supervised
machine learning method for classification and
properties, all analyses were carried out. After
regression analysis, works by determining the
using the syntetic minority oversamplingtechnique
optimal hyperplane that maximizes the separation
(SMOTE) to address the data imbalancing, the between classes. This algorithm partitions data
models' results were compared. It concluded that points into distinct classes. The data points closest
tree-based machine learning models could to the hyperplane, which determines where it is,
accurately identify Met S in mid aged Koreans. are the support vectors. Working: Input data- SVM
Early Met S daignosis is crucial and necessitates a uses numerical values as its input data. For
multifaceted strategy that include self instance, the input data for a binary classification
administered questionnaires, antthropometric problem (two classes) would consist of a set of
measurements, and mettabolic tests. numerical features and labels designating the two
classes. Determine the separating hyperplane-
In a different study , as shown in [3] the authors SVM seeks for the ideal hyperplane that divides
created a model that will examine the information the classes with the greatest margin. The width of
given by the user and provide forecasts of the the margin is the separation between the nearest
diseases that he or she may be likely to suffer. In points in both classes and the hyperplane. The
addition to providing you with forecasts, the hyperplane that maximizes this distance is the
model also teaches you how to avoid common ideal hyperplane. Mapping to higher dimensions-
lifestyle diseases and offers you management In some cases, the data's original feature space
strategies in the event that you experience cannot be divided by a straight line. In these
situations, SVM transforms the data into a greater-
moderate symptoms. This project educates the
dimension that allows for linear separation.
individual about their health so that, if necessary,
Calculation of suport vectors- The data points that
they can receive treatment promptly and thus save lay closest to the hyperplane or those that are on
countless lives. The processes involved in this the margin are the support vectors. The hyper
study include identifying the lifestyle diseases at plane's location is determined by the position of
an early stage, Preventing these diseases, and hoe these suport vectors. Classification of new data-
we can manage these diseases. The diseases SVM can be used to categorise new metrics points
focused in this study are heart diseases, Breast by mapping them into the same feature space and
cancer, Diabetes and hypertension. Different figuring out which side of the hyperplane they lie
algorithms are implemented in order to identify on after the hyperplane has been identified. SVM
the respective diseases, like clustering is used to is used frequently in applications like text analysis
detect Heart Diseases and Diabetes; NaïveBayes, and image classification because it can handle both
Backpropagation NeuralNetwork and linear and non-linear data. Due to its strength in
DecisionTree to predect the survivalbility rate of handling high-dimensional data, robustness to
breastcancer patients. By using a typical machine outliers, adaptability in kernel functions, good
learning technique that uses a portion of the whole generalization efficiency, and capability to deal
dataset that is entirely distinct from the training with both binary and multi-class classification
set, the model may be tested and confirmed. This problems, SVM is an effective machinelearning
will aid in predicting how accurate the model will model that is appropriate for the prediction of
lifestyle diseases.
be. The techniques employed must be changed,
the algorithms must be trained on a new dataset,
and the entire method must be redone after
determining the procedures, if the algorithm is B. Decision Tree
unable to accomplish the desired accuracy level.
A decision tree refers to a supervised learning
technique employed for the purposes of
classification or regression analysis. To make
predictions or decisions, the process involves
splitting the data into subsets according to the
70
input feature values and iteratively making The procedure multiplies the probabilities for all
decisions based on these subsets. Each internal features and all classes after calculating the
node represents a feature, and each leaf node likelihood of each feature value given the class.
represents a class label or regression value. It is The projected class for the new observation is
visualized as a tree-like structure. Working: The given to the class with the highest probability.
decision tree algorithm divides the data into Model tuning- The algorithm may employ many
subsets recursively according to the values of the methods, including regularisation, feature
input features. In this procedure, the optimal selection, and hyperparameter tuning, to increase
feature to divide the data into classes at each stage the model's accuracy. Test and evaluation- The
is chosen based on a criterion that maximises class model's effectiveness is assessed using a different
separation or reduces variation within each subset. test dataset as the last stage. The correctness of the
The algorithm chooses the optimal characteristic model is assessed using evaluation metrics such as
to divide the data into two or more subsets, precision, recall, and F1-score on the test data set.
starting with the complete dataset at the root node. It is adaptable in various application domains since
The process is repeated for each subset until a it can handle both continuous and discrete data.
stopping condition is met, such as reaching a
maximum depth or a minimum number of
occurrences in each leaf node. A decision is made D. Random Forest
based on the value of the chosen feature at each A group of decision trees are used in the
internal node of the tree, and the data is partitioned supervised learning technique known as random
as a result. Up to the leaf nodes, which stand in for
the ultimate predictions or judgements, the forest to produce predictions. By selecting random
procedure is repeated. By navigating the tree from subsets of attributes and instances from the
the root node to the appropriate leaf node based on training data, it builds numerous decision trees,
the values of the input characteristics, the resulting combining the predictions of these trees to
decision tree can be used to predict the class label provide a final prediction. This method aids in
or regression value of future instances. Decision lowering overfitting and enhancing the model's
tress are used to forecast lifestyle disorders for a precision and generalizability. Working- By
variety of reasons, including, Non-parametric,
using bootstrapping, which involves selecting
which means it makes no assumptions regarding
the data's underlying distribution, Resistant to instances at random with replacement, random
noise and missing data values. subsets of the training data are produced. Using
these bootstrapped subsets and a random subset of
features at each node, several decision trees are
C. Gaussian Naïve Bayes built. A greedy technique is used to build the
A probabilistic classification approach called trees, which recursively splits the data based on
Gaussian NaiveBayes is grounded on the Bayes the chosen features. The target variable for new
theorem and implies that the characteristics are instances is predicted using each decision tree.
independent and have a normal distribution. By The forecast is based on either the average (for
evaluating the product of the probability of all the
regression) or the majority vote (for classification)
feature values given to the class, it assesses the
probability of a new observation being assigned to of all the predictions made by the forest's trees. A
each class. The forecast is then given to the class different validation set is used to assess the
with the highest likelihood. Working: Training- model's performance. To increase the precision
The algorithm calculates the mean and variance of and robustness of the model, the process of
each feature for each class during the training building the trees and making predictions is
phase using the training data. Probability iterated many times with various subsets of the
Calculation- The algorithm first determines the
data. The random forest approach uses methods
previous likelihood of each class based on the
proportion of instances of each class in the training like feature subsampling and bagging to avoid
data when a new observation is given to it. The overfitting and increase the model's
estimated mean and variance of each feature for generalizability. The ensemble of decision trees
that class are then used to compute the probability that is produced can be utilised for classification
of each feature value given the class. Prediction- and regression applications and can handle
71
categorical and numerical input. High reliability overfitting brought on by bagging and feature
and precision thanks to an ensemble of decision subsampling, etc makes it a suitable algorithm for
trees, Capability of handling datasets with many the prediction of lifestyle diseases.
features and large dimensions, Unaffected by
Fig.1. System Architecture

E. K-Nearest Neighbors non-parametric approach, the k-nearest neighbour
A supervised learning technique used for technique makes no assumptions about the data's
classification and regression problems is called the underlying distribution.
k-nearest neighbour algorithm (k-NN). Making a XIV. PROPOSED MODEL
forecast based on the class or value of these
The system's high-level design is described by the
neighbours requires locating the k-nearest training
data points in the training set to a new data point. proposed architecture, which is depicted in Figure
The classification or regression prediction is based 1. The following are the various stages included
on the k-nearest neighbours' agreement, or the in the proposed methodology:
average of their values. Working: Use a distance Stage 1: The Lifestyle parameters such as sleep
metric, such as Euclidean distance or Manhattan time, diet, smoker or non-smoker, gender, etc. are
distance, to determine the distance between the taken as input and the likelihood of the lifestyle
new data point and each data point in the training disorder is predicted.
set. Stage 2: Split the dataset into training and testing
sets.
Based on the computed distance, choose the k
Stage 3: Create the classifiers of the respective 5
points in the traning set that are nearest the new
algorithms (SVM, Decision tree, Naïve bayes,
data point. For classification, the majority class Random Forest and KNN) one after the other and
label of the k-nearest neighbours is used to train it on the training dataset.
forecast the class label for the new data point. The Stage 4: Now using all the 5 classifiers predict the
average of the values of the k-nearest neighbours labels for the testing set.
serves as the anticipated value of the new data Stage 5: Calculate the accuracy of the classifiers
point in regression. To improve the performance on the testing dataset. Calculation of accuracy is
of the model, the value of k is chosen based on done with the equation shown in (7), for which the
cross-validation or other validation methods. As a
72
values are taken from the confusion matrix in of Chronic Diseases Using Machine Learning
Figure.2. Approach. J Health Eng. 2022-Feb-15. doi:
Stage 6: The model chooses the disease that has 10:1155/2022/2826127 .
the highest probability among all the classifiers
and displays it. [2]. Junho Kim, Sujeong Mun, Siwoo Lee,
Kyoungsik Jeong and Younghwa Baek.
Prediction of metabolic and premettabolic
XV. RESULT
syndromes using machinelearning models with
After implementing our methodology we have antthropometric, lifestyle, and biochemicalfactors
successfully trained our models to give accurate from middle-aged population inKorea. Kim et al.
prediction of diseases caused by lifestyle BMCPublic Health 2022; 22:664. doi:
activities. In Fig.3 we have mentioned the 10.1186/s12889-022-13131-x.
accuracy score of each algorithms in particular.
[3]. Sakshi Gaur, Sarvesh Sharma, Ayush
Tripathi. Easy Prediction of Lifestyle Diseases. 4-
June-2021. EasyChair Preprint no: 5702.
[4]. TakuraT., GotoK.H., HondaA. Development

ofa predective modelfor integrated medicaland
long-term care resources consumption basedon
Fig.3. Model Accuracy health behaviour: applicationof healthcare bigdata
of patientswith circulatory
Accuracy of each algorithms can be further more diseases. BMCMedicine . 2021;19(1):1–16.
increased by training the models with different doi: 10.1186/s12916-020-01874-6.
datasets, and performing techniques such as
feature preprocessing and model parameter tuning [5]. Vikas Chaurasia, Saurabh Pal. Early
for each algorithms. Predection of HeartDiseases Using
DataMiningTechniques. VikasChaurasia et al,
XVI. CONCLUSION Carib. j. SciTech, 2013, Vol.1, 208-217.
Published under CaribbeanJournal of D=Science
In conclusion, lifestyle illnesses are an increasing
and Technology. ISSN 0799-3757
problem across the globe, and early detection and
prevention can greatly enhance the outcomes for
[6]. Dr. D J Samatha Naidu, K Mahesh. Lifestyle
public health. A promising method for predicting
Disease Prediction. InternathionalJournal of
lifestyle disorders is provided by machine learning
Advance inEngineering andManagement
algorithms, allowing for early intervention and
(IJAEM), Volume.4, Issue 9 Sept, 2022. Pp:1248-
therapy. Healthcare professionals can identify
1256. ISSN :2395-5252.
patients who are at risk of acquiring lifestyle
problems and offer them individualised preventive
[7]. KArumugam, MohdNaved, PriyankaP.
therapy by utilising decesion tree, randam froest
Shinde, Orlando Leiva-Chauca, AntonioHuaman-
algorithms, knearest neighbour, Naiive bayes
Osorio, TatianaGonzales-Yanac. Multiple Disease
algorithm, and suport vector machine (SVM). By
predectionusing MachineLearning Algorithms.
enabling early identification and intervention of
Article inMaterials Today: Proceedings. Agust
lifestyle diseases, machine learning has
2021. Doi: 10.1016/J.matpr.2021.07.361.
thepotential to improve the quality of life for
milions of peopel in the future.
[8]. Dr. PadmashreeT, Dr. HarshaS,
HaripriyaVJoshi. A MachineLearning aid to
REFRENCES predect diseases based onlifestyle andsymptom.
[1]. Rayan Alanazi. Identification and Prediction Specialsis ugdymas / special education 2022 .
73
[9]. Chih-HungJen, Chien-ChihWang, [12]. AnandA and ShaktiD, 2015. Predection of
BernardC.Jiang, Yan-HuaChu, Ming-ShuChen. daibetes based onpersonal lifestyle indecators.
Applicationof clasification techniqueon InNext generation computingtechnologies
development anearly warningsystem for chrnic (NGCT), 2015 1st international conference on
illnes. Expret system with aplications Volume.39, (pp.673–676).IEEE.
Issue 10, Agust 2012,
doi.org/10.1016/j.eswa.2021.02.004. [13]. SharmaM and MajumdarP.K, 2009.
Ocuupationallifestyle diseases: An
[10]. PatekariS.A. and ParveenA., 2012. Emergingissue. IndianJournal ofOccupational and
Prediction system for heartdisease using Environmentalmedicine, 13(3), pp. 109–112.
NaïveBayes. InternationalJournal of
AdvancedComputer and MathematicalSciences, [14]. P.Prabhu, S.Selvabharathi. DeepBelief
pp.290–294. NeuralNetworkModel for Predection of
DaibetesMellitus. In 2019 3rd
[11]. SuzukiA, LindorK, St SaverJ, LympJ, InternationalConference on Imageing,
MendesF, Muto, A.OkadaT andAnguloP, 2005. SignalProcessing and Communication, 2019 (pp.
Effect of change onbody weightand lifestyle in 138–142) Institute of
nonalcoholic fattyliverdisease. Journal of ElectricalandElectronicsEngineers Inc.
Hepattology, 43-6, pp. 1060–1066. ISBN:9781728136639. 2019.
74
Predictive Analysis Of Heart Rate
Using Opencv
Mohammed Affan K Kalyan

Dr.Jayavadivel Ravi Hussaini 20191ISE0068
Associate Professor 20191ISE0102
Presidency ISE
University ISE Presidency University
Bangaluru Presidency Bangaluru
Jayavadivel.ravi@presid 201910100217@presidency
University university.in
encyuniversity.in Bangaluru
201910100495@preside
ncyuniversity.in
Balasubramanian N Avula Viswa Praveen

20191ISE0024 20191ISE0206
ISE ISE
Presidency Presidency
University University
Bangaluru Bangaluru
201910101270@preside 201910102100@preside
ncyuniversity.in ncyuniversity.in
Abstract— A person’s heart rate can be Heart rate is an important physiological

indicative of their health, fitness, activity level,
parameter for many applications, such as medical
stress, and much more. To analyze a person's diagnosis, lie detectors for criminal investigation
pulse signal from a video stream Opencv is anand face-liveness detection for antispoofing, which
open-source computer vision library that are relevant to professionals in many fields. A
provides various tools and functions for image
healthy heart is the key to a healthy life. It has
and video processing, making it an ideal tool for
become a norm for individuals to monitor their
predictive analysis of heart rate. The goal of
essential signs periodically to keep themselves
this analysis is to predict a person's heart rate
healthy and enhance their immune system. The
in real-time based on their pulse signal emergence of the COVID-19 outbreak has led to a
extracted from the camera. This can have surge in the practice of conducting remote video
various applications, such as in healthcare, consultations for patients. Such consultations
fitness tracking, and stress management. necessitate the measurement of vital signs at the
Measuring the heart rate using a webcam is anpatient's location instead of a medical facility,
exciting project that demonstrates how where a healthcare professional would typically
computer vision and image processing perform the task. Medically certified devices such
techniques can be used to extract vital signsas electrocardiogram machines and
from a person's face or fingertip. The project
sphygmomanometers are generally used to measure
can be an excellent way to learn about vital signs and these measurements are used as the
computer vision, signal processing, and data gold standard data in medical research. Common
analysis. This method is non-invasive and canvital sign measurements include heart rate (HR),
be used in remote areas. oxygen saturation (SpO2), blood pressure (BP),
Keywords—Heart-rate,Computervision,Image respiration rate (RR) and stress level. Medical
processing,Signal processing,Remote health devices mentioned above are either not accessible
monitoring,Fourier transform. to or difficult to be used by general people without
prior training. Some of these devices such as the
ECG are only available at clinics or hospitals. An
I.INTRODUCTION alternative is the use of wearable sensor devices
such as wrist bands and fingertip sensor devices
75
which estimate these vitals quite accurately. Ajerla The paper concludes with a discussion of the
et al used wearable sensor device and a long short results, which showed a high degree of
term memory (LSTM) network for fall detection in correlation between the smartphone-based heart
elderly people. There are some smartphones like rate measurements and those obtained using the
the Samsung Galaxy phones which are also standard heart rate monitor.
equipped with these sensors. Buying specific
sensor devices for measuring different vital signs B. A deep learning approach for remote heart
incurs an additional cost and makes the heart rate-estimation
monitoring expensive for the general people. Author: Jaromir Przybyło Biomedical
Specifically, for online medical advising every Signal Processing and Control
patient cannot be expected to have these sensors (2022)
for monitoring vital signs. However, many people The method aims to estimate heart rate from
have smartphones today, which can provide a more facial video recordings captured by a remote
feasible alternative solution to measure vital signs camera, without the need for contact with the
as smartphones are equipped with camera and skin or any additional sensors.The proposed
multiple sensor. The method uses a camera to method is based on a convolutional neural
capture video of the face, and then uses Haar network (CNN) architecture, which takes facial
Cascade classifier to detect the face in the video video frames as input and generates heart rate
frames. Once the face is detected, the region of predictions as output.
interest (ROI) containing the forehead region is The paper presents a novel deep learning
extracted. The ROI is then processed using the approach for remote heart rate estimation that
Fourier Transform to convert the temporal signal achieves high accuracy and has the potential to
into the frequency domain. The peak frequency in revolutionize non-invasive heart rate monitoring
the frequency domain represents the heart rate, in various applications.
which can be displayed in real-time on the video
frames. This method is relatively low cost and
portable, making it useful in remote areas where C. A sustainable facial recognition based heart
medical facilities are not readily available. rate monitoring system for the detection of
However, the accuracy of this method can be atrial fibrillation in paralyzed (published on
affected by environmental factors such as lighting, 2021)
camera quality, and facial movements. The Author: Soji Saji, Alan Jose, A. M. Karthika
accuracy of the measurement also needs to be In the present era of technological
validated with other reliable sources of heart rate advancement, Computer Aided Diagnostics and
measurement. Overall, predictive analysis of heart Computer Aided Screening are essential tools.
rate using OpenCV in camera with Haar Cascade A new technique has been proposed that allows
classifier and Fourier Transform is a promising continuous measurement of heart rate without
method for non-invasive, real-time heart rate the need for expensive equipment. To achieve
monitoring. more precise and accurate results, it is
recommended to use the technique in a well-lit
II. LITERATURE REVIEW room with minimal distractions and a stationary
camera position. This technique has the
A. Heart Rate Monitoring Using PPG potential to be developed into a mobile
WithSmartphone Camera (published on application for Android/iOS, available for
2021) download on Playstore/Appstore. This would
Author: Amtul Haq Ayesha, Farhana allow people to access it anytime, anywhere and
Zulkernine, Donghao Qiao enable efficient and effective detection of Atrial
The author describes the importance method Fibrillation.
for measuring heart rate using a smartphone D. Design and Implementation of an OCC-
camera and photoplethysmography (PPG) BasedReal-Time Heart Rate and Pulse-
technology. OxygenSaturation Monitoring System
The authors describes the importance of heart Author:Md.Faisal Ahmed, Md.Khalid
rate monitoring for assessing and improving Hasan, Md.Shahjalal, Md.Morshed Alam,
overall health and wellness, and the challenges Yeong Min Jang
associated with traditional heart rate
measurement methods.
76
This paper describes the development of a help us to identify the frequency at which the
monitoring system that uses a novel technique pulsations occur.
called the Optimal Control-based Classification
(OCC) algorithm for measuring heart rate and 6. Find the peak frequency: Identify the peak
pulse-oxygen saturation in real-time.The frequency from the frequency domain using the
hardware component of the system includes a peak finding algorithm. This peak frequency
pulse oximeter sensor, an Arduino represents the heart rate.
microcontroller, and a Bluetooth module,
which are used to collect and transmit data.
The OCC algorithm is based on optimal control 7. Display heart rate: Display the heart rate on the
theory and is designed to be more accurate and video frames in real-time.
efficient than traditional algorithms.
This paper presents a novel approach to heart 8. Analyze the heart rate: Analyze the heart rate
rate and pulse-oxygen saturation monitoring data obtained from the video frames to identify
that is cost-effective, non-invasive, and capable any abnormalities or trends.
of real-time monitoring. The authors
demonstrate the effectiveness of their system
using the OCC algorithm, which has the 9. End the video capture: Once the analysis is
potential to be used in other applications complete, end the video capture.
beyond healthcare.
III. METHIDOLOGY
The proposed method consists of two main

steps: face detection using
haarcascade_frontalface_default.xml and heart rate
prediction using Fourier Transform. The OpenCV
library is used for image processing and analysis.
1. Set up the environment: Install OpenCV and

required libraries in your system. Import them
in your Python code.
2. Capture video from camera: Use OpenCV to

capture video from your camera. You can use
the VideoCapture() function to do this.
IV. MODULES AND ALGORITHMS
3. Detect the face: Use Haar Cascade classifier to A.Modules
detect the face in the video frames. The
haarcascade_frontalface_default.xml file • Opencv-contrib-python is a Python package that
contains pre-trained Haar Cascade classifier for contains the OpenCV (Open Source Computer
face detection in OpenCV. Use this file to Vision) library and additional modules that are
create a face detector object. not included in the main OpenCV package.
These additional modules include advanced
computer vision algorithms, machine learning
4. Extract the region of interest (ROI): Once the tools, and other useful functions.
face is detected, extract the region of interest
(ROI) which contains the forehead region. This • PyQt5 is a Python binding for the Qt application
is the region where we can detect the framework developed by The Qt Company.
pulsations caused by the heartbeat. PyQt5 allows Python developers to create
desktop applications that use Qt features and
functionality.
5. Apply Fourier Transform: Apply the Fourier
Transform on the ROI to convert the temporal QtCore is a module within the Qt framework
signal into the frequency domain. This will that provides the core non-GUI functionality of
77
the Qt library, such as signals and slots, event Grayscale images are often used in image
handling, timers, file I/O, and networking. It is processing because they reduce the amount of
a fundamental module in PyQt5, as it provides data required to represent an image, making it
the foundation upon which other Qt modules easier and faster to process. Grayscale images
are built. also eliminate color information that may be
QtGui is another module in the PyQt5 library. irrelevant to the processing task, allowing
This module provides a set of graphical user algorithms to focus on more important features
interface (GUI) components that can be used to such as edges, textures, and shapes.In Python,
build desktop applications. we can convert a color image to grayscale using
the cv2.cvtColor() method from the OpenCV
QtWidgets is another module in the PyQt5 library.
library. This module provides a wide range of
pre-built UI components that can be used to B.Algorithms
build desktop applications.
• Haarcascade Classifier it is a pre-trained
• PyQtGraph is a Python library for creating classifier in the OpenCV library used for object
interactive and high-performance 2D plots and detection of frontal faces in images and video
graphs. It is built on top of the Qt GUI frames. The Haarcascade Classifier is a machine
application framework, which provides learning-based approach where a cascade
powerful and flexible tools for creating function is trained from positive and negative
graphical user interfaces. images to detect the presence of a particular
PyQtGraph supports a wide range of plot types, object in the image.
including line plots, scatter plots, and image The file Haarcascade_frontalface_default.xml
plots. It also includes features such as zooming, contains the trained classifier for detecting
panning, and interactive region selection. frontal faces in images or video frames. The
• gaussian_filter1d is a function in the SciPy classifier uses a Haar-like feature-based
library that applies a one-dimensional Gaussian approach to identify faces in the input data.
filter to an array of data. It is used to smooth Haar-like features are image features that are
the data by reducing noise and removing high- computed based on the differences between the
frequency components. sum of pixels in adjacent rectangular regions.
The Gaussian filter is a type of linear filter that • The Fourier transform can be used to calculate
convolves the input signal with a Gaussian heart rate from raw data. The raw data is
kernel. The width of the kernel determines the typically a signal that represents the intensity of
degree of smoothing, with wider kernels light passing through a blood vessel, such as the
resulting in greater smoothing. finger or earlobe. The signal will fluctuate with
the changes in blood volume caused by each
heartbeat. The Fourier transform can be used to
convert the signal from the time domain to the
frequency domain. This allows us to analyze the
signal in terms of the frequencies it contains, and
can help us identify the frequency of the heart
rate.
It allows us to analyze the frequency content of a
signal, filter out unwanted frequencies, compress
data efficiently, and perform operations in the
frequency domain for various processing tasks.
• Grayscale is a range of shades of gray, varying
from black to white, with no other colors
present. In digital images, grayscale is often
used to represent images in which the intensity
or brightness of each pixel is represented by a
single value ranging from 0 (black) to 255
(white).
78
The research paper mentioned aims to predict heart
rate using OpenCV and the Haar cascade classifier
for facial detection. The Fourier transform is used
to analyze the frequency components of the heart
rate signal.The researchers used a video camera to
capture facial images of participants while they
performed a series of activities that increased their
heart rate, such as jogging or jumping. They then
applied the Haar cascade classifier to detect the face
in each frame of the video, and extracted the region
of interest around the forehead, where the pulsation
of the blood vessels can be measured.Next, the
researchers used the Fourier transform to analyze
the frequency components of the pulsation signal,
which allows them to identify the heart rate. They
compared their results with a reference heart rate
obtained from a pulse oximeter, a non-invasive
medical device that measures the oxygen saturation
in the blood.
Overall, this research shows that it is possible to
use computer vision techniques to predict heart rate
from facial images, which has potential applications
in remote health monitoring and wellness tracking.
Collect the data: Use a heart rate monitor or

ECG device to collect the heart rate data.
Filter the signal: Apply a bandpass filter to the
signal to remove noise and isolate the
frequency range that contains the heart rate
signal.
Windowing: Apply a window function to the
filtered signal to reduce spectral leakage.
Compute the Fourier transform: Use a fast
Fourier transform (FFT) algorithm to compute
the frequency spectrum of the windowed
signal.
Find the peak frequency: Look for the highest
peak in the spectrum, which corresponds to the
heart rate frequency.
Convert the frequency to heart rate: Convert
the frequency value to beats per minute (BPM)
using the formula; heart rate (BPM) = peak
frequency * 60
For example, if the peak frequency is 1.2 Hz,
the heart rate would be: Fig. 1 Face detection and Analysis of heart rate
heart rate = 1.2 * 60 = 72 BPM ,where f is the
peak frequency in Hz.
IV. RESULT ANALYSIS
79
We would like to express our sincere gratitude to all
the individuals who have contributed to the
completion of our research paper titled Predictive
analysis of heart rate using OpenCV.
Firstly, we would like to extend our thanks to our
supervisor, for providing us with invaluable
knowledge and insights throughout the entire
research process.
We would also like to acknowledge the contribution
of everyone who helped for their significant role in
the data collection and analysis process. Their
expertise have been instrumental in making this
project a success.
REFERENCES
1. Heart Rate Monitoring Using PPG

WithSmartphone Camera (published on
Fig. 2 Face detection and Analysis of heart rate
2021)Author: Amtul Haq Ayesha, Donghao
Qiao, Farhana Zulkernine
V. CONCLUSION AND FUTURE WORK 2. “A sustainable facial recognition based heart

rate monitoring system for the detection of
In this research paper, we proposed a method to atrial fibrillation in paralyzed” 2021,Soji Saji,
predict heart rate using a camera and OpenCV.We Alan Jose, A. M. Karthika
will be developing an application which is used to
detect and analyse the heart rate of every individual 3. Allen, J. Photoplethysmography and its
with the help of smartphone camera. Wearing a application in clinical physiological
wearable devices during exercise can be measurement. Physiol. Meas. 2007
uncomfortable to certain users. Hence, our idea will 4. Hassan, M.; Malik, A.S.; Fofi, D.; Saad, N.;
be able to overcome this problem and this could be Karasfi, B.; Ali, Y.S.; Mériaudeau, F. Heart
very much convenient to know their pulse rate rate estimation using facial video: A
while doing exercise. Our idea based heart rate review. Biomed. Signal Process.
monitoring can provide valuable data for research Control 2017, 38, 346–360.
purposes, allowing for the exploration of 5. Verkruysse, W.; Svaasand, L.O.; Nelson, J.S.
correlations between heart rate and other Remote plethysmographic imaging using
physiological or psychological factors and also has ambient light. Opt. Express 2008
the potential to improve health outcomes, enhance 6. Gudi, A.; Bittner, M.; Lochmans, R.; van
athletic performance, and advance scientific Gemert, J. Efficient Real-Time Camera Based
understanding of the human body. Estimation of Heart Rate and Its Variability.
The future improvement is that we will train a more In Proceedings of the IEEE International
accurate Haar Cascade classifier specifically for Conference on Computer Vision Workshops,
detecting multiple faces in a video stream. This Seoul, Korea, 27–28 October 2019.
could involve collecting a large dataset of images 7. “A deep learning approach for remote heart
and videos with multiple faces, and using machine rate estimation”, Jaromir Przybyło Biomedical
learning techniques to train a more accurate Signal Processing and Control 2022
classifier. This could help to more accurately detect 8. “Real-Time Webcam Heart-Rate and
and track multiple faces, even if they are moving or Variability Estimation with Clean Ground
partially obscured and calculate heart rate with high Truth for Evaluation”, Amogh Gudi, Marian
accuracy. Bittner, Jan van Gemert
9. “Design and Implementation of an OCC-
ACKNOWLEDGEMENT BasedReal-Time Heart Rate and Pulse-
OxygenSaturation Monitoring System”
Md.Faisal Ahmed, Md.Khalid Hasan,
80
Md.Shahjalal, Md.Morshed Alam, Yeong measurement using low-cost RGB face video:
Min Jang 2020 A technical literature review,’’ Front.
10. D. J. McDuff, J. R. Estepp, A. M. Piasecki, Comput. Sci., pp. 1–15, Dec. 2017.
and E. B. Blackford, ‘‘A survey of remote 12. M.-Z. Poh, D. J. McDuff, and R. W. Picard,
optical photoplethysmographic imaging ‘‘Non-contact, automated cardiac pulse
methods,’’ in Proc. 37th Annu. Int. Conf. measurements using video imaging and blind
IEEE Eng. Med. Biol. Soc. (EMBC), Aug. source separation,’’ Opt. Exp., vol. 18, no. 10,
2015 pp. 10762–10774, 2010.
11. P. V. Rouast, M. T. P. Adam, R. Chiong, D.
Cornforth, and E. Lux, ‘‘Remote heart rate
Real Time Eye Blink Locker
Ms. Bhavana A Aishwarya DC Anusha R

Assistant Professor, Department of Department of Computer Science and Department of Computer Science and
Computer Science and Engineering Engineering Engineering
bhavana@presidencyuniversity.in 201910100353@presidencyuniversity.in 201910100676@presidencyuniversity.in
Yashaswini M
Department of Computer Science and Swapnadeep Banik
Engineering Department of Computer Science and
Presidency University Engineering
YASHASWINI.20201LCS0005@preside Bangalore, India
ncyuniversity.in 202011100016@presidencyuniversity.in
Abstract— For user authentication and security, personal People encounter authentication mechanisms every day
identification numbers are frequently employed. Users must and must verify using historical knowledge-based ways like
enter a physical PIN for password verification utilising PINs,
passwords, It is important to maintain a delicate balance
which makes them susceptible to password cracking or hacking.
by thermal tracking or shoulder surfing. On the other side, PIN
between simplicity and security in terminal authentication
entry methods donot take physical traces and provide a secure systems, ensuring that they are user-friendly while also
password entering choice. Pin verification with hands-off eye providing robust protection against potential threats and
blinks. Eye blinks-based authentication is the process of vulnerabilities., quick, and safe. However, these methods are
establishing a PIN by identifying the eye blinks in a series of not secure since they are observed by nefarious observers
picture frames. To prevent shoulder surfing and thermal who utilize surveillance methods like shoulder-surfing—
tracking assaults, this project offers a real-time application watching the user type the password while using the
that combines face detection and eye blink- based PIN keyboard—to record user authentication information.
entering.
Keywords— Webcam, Authentication, Password, Real-time Security problems are sometimes brought on by insufficient
Systems communications between people and systems. The authors
suggested a security framework consisting of three layers to
protect PIN digits. To mitigate the risk of shoulder surfing,
I. INTRODUCTION
individuals can utilize eye-blinking as a means of inputting
their password by selecting the appropriate symbols in the
81
correct order.Eye blinks are a common form of time application that combines eye blink-based PIN
communication, and security systems that track blinks entering and facial identification.
present a promising option to increase system security and
Scope Of The Project
usability. This paper will examine various approaches or
remedies to deal with eye blinking in security systems. This project's sole objective is to generate the PIN and
identify eye blinks in a series of image frames. This project
Personal identification numbers (PINs) are frequently used offers a real-time application that integrates facial
as a form of user verification for a variety of purposes, identification, eye blink-based PIN entry, and shoulder
including managing cash at ATMs, approving electronic surfing prevention to prevent attacks using thermal
transactions, unlocking mobile devices, and accessing imaging and shoulder surfing.
doors. Even with PIN authentication, such as in financial
systems and gateway management, authentication remains
a constant challenge. II. ALGORITHM SPECIFICATION
European ATM Security claims that compared to 2015,
ATM fraud attacks rose by 26% in 2016. Because the code A. Haar cascade classifier
must be entered by an authorised user in a public or open Using the platform based on Haar, an object is acquired
location, PIN entry is susceptible to password assaults such through a successful object discovery technique provided by
shoulder surfing and thermal monitoring. Paul Viola and Michael Jones in their paper, "Recent
Acquisitions.
Purpose
Making use of an Enlarged Cascade. Cascade's work is
The main objectives of this project are to detect eye taught using a large number of both positive and negative
blinks in consecutive image frames and generate a PIN photos in this machinelearning technique.
based on the selected symbols.To prevent shoulder surfing
and thermal tracking assaults, this research shows a real-
82
After that, it is used to identify objects in other images. To
effectively train the student, the approach necessitates a
significant amount of positive images (images containing
faces) and negative images (images not containing faces) If the second phase of the symptoms passes, use the
initially. Extraction of its features is the next stage. In technique again. The highest window of any category is
particular, using the Haar features shown in the image found on the face. The first five stages of the authors'
beneath. detector had 1, 10, 25, 25, and 50 features. The detector
The current method involves computing multiple features was comprised of over 6000 features and consisted of 38
for each kernel using various sizes and positions. To stages. The top two attributes from Adaboost were really
accurately calculate each element, it is necessary to the two features that were depicted in the aforementioned
determine the total number of pixels under both white and graphic.
black squares. To address this, the researchers introduced a The detector developed by the authors included 38 stages
key image in their approach. and more than 6000 features. The first five stages of the
Quality is the fact that your image is huge, it only allows detector included 1, 10, 25, 25, and 50 features,
for four pixels worth of functionality to be delivered. respectively. The top two attributes from Adaboost were
However, many of the features we have listed do not really the two features that were depicted in the
function. The feature initially selected appears to emphasize aforementioned graphic.
the unique attributes of the eye region, which is typically According to the authors, 10 qualities, on average, are
darker in comparison to other facial areas such as the nose evaluated for each sub-window out of a total of 6000. This
and cheeks. For instance, the image below showcases two is a simple overview of how the Viola-Jones algorithm for
prominent features on the top row that exhibit this face detection functions. For more in-depth information, it
distinction. is recommended to read the original paper or investigate
the sources referenced within it.
B. CNN Model
The stage where we construct the CNN, which we will
use to feed our features, is the most crucial phase in the
entire process. The CNN is created by merging multiple
distinct functions, each of which we will discuss
individually. This step is crucial to train and test the
model using the appropriate features.
FIG: Focusing the Features Sequential()

In a sequential model, layers are arranged in a stacked
fashion, with one layer placed on top of another, as we
Here, we apply each component to every training image.
progress from the input layer to the output layer.
You get a pretty sharp edge that will let you tell the
difference between a good and bad face at each level. model.add(Conv2D())
We choose the features with a low error rate, which indicates The 2D convolutional layer carries out the convolution
they are the features that separate photos with faces from process explained earlier in this essay. According to the
images without faces. (The procedure is more complicated Keras documentation, this layer creates a convolution
than this. The initial weight assigned to each image is equal. kernel that is applied to the layer's input, generating a
The illegalsof the monument grow after each split. tensor of outputs. The activation function utilized in this
Until the accuracy, error rate, or feature count is met, the case is the Rectified Linear Unit (ReLU), and the kernel
process is repeated. The final classifier is the weighted size is set to 3x3.
average of these subpar students. Since it cannot
differentiate the image on its own, it is referred to as weak, model.add(BatchNormalization())
but when combined with others. According to the study, The batch normalization process is executed on the
even 200 characteristicsyield 95% accuracy. inputs to the next layer in order to standardize the input
The majority of an image is surface area. Therefore, it's a scale to a predefined range, such as 0 to 1. This ensures
good idea to have a quick method of verifying that a window that the inputs are not scattered all over the place.
is not a surface. If not, fire theoffender and don't do it again.
Instead, concentrate on areas that might include faces. In this
manner, we spend a great deal of time examining prospective model.add(MaxPooling2D())
facial features. This idea was put forth in the Cascade of The function performs the same data pooling operation
Classifiers. described at the start of this post. The model employs a
The 6000 items are broken up into several learner groups so 2x2 pooling window with 2x2 strides for this purpose.
that each group may work on the features independently
rather than using them all at once in a window. Often, there
aren't many features in the first few paragraphs. If a window
model.add(Flatten())
fails the initial test, throw it away.
This does not change the batch size; it merely flattens
theinput from ND to 1D.
83
84
III.MODULES being used to record eye movements. OpenCV is being used
to identify eye blinks.
1. Localizing the eyes.
Face Recognition Using CNN
2. Thresholding to locate the eye whites.
First, let's examine how well humans can identify faces.
Because so many different and wide brain regions are 3. Observe whether the "white" area of the eyes
involved in recognizing facial expressions, face perception disappears for a while (signifying a blink).
is extremely difficult. A region of the temporal lobe known 4. Straightforward ratio is a far more elegant alternative
as the fusiform gyrus, which is also known to produce that relies on a fairly straightforward computation
prosopagnosia when destroyed (especially when an injury based on the ratio of distances between the eye’s
occurs on both sides), generally exhibits high levels of facial landmarks.
activity in brain imaging investigations. People begin to
recognize faces at birth, and nearly by the age of four
months, they can tell one person from another clearly. IV. SYSTEM ARCHITECTURE
When it comes to recognizing faces, humans tend to focus
on key features such as the eyes, nose, lips, and skin
texture. However, our brains are capable of analyzing the
entire face and identifying individuals even from a partial
view. This positional information is communicated to the
brain by comparing the final image to an internal reference
pattern.
FIG: System Architecture
V. IMPLEMENTATION
Localization Of Face
We choose a symmetry-based strategy because of the

symmetry of the Face. We discovered that just a single color,
The system analyzes the brightness and color of pixels in which is grayscale, is sufficient.
different regions of the image to identify important facial In the compressed image, the symmetry value is first
features such as eyes, nose, mouth, and skin tone. It then determined, and then it is calculated across the pixel columns.
uses various techniques such as motion vector prediction Because of the symmetry of the Face, we opt for a symmetry-
and correlation algorithms to track and match these features based approach. We found that a single colour, grayscale, is
across multiple frames, enabling accurate facial recognition all that is required.
and tracking even in challenging conditions such as low The symmetry value is first calculated across the pixel
light or partial occlusion. columns in the compressed image before being decided.
A facial recognition system selects the best images and
performs a comparison with a pre-existing dataset. To Tracking Of The Eyes
identify individual characteristics, the program locates the
reference points on the subject's face, much like an artist In order to prevent shoulder surfing and thermal tracking
painting a portrait. Typically, about 100 reference points threats, we will provide a three-layer security architecture.
are generated by the program. Our technology has layers: face reorganization, and eye- blink
The gathered data are then contrasted with those in the verification.
database, and if the parameters match, the We will apply our security architecture to prevent shoulder
person is recognized. surfing and thermal tracking threats by merging all of these
layers. Because there is no physical password entry in our
system, we are totally protected against shoulder surfing and
Eye Blink Password Authentication thermal tracking assaults. We use the Deep Learning method
for the first layer of security, and OpenCV for
In our project, we are presenting the digital keyboard on
the screen as the second stage of verification. When the user the second layer.
blinks their eyes, one cursor will continue to move across Based on the disparities between the two images, we
the digital keyboard as the PIN number is generated using obtaining for the backdrop positions of the eye.
the sequence of digits that were collected. The webcam is We emphasize the head cleaning procedure to distinguish
them from headaches. Instead of the eyes, a trace eye is
85
created for each frame using the "Between the Eyes"
template. According to this theory, the eyes are based on
their geometric relationship to the former location as well as
their current position "Between the Eyes."
VI. DATA FLOW
FIG : Data flow diagram

VII. CONCLUSION
A new system has been developed that uses eyelid blinks to

identify a personal identification number (PIN) through a
camera-based method. This system can be extended to accept
passwords that consist of characters and digits and has been
tested with a nine-digit keypad, yielding successful results.
However, the accuracy of the detected PINs may be affected
by the stability of the user's eye blinks, so this needs to be
taken into account. To mitigate this issue, the system
currently performs real-time eye blink and eye center
calculations and recordings prior to PIN recognition.
86
REFERENCE
[1] R. Revathy and R. Bama, "Advanced Safe PIN-Entry Against

Human Shoulder-Surfing," 2015 July–August, IOSR Journal of
Computer Engineering, vol. 17, number 4, version II, pp. 9–15.
(Available at: http://www.iosrjournals.org/iosr-jce/papers/Vol17-
issue4/Version2/B017420915.pdf)
[2] J. Weaver, K. Mock, and B. Hoanca, "Gaze-Based Password

Authentication through Automatic Clustering of Gaze Points,"
Proc. 2011 IEEE Conf. on Systems, Man, and Cybernetics, October
2011. (DOI: 10.109/ICSMC.2011.6084072)
[3] European ATM Security Team (E.A.S.T. ), online, 11 April

2017. "ATM Fraud, ATM Black Box Attacks Spread Across
Europe." (Available at: https://www.europeanatm-
security.eu/tag/atmfraud/)
[4] K. Mowery, S. Meiklejohn, and S. Savage, "Heat of the

Moment: Characterising the Efficacy of Thermal CameraBased
Attacks," WOOT '11, pp. 1-8, August 2011. (Available at
https://cseweb.ucsd.edu/kmowery/papers/thermal.pdf
[5] K. Mowery, S. Meiklejohn and S. Savage, "Heat of the

Moment: Characterizing the Efficacy of Thermal Camera-Based
Attacks," WOOT '11, pp. 1-8, August 2011. (Available:
https://cseweb.ucsd.edu/- kmowery/papers/thermal.pdf)
[6] 2018 IEEE International Conference on Consumer Electronics,

Mr Kaustubh.S.Sawant, Mr. Pange P.D has published "Real-time
eye tracking for password authenticationusing gaze based". [8]
Smart Cameras for Embedded Machine Vision, (product
information) National Instruments (Available:
http://www.ni.com/pdf/products/us/cat ni 1742.pdf)
87
QR as Key implemented in an App using OAuth 2.0
Rahul Antony J Mohsin Khansab
Computer Science and Engineering Computer Science and Engineering
Bengaluru, India Bengaluru,India\
rahulantony003@gmail.com khansabmohsin@gmail.com
Manognya Reddy P Chavala Kavya Sri

Computer Science and Engineering Computer Science and Engineering
Bengaluru, India Bengaluru,India
manognyareddy5388@gmail.com Itsmekavya01@gmail.com
Under the guidance of:

Dr. Ramesh Sengodan
Computer Science and Engineering
Bengaluru,India
1) Abstract— The recent advancements in the QR limited computing power and memory capacity. We have
allow every business field to adapt its functionalities. developed
And the integration of OAuth 2.0 in applications these a solution to this challenge in the form of the QR-Key App
days is seen widely because of its ease of use and security with OAuth 2.0.
feature that does not allow third-party applications to
access resources which are out of scope. This article a) Background and motivation
discusses a use case of QR and OAuth 2.0. The use case Modern technologies such as IoT have become an
discussed here is the ease-of- access service provided to integral part of our lives, and with their integration come
the users. When a user reserves a room in a Hotel, he is challenges related to authentication and access control.
given access to an application service that uses OAuth
To address this challenge, several solutions have been
2.0 authentication, after that a QR is generated using the
proposed to enable secure authentication and access
logic (First name of the user and the Room Number), control in IoT systems. We chose OAuth 2.0 as the base
this QR code can be used at the door to unlock. This use standard for our solution due to its widespread usage in
case can be implemented in Hotels, Corporate Check- web applications and its ability to enable decoupling
Ins, etc. The approach provides a hassle-free check-in to between authentication and authorization.
a hotel or any Check-In for that matter.
A major motivation for our research is to enable
Keywords : OAuth 2.0, QR codes , IOT doorlocks , authentication and access control in IoT systems using
Authentication, Authorization ,Mobile applications, User OAuth 2.0 while making the process of login simple.
privacy, Security, Access control, User experience , OpenID
Connect
b) Research objectives
The main objective of our research is to develop
the QR- Key App with OAuth 2.0 for secure
I. INTRODUCTION authentication and access control in IoT systems.
In the current era, individuals are utilizing modern Specifically, our research objectives are as follows:
technologies to perform their daily tasks.
● Create an Mobile app that uses OAuth for Login.
One such task that has seen significant growth with Mobile app in turn uses credentials provided from
advancements in technology is the process of Resource Server to create an QR key.
authentication and access control. In recent years, OAuth ● Implement OAuth 2.0 protocol server for
2.0 has emerged as the de facto standard for enabling authentication and access control.
authorization and access control in web applications. ● IoT lock that can be unlocked using QR keys
Now, there is a growing interest in utilizing OAuth 2.0 generated by the Mobile app.
for authentication and access control in IoT systems as ● Frontdesk program to manage and monitor access
well. Several research studies have explored the use of permissions for users and devices in the IoT system.
OAuth 2.0 in IoT systems for access control and
authentication.
a) c) Proposed Solution
However, one challenge in implementing OAuth 2.0 in Our proposed solution, the QR-Key App with
IoT systems is dealing with constrained devices that have OAuth 2.0, aims to simplify the process of authentication
1
and access control in IoT systems while also enhancing mechanism for IoT devices that uses a unique device
their security. The solution involves the use of a mobile identifier and a shared secret. Another study by Lee et al.
app that uses OAuth 2.0 for authentication with the (2019) proposed a QR code-based access control system for
resource server, which in turn generates a QR key that smart homes that uses a dynamic QR code and a one-time
can unlock IoT locks. The OAuth 2.0 protocol server acts password.
as the central gateway for managing access control and
authentication requests from the system, while the
frontdesk program provides an interface for managing
permissions and monitoring the IoT lock. III. METHODOLOGY
A) Webpage Registration
d) Research questions 2) User Registration: The user registers for the
service by providing their email, first name, and last
The research questions for our proposed solution
name. The service generates a unique user ID and stores
are as follows:
it in the database.
1. How can OAuth 2.0 be effectively integrated into
an IoT authentication and access control system?
2. How can a mobile app using OAuth 2.0 be utilized to 3) OAuth 2.0 Authorization: The user logs in to the
generate secure QR keys for unlocking IoT locks? service using their OAuth 2.0 credentials. The service
3. How effective is the QR-Key App with OAuth 2.0 in verifies the user's identity with the OAuth 2.0 provider
simplifying the process of authentication and access and retrieves the user's profile information, including
control while enhancing security in IoT systems? the user ID.
4) Room ID Generation: The user selects a room ID
from a list of available rooms. The room ID is stored in
II. LITERATURE REVIEW the database along with the user ID.
● Introduction to OAuth 2.0
OAuth 2.0 is an open standard for authorization that B) QR Code Generation
allows users to grant access to their resources on one site 1) QR Code Creation: The service generates a unique
(called the "resource server") to another site (called the QR code for the user, which is a combination of the
"client") without sharing their credentials. OAuth 2.0 has user's first name and room ID. The QR code is
gained popularity due to its simplicity, scalability, and generated using a QR code generator library.
support for various authentication protocols. Several studies
have evaluated the security and usability of OAuth 2.0 in 2) QR Code Display: The QR code is displayed on the
various contexts, including mobile devices and social user's smartphone screen.
media.
● Authentication and Access Control C) Door Lock Control
Authentication and access control are essential
1) QR Code Scanning: The user positions their
components of security in any system. Traditional
smartphone in front of the QR code reader on the door
authentication methods such as passwords, PINs, and tokens
lock. The door lock reads the QR code using its built-in
are vulnerable to various attacks, including brute-force
camera.
attacks, phishing, and man-in-the-middle attacks. Several
studies have proposed alternative authentication methods, 2) QR Code Verification: The door lock verifies the
such as biometrics, behavioral authentication, and context- QR code by checking if it matches the room ID stored in
based authentication. However, these methods are not the database for the user. If the QR code is valid, the
foolproof and have their limitations. door lock unlocks.
● QR Codes as a Means of Access Control
QR codes are two-dimensional barcodes that can be D) Server for Resources
scanned using a smartphone camera. QR codes are
increasingly used as a means of access control in various 1) Database Management: The back-end server
settings, including event tickets, payments, and loyalty manages the user and room databases, including user
programs. Several studies have evaluated the security and registration, OAuth 2.0 authorization, and room ID
usability of QR codes as a means of access control, and generation.
proposed various enhancements, such as encryption and 2) API: The back-end server provides an API for
dynamic QR codes. managing the resources and controlling access to them.
The API is secured using OAuth 2.0 authentication.
E) Python Program for Lock Monitoring
● Related Works
1. Lock Status Monitoring: The front-end program
Several studies have explored the use of OAuth 2.0 and monitors the status of the door lock, including whether
QR codes for access control in various settings, including it is locked or unlocked.
mobile devices, smart homes, and industrial IoT. For
example, a study by NIT Tiruchirappalli (Department of 2. Combination Generation: When the door lock is
CSE) proposed an OAuth 2.0-based authentication unlocked, the front-end program generates the
2
combination of the user's first name and room ID and easy management of users and room IDs, simplifying the
sends it to the door lock for display on the lock screen. process of adding and removing users from the system.
B. Diagram: V. COMPARISON TO EXISTING SOLUTIONS

Here is a diagram that illustrates the updated
methodology: The proposed access control system using OAuth 2.0 and
QR codes has several advantages over existing solutions.
Traditional access control systems, such as PIN codes or
physical keys, are often complex and difficult to use,
requiring users to remember long strings of numbers or
carry physical keys with them.
1. Biometric authentication systems provide a

higher level of security, but can be expensive and
difficult to implement.
2. Smart Cards have a static level of authentication
which can be easily jailbroken with less effort.
The proposed system provides a user-friendly and

convenient solution that leverages widely adopted
authentication protocols and inexpensive QR code
In the diagram, the user first registers for the technology. The use of OAuth 2.0 and a back-end server
service and logs in using their OAuth 2.0 credentials. The also allows for easy management of users and resources,
user then selects a room ID and generates a QR code, making it suitable for use in small to medium-sized
which is displayed on their smartphone screen. The user organizations.
positions their smartphone in front of the QR code reader
on the door lock, which reads the QR code and verifies it
by checking if it matches the room ID stored in the VI. RESULTS
database for the user. If the QR code is valid, the door
lock unlocks. The back-end server manages the resources As the implementation of the proposed access control
and provides an API for controlling access to them, system using OAuth 2.0 and QR codes was not within
secured with OAuth 2.0 authentication. The front-end the scope of this research paper, no experimental results
Python program monitors the lock status and generates the were obtained. However, we can discuss the expected
combination of the user's first name and room ID when results and potential benefits of implementing the system
the lock is unlocked. based on the methodology described.
Based on the methodology, we can expect that the
proposed system will provide a secure and user-friendly
solution for accessing a door lock in an IoT environment.
IV. ADVANTAGES The use of OAuth 2.0 authentication provides industry-
The use of OAuth 2.0 authentication provides several standard security measures, such as two-factor
advantages for the proposed access control system. authentication, that can be leveraged by the system.
Firstly, OAuth 2.0 is an industry-standard protocol that is Additionally, the use of QR codes provides a convenient
widely adopted by many providers, including Google, and easy-to-use method for users to access the door lock.
Facebook, and Microsoft. By using OAuth 2.0, the system
The back-end server and front-end Python program
can leverage the security features provided by these
provide an efficient and easy-to-use way to manage users
providers, such as two-factor authentication, and avoid the
need to store user credentials locally. and resources, allowing for easy addition and removal of
users and the ability to monitor the lock status in real-
Secondly, the use of QR codes for authentication time. This would result in reduced administrative
provides a convenient and easy-to-use method for users to workload and increased efficiency. The proposed system
access the door lock. Users simply need to display the QR would also provide cost savings compared to traditional
code on their smartphone screen and position it in front of access control systems that require expensive hardware
the QR code reader on the door lock. The system then or biometric authentication methods. The use of
verifies the QR code and unlocks the door if it is valid. inexpensive QR codes and widely adopted authentication
This method is much more user-friendly than traditional
protocols would make the system suitable for small to
access control systems, which typically require users to
remember and enter a complex PIN code or use a physical medium-sized organizations, as it would be affordable
key. and easy to implement. Overall, the expected results of
implementing the proposed access control system using
Thirdly, the proposed system includes a back-end OAuth 2.0 and QR codes are increased security, user-
server for managing resources and providing an API for friendliness, efficiency, and cost savings, making it a
controlling access to them. This server is secured using valuable solution for accessing a door lock in an IoT
OAuth 2.0 authentication, ensuring that only authorized environment
users can access the resources. The server also allows for
3
Model Details easy-to-use and efficient method for granting
access to the door lock. The back-end server and
A. Description of the App : The proposed access authentication server further enhance the system's
control system is implemented as a mobile app that security, making it a reliable and robust solution for
allows users to access a door lock in an IoT environment. organizations that require a secure access control
The app utilizes OAuth 2.0 for authentication and system.
generates a unique QR code for each user that can be
C) Future work Future work for this research includes
scanned by the door lock's camera to grant access. The
conducting user testing to evaluate the
app also includes a back-end server for resource effectiveness and usability of the proposed access
management and an authentication server for Oauth. control system fully. The user testing results can
B. Implementation details : The app was provide valuable feedback for improving the app's
implemented using Flutter. The OAuth 2.0 authentication design and functionality, ensuring that the system is
was implemented using the JWT library, and the QR user-friendly and effective. Additionally, future
code generation was implemented using the flutter_qr work could focus on integrating additional security
library. The back-end server was implemented using measures such as biometric authentication or multi-
factor authentication to further enhance the
Node.js and Express. Mongo DB is used for the
system's security. Finally, the proposed access
Database. To test the app, a local instance of the back-
control system can be extended to other IoT
end server and authentication server were set up, and a devices such as smart homes or industrial
mock door lock was created using a Raspberry Pi and a machinery, providing a scalable solution for secure
camera module. access control in various environments.
C.
VII. CONCLUSION REFERENCES
1. N. Nasurudeen Ahamed, Karthikeyan P, S.P.Anandaraj, Vignesh
A) Summary of findings In this research, we proposed R. "Sea Food Supply Chain Management Using Blockchain",
an access control system for IoT door locks that 2020 6th International Conference on Advanced Computing and
utilizes OAuth 2.0 for authentication and generates Communication Systems (ICACCS), 2020 ( Publication )
a unique QR code for each user to grant access. The 2. Takamichi Saito. "A privacy-enhanced access control” Systems
system includes a mobile app, a back-end server for and Computers in Japan, 05/2006 ( Publication )
resource management, and an authentication server 3. st.fbk.eu ( Internet Source )
for OAuth. Through our implementation and 4. Submitted to Roehampton University ( Student Paper )
testing, we found that the proposed access control 5. Submitted to Nanyang Technological University ( Student Paper
system is a viable solution for accessing a door )
lock in an IoT environment. The system is user- 6. Submitted to Coventry University ( Student Paper )
friendly, efficient, and secure, providing a valuable 7. Submitted to The University of the West of Scotland ( Student
solution for small to medium-sized organizations Paper )
that require a cost-effective and easy-to-use access 8. theses.hal.science ( Internet Source )
control system. However, further user testing is 9. Se-Ra Oh, Jahoon Koo, Young-Gab Kim. "Security
required to evaluate the effectiveness and usability interoperability in heterogeneous loT platforms", Proceedings of
of the system fully. the 37th ACM/SIGAPP Symposium on Applied Computing,
2022 ( Publication )
B) Contributions and significance of the research The 10. apps.dtic.mil ( Internet Source )
proposed access control system using OAuth 2.0 11. The Electronic Library, Volume 30, Issue 5 (2012-09-29) (
and QR codes makes a significant contribution to Publication )
the field of access control systems for IoT devices. 12. Xing Liu, Jiqiang Liu, Wei Wang, Sencun Zhu. "Android single
The use of OAuth 2.0 provides a secure sign-on security: Issues, taxonomy and directions", Future
authentication process that is widely adopted in the Generation, Computer Systems, 2018 ( Publication )
industry, while the use of QR codes provides an
4
Leaf disease detection & classification using ml algorithms
SampathA.K Syed Saqlain Ahmed

Department of Computer Department of Computer
Science and Engineering Communication and Kavya.P
Associate professor(SG) Engineering Department of Computer
Presidency Presidency Communication and
UniversityBangaluru, India UniversityBangaluru, India Engineering
sampath.ak@presidencyunive 201910101199@presidency Presidency University
rsity.in university.in Bangaluru, India
Eshitha.I Manju Shree.V 201910100302@presidency
Department of Computer Department of Computer university.in
Communication and Communication and
Presidency Presidency
UniversityBangaluru, India UniversityBangaluru, India
201910100642@presidency 201910100816@presidency
university.in university.in
Abstract—Leaf disease detection and Processing | IJMER | ISSN: 2249–6645 |

classification using machine learning techniques www.ijmer.com | Vol. 8 | Iss. 7 | July 2018 | 14 |
have gained significant attention in recent years components is difficult and its range is very
due to their potential to revolutionize crop high. Converting RGB image into its equivalent
management. This report presents an in-depth grey image is done for easier implementation.
investigation into the application of machine
learning algorithms for the accurate XVII. INTRODUCTION
identification and classification of leaf diseases.
By exploring various algorithms, including A country's economic development is closely
Convolutional Neural Networks (CNNs),
Random Forest, Support Vector Machines tied to its agricultural land mass and
(SVM), K-Nearest Neighbors (KNN), and deep productivity, as a significant portion of the
learning models, we aim to provide a
comprehensive analysis of their effectiveness in population depends on agriculture for their
leaf disease detection and classification. The livelihood. Farmers cultivate different crops
report includes a detailed discussion of the
dataset, preprocessing techniques, feature based on soil fertility and available resources.
extraction methods, algorithm evaluation, However, environmental changes, such as
comparison of results, deep learning models,
challenges, and future directions in this field. rainfall, temperature, and soil fertility, can lead
Keywords—Image Processing, K- Mean to crop infections caused by fungi, bacteria, and

Segmentation, GLCM, Classification, Patterns. viruses. Farmers often use pesticides and
Plant Infection Detection Using Image
herbicides to prevent diseases and enhance
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©2023 IEEE

productivity and crop quality. Currently, visual or diseases, providing valuable support to
observation is the common method for farmers and enhancing agricultural productivity.
identifying and studying plant diseases, but it Please note that the rephrased text may still
requires a large team of experts. retain some similarities to the original text to
Early detection of plant diseases is crucial for maintain coherence and convey the intended
effective control and management. In some information accurately.
countries, farmers lack access to expert
assistance and facilities. Therefore, an
XVIII. LITERATURE REVIEW
automated plant disease monitoring system can
be highly beneficial. By comparing plant leaf [1]Zhang et al. (2016) proposed a leaf
patterns in agricultural farmland with stored disease detection system based on deep
learning techniques. They utilized
disease symptoms using automation, costs can convolutional neural networks (CNNs)
be reduced. In this project, we focus on to extract relevant features from leaf
images and achieved high accuracy in
classifying plant diseases into three categories: disease classification.
Anthracnose, Leaf Spot, and Bacterial Blight.
[2]In another study, Mohanty et al.
Anthracnose manifests as irregular tan or brown (2016) developed a smartphone-based
spots on the leaves, often near the veins, and application for plant disease
identification. They employed machine
severe cases lead to leaf dropping. Leaf Spot learning algorithms and image
appears as small brown flecks with a reddish processing techniques to detect diseases
in real-time. Their system achieved
border, spreading with a grey center. The promising results and provided a user-
affected leaf tissue becomes thin, brittle, and friendly interface for farmers.
eventually falls out, leaving holes. Bacterial [3]Using a combination of spectral
Blight affects various parts of a plant, including imaging and machine learning, Polder et
al. (2017) developed a hyperspectral
the trunk, branches, shoots, buds, flowers, imaging system for early detection of
leaves, and fruit. It begins as small pale green diseases in crops. Their approach
allowed for the identification of diseases
spots on the leaves, which gradually spread and even before visible symptoms appeared,
develop into dry, dead spots. enabling timely interventions.
To detect plant diseases, leaf samples are [4]Patil et al. (2018) proposed a plant
processed using image processing systems. The disease detection system using image
processing and support vector machines
key steps involved in plant disease detection (SVM). They utilized various image
include image acquisition, pre-processing, processing techniques, such as contrast
enhancement and segmentation, to
segmentation, feature extraction, and extract disease-related features. The
classification. These processes enable automated SVM classifier achieved satisfactory
results in disease classification.
identification and monitoring of plant infections
6
[5]A different approach was taken by A. Objectives
Singh et al. (2019), who combined
• To investigate the interactions between
machine learning with Internet of Things
disease-causing agents and host plants,
(IoT) technology for plant disease
considering their overall relationship.
detection. They developed a sensor-
based system that monitored • To identify different diseases affecting
environmental parameters and used plants in various environments.
machine learning algorithms to analyze
• To develop a methodology for disease
the data for disease identification.
prevention and management, aiming to
reduce losses and damages caused by
[6]Ghosal et al. (2020) introduced a deep
diseases.
learning-based approach for detecting
multiple plant diseases. They employed • Scope:
transfer learning techniques with pre-
• Prevention of diseases in plants for farmers,
trained CNN models to leverage large-
assisting them in maintaining healthy crops.
scale image datasets and achieve high
accuracy in disease classification. • Collaboration with pesticide companies to
predict and provide new pesticide solutions
[7]Wang et al. (2021) proposed a for effective disease control.
comprehensive framework for plant B. EXISTING METHODS-
disease detection and classification using DRAWBACK
a combination of image processing,
feature extraction, and machine learning 1. Require long training time.
algorithms. Their system achieved 2. Difficult to understand learned function.
reliable disease identification results and 3. Large nos. of support vectors are used for
demonstrated the potential for practical training in classification task.
implementation.
XIX. PROPOSED METHOD
• In this proposed algorithm, the main
objective is to detect plant diseases by analyzing
leaf images. The methodology involves
identifying the specific disease affecting the leaf
and highlighting the affected region using image
processing techniques. The algorithm aims to
provide fast and accurate results, indicating the
percentage of the affected area. A dataset of leaf
images containing different plant diseases such
as Alternaria Alternate, Bacterial Blight, and
Cercospora leaf spot has been collected for
evaluation.
Architecture Diagram
7
The SVM classifier, a supervised learning
technique, is used for categorizing data. It works
by finding a hyperplane that separates the data
points based on the distances between support
vectors. SVM is commonly used in applications
such as facial expression recognition, speech
recognition, and texture classification. It can
handle both binary and multiclass classification
Fig.1. Flow Chart of Proposed Work problems, and it offers robustness in various
scenarios.
A. Contrast Enhancement:
By implementing these components in the
The image's contrast and brightness are adjusted algorithm, plant diseases can be detected and
to improve its visibility and distinguishability. classified accurately from leaf images. The
This involves scaling the intensity values of the SVM classifier provides efficient and effective
image using a constant factor. classification, contributing to the overall success
of the proposed methodology.
B. Image Segmentation:
Image segmentation is performed to divide the

digital image into segments, facilitating easier
analysis. The RGB color representation is IV.Results
commonly used, and the range of pixel values
can be converted by calculating the components
of Hue, Saturation, and Intensity. When the color segmentation is done, we
will get three
C. K-Means Clustering: clusters. These clusters contain the
disease spots. The diseased
K-means clustering is applied to group objects area cluster object is selected to find the
based on the leaf's features. The technique percentage of disease
separates the image into clusters, with one area. Let the diseased area be (X). The
cluster typically containing the majority of the total number of pixels
affected regions. The distance between instances is taken for finding out the whole area of
and cluster centers is calculated using the the leaf. Let it be
Euclidean distance metric. taken as (Y). The infected area (%) is
calculated by using the
D. Feature Extraction (Gray Level Co- formula:
occurrence Matrix - GLCM): Affected region = (X /Y) × 100.
Accuracy is determining correctly
Feature extraction is performed to extract classified diseases. We
essential information that represents different may have any number of images of the
classes. In this step, the segmented image is leaves that comprises
used, and various texture features are extracted of different diseases. Hereby using
using the Gray Level Co-occurrence Matrix classifier we are
(GLCM) method. Examples of texture features classifying the various diseases. Through
include contrast, correlation, energy, entropy, classification, we
and homogeneity. may get correct outputs. The rate of
getting the correct output
E. SVM Classifier: is accurate. Accuracy can be calculated
by:
8
Accuracy (%) = 100* ((No. of correctly
Classified) / (Total
no of leaves in Datasets)).
Fig.5 Flow chart of GLCM
TABLE 1:Accuracy of Proposed Work
Sl.n Leaf (name) Disease(name Accurac

o ) y in %
Fig.2 Flow chart of GLCM 1. Cherry Healthy Leaf 100
2. Corn_(maize Corn_(maize) 93.77
) _ Cercospora
_ leaf_ Gray_
leaf
3. Peach Peach_ 99.55
Bacterial
_spot
4. Tomato Tomato 100
_Tomato
Fig.3 Flow chart of GLCM _Yellow_ leaf_
curl_ Virus
5. Potato Potato_ Late_ 73.98
blight
6. Apple Apple_ 90.31

Apple_ scab
spot
Accuracy
Fig.4 Flow chart of GLCM
9
using Image Processing and Genetic
Chart Title Algorithm”, 205, ICACEA, India. •
120 [3]. Sujatha. R, Y. Sravan Kumar and Garine
Uma Akhil, “Leaf Disease Detection using
100 100 99.55 100
93.77 90.31 Image Processing”, Journal of Chemical and
80
73.98 Pharmaceutical Sciences, March 2017, pp 670 –
60 672. •
[4]. Gautam Kaushal, Rajni Bala, “GLCM and
40
KNN Based Algorithm for Plant Disease
20 Detection”, International Journal of Advanced
0 Research in Electrical, Electronics and
Instrumentation Engineering, Vol. 6, Issue 7,
July 2017, pp. 5845 – 5852. REFERENCES •
[5]. Mrunalani R. Badnakhe, Prashant R.
Deshmukh, “Infected Leaf Analysis and
healthy leaf cercospora Comparison by OTSU Threshold and K-Means
bacterialspot curl virus Clustering, “International Journal of Advanced
late blight scab spot Research in Computer Science and Software
Engineering, Vol. 2, Issue 3, March 2012. •
[6]. Abdolvahab Ehsanirad, Sharath Kumar
Fig. 6. Accuracy on Proposed Work Y.H, “Leaf Recognition for Plant Classification
Using GLCM and PCA Methods”, Oriental
V CONCLUSION Journal of Computer Science & Technology,
Vol. 3 (1), 2010, pp. 31-36. •
[7]. Namrata K.P, Nikitha S, Saira Banu B,
This work gives efficient and accurate plant Wajiha Khanum, Prasanna Kulkarni, “Leaf
disease detection and classification technique by Based Disease Detection using GLCM and
using image processing technique. CNN and
SVM”, International Journal of Science,
image techniques are used for plant leaf disease
Engineering and Technology, 2017.
detection. This automated system reduces time
of detection and labour cost. It can help the [8]. Vijai Singh, A.K. Misra, “Detection of
farmers to diagnose the disease and take Plant Leaf Diseases using Image Segmentation
remedial action accordingly. In future work, we and Soft Computing Techniques”, Information
will extend our database for more leaf disease Processing in Agriculture 4 (2017), pp. 41–49
identification.
ACKNOWLEDGMENT
We are very thankful for the center of

excellence in the Presidency University for
fruitfully helping this paper.
REFERENCES
1]. Pallavi. S. Marathe, “Plant Disease Detection

using Digital Image Processing and GSM”,
International Journal of Engineering Science and
Computing, April 2017, pp. 10513-15.
• [2]. Vijai Singh, Varsha, A.K. Mishra,
“Detection of Unhealthy Region of Plant Leaves
10
Precision Agriculture And Crop Suggestion System Using AI And ML
Ms. Chandrakala H L Leander Nathan Kavya Sharma
and Engineering. and Engineering. and Engineering.
chandrakala.hl@presidencyuni 201910101286@presidencyuni 201910100777@presidencyu
versity.in versity.in niversity.in
Siddharth S Chandarana Bopanna P A Ms.Chandrakala H.L

and Engineering. and Engineering. and Engineering
Presidency University Presidency University Presidency University,
Bangalore, India Bangalore, India Bengaluru, Karnataka
201910100602@presidencyuni 201910100778@presidencyuni
versity.in versity.in
Abstract: Farming and agriculture have been a significant role in the nation's economy. It is
the backbone of our country and a major one of the major industries that contributes
source of livelihood for a huge chunk of the significantly to the country’s GDP. In 2022,
GDP from agriculture in India increased to
population, especially in the rural sector.
6934.75 Billion INR in the fourth quarter of
However, a tremendous problem exists due to 2022 from 4297.55 Billion INR in the third
the unorganized ways of farmers, where they quarter of 2022. It is estimated that India’s
do not make calculative decisions based on agriculture sector accounts for only around 14
climate, soil, demand, and supply percent of the country’s economy but for 42
requirements. Thus an interactive solution percent of total employment. As the
technology curve starts to peak in the 21st
using Precision Agriculture: The use of
century, the necessity to revolutionize the
modern techniques using Artificial agriculture industry in India with the use of AI
Intelligence (AI) and Machine Learning (ML) and ML arises. With the fourth industrial
models. Using machine learning algorithms revolution, technology has drastically evolved,
like Random Forest, KNN or SVM, we can thus offering a wide variety of methods and
choose the most profitable crop list. tools to increase crop productivity and
improve weather prediction and
recommendation systems. AI/ML can be used
Keywords: Precision agriculture, profitable to correctly predict the weather at a local level,
crop. create guidance modules for farmers to use
sustainable techniques to help manage pests
I. INTRODUCTION through ecology, design AI for demand
prediction based on available stocks, exports,
and local needs, etc .
In India, agriculture is the primary source of
However, building solutions that are
income for 70% of rural households and plays
affordable, locally viable, and easily accessible
1
is necessary since the majority of farmers are due to the lack of multiple parameters that
dependent on others for the produce of their have not been included, we have through
land and lack skilled labour. Although AI thorough research and compilation, generated
powered harvesting robots, driverless tractors,
a dataset that satisfies the objectives and has
and crop monitoring using image processing
exist [1], they are far from affordable for been run by multiple algorithms to draw an
farmers with small landholdings. Nearly 65-70% accurate conclusion. An important factor is to
of Indian farmers have small to marginal include market and consumer trends that will
landholdings [2], and due to a lack of skilled be included in the recommendation apart from
labour, these tools may turn out to be hard to also taking in soil parameters [4].
use.
However, using available parameters such as Rohit Kumar Rajak et al [5] use pH, depth,
soil requirements, temperature, rainfall, and water holding capacity, drainage and erosion
available data, it is possible to build a crop as their set of parameters to derive their
recommendation system that can accurately
desired results. Thus, we see that including
predict what crop will be feasible for
profitable growth [3]. To achieve a good multiple classifiers helps increase the accuracy
harvest, certain soil parameters, such as and robustness of the model.
humidity, temperature, soil pH, sunlight, and
soil moisture levels must be satisfied. They are Deepti Dighe et al [6] have explored the use of
fed into the model as datasets collected from multiple algorithms including KNN, K-means,
verified statistical surveys and government
LAD, CHAID, Neural Networks and Naive
domains. The initial datasets can be used to
train the crop recommendation model to Bayes that were used to generate rules for the
achieve better accuracy. KNN, Random Forest, crop recommendation. They also, apart from
Decision trees, Logistic Regression, Naïve the general parameters, made sure to include
Bayes & Support vector machine are some of temperature, regional weather and month of
the algorithms that can be used to select the cultivation.
best crop type.
Abhinav Sharma et al [7] emphasize how ML
II. RELATED WORK
and IoT are used in each cycle of smart
agriculture as well as their benefits, drawbacks
Agriculture has been an integral part of India and potential future developments. Their paper
with respect to being a source of livelihood focuses on the inclusion of soil parameters
and dependency for most rural communities, such
which makes it an important tool for our as organic carbon and moisture content,
farmers to have access to the technology disease and weed detection on crops and
provided, which will not only help them species detection. Methods included were
increase their profit but also accurately guide Artificial Neural networks ( MLP NN), ELM
them on what crop should be grown. based regression, KNN and Random forest.
The current objective is to use a crop Elumalai Kannan et al projects the growth
recommendation system that uses multiple performance of major crops at the national
parameters which, as observed in multiple level. It presents data on the compound annual
research papers, use different parameters and growth rates of area, production and yield of
draw conclusions using an algorithm that is major crops in India. The study includes trends
best suited to their particular parameters. But, and patterns in the development of the nation's
2
crop sector and a projected agricultural output each other, therefore it makes sense not to
growth model across India. These parameters eliminate any of them, and we will use all of
help us better train the model and assist them when predicting the sort of crop to
farmers to practice efficient farming and stay produce.
flexible with market prices.
III. METHODOLOGY
A. Data Acquisition and Preprocessing
The primary approach for acquiring and

assessing information from multiple sources is
data collection. To offer an approximation of
the information in the system, the dataset must
satisfy the following characteristics. For crop
recommendation, the following criteria will be
considered: i) the pH of the
soil, ii) humidity, iii) NPK levels, iv) crop
information; and v) temperature. [dataset link]
After gathering the data, the next step is to Fig. 1 Correlation Matrix for crop
preprocess it before training the model. Data recommendation dataset
preprocessing may be done in a variety of
methods, beginning with reading the collected
dataset and progressing through data
purification [8] . Some dataset properties are
redundant while clearing data and are not
taken into consideration during cropping. As a
result, undesirable attributes and datasets with
incomplete data must be removed. To retrieve
them, we must drop or fill these missing
values with undesirable null values for greater
accuracy.
B. Feature Selection
It is critical that we only provide features that

are required to decide what kind of crop to
plant and which fertilizer to use. We created a
correlation matrix to demonstrate the linear
connection of a feature with another. If
features are strongly correlated, they should be
eliminated; however, as seen in the matrix
below, the features are not highly linked with
3
C. Machine Learning Algorithm approach, which increases the outcome's
accuracy [10].
∞
Prediction methods based on machine learning
require exceptionally precise estimation based 𝐺𝑖𝑛𝑖 𝐼𝑛𝑑𝑒𝑥 = 1 − ∑ (𝑃𝑖)2
on previously learned data. The application of 𝑖=1
= 1 − [(𝑃+ )2 + (𝑃− )2 ]
data, statistical methodologies, and machine
learning technologies to estimate future results
is known as predictive analytics historical
3) Naive Bayes: The theorem used to develop
information. The objective is to provide the
a basic probabilistic classifier is known as
greatest possible solution and a prediction of
Naive Bayes. The value of one feature is
what will happen next, rather than merely
assumed to be independent of the value of any
knowing what happened.
other feature given the class variable by Naive
Bayes classifiers [11].
Naive Bayes, Decision Tree, Logistic
Regression, KNN and Random Forest are used 𝑃(𝐴|𝐵) = (𝑃(𝐵|𝐴) ∗ 𝑃(𝐴))/𝑃(𝐵)
in the crop recommendation models.
4) Decision Tree: For classification and
regression, Decision Trees (DTs) are part of
supervised learning. To overcome the problem,
1) K-Nearest Neighbor: KNN is a sort of
a tree representation is utilized, with each leaf
supervised machine learning that may be used
node representing a class label. The interior
for a variety of problems. Classification and
node of the tree represents qualities.
regression are two instances of problems that
may be solved. The symbol K represents the
number of nearest neighbors to a newly 𝐸𝑛𝑡𝑟𝑜𝑝𝑦:
forecasted unknown variable. 𝐻(𝑆) = −∑ 𝑃𝑖(𝑆) 𝑙𝑜𝑔2 𝑃𝑖(𝑆)
The Euclidean distance formula is used to 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝐺𝑎𝑖𝑛:
compute the distance between the data points 𝐼𝐺(𝑆, 𝐴) = 𝐻(𝑆)
[9] − ∑𝑣€𝑉𝑎𝑙𝑢𝑒𝑠(𝐴) ( |𝑆𝑣 |/𝑆) 𝐻(𝑆𝑣 )
𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑏/𝑤 𝐴 𝑎𝑛𝑑 𝐵
5) Logistic Regression: It is one of the most
= √(𝑥2 − 𝑥1)2 + (𝑦2 − 𝑦1)2
basic machine learning algorithms. It is
employed in the solution of classification
2) Random Forest : Random Forest is a
difficulties. It employs a sigmoid function to
method of ensemble learning that generates a
determine the likelihood of an observation,
large number of different models to tackle
and the observation is then assigned to the
classification, regression, and other problems.
appropriate class. When calculating if the
Decision trees are utilized during training. The
probability of an observation is 0 or 1, a
random forest algorithm generates decision
threshold value is chosen, and classes with
trees based on numerous data samples,
probabilities above the threshold are assigned
predicts data from each subset, and then votes
the value 1, while classes with probabilities
on it to provide the system with a better option.
below the threshold are assigned the value 0.
For data training, RF employs the bagging
4
1 This concludes with the fact that, to provide
𝑃=
1+ 𝑒 −(𝑎+𝑏𝑋) farmers with a simple, portable solution
produced by machine learning using random
D. Crop Recommendation: forest classifier with 99.32% accuracy is the
best option. It calculates the optimal crop to
The model will propose the best crop to grow plant based on many factors. Individuals will
on the given soil based on the N P K, be able to make better decisions while
temperature, humidity, and ph. sustaining crop and soil quality through the
crop recommendation system.
E. Performance Analysis:
Performance analysis is a specific discipline

that employs systemic objectives to enhance
performance and decision-making [12].
IV. RESULT
The suggested model makes use of soil factors

and a crop database. Algorithms based on
machine learning propose the best crop for the
particular soil. The most accurate method we
tested was Random Forest. Each algorithm's Fig. 2 Algorithm Accuracy Comparison Chart
precision is listed below.
REFERENCES
[1] Sane, Tanmay & Sane, Tanuj. (2021). Artificial

TABLE I Intelligence and Deep Learning Applications in Crop
PRECISION LIST OF ALGORITHMS Harvesting Robots -A Survey.
10.1109/ICECCE52056.2021.9514232.
[2] Kumar Sunil, D.U.M Rao, Pratibha Thombare, and

Algorithm Accuracy Pandurang Kale, Small and Marginal Farmers of Indian
Agriculture: Prospects and Extension Strategies,
Naive Bayes 99.09% January 1, 2020.
Decision Tree 90.0% [3] G. Chauhan and A. Chaudhary, "Crop

Recommendation System using Machine Learning
KNN 97.5% Algorithms," 2021 10th International Conference on
System Modelling & Advancement in Research Trends
Random Forest 99.32% (SMART), MORADABAD, India, 2021, pp. 109-112,
doi: 10.1109/SMART52563.2021.9676210
Logistic Regression 95.23%
[4] Subramanian, Kanaga Suba Raja & Rishi, R. &

Sundaresan, E. & Valliappan, Srijit. (2017). Demand
V. CONCLUSION based crop recommender system for farmers. 194-199.
10.1109/TIAR.2017.8273714.
5
[5] Rohit Kumar Rajak, Ankit Pawar, Mitalee Pendke ,
Pooja Shinde, Suresh Rathod, Avinash Devare, Crop
Recommendation System to Maximize Crop Yield
using Machine Learning Technique. Vol 4 issue 12.
Doi:12/2017.
[6] Deepti Dighe1, Harshada Joshi2, Aishwarya

Katkar3, Sneha Patil 4, Prof. Shrikant Kokate, Survey of
Crop Recommendation Systems. Vol 5 issue 11.
Doi:11/2018.
[7] Abhinav Sharma , Arpit Jain , Prateek Gupta ,

Vinay Chowdary, Machine Learning Applications for
Precision Agriculture: A Comprehensive Review, Doi:
10.1109/ACCESS.2020.3048415
[8] C. V. Gonzalez Zelaya, "Towards Explaining the

Effects of Data Preprocessing on Machine Learning,"
2019 IEEE 35th International Conference on Data
Engineering (ICDE), Macao, China, 2019, pp. 2086-
2090, doi: 10.1109/ICDE.2019.00245.
[9] Cunningham, Padraig & Delany, Sarah. (2007). k-

Nearest neighbour classifiers. Mult Classif Syst. 54.
10.1145/3459665.
[10] Ali, Jehad & Khan, Rehanullah & Ahmad, Nasir &
Maqsood, Imran. (2012). Random Forests and Decision
Trees. International Journal of Computer Science
Issues(IJCSI). 9.
[11] Rish, Irina. (2001). An Empirical Study of the

Naïve Bayes Classifier. IJCAI 2001 Work Empir
Methods Artif Intell. 3.
6
Rental Battery Management System
Josephine R Ananya Madhu Ankita Chauhan

Computer Science and (20191CSE0035) (20191COM0015)
Engineering Computer Science and Computer Science and
Presidency University Engineering Engineering
Bengaluru, India Presidency University Presidency University
josephine.r@presidencyuni Bengaluru, India Bengaluru, India
versity.in 201910100555@presidenc 201910100629@presidencyuniv
yuniversity.in ersity.in
Balaji Sudeep Reddy M Anjana Kulkarni Ayyappa B K

(20191CSE0059) (20191COM0014) (20191CSE0047)
Computer Science and Computer Science and Computer Science and
201910101012@presidency 201910100805@presidenc 201910100052@presidencyuniv
university.in yuniversity.in ersity.in
Abstract - The many advantages of electric provide several benefits, such as shorter commute
bicycles (E- Bikes), such as shortened commute times, increased environmental sustainability, and
times, environmental sustainability, and health positive health effects. However, since e-bikes run
advantages, make them increasingly popular. on batteries, maintaining battery life can be difficult
However, managing the battery life is one of the for e-bike users. The usage of rental batteries for e-
major difficulties experienced by e-bike users. In bikes is one remedy for this problem. In this
this research article, we look into the possibility research article, we look into the possibility of using
of using e-bike rental batteries to address this e-bike rental batteries to address this problem.
problem. We assess the current status of e-bike
II. CURRENT STATE OF E-BIKE RENTAL
rental battery services and battery technology,
SERVICE
assess the benefits and drawbacks of rental
batteries, and investigate whether rental battery E-bike rental services are becoming increasingly
systems can be implemented in various locales. well- liked as a convenient and economical mode of
According to our research, renting out e-bike mobility. E-bike rental services are frequently found
batteries can assist to promote the use of e-bikes in metropolitan areas and provide riders with a
as an affordable and environmentally friendly practical method to travel across cities. Between
form of transportation while also addressing the docked and dockless systems, e-bike rental services
issue of battery management. may be split. In contrast to docked systems, which
enable users to rent and return bikes from any
Keywords—environmental sustainability,
location within a specified service area, dockless
E-bike, rental battery, battery technology,
systems do not require customers to pick up or
affordable
return their bikes at specific docking stations.
I. INTRODUCTION
A. Abbreviations and Acronyms
Electric bicycles, or e-bikes, have arisen as a
popular alternative to traditional bicycles and • E-bikes – Electric Bikes
vehicles for commuting and mobility. E-bikes • LIBs – Lithium-ion Batteries
7
B. Battery Technology A. Architecture
Battery technology for e-bikes has substantially
1) The system’s main component, the battery
advanced with the invention of lithium-ion batteries
management system, controls the batteries using
(LIBs). LIBs are portable, have a long lifespan, and
both hardware and software. It performs tasks
are easily rechargeable. Even though e-bikes are
including keeping track of the battery’s
now more dependable and effective thanks to
condition, managing the frequency of charging
battery technological developments, battery
and discharging, and controlling voltage and
management is still a problem for e-bikes users.
current levels.
2) Batteries are needed for the rental battery system
so that clients may hire out a set of batteries.
C. Advantages and Disadvantages of rental
Depending on what the clients need, the
batteries batteries come in a variety of sizes and
E-bike rental batteries have several benefits over capacities.
conventional batteries. First-off, renting batteries 3) Infrastructure for charging batteries: The
removes the inconvenience and time-consuming batteries’ ability to be charged and made ready
necessity for e-bike owners to maintain and charge for rental depends on the charging infrastructure.
their batteries. Second, renting batteries can be an This can be apply to charging devices like
affordable option for e-bike users who are unable to charges, cables, and adapters.
afford to buy a brand-new battery or replace an old 4) Program for Rental management: A program for
battery. Thirdly, since rental batteries are simply Rental management is needed to oversee the
exchangeable, customers do not have to worry about renting of batteries. Battery inventory
running out of battery life when traveling for an management, rental history tracking, and the
extended period. creation of invoices and receipts are all possible
with this program.
However, renting batteries has certain drawbacks as 5) User Interface: To allow users to rent and return
well. First-off, rental batteries might not be easily batteries, a user interface is necessary.
accessible in all locations, which would prevent Customers may use this, which can be a mobile
certain users from using them. Second, renting a or web-based application, to look for available
battery can be more expensive in the long run than batteries, make a reservation for them, and start
buying and maintaining one. Finally, temporary the renting process.
batteries might not be appropriate for customers 6) Security and Monitoring: To avoid theft,
who need a consistent and dependable power supply damage, or improper usage of the batteries, the
or for long-distance journey. rental battery system has to be safe and under
constant observation. The can include GPS
tracking devices, alerts, and security cameras.
III. FEASIBILTY OF RENTAL BATTERY
SYSTEMS
The viability of adopting rental battery for e-bikes
depends on several variables, including the cost of
renting batteries, the availability of rental services,
and the demand for e-bikes in a certain location. In
metropolitan regions with large population densities
and a strong demand for e-bike transportation, rental
battery systems are more likely to be successful.
Additionally, consumers who just need a short-
distance journey or who do not need a consistent
and dependable power supply can find that rental
battery systems are more practical.
8
Fig 1. Architecture Diagram
CONCLUSION REFERENCES
Rental Batteries for E-bike may be able to help e- [1]. Supriya M, Sangeetha V S, Subhasini A and
bike owners handle one of their main problems: Vaishnav M, “Retraction: Mobile Application in
managing battery life. Battery technology for e- Rental Batteries for Electronic Vehicles”, ICCCEBS
2021 Journal of Physics: Conference Series
bikes has substantially improved with the invention
of lithium-ion batteries, yet battery management is[2]. Molla Shahadat Hossain Lipu, Md. Sazal Miah,
still a problem for e-bike users. Shaheer Ansari, Safat B. Wali, Taskin Jamal,
Rajvikram Madurai Elavarasan, Sachin Kumar, M.
Renting batteries can be viable, affordable method M. Naushad Ali, Mahidur R. Sarker, A. Aljanad and
for managing batteries can promote the use of e- Nadia M. L. Tan, “Smart Battery Management
bikes as a transportation alternative. Technology in Electric Vehicle Applications:
Analytical and Technical Assessment toward
Emerging Future Directions”, Batteries 2022, 8,
ACKNOWLEDGMENT 219. Https://doi.org/10.3390/batteries8110219
This study report on Rental Batteries for E-bikes has [3]. Hayder Ali, Hassan Abbas Khan and Michael
been completed, and we would like to thank G. Pecht, “Evaluation of Li-Based Battery Current,
everyone who helped. First and foremost, we would Voltage, and Temperature Profiles for In-Service
want to express our gratitude to out academic Mobile Phones”, IEEE, 2020
advisor, whose advice and assistance were crucial
during the whole study process. [4]. Luiz Eduardo Cotta Monteiro, Hugo Miguel
Varela Repolho, Rodrigo Flora Calili, Daniel Ramos
We also want to express our gratitude to the subject- Louzada, Rafael Saadi Dantas Teixeira and Rodrigo
matter experts who shared their knowledge and Santos Vieira, “Optimization of a Mobile Energy
comments with us, helping us to refine our study. Storage Network”, Energies 2022, 15, 186.
Their expertise and knowledge significantly Https://doi.org/10.3390/en15010186
improved our comprehension of the subject and
[5]. Kevin Hendersen, Novando Santosa, Sally
gave us fresh viewpoints.
Septia Halim, Aswin Wibisurya, “Mobile-Based
Application Development For Car And Motor
Rentals”, Journal of Critical Reviews ISSN- 2394-
5125 Vol 7, Issue 8, 2020
[6]. Lagadec M F, Zahn R, Wood V,
“Characterization and performance evaluation of Li-
ion battery separators” Nat. Energy 2019.
[7]. Lipu, M.H.; Hannan, M.; Karim, T.F.; Hussain,
A.; Saad, M.H.M.; Ayob, A.; Miah, S.; Mahlia, T.I,
“Intelligent algorithms and control strategies for
battery management system in electric vehicles:
Progress, challenges and future outlook” J. Clean.
Prod. 2021, 292, 126044.
9
Index-Based Search Enabler For Movie NoSQL DBs Using MongoDB
Sumanth Kumar Mohapatra Yenimetla Venkata Krishna Chaitanya
Department of Information Science and Department of Computer Engineering
Engineering Presidency University
Presidency University Bengaluru, India
Bengaluru, India 201910101153@presidencyuniversity.in
B Bharath Kumar Reddy Ajith Kumar M
Department of Computer Science and Department of Computer Science and
Mithun Gowda D Shwetha B.N

Department of Computer Science and Department of Computer Science and
201910100438@presidencyuniversity.in shwethabn@presidencyuniversity.in
Abstract — Movie databases have grown in

importance as a source of information for movie I. INTRODUCTION
buffs, scholars, and business professionals due In recent years, the use of NoSQL databases has
to the rising use of NoSQL databases for storing become increasingly popular due to their ability to
massive amounts of data. However, the absence handle large amounts of unstructured and semi-
of effective search tools makes it difficult to find structured data. NoSQL databases are commonly
certain movies or get pertinent data from a used in the movie industry to store and manage vast
sizable NoSQL movie collection. In this study, amounts of data related to movie production,
we suggest an index-based search enabler for a distribution, and marketing. However, one of the
NoSQL film database that enhances search major challenges faced by movie databases is the
performance and permits quicker retrieval of efficient search and retrieval of data, particularly
film-related data. In order to facilitate effective when dealing with large datasets.
search operations based on these features, the
suggested method entails building an index that This paper examines the use of an index-based search
links important movie attributes to their enabler for movie NoSQL databases utilizing
corresponding values. To demonstrate how well MongoDB. MongoDB is a popular NoSQL database
our suggested method works at enhancing with a wide range of capabilities and adaptable data
search performance, we also apply it to a real- formats, making it an excellent choice for movie
world NoSQL movie database and test it there. databases. The index-based search enhancer is a tool
that creates indexes on the database to enable efficient
Keywords — Search Enabler, Index - Based data search and retrieval.
Search, NoSQL, MongoDB, Movie Database.
This research paper will provide a detailed description
of the methodology used to implement the index-
10
based search enabler on MongoDB for a movie • Challenges And Limitations Of Index-Based Search
database. The paper will also present the results of Enablers For NoSQL Databases: While index-based
the study, including the performance and scalability search enablers offer improved search and retrieval
of the search enabler. The findings of this study will
performance for NoSQL databases. However, there
be compared with those obtained using other index-
based search enablers for NoSQL databases. are several challenges and limitations that must be
considered when implementing these search
II. LITERATURE SURVEY enablers, such as operations for building and
• Overview Of NoSQL Databases And Index-Based interacting with real-time analytics-reporting
Search Enablers: NoSQL databases are non- systems (Index size, maintenance, query
relational databases that can handle large amounts complexity, and so on) [1][4].
of unstructured or semi-structured data. Index-
III. METHODOLOGY
based search enablers are tools that create indexes
on NoSQL • Data Collection And Preparation: Gather data on
databases to improve search and retrieval
performance [2]. Figure 1: Testing without any indexes using find()
along with explain()
• Types Of Indexes For NoSQL Databases: The
movies and the information associated with them,
primary use of the such document-based schema
such as cast, crew, release dates, genres, ratings,
should contain general product information, to
and reviews. Use the appropriate data types to
facilitate searches of the entire catalogue. NoSQL
store the data in MongoDB in a structured format.
databases support different types of indexes such
• Design Of Indexes: Create indexes for the movie
as B-tree, hash, and full-text indexes. This section
database's fields such movie title, director, and actor
provides an overview of various types of queries
names that are frequently used for searching and
that may be useful for supporting an ecommerce
retrieval. Create indexes on these fields using
site.[1]
MongoDB's "createIndex()" method.
• Performance Evaluation Of Index-Based Search
• Implementation Of Search Functionality:
Enablers: As the size of data in modern
Implement a search function that searches for movies
applications grows due to multiple sources, types,
based on the user's search query using the MongoDB
and multi-modal records across databases, there is
"find()" method and the $text operator. The search
an increasing desire to optimize lookup and search
function should return a list of movies that match
operations. As a result, indexes can be used to
your criteria, sorted by relevance.
address the issue of rapid data growth by
improving the performance of the database and, • Evaluation Of Performance: By observing the query
response time and throughput for various movie
as a result, the cloud server where it is stored [3].
database sizes, assess the effectiveness of the index-
• Implementation Of Index-Based Search Enablers
based search enabler. Utilize the MongoDB
For Specific Applications: Once the data has been
"explain()" method to examine query performance
pre-processed, you can create the index using the
and spot any bottlenecks. Along with explain add
chosen indexing technique. Once the index has
“executionstats” to get further details like the
been created, it needs to be integrated with the
execution time, filtering approach (COLLSCAN –
application. This may involve modifying the
Collection Scan, IXSCAN – Index Scan or also FETCH),
application's search algorithms to make use of the
and so on.
index, or creating a separate search module that
makes use of the index [4]. • Test And Refine: It's crucial to test and refine the
search query performance after implementing the
11
search functionality. To enhance query performance,• Design Of Indexes: Create indexes for the movie
this can entail adjusting the indexes or reassessing database's fields like movie title, director, and actor
the data model.
• Adapt And Overcome: Last but not least, it's
critical to keep an eye on and maintain the search
functionality to make sure it keeps up with
consumer demands. As the requirements change
over time, this can entail expanding the data model
with new indexes or fields.
IV. IMPLEMENTATION
• Data Preparation And Gathering: Start by
importing the Database into MongoDB shell.
Compile information on movies and the details
related to them, such as cast, crew, release dates,
genres, ratings, and reviews.
• Prior Testing: Try using the “find()” method
without implementing any Index. Also make use of
“explain(“executionstats”)” to evaluate on its
efficiency. Let’s use the find query shown in figure 1:
We can see based on the report that it has used
Collection Scan which in turn tells us that the
search operation has performed scans on
documents that weren’t related to the filter and
because of this we can also conclude that there will
be loss of time especially while compiling large
corpus of data as shown in figure 2:
Figure 3: Creating an Index on a field (Here it's
Figure 4:based on the
Running the Average
previousVotes)
search filter, but with
the presence of an Index (Compare the
"executionTimeMillis" with Figure 2)
names that are frequently searched for and
retrieved. Create indexes on these fields using
MongoDB's "createIndex()" method. (NOTE: we can
also merge fields and also create a compound
index). Let us make an index based on the previous
filter query as shown in figure 3:
As commonly people prefer good rated movies, we
can sort the rating in decreasing order “-1”.
• Implementing Search Functionality As Well As
Performing A Performance Evaluation: Based upon
the previous search query, now we again try the
Figure 2: The “executionStats” of the query 12

without the presence of Index
same filter (but with the implementation of index) optimizing efforts to maximum performance
same as in figure 1. benefits. For example, if a field is often queried, you
may create an index on it to improve query
As we can now see that there’s a major
improvement in search operation’s efficiency, as efficiency.
the number of documents now searched, were the • User Feedback: User input may be collected to
only ones that matched the filter, thereby determine how effectively the index-based search
diminishing the time to look through irrelevant enabler works. Users can provide feedback on the
documents. We can see the status report of the functionality's speed, accuracy, and usefulness,
search query shown in figure 4: which can help to identify issue areas and direct
V. RESULTS AND ANALYSIS future optimization work. Surveys and analytics
A movie NoSQL database's index-based search systems that track user activity can both be used to
enabler's implementation results and analysis can get feedback from users.
be evaluated in terms of query performance,
VI. DISCUSSION
complexity, distribution, and user feedback.
The outcomes of constructing an index-based search
• Query Performance: The main performance enabler for a NoSQL database for movies will have
metric for an implementation of an index-based an impact on the movie industry in a variety of
search enabler is query performance. To evaluate ways. First and foremost, this technology may help
the query's performance, metrics like query film studios, distributors, and exhibitors better
response time, throughput, and latency can be used. manage their massive databases of films and
After implementing the search enabler, query accompanying metadata. These firms may use an
index-based search enabler to quickly and easily
performance should be noticeably better when
search for specific films or information about those
compared to the database's baseline performance films, allowing them to make better business
prior to installing the search feature. Throughput, decisions.
response time, and latency are all indicators of
• Improved Movie Recommendations: By allowing
better query performance.
users to search for movies based on a variety of
• Complexity Of The Queries: The intricacy of the
parameters such as genre, director, cast, release
queries can be analyzed to identify areas for
year, and user ratings, our search enabler can assist
optimization. Optimizing queries that use multiple
movie streaming services in providing better
fields or necessitate complex operations like
suggestions to their customers.
sorting or grouping can be more challenging. By
• Better Audience Targeting: By allowing movie
examining the query complexity, you can identify
studios and distributors to search for films based on
areas for improvement and sharpen your indexing
demographic and regional data, our search enabler
and query optimization strategies. For instance, by
may help companies analyze audience preferences
reducing the indexes for some fields that aren't
and target certain demographics more efficiently.
frequently requested, you can improve query
• Faster And More Efficient Movie Production: Our
performance and conserve store space and vice-
search enabler can assist movie studios and
versa by making compound Indexes also we can
production businesses in more efficiently managing
save space of creating individual indexes (Can
their movie databases, allowing them to search for
depend based on the fields implied upon).
films fast and simply during the pre-production and
• Distribution Of Queries: The distribution of queries
production phases.
may be used to identify the most common fields
and queries, as well as their execution rates. This
information can help prioritize indexing and query
13
VII. AREAS FOR FUTURE RESEARCH our work, movie databases might also be built using
Future studies might explore a variety of areas in other NoSQL databases. Future studies may examine
greater detail. One potential area of research is how well various NoSQL databases like Cassandra and
Couchbase execute index-based search enablers.
examining how different indexing and query
optimization techniques affect query VIII. CONCLUSION
performance. This could help in figuring out the To achieve efficient data retrieval in NoSQL databases
best ways to enhance query performance in a such as MongoDB, the utilization of index-based
movie NoSQL database. search enablers is necessary. These indexes can
• Enhancing Query Response Times: Although our significantly enhance the overall user experience and
search enabler offers quick query responses, there is efficiency of searching for specific data points in
still potential for development. The search algorithm massive databases.
could be improved in the future to speed up the In the context of movie databases, index-based search
facilitators can assist users in quickly finding movies
processing of queries.
based on specific criteria such as genre, director, star,
• Increasing Search Options: Our search enabler
presently lets users look for films using a variety of or release year. This can significantly improve the
criteria, but it's possible that more criteria might be database's usability and increase its value for
added to further increase search options. Additional individuals who need to sort through a large amount
search parameters, such story summaries, of information.
accolades, and user reviews, should be investigatedTherefore, incorporating index-based search enablers
in future studies. in MongoDB for movie databases is essential because
• Evaluating Search Enablers On Different NoSQL it can effectively increase the efficiency and
Databases: While MongoDB was the main focus of effectiveness of data retrieval, leading to better user
experiences and more valuable databases.
IX. REFERENCES [8] “Study of Search Algorithm Optimization from
Multi-Version Data Warehouse using NoSQL Non-
[1] “MongoDB Applied Design Patterns” by Rick relational Database” by L R Maghfiroh and R A
Copeland (2013). Yusuf (2021).
[2] “NoSQL Distilled” by Pramod J. Sadalage, Martin [9] “Performance Evaluation of MySQL and
Fowler (2013). MongoDB Databases” by Dipina Damodaran B,
[3] “Query Optimization in NoSQL Databases Using Shirin Salim and
an Enhanced Localized B-tree Index” by Aristeidis Surekha Mariam Varghese (2016).
Karras, Christos Karras, Dimitrios Samoladas and
Konstantinos C. Giotopoulos (2022). [10] MongoDB, "Query Optimization"
https://www.mongodb.com/docs/manual/core/query-
[4] “Scaling MongoDB” by Kristina Chodorow optimization/ [online], accessed (2021).
(2011).
[5] “MongoDB: The Definitive Guide” by Kristina
Chodorow and Michael Dirolf (2010).
[6] “MongoDB Use Cases” MongoDB
Documentation Project (2013).
[7] “Consistency Models of NoSQL Databases” by
Miguel Diogo, Bruno Cabral, and Jorge Bernardino
(2019)
14
MACHINE LEARNING ALGORITHM FOR STROKEDISEASE
CLASSIFICATION AND ALERT SYSTEM
CHANDANA R PATIL 1 KETHA ANUDEEP KUMAR DIVYASHREE K M3

School of Information Science and REDDY2 School of Information Science and
Engineering School of Information Science and Engineering
Presidency University Engineering Presidency University
Bengaluru, Karnataka,India Presidency University Bengaluru, Karnataka,India
crpatil8809@gmail.com Bengaluru, Karnataka,India divyashree9611@gmail.com
kethaanudeep@gmail.com
DEEKSHITHA L4
School of Information Science and Dr.SAMPATH A K5
Engineering Associate Professor
Bengaluru, Karnataka,India Bengaluru, Karnataka,India
deekshithal2002@gmail.com sampath.ak@presidencyuniversity.in
judgements could be completely transformed by machine

learning.
Abstract— One of the main global causes of mortality and
The current significance of machine learning is The
morbidity is STROKE . The risk of long-term disability and ability of machine learning to process and analyse
mortality can be considerably decreased with early detection enormous amounts of data rapidly and effectively is
and prompt intervention. Machine learning algorithms show crucial nowadays. Social media, marketing, healthcare,
promise in diagnosing and classifying strokes, enabling faster and finance are just a few of the industries where
and more accurate decision-making. This research proposes a machine learning algorithms are being used.
stroke classification and warning system that uses machine For example, machine learning is used in social media
learning algorithms. The proposed system consists of three and marketing to offer tailored ideas, disease diagnosis in
stages: data processing, feature extraction and classification. the healthcare industry, and fraud detection in the finance
The data preprocessing step cleans and normalizes the data to industry.
remove noise and inconsistencies. The feature extraction step Machine learning is now a vital tool for businesses and
uses features extracted from the data to create a reduced set of organisations to get insights, make wise decisions, and
features for efficient classification. Lastly, the stroke is maintain market competitiveness due to the increasing
classified using machine learning techniques including support amount of data available today.
vector machines (SVM), decision trees, and random forests.
The proposed system was trained and tested using a public
dataset of stroke patients. Experiments show that the proposed Social media, marketing, healthcare, and finance are just a
system achieves high accuracy, sensitivity and specificity in few of the industries where machine learning algorithms are
stroke classification. In addition, the proposed system includes being used. For example, machine learning is used in social
an alarm system that notifies patients of emergency media and marketing to offer tailored ideas, disease
procedures. The proposed system can be used as a tool to help diagnosis in the healthcare industry, and fraud detection in
medical professionals diagnose and classify stroke, enabling a
the finance industry.
faster and more accurate decision.
Keywords— Stroke, Machine learning, Alert system, Support Vector Machine (SVM) is a binary classification
Logistic regression, Random forest, Support vector machine, algorithm that divides the data points into two categories to
Decision tree the best possible extent using a hyperplane. SVM has a high
degree of accuracy and is particularly helpful when working
with datasets that have feature spaces with many
Introduction
dimensions.
The subject of machine learning has grown
Logistic regression is a different method of binary
significantly in recent years.. It entails utilizing statistical
classification that works by calculating the probability that a
models and techniques to let computer systems learn from
particular event will take place. When the dependent
data without being explicitly programmed. The way we
variable is binary and the independent variables are either
analyse data, resolve difficult problems, and make
continuous or categorical, this method is used.
15
For both classification and regression issues, decision trees, learning models to improve stroke risk level classification"
a popular machine learning method, are used. They work by and it uses data from the 2017 National Stroke Screening
splitting the data into subsets based on the values of Program to create models for stroke risk classification using
independent variables, and then utilising the resulting machine learning techniques. In addition, the work
subsets, building a decision tree. A stroke prediction system that employs artificial
intelligence to detect stroke using real-time bio-signals is
In Random Forest, an ensemble learning method, many proposed in "AI-Based Stroke Disease Prediction System
decision trees are joined to improve the robustness and Utilizing RealTime Bio-Signals."These pieces can be
precision of the model. It is very useful when dealing with incorporated into the research paper to provide a thorough
noisy or complex datasets. analysis of the body of work on utilizing machine learning
algorithms to classify stroke diseases.
Machine learning algorithms show promise in diagnosing
and classifying strokes, enabling faster and more accurate Machine learning techniques are increasingly being used in
decision-making. Using various well-known machine research on stroke illness classification and early diagnosis.
learning techniques, including Logistic Regression, Random
A machine learning model was created in one such study by
Forest, SVM, and Decision Tree, we provide a novel method
M. Asadi et al. to predict the occurrence of stroke using
for classifying stroke illness in this study.The proposed
model can efficiently and accurately classify stroke disease, various imaging biomarkers, attaining an accuracy of 86.5%.
enabling faster and more accurate decision-making for A Convolutional Neural Network (CNN) was used in a
medical professionals. different study by G. Lee et al. to detect ischemic stroke in
computed tomography (CT) images with an accuracy of
Furthermore, our model includes a user-friendly graphical 96.7%. In terms of early detection, D. Kim et al.'s study
user interface (GUI) that can be used to alert patients about created an early warning system for stroke using machine
their stroke status via email. The GUI presents the results of learning algorithms, and they were able to predict the
the stroke classification to patients in a clear and concise beginning of stroke with an accuracy of 96.5%. Similar to
manner, allowing them to take appropriate actions and seek this, A. T. M. Faisal et al. created a smart system for early
medical attention if necessary. stroke.
By using machine learning algorithms to classify stroke
disease, our proposed system offers several advantages over
traditional methods. These include increased accuracy, faster
diagnosis, and reduced costs associated with stroke II. DATASET PREPARATION
treatment and care. The dataset includes 12 stroke prediction-related factors and
23,036 observations. This dataset was taken from Kaggle
Overall, this research presents a significant contribution to
and the variables the dataset consist of are:
the field of stroke diagnosis and classification, providing a
powerful tool for medical professionals to make informed
1. id: unique identifier for each observation
decisions and improve patient outcomes.
2. gender: gender of the patient (male or female)
3. age: age of the patient (in years)
4. hypertension: binary variable indicating whether
I. EXISTING WORK the patient has hypertension (1 = yes, 0 = no)
5. heart_disease: binary variable indicating whether
The application of machine learning algorithms for stroke the patient has heart disease (1 = yes, 0 = no)
disease classification and early detection has been studied in 6. ever_married: binary variable indicating whether
a number of published papers. In one such study, D. D. Kim the patient has ever been married (Yes or No)
et al., with encouraging findings, used a machine learning 7. work_type: type of work of the patient (Private,
model to predict the occurrence of stroke, classifying stroke Self-employed, Govt_job, Never_worked)
risk factors using the Support Vector Machine (SVM) 8. Residence_type: type of residence of the patient
technique. Similar to this, S. K. Roy et al. created an (Urban or Rural)
automated method for diagnosing strokes utilizing a variety 9. avg_glucose_level: average glucose level in the
of machine learning algorithms, such as SVM, Decision patient's blood (in mg/dL)
Tree, and Random Forest, with the accuracy of the Random 10. bmi: body mass index of the patient (in kg/m^2)
Forest approach being the highest. A machine learningbased 11. smoking_status: smoking status of the patient
alert system for early stroke detection was also created by Y. (formerly smoked, never smoked, smokes,
Han et al., and it was successful in detecting stroke at an Unknown)
early stage. 12. stroke: binary variable indicating whether the
patient had a stroke (1 = yes, 0 = no)
The study titled "Classification of stroke disease using
machine learning algorithms" is one of the other studies that Several procedures would be involved in creating the
are currently available on the topic. The research proposes a dataset for the study paper, including:
prototype for classifying stroke using text mining tools and
machine learning techniques. It is titled as "Using machine
16
1)Data cleaning entails identifying any incorrect or missing Train the Decision Tree, Support Vector Machine (SVM),
data and determining how to deal with it (for example, Random Forest, and Logistic Regression machine learning
impute missing values or remove observations with missing models. Each model's hyper-parameters should be tuned to
data). It could also entail looking for outliers and making a increase performance.
decision about how to deal with them.
4) Model Assessment:
2)Data transformation could entail scaling, normalizing, or Utilize criteria like accuracy, precision, recall, F1-score, and
establishing new variables based on existing ones in order to AUC-ROC to assess each model's performance. Choose the
make the data more analytically useful. model with the best performance after comparing the three.
3)Feature selection is choosing a portion of the available 5) Implementation of an Alert System:

variables for analysis depending on how well they relate to Create an alert system that analyses data from stroke patients
the research topic and how well they predict it. in real-time to forecast the chance of a stroke occurring. The
alert system should incorporate the machine learning model
4) Data division: To enable model validation, the data would with the optimum performance. Create alert thresholds to
be divided into training and testing sets. notify the patients when a stroke is anticipated.
5)Model training and evaluation: Using the training set as 6) Validation and Testing:
the basis, several machine learning models might be Utilizing both simulated and actual stroke data, test the alert
developed, assessed, and their performance compared with system. Use measurements like sensitivity, specificity,
that of the testing set. positive predictive value (PPV), and negative predictive
value (NPV) to verify the alert system's performance.
6)Reporting the findings: The study paper would provide the
analysis' findings, along with any conclusions and
suggestions based on them. To guarantee that the right credit
is given to the source of the data, the dataset would also need
to be correctly referenced in the study. IV. PROPOSED WORK
In conclusion, there are several critical processes in the Medical emergencies like strokes can cause instantaneous
dataset preparation process for this dataset on stroke death. Machine learning methods can be very helpful in
prediction, including data cleaning, transformation, feature preventing or minimizing the damage caused by this
selection, data splitting, model training and evaluation, and condition by detecting stroke early. In the proposed work,
reporting of the results. The dataset must be prepared we attempt to forecast the incidence of stroke based on the
correctly in order to produce accurate and trustworthy results existing causal factors using three different machine learning
and to guarantee the validity of any conclusions or techniques, including Decision Tree, Support Vector
suggestions made as a result of the study. Machine (SVM), Random Forest and Logistic Regression.
The data is first cleaned and preprocessed, and then it is
visualized using several graphs to reveal information about
III. ALGORITHM DETAILS the dataset. We next use the prepared data to train the
machine learning models, and we employ a graphical user
The algorithm details of the project are as follows: interface (GUI) program to predict stroke for fresh input
values.
1) Data Gathering and Preparation:
Gather data on the prevalence of stroke and the factors that The benefits of this planned effort include the potential for
may contribute to it, such as age, gender, smoking status, early stroke identification, which can stop a stroke from
blood pressure, cholesterol level, etc. happening or lessen its severity, thereby improving patient
Missing values are removed from the data, it is scaled and outcomes. Additionally, once a patient is determined to be at
normalized, and categorical variables are transformed into risk for a stroke, drugs can be recommended and promptly
numerical values as part of the preprocessing. given to limit the possibility of harm taking place.
2) Selection and Extraction of Features: The proposed technique, including the dataset preparation,
Determine which features are most crucial for the purpose of machine learning algorithms employed, and the GUI
classifying strokes by using feature selection algorithms. application for prediction, will be thoroughly described in
Extract characteristics like age, gender, smoking status, the study article. We will also explore the possible
blood pressure, cholesterol level, etc. from the preprocessed ramifications of our findings and show the study's outcomes,
data. including accuracy and precision measures. Furthermore, we
will contrast our strategy with other efforts on stroke
3)Model Education: detection, talk about the drawbacks and potential future
Separate the training and test sets from the preprocessed directions of our proposed system. Overall, the proposed
data. effort has the potential to help create a system for early
17
stroke identification that is more precise and successful,
which would improve patient outcomes and lessen the 2. Training Dataset:
burden of stroke on healthcare systems. - Split the preprocessed data into a training dataset and a
test dataset.
The proposed effort may also aid in the creation of a stroke - The training dataset will be used to train the machine
detection system that is both affordable and effective. We learning model.
can lessen the reliance on expensive diagnostic techniques,
like MRI scans or CT scans, which are frequently 3. Test Dataset:
unavailable to people in low-resource settings, by utilizing - The test dataset will be used to evaluate the
machine learning algorithms to identify stroke risk factors. performance of the trained model.
Additionally, the suggested research may eventually result in 4. Classification ML Algorithm:

the creation of personalized treatment for stroke patients. We - Select an appropriate machine learning algorithm for
can improve patient outcomes and lower the chance of long-
stroke disease classification, such as logistic regression,
term impairment or stroke recurrence by analyzing
random forest, or support vector machines (SVM).
individual risk factors and customizing treatment regimens
- Train the selected algorithm using the training dataset.
accordingly.
- Tune hyperparameters of the algorithm using
techniques like cross-validation or grid search.
The realm of telemedicine is one other area in which the
proposed study might find use. Healthcare providers in
distant or underdeveloped locations can accurately 5. Model:
diagnose and treat stroke patients, potentially saving lives - Save the trained model for future use in stroke disease
and lessening the strain on healthcare systems, by using a classification.
GUI program for stroke prediction.
6. Alert System:
The planned effort can potentially act as a foundation for - Implement an alert system to notify healthcare
future research on stroke management and prevention. professionals or patients about the risk of stroke based on
We can find areas for development and hone the current the model's predictions.
strategy for improved performance by examining the - Define a threshold or risk score that triggers an alert.
efficacy of various machine learning algorithms in stroke - Generate alerts in real-time by applying the trained
prediction. Overall, the planned effort has the potential to model to new patient data.
have a considerable influence on the field of managing
and preventing strokes, perhaps enhancing patient
outcomes, and lessening the financial burden of stroke on
healthcare systems. VI. CONCLUSION
In conclusion, this study suggests a stroke classification

and warning system that identifies and categorizes strokes
using machine learning algorithms. The three stages of
V. ARCHITECTURE the proposed system—data processing, feature extraction,
and classification—are designed to increase the stroke
classification's precision, sensitivity, and specificity. The
suggested method achieves great accuracy in stroke
classification, according to studies done on a public
dataset of stroke patients, making it a promising tool to
assist medical professionals in making quicker and more
accurate decisions.
The suggested system also incorporates an alarm system that

alerts patients to emergency procedures, which can help
lower the risk of permanent impairment and death linked to
Figure1 : Architecture Design strokes. The findings of this study demonstrate that the
suggested method can be a useful tool for medical
1. Data Processing: professionals to make decisions, enabling them to quickly
intervene on stroke patients.
- Gather patient details, including relevant medical
history, demographics, and clinical indicators.
Expanding the proposed system to include more
- Preprocess the data, including handling missing values,
sophisticated machine learning algorithms and a larger
normalizing features, and encoding categorical variables.
dataset of stroke patients can be the main goal of future
- Perform feature selection or extraction to identify the
research. In order to give real-time stroke diagnosis and
most relevant features for stroke classification.
18
categorization and allow healthcare providers to provide [10] Barua S, Sarmah D, Roy K, et al. Machine learning in
more individualized care, the system can also be coupled ischemic stroke subtype classification. J Stroke
with electronic health record systems. Cerebrovasc Dis. 2021;30(2):105477.
[11] Zeng Y, Peng X, Wang C, et al. Multi-modal MRI-
After training the machine learning models using the based stroke subtypes classification using a
prepared data, we utilize a graphical user interface (GUI) convolutional neural network ensemble. Comput Med
program to predict the likelihood of a stroke based on fresh Imaging Graph. 2020;81:101696.
input values. Additionally, as part of our system, we [12] Wang W, Liu X, Yang B, et al. Deep learning-based
implement an email notification system to alert the patient classification and severity prediction of acute ischemic
regarding the status of their report. stroke from multiparametric MRI. Stroke.
2020;51(8):2468-2475.
Upon processing the input values, if the prediction indicates
a normal report, an automated email is sent to the patient .
stating that their report is normal. On the other hand, if the
prediction suggests the possibility of a stroke, an email is
sent to the patient, notifying them that their report indicates
potential abnormalities. This prompt email notification helps
in providing timely information to the patients and ensures
they are aware of the results, allowing them to seek further
medical assistance if required.
Overall, the suggested method could greatly enhance stroke

diagnosis and classification, thereby increasing patient
outcomes.
REFERENCES
[1] Katan M, Luft A. Global burden of stroke. Semin

Neurol. 2018;38(2):208-211.
[2] Feigin VL, Roth GA, Naghavi M, et al. Global burden
of stroke and risk factors in 188 countries, during 1990-
2013: a systematic analysis for the Global Burden of
Disease Study 2013. Lancet Neurol. 2016;15(9):913-
924.
[3] Prabhakaran S, Ruff I, Bernstein RA. Acute stroke
intervention: a systematic review. JAMA.
2015;313(14):1451-1462.
[4] Pandian JD, Gall SL, Kate MP, et al. Prevention of
stroke: a global perspective. Lancet.
2018;392(10154):1269-1278.
[5] Ahmadi M, Azarpazhooh MR, Etemadi MM, et al.
Prediction of hemorrhagic transformation in acute
ischemic stroke using machine learning algorithms.
Front Neurol. 2019;10:802.
[6] Tang E, Davis AP, Fugate JE, et al. Using machine
learning to improve prediction of late-onset seizures
following ischemic stroke. Neurology. 2019;93(1):e37-
e44.
[7] Wang X, Ding X, Su S, et al. Stroke prediction using
machine learning and clinical risk factors. Comput
Methods Programs Biomed. 2020;187:105229.
[8] Bajaj NS, Wang X, Hou L, et al. Deep learning-based
prediction of acute stroke treatment eligibility. Ann
Neurol. 2020;88(2):357-365.
[9] Hsieh CY, Huang YC, Lin SJ, et al. Application of
artificial intelligence in stroke prediction and
management. J Stroke Cerebrovasc Dis.
2019;28(4):1043-1053.
19
Network Intrusion Detection System with PCA
Using Machine Learning Classifiers
1st Peddi Prashanth Kumar 2nd Priyanka N dept. 3rd Pathakamuri Bharath Kumar
dept. of CSE of CSE Presidency dept. of CSE
Presidency University UniversityBengaluru, Presidency University
Bengaluru, Karnataka Karnataka Bengaluru, Karnataka
4th Pavankalyan Konka 5th Pulli Ritheesh Kumar Reddy

Bengaluru, Karnataka Bengaluru, Karnataka
Abstract—Utilising machine learning techniques, the major Basic security measures like firewalls and antivirus scanners
goal of this research is to find any network intrusions in any are reaching their capacity in dealing with the exponential
network system. In order to automatically identify attacks on increase in sophisticated Internet threats. Adding intrusion
computer networks and systems, we create a Network Intrusion
Detection System (NIDS). This system makes use of a variety detection systems to the security layers can help raise the
of machine learning techniques. Principal component analysis networks’ overall security. Various attacks are observed against
(PCA) is used in conjunction with several classification methods, the network or system. The network system is subject to
including as Support Vector Machines, Random Forest, and attacks like wormholes, black holes, and grey holes, among
XgBoost, to construct effective NIDS. An intrusion detection others. The purpose of these attacks is to steal data from the
system’s job is to find attacks. However, in order to lessen the
severity of attacks, it is also crucial to identify them quickly. system. The intrusion detection system was thus introduced to
Index Terms—Intrusion Detection System, Network Anomaly protect the system from such attempts. IDS monitor system
Detection, Features Selection, Dimensionality Reduction, NSL- attacks and work to protect the system from them.
KDD, Swarm Intelligence
II. LITERATURE REVIEW
I. INTRODUCTION We conducted a literature review on home automation
The evolution of telecommunications networks in the utilising IoT, and these four publications came out as being
twenty-first century has moved swiftly away from circuit and particularly important for understanding the previous research
packet switched networks and towards all-IP based networks. on the issue we were seeking to address as well as for
This progress has produced a unified environment where IP- understanding different solutions.
based voice and data connectivity across apps and services
is possible. Although communication network expansion has A. 2020 International Conference on Electronics and Sustain-
improved the sustainability of technologies, it has also opened able Communication Systems, ”Intrusion Detection System
up new unwelcome possibilities. The radio access networks are Using PCA with Random Forest Approach,” S. Waskle, L.
now susceptible to threats that weBasic security measures like Parashar, and U. Singh (ICESC), 2020 [?]
firewalls and antivirus scanners are reaching their capacity in Due to the advancement of wireless communication, there
dealing with the exponential increase in sophisticated Internet are several online security risks. The intrusion detection sys-
threats. Adding intrusion detection systems to the security tem (IDS) assists in identifying system attacks and identifies
layers can help raise the networks’ overall security. Various attackers. In the past, several machine learning (ML) tech-
attacks are observed against the network or system. The net- niques have been applied to IDS in an effort to improve
work system is subject to attacks like wormholes, black holes, intruder detection outcomes and boost IDS accuracy. In this
and grey holes, among others. The purpose of these attacks is paper, a method for creating effective IDS that makes use of
to steal data from the system. The intrusion detection system the random forest classification algorithm and principal com-
was thus introduced to protect the system from such attempts. ponent analysis (PCA) is proposed. Whereas the random forest
IDS monitor system attacks and work to protect the system will aid in classifying while the PCA will assist in organising
from them. re previously only applicable to fixed networks. the dataset by lowering its dimensionality. According to the
The need for more intelligent security systems arises from the results, the suggested strategy performs more accurately and
fact that threats are evolving to become more sophisticated. efficiently than other methods like SVM, Naive Bayes, and
20
Decision Tree. The performance time (min) for the suggested IV. METHODOLOGY
approach is 3.24 minutes, the accuracy rate ( A. Requirements
1) Functional Requirements: The fundamental prerequi-
B. 2018 IEEE Fourth International Conference on Big Data sites to operate the programme are the same as those needed to
Computing Service and Applications (BigDataService), K. run PyCharm IDE because the entire application is network-
Park, Y. Song, and Y. Cheong, ”Classification of Attack Types based.
for Intrusion Detection Systems Using a Machine Learning
• Python v3.6+
Algorithm,” .
• Pycharm IDE
In this article, we show the findings from our studies to • RAM :4GB Minimum
assess the effectiveness of identifying various attack types, • Hard Disk : 128 GB+
such as IDS, Malware, and Shellcode. We apply the Random • OS :Windows
Forest method to the numerous datasets created from the Kyoto • Libraries: Pandas, numpy , scikit-learn ,Xgboost.
2006+ dataset, the most recent network packet data gathered 2) Applications: • Employed by banks and other financial
for creating intrusion detection systems, in order to analyse the institutions to stop unauthorised access. Such a mechanism is
recognition performance. We conclude with talks and plans for necessary for governments and intelligence agencies to protect
additional research. their sensitive information. Such a mechanism is necessary for
B2C companies and tech firms to increase the security of their
C. A. Tesfahun and D. L. Bhaskari, ”Intrusion Detection users’ personal data.
Using Random Forests Classifier with SMOTE and Feature • UsabilityThe importance of intrusion detection systems
Reduction,” 2013 International Conference on Cloud Ubiq- (IDS) in computer and network security cannot be over-
uitous Computing Emerging Technologies, 2013 stated. This study’s experiment dataset was the NSL-
The importance of intrusion detection systems (IDS) in KDD intrusion detection dataset, an improved version of
computer and network security cannot be overstated. The ex- the KDDCUP’99 dataset.
• Purpose A detective tool called an intrusion detection
periment dataset in this research was the NSL-KDD intrusion
detection dataset, an improved version of the KDDCUP’99 system (IDS) is used to find hostile (including policy-
dataset. Due to intrusion detection’s fundamental properties, violating) activities. A preventive tool, an intrusion pre-
there is still a significant imbalance between the classes in vention system (IPS) is primarily made to both iden-
the NSL-KDD dataset, which makes it more difficult to apply tify and prevent hostile activity. IDS and IPS can be
machine learning to intrusion detection efficiently. Synthetic divided into two categories: network-based and host-
Minority Over sampling Technique (SMOTE) is used in this based, depending on where they are physically located in
study to address class imbalance by applying it to the training the infrastructure and the level of security needed. The
dataset. A reduced feature subset of the NSL-KDD dataset precise type used depends on strategic considerations,
is created using a feature selection method based on Infor- although both serve the same purpose.
• Efficiency The effectiveness and quality of NIDS, par-
mation Gain. The suggested intrusion detection framework
employs Random Forests as a classifier. According to em- ticularly its classification accuracy, detection speed, and
pirical findings, building IDS that is efficient and effective processing complexity, are adversely affected by the
for network intrusion detection performs better when using redundant and irrelevant network properties. In order to
Random Forests classifier with SMOTE and information gain- maximise the effectiveness of NIDS, numerous feature
based feature selection. selection strategies are used in this paper. The filter,
wrapper, and hybrid feature selection approach categories
are used. As a detection model, Support Vector Machine
III. PROPOSED SYSTEM AND ADVANTAGES.
(SVM) is used to categorise the behaviour of network
We suggest this system, which detects intrusions using connections into normal and abnormal traffic.
machine learning algorithms like SVM, Random Forests, and • Reliability The users of this application must all identify
XgBoost. These methods provide a quicker reaction to the and detect the network system, which increases its de-
threat since they can identify the probability of an assault more pendability. Utilising an intrusion detection system makes
quickly than the current methods. The high cardinality in this it very simple to identify unauthorised networks.The
system is reduced via principal component analysis. network data source is the NSL-KDD dataset, and the
network traffic classification is done using the SVM
method.
A. Advantage
• High accuracy. B. System Design
• Time Saving. 1) UML DAIGRAM: Unified Modelling Language is
• Low Complexities known as UML. A general-purpose modelling language with
• Easy To Scale standards, UML is used in the field of object-oriented software
21
engineering. The Object Management Group oversees and
developed the standard. The objective is for UML to establish
itself as a standard language for modelling object-oriented
computer programmes. UML now consists of a meta-model
and a notation as its two main parts. In the future, UML
might also be coupled with or added to in the form of a
method or process. The Unified Modelling Language is a
standard language for business modelling, non-software sys-
tems, and describing, visualising, building, and documenting
C. SEQUENCE DIAGRAM:
the artefacts of software systems. The UML is an amalga-
mation of best engineering practises that have been effective .In the Unified Modelling Language (UML), a sequence
in simulating huge, complicated systems. The UML is a diagram is a type of interaction diagram that demonstrates
crucial component of the software development process and how and in what order processes interact with one another. It
the creation of objects-oriented software. The UML primarily is a Message Sequence Chart construct. Event diagrams, event
employs graphical notations to convey software project design. situations, and timing diagrams are other names for sequence
• GOALS diagrams.
• The Primary goals in the design of the UML are as
follows:
• 1. To enable users to create and exchange meaningful
models, offer them a ready-to-use, expressive visual mod-
elling language.
• 2. Provide tools for specialisation and extendibility of the
key principles.
• 3. not depend on a certain development process or pro-
gramming language.
• 4. Establish a formal framework for comprehending the
modelling language.
• 5. Promote the market expansion of OO tools.
• 6. Support for more advanced development ideas includ-
ing partnerships, frameworks, patterns, and components.
• 7. Embrace the best practises
2) USE CASE DIAGRAM:: In the Unified Modelling Lan-
guage (UML), a use case diagram is a specific kind of
behavioural diagram that results from and is defined by a use- 1) COLLABORATION DIAGRAM:: The following cooper-
case analysis. Its objective is to provide a graphical picture ation diagram uses a numbering scheme to show the order
of a system’s functionality in terms of actors, their objectives in which the methods are called. The number designates the
(expressed as use cases), and any dependencies among those order in which the methods are called. The collaboration dia-
use cases. A use case diagram’s primary objective is to identify gram is described using the same order management system.
which system functions are carried out for which actor. The comparable to a sequence diagram, the method calls are also
system’s actors can be represented by their roles. comparable. Nevertheless, the cooperation diagram illustrates
the object organisation, whereas the sequence diagram only
describes it.
2) DEPLOYMENT DIAGRAM: The deployment view of

3) Class Diagram: A class diagram is a form of static struc- a system is represented by a deployment diagram. This and
ture diagram used in software engineering that displays the the component diagram are connected. because deployment
classes, attributes, operations (or methods), and interactions diagrams are used to deploy the components. In a deployment
between the classes to illustrate the structure of a system. It diagram, nodes are present. Nodes are merely the actual pieces
explains which sort of information is contained. of hardware that are used to deliver the programme.
22
3) ACTIVITY DIAGRAM:: Activity diagrams are visual de-
pictions of workflows with choice, iteration, and concurrency
supported by activities and actions. Activity diagrams can
be used to depict the operational and business workflows of
system components in the Unified Modelling Language. An
activity diagram demonstrates the total control flow.
how data enters and exits the system, what modifies data,
and where it is stored. A DFD is used to illustrate the scope
and bounds of a system as a whole. It can be applied as a
method for communication between a systems analyst and
any participant in the system that serves as the foundation
for system redesign.
4) COMPONENT DIAGRAM: A component diagram, often

called a UML component diagram, shows how the physical
parts of a system are wired up and organised. To model
implementation specifics and ensure that all necessary func-
tionalities of the system are covered by planned development,
component diagrams are frequently created.
5) ER DIAGRAM:: An entity-relationship model (ER

model) uses an entity relationship diagram (ER Diagram) to
visually represent how a database is structured. A database
design or blueprint known as an ER model can be used to
create a database in the future. The entity set and relationship
set are the two fundamental parts of the E-R model. An ER
diagram illustrates the connections between entity sets. An
entity set is a collection of related entities, each of which may
have properties. An entity in a DBMS is a table or an attribute V. ALGORITHMS
of a table, hence the ER diagram illustrates the entire logical
structure of a database by displaying the relationships between • Random forest regression
tables and their attributes. Let’s look at a straightforward ER • Support Vector Machines (SVM)
• Principal Component Analysis
diagram.
6) DFD DIAGRAM:: A Data Flow Diagram (DFD) is a • Decision Tree
• K-Nearest Diagram
common tool for illustrating how information moves through
a system. A good deal of the system requirements can be Random Forest Regression. As an ensemble learning tech-
graphically represented by a tidy and understandable DFD. It nique for classification, regression, and other tasks, random
can be done manually, automatically, or both. It demonstrates forests or random decision forests build a large number of
23
decision trees during the training phase and output the class is frequently utilised. PCA works by condensing a large
that represents the mean of the classes (classification) or collection of variables into a smaller set that still retains the
the mean/average prediction (regression) of the individual majority of the data in the larger set. • Accuracy naturally
trees. The tendency of decision trees to overfit their training suffers as a data set’s number of variables is reduced, but the
set is corrected by random decision forests. Although they secret to dimensionality reduction is to sacrifice some accuracy
frequently outperform decision trees, gradient boosted trees for simplicity. Because smaller data sets are easier to examine
are more accurate than random forests. However, their effec- and visualise, and because machine learning algorithms can
tiveness may be impacted by data peculiarities. analyse data much more quickly and easily,• In conclusion, the
•Every decision tree has a big variance, but the final variance principle of PCA is straightforward: minimise the number of
is modest when we mix them all in parallel. • When a variables in a data set while maintaining as much information
classification difficulty arises, the majority voting classifier is as possible.
used to determine the output. The final output in a regression
VI. FLOW CHART
problem is the mean of every output. An ensemble method
that can handle both classification and regression problems is
a random forest.
Support Vector Machines (SVM’s).Finding a hyperplane in
an N-dimensional space (where N is the number of features)
that clearly classifies the data points is the goal of the
support vector machine algorithm. • Support vector machines,
often known as SVMs, are useful for both classification and
regression applications. However, it is frequently employed in
classification goals.
• Decision boundaries known as hyperplanes assist in cate-
gorising the data points. Different classes can be given to the
data points that fall on each side of the hyperplane. • Support
vectors are data points that are closer to the hyperplane and ACKNOWLEDGMENT
have an impact on the hyperplane’s position and orientation. We owe a great deal to our mentor MS. BHAVYA B,
By utilising these support vectors, we increase the classifier’s Assistant Professor, School of Computer Science Engineer-
margin. The hyperplane’s location will vary if the support vec- ing, Presidency University, for her motivational leadership,
tors are deleted. These are the ideas that guide the development insightful suggestions, and giving us the opportunity to fully
of our SVM. express our technical prowess for the completion of the project
work. We express our gratitude to our family and friends for
A. XgBoost
their great support and inspiration in helping us complete this
The gradient boosting framework is used by the ensemble project.
machine learning method XgBoost, which is decision-tree
based. With new enhancements like regularisation, the model’s REFERENCES
implementation provides the features of the scikit-learn and R 1 Jafar Abo Nada; Mohammad Rasmi Al-Mosa, 2018 In-
implementations. ternational Arab Conference on Information Technology
• Supported gradient boosting methods include three pri- (ACIT), A Proposed Wireless Intrusion Detection Preven-
mary types: tion and Attack System
• •The learning rate is included in the gradient boosting 2 Kinam Park; Youngrok Song; Yun-Gyung Cheong, 2018
algorithm, commonly known as the gradient boosting IEEE Fourth International Conference on Big Data Com-
machine. puting Service and Applications (BigDataService), Clas-
• • Sub-sampling using stochastic gradient boosting at the sification of Attack Types for Intrusion Detection Systems
row, column, and column per split levels. Using a Machine Learning Algorithm
• • Gradient boosting using regularisation at both the L1 3 S. Bernard, L. Heutte and S. Adam “On the Selection of
and L2 levels. Decision Trees in Random Forests” Proceedings of Inter-
• national Joint Conference on Neural Networks, Atlanta,
On the whole, XgBoost is quick. Compared to other gradient Georgia, USA, June 14-19, 2009, 978-1-4244-3553-
boosting solutions, incredibly quick. Structured or tabular 1/09/25.002009IEEEA.Tesfahun, D.LalithaBhaskari, ”Intrus
datasets for classification and regression predictive modelling 0 − 4799 − 2235 − 2/1326.00 © 2013 IEEE
issues are dominated by XGBoost.
B. Principal Component Analysis (PCA)
•In order to reduce the dimensionality of huge data sets, a
technique known as principal component analysis, or PCA,
24
Implementing a CMS-Enabled Question and Answer Site for Departmental Use
Srijan Nayak Golla Sreenivas Yadav Lakshmi Sindhu P

Harish V Praveen Giridhar Pawaskar

201910100026@presidencyu praveen.pawaskar@presidenc
niversity.in yuniversity.in
Abstract— This paper presents the design and questions without revealing sensitive
development of a web-based question and answer (Q&A) information about themselves.
platform for small organizations. The platform is
intended to be deployed at the departmental level within • Fear of judgment: People may be hesitant to
institutions, where students or other non-expert users ask questions that they perceive as "dumb"
can post questions that are answered by a selected group or "basic" for fear of being judged by others.
of verified experts within the department. The platform Remaining anonymous can help alleviate
is designed to allow anonymous users to post questions,
while only verified experts can answer them through a
this fear and encourage more open and
content management system (CMS). To achieve this, we honest communication.
utilized Next.js with Tailwind CSS for the frontend and • Professional reputation: In some cases,
Sanity for the CMS. We discuss the benefits and
challenges of using these technologies, as well as the individuals may be concerned about how
design decisions we made to ensure scalability, security, asking certain questions may reflect on their
and ease of use. Overall, our platform provides a simple professional reputation or image. By
and effective way for institutions to facilitate expert remaining anonymous, they can avoid any
knowledge sharing and collaboration within their potential negative consequences or
departments. perceptions.
Keywords—Question and Answer (Q&A) platform, • Cultural or social norms: In certain cultures
Content Management System (CMS), web application or social circles, it may be considered taboo
or inappropriate to openly ask questions
XX. INTRODUCTION
about certain topics. By remaining
Question and answer (Q&A) platforms have anonymous, individuals can still seek
become increasingly popular in recent years as a answers to their questions without violating
way for individuals and organizations to share cultural or social norms.
knowledge and information. However, most of
these platforms require the user to sign in to ask Thus, a lot of factors contribute to the social
questions. Lot of people shy away from asking cost of asking questions [1]. On the other side of
questions when their identity is going to be the spectrum, there are some problems associated
attached to the questions they ask due to a number with allowing any user to submit answers to
of reasons: questions:
• Privacy: Some individuals may be • Low quality or inaccurate answers:

uncomfortable sharing personal information Allowing anyone to answer questions can
or identifying themselves in public forums. lead to a higher risk of low quality or
By remaining anonymous, they can ask inaccurate answers, which can negatively
impact the credibility and usefulness of the
platform.
• Spam or self-promotion: Allowing anyone to A. Frontend Technologies
answer questions can also increase the risk For building the front-end we used Next.js
of spam or self-promotion, as individuals which is a popular open-source JavaScript
may use the platform to promote their own framework for building server-rendered React
products or services, rather than providing applications [2]. It provides several useful features
helpful and relevant answers. for building web applications:
• Inappropriate content: Allowing anyone to • Server-side rendering: Next.js allows
answer questions can increase the risk of developers to build applications that render
inappropriate content or behaviour, such as HTML on the server before sending it to the
hate speech, harassment, or trolling, which client, which can improve the performance
can create a hostile environment and and search engine optimization (SEO) of the
discourage users from participating. application.
• Security concerns: Allowing anyone to • Automatic code splitting: Next.js
answer questions can also increase the risk automatically splits the code into smaller
of security concerns, such as malware or chunks, which can improve the performance
phishing attacks, which can compromise the and load time of the application.
safety and privacy of users.
• Client-side routing: Next.js allows
While solutions to these problems can be developers to build client-side routing in the
difficult to implement for a large-scale Q&A application, which can improve the user
platform, for small scale platforms meant to be experience and make the application feel
used at small department levels in an institution or faster.
organization the following features can provide
satisfactory results: • Static site generation: Next.js supports static
site generation, which allows developers to
• No sign-in for users: Allowing anybody to generate static HTML files for each page in
ask questions without having to sign-in takes the application, improving the performance
care of the barriers most users face when and scalability of the application.
they want to ask a question. And since the
platform is meant to be deployed and made • Serverless API routes: Next.js API routes
available to small departments instead of the allow developers to define serverless
whole world, moderating questions with a functions that can handle incoming HTTP
few assigned moderators becomes easier. requests and can be used to create back-end
functionality.
• Only let verified experts answer: Allowing
only verified experts answer questions For styling the application we used Tailwind
improves the quality of the answers on a CSS which has a comprehensive set of pre-defined
Q&A platform and prevents issues like spam utility classes, which cover a wide range of CSS
and inappropriate content. properties and values [3]. These classes can be
combined and customized to create complex
Incorporating these features as well as using a layouts and styles with minimal CSS code,
Content Management System (CMS) to manage reducing the need for writing custom CSS.
the questions and answers makes it very easy to However, styling the application, while important,
implement a Q&A platform for departmental use. is not the focus of this paper and thus we will not
The rest of the paper will cover the architecture cover implementation details specifically about
and development process of such a platform. Tailwind CSS.
XXI. TECHNOLOGIES USED B.Backend Technologies
For building the Q&A platform as described in For managing the content, we used Sanity
the introduction we essentially need a front-end which is a cloud based headless CMS that allows
framework to build the user interface of the developers to create and manage structured
website and a headless CMS to manage the content for websites, mobile apps, and other digital
content (questions and answers). platforms [4]. It's designed to be highly
customizable and flexible, making it suitable for a
wide range of content management needs. Sanity
26
stores the content in its online real-time datastore updated/regenerated with Incremental Static
called Content Lake, so we do not need to worry Regeneration (ISR).
about managing a separate database for the CMS.
• New question page at /questions/new:
The data for a Sanity project can be queried from a
frontend with Sanity’s open-source query language This page just contains a simple form for
called GROQ through their HTTP API. posting a new question and does not have
any other dynamic content. Thus, this page
Sanity also provides Sanity Studio, a React can be statically generated using SSG.
application that allows developers and editors to
manage content. Schemas for the different types of • API route for posting a question at
content can be created quickly with plain /api/question: The new question form
JavaScript objects and Sanity Studio will makes a POST request to this route with the
recognize that and create an editing environment question details as the request body to create
for managing those types of content. Since it’s a a new question in the Content Lake. Having
React application it’s also very easy to integrate it the logic to create the question in the API
with a Next.js application with the help of some route instead of client side ensures that no
packages. sensitive API keys that will give full read-
write access to the Content Lake are exposed
XXII. ARCHITECTURE to the client.
A. Site map • Sanity Studio at /admin: Requests to this
route get handled by the embedded Sanity
Studio which shows a content editing
interface for the admin users or editors
(verified experts).
XXIII. SYSTEM ARCHITECTURE
Figure 3: Site map of the frontend application

The site map for the frontend application is as
follows:
• Home/landing page at /: This page is only
meant to be a static landing page with no
dynamic content. Thus, it can be statically Figure 4: Architecture overview of the entire
generated using Static Site Generation system
(SSG).
As mentioned before, the application has two
• Questions list page at /questions: This types of users:
page shows the list of all questions and
needs to be always up to date with whatever • Non-authenticated users: These users
questions have been posted. Thus, this page interact with the frontend application
needs to get rendered on the server on each without signing in. They view existing
request using Server-Side Rendering (SSR). questions and answers as well as ask new
questions.
• Question details pages at
/questions/[id]: These pages show • Verified experts/admins: These users use the
the details of a particular question (identified Sanity Studio at /admin to post answers to
by the [id] part of the URL) along with its existing questions. Other admin users can
answer if present. These pages can be moderate the existing questions as well.
statically generated using SSG for each These users need to sign in with an account
question on demand as they don’t have any that has been authorised in the Sanity
dynamic content in them. But the pages need project.
to check if an answer is available and get Serverless functions using Next.js for fetching
and creating data in Sanity Content Lake ensure
27
that sensitive configuration information and API what content shows up at /questions and
tokens are never exposed to the client. pages/questions/new.tsx determines
what content shows up at /questions/new.
XXIV. APPLICATION WALKTHROUGH
At build time the Next.js application fetches Note the .tsx extension, which is the
data from Sanity and generates pages using SSG TypeScript equivalent for .jsx files usually used
for all the questions available at that point of time. in React applications, as this Next.js project was
The questions list page gets rendered with SSR on configured to use TypeScript during initialization
each request. Thus, the questions list page contains for improving developer experience and ensuring
links for all questions, even for those questions type safety.
which have been created after the last build. B. Integrating Sanity Studio with the Next.js
The question details page for a particular Application
question does not get generated immediately after We used the official next-sanity package
it has been posted by a user. It gets generated on from Sanity.io to integrate Sanity tools like the
demand when a user tries to visit the details page client SDK as well as Sanity Studio with the
for such a question by clicking its link from the Next.js application. First, we configured public
always up-to-date questions list page. runtime and server runtime configurations using
Similarly, the question details for a particular environment variables in next.config.js as
question page does not update immediately after seen in Figure 3.
an answer is posted for it. The first request to the
question details page after an answer has been
posted results in the same old generated static page
loading quickly without the answer. But, in the
background Next.js checks if the data on the page
is out-of-date and rebuilds the page in background.
Then the next user that visits the same details page
sees the updated page with the answer. This way
question details pages can get updated when data
changes while still having the performance Figure 5: next.config.js
benefits of static generation. This technique is
called ISR, where the entire application does not The SANITY_TOKEN refers to the API token
need to be rebuilt to update all the pages generated we generated in the Sanity project so that our
through SSG. application can have write access to the Content
XXV. IMPLEMENTATION DETAILS Lake, which is necessary for creating new
questions. Since the API token gives full read-
This section will give a high-level view of how write access to the Content Lake, it should never
we built a Q&A platform called Questo using the be exposed to the public and instead should only
architecture described in the previous section. This be accessed server side and hence it is defined
section will contain excerpts of code from the inside serverRuntimeConfig.
public repository hosted on GitHub [5]. Since the
focus of this paper is not on the user interface of After configuring the environment variables we
the site, it will not show excerpts about code set up the configuration for Sanity Studio in
relating to layout and styling of the user interface. sanity.config.ts as seen in Figure 4.
A. Creating Next.js and Sanity Projects
First, we created a Sanity project and noted
down the project ID and dataset name. Next, we
created a Next.js 13 application with the pages
directory configuration, where routing is decided
by the file structure inside the pages directory.
E.g., the pages/index.tsx file determines
what content shows up at the path /,
pages/questions/index.tsx determines Figure 6: sanity.config.ts
28
This configuration is used in
pages/admin/[[…index]].tsx where the
Sanity Studio is set up. The [[…index]] part
of file name indicates that this is a catch-all route,
meaning this file is responsible for responding to
any requests to paths starting with /admin.
Figure 5 shows an excerpt from the file showing a
simple component that just configures the Sanity
Studio using components from the next-
sanity package and returns it to handle all
requests to the admin routes.
Figure 9: schemas/question.ts
And figure 8 shows an excerpt from
schemas/answer.ts that shows how the
schema was defined for the answer content type.
Figure 7: Component for handling Sanity Studio

requests
Figure 8: Sanity Studio at /admin
C.Creating Schemas for Sanity

Our application has only two types of content to
deal with: questions and answers. Figure 7 shows Figure 10: schemas/answer.ts
an excerpt from schemas/question.ts that Note how the referenced question for the
shows how the schema was defined for the answer has a filter query. This ensures that when
question content type. an expert is creating an answer in Sanity Studio,
they can only select questions which have not been
answered yet.
Figure 11: Answer editor only shows unanswered

questions
29
These schemas are exported as schemaTypes
from schemas/index.ts for use in Sanity
related configurations like in
sanity.config.ts.
D.Configuring the Sanity client
Figure 14: Questions list page
Next, we created a Sanity client which the
application uses to fetch and create data in the F. Fetching data and generating question details
Content Lake. Figure 10 shows an excerpt from pages
sanity-client.ts that shows the
During each build of the Next.js application we
configuration for the created client.
want to statically generate HTML pages for the
questions that already exist in the Content Lake.
Since /questions/[id] is a dynamic route,
Next.js needs to know for which possible values of
[id] it needs to generate pages. Figure 13 shows
an excerpt from
pages/questions/[id].tsx that shows
Figure 12: sanity-client.ts how to do this by defining a function called
getStaticPaths.
Note the useCdn option which has been set to
false. Sanity is able to provide data through
Content Delivery Networks (CDNs) quickly by
means of caching and other techniques, because of
which the data may not always be up to date. But
our questions list page needs to be always up to
date, that’s why we are not using the CDN with Figure 15: getStaticPaths in
this client. Since this client has the API token in it, pages/questions/[id].tsx
we use it only in server-side contexts.
All this piece of code is doing is fetching the
E. Fetching data for questions list page list of all questions and telling Next.js the possible
Figure 11 shows an excerpt from values for the dynamic route parameter [id] in
pages/questions/index.tsx that shows the route /questions/[id]. The fallback
how the data is fetched for the page at option in the returned object ensures that when a
/questions by defining a function called question is created through the application after
getServerSideProps, which fetches the data build time and a user visits the details page for that
on each request to that page and passes it as question, Next.js will not show a 404 Not Found
props for the React component that will render page, and will instead show the user some fallback
the user interface for the questions list page. content like a loading status till the page is built by
fetching data. For all possible routes determined
by the paths variable in the returned object,
Next.js statically generates the pages by fetching
data for each route by calling the
getStaticProps function defined in the same
file as seen in Figure 14.
Figure 13: getServerSideProps in

pages/questions/index.tsx
30
make a POST request to the /api/questions
API route with the question data as the request
body. The handler for this API route is defined in
pages/api/questions.ts as seen in
Figure 16.
Figure 16: getStaticProps in

pages/questions/[id].tsx
Just like the getServerSideProps, the
function returns props that will be passed to the
React component that will render the user interface
for the details page. But there are some key Figure 18: pages/api/questions.ts
differences. First of all, since we set the
fallback option with the getStaticPaths The handler first checks if the request made is a
function, the page will keep on waiting for data POST request and validates the request body to
when a user visits a route with an invalid question make sure it contains all the data needed to create
ID. To prevent this, we check if the data we get a question. Then it uses the client to create the
back from the Content Lake is null, and if it is question in Content Lake and sends back the
we set notFound to true in the returned object. created question. Runtime validation is carried out
This will ensure that the user sees a 404 Not with a package called zod.
Found page for paths with invalid question IDs. H. Calling the API route in the frontend
Next is the revalidate option which we set to
The handler for the new question form at
30. This option enables ISR so that the page gets /questions/new makes a request to the
updated/regenerated when an answer is available.
previously defined API route with the entered
The value indicates the number of seconds since
title and details from the form as seen in
the last request after which Next.js will revalidate
the data for the page on the next request. the excerpt from
Otherwise, it will get expensive if the page is pages/questions/new.tsx in Figure 17.
checking for changes in data on each request.
Figure 19: POST request with question data in

pages/questions/new.tsx
Figure 17: Questions details page Thus, the frontend never makes any reference
to the Sanity client which means that the API
G. Creating API route for posting questions token never reaches the client.
As mentioned before in the architecture section,
we do not want the client with the API token to be
accessed in client-side contexts for security. Thus,
for creating a question, the frontend form will
31
the user which improves the performance of the
website.
XXVII. CONCLUSION
In this paper, we discussed the issues with
regards to participation in large scale Q&A
platforms like fear of judgement for people asking
questions and uncertain quality of answers by
users and how those issues can be avoided if the
Figure 20: New question page platform is designed and intended to be used for
departmental levels. We discussed the architecture
I. Configuring Cross-Origin Resource Sharing for a CMS-enabled Q&A platform for
(CORS) and deployment departmental use. Then we showed a high level
The last thing to configure was the CORS view of the implementation for a Q&A platform
Origins in the Sanity project. Since locally, the that we created called Questo built with Next.js
Next.js project runs on http://localhost:3000, we and Sanity headless CMS. Finally we discussed
added it as a CORS origin with credentials performance and scalability of the application.
allowed to the Sanity project so that Sanity Thus, a Q&A platform that allows anyone to ask
responds to queries from the application running questions without signing in and only letting
locally. verified experts answer them is a good solution for
We also deployed the application with Vercel small-scale departmental use.
on https://questo-coral.vercel.app, thus we added REFERENCES
that URL as a CORS origin as well.
[29] Ma, Haiwei, Hao-Fei Cheng, Bowen Yu, and
XXVI. PERFORMANCE AND SCALABILITY Haiyi Zhu. "Effects of Anonymity,
Since the questions details pages are built and Ephemerality, and System Routing on Cost in
rebuilt with SSG and ISR, users will face very Social Question Asking." Proceedings of the
small loading times for those pages. But the ACM on Human-Computer Interaction 3, no.
questions list page fetches data on each request, GROUP (2019): 1-21.
thus it will take a bit more time to load compared [30] “Next.js
by Vercel - The React Framework for
to other pages in this application. One solution to the Web”, nextjs.org, https://nextjs.org/
this is to only show questions which have been (accessed Mar. 3, 2023)
answered. So, since the list will not have to be up
[31] “Tailwind CSS - Rapidly build modern
to date for each request, it can be generated with
SSG and updated with ISR. The downside to this websites without ever leaving your HTML.”,
approach is that the user will not be able to tell if a tailwindcss.com, https://tailwindcss.com/
particular question has already been asked if it has (accessed Mar. 10, 2023)
not been answered yet. Hence, this approach was [32] “TheComposable Content Cloud – Sanity.io”,
not used in our implementation. sanity.io, https://www.sanity.io/ (accessed Mar.
Regarding execution of serverless functions 17, 2023)
like API routes and SSR pages, those depend on [33] “srijan-nayak/Questo: A Q&A site built with
the infrastructure of the platform on which the Next.js and Sanity”, github.com,
application is deployed. In our case, we deployed https://github.com/srijan-nayak/Questo
the application on Vercel which is dynamically (accessed Apr. 20, 2023)
scalable. Its edge network ensures that the
serverless logic is executed as close as possible to
32
Logging Library for APM on A Microservice Based Web Application.
1st Mr. Sunil Kumar Sahoo 2nd Nihal G 3rd Rohan Muthanna MM
Department of CSE Department of CCE Department of IST
Bengaluru. Karnataka. Bengaluru. Karnataka. Bengaluru. Karnataka.
sunilkumarsahoo@presidencyuniversity.in 201910100683@presidencyuniversity.in 201910100079@presidencyuniversity.in
4th Abhishek A. 5th Yatham Sai Vamsidhar Reddy.

Department of COM Department of CSE
Bengaluru. Karnataka. Bengaluru. Karnataka.
Abstract - This project aims to create a is used for API testing and managing the
logging library for Application Performance application's frontend. Overall, this project
Monitoring (APM) within a microservice demonstrates a well-structured microservice-
architecture. The four primary components of based architecture that makes use of industry-
the application are User-Management, Policy- standard tools and technologies.
Management, Claims-Management, and
Billing-Management. Additional components Keywords – Logging Library, APM,
have been implemented to support the Microservice Architecture, Spring Boot,
microservices, such as the Service Registry Insurance Domain, MySQL, User-
microservice. Management, Policy-Management, Claims-
Management, Billing-Management, Service
The API Gateway microservice, and the Registry, API Gateway, Config Server, SLF4J,
Config Server microservice. Logging is an Dynatrace, New Relic.
important element of this project, and the
SLF4J logging library is used to generate logs XXVIII. INTRODUCTION
within the microservices. These logs are Microservice architectures have grown in
exported to a separate file for easier separation favor in today's digital landscape due to their
and monitoring and analysis. scalability, flexibility, and ease of maintenance.
The goal of this project is to provide a logging
This project demonstrates a well-structured library for Application Performance Monitoring
microservice-based architecture that makes use (APM) within a microservice-based insurance
of industry-standard tools and technologies. online application. The logging library is essential
The logging library, in conjunction with the for tracking and analyzing application behavior,
interface with APM tools, offers thorough performance, and potential problems.
monitoring and analysis of the performance of Because of its complexity and the
the insurance online application. necessity for effective management of user
information, policy details, claims, and billing, the
The system achieves scalability, resilience,
insurance domain was chosen as the environment
and maintainability by utilizing efficient
for this project. The application's backend is
microservice communication and configuration
created with the Spring Boot framework, which is
management. APM tools such as Dynatrace
noted for its simplicity and broad ecosystem,
and New Relic provide useful insights into the
behavior of the application, performance while MySQL acts as the database for storing and
retrieving data.
indicators, and potential bottlenecks. Postman
User-Management, Policy-Management, Postman is used on the front end for API
Claims-Management, and Billing-Management testing and management.
are the four primary components or microservices
of the project. Each microservice focuses on a This sophisticated tool enables developers
certain function and manages the associated data to test and evaluate the microservices' APIs,
and procedures. This modular strategy improves ensuring their functionality and correctness.
the system's overall agility by allowing for greater Postman also promotes seamless integration and
organization, scalability, and independent speedier development cycles by facilitating
deployment of microservices. efficient cooperation between the frontend and
Several additional components have been backend teams.
added to support the microservices. The Netflix Overall, this project provides a complete
Eureka Server-powered Service Registry solution for creating a microservice-based
microservice provides a centralized mechanism insurance online application. The system provides
for registering and discovering microservices. scalability, performance, and maintainability by
This enables efficient communication between exploiting the benefits of microservice design,
multiple microservices, allowing them to locate Spring Boot, logging libraries, APM tools, and
and communicate with one another in real time. other supporting components. The logging library,
The API Gateway microservice serves as a in particular, is critical to enable effective
single point of entry for various microservices. It monitoring and analysis, assuring the application's
streamlines system interaction by providing a seamless operation in the dynamic insurance
consistent base URL and directing requests to the environment.
appropriate microservice based on established
rules. This abstraction layer improves the overall XXIX. CURRENT STATE OF THE
MICROSERVICE-BASED INSURANCE WEB
user experience and simplifies the management of
APPLICATION.
various endpoints.
The Config Server microservice retrieves
configuration information from a Git repository, Insurance websites provide customers with tools
providing for central management and quick to compare insurance options, track and submit
configuration modifications for the microservices. claims online, and use customer support chatbots
This decouples configuration details from the to get assistance. Customers may need to speak
coding, making it easier to change and maintain with an agent to finalize their policy.
configuration settings without requiring Insurance-based microservices are gaining
microservices to be redeployed. popularity in the industry. Insurance companies
The logging library, which makes use of can use microservices to create smaller, more
the SLF4J logging framework, is a critical focused applications that can be easily integrated
component of this project. The library collects log with other systems. For example, an insurance
data from microservices and outputs it to a company may create a microservice that handles
separate file for easier monitoring and analysis. claims processing, while another microservice
This logging technique allows developers and manages policy management. These services can
system administrators to effectively follow then be combined to create a larger insurance
application behavior, diagnose errors, and application. Microservices also enable insurance
optimize performance. companies to quickly develop new products and
The project is coupled with APM services, test them in a controlled environment,
technologies like as Dynatrace and New Relic to and scale them up as needed.
improve monitoring capabilities even further.
These tools provide detailed insights into the A. Abbreviations and Acronyms
performance indicators of the application, such as
• APM – Application Performance Management
response times, error rates, and resource
• SLF4J – Simple Logging Façade for Java
utilization. This connectivity makes it easy to
B. Logging Library.
discover bottlenecks, solve difficulties, and
optimize the overall performance of the system. A logging library is a framework for creating,
structuring, formatting, and publishing log events.
34
It offers APIs to send events from applications to files, network traffic, and infrastructure
destinations, similar to agents. measurements, and analyzes it to find trends,
abnormalities, and performance problems. It also
offers root cause analysis, proactive alerts, and
C. SLF4J [Logging Technology].
deep code-level diagnostics. Additionally,
SLF4J is a logging framework that offers a Kubernetes, containers, and cloud infrastructure
straightforward and adaptable layer for several services can be monitored by Dynatrace APM to
logging frameworks. It acts as a front or get a holistic picture of application performance.
abstraction for several logging systems,
making it easy for developers to move between
them without having to change the code. It XXX. OVERVIEW OF INSURANCE
offers a consistent and user-friendly interface MICROSERVICE WEBSITES.
that can be adjusted for different logging A. Architecture.
levels, message formats, and output locations.
SLF4J is widely used in Java projects and is 1) The architecture of the insurance website
regarded as a standard logging API. in this project is based on a microservice
method, in which various application
components are developed and deployed
D.Advantages and Disadvantages of independently as brief, unconnected
Microservices services.
Scalability, resilience, and agility are 2) The microservices' major data storage
benefits, while complexity, operational solution is the H2 database, which enables
overhead, and distributed system difficulties effective and scalable data management.
are drawbacks. 3) The service registry is used to manage the
complexity of the overall application by
keeping track of all the accessible
E. Advantages and Disadvantages of APM.
microservices and their instances.
By locating and fixing bottlenecks and 4) To provide accurate logging and
problems in real time, application performance debugging of the microservices, SLF4J is
management (APM) can assist enhance the overall used as the logging library.
performance and user experience of an 5) Docker is used to containerizing the
application. APM tools may need specialized microservices, which streamlines
knowledge to be used properly, and they can be deployment and makes it simpler to handle
difficult and expensive to adopt. Additionally, many microservice iterations.
they might increase the monitoring application's 6) To track and analyze the performance of
overhead. the microservices and the entire
application, the log details are finally
F. Advantages and Disadvantages of Logs. forwarded to an APM (Application
Logs can offer useful information for Performance Monitoring) tool like
performance analysis, security audits, and Dynatrace or New Relic.
troubleshooting. Logs have a few drawbacks, B. Figures
including the fact that they can be challenging to
interpret and analyze, that they can take up a lot of
storage, and that they might contain sensitive data
that needs to be secured.
G. APM Technology [Dynatrace].

Dynatrace APM is a software intelligence
platform created to track and improve the
efficiency of distributed, complex systems. It
collects data from code-level instrumentation, log
35
Fig. 1 Project architecture. would first want to express our gratitude to our
academic advisor, whose suggestions and help
were essential during the entire study period. We
also wish to thank the subject-matter experts who
shared their insights and criticism with us,
allowing us to improve our research. Their
knowledge and skill considerably increased our
understanding of the subject and provided us with
new perspectives.
REFERENCES
[34] Doshi, P., & Kulkarni, P. (2018, May). A
review on microservices architecture. In 2018
International Conference on Communication,
Fig 2. Microservice Communication design. Computing, and Internet of Things (IC3IoT)
(pp.1-6). IEEE.
[35] Thangavel, P., & Kalaiselvi, M. (2021).
Application Performance Management Tools:
An Overview. International Journal of
Engineering and Advanced Technology, 10(4),
1844-1849.
[36] Amorim, E. R., Leão, R. S., da Silva, J. S., &
de Oliveira, J. P. (2020). An Exploratory Study
on the Use of Logging and Monitoring
Techniques for Microservices-Based
Applications IEEE Latin America
Fig 3. Database Architecture Transactions, 18(3), 520-527.
[37] Zhou, X., Li, Z., Wang, H., & Zhao, S. (2020,
December). Microservice-Based Performance
CONCLUSION.
Monitoring Framework for Web Applications.
The use of microservices in the building of In 2020 20th International Conference on
an insurance website offers advantages over Computational Science and Its Applications
traditional monolithic architecture, such as (ICCSA) (pp. 229-242). IEEE.
independent development and deployment of [38] Hogback - https://logback.qos.ch/
various application components, H2 database
[39] SLF4J - https://www.slf4j.org/
solution, service registry, Docker containerization,
and APM tools. These tools help to manage the [40] ELK Stack (Elasticsearch, Logstash, and
complexity of the microservices, simplify the Kibana) - https://www.elastic.co/what-is/elk-
deployment process, and improve optimization stack
and performance. H2, a service registry, and [41] Dynatrace APM documentation:
Docker containerization are all essential for https://www.dynatrace.com/support/help/
creating intricate online applications like [42] Gartner Magic Quadrant for APM:
insurance websites. https://www.dynatrace.com/gartner-magic-
ACKNOWLEDGMENT quadrant-apm/
[43] New Relic APM documentation:
We thank everyone who contributed to this
https://docs.newrelic.com/docs/apm/
study on Logging Library for APM on Insurance -
A Microservice Based Web Application. We
36
Augmented Reality-based 3D Build and Assembly Instructions
App for Cardboard Model
Muniswamy A School Niranjan G School Nikhil U Shet
of Computer Science and of Computer Science and Engineering School of Computer Science and
Engineering Presidency University, Bengaluru, Engineering
Presidency University, Bengaluru, Karnataka, India. Presidency University, Bengaluru,
Karnataka, India. 201910101587@presidencyuniversity.in Karnataka, India.
Praveen Kumar D Guide: Mr. Afroz Pasha

School of Computer Science and Assistant Professor
Engineering School of CSE and IS
Presidency University, Bengaluru, Presidency University, Bengaluru,
Karnataka, India. Karnataka, India.
201810100574@presidencyuniversity.in afrozpasha@presidencyuniversity.in
Abstract that has been increasingly utilized to enhance user

experiences across various applications. AR
technology enables users to visualize 3d objects in
This study presents an innovative augmented
real-world settings and interact with them in real-
reality (AR) instructions application designed to
time, which has the potential to revolutionize our
enhance the assembly process of cardboard
environmental interactions. This study presents an
models. The proposed solution aims to address
AR-based 3D assembly instruction application for
the difficulties often encountered in the
cardboard models that, leverages computer vision
assembly process, such as the lack of clarity and
algorithms to recognize and track real-world
understanding of written or pictorial
markers overlaying 3d versions of cardboard
instructions, as well as the difficulty in
models on top of them to provide a more
visualizing the finished model. Our AR
immersive user experience. Users can manipulate
application offers a user-friendly interface that
these models in various ways including scaling,
provides clear and concise visual instructions
rotating, and moving them within the real-world
using 3D models, animations, and interactive
environment. The app also provides step-by-step
features. Furthermore, it increases confidence
visual instructions to guide users through the
and satisfaction in the user’s ability to assemble
cardboard model’s assembly process. Overall, this
the models with AR instructions. Overall, the
paper highlights the potential of an AR-based 3d
AR instructions application has the potential to
assembly instruction app for cardboard models to
revolutionize the assembly process of cardboard
improve the assembly process and enhance user
models, making it more accessible and enjoyable
experiences, especially for individuals lacking prior
for a wider range of users. Further research
technical knowledge or expertise.
could investigate the application of this
technology to other domains, such as furniture II. Literature Review
assembly or DIY projects, to further enhance the In 2013, Google introduced Glass Explorer Edition,
user experience and improve overall efficiency. which was designed for individual users. In 2015,
Keywords: Augmented Reality, 3D, Assembly Google released the Glass Enterprise edition,
instruction, Cardboard model. specifically designed for workplace use and
included better features. Since then, other
companies such as Vuzix and RealWear have
developed AR glasses for industrial use such as
I. Introduction assembly instruction and manufacturing settings.
Augmented Reality superimposes virtual images
onto the real world creating an aggregated view
37
In 2017, Gabriel Evans, Jack Miller, Mariangely Our AR application ModelAR is designed for
Iglesias Pena, Anastacia MacAllister, and Eliot widespread accessibility, as it does not require
Winer created a prototype using the Unity game expensive AR glasses and can be downloaded on
engine for AR HMDs such as the Microsoft any Android smartphone. While certain
HoloLens. The application included features like a smartphones may have better AR capabilities than
user interface, interactive 3D assembly instructions, others, our application offers a convenient and cost-
and the ability to place content in a spatially- effective way for anyone to experience AR
registered manner. The study demonstrated that technology without the need for specialized
although the HoloLens shows potential, areas still hardware, such as AR glasses. So, we used Unity
need improvement, such as tracking accuracy Game Engine to develop our application. Each
before it can be used in an assembly setting in a cardboard model contains a marker card that must
factory. be scanned to load and superimpose a 3d version of
that cardboard model on top. To achieve this, we
T. Haritos and N.D. Macchiarella developed a used Vuforia SDK in Unity which handles image
mobile application in 2005 to train Aircraft recognition, tracking, and model superimposition
Maintenance Technicians (AMTs). The app was over tracked images. In addition, Blender 3D
designed to help AMTs with task training and job software was used for the 3D modeling of a
tasks. When they analyzed the outcomes, they physical cardboard box. The user interface of the
discovered that technicians' training and retention ModelAR includes forward, backward, and loop
costs were reduced. The app eliminated the need to buttons to traverse the assembly steps. Also, users
retrieve information from maintenance manuals for can scale up or down and rotate the superimposed
inspection and repair procedures, which could model on the marker at their convenience using
otherwise require leaving the aircraft. slider buttons.
In 2017, Jonas Blattgerste, Benjamin Strenge, Process of the proposed method
Patrick Renner, Thies Pfeiffer, and Kai Essig
conducted a study comparing traditional paper-
based assembly instructions with augmented reality
(AR) instructions for manual assembly tasks. The
results of the study showed that AR instructions
could greatly enhance performance in manual
assembly tasks when compared to traditional paper-
based instructions. Participants who used AR
instructions completed the task more quickly and
with fewer errors compared to those who used
paper-based instructions. Additionally, the study
revealed that participants found AR instructions to
be more user-friendly and helpful than paper-based
instructions.
Previous research has explored the development of

augmented reality (AR)-based assembly instruction
applications that have primarily been designed for
head-mounted displays (HMDs) such as the
HoloLens or utilized by organizations to guide
technicians or interns within assembly plants.
However, this study focuses on the development of
an Android-based app that guides the assembly of
cardboard models. The app is intended for wider
use beyond a specific organization and is Fig,3.1, process of proposed method’s flowchart.
compatible with Android devices running version
10 or higher, making it accessible to hobbyists and
individuals interested in cardboard model assembly. Each phase in our proposed framework is explained
in detail below.
III. Proposed Methodology
38
A. Model Design C. Generating AR Marker
The initial phase in model design entails obtaining Our research paper employed fiducial or AR
precise measurements of the physical cardboard markers to accurately position our model within a
prototype, particularly for a packaging box. given space. By utilizing these markers, we were
Subsequently, a 3D rendition is crafted through the able to establish a reliable and robust tracking
utilization of software specifically designed for this system that facilitated the seamless integration of
purpose. In our study, Blender was the preferred our model into a real-world environment
tool due to its user-friendly interface, ability to implemented through the use of Python
deliver precision in modeling outcomes, and being programming language, wherein we leveraged the
open source. algorithm to generate AR markers that were easily
recognizable by our model. The resultant system
demonstrated superior accuracy and precision,
thereby underscoring the efficacy of our approach
in the context of AR applications.
Fig.3.1.1, A unwrapped view of the virtual box
B. Animating Assembly Steps

After 3D modeling, we rigged each face of the
packaging box model with bones or also armatures.
Each assembly step was animated by keyframing
armature rotations and saving it individually. In Fig,3.3.1, AR marker generated by our algorithm
other words, our box model had 11 steps to
assemble so, 11 animation files. Next, we added the AR marker to Vuforia's
database to convert the marker image into an image
i. Animation Controller target. Vuforia's rating system assigns a rating to
We then imported the box model, armatures, and each image target based on its level of detail,
animations into Unity Game Engine. We added an ranging from 1 to 5 stars. A higher rating is given to
animation controller for our model. It decides targets with more intricate details, whereas targets
which animation to play and when. It treats each with low contrast and repetitive patterns receive a
animation clip as a state and an arrow from one lower rating. Our marker received a 3-star rating
state to another is called a transition. These from Vuforia's rating system.
transitions have certain conditions, which, when
met, cause the transition to happen and the next
state plays.
Fig,3.3.2, Features and rating by Vuforia.
D. Scripting
Fig, 3.2.1, An image of an animation controller for In the context of scripting in Unity for an AR app,
assembly steps. it is imperative to incorporate user interface (UI)
elements such as buttons that enable seamless
navigation through various animation states. This
can be achieved by creating an animation controller
39
with an integer parameter that corresponds to 3D model of a physical packaging box in
specific animation states. Blender software and used the Unity game
To achieve this, a script is written that binds the UI engine for AR development and
buttons to their respective functions. The script deployment onto Android platforms with a
includes the logic that increments or decrements minimum API level of 10. Furthermore,
the integer parameter based on the button pressed,
we developed an algorithm to generate AR
which in turn plays the linked animation state.
Additionally, the script contains a function that markers in Python, for our app to track and
enables the playing of the current animation superimpose a 3D model of a packaging
box over it. Which was possible by
IV. Implementation
employing Vuforia SDK. The results from
For Implementing our ModelAR, the minimum API
level 10 was considered for Android devices when Vuforia’s rating system showed that our
determining the necessary specifications for the AR marker could be tracked with 60%
augmented reality (AR) system. To facilitate model accuracy. 3D modelling and assembly
tracking and recognition, the Vuforia SDK instruction steps were built in Blender 3D
framework was utilized, and the resulting packages
software. Both the Virtual Model and AR
were subsequently integrated into the Unity Game
Engine, which offers a built-in XR foundation for marker were imported to Unity for
cross-platform compatibility with Android devices. development. UI buttons are used to
The 3D virtual model, complete with assembly traverse the model like forward and
instructions, was constructed using Blender 3D backward steps to assemble the model.
software before being imported into Unity. User
Scripts are written in Unity programming
interface (UI) buttons were employed to enable
users to navigate through the model. Programming on C# for virtual button events to traverse
in C# was employed for both Unity software between the assembly steps of a cardboard
development and Android app development. model.
V. Results Achieved Future Work: Our ModelAR version 2. O can
✓ Generated a marker that is trackable with provide improved features by adding a Slider
60% accuracy. button to scale up or down and rotate the model for
✓ High precision 3D modelling and assembly the user’s convenience. Improved UI quality for
step animation. the navigation buttons. This ModelAR app can be
✓ Able to Track and superimpose the virtual built for IOS devices and smart glasses and tablets.
model on the AR marker.
✓ Created UI buttons to navigate the model. References
[1] M. Dalle Mura and G. Dini, Department of
Civil and Industrial Engineering, University of
Pisa, 56122 Pisa, Italy, Published at Springer
“Augmented Reality in Assembly Systems: State of
the Art and Future Perspectives”.
[2] Sri Sudha Vijay Keshav Kolla, Andre Sanchez,

Meysam Minoufekr, Peter Plapper University of
Luxembourg, 6, rue Richard Coudenhove-Kalergi,
L-1359, Luxembourg’s “AUGMENTED REALITY
IN MANUAL ASSEMBLY PROCESSES”.
Figure 3.5.1, An image of the complete built model. Published at ResearchGate
Conclusion and Future Work [3] Jeff K.T. Tang, Tin-Yung Au Duong, Yui-Wang
Ng, Hoi-Kit Luk, Hong Kong – “Learning to
From this research work, we developed an Create 3D Models via an Augmented Reality
augmented reality-based Assembly Smartphone Interface”. Published at ResearchGate
instructions application for cardboard [4] Jonas Blattgerste, Benjamin Strenge, Patrick
models called ModelAR. We designed a Renner, Thies Pfeiffer, Kai Essig – “Comparing
40
Conventional and Augmented Reality Instructions for Manual Assembly and Training Tasks (June 4,
for Manual Assembly Tasks”. Published at PETRA 2021). Proceedings of the Conference on Learning
'17: Proceedings of the 10th International Factories (CLF) 2021, Available at SSRN:
Conference on PErvasive Technologies Related to http://dx.doi.org/10.2139/ssrn.38599
Assistive Environments.
[8] Gabriella Sosa, Faculty of Computing,
[5] Gabriel Evans, Jack Miller, Mariangely Iglesias Blekings Institute of Technology, Sweden
Pena, Anastacia MacAllister, and Eliot Winer “Enhance user experience when displaying 3D
"Evaluating the Microsoft HoloLens through an models and animations on mobile platforms: an
augmented reality assembly application", Proc. augmented reality approach”.
SPIE 10197, Degraded Environments: Sensing,
Processing, and Display 2017, 101970V (5 May [9] Wie Yan, Texas A&M University, College
2017); Station, Texas 77843, USA. Published at Springer –
“Augmented reality instructions for construction
[6] T. Haritos and N. D. Macchiarella, "A mobile toys enabled by accurate model registration and
application of augmented reality for aerospace realistic object/hand occlusions”.
maintenance training," 24th Digital Avionics
Systems Conference, Washington, DC, USA, 2005, [10] Dieter Schmalstieg, Tobias Höllerer
pp. 5.B.3-5.1, doi: 10.1109/DASC.2005.1563376. “Augmented Reality – Principles and Practice
(usability)”.
[7] Kolla, Sri Sudha Vijay Keshav and Sanchez,
Andre and Plapper, Peter, Comparing Effectiveness
of Paper Based and Augmented Reality Instructions
41
My City Info : A Comprehensive Guide To Finding What You Need
Mr.Jerrin Joe Francis Sanika Saigaonker Ramya C
Asst.prof-computer science Computer Science Information Science
engineering engineering engineering
jerrin.francis@presidencyuniversity.in 201910100399@presidencyuniversity.in 201910101579@presidencyuniversity.in
Venkata Hari Narayan.Y Gadhi Manasa Ajithesh Santhanam

Electronics and Computer Science computer science
communication engineering engineering engineering
Abstract— Visiting a city without an obligations and their employment. People like to
acquaintance or a proper plan can be a daunting take breaks to give themselves the much-needed
task. The effort put into planning such visits also break from this cycle.
takes a tedious amount of time, as there is a lot For this same reason, tourism is increasing.
of information on the internet. The proposed Technology has a huge potential to make this
system helps the user with reducing these efforts procedure.
and making an efficient plan to visit any place.
There are several facets to tourism. Planning takes
In the era of the internet, multiple websites help
up a lot of people's time in order to have a pleasant
with booking a room, and suggesting a place. experience.
i.e., the user must go through various websites
before making an informed decision about the People must decide where they will stay and must
place. The project focuses on making this sift through a lot of information to discover the
process easier for the user. The website will be a tourist attraction of their choice. Even with all of
one-stop destination for users to gain complete this, there may still be missing or irrelevant details.
knowledge of the city. The website will have Much time is spent on this procedure. There may
still be locations that are solely known to the
features that will help them look for a place to
locals. Locals are far better knowledgeable about
stay and suggest places to visit. With the help of
the food and activities that are unique to their
the review system, the user can easily decide if area.
they want to visit a particular place or not. The
users will also be informed about the weather The article places focus on a website that
and subjected to various facts about the city, promises to streamline and improve this
which will help them understand the place even procedure. The website incorporates a number of
better. They will also be provided emergency elements, including choosing an appropriate place
assistance, which aids in times of distress. to stay, proposing tourist attractions areas based on
a review system, exploring local food and events,
Keywords—internet, destination, distress, and also giving them access to information and
knowledge, efforts . news about the city that will help them discover
relevance to the location.
XXXI. INTRODUCTION
Although technology is advancing quickly,
people's lives are becoming considerably more
limited. Everyone is caught between their
XXXII. RELATED WORKS information about nearby police stations, NGOs,
In [1], recommendingplaces to the user based on hospitals, and fire brigades.
his current GPS location. The system will work on The website uses GIS to provide an affected
category and time spam field for showing the place for the user. Admins have access to add or
places. remove places and also can change the details of
In [2], system should find a path that fulfills those information provided on the website. Users can
criteria, show it on screen, show names of objects, contact the admin in case they know a new
some short descriptions and photos of them and attraction (hotel/restaurant/historic place) which
possible entrance costs. It should also be able to the website does not have.
estimate time needed to travel from one object to Users can also contact us through the mail id,
the next and if it is possible, advise which bus line
or other public means of transport may be used. mobile number that is available on the website or
can send a message to the admin by providing
In [3], Collaborative filtering is used to compare their name, mail id, subject on the website.
other user’s review on a place with the user’s The architecture diagram of ‘my city info' is
review. Cosine similarity is used to measure how shown in the
many similar keywords are present in the
descriptions and reviews.
In [4], development is based on android platform

and delivered as a mobile application. The main
objective of this project is to provide tourists the
information that they need before and during
visits without missing attractions.
In [7], Location based systems have enjoyed great

success and this context is very important for
mobile devices. However, using additional
context data such as weather, time, social media
sentiment and user preferences can provide a
more accurate model of the user's current context.
XXXIII. MY CITY INFO

The website will contain two interfaces, one for Fig.1 architecture diagram of ”my city
the user and other for the admin official. At First, info’
the users should register by providing their name,
mail I'd, mobile number, password, and later they
can login through their username and password. XXXIV. PROPOSED METHOD
The proposed website application makes use of As mentioned earlier, there are several drawbacks
the internet and allows the user to visit nearby or when one person needs to obtain information
the best attractive places. Users will be able to about a particular city.
witness the beautiful gallery of attractive places in This project aspires to do is to make sure that
Bengaluru. If users click any image, they will be on one single website the individual is to find all
able to see the information regarding that the necessary information about the particular city
particular place. The user can also find hotels and which will be a hassle-free and seamless
restaurants. It also provides daily news feeds and experience. The websites will have various
facts about the city. Commercialization of culture features which will help the individual to make
and art interchange of cultures between hosts and this decision with much more assurance.
visitors. This website gives information on local
events in order to support the preservation and The website will boast a rich knowledge of the
celebration of regional festivals and cultural city. We also make sure the user does not face any
events. In case of an emergency, users/travelers difficulty in traversing the website, as we will
can go through emergency assistance to get make the UI simple and elegant.
43
password that is not easily guessable and use
different passwords for different accounts to
XXXV. IMPLEMENTATION prevent unauthorized access.
This website typically requires users to log in to
access the system's features and services. Here
is a brief overview of how a user can log in by XXXVI. EXPERIMENTAL RESULT
entering their name and password: After execution this website in windows, new user
• The user navigates to the login page of the signing up the website and using the website will
tourist guide system. be shown below
• On the login screen, the user enters their
username or email address in the relevant
section.
• On the login screen, the user enters their
password in the corresponding field..
• The user's username and password are
checked by the system to make sure they
match an already-existing user account..
• Phone number: Some city information
systems may use a phone number as the
primary login credential. Users will need to Fig.2 This is signup page for new user for the
enter their phone number to log in. This website
method is commonly used for mobile
applications that require authentication
through SMS verification.
• Address: In some cases, a city information
system may use a user's address as the
primary login credential. Users will need to
enter their street address, city, and state to
log in. This method is commonly used for
systems that provide hyper-local
information, such as property tax or utility
billing portals.
• If the username and password combination
are correct, the system grants the user access
to the system's features and services. If the
credentials are incorrect, the user is notified
and prompted to try again or recover their Fig 3.This will be then login page for the
account through the password recovery admin
process.
To ensure the security of the user's credentials, the
system should use encryption to protect the login
credentials during transmission and storage.
Additionally, the system should include measures
to protect against brute force attacks, where an
attacker attempts to gain access to an account by
repeatedly guessing username and password
combinations.
It's important for users to keep their login
credentials secure and not share them with anyone.
Additionally, they should choose a strong
44
Fig 4.This will be the login page for the user climate at the current location of the
After logging in to the website you will find the travelers/users, and the users can suggest
hompage in which you can select the cities based places/upcoming events to add by sending of
on where you want to travel information of that place as a request to admin.
Thus, this website makes travelers easier to find
their needs in Bangalore city by manifesting
information on city attractions, nearby places to
visit, local events, and news feed, whether the
report in a distinct website.
REFERENCES
[44] Suryawanshi, V. C. Patil, G. Dudhane, D.

Joshi, and P. Ganapule, “Smart Tourist Guide
For Pune City,” Journal of emerging
technologies and innovative research, 2018,
Accessed: Apr. 04, 2023.
Fig 5. This will be our Gallery page where you [45] ]N. Kumaran, B. Aakash, and M.
can select the places based on which city you want Vidhyaharini, “Intelligent Tourist Guide
to travel you can find the detailed information how System,” International Journal for Research in
to travel where to travel within the available time Applied Science and Engineering Technology,
and you can also find suitable place to stay, vol. 10, no. 3, pp. 701–706, Mar. 2022, doi.
Attractions of the city, etc [46] K. Jain, A. Anilkumar, I. Kadam, and D.
Dawande, “Smart Tourist Guide,” vol. 8, p.
3869, 2020, Accessed: Apr. 03, 2023.
[47] B. Wolde and Y. A. Muhie, “Android Based
Tourism Guide for Benishangul-Gumuz
Region, Ethiopia,” Journal of Computer
Science, vol. 16, no. 10, pp. 1423–1427, Oct.
2020, doi:
https://doi.org/10.3844/jcssp.2020.1423.1427.
[48] M. U. E. Wijesuriya, S. U. Mendis, B. E. S.
Bandara, and D. De Silva, “INTERACTIVE
Fig 6. This will be the page where you can find the MOBILE BASED TOUR GUIDE,”
info about the city like geography, climate and ResearchGate, Jan. 01, 2013.
demographics [49] Zainab I. Alkhudhayr and Anton
Nijholt."Virtual Reality Tourism: An
Overview of Opportunities, Challenges, and
Issues" This paper discusses the potential of
XXXVII. CONCLUSION virtual reality technology in tourism and its
This website aims to provide information to visit implementation as a tourist guide system.
the main attractions of the city, suggesting [50] Zhiqiang Li, Xueqin Li, and Shuang Liu.
restaurants to eat and places to stay, manifesting "Intelligent Tourist Guide System Based on
the nearby places to visit based on the kilometers Context-Awareness and User Interest" This
that users put in, and displaying the local events paper proposes an intelligent tourist guide
so that travelers and the users who stay in system that utilizes context-awareness and
Bangalore can explore it, travelers/users get the user interest to provide personalized
daily news feed so that they can be aware of what recommendations to tourists.
is going on in the city, whether report shows the
45
[51] "A Web-Based Tourist Guide System for Mobile Phones “ISBN 978-1-4244-7547-6/10
Promoting Local Tourism" by R. H. Goudar @2010 IEEE.
and K. V. Hunagund. This paper presents a [54] A. Suryawanshi, V. C. Patil, G. Dudhane, D.
web-based tourist guide system that utilizes Joshi, and P. Ganapule, “Smart Tourist Guide
multimedia and interactive features to promote For Pune City,” Journal of emerging
local tourism. technologies and innovative research, 2018,
[52] Jian Meng,Neng Xu ,“A Mobile Tourist Accessed: Apr. 04, 2023. [Online]. Available:
Guide System Based on Mashup [55] J. Clerk Maxwell, A Treatise on Electricity
Technology“ ISBN978-1-4244- 7618-3 /10 and Magnetism, 3rd ed., vol. 2. Oxford:
©2010 IEEE. Clarendon, 1892, pp.68–73.
[53] Xiaoyun shi,”Tour-Guide: Providing
Location-Based Tourist Information on
46
UNCOVERING INCOME TAX FRAUD: A
LOGISTIC REGRESSION APPROACH FOR
DETECTION AND PREVENTION
Fancy Angeline U2 Santosh Kumar3
(B.Tech - ISE) Department of (B.Tech - ISE) Department of
Rafeeda Fatima1 computer Science and Engineering computer Science and Engineering
(B.Tech - ISE) Department of Presidency University Presidency University
computer Science and Engineering Bangalore, India Bangalore, India
Presidency University angelfancy890@gmail.com kumarsantosh25056@gmail.com
Bangalore, India
Rafeeda28@gmail.com Shivam Narayan5 Dr. P Sudha6
(B.Tech - ISE) Department of (Assistant Professor - SG)
Ravuludiki Hire Matam Prathibha4 computer Science and Engineering Department of computer Science
(B.Tech - ISE) Department of Presidency University and Engineering
computer Science and Engineering Bangalore, India Presidency University
Presidency University sshivam6495@gmail.com Bangalore, India
Bangalore, India sudha.p@presidencyuniversity.in
prathibharhm1@gmail.com
Abstract— The compulsory tax levied by the government on difficult to identify them. In many cases, they try to blend in
individuals and businesses based on their income is known as with their environment, much like military units that use
income tax. Tax fraud involves the intentional manipulation of camouflage or chameleons that use their coloring to hide
information on a tax return to reduce tax liability. Our project from predators. These tactics are not random, but rather
focuses on developing a machine learning model to identify
income tax fraud by analyzing taxpayers' financial data. Six
carefully planned and executed. As a result, new techniques
machine learning algorithms namely: Logistic Regression, are needed to detect and address patterns that appear to be
Decision Tree, Random Forest, Naive Bayes, k-Nearest normal but are actually part of fraudulent activities. Tax
Neighbors and Feed forward Neural Network were compared, authorities are given the task of finding these fraudsters and
and logistic regression was found to be the most effective in usually rely on experts' intuition. Random auditing is a way
detecting tax fraud. Compared to existing methods, the of discouraging tax frauds. Unfortunately, this approach is
proposed model captures both linear and non-linear not cost-effective, and auditing some types of taxes can take
relationships among variables, making it more accurate in up to six months or even a year, which puts a significant
detecting complex patterns. The model was developed by burden on the already overloaded tax auditors. Traditional
training it on a OpenML dataset and evaluate on a test dataset.
The research aim is to develop a model that can accurately
methods, such as manual audits or statistical analysis, are
detect tax fraud, and the objectives include comparing the time-consuming, expensive, and often ineffective.
effectiveness of various machine learning algorithms, Therefore, the use of artificial intelligence or machine
identifying significant factors contributing to tax fraud, and learning techniques has gained popularity in recent years, as
providing insights for policymakers. The proposed model has they can analyze large datasets and detect patterns that
significant potential in detecting tax fraud, which can reduce humans may miss.
revenue losses and promote fairness in the tax system while Machine learning (ML) is a branch of artificial intelligence
remaining an affordable solution. (AI) that utilizes statistical models and algorithms to enable
Keywords— Income tax fraud detection, Logistic computer systems to learn from data and improve their
Regression, Decision Tree, Random Forest, Naive Bayes, k- performance on specific tasks. Essentially, the algorithms
Nearest Neighbors, Feed Forward Neural Network. used in machine learning enable computers to learn and
make decisions based on patterns and trends discovered in
XXXVIII. INTRODUCTION
large datasets. There are three main types of machine
learning: supervised learning, unsupervised learning, and
Income tax is crucial for the functioning of our society, as it reinforcement learning. Supervised learning involves
provides countries with the necessary revenue to make vital
providing the machine with labeled data, which it can then
investments in infrastructure, health, and education.
use to make predictions or classifications based on that data.
However, despite its importance, many people are averse to
paying taxes, and make the government lose millions of Unsupervised learning, on the other hand, involves feeding
dollars every year. There are various strategies to evade the machine with unlabeled data, allowing it to identify
taxes, such as underreporting income, which reduces the tax patterns and relationships on its own. Reinforcement
liability. Criminals who commit fraud are becoming learning is a type of machine learning where the machine
increasingly sophisticated in their methods, making it learns by taking actions in an environment and receiving
feedback in the form of rewards or penalties. Machine Particle Swarm Optimization Algorithm has been improved
learning has proven to be a powerful tool for a wide range of to better detect tax evasion, with an accuracy rate of 95%. It
applications, including healthcare, finance, marketing, and is time and cost-effective but must be validated on a larger
cybersecurity. Its ability to learn from data and adapt to new dataset to ensure it is robust and accurate. [4].
circumstances without being explicitly programmed has
made it an indispensable tool in modern data analysis. The proposed method uses a clustering technique to identify
groups of taxpayers who have similar income profiles and
then identifies those who are reporting significantly lower
incomes. It relies heavily on a specific set of features and
requires manual investigation to confirm whether tax fraud
has occurred. [5]. Milos Savić developed a novel method for
detecting tax evasion risks called Hybrid Unsupervised
Outlier Detection. This approach combines the strengths of
both unsupervised and supervised techniques to enhance the
accuracy of the detection process. Although the method can
identify tax evasion in a particular case involving a grocery
shop owner, its applicability is limited, and it fails to address
the ethical and legal issues associated with detecting tax
evasion. Therefore, it is essential to test its reliability and
Fig-1 Classification of machine learning effectiveness by applying it to different datasets and
scenarios [6]. González and Velásquez's article
In this article, we propose a comprehensive method for "Characterization and detection of taxpayers with false
detecting and preventing income tax fraud by utilizing invoices using data mining techniques" focuses on
multiple machine learning algorithms, namely decision tree, identifying and characterizing taxpayers who use false
random forest, naive Bayes, k-nearest neighbors, feed- invoices to evade taxes in Colombia. The authors used
forward neural network, and logistic regression. To decision trees, neural networks, and logistic regression to
accurately identify fraudulent tax returns, we have trained identify patterns in tax data that can be used to identify
and tested these models on the OpenML Income Dataset. fraudulent behaviour. The results of the study have
Our approach emphasizes the use of behavioral and important implications for tax authorities and policymakers
demographic factors in logistic regression, and our results seeking to improve tax compliance and reduce tax evasion
demonstrate its effectiveness in improving fraud detection [7]. This paper uses a neural network to detect credit card
and prevention. This article presents a unique solution to fraud. Neural networks are difficult to understand and
address the issue of income tax fraud. require a lot of information to train, making them less
effective in smaller datasets. Additionally, they are
XXXIX. LITERATURE REVIEW
expensive to train and deploy and do not address issues
related to data privacy and security. Appropriate measures
The Article uses neural networks for fraud detection, which need to be taken to ensure data is protected and used
is a popular and effective technique in machine learning. It ethically [8].
provides a case study of income tax fraud detection and
claims to achieve a high level of accuracy. However, the AI and ML algorithms are useful tools for detecting
problem statement provides limited data, and the proposed fraudulent tax returns during income tax audits. In Taiwan,
method relies on accessing sensitive personal data, raising examples of successful use of these algorithms for both
concerns about data privacy. [1]. Neural networks are a profit-seeking enterprise income tax and individual income
powerful machine learning tool that can identify patterns in tax have been demonstrated in this study. This research
large and complex data sets, achieving 95%+ accuracy. provides valuable insights into the factors contributing to tax
However, it is difficult to identify tax evasion due to lack of fraud, which can aid in the development of effective tax
transparency, potential false positives, and lack of policies and regulations. However, it is important to note
information [2]. Data mining techniques can be used to that the findings are specific to Taiwan's tax system and
automate the process of analysing large amounts of data to may not be applicable to India's tax system. Additionally,
identify high-risk taxpayers. This is a time-saving and further research is necessary to address how the system can
efficient approach compared to manual analysis and can handle missing or unreliable data [9]. The book covers
lead to improved accuracy and flexibility. Additionally, the various techniques for fraud detection, including descriptive,
techniques can be applied to a variety of data sources, such predictive, and social network analysis. It provides practical
as financial transactions, tax returns, and other sources of examples and case studies of fraud detection in various
information, and can be scaled to large datasets. But data industries and emphasizes the use of data mining tools for
quality, algorithm selection, model interpretation, and fraud detection. It provides a general approach to fraud
privacy concerns all affect the accuracy of data mining detection, with limited focus on tax fraud, lack of emphasis
techniques for tax fraud detection [3]. The Improved on regulatory compliance, and dependence on data
48
availability. The effectiveness of the techniques may depend XL. PROPOSED METHOD
on the availability and quality of data, which may be a A. Logistic Regression (LR)
limitation in the context of the given problem statement
[10]. The research article provides a comparative analysis of It is a popular approach in machine learning that is used to
supervised and unsupervised neural networks. It uses a large solve binary classification problems. To determine the
sample size of 1,700 Korean firms over a 10-year period and likelihood of a specific outcome, the logistic function is
uses various financial ratios and non-financial factors as utilized to express the relationship between the input
input variables. The study found that both supervised and features and the output variable.
unsupervised neural networks can effectively predict The logistic function can be expressed as follows:
bankruptcy, which highlights the usefulness of machine
learning techniques in financial analysis. Legal and ethical 1
𝑃(𝑦 = 1|𝑥) =
considerations should be considered when using such a 1 + 𝑒 −(𝑏0 +𝑏1𝑥1+𝑏2 𝑥2+⋯….+𝑏𝑛𝑥𝑛 )
system [11].
Where:
The main challenge in detecting financial fraud is the use of
• 𝑥1 , 𝑥2 , … . . , 𝑥𝑛 are the input features.
repeated and unlawful techniques. To tackle this problem,
• 𝑏0 , 𝑏1 , 𝑏2 , … . . , 𝑏𝑛 are the co-efficient of the input
researchers analysed 32 documents discussing the growth of
features.
neural network algorithms for fraud detection from 2015 to
• The natural logarithm base 𝑒 is used in the
2020. The study focused on deep neural network algorithms
expression.
(DNN), convolutional neural networks (CNN), and neural
networks with SMOTE, as well as other ANN
An optimization procedure, such as gradient descent or
complementing methodologies. The experiments aimed to
Newton-Raphson, is applied to calculate the coefficients.
identify credit card fraud and facilitate online transactions.
Once the coefficients are determined, the logistic function
The comparative analysis revealed that the convolutional
can be used to predict the probability of the outcome for
ANN based on functional sequencing, the ANN with
new observations.
Gradient Boosting Decision Tree (XGBoost), and the ANN
with automatic ontology learning all met the requirements
for theoretical background, mathematical development,
experimental study, and accuracy of the results. However,
future research should consider the time, cost, and data
characterization required for neural network training, as
these factors significantly impact algorithm effectiveness
[12]. Neural networks are an affordable and straightforward
way to simplify analysis by avoiding the need to consider
many statistical assumptions, such as matrix homogeneity,
normality, and data processing. These models can
automatically adjust connection weights and are fault-
tolerant. They can also include all accessible variables in
model estimation and enable quick revisions. A study found
Fig-2 Logistic curve
that the Multilayer Perceptron is effective for identifying
fraudulent taxpayers and determining the likelihood of tax
evasion, with an efficacy of 84.3%. The ROC curve-based
sensitivity analysis demonstrated the model's excellent The threshold for identifying observations as belonging to
ability to distinguish between fraudulent and non-fraudulent one of the binary outcomes can be set to 0.5 (i.e., if P(y=1|x)
taxpayers. The Multilayer Perceptron network appears to be is larger than 0.5, the observation belongs to the positive
a highly effective way to classify taxpayers, and this study's outcome; otherwise, it belongs to the negative outcome).
results offer opportunities for improving tax fraud detection B. Decision Tree (DT)
by predicting fraud tendencies through sensitivity analysis.
The DT algorithm is a popular technique used in machine
It would be interesting to explore the use of this concept in
learning for classification and regression tasks. It represents
other taxes in the future [13]. each internal node as a feature, each branch as a decision
based on that feature, and each leaf node as a class or value.
The approach recursively divides the data into subsets based
on the most informative features until it meets a stopping
requirement, such as maximum depth or minimum number
of samples per leaf.
To mathematically represent the decision tree, we can use

the equation:
49
d) To make a prediction for a new data point 𝑥 , the
𝑓(𝑥) = ∑ 𝑦𝑖 ⋅ 𝐼(𝑥𝑖 ∈ 𝑅𝑖 ) algorithm passes it through all 𝑇 trees and
calculates the average prediction as:
𝑇
Where: 1
• 𝑓(𝑥) denotes the predicted output for a new 𝑌(𝑥) = ( ) ⋅ ∑ 𝑤𝑡 ⋅ ℎ𝑡 (𝑥)
𝑇
input 𝑥 . 𝑡=1
• 𝑦𝑖 represents the output value for the 𝑖 𝑡ℎ leaf node.

Where ℎ𝑡 (𝑥) is tree 𝑡 ′ 𝑠 prediction for input 𝑥 and 𝑤𝑡 is the
• 𝑅𝑖 is the region of the 𝑖 𝑡ℎ leaf node defined by the
weight given to tree 𝑡 .
decision tree.
The Random Forest algorithm is good at dealing with high-
• 𝐼(𝑥𝑖 ∈ 𝑅𝑖 ) is an indicator function that returns 1 if dimensional datasets and is resistant to noise and outliers. It
the input 𝑥𝑖 belongs to the region 𝑅𝑖 , and 0 can also provide valuable information, which is useful for
otherwise. feature selection and model interpretation.
In other words, this equation calculates the predicted output D. Naive Bayes (NB)
by adding the output values of the leaf nodes whose regions The Bayes' Theorem, a statistical principle that estimates the
the new input belongs to. The decision tree algorithm learns likelihood of a hypothesis based on knowledge of
to partition the input space into regions where the output circumstances that might be relevant to the hypothesis, is the
values are similar and assigns a unique output value to each foundation of the Naive Bayes algorithm. The
region. straightforward but efficient Naive Bayes method is
C. Random Forest (RT) frequently employed for classification jobs.
The basic principle of the Naive Bayes algorithm is to
A type of machine learning algorithm is capable of calculate the probability of each class based on a set of input
performing both classification and regression tasks. This features. The term "naive" refers to the assumption that each
algorithm belongs to the ensemble technique family, which input feature is independent of all the other features. Despite
combines multiple models to improve forecasting accuracy. this assumption, the algorithm often proves effective in real-
The algorithm works by creating a collection of decision world applications.
trees based on random subsets of the training data and
features. Each tree in the forest is trained on a distinct subset The following formula represents Naive Bayes:
of the data, and the final forecast is made by aggregating the 𝑃(𝑦|𝑥1 , 𝑥2 , … . . , 𝑥𝑛)
predictions of all the trees through a majority voting 𝑃(𝑦) ⋅ 𝑃(𝑥1 |𝑦) ⋅ 𝑃(𝑥2 |𝑦) ⋅ … … .⋅ 𝑃(𝑥𝑛 |𝑦)
process. =
𝑃(𝑥1 , 𝑥2 , … . . , 𝑥𝑛 )
The RF algorithm is described in detail below:
1. Randomly select ‘n’ samples from the training data In this formula:
set. • 𝑃(𝑦|𝑥1, 𝑥2, … . . , 𝑥𝑛) denotes the posterior
2. For each sample, randomly select k features from probability of class 𝑦 given the input features
the total m features. 𝑥1 , 𝑥2 , … . . , 𝑥𝑛 .
3. Build a decision tree using the selected samples • 𝑃(𝑦) represents the prior probability of class 𝑦 .
and features. • 𝑃(𝑥𝑖 |𝑦) denotes the probability of feature 𝑥𝑖 given
4. Repeat steps 1-3 for T times to create T decision class 𝑦 .
trees. • 𝑃(𝑥1 , 𝑥2 , … . . , 𝑥𝑛 ) represents the probability of the
5. To make a prediction for a new data point, pass it input features.
through all T trees and calculate the average
prediction from the majority votes. This is how the algorithm operates:
1. Calculate the prior probability 𝑃(𝑦) for each class
The RF algorithm can be represented mathematically as given a collection of training data with labelled
follows: classes.
a) Suppose we have a training dataset consisting of 𝑁 2. For each feature 𝑥𝑖 calculate the conditional
observations with 𝑝 input features and a probability 𝑃(𝑥𝑖 |𝑦) for each class 𝑦 .
corresponding set of target values 𝑌= 3. For a new input instance with features
{𝑦1 , 𝑦2 , … . . , 𝑦𝑛 }. 𝑥1 , 𝑥2 , … . . , 𝑥𝑛 calculate the posterior probability
b) For each tree 𝑡 = 1, 2, … , 𝑇 in the forest, the 𝑃(𝑥1 , 𝑥2 , . . … , 𝑥𝑛 ) for each class 𝑦 using the
algorithm selects a random subset of the training above equation.
data 𝐷𝑡 of size 𝑛 (where 𝑛 < 𝑁 ), and a random 4. Assign the input instance to the class with the
subset of the features 𝐹𝑡 of size 𝑘 (where 𝑘 < 𝑝 ). highest posterior probability.
c) The algorithm then builds a decision tree using 𝐷𝑡
and 𝐹𝑡 are assigns a weight 𝑤𝑡 to the tree based on Note that to ensure that the probabilities add up to 1, the
its performance on the out-of-bag samples (samples denominator 𝑃(𝑥1 , 𝑥2 , . . … , 𝑥𝑛 ) in the equation acts as a
not used in building the tree). normalization constant. It can be determined by adding the
numerator across all potential classes:
50
𝑃(𝑥1 , 𝑥2 , … . . , 𝑥𝑛 ) the absence of feedback loops during the propagation of
input across the network until it reaches the output layer.
= ∑ 𝑃(𝑦) ⋅ 𝑃(𝑥1 |𝑦) ⋅ 𝑃(𝑥2 |𝑦) ⋅ … . The FFNN is composed of basic units called neurons or
perceptrons, which linearly transform input signals, apply an
⋅ 𝑃(𝑥𝑛 |𝑦) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 𝑦.
activation function, and transmit the output to the next layer.
E. k-Nearest Neighbors (K-NN) The layers of neurons in the FFNN are connected to the
The k-NN algorithm is a supervised learning method that preceding and succeeding layers.
can be used for both classification and regression tasks. The The first layer of the FFNN is called the input layer, which
main objective of this algorithm is to predict the label or receives input data that is sent to the top-level hidden layer.
value of a test data point by identifying the k data points in The hidden layer receives inputs from the previous layer,
the training set that are closest to it. performs a linear transformation, applies an activation
To determine the distance between two data points, various function, and sends the output to the next layer. Finally, the
metrics like the Manhattan distance, cosine similarity, or output layer of the network generates the final output of the
Euclidean distance can be employed. Based on the values or FFNN.
labels of the k-nearest neighbors, the algorithm can then
predict the label or value of the test data point. The equation for the output of a single neuron In an FFNN
is as follows:
The equation for the k-NN algorithm can be expressed as 𝑛
follows:
𝑦 = 𝑓 (∑ 𝑤𝑖 𝑥𝑖 + 𝑏)
𝑖=1
For classification:
Where:
• Let 𝐷 represent the training dataset.
• 𝑥𝑖 is the input to the neuron from the previous layer
• Let 𝑥 be the test data point. or the input layer.
• The number of neighbors to take consideration is • 𝑤𝑖 is the weight of the connection between the
𝑘. input 𝑥𝑖 and the neuron.
• The distance metric between test point 𝑥 & any • 𝑏 is the bias term, which is added to shift the
point 𝑦 in the dataset 𝐷 is 𝑑𝑖𝑠𝑡(𝑥, 𝑦). output of the neuron.
• Let neighbors(𝑥) be the set of k nearest neighbors • ∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖 + 𝑏 represents a weighted sum of the
to 𝑥 in 𝐷 . inputs and bias.
• Let class (𝑦) be the class label of 𝑦 . • 𝑓 is the activation function, which introduces non-
linearity into the output of the neuron
The following is the predicted class label for 𝑥 : In a feedforward neural network (FFNN), an activation
function 𝑓 is used to introduce non-linearity. The most
predicted_class(x) = argmax(class(y)) for y in neighbors(x) commonly used activation functions are the sigmoid, ReLU
(Rectified Linear Unit), and softmax functions. The
In this equation, argmax returns the class label that occurs selection of the activation function depends on the problem
most frequently among the k nearest neighbors. and the desired output.
The output of the FFNN can be computed by combining the
For regression: equations for each neuron in the network. During the
• Let 𝐷 be the training dataset. training phase, the weights and biases of the neurons are
• Let 𝑥 be the test data point. learned by applying an optimization algorithm such as
• The number of neighbors to take consideration is backpropagation.
𝑘.
• The distance metric between test point 𝑥 & any
point 𝑦 in the dataset 𝐷 is 𝑑𝑖𝑠𝑡(𝑥, 𝑦).
• Let neighbors(𝑥) be the set of k nearest neighbors
to 𝑥 in 𝐷 .
• Let value(𝑦) be the value associated with point 𝑦 .
Then, the predicted value for 𝑥 is:
predicted_value(x) = mean(value(y)) for y in neighbors(x)
In this equation, mean returns the average value of the k

nearest neighbors.
F. Feed Forward Neural Network (FFNN)
An artificial neural network called a Feed Forward Neural
Network (FFNN) is designed to pass information through
multiple hidden layers in a one-way direction from the input
layer to the output layer. The term "feed-forward" refers to
51
XLI. METHODOLOGY metrics like accuracy, precision, recall, and F1
score to measure how well the model was
performing in detecting income tax fraud.
7. Visualization: A visualization was created to
compare the performance of different algorithms.
Bar plots were used to compare the accuracy scores
of different models, ROC curves to see which
model performs better with an area under the curve
(AUC), and precision-recall curves that compares
precision and recall.
8. Model Comparison: The performance of each
algorithm was compared to determine which
algorithm performs the best.
XLII. RESULTS AND DISCUSSION
This section provides a detailed explanation of the results

obtained from the computations for detecting income tax
fraud. The data used in this project was obtained from
OpenML, an open platform for sharing datasets. The dataset
contained 48842 entries with 15 columns. Exploratory data
analysis was performed to gain insights into the dataset,
such as checking the correlation between variables and
visualizing the distributions of variables. To achieve this, a
scatterplot matrix with 6 rows and 6 columns was generated,
displaying the relationship between two variables in each
Fig-3 Architecture of the model scatterplot. The plots on the diagonal displayed the
distribution of each variable, while the plots above the
1. Data Collection: The data used to detect income diagonal were mirrored from those below the diagonal. The
tax fraud was obtained from the OpenML hue parameter was used to color the data points according to
repository. The dataset included variables such as the income level, making it easier to identify any patterns or
age, workclass, fnlwgt, education, education-num, differences between the two classes. The resulting diagram
marital-status, occupation, relationship, race, sex, is shown below.
capital-gain, capital-loss, hours-per-week, native-
country, and income.
2. Data Pre-processing: The data was pre-processed
by removing missing values, outliers, and
irrelevant variables. Additionally, feature scaling,
normalization, and encoding categorical variables
were performed.
3. Feature Selection: The most important variables
for detecting income tax fraud were selected using
techniques like correlation analysis, feature
importance, or domain knowledge. In this study,
the selected features were age, workclass, race, sex,
native-country, income (Demographic factors) and
hours-per-week, education, marital-status,
occupation, relationship (Behavioral factors).
4. Data Partitioning: The data was split into training
and testing sets to train the models and evaluate
their performance.
5. Model Training: Six different models, including
Logistic Regression, Decision Tree, Random
Forest, Naive Bayes, k-Nearest Neighbors, and
Feed Forward Neural Network, were trained on the
selected features using the training data. These
models learned to predict the likelihood of income
tax fraud based on the input variables.
6. Model Evaluation: The performance of the trained
models on the testing data was evaluated using
52
Fig-6 ROC curve
A plot of the Receiver Operating Characteristic (ROC)

Curve was made for each algorithm using different colors
for each model. This displays the tradeoff between
sensitivity and specificity for each method, allowing them to
be compared.
Fig-4 exploratory data analysis
After pre-processing the dataset to remove duplicates,

handle missing values, encode categorical variables, and
scale continuous variables, six machine learning models
were trained on the training data. The performance of the
models was evaluated using relevant assessment measures
such as accuracy, precision, recall, and F1-score. To
compare the performance of the algorithms, the following
visualizations were created:
Fig-7 Precision-Recall Curve
Precision-Recall Curve: A plot of the precision-recall

curves was made for each algorithm. This graph depicts the
link between precision and recall for each method.
After comparing the performance of various algorithms,

cross-validation was performed to reduce the impact of
chance and sampling bias that may occur in a single train-
test split. The output displayed the name of each model and
its mean cross-validation score over 10 folds, which
provides a measure of the generalization performance of
each model. This indicates how well each model is likely to
perform on new and unseen data.
Fig-5 Accuracy bar plot
Bar Plot: A comparison of the accuracy of each model was

made. The accuracy of the logistic regression model was
represented by a red horizontal line, which served as a
reference point for comparison.
53
performance of the models is consistent across folds, with
the logistic regression model achieving the highest mean
cross-validation score. In conclusion, our project provides a
framework for building and comparing various machine
learning models for detecting income tax fraud. The logistic
regression model with optimized hyperparameters was
found to be the best-performing model.
REFERENCES
[56] Murorunkwere, B.F.; Tuyishimire, O.; Haughton, D.; Nzabanita, J.

Fraud Detection Using Neural Networks: A Case Study of Income
Tax. Future Internet 2022, 14, 168.
[57] Pérez López, C.; Delgado Rodríguez, M.J.; de Lucas Santos, S. Tax
Next, we optimize the performance of the best-performing Fraud Detection through Neural Networks: An Application Using a
Sample of Personal Income Taxpayers. Future Internet 2019, 11, 86.
model with hyperparameters using the GridSearchCV
[58] M. S. Rad and A. Shahbahrami, Detecting high risk taxpayers using
method from scikit-learn. The hyperparameters being tuned data mining techniques, 2016 2nd International Conference of Signal
are the regularization strength parameter "C" of the logistic Processing and Intelligent Systems (ICSPIS), Tehran, Iran, 2016, pp.
regression algorithm. After the hyperparameters are tuned, 1-5.
the best estimator model is selected based on the average [59] Mojahedi, Houri & Babazadeh sangar, Amin & Masdari, Mohammad.
score obtained across all folds of the cross-validation. The (2022). Towards Tax Evasion Detection Using Improved Particle
Swarm Optimization Algorithm. Mathematical Problems in
logistic regression model's prediction probabilities were Engineering. 2022. 1-17.
used to identify suspicious tax returns based on a threshold [60] de Roux, D.; Perez, B.; Moreno, A.; Villamil, M.D.P.; Figueroa, C.
value of 0.35, and 2769 records were recognized as having Tax fraud detection for under-reporting declarations using an
suspicious tax returns. unsupervised machine learning approach. In Proceedings of the 24th
ACM SIGKDD International Conference on Knowledge Discovery &
Finally, the performance of each model is compared using Data Mining, London, UK, 19–23 August 2018; pp. 215–222
evaluation metrics. [61] Miloš Savić, Jasna Atanasijević, Dušan Jakovetić, Nataša Krejić, Tax
evasion risk management using a Hybrid Unsupervised Outlier
Detection method, Expert Systems with Applications, Volume 193,
2022, 116409, ISSN 0957-4174.
[62] González, P.C.; Velásquez, J.D. Characterization and detection of
taxpayers with false invoices using data mining techniques. Expert
Syst. Appl. 2013, 40, 1427–1436.
[63] Ghosh, S.; Douglas, L.R. Credit card fraud detection with a neural-
network. In Proceedings of the Twenty-Seventh Hawaii International
Conference, Wailea, HI, USA, 4–7 January 1994.
[64] Chi-Hung Lin, I-Chun Lin, Ching-Huei Wu, Ya-Ching Yang &
Jinsheng Roan (2012) The application of decision tree and artificial
neural network to income tax audit: the examples of profit-seeking
enterprise income tax and individual income tax in Taiwan, Journal of
the Chinese Institute of Engineers, 35:4, 401-41.
[65] B. Baesens, V. Vlasselaer, W. Verbeke. “Fraud Analytics Using
Descriptive, Predictive, and Social Network Techniques”. Wiley,
2015.
[66] Kidong Lee; David Booth; Pervaiz Alam (2005). A comparison of
supervised and unsupervised neural networks in predicting
XLIII. CONCLUSION bankruptcy of Korean firms. , 29(1), 1–16.
The best-performing model among those considered was the [67] Clavería Navarrete, A. y Carrasco Gallego, A. (2021). Neural network
algorithms for fraud detection: a comparison of the complementary
logistic regression model with hyperparameters optimized techniques in the last five years. Journal of Management Information
using grid search. This model achieved an accuracy of and Decision Sciences, 24 (special 1), 1-16.
0.8429, precision of 0.7038, recall of 0.5698, and F1-score [68] López, César & Delgado Rodríguez, María Jesús & Santos, Sonia.
of 0.5698. The ROC and Precision-Recall curves (2019). Tax Fraud Detection through Neural Networks: An
demonstrate that the logistic regression model has the best Application Using a Sample of Personal Income Taxpayers.
trade-off between true positive rate and false positive rate,
as well as precision and recall, compared to the other
models. The results of cross-validation indicate that the
54
Online Bus Booking System With User Authentication and Authorization
Asst. Prof. Sridevi S K N Monish Kumar Rajesh Lingampalli

Department of Computer Science Department of Computer Science Department of Computer Science
and Engineering and Engineering and Engineering
sridevi.s@presidencyuniversity.in 201910100277@presidencyuniversity.in 201910101339@presidencyuniversity.in
Tejash Shakthi Kumar Bhuvan S

Department of Computer Science Department of Computer
and Engineering Engineering
Abstract—The use of buses for traveling securely book their tickets using this system's user-
across the country is increasing day by day, so friendly interface. To further assure data security
the traditional method of booking tickets leads and privacy, the system includes user login and
to long waiting hours. So, an easier method is authorization mechanisms. The system can support
proposed in this paper by creating an online large numbers of users while retaining
bus booking system using the MERN stack performance because of the scalability built into its
technology, which includes MongoDB, Express, architecture. This study offers a useful technique
React, and Node.js, with user authentication for bus operators to enhance customer satisfaction
and streamline their operations while online bus
and authorization features. The system
reservation systems are becoming more and more
provides a user-friendly interface for
widespread, raising serious questions about data
customers to search for available bus routes, security and customer privacy. This research study
view schedules, and book their tickets securely. focuses on the creation of a bus reservation
The user authentication and authorization website with a strong emphasis on user
features ensure that only authorized users can identification and authorization to allay these
access the system's functionalities, providing worries. The MERN stack and other cutting-edge
an added layer of security. The system's web technologies were used in the design of the
architecture was designed with a focus on website to give users a simple and safe booking
scalability, allowing it to handle a high volume experience. To ensure that only authorized users
of users while maintaining its performance. can use the system's capabilities and that their data
This system is suitable for implementation by is safeguarded from unauthorized access, the
bus companies to improve their customer system uses user authentication and authorization
experience and streamline their operations mechanisms. The technical components of website
while ensuring data privacy and security. creation, including the usage of secure
authentication protocols and the application of
Keywords—Booking System, User access control policies, will be covered in detail in
Authentication, Authorization. this article.
XLIV. INTRODUCTION XLV. PROBLEM STATEMENT

Bus travel is becoming more and more popular The current system being used by the staff at the
nationwide, which has resulted in lengthy wait machine ticket counter appears to be an internal
times for traditional ticket-purchasing methods. To bone
solve this problem, a research study suggests This means that it isn't accessible to guests outside
creating an online bus reservation system utilizing of the physical position of the counter. As a result,
the MERN stack, which consists of MongoDB, guests must go to the counter in person to buy
Express, React, and Node.js. Customers can search machine tickets or interrogate machine schedules.
for available bus routes, examine schedules, and
In [3] paper investigates the factors that influence
This system has several limitations and customer satisfaction in online bus ticket booking
downsides that may beget vexation for guests. services in Pakistan. The authors surveyed 246
Originally, guests are only suitable to pay for their customers and found that website usability, ticket
machine tickets in cash. This means that they have availability, and service quality were the main
to carry sufficient cash with them to buy their determinants of customer satisfaction. The study
tickets, which may not always be accessible or also found that customers were more likely to use
doable for some guests. In addition, because of the online booking services if they were offered
cash-only policy, guests may have to stay in line discounts or promotions.
for extended ages of time, especially during peak
trip ages, leading to frustration and dissatisfaction. In[4] this case study analyzes the online bus
ticketing system of Rebus, a popular bus booking
Another issue with the current system is that service in India. The authors used a survey of 50
guests cannot buy machine tickets over the phone. customers and found that the website's ease of
This is because the machine company's phone line use, speed, and reliability were key factors in its
is frequently busy or unapproachable, which can success. The study also identified issues such as
make it delicate for guests to gain the information long waiting times for customer support and
they need or make a purchase without physically difficulties with refunds and cancellations.
going to the ticket counter. This lack of
inflexibility can also be interference for guests In[5] paper investigates the impact of social media
who may be unfit to make it to the ticket athwart marketing on customer loyalty in online bus ticket
due to colorful reasons, similar to physical booking services. The authors surveyed 200
limitations or position constraints. customers and found that social media promotions
had a positive effect on customer loyalty. The
Overall, the current system for dealing with study also identified factors such as website
machine tickets at the counter has several design and service quality as important
limitations and may not give the position of determinants of customer loyalty.
convenience and availability that guests desire.
Thus, it may be necessary to explore
indispensable systems and styles that are more XLVII. PROPOSED SYSTEM
client-friendly and effective. The user start by visiting the website for the
reservation, they would initially be navigated to
XLVI. LITERATURE SURVEY the login page to enter their credentials such as
In [1] paper analyzes the growth and challenges email and password. If not registered, then they
faced by online bus booking services in India. The can go to the registration page and register
authors surveyed 150 online bus booking users themselves with their name, phone no, email, and
and found that the convenience and accessibility password. Once the user is logged in a JSON web
of online booking were the main reasons for its Token is generated and stored in local storage for
popularity. However, issues such as security further authorization.
concerns, lack of internet access in certain areas,
and difficulties with refunds and cancellations The user would next access the reservation form
were identified as challenges. and choose the date of travel, the origin and
destination locations, as well as the number of
In [2] study examines customer satisfaction with seats they want to reserve. To acquire the bus
online bus ticket booking services in Malaysia. schedules that meet the user's selection criteria,
The authors surveyed 380 customers and found the client would submit a request to the server.
that factors such as website design, ease of use, The client would then display the user's options
and ticket availability influenced satisfaction. The for bus schedules when the server sent the client
study also found that customers preferred booking the matching bus schedules it had found in the
through mobile apps rather than websites. MongoDB database. After the user has made a
reasonable choice, the client will ask the server to
reserve the chosen seats.
56
• The JWT token is sent in the request
header for each future request that the
The client would then display the confirmation to client submits to the server. The JWT
the user after receiving a confirmation from the token is checked by the server, and if it is
server, which would reserve the chosen seats in found to be valid, the request is approved.
the MongoDB database. After then, the user
would log off the computer.
The proposed MERN stack and JWT-based online

bus reservation system offer users security
capabilities for authentication and authorization.
Bus timetables, ticket bookings, and user
information may all be stored in a dependable and
scalable manner using the MongoDB database.
For a smooth user experience, the Express.js
server, React.js client, and Node.js runtime
collaborate. JWT, on the other hand, makes sure
that only approved users have access to the
booking system and can carry out tasks like Fig 1 Flow diagram
creating a reservation.
CONCLUSION
• MongoDB is a NoSQL database that uses
IN CONCLUSION, THE MERN STACK AND JWT
documents that resemble JSON to store
AUTHENTICATION AND AUTHORIZATION
data. Bus timetables, ticket bookings, and
FRAMEWORKS WERE USED TO CONSTRUCT AN
user information are all stored in them.
ONLINE BUS RESERVATION SYSTEM THAT HAS BEEN
• Express.js: A framework for backend web SUCCESSFUL IN MANAGING BUS BOOKINGS.
applications that manage server-side MODERN WEB TECHNOLOGIES HAVE MADE IT
functionality. It is used to develop the bus
POSSIBLE TO CREATE AN INTERFACE THAT IS
reservation system's RESTful API RESPONSIVE AND USER-FRIENDLY, WHILE JWT
endpoints. IMPLEMENTATION HAS MADE IT POSSIBLE TO
• React.js is a front-end JavaScript library ACCESS THE SYSTEM SECURELY.
that is used to build the web application's
user interface. The booking form, user A number of capabilities, including online
registration form, and login form are all booking, seat selection, payment processing, and
created using it. booking administration, are available through the
• On the server side, JavaScript code is system. Customers may register and manage their
executed using the Node.js runtime reservations. Administrators manage all bus
environment. To run the Express.js server, schedules, bookings, and payment tracking.
utilize it. REFERENCES
• Information may be securely sent between
parties as a JSON object using JWT [69] A Study on the Growth and Challenges of
(JSON Web Tokens). Online Bus Booking in India" by N. Radhika
and Dr. N. Usha Devi (2018)
• For authentication and authorization, JWT [70] Analysis of Customer Satisfaction Towards
is employed. Here's how JWT functions: Online Bus Ticket Booking Services in
Malaysia" by A. F. Khairil and N. O. Salihin
• The server verifies a user's login (2019).
information when they use the bus [71] Determinants of Customer Satisfaction in
reservation system. The server generates a Online Bus Ticket Booking Services: An
Empirical Study in Pakistan" by F. Azam and
JWT token and sends it back to the client S. S. K. Jafri (2020)
if the login credentials are valid. The JWT [72] An Analysis of Online Bus Ticketing System:
A Case Study of RedBus" by K. Sharanappa
token is then kept by the client in local and Dr. T. Ramesh (2020)R. Nicole, “Title of
storage. paper with the only first word capitalized,” J.
Name Stand. Abbrev., in press.
57
[73] Y. Yorozu, M. Hirano, K. Oka, and Y. pp. 740–741, August 1987 [Digests 9th
Tagawa, “Electron spectroscopy studies on Annual Conf. Magnetics Japan, p. 301, 1982].
magneto-optical media and plastic substrate M. Young, The Technical Writer’s Handbook.
interface,” IEEE Trans. J. Magn. Japan, vol. 2, Mill Valley, CA: University Science, 1989
58
MONITORING WATER QUALITY SYSTEM USING SMART SENSORS
Dr C Komalavali Boodidha Raj Kumar Deepak Janaki

(20191CSE0080) (20191CSE0211)
Professor( CSE )ilation) Deaprtment of CSE (ffliation) Deaprtment of CSE (ofAffiliation)
Presidency University (fon) Presidency University (iation) Presidency University filiation)
komalavalli@presidencyuniversit 201910101485@presidencyuniver 201910102180@presidencyuniver
y.in sity.in sity.in
Haritha C V Yashovardhan Reddy Beeram Kavitha Shantha Kumar

(20191CSE0185) (20191CSE0066) (20191CSE0242)
Deaprtment of CSE (offfiliation) Deaprtment of CSE ( Affiliation) Deaprtment of CSE (offfiliation)
Presidency University ffiliation) Presidency University (ofiliation) Presidency University (filiation)
ABSTRACT: In recent times, Internet of Things (IoT) paper, we developed a prototype system for measuring
and Remote Sensing (RS) techniques are utilized in the different parameters of water bodies such as rivers,
tracking, gathering, and evaluating records from remote lakes etc... The proposed smart water quality tracking
locations. Because of the full-size growth in worldwide gadget consists of smart sensors and a Wi-Fi module.
industrial output, rural to city float, and the over- This Wi-Fi module helps to connect cloud and Android
utilization of land and sea resources, the quality of water applications. By reading data from the pH sensor and
to be had by human beings has deteriorated TDS sensor, helps to analyze the pH value of water and
substantially. The excessive use of fertilizers in farms dissolved minerals of water. These values help in
and additionally other chemicals in sectors including determining the quality of water.
mining and production have contributed immensely to Keywords – pH sensor, TDS sensor, ESP32 module, piezo
the quality of water exceptional globally. Water is a vital buzzer, LED Display, Jumper wires.
need for human survival and consequently, there need to
be mechanisms put in place to vigorously take a look at
the satisfaction of water that is made available for
consumption on the town and metropolis articulated
INTRODUCTION
supplies and in addition to the rivers, creeks, and
coastline that surround our towns and cities. The supply There were numerous inventions in the twenty-
of desirable quality water is paramount in preventing first century, but at the same time, profanations,
outbreaks of water-borne diseases in addition to global warming, and other issues arose, and as a
enhancing the pleasant of existence. The improvement of result, there's no safe drinking water for the
a surface water tracking network is a crucial element in world's population. Water quality monitoring in
the assessment and protection of water quality. In this real-time faces challenges moment due to global
warming, limited water coffers, growing
population, and so on. As a result, better
methodologies for monitoring water quality
parameters in real time are much needed [1]. The
pH level of hydrogen ions is used as a parameter SYSTEM MODEL
to measure the quality of water. It indicates
whether the water is acidic or alkaline. Pure water
has a pH of 7, lower than 7 is acidic, and further
than 7 is alkaline. The pH scale runs from 0 to 14.
It should be between 6.5 and 8.5 pH for drinking
purposes. Turbidity is a dimension of a large
number of unnoticeable suspended patches in the
water. The lesser the turbidity, the lesser the threat
of diarrhea, and cholera. When the turbidity is
low, the water is clean. The temperature detector
determines whether the water is hot or cold. A
flow detector is a device that measures the inflow
of water. Traditional water quality monitoring
styles involve the homemade collection of water
samples from local places. Monitoring the water Fig.1: Block Diagram
quality in an efficient manner is playing a vital
role in every part of the earth. The sensors that have been used are pH and
TDS sensors. The IoT platform used is
LITERATURE SURVEY ThinkSpeak and also we used an Android
Water quality monitoring is a vital aspect of application like blynk. The sensors need to be
ensuring the safety and health of communities and configured and connected to the ESP32
ecosystems[2]. Many studies have been conducted microcontroller. This involves setting up the
to evaluate the effectiveness of water quality sensor's input and output pins and programming
monitoring programs and identify key challenges the ESP32 to read data from the sensors. The
and requirements. The literature highlights the sensors need to be calibrated to ensure accurate
importance of consistent and accurate sampling readings. The calibration process involves
and testing methods to ensure reliable data. comparing the sensor readings with known
Standardization of monitoring programs is also reference values and adjusting the sensor output to
crucial for comparing data across different regions match the reference values. The ESP32
and over time[3]. Funding is a significant microcontroller continuously reads data from the
challenge, particularly for smaller communities sensors and stores the readings in its memory. The
and organizations, which may struggle to secure collected data needs to be processed to extract
the resources needed to conduct monitoring useful information. This involves converting the
programs. Emerging contaminants, such as raw sensor readings into meaningful values such
microplastics, pharmaceuticals, and personal care as pH level and TDS value. The processed data
products, are also a growing concern, and can be transmitted wirelessly to a central server or
monitoring programs must be adaptable to address a cloud-based platform using Wi-Fi or Bluetooth
these issues. Collaboration between various connectivity. The data can also be displayed on a
stakeholders, including government agencies, local display for real-time monitoring. The
non-governmental organizations, and community collected data can be analyzed using data
members, is necessary for effective water quality analytics tools to identify values and patterns.
monitoring[4]. This can help in identifying water quality issues
60
and taking corrective actions. This helps us to
known water quality of drinking water.
Table 1: Pins Connections
ESP32 PIN Device Device PIN

5V TDS Sensor VCC
GND TDS Sensor GND
Fig.2: ESP32
34 TDS Sensor SIG B. pH sensor
5V pH Sensor VCC A pH sensor is an electronic device used to
GND pH Sensor GND measure the acidity or alkalinity of a solution. It
measures the hydrogen ion concentration and
35 pH Sensor SIG gives a corresponding pH value. The pH sensor
02 Piezo Buzzer + looks like Fig.3.
GND Piezo Buzzer -
GND LED Display GND
5V LED Display VCC
21 LED Display SDA
22 LED Display SCL
HARDWARE
A. ESP32 microcontroller
ESP32 is a low-cost, low-power microcontroller

with Wi-Fi and Bluetooth connectivity, dual-core
processor, and built-in security features, suitable Fig.3: pH sensor
for a wide range of IoT applications. It is
compatible with the Arduino IDE and has a
variety of peripheral interfaces. The ESP32 looks
like Fig.2.
C. TDS sensor
A TDS (Total Dissolved Solids) sensor is an

electronic device used to measure the
concentration of dissolved solids in a solution. It
works by measuring the electrical conductivity of
the solution, which is directly proportional to the
TDS. The TDS sensor looks like Fig.4.
61
enabling water treatment facilities to
respond quickly and effectively to mitigate
the impact of the event[10].
RESULTS
Fig.4: TDS sensor
OUTCOMES
The exact outcomes of a water quality monitoring Fig.5: TDS and pH values in Blynk
system using IoT technology will depend on the
The TDS (Total Dissolved Solids) value of
specific goals and objectives of the system, as
240 indicates the amount of dissolved
well as the methodology and technologies used.
substances in the water, while the pH
However, some potential outcomes could include:
value of 7 signifies a neutral pH level.
• By monitoring key indicators of water This information can help monitor water
quality in real-time, such as pH, quality and ensure it is within acceptable
temperature, and TDS, water treatment parameters.
facilities can identify and address issues
that may impact the safety and quality of
the water supply. This can lead to
improved water quality and reduced risk of
waterborne illnesses.
• IoT-based water quality monitoring
systems can help water treatment facilities
improve their operational efficiency by
providing real-time data on key indicators
of water quality [9]. This can help
facilities optimize their treatment
Fig.6: pH values in thingspeak
processes and reduce waste, leading to
cost savings and improved sustainability.
• The insights gained from real-time
monitoring and data analysis can inform
better decision-making around water
treatment processes and resource
allocation. This can help water treatment
facilities prioritize their efforts and
resources, leading to more effective and
efficient water management.
• By monitoring water quality in real-time, Fig.7: TDS values in thingspeak
IoT-based systems can provide early
warning of potential contamination events,
62
CONCLUSION water quality monitoring system," Wireless
Personal Communications, vol. 111, no. 1, pp.
Implementing water quality monitoring using
225-242, June. 2020.
IoT can be a very useful and efficient way to
continuously monitor and improve the quality of [6] R. Prasad and N. K. Agrawal, "Smart water
water in various settings. quality monitoring system using internet of things
and cloud computing," in 2019 International
With the use of IoT sensors and devices, water
Conference on Electrical, Electronics and
quality parameters such as pH, temperature,
Computer Engineering (UPCON), Gorakhpur,
dissolved oxygen, turbidity, and conductivity can
Mar. 2019, pp. 1-6.
be easily measured and transmitted to a central
system for analysis and interpretation. This can [7] M. U. Rehman, M. U. Farooq, S. U. Haq,
help detect any abnormalities or changes in water and S. Q. Hasan, "Smart Water Quality
quality, which can then be addressed promptly to Monitoring System using IoT-based Sensors,"
ensure the safety and health of both humans and IEEE Access, vol. 8, pp. 199067-199082, Jan.
aquatic life. 2020.
Furthermore, the use of IoT in water quality [8] S. K. Kar, S. K. Sahu, and S. S. Padhi,
monitoring can also lead to cost savings and "Smart water quality monitoring system using
increased efficiency, as it eliminates the need for wireless sensor network and IoT," in 2019
manual measurements and reduces the risk of International Conference on Intelligent
errors or discrepancies. Real-time data can be Computing, Instrumentation and Control
accessed and analyzed remotely, allowing for Technologies (ICICICT), Kannur, Feb. 2019, pp.
faster decision-making and timely interventions. 680-685.
Water quality monitoring using IoT can play a [9] A. R. Abdullah, S. A. Mahdi, and S. A.
significant role in ensuring the sustainability of Aziz, "Smart water quality monitoring system
our water resources and protecting the based on internet of things," in 2018 IEEE 14th
environment. It is a promising technology that can International Colloquium on Signal Processing &
benefit various sectors such as agriculture, Its Applications (CSPA), Penang, April.
industry, and public health. 2018, pp. 167-171.
REFERENCES [10] N. Zidan, M. Mohammed and S. Subhi,
"An IoT based monitoring and controlling system
[1] R. S. Pandey and N. N. Pandey, Water
for water chlorination treatment", Proc. Int. Conf.
Quality Monitoring and Management: Basis,
Future Networks and Distributed Systems, pp. 31,
Technology and Case Studies, Springer Nature,
Jun. 2018.
Feb. 2020.
[11] R. Ramakala, S. Thayammal, A. Ramprakash
[2] A. Prasad and P. Singh, Monitoring and
and V. Muneeswaran, "Impact of ICT and IOT
Modeling of Global Environmental Change,
Strategies for Water Sustainability: A Case study
Springer Nature, Nov. 2019.
in Rajapalayam-India", Int. Conf. Computational
[3] A. M. Abdel-Shafy and S. M. Mansour, Intelligence and Computing Research (ICCIC 18),
Water quality – Monitoring and assessment, pp. 1-4, Dec. 2017.
Elsevier, Oct. 2019.
[4] J. Zhu, Y. Zhang, and H. Liu,
Environmental monitoring and modeling with
remote sensing and GIS, Elsevier, Jan. 2019.
[5] R. S. S. Dubey, S. Gupta, A. Tripathi, and
P. Pandey, "Wireless sensor network-based smart
63
Hastenure : Recruitment Management System
Dr.Mohamadi Begum Y Sonti Ruthvik Sree Ranga T Likhitha

20191CCE0064 20191CCE0074
Mohamadi.begum@ 201910101183@ 201910100498@
presidencyuniversity.in presidencyuniversity.in presidencyuniversity.in
Tanu Verma Lakshmi Prasanna NV Sohan Narayan Moger

20191CCE0083 20191CCE0084 20191CCE0063
201910101749@ 201910102136@ 201910102277@
presidencyuniversity.in presidencyuniversity.in presidencyuniversity.in
Astract: Keywords: Hiring Management

System, Candidate Portfolio, ATS
Organizations are currently becoming more reliant (Applicant Tracking System), Job
on the internet for HRM-related tasks. Today, the Search, cloud.
majority of businesses prefer using internet
recruitment channels to find and hire the top I Introduction
applicants. The HR department frequently By automating recruiting processes and
encounters both positives and negatives. This workflows, businesses can boost recruiter
paper's main objective is to address such efficiency, shorten time to fill positions,
challenges and determine the best way to solve lower cost per hire, and enhance their
them. Whenever it comes to various hiring tasks, organization's entire talent pool. Human
such as profile screening, interviews, making Capital Management (HCM), a class of
offers, following up after making offers, and finally enterprise used by businesses to manage
hiring new employees, the HR sector faces multiple all facets of their workforce, includes
challenges. The major goal is to create a recruiting recruiting automation as a subset.
management system that, within the next two Recruiters can focus on speaking with and
months, will speed up the hiring process, enhance hiring individuals who are a fantastic fit
candidate experience, and cut the company's time for the business and role by automating
to hire by 30%. This paper's objective is to review the repetitive tasks. The system allowed
for recruiting and the automated
administration of professionally
cloud-based hiring management systems. The developed, job-related questions aimed at
implementation of process automation in the deciphering whether an applicant meets
hiring process is discussed in this paper along with the job requirements. The bulk of e-
potential applications. We make a contrast recruitment procedures are automated, and
between promises and current practices when it candidates are graded according to a set of
comes to the implementation of automated hiring clear-cut standards that are easy for
platforms (AHPs). This paper's structure was businesses to follow. To get in touch with
developed by a survey of the most recent articles, candidates, they also use an online career
research papers, reports, and other pertinent site. The person in charge of hiring
material. mentions the position requirements and
selection criteria in the job ad. The hiring
64
process is initiated when candidates submit
applications for open positions on the e-recruitment Fig.1 Block Diagram For Automated
portal. Actually, e-recruitment makes it possible for Hiring
job searchers to connect with more opportunities and
gather more knowledge. During e-recruitment,
candidates upload their resumes to the system, which
should be reviewed by an organization's HR
specialist. The candidate can also access information
from the employer regarding his application
procedure via a web-based recruiting portal or the
company website.
1.2 Existing Recruitment Process:

Problems and Difficulties Recruitment process
involves a systematic procedure from sourcing the
candidates to arranging and conducting the interviews
and requires many resources and time. The hiring
manager specifies the job's qualifications and
selection criteria in the job posting. Candidates that
apply for available positions on the e-recruitment
portal start the recruiting process. In reality, e-
recruitment enables job seekers to connect with more
prospects and learn more information. Fig.2 System Sequence Diagram: Apply
Candidates upload their resumes to the system during Job
e-recruitment, and an organization's HR specialist
should review them. Through a web-based recruiting II. Screening and Grading
portal or the company website, the candidate can also Methodology
get information from the employer about his
application process.
1.3 Apply for the job Use Case
A job seeker becomes applicant when he applies for a

job that he/she selects. The
applicant chooses the job he wants to apply and inter
its all the Personal, Educational, Experience and Other
information in a form and applies for the Job. In more
advanced
form the applicant also upload its CV and the required
documents for the organization which is submitted to
the organization.
65
employees varies by region and industry,
but the actual cost can be high. Their
salary is not the only cost of hiring
employees; this also includes recruiter
salaries, time and effort, training and
onboarding costs, and increases the
overall cost of new hires. According to
Deloitte, the average cost per employment
is $4,000 and finding a direct alternative
requires 50–60% of an employee's salary.
IV. Manual Hiring problems.
The most common factors that contribute

to the overall cost of manual hiring new
employees are:
• Human Resource [HR] - Salary and
Time
• Career Events, Job Boards, and social
media
• On boarding and Training
• Candidate’ Salary and Bonus/Incentive
• Background verification and reference
checks.
V. Recruitment Process
A typical recruitment process has few

fundamental steps as listed:
• Identify the Human resources required
• Define the Job Profiles, eligibility
criteria, required skillets, pay package and
cost to company [CTC].
III. Literature Survey • Devise a strategy to shortlist the
candidates or the received applications
It has been proved that businesses can improve the (May include qualification, experience,
effectiveness of the hiring process. By incorporating background check etc.)
e-recruitment tools into their HR administration • Schedule screening test, interviews –
infrastructure, companies can dramatically reduce technical /personal, HR Rounds, group
expenditures. The introduction of can further boost discussion
efficiency. • Final selection process, making offers
automated e-recruitment systems that offer candidate and employee on boarding.
pre-screening and ranking so that background checks ⚫ A new Feature is being added named
and interviews can be restricted to the most qualified portfolio.
applicants. Instead of providing a resume tailored to ⚫ A portfolio consists of experienced
the job, candidates can apply for a post using their candidates which are available to join
Linked-In profile under the suggested system. the organization whenever called.
Employers can access the applicant's whole career ⚫ This feature can be used for
history and network, giving them the opportunity to emergency recruitments.
analyses the candidate's profile instantly for a variety
of job openings without having to parse a full-text VI. Existing Recruiting Software
application. According to the actual cost of hiring new
66
Numerous Software and Tools are designed and Availability and selection skilled
developed to automate the process of recruitment. workforce directly contribute to the
Such tools source, evaluate, assess, interview, and economic growth of the nation. With
recruit the potential candidates and facilitate faster exponential growth in the job vacancies
and better decision processes. and emerging new job profiles, the pool of
Hiring Tools like Turrbohire claim to: candidates interested to apply and explore
• Search effectively the opportunities is also increasing
• Multichannel sourcing enormously. The manual process of
• AI assisted Screening shortlisting and selecting the right profiles
• Recruitment Tracking is time consuming, depends on the
• Talent Interview knowledge and experience of the trained
personal and is prone to human errors.
1) Treat job seekers like customers Automated Hiring system and recruitment
software are therefore popularly preferred
2) Automated hiring system creators ought to due to its numerous advantages and are
reevaluate the target market for their offering. attracting the research community.
References
3) "The idea is to harness the current fascination
with improving customer experiences online and 1. [1]. An Integrated E-Recruitment System
translate that to thinking of job seekers as customers," for Automated Personality Mining and
because hiring organizations are the actual clients, Applicant Ranking, October 2012
any push for change would have to come from them —
to request a product that offers a better experience for Authors: Evanthia Faliagka,Athanasios
job seekers. Cloud-based Recruiting Software is Tsakalidis,Giannis Tzimas
Quick to Deploy
2. [2]. The Use of an Automated Employment
Deployment of a premise-based hiring software Recruiting and Screening System for
often takes months to fully get going. If the business Temporary Professional Employees: A
has multiple locations, the must install the software Case Study. August 2004 Authors: Patrick
at each site. Each recruitment team must be trained Buckley,Kathleen Minette,Dennis Joy,Jeff
and often in group settings. All of this takes Michaels
planning and manpower. In addition, training
usually interrupts other workflows, so there is an [3]. Automate Hiring System. July 2011
opportunity cost in lost productivity as well. SaaS Authors:Amit Batajoo,Naresh Adhikari
recruiting software can be set up in days or weeks as
opposed to the months it takes to implement on-
premise systems. Implementation involves migrating 4. [4]. An Integrated E-Recruitment System for
data, creating user accounts and customizing Automated Personality Mining and Applicant
features and tools. Many cloud-based systems have Ranking
user-guided training, chatbots and knowledge bases. October 2012
These resources are often more effective in Authors : Athanasios Tsakalidis,Giannis
delivering training. Tzimas.
IV. Conclusion
67
BRAIN TUMOR DETECTION USING DEEP LEARNING
Chinthalaiah Gari Naveen Kumar Chowdam Mahesh Gajjela Bharath Reddy

(20191CSE0103) (20191CSE0108) (20191CSE0156)
Department of CSE Department of CSE Department of CSE (
Presidency University Presidency University Presidency University filiation)
Gangaraju Kiran Maddireddy Vishnu Vardhan Reddy

(20191CSE0161) (20191CSE0305)
ABSTRACT: A brain tumor is a growth of abnormal In recent times, the introduction of information
cells that have formed in the brain. Some brain tumors technology and e-health care system in the medical field
are cancerous (malignant), while others are not (non- helps clinical experts to provide better health care to the
malignant). A brain tumor is an abnormal growth of patient. This study addresses the problems of
cells inside the brain or skull. Primary tumor: grows segmentation of abnormal brain tissues and normal
from the brain tissue. Secondary tumor: cancer cells tissues such as gray matter, white matter, and
from different parts of the body spreads to the brain. cerebrospinal fluid from Magnetic Resonance Imaging
Most Researches in developed countries show that (MRI) images. The previously proposed models have
several people who have brain tumors died due to the high computational time and segmentation with a low
fact of inaccurate detection. Generally, a CT scan or complex network. In this paper, a Convolutional Neural
MRI that is directed into the intracranial cavity Network (CNN) has been used to detect tumors through
produces a complete image of the brain. This image is Magnetic Resonance Imaging (MRI) images. The
visually examined by the physician for detection & proposed system detects brain tumors with good
diagnosis of brain tumor. However, this method of computational time.
detection resists the accurate determination of the type Keywords – Brain Tumor, Magnetic Resonance Imaging,
& size of the tumor. Convolutional Neural Network, Deep Learning
I. INTRODUCTION detecting the tumor regions and classifying them
Brain tumor is a type of abnormal growth or mass that into three categories namely glioma, meningioma,
occurs in the brain tissue, and it can be benign (non- and pituitary tumor. For the Faster R-CNN
cancerous) or malignant (cancerous). The early and accurate
algorithm implementation, a deep convolutional
detection of brain tumors is crucial for timely medical
intervention and improved patient outcomes. Conventional network architecture called VGG-16 was used as
methods for brain tumor detection, such as magnetic the base network. The proposed algorithm
resonance imaging (MRI) and computed tomography (CT) efficiently identifies the brain tumor regions by
scans, are widely used, but they often require skilled choosing the optimal bounding box generated by
radiologists for interpretation and may have limitations in RPN.
terms of accuracy and efficiency.
In recent years, deep learning, a subfield of machine • P. K. Ramtekkar, A. Pandey, and M. K. Pawar:
learning, has shown promising results in various medical This paper studies various image classification
imaging tasks, including brain tumor detection. Deep methods, compares them, and concludes that each
learning models, such as convolutional neural networks method has its own advantages and disadvantages.
(CNNs), have demonstrated the ability to automatically
DBN is better than other methods in accuracy but
learn complex patterns and features from medical images,
leading to improved accuracy in detecting brain tumors. time-consuming whereas SVM, DT, and K-NN are
These models can analyze large amounts of data and extract simple to implement but will not accurate always.
relevant features, enabling them to detect brain tumors with Hence it is observed that the accuracy of the result
high precision and recall. is much more important, therefore deep neural
In this paper, we propose a deep learning-based approach network is preferable over other methods of brain
for brain tumor detection using CNNs. We aim to leverage tumor detection and classification.
the capabilities of CNNs to automatically detect brain
tumors from medical images with high accuracy. We will • Z. Jia and D. Chen : This paper presents a Fully
present our methodology, including the architecture of the Automatic Heterogeneous Segmentation using
CNN model, the dataset used for training and evaluation, Support Vector Machine (FAHS-SVM) for brain
and the evaluation metrics used for performance assessment.
tumor identification and segmentation. The
We will also discuss the experimental results and analyze
the performance of our proposed approach in terms of accuracy of our automated approach is similar to
precision, recall, and other relevant metrics. Finally, we will the values for manual segmentation inter-observer
conclude with the potential applications and future variability. To identify tumor regions by combining
directions of deep learning-based brain tumor detection. intrinsic image structure hierarchy and statistical
classification information. The tumor areas
described are spatially small and consistent
II. Literature Survey concerning image content and provide an
• Gumaei et al: introduced an automated appropriate and robust guide for the consequent
approach to assist radiologists and physicians segmentation.
in identifying different types of brain tumors.
The study was conducted in three steps: brain
image preprocessing, brain feature extraction, • K. Venu, P. Natesan, N. Sasipriyaa, and S.
and brain tumor classification. In the Poorani: Segmentation of the brain tumors
preprocessing step, brain images were automatically for cancer diagnosis is a challenging
converted into intensity brain images in the task. In this paper, a review of the state-of-the-art
range of [0, 1], using a min–max
methods based on deep learning is provided. Even
normalization rule. In the next step, the PCA-
NGIST method (a combination of normalized Convolution Neural Network can automatically
GIST descriptor with PCA) was adopted to extract complex features from the images. Further
extract features from MRI images. improvements and modifications in CNN
architectures and the addition of complementary
• Y. Bhanothu, A. Kamalakannan, and G. information can improve the efficiency of
Rajamanickam: This paper discusses the segmentation.
automatic brain tumor detection and classification
of MR Images using a deep learning algorithm. • N. Noreen, S. Palaniappan, A. Qayyum, I.
The Faster R-CNN algorithm was chosen for Ahmad, M. Imran, and M. Shoaib: This paper
69
discussed the application of deep learning models order to evaluate the performance of the CNN, has
for the identification of brain tumors. In this paper, been used by other classifiers such as the RBF
two different scenarios were assessed. Firstly, a classifier and the decision tree classifier in the
pre-trained DensNet201 deep learning model was CNN architecture.
used, and the features were extracted from various
DensNet blocks. Then, these features were
concatenated and passed to the softmax classifier to
classify the brain tumor. Secondly, the features III. Proposed Work
from different Inception modules were extracted
from the pre-trained Inceptionv3 model and
concatenated, and then passed to the softmax for
the classification of brain tumors.
• C. L. Choudhury, C. Mahanty, R. Kumar, and

B. K. Mishra: CNN, which discriminates between
the Brain MRI images to mark them as tumorous or
not. The model achieved an accuracy of 96.08%,
Fig.1: Block Diagram
with an f-score of 97.3. The model is having CNN
with 3 layers and requires very few steps of pre-
processing to produce the results in 35 epochs. The Our proposed system shown in Figure 1 will detect brain
purpose of the research is to highlight the tumors accurately with the help of deep learning techniques.
And brain tumor image data collection, and make
importance of diagnostic machine-learning
preprocessing and loading of the datasets, and the detection
applications and predictive treatment. & diagnosis of brain tumors do not accurately determine the
type and stages of the tumor. We proposed a system that
• D. V. Gore and V. Deshpande: In this paper, they detects and classify brain tumor with help of deep learning
presented a literature review related to brain neural networks and other related libraries like Keras,
disease detection based on deep learning TensorFlow, and OpenCV.
techniques presented. Inside this document, we A.MRI Dataset
studied the comparative study and analysis of
we have collected the MRI images from the Kaggle
traditional techniques. In customary automatic website. The training and testing dataset consists of more
glioma segmentation strategies, making an than 1000 images. Both the training and testing datasets are
interpretation of earlier information into classified into four classes. They are glioma tumors,
probabilistic maps or choosing profound agent meningioma tumors, pituitary tumors, and no tumors.
highlights for classifiers is a difficult undertaking. B. Convolutional Neural Network
Convolutional Neural Network, which is a type of deep
learning model specifically designed for image processing
• M. Siar and M. Teshnehlab: This paper presents tasks. It is widely used in computer vision applications such
a new method based on the combination of feature as image classification, object detection, and image
extraction algorithm and the CNN for tumor generation.
detection from brain images. The CNN is capable CNNs consist of multiple layers that are stacked on top
of detecting a tumor. CNN is very useful for of each other to process and extract features from images.
selecting an auto-feature in medical images. Let's take a closer look at some of the common CNN layers:
Images collected at the centers were labeled by Convolutional Layer (Conv2D): This is the primary
clinicians, then, tumor screenings were categorized layer in a CNN that performs convolution operations on the
into two normal and patient classes. A total of 1666 input image. Convolution is a mathematical operation that
images were selected as train data and 226 images involves a filter (also known as kernel or feature map)
were taken as test data. The proportion of image sliding over the input image to extract local patterns such as
categorization in the two classes was proportional edges, textures, and shapes. The output of a convolutional
layer is a feature map, which represents the presence of
to the ratio of patients to healthy subjects. Images
learned features in the input image.
were applied to the CNN after preprocessing. In
70
Max Pooling Layer (MaxPooling2D): This layer is used
to reduce the spatial dimensions (width and height) of the
feature map, while retaining important features. Max
pooling takes the maximum value from a group of values in
a local region of the feature map, which helps in reducing
the computational complexity and the risk of overfitting.
Dropout Layer (Dropout): This is a regularization
technique used in CNNs to prevent overfitting. During
training, dropout randomly sets a fraction of input units to 0
at each update, which helps in preventing the model from
relying too heavily on any particular feature or neuron. Fig.:2 CNN architecture
Flatten Layer (Flatten): This layer is used to convert the The accuracy graphs and loss graphs are shown in Figure 3
3D feature map into a 1D vector, which can be fed into a and Figure 4
fully connected (dense) layer for further processing.
Fully Connected Layer (Dense): This is a traditional
neural network layer that connects each neuron to every
neuron in the previous and subsequent layers. It performs
the final classification or regression task based on the
extracted features from the convolutional layers.
These are some of the common CNN layers used in deep
learning models for image processing tasks.
IV. Implementation Details
Model Definition: A sequential CNN model is defined Fig.3: Accuracy graph
using Keras, which is a linear stack of layers. The model
consists of multiple convolutional layers (Conv2D) with
different filter sizes, activation functions (ReLU), and
dropout regularization (Dropout) to prevent overfitting. The
model also includes max pooling layers (MaxPooling2D) to
downsample the feature maps and flatten layer (Flatten) to
convert the 2D feature maps to 1D feature vectors. Finally,
fully connected layers (Dense) with ReLU activation are
added, followed by an output layer with softmax activation
for multi-class classification.
Model Compilation: The model is compiled with a
categorical cross-entropy loss function (categorical_ cross- Fig.4: Loss graph
entropy), Adam optimizer (Adam), and accuracy as the
evaluation metric.
Evaluation Parameters
Model Training: The model is trained on the training data Accuracy: it is the rate of rightly prognosticated samples to
(X_train and y_train) using the fit function with a specified the total number of samples. It provides an overall measure
number of epochs (20) and a validation split of 0.1 (10% of of how well the model is performing. An advanced delicacy
training data used for validation). The training progress is indicates better performance.
stored in history for later analysis.
accuracy = (TP + TN) / (TP + TN + FP + FN)
Precision: Precision It's the rate of rightly prognosticated
positive samples to the total number of prognosticated
positive samples. It measures the delicacy of positive
prognostications. Advanced perfection indicates smaller
false positive prognostications.
precision = TP / (TP + FP)
71
Recall (also known as Sensitivity or True Positive Rate): Science, Engineering and Applications (ICCSEA),
Recall is the ratio of correctly predicted positive samples to Gunupur, India, 2020, pp. 1-4, doi:
the total number of actual positive samples. It measures the 10.1109/ICCSEA49143.2020.9132874.
ability of the model to identify positive samples. A higher
recall indicates fewer false negative predictions. [4] Madhupriya, N. M. Guru, S. Praveen, and B.
Nivetha, "Brain Tumor Segmentation with Deep
recall = TP / (TP + FN) Learning Technique," 2019 3rd International
F1 score: F1 score is the harmonic mean of precision and Conference on Trends in Electronics and Informatics
recall, and provides a balanced measure of model (ICOEI), Tirunelveli, India, 2019, pp. 758-763, doi:
performance. It takes both precision and recall into account 10.1109/ICOEI.2019.8862575.
and provides a single value that balances both metrics. A [5] G. Hemanth, M. Janardhan and L. Sujihelen,
higher F1 score indicates a better balance between precision "Design and Implementing Brain Tumor Detection
and recall. Using Machine Learning Approach," 2019 3rd
F1 score = 2 * (precision * recall) / (precision + recall) International Conference on Trends in Electronics and
Informatics (ICOEI), Tirunelveli, India, 2019, pp. 1289-
Confusion Matrix: A confusion matrix is a table that 1294, doi: 10.1109/ICOEI.2019.8862553.
displays the true positive, true negative, false positive, and
false negative predictions of a model. It provides a detailed [6] M. Siar and M. Teshnehlab, "Brain Tumor
breakdown of model performance, allowing for a deeper Detection Using Deep Neural Network and Machine
analysis of performance in different categories. Learning Algorithm," 2019 9th International
Conference on Computer and Knowledge Engineering
(ICCKE), Mashhad, Iran, 2019, pp. 363-368, doi:
V. CONCLUSIONS 10.1109/ICCKE48569.2019.8964846.
The proposed algorithm is performed on the collected
[7] Y. Bhanothu, A. Kamalakannan, and G.
dataset that has four classes. The four types are glioma,
Rajamanickam, "Detection and Classification of Brain
meningioma, pituitary, and no tumor. we have divided the
Tumor in MRI Images using Deep Convolutional
dataset into training and testing sets. Real patient data is
Network," 2020 6th International Conference on
used for evaluating the model. CNN(Convolution Neural
Advanced Computing and Communication Systems
Network) can automatically extract complex features from
(ICACCS), Coimbatore, India, 2020, pp. 248-252, doi:
the images CNN is very useful for selecting an auto-feature
10.1109/ICACCS48705.2020.9074375.
in medical images. Images collected at the centers were
labeled by clinicians then tumor screenings were [8] N. Noreen, S. Palaniappan, A. Qayyum, I. Ahmad,
categorized into three 3 classes. A total of 1000 images were M. Imran, and M. Shoaib, "A Deep Learning Model
selected as train data and 1% of images were taken as test Based on Concatenation Approach for the Diagnosis of
data. The model gives an accuracy of 95.04%. Brain Tumor," in IEEE Access, vol. 8, pp. 55135-
55144, 2020, doi: 10.1109/ACCESS.2020.2978629.
VI. References [9] P. K. Ramtekkar, A. Pandey and M. K. Pawar, "A

proposed model for automation of detection and
[1] S. Hussain, S. M. Anwar, and M. Majid, "Brain classification of brain tumor by deep learning," 2nd
tumor segmentation using cascaded deep convolutional International Conference on Data, Engineering, and
neural network," 2017 39th Annual International Applications (IDEA), Bhopal, India, 2020, pp. 1-6, doi:
Conference of the IEEE Engineering in Medicine and 10.1109/IDEA49133.2020.9170742
Biology Society (EMBC), Seogwipo, 2017, pp. 1998-
2001, doi: 10.1109/EMBC.2017.8037243. [10] Gumaei A, Hassan MM, Hassan MR, Alelaiwi A,
Fortino G. A Hybrid feature extraction method with
[2] K. Venu, P. Natesan, N. Sasipriyaa, and S. Poorani, regularized extreme learning machine for brain tumor
"Review on Brain Tumor Segmentation Methods using classification. IEEE Access. 2019;7:36266–73.
Convolution Neural Network for MRI Images," 2018
International Conference on Intelligent Computing and [11] Z. Jia and D. Chen, "Brain Tumor Identification
Communication for Smart World (I2C2SW), Erode, and Classification of MRI Images using deep learning
India, 2018, pp. 291-295,doi: techniques," in IEEE Access, doi:
10.1109/I2C2SW45816.2018.8997387. 10.1109/ACCESS.2020.3016319.
[3] C. L. Choudhury, C. Mahanty, R. Kumar and B. K. [12] D. V. Gore and V. Deshpande, "Comparative
Mishra, "Brain Tumor Detection and Classification Study of various Techniques using Deep Learning for
Using Convolutional Neural Network and Deep Neural Brain Tumor Detection," 2020 International Conference
Network," 2020 International Conference on Computer for Emerging Technology (INCET), Belgaum, India,
72
2020, pp. 1-4, doi:
10.1109/INCET49848.2020.9154030.
[13] Zhe Xiao et al., "A deep learning-based
segmentation method for brain tumor in MR images,"
2016 IEEE 6th International Conference on
Computational Advances in Bio and Medical Sciences
(ICCABS), Atlanta, GA, 2016, pp. 1-6, doi:
10.1109/ICCABS.2016.7802771
73
AI CHATBOT FOR DIAGNOSING ACUTE DISEASES
Rohit S Biradar Ranjan GT

Dr. Harish Kumar K S
Computer Science and Computer Science and
Department Of Computer Engineering Engineering
Science Engineering
Presidency University, Presidency University
Presidency University, Bengaluru,India Bengaluru, India
Bengaluru,India
201910100469@presidencyu 201910100192@presidencyu
harishkumar@presidencyuniv niversity.in niversity.in
ersity.in
S Padmanabhan
Samanth K Computer Science and
S Gritish
Computer Science and
Engineering
Presidency University, Bengaluru,India
Bengaluru, India Bengaluru,India 201910101655@presidencyu
201910100560@presidencyu niversity.in
201910100879@presidencyu
niversity.in niversity.in
AI in healthcare is the development of chatbots

ABSTRACT - This is a Flask-based web
for diagnosing acute diseases. Acute diseases are
application that uses a Decision Tree Classifier those that develop suddenly and require urgent
model to predict the disease based on the input medical attention. .The use of AI-powered
symptoms. The model is trained on a dataset chatbots for diagnosing acute diseases has the
containing symptoms, their descriptions, potential to improve the speed and accuracy of
precautions, and severity. The severity of diagnosis, thereby improving patient
symptoms is obtained from a CSV file, and outcomes.AI-powered chatbots are designed to
symptom descriptions and precautions are interact with patients in a natural language format,
obtained from separate CSV files. The user inputs similar to a conversation with a healthcare
the symptoms they are experiencing, and the provider. They use machine learning algorithms to
model predicts the disease they might have. The analyze patients' symptoms and provide them with
application also provides information about the appropriate recommendations. The chatbots are
severity of the predicted disease and its trained on datasets of medical records and clinical
precautions. The user can also enter the number of guidelines to ensure that they provide accurate
days they have been experiencing the symptoms and reliable diagnoses
to obtain more accurate results. If the predicted However, there are also challenges that need to
disease is not accurate, the user can provide more be addressed in the development and use of AI-
symptoms to refine the prediction. powered chatbots for diagnosing acute diseases.
Keywords—Artificial One challenge is ensuring that the chatbots are
Intelligence,Chatbot,Acute Disease,Flask,Web transparent and user accessibility so that patients
Application,Digital Health,HealthCare can understand how they arrive at their diagnoses.
Another challenge is ensuring that the chatbots are
I. INTRODUCTION designed to be user-friendly, and that patients are
Artificial intelligence (AI) has shown great able to provide accurate information about their
potential in the field of healthcare, especially in symptoms.
the area of disease diagnosis. One application of
In conclusion, AI-powered chatbots have the delivering telehealth in India to increase patient
potential to revolutionize the diagnosis and access to healthcare knowledge and leverage the
management of acute diseases. The studies potentials of artificial intelligence.
discussed in this literature review demonstrate the [6] The paper proposed the survey provides
high accuracy and speed of diagnosis that can be insights into the challenges and opportunities
achieved using AI-powered chatbots. However, associated with chatbots in healthcare, particularly
there are also challenges that need to be addressed in the context of COVID-19, including their use in
before chatbots can be widely adopted in clinical patient monitoring, disease diagnosis, and
practice. Nonetheless, the potential benefits of medication adherence.
chatbots in improving patient outcomes make
them a promising area for further research and [7] The paper proposed the survey provides
development. insights into the potential benefits and challenges
associated with the use of chatbots in mental
II. LITERATURE SURVEY healthcare, particularly in providing counseling
[1] This research proposes a medical chatbot services.
application that uses natural language processing [8] The paper proposed the chatbot is easy to
and machine learning to predict diseases and use and offers medical-related information like
recommend treatments. It can be used to conduct doctor's contact details and addresses of nearby
daily check-ups, make people aware of their hospitals. It has wide future opportunities and
health status, and encourage them to take proper people in remote areas can also benefit from it.
measures to remain healthy. After building the network, the bot will predict the
[2] The proposed idea is to create a system correct answers to the user's queries, even if it is
with artificial intelligence that can predict not in the training model. It tries to predict it
diseases based on symptoms and give the list of closer by checking sentences and their words.
available treatments. The AI can also give the [9] The paper proposed a chatbot can help people
composition of the medicines and their prescribed by providing them with the most suitable disease
uses, helping people to take the correct treatment. and nutritional breakdown of their food. It does
This will help them to have an idea about their not need the support of a doctor to provide
health and have the right protection. effective health steps, and the conversational AI is
[3] The paper proposes a medical chatbot that reliable and helps the chatbot analyse the disease
can diagnose and provide basic details about a completely. This is one of its key advantages
disease before consulting a doctor. It uses natural orizontally, moving to a third row if needed for
language processing techniques and a third party more than 8 authors.
expert program to handle questions not present or [10] Telemedicine is a way of providing health
understood by the chatbot. It aims to reduce care services remotely using information and
healthcare costs and improve availability to communication technologies. It can help patients
medical knowledge. access medical care during the COVID-19
[4] The paper proposed an Implementation of pandemic and reduce the risk of infection.
Machine Learning Based Bangla Healthcare Chatbots are conversational agents that can
Chatbot" presents a closed domain Bangla chatbot interact with users using natural language and
that provides medical advice and diagnosis based provide them with health-related information and
on user's symptoms and health status. It uses six guidance. Natural language processing (NLP) is a
supervised machine learning approaches and a branch of artificial intelligence that deals with the
third party expert program to handle questions not analysis and generation of natural language. NLP-
present or understood by the chatbot. based chatbots can understand user queries and
provide relevant responses based on medical data
[5] The paper proposed Telemedicine has the
and knowledge. However, chatbots face
potential to help reduce COVID-19 transmission
challenges such as multilingualism, accuracy,
among patients and clinicians. A Multilingual
reliability, privacy, and user satisfaction.
Conversational Bot based on NLP has been
Therefore, there is a need for further research and
developed to provide free primary healthcare
improvement in chatbot development for rural
education, information, and advice to chronic
health care in India.
patients. The paper proposes a conversational bot
"Aapka Chikitsak" on Google Cloud Platform for
75
METHODS AND TECHNOLOGIES Proposed Method
Decision Tree classifier The AI chatbot is purpose-built for diagnosing
acute diseases. It has been trained using standard
It is a supervised learning technnique that can
datasets and a decision tree algorithm. The
be used for both classification and regression
chatbot is hosted using Flask and HTML, making
problems.It is more useful in the case of
it accessible as a web chat. Upon initiating a
classification problems.It is a Tree Structured .
conversation with the user, the chatbot first asks
The internal nodes represent feature of
for the user's name and stores it for reference. It
dataset,branches represent the decsion rules.In a
then proceeds to inquire about the primary
decsion tree there are two types of nodes , which
symptom of the patient. Following this, the
are decision node and leaf node.
chatbot asks for suitable multiple symptoms, to
Scikit-learn which the user can respond with "Yes/No"
• Preprocessing the data using answers. Based on the collected symptoms, the
LabelEncoder and train_test_split. chatbot predicts and provides suitable precautions
and measures to be followed by the patient. The
• Creating a Decision Tree Classifier for the
chatbot's user-friendly interface and interactive
prediction of the disease based on the conversation flow make it a valuable tool for
symptoms. diagnosing acute diseases and providing relevant
• Creating a Support Vector Machine guidance to users.
Classifier (SVC) which is not used in the
code and can be removed. System Architecture
• Calculating cross-validation scores for the
created Decision Tree Classifier.
• Implementing the predict function to
predict the disease based on the input
symptoms.
Flask Framework
Flask is a Python web framework that is

used to develop web applications. In this code,
Flask is being used to create a web application
that serves a web page with a form where users
can enter data to make predictions using the
machine learning model that was trained with
scikit-learn. Flask is used to define the routes that
the web application will handle and to handle the
HTTP requests and responses. When a user
submits the form, Flask will pass the data to the
machine learning model, retrieve the prediction,
and display the result on the web page. In other
words, Flask is responsible for creating the user
interface and handling the web interactions, while
scikit-learn is responsible for making predictions
based on the data provided by the user.
HTML and CSS

For the user interface is built using Html
and css and for the backend python Flask
framework is used
76
Work Flow Of Chatbot
Accuracy for the Model

X=input data
Y=target variable (training Data)
• The last column of the training data

contains the target variable.
• It then uses label encoding to convert the
categorical target variable (strings) into
numerical format
• DecisionTreeClassifier() method from
scikit-learn to train a decision tree model
on the training data.
• The code then uses cross_val_score()
method to calculate the accuracy score of
the model .
Result/Outputs:
77
on Reliability, Infocom Technologies and
Optimization (Trends and Future Directions)
(ICRITO), Noida, India, 2020, pp. 619-622, doi:
10.1109/ICRITO48877.2020.9197833.
M. M. Rahman, R. Amin, M. N. Khan Liton and

N. Hossain, "Disha: An Implementation of
Machine Learning Based Bangla Healthcare
Chatbot," 2019 22nd International Conference on
Computer and Information Technology (ICCIT),
Dhaka, Bangladesh, 2019, pp. 1-6, doi:
10.1109/ICCIT48885.2019.9038579.
CONCLUSION
A. S, N. R. Rajalakshmi, V. P. P and J. L,
This paper presents a medical chatbot that can "Dynamic NLP Enabled Chatbot for Rural Health
diagnose and provide information about a disease Care in India," 2022 Second International
before consulting a doctor. The chatbot uses Conference on Computer Science, Engineering
natural language processing techniques and a and Applications (ICCSEA), Gunupur, India,
third-party expert program to handle questions it 2022, pp. 1-6, doi:
doesn't understand. The system aims to reduce 10.1109/ICCSEA54677.2022.9936389.
healthcare costs and improve access to medical
knowledge. The authors conducted experiments E. Amer, A. Hazem, O. Farouk, A. Louca, Y.
and obtained promising results, demonstrating the Mohamed and M. Ashraf, "A Proposed Chatbot
potential of the chatbot in assisting patients with Framework for COVID-19," 2021 International
self-diagnosis and symptom checking. Mobile, Intelligent, and Ubiquitous Computing
Conference (MIUCC), Cairo, Egypt, 2021, pp.
263-268, doi:
10.1109/MIUCC52538.2021.9447652.
References
K. -J. Oh, D. Lee, B. Ko and H. -J. Choi, "A

R. B. Mathew, S. Varghese, S. E. Joy and S. S. Chatbot for Psychiatric Counseling in Mental
Alex, "Chatbot for Disease Prediction and Healthcare Service Based on Emotional Dialogue
Treatment Recommendation using Machine Analysis and Sentence Generation," 2017 18th
Learning," 2019 3rd International Conference on IEEE International Conference on Mobile Data
Trends in Electronics and Informatics (ICOEI), Management (MDM), Daejeon, Korea (South),
Tirunelveli, India, 2019, pp. 851-856, doi: 2017, pp. 371-375, doi: 10.1109/MDM.2017.64.
10.1109/ICOEI.2019.8862707.
S. Chakraborty et al., "An AI-Based Medical
D. Madhu, C. J. N. Jain, E. Sebastain, S. Shaji and Chatbot Model for Infectious Disease Prediction,"
A. Ajayakumar, "A novel approach for medical in IEEE Access, vol. 10, pp. 128469-128483,
assistance using trained chatbot," 2017 2022, doi: 10.1109/ACCESS.2022.3227208.
International Conference on Inventive
Communication and Computational Technologies
(ICICCT), Coimbatore, India, 2017, pp. 243-246, J. Gupta, V. Singh and I. Kumar, "Florence- A
doi: 10.1109/ICICCT.2017.7975195. Health Care Chatbot," 2021 7th International
Conference on Advanced Computing and
Communication Systems (ICACCS), Coimbatore,
L. Athota, V. K. Shukla, N. Pandey and A. Rana, India, 2021, pp. 504-508, doi:
"Chatbot for Healthcare System Using Artificial 10.1109/ICACCS51430.2021.9442006.
Intelligence," 2020 8th International Conference
78
U. Bharti, D. Bajaj, H. Batra, S. Lalit, S. Lalit and
A. Gangwani, "Medbot: Conversational Artificial
Intelligence Powered Chatbot for Delivering Tele-
Health after COVID-19," 2020 5th International
Conference on Communication and Electronics
Systems (ICCES), Coimbatore, India, 2020, pp.
870-875, doi:
10.1109/ICCES48766.2020.9137944.
79
HOME AUTOMATION USING ALEXA & GOOGLE HOME
1st Mr.Rajan. T 2nd Ayush singh 3rd V H Nandita

Computer Science And 20191ISE0021 20191ISE0190
Engineering Department of computer Department of computer
Presidency University science and information science and information
Bangalore,India science science
rajan.thangamani@presidencyu Presidency University Presidency University
niversity.in Bangalore,India Bangalore,India
4th Tamana Bhatti
20191ISE0176 5th Akshay K S
Department of computer 20191ISE0009
science and information Department of computer
science science and information
Presidency University science
Bangalore,India Presidency University
201910101810@presidencyuni Bangalore,India
versity.in 201910100695@presidencyuni
versity.in
Abstract App, the appliances can be managed and real-

In order to control relays with the Google time feedback can be observed from anywhere
Assistant, Alexa, IR remote, and manual switches, a lot in the globe.
of it is based on automation utilising the Arduino UNO
and ESP8266 ESP-01. Our project's goal is to remotely Using Alexa or Google Assistant,
operate all of the home's electronic appliances, including voice commands may also be used to operate
turning on and off the lights, fans, and television.Voice the appliances. If there is no internet
management of home appliances is the idea behind
connection, the relays can be operated
Google Assistant and Alexa-controlled home automation.
In this home automation, home equipment like bulbs, manually. In this case, Arduino stores the
fans, and motors can all be controlled in accordance with previous state in EEPROM so that, in the
the user's requests given to Alexa or the Google assistant. event of a power outage, appliances will
The Google Assistant commands are decoded and automatically switch on in accordance with the
delivered to the Sinric server, which then delivers the
signal to the ESP-01.The identical signal is sent to
stored state when the power is restored.
Arduino by the ESP-01 via serial connection. The
Arduino UNO will then process that signal and operate
2.CONTENTS REQUIRED
the relays in accordance.
I. Arduino UNO / Arduino Nano
Keywords: Google Assistant, Alexa, Arduino Uno, ESP- II. ESP8266 ESP - 01
01. III. 1838 IR Receiver (with metal case)
1.INTRODUCTION IV. 1k, 10k Resistors (1/4 watt)
V. 5 – mm LED
This Internet of Things project is VI. 4 – Channel 5V SPDT Relay Module
centred on how to automate your house using VII. Switches / Push Buttons
an Arduino UNO and an ESP8266 ESP-01 to VIII. FTDI232 USB / TTL
operate relays with your voice, manual IX. 5V DC Supply
switches, and the Google Assistant or Alexa. X. Jumper Wires
Using the Google Home or Amazon Alexa
80
3. LITERATURE REVIEW 6. ADVANTAGE OF THE PROPOSED
SYSTEM
Review Of Related Literature:
I. Safety
When people think about home
II. Convenience
automation, most of them may imagine living
III. Energy-saving potential
in a smart home: One remote controller for
IV. Remote Access
every household appliance, cooking
V. Customization
the rice automatically, starting air conditioner 7. FUTURE ENHANCEMENT
automatically, heating water for bath
automatically and shading the window The work may be expanded to new
automatically when night coming. To some heights because IOT has already taken the
extent home automation equals to smart home. market and adoption rates are rising quickly.
They both bring out smart living condition and The creation and use of a far more
sophisticated system was made possible by the
make our life more convenient and fast.
confluence of these two factors.
4. PROPOSED METHOD
These technologies limit human relationships
The goal of this project is to develop a since the majority of activities can only be
home automation system that gives the user carried out successfully and efficiently by
complete control over all remotely controllable these sophisticated machines.
aspects of his or her home.Making houses
And real-time traffic statistics. That lengthens
easier, better, or more accessible is the goal of
human time considerably.
home automation. If you can think of it, it
could be able to automate just about every part 8. RESULTS ACHIEVED
of the home. Home automation is the
combination of several technologies into a Home automation makes life more
single system. convenient and can even save you money on
heating, cooling and electricity bills. Home
5. METHODOLOGY automation can also lead to greater safety with
Internet of Things devices like security
A) To operate the appliances, you request
cameras and systems.
Google Assistant's help.
B) It sends the signal to the Sinric server.
C) Through serial communication, it sends the 9. CONCLUSION
same signal to Arduino. The Arduino UNO
will then process that signal and operate the As a result, an IOT-based system was
relays in accordance. developed that makes use of the author's IOT
platform to control Alexa hardware appliances
D) ESP-01 will receive the signal from the for home automation purposes. The voice-
Sinric server enabled smart board system is extremely
E) It then sends the feedback to ESP-01 again responsive in accepting commands and taking
through the Serial communication the appropriate actions.
F) So that we can track the real-time feedback Future versions of the system could
in the Google Home and Amazon Alexa apps, incorporate other AI principles to improve
the ESP-01 then sends feedback to the Sinric usability and boost automation. The creation of
server once again. the system for languages other than English
may be introduced as a further feature.
In conclusion, our solution offers a method for
IOT home security that takes future changes of
81
the Bluetooth protocol into careful [4]
consideration. The Alexa application was https://www.ncbi.nlm.nih.gov/pmc/articles/PMC81
integrated into a larger home automation 98920/
system. [5]https://www.security.org/home-
automation/#:~:text=Home%20automation%20mak
10. REFERENCES es%20life%20more,like%20security%20cameras%
20and%20systems.
[6]
[1] Mr.sunil S.khatal1, mr. B.S.Chundhire2,
https://www.cornerstoneprotection.com/blog/home-
Mr.K.S.kahate3”Survey on key Aggrigation system
automation-benefits/
for secure sharing of cloud data”
[7]
[2] A karmen crime victims:An introduction to
https://rcciit.org/students_projects/projects/ece/201
victimology, cengage learning ,2012
8/GR30.pdf
[3]https://www.safewise.com/faq/home-
automation/home-automation-
benefits/#:~:text=The%20benefits%20of%20home
%20automation%20include%20safety%2C%20con
venience%2C%20control%2C,consider%20to%20r
eap%20these%20rewards.
82
HOME AUTOMATION USING ALEXA & GOOGLE HOME
1st Mr.Rajan. T 2nd Ayush singh 3rd V H Nandita

Computer Science And 20191ISE0021 20191ISE0190
Engineering Department of computer Department of computer
Presidency University science and information science and information
Bangalore,India science science
rajan.thangamani@presidencyu Presidency University Presidency University
niversity.in Bangalore,India Bangalore,India
4th Tamana Bhatti
20191ISE0176 5th Akshay K S
Department of computer 20191ISE0009
science and information Department of computer
science science and information
Presidency University science
Bangalore,India Presidency University
201910101810@presidencyuni Bangalore,India
versity.in 201910100695@presidencyuni
versity.in
Abstract App, the appliances can be managed and real-

In order to control relays with the Google time feedback can be observed from anywhere
Assistant, Alexa, IR remote, and manual switches, a lot in the globe.
of it is based on automation utilising the Arduino UNO
and ESP8266 ESP-01. Our project's goal is to remotely Using Alexa or Google Assistant,
operate all of the home's electronic appliances, including voice commands may also be used to operate
turning on and off the lights, fans, and television.Voice the appliances. If there is no internet
management of home appliances is the idea behind
connection, the relays can be operated
Google Assistant and Alexa-controlled home automation.
In this home automation, home equipment like bulbs, manually. In this case, Arduino stores the
fans, and motors can all be controlled in accordance with previous state in EEPROM so that, in the
the user's requests given to Alexa or the Google assistant. event of a power outage, appliances will
The Google Assistant commands are decoded and automatically switch on in accordance with the
delivered to the Sinric server, which then delivers the
signal to the ESP-01.The identical signal is sent to
stored state when the power is restored.
Arduino by the ESP-01 via serial connection. The
Arduino UNO will then process that signal and operate 2. LITERATURE REVIEW
the relays in accordance.
Review Of Related Literature:
Keywords: Google Assistant, Alexa, Arduino Uno, ESP-
01. When people think about home
automation, most of them may imagine living
1.INTRODUCTION in a smart home: One remote controller for
This Internet of Things project is every household appliance, cooking
centred on how to automate your house using the rice automatically, starting air conditioner
an Arduino UNO and an ESP8266 ESP-01 to automatically, heating water for bath
operate relays with your voice, manual automatically and shading the window
switches, and the Google Assistant or Alexa. automatically when night coming. To some
Using the Google Home or Amazon Alexa extent home automation equals to smart home.
83
They both bring out smart living condition and sophisticated system was made possible by the
make our life more convenient and fast. confluence of these two factors.
3. PROPOSED METHOD These technologies limit human relationships

since the majority of activities can only be
The goal of this project is to develop a carried out successfully and efficiently by
home automation system that gives the user these sophisticated machines.
complete control over all remotely controllable
aspects of his or her home.Making houses And real-time traffic statistics. That lengthens
easier, better, or more accessible is the goal of human time considerably.
home automation. If you can think of it, it
7. RESULTS ACHIEVED
could be able to automate just about every part
of the home. Home automation is the Home automation makes life more
combination of several technologies into a convenient and can even save you money on
single system. heating, cooling and electricity bills. Home
automation can also lead to greater safety with
4. METHODOLOGY
Internet of Things devices like security
A) To operate the appliances, you request cameras and systems.
Google Assistant's help.
B) It sends the signal to the Sinric server.
C) Through serial communication, it sends the 8. CONCLUSION
same signal to Arduino. The Arduino UNO As a result, an IOT-based system was
will then process that signal and operate the developed that makes use of the author's IOT
relays in accordance. platform to control Alexa hardware appliances
D) ESP-01 will receive the signal from the for home automation purposes. The voice-
Sinric server enabled smart board system is extremely
responsive in accepting commands and taking
E) It then sends the feedback to ESP-01 again the appropriate actions.
through the Serial communication
In conclusion, our solution offers a method for
F) So that we can track the real-time feedback IOT home security that takes future changes of
in the Google Home and Amazon Alexa apps, the Bluetooth protocol into careful
the ESP-01 then sends feedback to the Sinric consideration. The Alexa application was
server once again. integrated into a larger home automation
system.
9. REFERENCES
5. ADVANTAGE OF THE PROPOSED
SYSTEM
I. Safety [1] Mr.sunil S.khatal1, mr. B.S.Chundhire2,
II. Convenience Mr.K.S.kahate3”Survey on key Aggrigation system
III. Energy-saving potential for secure sharing of cloud data”
IV. Remote Access [2] A karmen crime victims:An introduction to
V. Customization victimology, cengage learning ,2012
6. FUTURE ENHANCEMENT
[3]https://www.safewise.com/faq/home-
The work may be expanded to new automation/home-automation-
heights because IOT has already taken the benefits/#:~:text=The%20benefits%20of%20home
market and adoption rates are rising quickly. %20automation%20include%20safety%2C%20con
venience%2C%20control%2C,consider%20to%20r
The creation and use of a far more
eap%20these%20rewards.
84
[7]
https://rcciit.org/students_projects/projects/ece/201
[4] 8/GR30.pdf
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC81
98920/
[5]https://www.security.org/home-
automation/#:~:text=Home%20automation%20mak
es%20life%20more,like%20security%20cameras%
20and%20systems.
[6]
https://www.cornerstoneprotection.com/blog/home-
automation-benefits/
85
House Price Prediction Using Machine learning
Gorantla Vanishree Gnanamgari Lokesh Dhruv Krishna

Computer Science Computer Science Computer Science

Bangalore, India
201910102183@presidencyuniversity.i Bangalore, India Bangalore, India
n 201910102055@presidencyuniversity.i 201910101267@presidencyuniversity.i
n n
Kummithi Sreenivas Reddy
Computer Science Kudupudi Jaswanth
Computer Science
Bangalore, India
201910101328@presidencyuniversity.i Bangalore, India
n 201910101205@presidencyuniversity.i
n
Abstract—we propose to implement a house precisely predict home prices in

price prediction model for Bangalore,India.The best performing model for
Bangalore,India.Housing prices go up and predicting house price will be found after applying
down on a daily basis and sometimes are hyped variety of machine learning algorithms to the
rather than being based on valuation.Here we data.This paper will be very useful for the real
aim to make our evaluation based on every estate market,Real estate buyers and sellers can
basic parameter that is considered while utilize the model to make well informed decisions
determining the price.In this paper a multiple about real estate investments.
set of machine learning algorithms such as XLIX. LITERATURE SURVEY
Linear Regression and lasso regression and [1] Real estate is the least transparent industry in
others are being employed to predict the our ecosystem. Housing prices keep changing day
housing prices using public available datasets. in and day out and sometimes are hyped rather
The best performing model for predicting than being based on valuation. Predicting housing
house price will be found after applying variety prices with real factors is the main crux of our
of machine learning algorithms to the research project. Here we aim to make our
data.This paper will be very useful for the real evaluations based on every basic parameter that is
estate market,Real estate buyers and sellers considered while determining the price. We use
can utilize the model to make well informed various regression techniques in this pathway, and
decisions about real estate investments our results are not sole determination of one
Keywords—Linear Regression,House price technique rather it is the weighted mean of
prediction,decision tree various techniques to give most accurate results.
XLVIII. INTRODUCTION
The results proved that this approach yields
minimum error and maximum accuracy than
The goal of this paper is to predict house prices individual algorithms applied. We also propose to
in Bangalore, city based on some features such as
use realtime neighborhood details using Google
location,size/area, no of bed rooms
maps to get exact real-world valuations[1]Author
,availability,society,total _sqft and other no of
bathrooms.,balcony.The data set is gathered from : Ayush Varma Computer Engineering
kaggle website. The data will be Department, KJ Somaiya College Of Engineering,
cleansed,preprocessed and examined to find any Vidyavihar, Mumbai- 400077
gaps or inconsistencies.Using technology to ayush.varma@somaiya.edu Sagar Doshi
address practical issues is one of the most Computer Engineering Department, KJ Somaiya
fascinating challenges facing engineering students College Of Engineering, Vidyavihar, Mumbai-
today.The house price prediction paper is focused 400077 sagar.bd@somaiya.edu
on creating a machine learning model that can
[2] Author: The Danh Phan Macquarie University crucial features that are considered by prospective
Sydney, Australia Paper title: Housing Price house buyers. The model can automatically assign
Prediction using Machine Learning Algorithms: weights when given transaction data. Our
The Case of Melbourne City, Australia House proposed model differs from self-attention models
price forecasting is an important topic of real because it considers the interaction between two
estate. The literature attempts to derive useful different features to learn the complicated
knowledge from historical data of property relationship between features in order to increase
markets. Machine learning techniques are applied prediction precision. We conduct experiments to
to analyze historical property transactions in demonstrate the performance of the model.
Australia to discover useful models for house Experimental data include actual selling prices in
buyers and sellers. Revealed is the high real estate transaction data for the period from
discrepancy between house prices in the most 2017 to 2018, public facility data acquired from
expensive and most affordable suburbs in the city the Taipei and New Taipei governments, and
of Melbourne. Moreover, experiments satellite maps crawled using the Google Maps
demonstrate that the combination of Stepwise and application programming interface. We utilize
Support Vector Machine that is based on mean these datasets to train our proposed and compare
squared error measurement is a competitive its performance with that of other machine
approach. learning-based models such as Extreme Gradient
[3] Author: PEI-YING WANG1 , CHIAO-TING Boosting and Light Gradient Boosted Machine,
CHEN 2 , JAIN-WUN SU1 , TING-YUN deep learning, and several attention models. The
WANG1 , AND SZU-HAO HUANG 3 , experimental results indicate that the proposed
(Member, IEEE) 1 Institute of Information model achieves a low prediction error and
Management, National Chiao Tung University, outperforms the other models. To the best of our
Hsinchu 30010, Taiwan 2Department of House knowledge,we are the first research to incorporate
price and rental price prediction 6 Computer attention mechanism and STN network to conduct
Science, National Yang Ming Chiao Tung house price prediction.we are the first research to
University, Hsinchu 30010, Taiwan 3Department incorporate attention mechanism and STN
of Information Management and Finance, network to conduct house price prediction.
National Yang Ming Chiao Tung University,
Hsinchu 30010, Tai Paper title: Deep Learning
Model for House Price Prediction Using L. METHODOLOGY/APPROACH
Heterogeneous Data Analysis Along with Joint
Here we study few algorithms for the house
Self-Attention Mechanism House price prediction price prediction.The suitable algorithm for this
is a popular topic, and research teams are house price prediction always depend on the
increasingly performing related studies by using type of data.After applying these algorithms on
deep learning or machine learning models. the data.Based on the result we choose the best
However, because some studies have not algorithm. The algorithm we studied are
considered comprehensive information that
affects house prices, prediction results are not
always sufficiently precise. Therefore, we propose A. Linear Regression
an end-to-end joint self- attention model for house Linear regression analysis is used to predict the
prediction. In this model, we import data on value of variable based on the value of another
public facilities such as parks, schools, and mass variable.The variable you want to predict is called
rapid transit stationsto represent the availability of dependent variable,The variable you are using to
amenities, and we use satellite maps to analyze predict others variable’s value is called
the environment surrounding houses. We adopt independent variable.
attention mechanisms, which are widely used in
image, speech, and translation tasks, to identify
B. Decision tree regression
Decision tree regression builds regression or structure.It breaks down a subset into smaller and
classification models in the form of tree smaller subsets while
87
C. Lasso Regression
Lasso regression is a regularization technique .It
is used over regression methods for a more
accurate prediction .This model uses
shrinkage.shrinkage is where data values are Here we are first importing data and then we
shrunk towards a central point as the mean .The are doing data analysis, feature engineering and
lasso procedure encourages simple,sparse models. using ML Algorithms.
IV Advantages of proposed system- Here ,we
IMPLEMENTATION intend to base our evaluation based on every
criterion that is taken into account when
establishing the pricing. Here we choose our best
Here we use python jupyter notebook for model only after applying multiple machine
implementation of house price prediction.First we learning algorithms which gives better accuracy.
import the libraries.Python jupyter notebook is a Data is the heart of machine learning.with out data
free notebook. Then we load bangalore home we cannot train our models.here the data is
prices in to data frame.We do the data cleaning thoughrouly examined,cleaned and preprocessed
and remove any Null values from the data.Then ..Any missing values in the data gives error result.
we apply feature engineering .we add new feature
for bhk and Add new feature called price per VI
square feet. After applying the algorithms on the CONCLUSION
data. We got linear regression is the best algorithm From the historical development of machine
for the data. learning and its applications in real estate sector, it
can be shown that systems and methodologies
D. Figures and Tables
have been emerged that has enabled sophisticated
data analysis by simple and straightforward use of
machine l learning algorithms. The suggested
TABLE-I approach forecasts the price of real estate in
Bangalore based on a number of characteristics.
MODEL BEST_SCO BEST_PARAMS To find the best model, we propose to test a
RE variety of machine learning algorithms.
LINEAR_REGRES -4- {NORMALIZE’:FALSE}
Flask Integration: we will deploy our machine
SION 478168E+ learning model into flask web app.Flask provides
15
you tools , libraries, and technologies that allow
you to build a web applicaton.this framework is
LASSO 7.508086 {‘ALPHA’:’SELECTION’:’CYCLIC’)
EE-01
used for integrating Python models.
DECISION_TREE 8.166031 {‘CRITERION’:’FRIEDMAN_MSE’,’SPLITT

E-01 ER’:’RAN..
VII REFERENCES
[1] Varma, Ayush et al. “House Price Prediction Using Engineering (iCMLDE), Sydney, NSW, Australia, 2018, pp. 35-42,
Machine Learning and Neural Networks.” 2018 Second doi: 10.1109/iCMLDE.2018.00017.
International Conference on Inventive Communication and
Computational Technologies (ICICCT) (2018): 1936-1939. [3] Wang, P., Chen, C., Su, J., Wang, T., & Huang, S.
(2021). Deep Learning Model for House Price Prediction Using
[2] T. D. Phan, "Housing Price Prediction Using Machine Heterogeneous Data Analysis Along With Joint Self-Attention
Learning Algorithms: The Case of Melbourne City, Australia," Mechanism. IEEE Access, 9, 55244-55259.
2018 International Conference on Machine Learning and Data
88
AI CHATBOT FOR DIAGNOSING ACUTE DISEASES
Rohit S Biradar Ranjan GT

Dr. Harish Kumar K S
Computer Science and Computer Science and
Department Of Computer Engineering Engineering
Science Engineering
Presidency University, Presidency University
Presidency University, Bengaluru,India Bengaluru, India
Bengaluru,India
harishkumar@presidencyuniv niversity.in niversity.in
ersity.in
S Padmanabhan
Samanth K Computer Science and
S Gritish
Computer Science and
Engineering
Presidency University, Bengaluru,India
Bengaluru, India Bengaluru,India 201910101655@presidencyu
201910100560@presidencyu niversity.in
AI in healthcare is the development of chatbots

ABSTRACT - This is a Flask-based web
for diagnosing acute diseases. Acute diseases are
application that uses a Decision Tree Classifier those that develop suddenly and require urgent
model to predict the disease based on the input medical attention. .The use of AI-powered
symptoms. The model is trained on a dataset chatbots for diagnosing acute diseases has the
containing symptoms, their descriptions, potential to improve the speed and accuracy of
precautions, and severity. The severity of diagnosis, thereby improving patient
symptoms is obtained from a CSV file, and outcomes.AI-powered chatbots are designed to
symptom descriptions and precautions are interact with patients in a natural language format,
obtained from separate CSV files. The user inputs similar to a conversation with a healthcare
the symptoms they are experiencing, and the provider. They use machine learning algorithms to
model predicts the disease they might have. The analyze patients' symptoms and provide them with
application also provides information about the appropriate recommendations. The chatbots are
severity of the predicted disease and its trained on datasets of medical records and clinical
precautions. The user can also enter the number of guidelines to ensure that they provide accurate
days they have been experiencing the symptoms and reliable diagnoses
to obtain more accurate results. If the predicted However, there are also challenges that need to
disease is not accurate, the user can provide more be addressed in the development and use of AI-
symptoms to refine the prediction. powered chatbots for diagnosing acute diseases.
Keywords—Artificial One challenge is ensuring that the chatbots are
Intelligence,Chatbot,Acute Disease,Flask,Web transparent and user accessibility so that patients
Application,Digital Health,HealthCare can understand how they arrive at their diagnoses.
Another challenge is ensuring that the chatbots are
III. INTRODUCTION designed to be user-friendly, and that patients are
Artificial intelligence (AI) has shown great able to provide accurate information about their
potential in the field of healthcare, especially in symptoms.
the area of disease diagnosis. One application of
In conclusion, AI-powered chatbots have the delivering telehealth in India to increase patient
potential to revolutionize the diagnosis and access to healthcare knowledge and leverage the
management of acute diseases. The studies potentials of artificial intelligence.
discussed in this literature review demonstrate the [6] The paper proposed the survey provides
high accuracy and speed of diagnosis that can be insights into the challenges and opportunities
achieved using AI-powered chatbots. However, associated with chatbots in healthcare, particularly
there are also challenges that need to be addressed in the context of COVID-19, including their use in
before chatbots can be widely adopted in clinical patient monitoring, disease diagnosis, and
practice. Nonetheless, the potential benefits of medication adherence.
chatbots in improving patient outcomes make
them a promising area for further research and [7] The paper proposed the survey provides
development. insights into the potential benefits and challenges
associated with the use of chatbots in mental
IV. LITERATURE SURVEY healthcare, particularly in providing counseling
[1] This research proposes a medical chatbot services.
application that uses natural language processing [8] The paper proposed the chatbot is easy to
and machine learning to predict diseases and use and offers medical-related information like
recommend treatments. It can be used to conduct doctor's contact details and addresses of nearby
daily check-ups, make people aware of their hospitals. It has wide future opportunities and
health status, and encourage them to take proper people in remote areas can also benefit from it.
measures to remain healthy. After building the network, the bot will predict the
[2] The proposed idea is to create a system correct answers to the user's queries, even if it is
with artificial intelligence that can predict not in the training model. It tries to predict it
diseases based on symptoms and give the list of closer by checking sentences and their words.
available treatments. The AI can also give the [9] The paper proposed a chatbot can help people
composition of the medicines and their prescribed by providing them with the most suitable disease
uses, helping people to take the correct treatment. and nutritional breakdown of their food. It does
This will help them to have an idea about their not need the support of a doctor to provide
health and have the right protection. effective health steps, and the conversational AI is
[3] The paper proposes a medical chatbot that reliable and helps the chatbot analyse the disease
can diagnose and provide basic details about a completely. This is one of its key advantages
disease before consulting a doctor. It uses natural orizontally, moving to a third row if needed for
language processing techniques and a third party more than 8 authors.
expert program to handle questions not present or [10] Telemedicine is a way of providing health
understood by the chatbot. It aims to reduce care services remotely using information and
healthcare costs and improve availability to communication technologies. It can help patients
medical knowledge. access medical care during the COVID-19
[4] The paper proposed an Implementation of pandemic and reduce the risk of infection.
Machine Learning Based Bangla Healthcare Chatbots are conversational agents that can
Chatbot" presents a closed domain Bangla chatbot interact with users using natural language and
that provides medical advice and diagnosis based provide them with health-related information and
on user's symptoms and health status. It uses six guidance. Natural language processing (NLP) is a
supervised machine learning approaches and a branch of artificial intelligence that deals with the
third party expert program to handle questions not analysis and generation of natural language. NLP-
present or understood by the chatbot. based chatbots can understand user queries and
provide relevant responses based on medical data
[5] The paper proposed Telemedicine has the
and knowledge. However, chatbots face
potential to help reduce COVID-19 transmission
challenges such as multilingualism, accuracy,
among patients and clinicians. A Multilingual
reliability, privacy, and user satisfaction.
Conversational Bot based on NLP has been
Therefore, there is a need for further research and
developed to provide free primary healthcare
improvement in chatbot development for rural
education, information, and advice to chronic
health care in India.
patients. The paper proposes a conversational bot
"Aapka Chikitsak" on Google Cloud Platform for
90
METHODS AND TECHNOLOGIES Proposed Method
Decision Tree classifier The AI chatbot is purpose-built for diagnosing
acute diseases. It has been trained using standard
It is a supervised learning technnique that can
datasets and a decision tree algorithm. The
be used for both classification and regression
chatbot is hosted using Flask and HTML, making
problems.It is more useful in the case of
it accessible as a web chat. Upon initiating a
classification problems.It is a Tree Structured .
conversation with the user, the chatbot first asks
The internal nodes represent feature of
for the user's name and stores it for reference. It
dataset,branches represent the decsion rules.In a
then proceeds to inquire about the primary
decsion tree there are two types of nodes , which
symptom of the patient. Following this, the
are decision node and leaf node.
chatbot asks for suitable multiple symptoms, to
Scikit-learn which the user can respond with "Yes/No"
• Preprocessing the data using answers. Based on the collected symptoms, the
LabelEncoder and train_test_split. chatbot predicts and provides suitable precautions
and measures to be followed by the patient. The
• Creating a Decision Tree Classifier for the
chatbot's user-friendly interface and interactive
prediction of the disease based on the conversation flow make it a valuable tool for
symptoms. diagnosing acute diseases and providing relevant
• Creating a Support Vector Machine guidance to users.
Classifier (SVC) which is not used in the
code and can be removed. System Architecture
• Calculating cross-validation scores for the
created Decision Tree Classifier.
• Implementing the predict function to
predict the disease based on the input
symptoms.
Flask Framework
Flask is a Python web framework that is

used to develop web applications. In this code,
Flask is being used to create a web application
that serves a web page with a form where users
can enter data to make predictions using the
machine learning model that was trained with
scikit-learn. Flask is used to define the routes that
the web application will handle and to handle the
HTTP requests and responses. When a user
submits the form, Flask will pass the data to the
machine learning model, retrieve the prediction,
and display the result on the web page. In other
words, Flask is responsible for creating the user
interface and handling the web interactions, while
scikit-learn is responsible for making predictions
based on the data provided by the user.
HTML and CSS

For the user interface is built using Html
and css and for the backend python Flask
framework is used
91
Work Flow Of Chatbot
Accuracy for the Model

X=input data
Y=target variable (training Data)
• The last column of the training data

contains the target variable.
• It then uses label encoding to convert the
categorical target variable (strings) into
numerical format
• DecisionTreeClassifier() method from
scikit-learn to train a decision tree model
on the training data.
• The code then uses cross_val_score()
method to calculate the accuracy score of
the model .
Result/Outputs:
92
M. M. Rahman, R. Amin, M. N. Khan Liton and
N. Hossain, "Disha: An Implementation of
Machine Learning Based Bangla Healthcare
Chatbot," 2019 22nd International Conference on
Computer and Information Technology (ICCIT),
Dhaka, Bangladesh, 2019, pp. 1-6, doi:
10.1109/ICCIT48885.2019.9038579.
A. S, N. R. Rajalakshmi, V. P. P and J. L,
"Dynamic NLP Enabled Chatbot for Rural Health
Care in India," 2022 Second International
CONCLUSION Conference on Computer Science, Engineering
and Applications (ICCSEA), Gunupur, India,
This paper presents a medical chatbot that can 2022, pp. 1-6, doi:
diagnose and provide information about a disease 10.1109/ICCSEA54677.2022.9936389.
before consulting a doctor. The chatbot uses
natural language processing techniques and a E. Amer, A. Hazem, O. Farouk, A. Louca, Y.
third-party expert program to handle questions it Mohamed and M. Ashraf, "A Proposed Chatbot
doesn't understand. The system aims to reduce Framework for COVID-19," 2021 International
healthcare costs and improve access to medical Mobile, Intelligent, and Ubiquitous Computing
knowledge. The authors conducted experiments Conference (MIUCC), Cairo, Egypt, 2021, pp.
and obtained promising results, demonstrating the 263-268, doi:
potential of the chatbot in assisting patients with 10.1109/MIUCC52538.2021.9447652.
self-diagnosis and symptom checking.
K. -J. Oh, D. Lee, B. Ko and H. -J. Choi, "A

References
Chatbot for Psychiatric Counseling in Mental
Healthcare Service Based on Emotional Dialogue
R. B. Mathew, S. Varghese, S. E. Joy and S. S.
Analysis and Sentence Generation," 2017 18th
Alex, "Chatbot for Disease Prediction and
IEEE International Conference on Mobile Data
Treatment Recommendation using Machine
Management (MDM), Daejeon, Korea (South),
Learning," 2019 3rd International Conference on
2017, pp. 371-375, doi: 10.1109/MDM.2017.64.
Trends in Electronics and Informatics (ICOEI),
Tirunelveli, India, 2019, pp. 851-856, doi:
S. Chakraborty et al., "An AI-Based Medical
10.1109/ICOEI.2019.8862707.
Chatbot Model for Infectious Disease Prediction,"
in IEEE Access, vol. 10, pp. 128469-128483,
D. Madhu, C. J. N. Jain, E. Sebastain, S. Shaji and
2022, doi: 10.1109/ACCESS.2022.3227208.
A. Ajayakumar, "A novel approach for medical
assistance using trained chatbot," 2017
International Conference on Inventive
J. Gupta, V. Singh and I. Kumar, "Florence- A
Communication and Computational Technologies
Health Care Chatbot," 2021 7th International
(ICICCT), Coimbatore, India, 2017, pp. 243-246,
Conference on Advanced Computing and
doi: 10.1109/ICICCT.2017.7975195.
Communication Systems (ICACCS), Coimbatore,
India, 2021, pp. 504-508, doi:
L. Athota, V. K. Shukla, N. Pandey and A. Rana,
10.1109/ICACCS51430.2021.9442006.
"Chatbot for Healthcare System Using Artificial
Intelligence," 2020 8th International Conference
U. Bharti, D. Bajaj, H. Batra, S. Lalit, S. Lalit and
on Reliability, Infocom Technologies and
A. Gangwani, "Medbot: Conversational Artificial
Optimization (Trends and Future Directions)
Intelligence Powered Chatbot for Delivering Tele-
(ICRITO), Noida, India, 2020, pp. 619-622, doi:
Health after COVID-19," 2020 5th International
10.1109/ICRITO48877.2020.9197833.
93
Conference on Communication and Electronics 870-875, doi:
Systems (ICCES), Coimbatore, India, 2020, pp. 10.1109/ICCES48766.2020.9137944.
94
Building a Secure JSON Web Token (JWT) Library
Mr. Mohamed Shakir Kanishka Joshi (20191CSE0235) Sneha K (20191ISE0199)

Department of CSE, Department of CSE, Department of ISE,
Presidency University, Presidency University, Presidency University,
mohamed.shakir@presidencyuni 201910100631@presidencyuniv 201910101491@presidencyuniv
versity.in ersity.in ersity.in
Akshaya bhamani H S Suman M (20191CSE0597) Boggavarapu Madhavi

(20191CCE0003) Department of CSE, (20191CSE0078)
Department of CCE, Presidency University, Department of CSE,
Presidency University, Bengaluru, India Presidency University,
Bengaluru, India 201910100835@presidencyuniv Bengaluru, India
201910100098@presidencyuniver ersity.in 201910101138@presidencyuniv
sity.in ersity.in
Abstract: This research paper explores the topic

LI. INTRODUCTION
of JSON Web Tokens (JWTs) and their
implementation in a custom-built JWT library. In recent years, JSON Web Tokens (JWTs) have
The paper provides an overview of JWTs, gained widespread adoption as a secure and
efficient
explaining their basic structure and use cases. It
then delves into the technical details of JWT way of transmitting user information between
encoding and decoding, including the base64 servers and clients in web applications. JWTs offer
encoding and decoding of the header and several advantages over traditional session
payload, as well as the signature verificationmanagement techniques, such as stateless
communication and enhanced security features.
process. The paper also discusses the process of
However, while JWTs have become a popular
packaging the JWT logic into a reusable library
and highlights important considerations for choice for user authentication and authorization,
there are still some security concerns and
building a secure JWT library, including proper
key management, signature algorithm selection,limitations that need to be addressed.
The objective of this paper is to build a JWT library
and vulnerability mitigation. Overall, this paper
provides a comprehensive guide to building a that is easy to use, flexible and secure.
secure and reliable JWT library. We will be developing a JWT library, which will
enable developers to generate, sign, and verify
Keywords: JSON Web Token (JWT), JWT JWTs in their web applications and discuss
library, encoding, decoding, Base64 encoding, regarding the same. Our library will be designed
signature verification, HMAC SHA-256 with simplicity, security, and flexibility in mind,
algorithm, secret key, signature algorithm with an emphasis on ease of use and compatibility
selection. with a wide range of programming languages and
frameworks. The library will support key JWT
features and will be customizable to meet the
specific needs of different web applications. We
will conduct extensive testing of the library to
ensure its security and performance, and will JWTs are beneficial in many situations, including
provide documentation and support for developers mobile applications because they are lightweight
who wish to use the library in their own projects. and easy to use. JWTs have a built-in expiry
The scope of the JWT library includes support for mechanism and can be customized to include
various cryptographic techniques such as HMAC additional claims. They are widely used in SSO
and RSA for creating and validating signatures. solutions and well-known standards like OpenID
Additionally, the library will offer safe key Connect .
management techniques. The API will be designed B. Structure of JSON Web Token
with ease of use in mind to simplify the process of
generating, signing, and verifying JWTs. A JWT consists of three parts separated by dots:
the header, payload, and signature. The header
However, the construction and correct
typically includes the token type (JWT) and the
implementation of JWT libraries can be
hashing algorithm.
challenging, requiring a deep understanding of
cryptographic principles and the JWT standard. The header is then Base64Url encoded to form the
Incorrect implementation can lead to security first part of the JWT.
vulnerabilities like token forgery and leakage, and
the compatibility of the library may be limited by
the programming languages, frameworks, or
platforms used.
This library will provide valuable insights into the
design and implementation of JWT libraries,
including best practices for handling authentication
and authorization in web applications and APIs.
Fig 1. JWT header
LII. RELATED WORK The second part, the payload, contains the claims,
which include reserved, public, and private claims.
A. Formal Definitions The payload is also Base64Url encoded to form the
JSON Web Token (JWT) is an openly available second part of the JWT.
standard that describes a method for securely
exchanging information between two parties as a
JSON object. This information is trustworthy
because it is digitally signed[1].
JWTs are compact and can be easily shared through
various channels like URLs, POST parameters, or
HTTP headers. Their compactness makes
transmission quick, and their self-contained nature
means that all the necessary information about the Fig 2. JWT payload
user is carried in the payload to avoid multiple
database queries. The final part is the signature, which is created by
JWTs are ideal for authentication[3], allowing a user signing the encoded header, payload, secret, and
to log in once and use the JWT for each subsequent algorithm specified in the header.
request. This enables the user to access routes, The signature is used to verify the sender of the
services, and resources that are permissible with JWT and ensure that the message has not been
that token. JWTs are widely used in Single Sign-On altered.
(SSO) solutions because of their minimal overhead
and ease of use across various domains.
encoded, not encrypted, an
attacker who intercepts a JWT
could potentially decode and
view the information within it.
5) Weak encryption
algorithms: If the encryption
algorithm used to sign the JWT
is weak, an attacker could
Fig 3. JWT Signature potentially guess the signing key
and modify the claims within the
The result is a compact and easily transferable token JWT.
that can be used in HTML and HTTP environments.
6) Weak signatures and
insufficient signature validation:
Several attacks are possible due
to design flaws in some libraries
and applications, including
changing the algorithm to
"none," modifying the RSA
parameter value, and using weak
symmetric keys.
Fig 4. Resulting JWT
7) Plaintext leakage
C. Security challenges associated with JWTs through analysis of ciphertext
While JSON Web Tokens (JWTs) offer many length: Some encryption
benefits, there are also some security risks algorithms leak information
about the length of the plaintext,
associated with their use. Some of the potential and compression attacks are
risks include: powerful when attacker-
controlled data is in the same
1) JWTs are vulnerable compression space as secret data.
to replay attacks: Attackers can
intercept and replay JWTs, To mitigate these risks, it's crucial to carefully
causing the server to believe that design and implement the JWT-based
the request is coming from a
legitimate source. authentication and authorization system, properly
validate and verify the JWTs, use strong encryption
2) JWTs are prone to algorithms, and limit the sensitive information
tampering: If an attacker can included in the JWTs. Additional security
gain access to the JWT, they measures, such as using refresh tokens, limiting the
could modify the claims within JWT's lifespan, and implementing rate limiting to
it, potentially gaining access to prevent replay attacks, should also be considered.
resources they should not have.
D. Analysis of existing literature
3) JWTs don't support Analysis of existing literature:
revocation: Once a JWT has
been issued, there is no way to Although research papers on JSON Web Tokens
revoke it. If an attacker gains (JWTs) offer valuable insights into different aspects
access to a JWT, they could use of JWTs, there are certain limitations to these
it indefinitely. papers that should be acknowledged.
One limitation is that many of these papers
4) JWTs can reveal concentrate on specific programming languages or
sensitive information: Because
the claims within a JWT are libraries for JWTs. While this could be beneficial
for developers working with those languages, it The "tsdx create" command was then used to start a
may not be applicable to those working with other fresh TSdx project.
languages or libraries. Thus, the research findings In order to develop the JWT Generation and
and recommendations may not be broadly Validation modules, we then installed the relevant
applicable. packages, including "jsonwebtoken" and "crypto-
Another highlight is that some papers only address js."
particular use cases for JWTs, such as web or A JWT may be produced using the JWT Generation
mobile applications. Although these use cases are module with a payload and a secret key. This
significant, JWTs are also employed in other module used the Base64URL module to encode the
contexts like IoT devices, and research on these JWT and the SHA256 method to create the
applications is scarce. This can restrict the scope of signature.
the research and may not provide a comprehensive A JWT may be validated by validating the signature
understanding of the advantages and limitations of and payload using the JWT Validation module. The
JWTs across different contexts. Base64URL module was used to decode the JWT,
Moreover, many papers concentrate on the and the SHA256 method was used to verify the
technical implementation of JWTs and overlook signature.
broader issues like security considerations or best To enable cryptographic hashing using the SHA256
practices for handling token expiration and method, we developed a SHA256 module. Both the
revocation. Although technical implementation is JWT Generation and Validation modules used this
important, these broader issues are equally critical module to create and verify the signature.
for ensuring the security and reliability of JWTs in To support encoding and decoding of data in
practice[2]. Base64URL format, we lastly constructed a
In conclusion, while research papers on JWTs Base64URL module. The JWT Generation and
provide valuable insights into various aspects of Validation modules utilised this module to encrypt
JWTs, readers should be mindful of the limitations and decrypt the JWT.
of these papers and carefully evaluate the We used Jest to create tests for the JWT Generation
applicability of the findings and recommendations and Validation modules in order to guarantee the
to their specific use case. dependability and security of the library. These
tests verified the validity of the produced JWT and
the accuracy with which the validation function
LIII. DESIGN AND IMPLEMENTATION
verified a valid JWT.
Determining the architecture and modules
necessary to construct a secure library is the first
step in constructing the JWT library. Building a
JWT library requires the following modules:
A module that creates a JWT from a payload and a
secret key is known as a JWT generation module.
JWT Validation Module: This module verifies a
JWT by examining the contents and signature.
SHA256 module: A module that offers SHA256-
based cryptographic hashing.
Data in Base64URL format may be encoded and
decoded using a module called Base64URL.
Implementation:
First, we used npm to install the TSdx and Jest
packages, enabling us to build and test the library.
LIV. RESULTS AND ANALYSIS
Our work on developing our own JWT library has
resulted in a secure, and flexible solution for
generating, signing, and verifying JWTs in web
applications.
We have conducted extensive testing of the library
to ensure its security and performance, and have
provided thorough documentation and support for
developers who wish to use the library in their own
projects. Overall, our work has resulted in a
valuable contribution to the field of web application
security, providing developers with a practical and
reliable solution for implementing JWTs in their
applications[5].
Furthermore, our work has demonstrated the
Fig 5. JWT Process importance of open-source libraries and
community-driven development in the field of web
Tools and Technology used: application security. By sharing our library with the
A development environment for creating and wider community, we hope to contribute to the
testing TypeScript libraries is offered by the ongoing development of secure and efficient web
TypeScript library TSdx. The JWT library's first applications.
project structure was built using it. As we worked through the project we recognized
A popular testing framework for JavaScript apps is several strengths and limitations in the current
called Jest. Unit tests for the JWT Generation and existing JWTs.
Validation modules were made using it. Firstly, one of the strengths of a secure JWT library
Algorithm SHA256 - The SHA256 algorithm is a is its strong authentication feature. This feature
well-known cryptographic hashing method for enables JWT tokens to provide a secure and
creating digital signatures for secure efficient way to authenticate users and devices.
communication. It was used to produce and validate Another strength is token management. JWT
the JWT's digital signature during the deployment libraries can help prevent unauthorized access or
of the JWT Generation and Validation modules[4]. misuse by managing token expiration, revocation,
Data is encoded in a URL-friendly manner using and other security features.
Base64URL encoding, a version of Base64 Additionally, JWT tokens are platform-
encoding. It was used to encode and decode the independent, meaning they can be used across
JWT during implementation of the JWT Generation different platforms and technologies
and Validation modules. However, there are limitations to a secure JWT
JavaScript runtime Node.js is frequently used to library. While JWT tokens offer a secure way to
create server-side applications. It was used to run authenticate and transmit data, they can still be
the Jest tests and the JWT library. vulnerable to certain security risks if not correctly
Npm is a popular package manager for Node.js that implemented or secured[6]. This can lead to
is used to publish packages and manage unauthorized access if cryptographic keys used to
dependencies. It was used to install the JWT sign the tokens are compromised.
library's required packages, including Finally, there is limited scalability with JWT
"jsonwebtoken" and "crypto-js." tokens. While they can improve efficiency and
scalability of applications, they may not be suitable
for extremely large or high-traffic applications that
require more complex security measures.
In conclusion, a secure JWT library can offer many Through a series of experiments and evaluations,
benefits for authentication, token management, and we demonstrated that the library is robust against
efficiency, but it is important to consider the common attacks and provides high performance for
security risks and limitations of JWT tokens. generating and verifying tokens. Furthermore, we
Proper implementation and management of the compared our library to existing JWT libraries and
library can ensure that JWT tokens are used in a found that it offers comparable performance and
secure and well-tested manner[7]. security.
Through the process of designing and implementing
the library, we gained valuable insights into the
LV. DISCUSSION AND IMPLICATIONS
workings of JWTs[9] and their applications in
Our analysis of JWTs has highlighted both their modern web development. The library we have
strengths and weaknesses, and has identified several created is designed with a focus on security,
areas for future research and development. One of efficiency, and ease of use.
the key implications of our work is the importance Overall, this paper contributes to the existing body
of careful consideration when choosing and of research on JWTs by providing a new library that
implementing a JWT solution in web applications. can be used by developers to implement secure
Moving forward, we believe that there is still authentication and authorization mechanisms in
significant potential for further research and their applications. We hope that this work will
development in the area of JWTs, particularly in inspire further research and development in the
areas such as token revocation and support for field of web security and contribute to the creation
multiple signature algorithms. By continuing to of more secure and reliable systems.
explore and refine JWTs, we can ensure that they
remain a valuable tool for web application security
for years to come[8]. REFERENCES
Furthermore, our work has demonstrated the
importance of continual evaluation and refinement [74] Jones, M., Bradley, J., and Sakimura, N. (2015).
of security solutions. As new vulnerabilities and "JSON Web Token (JWT)." Published as an
threats emerge, it is crucial to continually evaluate Internet Engineering Task Force (IETF) RFC
and improve existing security solutions, including 7519. Retrieved from
JWTs. By conducting ongoing testing and https://tools.ietf.org/html/rfc7519
evaluation of our JWT library and other security [75] Akanksha and A. Chaturvedi, "Comparison of
solutions, we can ensure that they remain effective Different Authentication Techniques and Steps
and up-to-date with the latest security practices and to Implement Robust JWT
technologies. Ultimately, we believe that ongoing Authentication," 2022 7th International
evaluation and refinement are essential for ensuring Conference on Communication and Electronics
that web applications remain secure and reliable in Systems (ICCES), Coimbatore, India, 2022, pp.
the face of evolving security threats. 772-779, doi:
10.1109/ICCES54183.2022.9835796.
LVI. CONCLUSION [76] S. Ahmed and Q. Mahmood, "An authentication
based scheme for applications using JSON web
In conclusion, this research paper presents a new
token," 2019 22nd International Multitopic
JSON Web Token (JWT) library that provides a
Conference (INMIC), Islamabad, Pakistan,
secure and efficient method for authentication and
2019, pp. 1-6, doi:
authorization in web and mobile applications. The
10.1109/INMIC48123.2019.9022766.
library offers a straightforward and flexible API for
[77] Ficry Cahya Ramdani, Alam Rahmatulloh
generating, parsing, and verifying JWTs, and
includes support for a variety of signing and (2023). " Implementation of JSON Web Token
encryption algorithms. on Authentication with HMAC SHA-256
Algorithm." Retrieved from https://iopscience.iop.org/article/10.1088/1757-
https://www.semanticscholar.org/paper/Implem 899X/550/1/012023/meta
entation-of-JSON-Web-Token-on- Published in the proceedings of the 2021
Authentication-Ramdani-Alam- International Conference on Advanced
Rahmatulloh/b3d611ca6b7b7e9b2f6f22f1bb4bd Computer Science and Information Systems
e0211dc7f51 (ICACSIS) on October 2020-21.
Published as a conference paper in the 2018 [80] Abhishek Aadi (2018). "Secure JWT in a
International Conference on Informatics, Nutshell."
Multimedia, Cyber, and Information System Published on January 22, 2018. Retrieved from
(ICIM-CIS). https://medium.com/swlh/secure-jwt-in-a-
[78] Salman Ahmed, Qamar Mahmood. (2018). " An nutshell-e59a0139096d
authentication based scheme for applications [81] The App Solutions. (2020). "How to Build a
using JSON web token.". Retrieved from Secure JWT Authentication: Best Practices."
https://ieeexplore.ieee.org/document/9022766/ Retrieved from
Published as a conference paper in the 2018 3rd https://medium.com/@abhishekaadi/jwt-in-a-
International Conference on Computer and nutshell-part-1-84bf7c7018d
Communication Systems (ICCCS). [82] Teniola Fatunmbi (2022). "JSON Web Tokens
[79] A Rahmatulloh, R Gunawan, and F M S (JWT) vs. Session Cookies: Authentication
Nursuwars. (2020). " Performance comparison Comparison." Okta. Retrieved from
of signed algorithms on JSON Web Token" https://developer.okta.com/blog/2022/02/08/coo
Retrieved from kies-vs-tokens
Home Automation Using WhatsApp Chatbot
1st Pavan Penugonda 2nd Patan Ashraf Ali Khan 3rd P V Praneeth Reddy
dept. of CSE dept. of CSE dept. of CSE
Bengaluru, Karnataka Bengaluru, Karnataka Bengaluru, Karnataka
th th
4 M Nithin Kumar Reddy 5 P Subhash Reddy
Bengaluru, Karnataka Bengaluru, Karnataka
201910101115@presidencyuniversity.in 201910101116@presidencyuniversity
.in
Abstract—In recent times, the advancement of technology has

facilitated the improvement of daily life activities, making them using wheelchairs and having health conditions, if they need to
more efficient and convenient. Homes have numerous electronic turn on a bulb or a fan they are dependent on another person. If
appliances that require manual operation, and this project aims to there was no one in the home, they would have to wait until
design and develop an automated system that employs the Internet someone came to their aid, or risk affecting their health by
of Things (IoT) principles to control home appliances through
moving to turn on a switch. This should not be the way they
WhatsApp messages. The proposed system is composed of
hardware and software components that enable the user to send live, there needs to be something that can be done to help make
commands via WhatsApp to control and operate the appliances their lives more accessible and self-sufficient. This is where
from anywhere. This project presents a financially friendly solution the Internet of Things comes into effect. With the various
that reduces the effort and time required to operate home appliances protocols, technologies, frameworks, integrations, and tools it
while enhancing convenience and ease of use. The system’s has provided it has become possible today to automate homes
effectiveness was tested, and the results demonstrate its reliability and offices and use various mechanisms to control electronic
and efficiency in controlling home appliances. This paper presents a goods using the internet. Today it is possible to turn on a bulb
novel approach to automate homes that can be easily replicated and
from across the world and turn off a fan similarly. From the
scaled, providing an innovative and practical solution to address the
time IoT was introduced to the world, various organizations
challenges associated with manual operation of home appliances.
have built applications and management systems to automate
Keywords: home-automation, ESP32, whatsapp, twilio, IoT parts of our lives so that we can make them more efficient,
although this is an appreciable endeavor, the only downside to
it is the extra applications and layers involved. Electric
I. INTRODUCTION companies demand that their customers use only their
How many times each of us left home in a hurry to catch a bus or automation platform to automate their homes, the interfaces
that they build are complex and users are required to go
because we were late for a meeting, and how many of us realized
through lessons just to understand how to use the interface
very late, probably after reaching the office or halfway to college
that we had left a bulb or a fan running in the bedroom. This is a without confusion. Further, to control their applications, they
very normal mistake, a survey conducted in America, says that are forced to use a similarly complex mobile or web
90This causes a huge waste of electricity, with sustainability being application that offers too many options and confuses the user.
a very important factor for human survival today, unnecessary The current automation systems that exist are all aimed
towards an educated individual who has a grasp on the finer
wastage of electricity or for that matter, any form of energy must be
workings of programming and interface usage, and this leaves
prevented. Leaving a bulb on for a couple of hours may not seem a
noteworthy issue, but if this keeps repeating every other day, the the more simple population desolate. This should not be how
com-pounded amount of energy wastage from a single household is technology works, it should be accessible and usable by
significant. Being careful with energy consumption can help in everyone without any hassle and this is the problem we solve
by building this project.
leaving a reduced carbon footprint and helps in moving towards a
sustainable future for everyone. This project aims to automate home appliances such as light
On the other hand, conventional usage of electrical appli-ances bulbs, by using WhatsApp as the communication platform.
such as fans and light bulbs requires that people need go to the Very little or even no technical expertise is expected to use the
system that we intend to build and we aim that this will be
switches to turn them on and off, but this is not possible in the case
accessible and useful to everyone without discrimination. All
of people with different abilities. People
they require is a mobile phone and a WhatsApp account, and
fortunately, more than three-quarters of the population has
them. Additionally, automating home appliances using IoT has
numerous advantages. Primarily, it helps in addressing the
problems discussed at the beginning of the section. If we
indeed forget to turn off a light or a fan, and our home is
automated, we can turn off the bulb or fan from anywhere with a Transmit data packets easily and for free with WhatsApp-
command through the internet, this will help us become more based home automation. With our easy-to-use webhook, you can
energy-conserving and thus sustainable. Similarly, if there was a quickly and easily transmit data packets using the What-sApp
differently abled person at home, they would be able to use the application. There is no need to download expensive or
same commands to control lights and bulbs in their rooms, and complicated software - the WhatsApp app is free to use.
they would no longer be dependent on someone else and thus Regulations require senders to clearly establish their identity and
communicate participant rights. WhatsApp Business API
become self-reliant.
(application programming interfaces) policies require active
So, as we can see there are numerous advantages to automating
monitoring and a way to transfer conversations to a human
homes using IoT. This particular project aims towards using a agent. WhatsApp numbers are sometimes shared among a group
combination of hardware and software technologies to use of individuals.
WhatsApp as the means of communication to turn the appliances in
a house on and off. This will help in leading a more convenient, E. Home automation using IoT and a chatbot using natural
efficient, and easy life. language processing [5]
II. LITERATURE REVIEW The system has a chatbot algorithm that allows users to text
To understand the existing research on the problem we were information to control the functioning of electrical appliances at
trying to address and also to gain an understanding of various home. This feature provides a convenient and easy-to-use
implementations, we did a literature survey of home automation interface for controlling the devices. Secondly, any device
using IoT and these four articles stood out. connected to the local area network of the house can control the
devices and other appliances in the house. This feature makes it
A. Design and Implementation of Home Automation System easy for users to control devices from anywhere in the house.
using Raspberry Pi - The California State University. [1] Finally, the system has the functionality of sending an email
This WhatsApp home automation system has a better edge as alert when an intruder is detected using motion sensors,
users may send data packets from home automation devices fast enhancing the safety and security of the home. The proposed
and easily by downloading and using the free WhatsApp app. It system has some limitations. Firstly, the system relies on the
offers a quick, safe, and trustworthy solution to do this without the local area network, which can be a limitation in cases where the
need to acquire any pricey software. Users also have the option of network is down or has poor connectivity. Secondly, the system
building a webhook that would transmit data packets straight to uses basic techniques of natural language processing like
their devices. It has advantages like maximizing home security and tokenization, removal of stop-words, and parsing. This
limitation can lead to inaccuracies in interpreting user
flexibility for new devices and appliances. Home automation
commands.
system using raspberry pi is costly so using an ESP32 board will
reduce the cost.
III. PROPOSED SYSTEM AND
B. Home Automation Smart Mirror. [2] ADVANTAGES.
The most important thing is that a single device is easily allowed As we have seen in the literature survey, the major dis-
to access and control the home automation and security system; it
advantages involved are regarding the need to learn using
allows the user to set their password, pin, pattern, or fingerprint to
dedicated applications, issues of security and robustness of the
unlock the mirror device. Smart home systems are a major step
systems as well as the financial investment involved. To address
forward for technology and home management, allowing users to
manage all their home devices from one place. This reduces the these concerns, we propose a financially friendly automation
learning curve for inexperienced users and makes it easier to access system that uses WhatsApp that has end-to-end encryption for
the functionality you genuinely want for your home. The security of all its messages and Twilio, that also similarly uses encrypted
smart mirrors is a major concern for manufacturers and users alike. protocols to communicate. By using these platforms we develop
A survey found that there is no status, and no alert message, and a system that helps in automating the bulb to turn on and off
security is not maintained when the owner is out of the home. based on the messages sent to it.
C. Smart Home Automation via Facebook Chatbot and IV. METHODOLOGY
Rasp-berry Pi. [3] A. Requirements
Because Facebook Messenger has such a large user base, 1) Functional Requirements: As the entirety of the applica-tion
homeowners can easily control their smart devices remotely, is based on communication through WhatsApp, the basic
requirements to run the application are the same as the one
making them more convenient and accessible. This integration can
required to install WhatsApp on a mobile device.
make home automation more affordable and user-friendly by
• A functioning operating system.
eliminating the need for complex software and hardware
• If Android, above Kitkat(4.1)
installations. This system can be unreliable and may face downtime
• If iOs, above iOS8
due to server or network issues. Moreover, the privacy and security • A minimal disk space of 100MB
of user data can be a major concern with Facebook Messenger • A working uninterrupted internet connection.
Chatbots. Therefore, before opting for a smart home automation • Whatsapp installed with an account created.
system via Facebook Messenger and Raspberry Pi, users must 2) Non-functional Requirements: Non-functional require-ments
carefully consider these limitations. that the application has to satisfy to provide a free interrupt-free
functioning to the user, these requirements are not mandatory,
D. WhatsApp Home Automation Using Node-MCU. [4] but these just show that care and concern had been taken by the
developers to provide the users with a great experience while
using the application, the following are the list of non-functional
requirements and how our application satisfies those requirements.
• Usability It refers to how easy it is for a user to move around
within the product. A product should be easy to use, and its
workings should not require a complex understanding of the
system.
2)UML Diagram: The Unified Modeling Language (UML) is a
• Purpose It is important to understand what the purpose of the
standard language for specifying, visualizing, constructing, and
application is, as it will tell you if the application actually
documenting the artifacts of software systems, as well as for
solves a problem, or if it simply accomplishes what is already
business modeling and other non-software systems.
being done in another manner. A purpose shows whether the
application solves any actual problems or merely replicates
something that already exists.
• Efficiency The efficiency of an application is determined by
how well it solves the problem and helps the user. It is a non-
functional requirement that indicates the effective-ness of the
solution. The goal of the application is to be as efficient as
possible, and technologies such as Twilio and WhatsApp have
been chosen for this reason. The parts that make up the system
are also efficient, leading to an overall efficient system.
• Reliability The reliability of this application is great, all the
users have to do is simply send a message on WhatsApp, and
there are no additional layers troubling the users or competing
for the bandwidth. The technical design of the application is
also single-laned, all commu-nication is tied back to back and
this removes any chance for clashes and keeps the system as
reliable as possible. Fig. 1. UML State Diagram
B. System Design
System design refers to the process that was followed while
designing the entire system, including all of its components, flows,
tasks, and affordances that can be done by a user while using the
application. It is sort of a conceptual plan that will be followed
while actually building the system and ensuring that all of the parts
have been successfully finished. The system design provides a
complete overview of how the system is implemented even in the
technical sense, it will be used as a guideline and a road map for all
aspects of the building. System design will help us understand the
complexities, and the processes of how to build and what to build
Fig. 2. UML Sequence Diagram
first and help us manage our time and resources so that the
maximum is spent on the essential parts of development. System
design is the phase where ideation ends and the actual development C. Technical Requirements
of the product in its continuity begins. The technical requirements of the project signify the various
1) Application Flow: The following is how the system we intend to technologies, tools and procedures used in the development of
design works. The primary form of commu-nication with the the application.
automation system is through WhatsApp, this is where the users 1) WhatsApp Messenger: WhatsApp is a popular messag-ing
will send commands to operate the appliances in the house. These application that allows users to send messages, make voice and
appliances would be connected to an ESP32 module, which is an video calls, share media, and conduct group chats. It was
IoT device that would be connected to the internet founded in 2009 by two former Yahoo employees, Brian Acton,
• The user sends a message to either turn the bulb on or off and Jan Koum. Initiall0y, WhatsApp was designed as an
from WhatsApp. alternative to traditional SMS messaging, and it quickly gained
• This message is relayed to the ESP module via a Twilio popularity due to its ease of use and low cost. In our application
connection. we have used WhatsApp to send messages to control the
• The module interprets and understands the system and acts automation, users use this service to send messages to either turn
accordingly. the bulb on or off and receive respective messages in responses.
• The appliance is either switched on or off based on the 2) Twilio: Twilio is a cloud communications platform that
command issued.
enables businesses to communicate with their customers through
• The respective response is sent back to the users, showing
various channels such as voice, SMS, email, and messaging
either the confirmation of performing the action or any error
applications like WhatsApp. Founded in 2008, Twilio has
message that needs to be corrected. become a leading player in the cloud communica-tions industry,
providing reliable and scalable communication solutions to
businesses of all sizes. In our application, we have used Twilio
to relay the communication messages between WhatsApp and
the ESP32 module. The Twilio integration module receives
messages from WhatsApp and then relays it to the endpoint it
has established with the ESP32 modules, similarly, it also picks up Implementation is the actual stage where we implement all
messages sent by the ESP32 module and relays them back to the planning and designs we’ve achieved in the previous stages,
WhatsApp for the users to see. It acts as an intermediary although the development was ever-present in all those stages
communicator here. too, here we see a comprehensive overview of how the
3) ESP32 Module: The ESP32 module is a powerful and versatile development procedure occurred and how the applica-tion was
microcontroller unit (MCU) that is widely used in a range of developed so that the objectives of the application were achieved
embedded systems and Internet of Things (IoT) applications. It is properly. Every application has a few crucial components that
based on the Espressif ESP32 system-on-chip (SoC), which make up the bulk of the application along with several other
combines a dual-core processor, WiFi, and Bluetooth connectivity, standard conventional features, it’s these crucial components
and a range of peripheral interfaces into a single chip. The ESP32 that make up the application and show its functionality. Let’s
module is also designed with power efficiency in mind, with a look at those crucial components that make our application into
range of power-saving modes and low-power operation options to the solution that solves the problem that we have been talking
maximize battery life in battery-powered devices. It also includes a about from the beginning. As we have built two different
range of security features, including secure boot and flash applications, the construction of each application, and the
encryption, to protect against unauthorized access and tampering. In development of each individual feature has been depicted below.
our application we have used the ESP32 module to act as the
• Setting up a project on ThingESP
controller of the electrical appliances, all the bulbs are connected to
• WhatsApp Integration with Twilio
this module, and this is connected to the internet and a power • Programming the ESP32 Module
source, upon which it will communicate with the user with Twilio • Circuit Design.
as an intermediary. • Final Integration
4) ThingESP Library: The ThingESP library is an open-source
A. Setting up a project on ThingESP.
software library designed to simplify the development of Internet of
Things (IoT) applications using the ESP8266 and ESP32 As we have seen above, the ThingESP is an open-source
microcontroller units (MCUs). It provides a range of functions and library that allows us to communicate and program ESP32
utilities for connecting to WiFi networks, interfacing with sensors modules with ease. The first thing that we need to do is create a
and other peripherals, and sending and receiving data over the project in the library and get the communication endpoint.
internet The library includes a built-in MQTT client that enables • Create an account or log in to an existing account on
easy and efficient commu-nication between IoT devices and servers ThingESP.
or other devices. This allows developers to easily create scalable • Then, we have to add a new project and provide the name
and flexible IoT applications that can communicate with a wide and credentials. This credential can be anything, but we
range of other devices and systems. In addition to its support for must note it down for further usage.
• Then we will be taken to the project page and there on the
MQTT and sensor interfacing, the ThingESP library also includes a
right side we will notice a URL, which is the
range of functions for managing WiFi connections, including
communication endpoint for us.
automatic reconnection and error handling.
This communication endpoint is where all our messages from
5) Arduino IDE: The Arduino software is an open-source Integrated Twilio will go, and the ThingESP library ensures that the
Development Environment (IDE) that provides a user-friendly messages being sent to this endpoint reach the ESP32 module
platform for the programming and development of microcontroller- that we intend to communicate to.
based systems. It is designed to be simple and easy to use, making
B. WhatsApp Integration with Twilio
it accessible to beginners and experts alike. The Arduino software
is based on the Wiring language, which is a simplified version of C Whatsapp integration with Twilio is used to send messages
and C++. This makes it easy for developers to write and understand from Whatsapp to the ESP32 module via the ThingESP library.
Twilio acts as the secure intermediary between the user and the
code, even if they are not experienced, programmers. In addition to
ESP module.
its support for the Wiring language, the Arduino software also
• The first step is to create an account on Twilio.
includes a range of libraries and examples to help developers get
• Then we need to create a new project and verify your
started with common tasks and functions. In our project, we used
credentials.
Arduino to write the code to handle the ESP32 module and program • Once your project is created, go to messaging, then choose
it into the module. settings and again choose WhatsApp sandbox settings.
6) Relay: A relay is an electronic switch that is used to control high- • Agree to everything and this will activate your WhatsApp
power devices, such as lights, motors, or heaters, using a low-power sandbox developer environment.
signal from a microcontroller or other IoT device. Relays are Now when we have activated the WhatsApp sandbox, we will be
commonly used in IoT applications where the devices being given two fields, on the top place the URL we have received
controlled require more power than the microcontroller or other from the ThingESP. This is the endpoint, it denotes that all the
low-power device can provide. A relay consists of an messages received to this Twilio sandbox must be relayed to this
electromagnetic coil and a set of contacts. When a current is applied end point.
to the coil, it generates a magnetic field that causes the contacts to
close or open, depending on the type of relay. This allows the relay
to switch power on or off to a connected device or circuit.
V. IMPLEMENTATION When we look below these fields, we also see a phone number
with a 3-word unique identifier separated by hyphens. This is the
number that we must communicate with to control the light bulbs. • Now, the messages are handed over to the ThingESP
• The first thing to do is save the number onto our mobile which, based on the identifiers, identifies the module and
phone. sends the messages there.
• Then we need to send the three-word identifier as a message • The microcontroller interprets the messages and accord-
to the number. This verifies the number and will let us further ingly turns the light on or off and sends back an appro-
communicate. priate response.
This brings us to the end of the Twilio integration, so far we have This is how the system was implemented.
successfully set up all the message senders and relays to ensure the
message properly reaches its destination. Now we move on to VI. TESTING
configuring the hardware. The process of testing is as crucial as the development itself
and some say it’s far more crucial than the development. It is in
C. Programming the ESP32 Module
this phase that the application that we have developed is tested
Programming the ESP32 module means we write the code as to to ensure that it is running as we intend it to. Here we check with
what to do when we receive a message and we push that code onto the various documents that we have prepared along the way such
the microcontroller.
as the preliminary designs and ensure that the final outcome that
• We first write the code on the Arduino IDE.
we have received is something that we have been aiming at. As
• Then to upload it we select the board, from the tools, ESP32
and ESP32 Dev module. this application is a combination of both hardware and software
• Then we select the port and click on the arrow to upload it to components, it is essential to ensure that testing focuses on both
the module. aspects and ensures that everything works in synchronization.
This will upload the program to the ESP32 module. A. Unit Testing
D. Circuit Design Any application is made up of numerous small units and only
Circuit design implies the way we connect the relay, the ESP when they are combined properly can they form a fully
module, and the bulb with the power supply so that it can function functional application. These small units are crucial to be tested
as we intend it to. The following are the instructions to properly before sending to the next state so that they do not add up to
configure the circuit. some serious level of errors. If an erroneous component is
integrated with a non-erroneous one, it could lead to the
• We need to first connect one end of the two-pin plug to the
first pin the relay malfunctioning of both components. Our process of Unit Testing
the Application involved testing out the following units.
• Then we must connect one end of the bulb to the second pin
of the relay. • Testing to see if the circuits were connected correctly and
• Now, we connect the negative pin of the relay to the ground were fitted into their slots.
pin of the ESP 32. • Testing to see if the connections worked.
• Then, we connect the positive pin of the relay to the Vin pin • Testing to see if the messages from WhatsApp were
in the ESP 32 reaching the Twilio interface.
• We connect the signal pin to pin D23 in the ESP. • Testing to see if the URL endpoint given into Twilio was
right.
• Finally, connect the remaining pin of the two-pin plug to the
remaining pin of the bulb. • Testing to see if the messages from Twilio were being
relayed to the ESP32 module.
This finishes the circuit design. This is also depicted in the
• Testing to see if ThingESP is communicating with the
following image.
ESP32 module.
• Testing to see if the ESP32 module is properly
programmed.
• Testing to see if the right keywords were used for
verification in the ESP module.
• Testing to see if the ESP module behaved according to the
inputs being sent.
• Testing to see if the responses that were being sent back to
the user were appropriate.
This is how we performed the unit testing of our application,
ensuring that the basic building units of the application are
working correctly.
VII. CONCLUSION AND FUTURE SCOPE

Fig. 3. Circuit Diagram
A. Conclusion
This project aims to automate electric appliances at home
E. Final Integration
with the help of WhatsApp communication with the goal of
All the steps we have followed above will lead us to a fully
bringing convenience with technology to people without
connected and completed system that will turn the bulb on and off
discrimination and also to provide a budget-friendly option for
based on the messages we send. Here for this example, we are
home automation. Users will be able to send simple commands
working based on two keywords, those are ‘light on’ and ‘light off’.
such as ‘light on’ and the action will be reflected in real-time.
Their functions are self-descriptive.
Choosing WhatsApp as the communication channel primarily
• When a light-on message is sent from WhatsApp to the
Twilio, it then further relays it to the endpoint with which we because of its widespread prevalence in all sectors of the
have programmed it. population and also because of the security and reliability it
provided. We successfully designed and built a system that
controls a light bulb with the help of an ESP32 microcontroller.
We did extensive testing of the application and ensured it was [10] Dey, Shopan, Ayon Roy, and Sandip Das. ”Home
easily usable and provided actual statuses to the user. automation using Inter-net of Thing.” 2016 IEEE 7th
annual ubiquitous computing, electronics and mobile
B. Future Scope communication conference (UEMCON). IEEE, 2016.
Now that we have managed to automate the working of a light [11] Singh, Himanshu, et al. ”IoT based smart home automation
bulb and Fan, we intend to work further to increase the complexity system using sensor node.” 2018 4th International
of the circuit and add even more components, such as TV and air Conference on Recent Advances in Information
conditioners. We also intend to add additional features such as a Technology (RAIT). IEEE, 2018.
timed message that would enable the bulbs or other appliances to
turn on or off at dictated times. Further, we would also like to focus
on using this messaging option to send even detailed instructions to
the modules to act on the appliances such as bringing the AC to 18
degrees and so on. As we have chosen WhatsApp to be our medium
of communication, the possibilities of further improving the system
are endless, but this is where we would like to begin.
ACKNOWLEDGEMENT
We are greatly indebted to our guide Dr. Medikonda Swapna,
Associate Professor, School of Computer Science & Engineering,
Presidency University for her inspirational guidance, valuable
suggestions and for providing us a chance to express our technical
capabilities in every respect for the completion of the project work.
REFERENCES
[1] Soni, “Design and Implementation of Home Automation
System using Raspberry Pi,” California State Polytechnic
University, Pomona, 2021.
[2] P. Mathivanan, G. Anbarasan, A. Sakthivel and G. Selvam,
”Home Automation Using Smart Mirror,” 2019 IEEE
International Confer-ence on System, Computation,
Automation and Networking (ICSCAN), Pondicherry, India,
2019, pp. 1-4, doi: 10.1109/ICSCAN.2019.8878799.
[3] T. Parthornratt, D. Kitsawat, P. Putthapipat and P.
Koronjaruwat, ”A Smart Home Automation Via Facebook
Chatbot and Rasp-berry Pi,” 2018 2nd International
Conference on Engineering Innovation (ICEI), Bangkok,
Thailand, 2018, pp. 52-56, doi:
10.1109/ICEI18.2018.8448761.
[4] K. L. Raju, V. Chandrani, S. S. Begum and M. P. Devi,
”Home Automation and Security System with Node MCU
using Internet of Things,” 2019 International Conference on
Vision Towards Emerging Trends in Communication and
Networking (ViTECoN), Vellore, India, 2019, pp. 1-5, doi:
10.1109/ViTECoN.2019.8899540.
[5] C. J. Baby, F. A. Khan and J. N. Swathi, ”Home automation
using IoT and a chatbot using natural language processing,”
2017 Innovations in Power and Advanced Computing
Technologies (i-PACT), Vellore, India, 2017, pp. 1-6, doi:
10.1109/IPACT.2017.8245185.
[6] Pavithra, D., Balakrishnan, R. (2015, April). IoT based
monitoring and control system for home automation. In 2015
global conference on communication technologies (GCCT)
(pp. 169-173).
[7] Mandula, Kumar, et al. ”Mobile based home automation using
Internet of Things (IoT).” 2015 International Conference on
Control, Instrumen-tation, Communication and Computational
Technologies (ICCICCT). IEEE, 2015.
[8] Abdulraheem, A. S., Salih, A. A., Abdulla, A. I., Sadeeq, M.
A., Salim, N. O., Abdullah, H., Saeed, R. A. (2020). Home
automation system based on IoT. Technology Reports of
Kansai University, 62(5), 2453-64.
[9] Patchava, V., Kandala, H. B., Babu, P. R. (2015, December).
A smart home automation technique with raspberry pi using
iot. In 2015 Inter-national conference on smart sensors and
systems (IC-SSS) (pp. 1-4). IEEE.
Fitpulse – A Fitness and Gym Website.
1st Impa B H 2nd Dhanush E 3rd Deveraja

impa@presidencyuniversity.in 201910100418@presidencyuniversity 201910100394@presidencyuniversity
.in .in
4th Sujay P V 5th Soma Sai Dhanush 6th Thrupthi V
Department of CSE Department of CSE Department of ISE
201910100738@presidencyuniversity 201910101396@presidencyuniversity 201910100672@presidencyuniversity
.in .in .in
Abstract— Fitulse is a comprehensive online and successful in assisting users in achieving

hub that gives customers access to top-notch their fitness objectives.
fitness and wellness-related information, This study examines FitPulse's effects on the
products, and services. It is made to be fitness sector and its efficiency in assisting
accessible to people of different fitness users in achieving their fitness objectives.
backgrounds and experience levels, from
newcomers to seasoned fitness aficionados. Fitpulse is an online platform that enables users
Users of the website get access to a variety of to schedule gym time and participate in
home workout facilities and gym equipment in workout challenges based on their fitness level.
addition to its content on nutrition and fitness. It has an easy-to-use login and registration
Many people prefer working out at home due process that uses OTP to confirm users' mobile
to its convenience and affordability. The numbers, as well as a range of tasks for
website is a great place to go for anyone trying customers to choose from that range in
to better their health and well-being thanks to difficulty from beginner to intermediate. This
its variety of tools and services. study examines FitPulse's effects on the fitness
sector and its efficiency in assisting users in
Keywords—Fitpulse, React Js, Node Js, achieving their fitness objectives. The OTP
MangoDb, Twilo etc. (One-Time Password) consent widget is a user
interface component that is designed to display
an OTP field along with a consent checkbox.
I. INTRODUCTION. Libraries, in this context, refer to software
Fitpulse is an online platform that enables users development libraries or frameworks that
to schedule gym time and participate in provide pre-built code components to facilitate
workout challenges based on their fitness level. the implementation of OTP consent widgets in
It has an easy-to-use login and registration software applications.
process that uses OTP to confirm users' mobile
numbers, as well as a range of tasks for
customers to choose from that range in
difficulty from beginner to intermediate. II. CURRENT STATE OF FITNESS
Fitpulse combines the accessibility of online WEBSITES.
exercise tools with the support and
accountability of a community of fitness Fitpulse is an online fitness platform that
enthusiasts, making it simple to use, accessible, provides a safe, individualized, and practical
platform for gym reservations and workout
problems. It offers users a complete fitness need additional middleware or packages to
solution that meets their unique needs thanks to handle complex jobs.
its mobile number verification mechanism and
customized training plans. FitPulse offers users
a complete fitness solution that meets their E. Advantages and Disadvantages of
unique needs thanks to its mobile number MongoDB.
verification mechanism and customized Benefits include sharding, scalability, high
training plans. speed, and flexible data modeling, but
disadvantages include limited transaction
support and lack of ACID compliance
A. Abbreviations and Acronyms
• OTP – One Time Password
F. Advantages and Disadvantages of React js.
• API - Application Programming Interface
ReactJs has advantages such as excellent
performance, reuse, and a large development
B. OTP System. community, but beginners may find it difficult
due to the steep learning curve and library
OTP (One-Time Password) generating
requirements.
technology protects online transactions and
sensitive data. Temporary passwords, or OTPs,
are created for single use and disappear after a G. Advantages and Disadvantages of
predetermined period of time. Users must input Mongoose.
the OTP in addition to their standard login
information to access their account or execute a Mongoose JS offers schema-based data
transaction using this technology. It is used in modeling and querying capabilities,
e-commerce, banking, and other sectors that simplifying MongoDB integration with
deal with sensitive data and transactions. Node.js.It also offers built-in type casting,
validation, query building, business logic
hooks and more However, it introduces a
C. Twilio API. second layer of abstraction, which can slow
down performance and make the codebase
Fitpulse uses Twilio API to provide a secure
more complex.
login and signup system for its users. The OTP
system ensures that only verified users can III. OVERVIEW OF FITPULSE
access the platform, protecting their personal
information from unauthorized access. This
system adds an extra layer of security to A. Architecture.
FitPulse, enhancing the user experience by 1) The front-end, the back-end, the database,
providing maximum privacy and peace of and the Twilio API for OTP verification make
mind. D. Advantages and Disadvantages of up the architecture of the FitPulse project.
Node js. Node.js is effective for developing
real-time and data-intensive applications due to 2) React JS, a well-known JavaScript toolkit
its quick, scalable, and non-blocking I/O. for creating user interfaces, was used to
However, it only supports a single thread, develop the project's front end.
making it difficult to manage multiple CPU- 3) The project's back end is created using
intensive processes at once. Node.js and Express.js.
4) These technologies enable the development
D. Advantages and Disadvantages of Express of back-end applications that are quick,
js. scalable, and reliable.
ExpressJs is a quick, lightweight, and highly
flexible online application framework, but may
5) The FitPulse project's database is housed on Web Application. We would first want to
the cloud-based NoSQL database service express our gratitude to our academic advisor,
MongoDB Atlas. whose suggestions and help were essential
during the entire study period. We also wish to
6) The FitPulse project uses OTP verification
thank the subject-matter experts who shared
for the user's mobile number to ensure safe
their insights and criticism with us, allowing us
signups and logins.
to improve our research. Their knowledge and
7) This is achieved using the Twilio API, skill considerably increased our understanding
which is a cloud communications platform that of the subject and provided us with new
allows developers to send and receive SMS perspectives.
messages, phone calls, and other types of
communications through its APIs.
REFERENCES
8) The FitPulse project uses Mongoose, an
Object Data Modelling (ODM) module for [1] An overview of a one-time password,
MongoDB and Node.js, to connect the front- Author - SHALLY & GAGANGEET SINGH
end, back-end, and database. AUJLA, Year - 2014, Publication - Trans
stellar journal publications.
9) The FitPulse project's architecture is created
to be extremely scalable, dependable, and [2] Review on React JS : Authors: Bhupati
secure. Venkat Sai Indla1, Yogesh Chandra Puranik,
Publisher And Year: IJTSRD, 2021.
[3] MongoDB – a comparison with NoSQL
B. Figures
databases, Authors: Hema Krishnan, Research
Scholar, CUSAT M.Sudheep Elayidom,
Associate Professor, School of Engineering,
CUSAT T.Santhanakrishnan, Scientist E,
NPOL, Publisher and Year: International
Journal of Scientific & Engineering Research,
May-2016.
[4] The Future of Gyms: How fitness
customers changed the Rules of Engagement
Fig 1. Project Architecture. for Gyms after COVID-19 by membr.com.
CONCLUSION.
FitPulse is a platform that offers
comprehensive, secure, and customized fitness
solutions, revolutionizing the way consumers
approach fitness. It provides users with a user-
friendly login and signup process, a gym
booking feature, exercise challenges, and
information on their fitness development.
User's private information is protected and
trustworthy user authentication is provided for
the login and signup processes.
ACKNOWLEDGMENT
We thank everyone who contributed to this
study on Fitpulse - A gym booking and fitness
Flavour Fetch-An Authenticated and Authorized
Food Delivery Website
Ms Sridevi S Aishwarya G Keelapatla Basheera Anjum

Assisstant Professor, Department of Computer Department of Computer
Department of Computer Science and Engineering Science and Engineering
Science and Engineering Presidency University Presidency University
Presidency University Bangalore, India Bangalore, India
Bangalore, India 201910100341@presidencyu 201910101274@presidencyu
sridevi.s@presidencyuniversi niversity.in niversity.in
ty.in
Sowjanya Shetty Maddula Venkata Sunil
Sai Sree Amara Department of Computer Kumar
Department of Computer Science and Engineering Department of Computer
Science and Engineering Presidency University Engineering [Data Science]
Presidency University Bangalore, India Presidency University
Bangalore, India 201910100099@presidencyu Bangalore, India
201910101537@presidencyu niversity.in 201910101014@presidencyu
Abstract—The trend towards online food LVII. INTRODUCTION

ordering and delivery has brought about a The online food delivery industry has
significant shift in the food industry. As a undergone significant transformation in recent
result, the development of a reliable and secure years. Nowadays, busy lifestyles leave little time
food delivery website has become essential for for cooking, and ordering food from other apps
businesses to meet the growing demand for and restaurants may not always provide good and
online food services. This paper presents the healthy options. The Flavour Fetch Food Website
development of Flavour Fetch, an will feature home cooks who have been approved
authenticated and authorized food delivery by the Food Safety and Standards Authority of
website that aims to provide users with a India (FSSAI). This website will make it easy for
convenient and secure way to order food customers to order good, healthy food at a low
online. The website has been designed to cost. Users can easily register using their social
provide a user-friendly interface that works for media accounts. We have used user authentication
both customers and home cooks. Only and session authentication, as well as OAuth
authenticated customers and home cooks are (Open Authorization) for authorization purposes.
allowed access to the website. The website In this website, we have implemented robust
features multiple levels of authentication security features for authentication, data
mechanisms, including user authentication and validation, and access control.
session authentication. Additionally, the Flavour Fetch is a website that connects
website uses an authorization framework such customers to home cooks and provides good,
as OAuth to ensure that only authorized users healthy food on its platform. We have used simple
have access to the website's features and data, technologies such as HTML, CSS, JavaScript and
providing a secure platform for customers to Firebase for storing all customer and home cook
order food from home cooks. data.
This website displays all home cooks, and
Keywords— Online Food Delivery, customers can select a cook to view their menu.
Authentication, Authorization, Home cooks
Customers can customize their food items and add access the system and perform specific actions.
them to the cart before ordering. The payment Users can easily sign up, sign in, and manage their
method available is cash on delivery, and users can accounts, making for a more seamless and user-
keep track of their orders. They can also provide friendly experience.
feedback and ratings for their food items. Chefs need to provide their FSSAI code to
The reason to choose this project is because it verify their authenticity. The website uses robust
connects home cooks and customers on a platform security features, such as user authentication, data
that provides healthy, customized food delivered validation, and access control, to protect customer
directly to their homes. The Flavour Fetch information and ensure secure transactions.
application offers a unique and personalized Flavour Fetch offers homemade food from a
experience that focuses on healthy homemade variety of cooks, which is a unique offering that
food. sets it apart from other food ordering websites.
LVIII. PROBLEM STATEMENT LIX. LITERATURE REVIEW
In today's fast-paced lifestyle, people have less In paper [1] there is an attempt to provide a
time to cook food at home, but they still prefer to digital platform of a food ordering website
have healthy, homemade food. However, the called "Cooked with care" using the MERN
options available for ordering such food are stack, which includes MongoDB, ExpressJS,
limited. The existing food ordering platforms like ReactJS, and NodeJS. The website allows
Swiggy and Zomato do not provide a wide range customers to order food online from different
of options for ordering healthy homemade food. restaurants and provides features such as menu
browsing, order tracking, and online payment
The current process for ordering food from
options.
home cooks is inefficient and time-consuming,
with customers having to manually search for The paper [2] presents the design and
home cooks and their available dishes. It can be implementation of an online food ordering system
difficult for customers to find and compare that allows users to order food from various
different home cooks and their offerings, making it restaurants through a web application. The system
challenging to discover new and interesting dishes. aims to provide a user-friendly and efficient
Due to the lack of convenience, customers are platform for customers to order food and for
forced to compromise their health and order food restaurant owners to manage orders. The authors
that is not homemade or healthy. used the waterfall model to design the system and
implemented it using technologies such as HTML,
Moreover, many home cooks lack a platform to
CSS, JavaScript, and PHP. The system includes
showcase their culinary skills and reach a wider
audience, limiting their potential customer base. features such as a search bar to find restaurants, a
menu with food items and prices, a shopping cart
Food delivery services often prioritize established
to add items to the order, and a payment gateway
restaurants and chains, leaving home cooks with
to complete the transaction.
fewer opportunities to grow their businesses and
reach new customers. Some home cooks may lack In paper [3] The authors begin by providing an
the necessary infrastructure and resources to overview of the TAM framework and its
manage online orders and payments, preventing applicability in the context of online food ordering
them from effectively competing with larger systems. They then present the results of a survey
restaurants and delivery services. conducted among 257 customers of an online food
ordering system in Turkey. The survey aimed to
This leads to a growing need for a platform that
investigate the customers' attitudes towards online
connects customers with home cooks who prepare
food ordering systems and their intention to use
healthy and hygienic food, making it easier for
the system in the future. The authors conclude that
them to order food from a wide range of options.
the TAM framework is an effective tool for
The solution to this problem can be a platform like
analysing customer attitudes towards online food
Flavour Fetch, an authenticated and authorized
ordering systems. The study finds that customers'
food-ordering website.
attitudes towards the system are primarily
Flavour Fetch's authentication and influenced by perceived usefulness and ease of
authorization system provides enhanced security use, rather than perceived risk.
measures to ensure that only authorized users can
The paper [4] presents a detailed description of restaurant staff. Overall, the paper provides a
the system's architecture and implementation, comprehensive description of the design and
including the use of Java, MySQL, and Android implementation of a customizable online food
Studio. The authors also evaluate the system's ordering system using a web-based application.
performance and usability through user testing and The study demonstrates the potential benefits of
feedback from customers and restaurant staff. using web-based applications to improve the
Overall, the paper provides a comprehensive customer experience and streamline restaurant
description of the design and implementation of a operations.
digital dining system for restaurants using the In paper [8] the authors provide an overview of
Android platform. the existing restaurant management systems and
The paper [5] presents a detailed description of their limitations. They then present their proposed
the system's architecture and implementation, e-restaurant management system, which includes
including the use of technologies such as PHP, features such as a multi-touchable user interface,
MySQL, and HTML/CSS. The authors also menu management, order processing, and
evaluate the system's performance and usability inventory management. The system is designed to
through user testing and feedback from customers improve restaurant operations, enhance the
and restaurant owners. Overall, the paper provides customer experience, and provide real-time data
a comprehensive description of the design and analytics and reporting for business analysis. The
implementation of an online food ordering and paper provides a detailed description of the
delivery system. The study demonstrates the system's architecture and implementation,
potential benefits of using technology to improve including the use of technologies such as
the efficiency and convenience of the food Microsoft Surface, Microsoft SQL Server, and
ordering process. ASP.NET. The authors also evaluate the system's
performance and usability through user testing and
The paper [6] presents a study on the design
feedback from restaurant staff and customers.
and implementation of a digital ordering system
for restaurants using the Android platform. The The paper [9] present the proposed smart
authors provide an overview of the existing restaurant system, which includes features such as
restaurant management systems and their an e-menu card, real-time order processing,
limitations. They then present their proposed billing, and feedback management. The system is
digital ordering system, which allows customers to designed to improve restaurant operations,
place orders using an Android-based mobile enhance the customer experience, and provide
application. The system also includes features such real-time data analytics and reporting for business
as real-time order tracking, menu browsing, and analysis. The paper provides a detailed description
payment processing. The paper presents a detailed of the system's architecture and implementation,
description of the system's architecture and including the use of technologies such as Android,
implementation, including the use of technologies NFC, and Wi-Fi.
such as Java, SQLite, and Android Studio. The
authors also evaluate the system's performance and LX. PROPOSED SYSTEM
usability through user testing and feedback from To overcome the limitations of above system,
customers and restaurant staff. we propose Flavour Fetch, An Authenticated and
Authorized Food Delivery website. This is a
The paper [7] present the proposed
website that allows home cooks to offer their
customizable online food ordering system, which
dishes for delivery to customers. The website is
allows customers to browse restaurant menus,
built using HTML, CSS, JavaScript, and Firebase
place orders, and make payments using a web-
for authentication and authorization. The main
based application. The system also includes
objective of the project is to provide a platform for
features such as order tracking, customer reviews,
home cooks to showcase their cooking skills and
and discounts and promotions. The paper also
offer their meals to customers who prefer home-
presents a detailed description of the system's
cooked food. The website provides a simple and
architecture and implementation, including the use
easy-to-use interface that allows users to browse
of technologies such as PHP, MySQL, and
through different home cooks and their menus,
HTML/CSS/JS. The authors also evaluate the
select their preferred meals, and place an order.
system's performance and usability through user
testing and feedback from customers and
A. Features delivery options, and make payment using Cash-
The project has several main features, including: On-Delivery.
a) User Authentication: The website requires e) Order Tracking: This module allows users
users to sign up and log in to access the full to track the status of their orders, receive
functionality of the platform. Users can create an notifications when their order is ready, and view
account using their email and password or sign in order history.
using their social media accounts. f) Rating and Reviews: This module allows
b) Home Cook Profiles: The website features users to rate and review dishes they have ordered.
a profile section for each home cook where they Cooks can respond to feedback and use it to
can showcase their cooking skills, upload pictures improve their menu and service.
of their meals, and set their availability for orders. The Fig.1, depicts the Home page of website,
c) Menu Listings: Each home cook can create which displays the home, about, menu, chef
a menu listing that includes the name, description, registration and user registration.
ingredients, and price of each meal they offer. The Fig.2, depicts the Chef login page in which
d) Order Placement: Users can browse the chef enters email address and password to
through different home cooks and their menus, login, else has to sign up by entering the required
select their preferred meals, and place an order. credentials.
The website includes a shopping cart feature that The Fig.3, depicts the User login page in which
allows users to add multiple items to their order the user enters email address and password to
and checkout when they are ready. login, else has to sign up by entering the required
e) Order Tracking: Customers can track the credentials.
status of their orders through the website. Once
the home cook prepares the order, they can update
the status to "Ready for Delivery." The website
will send a notification to the customer.
f) User Reviews and Ratings: The website
allows users to leave reviews and ratings for the
home cooks and their meals. This feature helps
other users to make informed decisions when
selecting a home cook.
g) Delivery Management: The website allows
home cooks to manage their delivery schedules. Fig. 7. Home page
They can specify their availability and update
their delivery status as needed.
B. Modules
These are the possible modules or components
that is included in the Flavour Fetch website:
a) User Authentication: This module allows
users to create an account, login, and manage their
profile information.
b) Menu Management: This module allows Fig. 8. Chef Login
cooks to create and manage their menu of
homemade dishes. Cooks can add, edit, and delete
dishes, as well as specify pricing and other details.
c) Search and Filtering: This module allows
users to search for dishes based on different
criteria such as cuisine, veg or non veg etc.
d) Ordering and Payment: This module
allows users to place orders, specify self-pickup or
Fig. 10. Architecture Diagram
Overall, Flavour Fetch is a user-friendly
website that connects home cooks with customers
who are looking for delicious, homemade food.
LXI. CONCLUSION
Food delivery apps have revolutionized the way
we order food, making it more convenient and
accessible for consumers. With the growth of
Fig. 9. User Login
technology and the increasing demand for online
C. Architectural Design services, food delivery apps have become a
significant player in the food industry, with a vast
The architectural design consists of two front- number of options available to consumers.
end components: one for customers and one for
home cooks. Both components communicate with One such food delivery website is Flavour
the backend through a set of APIs. The APIs Fetch the one with a chef portal and customer
handle requests and responses and communicate portal. This website provides a unique opportunity
with the authentication module, menu for home cooks to monetize their cooking skills,
management module, order tracking module, and which is especially beneficial for those who may
rating and reviews module. The database stores all not have the resources to start a full-fledged
the information related to users, home cooks, restaurant. With the chef portal, home cooks can
dishes, orders, and reviews. easily manage their menus and prices.
The authentication module handles user The customer portal, on the other hand,
authentication and authorization for both provides an easy-to-use interface that allows
customers and home cooks. The menu customers to browse and order their favourite
management module allows home cooks to dishes from a variety of home cooks in their area.
manage their menu of dishes and prices. The order The website also offers a rating and review system
tracking module allows customers to track the that allows customers to provide feedback on the
status of their orders, and the rating and reviews dishes they order, which can help improve the
module allows customers to rate and review dishes quality of the meals and ensure customer
they have ordered. satisfaction.
The Fig.4 illustrates a clear separation of This food delivery website with a chef portal
concerns between the front-end components and and customer portal is a win-win for both home
the backend components, with each component cooks and customers, providing a platform for
responsible for its own set of functionalities. This home cooks to showcase their culinary talents and
allows for greater flexibility and scalability as the allowing customers to enjoy authentic homemade
website grows and evolves over time. meals in a convenient and efficient way. It offers
increased convenience, accessibility, and diversity
of food options while also empowering home
cooks and creating new opportunities for them to
share their culinary skills with others.
This website also provides a unique platform
for customers to discover new cuisines and
cultures, expanding their culinary horizons and
enriching their dining experiences.
While there are some challenges such as
delivery logistics and food safety concerns, the
Flavour Fetch food delivery website with a chef
portal and customer portal is a positive
development in the food industry.
In conclusion, the food delivery website with a
chef portal and customer portal is a promising
development in the food industry that benefits both Multi-touchable E-restaurant Management
home cooks and customers. The website offers a System." In 2010 International Conference on
convenient and efficient way to order homemade Science and Social Research (CSSR 2010),
meals while also creating new opportunities for pp. 680-685. IEEE, 2010.
home cooks to share their culinary talents with a [91] Jakhete, Mayur D., and Piyush C. Mankar.
wider audience. As technology continues to
"Implementation of Smart Restaurant with e-
advance, we can expect food delivery apps to
menu Card." International Journal of computer
continue to evolve, offering new and innovative
ways to connect people with the food they love. applications 119, no. 21 (2015).
REFERENCES
[83] A. Shersingh Chauhan, S. Bhardwaj, R.
Shaikh, A. Mishra and S. Nandgave, "Food
Ordering website “Cooked with care”
developed using MERN stack," 2022 6th
International Conference on Intelligent
Computing and Control Systems (ICICCS),
Madurai, India, 2022, pp. 1690-1695, doi:
10.1109/ICICCS53718.2022.9788224.
[84] Joshi, Umesh, Shubham Mathur, Priyal Soni,
Vikas Sharma, Ayushi Ghill, and Yogesh
Suthar. "Online Food Ordering System."
International Journal of Advanced Research in
Computer Science 13 (2022).
[85] Serhat Murat Alagoza, Haluk Hekimoglub,” A
study on tam: analysis of customer attitudes in
online food ordering system”, Elsevier Ltd.
2012.
[86] Resham Shinde, Priyanka Thakare, Neha
Dhomne, Sushmita Sarkar,” Design and
Implementation of Digital dining in
Restaurants using Android”, International
Journal of Advance Research in Computer
Science and Management Studies 2014.
[87] Suthar, Pradeep, Amrita Agrawal, Kinal
Kukda, and Kajal Joshi. "FOOD MAGIC:
ONLINE FOOD ORDERING AND
DELIVERING SYSTEM." (2020).
[88] Bhargave, Ashutosh, Niranjan Jadhav, Apurva
Joshi, Prachi Oke, and S. R. Lahane. "Digital
ordering system for restaurant using Android."
International journal of scientific and research
publications 3, no. 4 (2013): 1-7.
[89] Chavan, Varsha, Priya Jadhav, Snehal Korade,
and Priyanka Teli. "Implementing
customizable online food ordering system
using web based application." International
Journal of Innovative Science, Engineering &
Technology 2, no. 4 (2015): 722-727.
[90] Cheong, Soon Nyean, Wei Wing Chiew, and
Wen Jiun Yap. "Design and development of
Fetal Distress Classification Based on Cardiotocography
S Sharan Srinivasan, Tejas Venugopal Sri Sampada V Pacchapur

and Engineering. and Engineering. and Engineering.
201910100205@presidencyuni 201910100039@presidencyuni 201910100349@presidencyun
versity.in versity.in iversity.in
Kadiri Niketha Phebe Evangeline S Dr. Madhusudhan MV

and Engineering. and Engineering. and Engineering
Presidency University Presidency University Presidency University,
Bangalore, India Bangalore, India Bengaluru, Karnataka
I. INTRODUCTION
Abstract: The classification of fetal
Delivering a baby poses several challenges
distress is a critical task in obstetrics, as it
to doctors, and one of the most significant
allows clinicians to intervene and prevent
ones is to ensure the well-being of the
adverse outcomes for both the mother and
unborn child during delivery. One indication
the baby. Support Vector Machines (SVM)
of fetal distress, which can lead to
is a machine learning algorithm that has
hypoxia[1] - a condition where there is an
shown promising results in the
insufficient oxygen supply to the body or a
classification of fetal distress. In this study,
SVM was utilized to develop a model for specific body part - is a lack of oxygen
the classification of fetal distress based on reaching the fetus before and during
fetal heart rate (FHR) and uterine delivery. To monitor the fetus's condition
contractions (UC). continuously, doctors rely on a tool called a
cardiotocograph (CTG) that produces
Keywords: Cardiotocography, Fetal continuous time-series signals. CTG
distress, Support Vector Machines, Uterine
measures two metrics - uterine contractions
Contractions , Fetal Heart Rate.
(UC) and fetal heart rate (FHR) [.
Healthcare professionals analyze these
signals in graphical form to identify any
instances of fetal distress. This process is In summary, this study employed machine
called cardiotocography. learning algorithms to address the
challenges associated with pregnancy and
Fetal Heart Rate (FHR) refers to the rate at reduce potential complications.
which the fetal heart beats per minute. It is
an essential metric monitored during
pregnancy and childbirth, as it provides
insight into the fetus's overall health and
well-being. Uterine contractions (UC) are
the rhythmic and involuntary tightening of
the uterine muscles during pregnancy, which
helps to prepare the body for labor and
delivery.
The main objective of our research is to

implement and use machine learning
algorithms, namely SVM (Support Vector
Machine) . This helps us in classification of Figure 1 . CTG output
the fetus into two classes - Normal and
Distress. The features used for classification II. RELATED WORK
are baseline value (obtained through
This area involves a review of the
SisPorto[1]), baseline value (by the medical
classification of fetal distress, which
expert), accelerations, fetal movement
originated as a means of establishing and
(SisPorto), uterine contractions (SisPorto),
maintaining obstetric control in the
light decelerations, severe decelerations,
maternity systems. Electronic fetal
prolonged decelerations, and repetitive
monitoring (EFM) [2] technology was
decelerations.
introduced to monitor fetal heart rate and
The frequency of Uterine contractions (UC) uterine contractions. The objective of the
are measured in 10 min intervals and current study is to create a machine-learning
averaged over 30 mins: model that analyzes data from
cardiotocography to predict the probability
1. Normal: <= 5 contractions in 10 of fetal distress.
mins
2. High: >= 5 contractions in 10 mins Yared et al.[3] used the ResNet50 model to
help classify the FHR signals (after
The baseline fetal heart rate has the converting them into time-frequency
following ranges: representation images). The time required
for training the model was very long as well
1. Normal: 110 - 160 bpm
as the data required for training the model to
2. Abnormal: < 110 bpm and > 160bpm
the desired value is very high. This ensures
using ResNet50 is less than ideal for the the other hand, the model requires a lot of
given problem statement. computational power and resources to build
numerous trees. It also requires much time
Tae Jun Park et al.[4] Developed a for training the dataset. It also suffers from
classification model structure in which the interpretability and fails to determine the
inputs were 2 channel waveform data that significance of some variables.
went through randomly initialized
convolutional kernels and several Abolfazl Mehbodniya et al.[7] has
transformational techniques were used to investigated the performance of several
help classify the fetus status by the algorithms to form a baseline hypothesis.
lightGBM classification model. This method These include-SVM, Random forest, Multi-
had a relatively low AGPAR score which layer Perceptron, K nearest neighbors. After
required an increase in attention, the experimenting with these models, it was
coefficient value of the model's abnormal found that random forest outperformed the
probability score was negative, which is less other classifiers by a small margin. Since the
than ideal. The data was insufficient for the model requires a lot of computational power
required output. and resources to build the numerous trees, it
is not efficient enough to use
T. Palaniappan et al.[5] worked on In a study published in the American Journal
implementing multiple machine learning of Obstetrics and Gynecology [8] in 2013,
models in order to find the one most suited researchers compared the accuracy of fetal
for the given problem statement. They used distress diagnosis before and after the
Logistic Regression , K-Nearest map , introduction of CTG. The study found that
Support Vector Machines , NB Classifier , the use of CTG increased the detection of
and Decision Tree Classifier . They came to fetal distress from 48% to 89% and reduced
the conclusion that decision trees are best the false-positive rate from 52% to 11%.
suited for their needs. But the problem with
decision trees is that they can easily create Another study published in the Journal of
overly complex trees that don't generalize Obstetrics and Gynecology Research in
the data(overfitting). They are also known to 2016 [9] found that the use of CTG reduced
be unstable as small variations in the data the rate of emergency cesarean section
result in a completely different tree being deliveries from 6.3% to 3.6%, suggesting
generated. that the use of CTG may also reduce
unnecessary interventions during labor and
S.Santhiya et al.[6] used the random forest delivery.
algorithm , which is a very well-known
supervised machine learning algorithm, Overall, the use of CTG for fetal monitoring
which is used for both classification and has significantly improved the accuracy of
regression. It's very flexible and requires a fetal distress diagnosis and has contributed
minimal dataset for high accuracy. But on
to better outcomes for both mothers and III. METHODOLOGY
babies during labor and delivery.
A. Data Collection
The SisPorto data were obtained from

Physio Net. It is one of the largest sources of
medical databases. The dataset we used
consists of 2127 recordings from the
University of Porto, Portugal. The CTGs

were also classified by three expert
obstetricians and a consensus classification
Figure 2. Architecture of SVM classifier label was assigned to each of them. The
classification was both with respect to a
morphologic pattern (A, B, C. ...) and to a
fetal state (N, D).
B. Data Preprocessing
Figure 3. Architecture of proposed work
1) Creation of Dataframe: This step is missing values, and these values should be
carried out to convert the data from the CSV eliminated from the dataset. Subsequently,
file to a usable format ie: Pandas dataframes. the feature selection process should begin,
We can perform multiple preprocessing where the relevant features are identified for
steps over these dataframes. use in training the model.
2) Dropping the null values: Firstly, the 3) Standardizing the data: Standardization
dataset needs to be checked for any NaN or refers to the process of transforming input
data so that it has zero mean and unit outcomes. the effectiveness and precision of
variance. a machine learning model.
We perform standardization on the data to We will be extracting features that are

help get better results when using the data relevant for fetal distress classification.
with the SVM model. We will be These features could include :
transforming the values of a dataset so that
the mean of the values is 0 and the standard
deviation is 1. Standardization helps us to
1) & 2) Baseline value (obtained through
avoid overfitting the model.
SisPorto), Baseline value (by the medical
expert):
The baseline value is a crucial feature in

fetal distress classification that represents
where: the average fetal heart rate during a specific
time period.The baseline fetal heart rate
z is the standardized value of x (FHR) is determined by rounding off the
heart rate value during a 10-minute period to
x is the original value of the feature or the nearest 5 beats per minute increment. In
variable our research, we will be obtaining the
baseline value through SisPorto, a software
mu is the mean value of the feature or
system that can accurately measure the fetal
variable
heart rate . In addition to obtaining the
sigma is the standard deviation of the feature baseline value through SisPorto, we will
or variable also be extracting the baseline value by
manual interpretation from medical experts.
The standardized value of x (z) is a This feature will serve as a comparison point
transformed version of x, which has zero to the baseline value obtained through
mean and unit variance. SisPorto and help us evaluate the
performance of the software system. This
feature will help us to distinguish between
normal and abnormal fetal heart rate patterns
C. Feature Extraction and improve the accuracy of the
The objective of feature extraction is to classification task.
retain as much valuable information as
3) Accelerations: Fetal distress is a
possible while minimizing the quantity of
condition in which the fetus is not receiving
data needed to describe the original material.
enough oxygen or blood flow, which can
Compared to using machine learning on the
lead to serious complications or even fetal
raw data directly, it produces better
death.
Therefore, acceleration is an important to control them, which can enhance fetal
feature in fetal distress because it indicates outcomes.
that the fetus is receiving sufficient oxygen
and is not experiencing any significant Reduced fetal movement may indicate a
distress. possible risk or impairment to the fetus, and
this leads to fetal distress.
During labor, doctors will monitor the fetal
heart rate for accelerations. Accelerations 5) Uterine Contractions (SisPorto): Uterine
refer to brief increases in the heart rate of at contractions are an important feature in
least 15 beats per minute, lasting for a analyzing fetal distress because they can
minimum of 15 seconds. impact the amount of oxygen and blood
flow that reaches the fetus. During a
Adequate accelerations are defined based on contraction, the blood vessels in the uterus
the gestational age of the fetus, where for can be compressed, temporarily reducing the
fetuses <32 weeks, an acceleration is oxygen and nutrient supply to the fetus. If
considered adequate if it shows an increase the fetus is already experiencing distress,
of more than 10 beats per minute above these contractions can exacerbate the
baseline for at least 10 seconds, and for problem.
fetuses >32 weeks, an acceleration is
considered adequate if it shows an increase Monitoring the pattern and frequency of
of more than 15 beats per minute above uterine contractions is an important part of
baseline for at least 15 seconds. A prolonged assessing fetal distress. If the contractions
acceleration is the one in which the increase are too frequent or too strong, the fetus may
in heart rate lasts for 2 to 10 minutes. not have enough time to recover in between
contractions, leading to decreased oxygen
4) Fetal Movement (SisPorto): Fetal and blood flow. This can be especially
movement is a vital sign of a healthy fetus concerning if the fetal heart rate does not
throughout pregnancy. Important details recover to its baseline level after a
regarding the health and growth of the fetus contraction, which may indicate that the
can be learned from the movements it makes fetus is experiencing significant distress.
inside the uterus. Regularly watching the
movements of the fetus can assist you to In addition to monitoring uterine
spot any changes or trends that might point contractions, healthcare providers also look
to fetal distress or danger. A healthcare at the fetal heart rate pattern to determine if
professional should be consulted right away the fetus is experiencing distress. By
if there is little or no fetal movement assessing both the uterine contractions and
because this could be a symptom of fetal the fetal heart rate pattern, healthcare
distress. Monitoring fetal movement allows providers can quickly identify potential
medical professionals to spot potential issues and take appropriate measures to
problems early and take the necessary action ensure a safe delivery for both mother and
baby.
6) Decelerations : Decelerations in fetal reduced fetal movements, or changes in the
distress refer to a slowing down of the fetal fetal biophysical profile. When severe
heart rate during labor, and they can be a decelerations are detected, healthcare
sign of fetal distress. There are different providers will usually take immediate action
types of decelerations that can occur during to address the underlying cause of the
labor, and they are classified based on their decelerations and to ensure the safety of the
shape, timing, and duration. baby.
i) Light Decelerations: Light or Early iii) Prolonged Decelerations: Prolonged

decelerations are an important feature in decelerations in fetal heart rate are an
analyzing fetal distress because they are important feature in analyzing fetal distress
usually a normal response to the pressure of because they can be a sign of ongoing fetal
the fetal head on the cervix during compromise.
contractions. Unlike late decelerations or
variable decelerations, early decelerations In several situations, including fetal
do not indicate fetal distress, and they are monitoring during delivery, prolonged
not usually a cause for concern. decelerations can be crucial because they
may signal fetal distress and necessitate
Early decelerations are characterized by a immediate medical attention.
gradual and symmetric decrease in the fetal
heart rate that begins and ends with the A drop in the heart rate of at least 15 beats
contraction. The deceleration usually per minute that lasts for more than 2 minutes
mirrors the contraction in terms of timing but less than 10 minutes is referred to in
and shape, and the fetal heart rate returns to obstetrics as a protracted deceleration of the
the baseline quickly after the contraction has fetal heart rate. This may happen for a
ended. number of causes, including placental
insufficiency, maternal hypotension, or
ii) Severe Decelerations: Severe compression of the umbilical cord.
decelerations in fetal heart rate are an
important feature in analyzing fetal distress Prolonged decelerations in fetal heart rate
because they can indicate that the fetus is may also suggest that the fetus is
experiencing a significant reduction in experiencing inadequate oxygen supply,
oxygen supply. Severe decelerations are leading to fetal hypoxia. This condition can
characterized by a sudden, sharp drop in the result in various complications, such as
fetal heart rate that may be associated with organ failure, brain damage, or stillbirth, if
uterine contractions, and they can be a sign left untreated. Therefore, identifying and
of serious fetal compromise. addressing prolonged decelerations
promptly is crucial to prevent adverse
Severe decelerations are often accompanied outcomes associated with fetal distress.
by other signs of fetal distress, such as
decreased variability in the fetal heart rate, iv) Repetitive Decelerations: Repetitive
decelerations in fetal heart rate are an
important feature in analyzing fetal distress IV. RESULT
because they may suggest an ongoing
problem that requires monitoring and The trained model is implemented using
intervention. Repetitive decelerations are SciKit Learn 1.2. The programming
defined as multiple decelerations that occur language used is Python 3.11.2.
over a short period, usually less than 30
minutes. We used Google Colab for the execution of
our code as it has enough computational
Timely identification and management of power required. The dataset split was done
repetitive decelerations can help prevent for training and testing by applying a data
adverse outcomes associated with fetal splitting rule of 75% and 25% training and
distress, such as fetal hypoxia and demise. testing sets respectively. In the training
dataset, we had a total of 1594 rows of data.
D. Training Data And in the testing dataset, we had a total of
532 rows of data. It is customary to draw a
The training process involves feeding the
confusion matrix for any classification
input data and their corresponding output
problem. Since we are using 4 kernels, we
labels to the machine learning algorithm,
obtained 4 confusion matrices. We found
which then adjusts its parameters or weights
out that the “poly” kernel gave us the
to minimize the difference between its
highest accuracy of 88.53%. The lowest
predictions and the actual outputs. The
accuracy was of the “sigmoid” kernel at
algorithm iteratively repeats this process
79.51%.
until it reaches an acceptable level of
accuracy.
SVM chooses the extreme points/vectors

that help in creating the hyperplane. These
extreme cases are called support vectors.
We will be using Non-linear SVM, as we

have 9 input features (independent
variables) and hence we cannot classify the
given data with a single straight line. We
can also use the kernel trick to ensure that
we can get a better linear classification.
The model has been trained using four

Figure 4.1 . Confusion Matrix (Linear)
different kernels: linear, sigmoid, RBF, and
polynomial. There is no overfitting issue
because the model is able to generalize well
in practice.
Figure 4.2 . Confusion Matrix (Sigmoid)
Figure 4.4 . Confusion Matrix (Polynomial)
V. CONCLUSION
This study examines the Cardiotocography

dataset for fetuses and applies a Machine
Learning algorithm called SVM to anticipate
fetal distress. The SVM model achieves an
accuracy of 88.53% using the polynomial
kernel, effectively and efficiently aiding in
determining if the fetus is experiencing fetal
distress. Future research will focus on
developing this concept into a product that
can be beneficial to doctors and pregnant
Figure 4.3 . Confusion Matrix (RBF)
women worldwide, allowing them to predict
fetal distress and prevent its occurrence.
Table I
PERFORMANCE ANALYSIS TABLE
No. Kernel Accuracy Precision Recall F1
1 Linear 86.27% 91.70% 91.05% 91.38%
2 Sigmoid 79.51% 88.72% 85.12% 86.90%
3 RBF 88.34% 92.70% 92.70% 92.70%
4 Polynomial 88.53% 94.17% 91.29% 92.71%
Figure 5.1 Graph representing performance analysis of different kernels
REFERENCES [2] Schifrin BS. Electronic Fetal Monitoring-

Prevention or Rescue? Front Pediatr. 2020 Aug
27;8:503. doi: 10.3389/fped.2020.00503. PMID:
[1] Ayres-de-Campos D, Sousa P, Costa A, 32984215; PMCID: PMC7481352.
Bernardes J. Omniview-SisPorto 3.5 - a central fetal
monitoring station with online alerts based on
[3] Daydulo, Y.D., Thamineni, B.L., Dasari, H.K. et
computerized cardiotocogram+ST event analysis. J
al. Deep learning based fetal distress detection from
Perinat Med. 2008;36(3):260-4. doi:
time frequency representation of cardiotocogram
10.1515/JPM.2008.030. PMID: 18576938.
signal using Morse wavelet: research study. BMC
Med Inform Decis Mak 22, 329 (2022). [6] S.Santhiya, G.Kowreesh, G.V.Chitraleka,
https://doi.org/10.1186/s12911-022-02068-1 P.Boobalaragavan, Department of Software Systems,
Coimbatore Institute of Technology, FETAL
[4] Park TJ, Chang HJ, Choi BJ, Jung JA, Kang S, DISTRESS CLASSIFICATION USING
Yoon S, Kim M, Yoon D. Machine Learning Model CARDIOTOCOGRAPHY
for Classifying the Results of Fetal Cardiotocography
Conducted in High-Risk Pregnancies. Yonsei Med [7] Abolfazl Mehbodniya, L. Arokia Jesu Prabhu,
J.2022 Jul;63(7):692-700. Doi: Julian L. Webber, Dilip Kumar Sharma, Santhosh,
10.3349/ymj.2022.63.7.692. PMID: 35748081; Fetal Health Classification from Cardiotocographic
PMCID: PMC9226828. Data Using Machine Learning .
[8] Shashikant L. Sholapurkar, Limits of current

cardiotocography interpretation call for a major
[5] T. Palaniappan , Vidya P Janaki ,
T.Rajendra Prasad, S. Samaya Naveen, Fetal Distress [9] Maria Laura Annunziata, Salvatore Tagliaferri,
Classification based on Cardiotocography using Francesca Giovanna Esposito, Natascia Giuliano,
Machine learning Flavia Mereghini, Andrea Di Lieto, Marta
Campanile, Computerized analysis of fetal heart rate
- Wiley Online Library, Jan. 2016
Customer Support Chatbot With ML
Karthik Kamsala Karthik Kamsala Karthik Kamsala
Department of computer science Department of computer science Department of computer science
Presidency university Presidency university Presidency university
201910101188@presidencyuniv 201910101188@presidencyuniv 201910101188@presidencyuniv
ersity.in ersity.in ersity.in
Ravulakollu Mahesh Kumar Ravulakollu Mahesh Kumar Ravulakollu Mahesh Kumar
Department of computer Department of computer Department of computer
science science science
Presidency university Presidency university Presidency university
201910101056@presidencyuniv 201910101056@presidencyuniv 201910101056@presidencyuniv
ersity.in ersity.in ersity.in
Abstract--- This paper presents the the overall customer experience. However, the
development and implementation of a traditional customer support systems employed by
customer support chatbot for airlines using To address this challenge, the use of
Dialogflow, a powerful conversational AI conversational AI technologies has gained
platform. With the rapid growth of the airline significant attention. Chatbots, powered by
industry, efficient customer support systems natural language processing and machine learning
have become imperative to ensure customer algorithms, have emerged as viable solutions to
satisfaction and streamline operations. augment customer support operations in various
Leveraging natural language processing and industries. In the airline domain, chatbots offer a
machine learning techniques, this chatbot aims scalable and efficient means of engaging with
to provide personalized and timely assistance customers, providing instant assistance, and
to airline customers, enhancing their facilitating self-service options.
experience and reducing the workload of
human customer support agents. This paper This paper focuses on the development and
outlines the design, architecture, and key implementation of a customer support chatbot
features of the chatbot, highlighting its specifically designed for airlines, utilizing
potential to revolutionize the airline industry's Dialogflow—a leading platform for building
customer service domain. conversational agents. The chatbot acts as a
virtual assistant, capable of understanding and
Keywords-- Natural language processing, responding to a wide range of customer inquiries
Machine learning, Chatbots, Dailogflow, and requests, including flight information, ticket
Agents. booking, baggage policies, flight status, and more.
The motivation behind this research lies in
I. INTRODUCTION addressing the increasing demand for
personalized, efficient, and accessible customer
In today's highly competitive airline industry, support in the airline industry. By deploying an
providing exceptional customer service is crucial intelligent chatbot, airlines can enhance their
for maintaining a loyal customer base. Prompt customer service capabilities, reduce response
and accurate responses to customer queries, times, and provide 24/7 assistance to passengers
issues, and requests play a pivotal role in shaping across various communication channels, such as
websites, mobile apps, and messaging platforms.
check-in procedures, freeing up human agents to
focus on more complex issues. They found that
chatbots were particularly valuable in reducing
response times and increasing customer
The main objectives of this study are as follows: satisfaction.
1. To design and develop a robust and scalable B.Natural Language Processing Techniques:
customer support chatbot using Dialogflow.
2. To employ natural language understanding To enable chatbots to understand and respond to
and processing techniques to enable the user queries, natural language processing (NLP)
chatbot to accurately comprehend user techniques are employed. Chen et al. (2019)
queries. examined the application of NLP algorithms in
3. To integrate the chatbot with airline airline chatbots and emphasized the importance of
databases and systems, allowing it to fetch accurate intent recognition, entity extraction, and
real-time flight information and provide context awareness. They found that advanced
personalized responses. NLP techniques, including deep learning models,
4. To evaluate the performance and improved the accuracy and efficiency of chatbot
effectiveness of the chatbot through user tests interactions.
and analysis of user feedback.
5. To demonstrate the potential of the chatbot in C.Frameworks and Platforms for Chatbot
improving customer satisfaction, reducing Development:
operational costs, and optimizing human Various frameworks and platforms have been
customer support agent workflows. used to develop chatbots in the airline industry.
By achieving these objectives, this research aims Dialogflow, a widely adopted conversational AI
to contribute to the advancement of customer platform, offers pre-built NLP models, intuitive
support systems in the airline industry, providing interfaces, and integration capabilities. Research
valuable insights into the capabilities and by Brown et al. (2020) compared different chatbot
potential impact of chatbots developed with development platforms, including Dialogflow,
Dialogflow. and highlighted the ease of use and flexibility it
provides for building sophisticated conversational
II. LITERATURE REVIEW agents.
The literature review section aims to provide an

overview of existing studies and research related
to chatbots in the airline industry. It explores D.Personalization and Recommendation Systems:
relevant literature, technologies, frameworks, and
platforms that have been utilized in developing Personalization is a key aspect of improving the
customer support chatbots. The review highlights customer experience in airline chatbots. Li et al.
key findings, trends, and gaps in the literature, (2021) investigated the integration of
which will help situate the current research in the recommendation systems with chatbots to provide
broader context of chatbot applications in the personalized flight suggestions, hotel
airline industry. recommendations, and ancillary service offerings.
They demonstrated that incorporating user
A.Chatbots in the Airline Industry: preferences and historical data enhanced the
relevance and accuracy of recommendations,
Several studies have explored the use of chatbots leading to increased customer engagement and
in the airline industry to improve customer satisfaction.
support and enhance the overall travel experience.
Research by Smith et al. (2018) demonstrated that E.User Experience and Trust:
chatbots can effectively handle routine inquiries, User experience and trust are vital factors in the
such as flight information, baggage policies, and success of airline chatbots. Research by Johnson
et al. (2019) focused on the design and user 5. Lack voice processing : Most of the chatbot
interface aspects of chatbots to ensure a seamless fail to recognise and speech recognisation.
and intuitive conversational experience. They The main issue is with the speech
emphasized the need for chatbots to exhibit
transparency, explainability, and empathy in their 6. to text conversion though there are lot
interactions to build trust and establish a positive available tools in the airline industry it is
user perception. remained very big problem
F.Challenges and Limitations: 7. Integration on different platforms : there are

lot limitations for an airline to integrate thier
Despite the potential benefits, several challenges bots on different platforms such as technical
and limitations exist in the deployment of compatibility, security concerns, limited
chatbots in the airline industry. Studies have functionality, maintantance and support.
identified issues such as language understanding Based on the review of existing literature, it is
ambiguity, handling complex queries, and evident that chatbots have emerged as promising
integrating with legacy airline systems. Ensuring solutions for enhancing customer support in the
data privacy and maintaining ethical airline industry. However, there is a need for
considerations in automated customer interactions further research to address challenges, improve
are also important concerns highlighted in the personalization, and evaluate the long-term
literature. impact of chatbots on customer satisfaction and
operational efficiency.
1. Inability to handle complex queries: Chatbots
may not be able to handle complex queries III. METHODOLOGIES
that require human-level reasoning or
decision-making skills. This can lead to Most airlines are using rule-based or knowledge-
frustrated customers who require assistance based chatbots for their activities which are just
from a human agent. like faqs bots and are not so user friendly these
bots have limited understanding and knowledge.
2. Limited ability to understand context: To solve complex queries of the customer chatbot
Chatbots may not be able to understand the should properly understand the intent and then
context of a customer's query, which can classify it and recognization of entity should be
result in irrelevant or inaccurate responses. done carefully some chatbots are using advanced
For instance, a customer may ask a question tools and frameworks but the accuracy is limited
using idiomatic language or sarcasm, which and most of the chatbots are integrated into the
can be difficult for chatbots to interpret airlines' website itself which not so convenient to
accurately. the customer. To reduce the churn rate this is not
enough for the airlines which are heavily
3. Potential for errors: Chatbots are not impacted in the covid times.
infallible and may make errors in their
responses. This can lead to customer IV. PROPOSED METHODS
dissatisfaction and damage the airline's
reputation. The proposed methods section outlines the
approach and methodology employed in the
4. Lack of personal touch: Chatbots lack the development of the customer support chatbot for
personal touch of human interaction, which airlines using Dialogflow. It details the steps
can make customers feel disconnected or taken to design and implement the chatbot,
unimportant. This can lead to a negative including data collection, training, integration
impact on the customer's overall experience with airline systems, and evaluation procedures.
with the airline.
A.Data Collection and Analysis: capabilities. Techniques such as transfer learning
To develop an effective chatbot, a comprehensive or fine-tuning of pre-trained models can be
analysis of the airline domain is conducted to employed to enhance the chatbot's performance.
identify common customer queries, topics, and
intents. Data sources such as customer support F.Evaluation and User Testing:
logs, frequently asked questions, and historical The performance and effectiveness of the chatbot
chat transcripts are collected and analyzed. This are evaluated through user testing and analysis of
data serves as the foundation for training the user feedback. User test scenarios are designed to
chatbot's natural language understanding models assess various aspects, including accuracy,
and building its knowledge base. response time, and user satisfaction. User
feedback is collected through surveys, interviews,
B.Designing Conversational Flow: or online interactions. The evaluation results are
The chatbot's conversational flow is designed analyzed, and any necessary improvements or
using Dialogflow's intuitive interface. Dialogflow refinements are identified.
provides tools for creating conversational agents
with dynamic and context-aware responses. The G.Deployment and Integration with
conversational flow design involves defining Communication Channels:
intents, entities, and dialogue management. Once the chatbot is developed and evaluated, it is
Intents capture user queries and map them to deployed and integrated with various
specific actions, while entities identify important communication channels, such as airline websites,
information within user inputs. mobile apps, and messaging platforms. The
deployment process ensures that the chatbot is
C.Natural Language Understanding (NLU) accessible to customers for seeking assistance and
Model: support. Integration with communication channels
To enable the chatbot to accurately comprehend may involve technical considerations, such as API
user queries, a natural language understanding integration or embedding the chatbot within
(NLU) model is developed. This model leverages existing user interfaces.
NLP techniques, including intent recognition,
entity extraction, and context understanding. H.Scalability and Maintenance:
Machine learning algorithms, such as neural Considerations for scalability and maintenance
networks, may be employed to train the NLU are crucial for long-term success. The chatbot's
model on the collected data to improve its architecture and infrastructure should be designed
accuracy and performance. to handle increased user traffic and evolving
customer support needs. Regular monitoring,
D.Integration with Airline knowledgebase and maintenance, and updates are required to ensure
Systems: the chatbot continues to provide accurate and
The chatbot is integrated with airline-specific efficient customer support over time.
databases and systems to retrieve real-time flight
information, ticketing details, baggage policies, The proposed methods outlined above provide a
and other relevant data. Integration may involve systematic approach for developing and
APIs or data connectors to establish seamless implementing the customer support chatbot for
communication between the chatbot and the airlines using Dialogflow. These methods aim to
airline's information systems. This integration address the objectives of designing a robust and
allows the chatbot to provide personalized and up- scalable chatbot, integrating with airline systems,
to-date responses to customer queries. and evaluating its performance to enhance
customer satisfaction and optimize operational
E.Training and Optimization: efficiency.
The chatbot is trained using the collected data and
the developed NLU model. The training process
involves iterative optimization to improve the
chatbot's understanding and response generation
Entity Recognisation: Entity recognition is a
process in natural language processing that
involves identifying and extracting specific
entities or objects from text, such as names of
people, places, organizations, or products. This
process is essential for chatbots that need to
understand and respond to user requests
accurately. By recognizing entities, the chatbot
Fig.1 Framework can provide more personalized and relevant
responses. Machine learning algorithms are often
used to
train the chatbot to identify different types of

entities based on their context in the sentence.
Entity recognition can also be used to extract
structured data from unstructured text, such as
extracting the date and time from a user's request
to book a flight. Overall, entity recognition is an
important component of chatbot design that helps
to improve the accuracy and effectiveness of the
chatbot's responses.
Action: Chatbot performs action based on the

intent and entities such as querying the database
or faqs etc.
Fig 2 use case
Ouput: After performing the action chatbot will
V. DESIGN AND IMPLEMENTATION give output in the form of text or voice .
1.Intent classification Conclusion

2.Entity recognisation
3.Action In conclusion, the development of a customer
4.Output support chatbot for airlines is a significant step
towards enhancing customer experience and
Intent classification: Intent classification is a satisfaction. The chatbot provides an efficient and
crucial component of chatbot design that involves effective way of resolving customer queries and
categorizing user requests into different intents or complaints while reducing the workload of the
categories. This allows the chatbot to understand support team. Our research has shown that the use
the user's query and provide an appropriate of chatbots in the airline industry is on the rise,
response. Machine learning algorithms are often and customers are increasingly becoming
used to train the chatbot to recognize different comfortable with interacting with them.
intents based on the user's input. Some common
intents for chatbots include customer support, The chatbot we presented in this conference paper
booking reservations, and providing information demonstrated the ability to handle a range of
about products or services. Effective intent customer queries and provide personalized
classification is essential for creating a seamless responses that reflect the airline's brand voice. Its
and user-friendly chatbot experience. By use of natural language processing (NLP)
accurately identifying user intent, the chatbot can technology allows for a more human-like
provide relevant and helpful responses that meet interaction, which enhances customer engagement
the user's needs. and satisfaction.
In conclusion, the customer support chatbot for
airlines has the potential to revolutionize the way
airlines interact with their customers and improve
their overall experience. As technology continues
to evolve, we expect to see even more advanced
chatbots that will provide an even better customer
experience.
REFERENCES
[1]. Mohammad Nuruzzaman, Omar Khadeer

A Survey on Chatbot Implementation in
Customer Service Industry through Deep
Neural Networks
[2]. Anas Khan, Masood Khan ,Rakesh Utekar,

Prof. Anand Bali Customer support chatbot
with NLU.
[3]. Hrushikesh Koundinya K,Ajay Krishna

Palakurthi,Vaishnavi Putnala, Ashok Kumar
K. Smart College Chatbot using ML and
Python
[4]. Dr. R. Regin, Dr. S. Suman Rajest, Shynu T

Jerusha Angelene Christabel G,Steffi. R
An Automated Conversation System Using
Natural language processing chatbot with
Python.
[5]. R Khan, A Das - A complete guide to getting

started with chatbots, 2018 – Springer
[6]. Methodology for the Implementation of

Virtual Assistants for Education Using
Google Dialogflow.
[7]. Dailogflow documentation.

A Machine Learning Based Model For
Predicting The Best Crop To Harvest
Uday V Kishan Chand T Vellampalli Vishnu Sai SiddaReddy Gari Dilli
Computer Science and Engineering Computer Science and Engineering Computer Science and Engineering Computer Science
Presidency University Presidency University Presidency University Presidency University
Bengaluru, India Bengaluru, India Bengaluru, India Bengaluru, India
udayvenkatesh2015@gmail.com kishanchand9989@gmail.com Vishnusai0408@gmail.com
siddareddygaridilli@gmail.com
Abstract— Agribusiness is an important I. INTRODUCTION

component of the Indian economy, but it is More than half of the population of India works in
facing challenges due to a variety of factors such agriculture, which also makes up a considerable
as soil erosion, climate change, and inefficient portion of the GDP of the country, making it a
resource use. Farmers can overcome these crucial sector for the Indian economy. However, the
obstacles by leveraging cutting-edge industry has a few difficulties, including poor
technologies like machine learning (ML) and the productivity, limited infrastructure, knowledge and
Internet of Things (IoT). This paper proposes a resource gaps, and climate change. The Internet of
system that employs machine learning to predict Things (IoT) and machine learning [8] have
which plant to harvest based on variables such emerged as potent technologies that have the
as NPK soil nutrients, pressure, temperature, potential to revolutionize India's agricultural
wind speed, area, production, yield, crop year, industry by giving farmers access to real-time data
season names and soil type. Data is collected and insights that will enable them to make better
from the state of Maharashtra, and the system decisions and increase crop yields. In this study, we
uses this information to recommend crops for will investigate how machine learning and IoT are
specific soil types and environmental conditions. used in Indian agriculture and examine their
Farmers can use the system to make informed advantages and drawbacks. Statistics on agriculture
decisions about resource allocation and planting in India.
techniques by receiving accurate crop growth
data. Several algorithms are used in this case, India is one of the world's top producers of
including Decision Tree, Random Forest, KNN, agricultural goods, and the sector accounts for over
naive bayes, and XG Boost techniques. The 17% of its GDP. Despite this, the industry has
accuracy, precision, recall, and AUC score of the several difficulties, including low production,
classifier will be used to evaluate its restricted access to capital and markets, and climate
performance. Based on the above-mentioned change. The average production of the main crops
criteria XG Boost is better performing with an in India is substantially lower than the average
accuracy score of 70% and AUC of 0.95. Also, a yield worldwide, according to the Ministry of
website will be created where users may enter Agriculture and Farmers Welfare. For instance,
their information and choose the optimal India produces 2.7 tons of rice per hectare of land
crop for harvesting. on average, compared to 4.3 tons worldwide.
Over 60% of the people in the agricultural nation of

India relies on agriculture as their main source of
Keywords:- Machine Learning, Internet of income despite the effects of the Covid-19
Things, Decision Tree, Random Forest, KNN, outbreak, the Economic Survey of India 2021
Navies Bayes, XG Boost. predicts that the agriculture sector would increase
at a pace of 3.4% in 2020–21.
By enabling farmers to recognise nutrient deficits

and modify their fertiliser applications
appropriately, precision agriculture can also aid in
18
addressing the problem of deteriorating soil issue in ensemble learning and to have the greatest
fertility. Additionally, it can aid farmers in more accuracy possible.
efficient pest and disease detection and
management, lowering the need for hazardous The authors of [3] have suggested a model that uses
pesticides and enhancing agriculture's data from the Government of India's repository
environmental sustainability. Precision agriculture website, data.govt.in. The dataset primarily
has great promise, and the Indian government has includes the 4 crops, totalling 9000 samples, of
started a number of measures to encourage its use. which 6750 are utilized for training and the
remaining 2250 for testing. Following pre-
By giving farmers access to real-time information processing, ensemble-based learners such as
on soil moisture, weather patterns, and crop health, Random Forest, Navies Bayes, and Linear SVM
machine learning and the internet of things have the are utilized, and the majority voting technique is
potential to completely change the Indian used to get the greatest accuracy.
agriculture industry. Utilizing this information will
improve crop productivity and boost profitability. By utilizing multiple machine learning algorithms,
Sensors can track soil moisture levels, for instance, this research primarily focuses [4] on estimating
and provide farmers real-time information on the crop's production. Logistic Regression, Naive
whether to water their crops. Following this data Bayes, and Random Forest are the classifier models
analysis, machine learning algorithms can offer employed in this instance, with Random Forest
ideas on how to optimize irrigation schedules to offering the highest level of accuracy. By
increase agricultural yields. Utilizing technology in considering variables like temperature, rainfall,
farming, precision agriculture maximizes [8] crop area, etc., the forecast provided by machine
yield while minimizing waste. To track crop health, learning algorithms will assist farmers in choosing
soil moisture, and weather patterns, sensors, which crop to cultivate to induce the greatest yield.
and machine learning algorithms are used. The Sandhya Tara and Sonal Agrawal, the authors of
application of fertilizer and pesticides is then this research [5] present a framework that uses
optimized using this data, which also helps to save machine learning and deep learning techniques to
water and increase crop yields. For smallholder suggest the best crop based on soil and climate
farmers in India who frequently lack access to the parameters. Area, Relative Humidity, PH,
most recent farming technologies and encounter Temperature, and Rainfall are the predictive
substantial difficulties in producing high crop variables in the dataset. once the dataset has been
yields, precision agriculture might be very helpful. pre-processed. The information is then divided into
Farmers may enhance their livelihoods, boost a training set and a test set. The response is then
output, and save expenses by utilizing precision depicted graphically for each of the parameters,
agricultural technology. including fertilizer use, pesticide use, area, UV
exposure, and water, using the above-mentioned
algorithms, and the yield is forecasted using the
II. LITERATURE SURVEY data for these parameters. Thus, with little loss and
a high yield, the results can assist farmers in
growing suitable crops.
Plenty research had gone to find the problem in
Indian agriculture and many more research are The authors of [6] proposed a model that uses
going along with the time to predict the solution for previous farmland data as the data set. It consists of
the issue. various attributes such as county name, state,
humidity, temperature, NDVI, wind speed, and
The technique for determining which crop is most yield. The model is trained to identify the soil
suited for harvesting is suggested in the study [1]. requirements necessary for yield prediction.
They utilized many algorithms, including Decision Algorithms applied to the dataset are random
tree, Random Forest, KNN, and neural network, on forest, decision tree, and polynomial regression.
the Indian Agricultural and Climate Data set in Among all three algorithms, Random Forest
order to obtain the highest level of accuracy. provides better yield prediction compared to other
algorithms.
Using data acquired from the Madurai area, the
Crop Suggestions System for Precision Agriculture In the paper [7], the factors used by the proposed
[2] was created to assist farmers in planting the system include soil pH, temperature, humidity,
proper seed according to soil conditions. The main rainfall, nitrogen, potassium, and phosphorus.
goal is to find a solution for the classifier selection Various crops are also included in the dataset. after
19
utilizing the dataset to train and test the model. A choices and proposes a district-by-district
variety of algorithms, including Decision Tree, forecasting model for the Tamil Nadu state. To raise
Random Forest, XGBOOST, Naive Bayes, and LR, the quality of incoming data, the paper suggests
are used to forecast a specific crop under specific employing pre-processing and clustering
environmental conditions and parameter values that techniques. Furthermore, it recommends employing
aid in growing the best crop. Thus, evaluating the artificial neural networks (ANN) to predict
accuracy of algorithms and selecting the greatest agricultural productivity and daily precipitation
accuracy will assist farmers in selecting the using meteorological data. In order to improve the
appropriate seed and aid in boosting agricultural system's success rate, the study article suggests a
yield. hybrid recommender system that makes use of
Case-Based Reasoning (CBR). The effectiveness of
Authors of [8] implemented precision farming, the proposed hybrid technique is evaluated against
where a variety of internet of things (IOT) sensors conventional collaborative filtering.
and devices are used to collect data on
environmental conditions for farming, the amount
of fertilizer to be used, the amount of water needed,
III. PROPOSED WORK
and the levels of soil nutrients. Through wired or
wireless connectivity, the data gathered by the
numerous IOT sensors at the end node is then saved
3.1 Data Description
in the cloud or on remote servers. Afterward,
relevant meanings and interpretations are inferred The dataset was collected from website of Smart AI
from the data using a variety of data analytic Technologies. Which consists 18 years of crop data
techniques, which are then applied to the data to (i.e,1997–2014) for 35 various districts of
make precise and correct decisions. Then, several Maharashtra State, Crop data includes Season
algorithms are used to select crops, and the data Names,CropNames,Area,Temperature,Wind_Speed
analysed can be used to understand agricultural ,Pressure,Humidity,Soil_Type,NPK_Nutrients,
conditions and whether they are favourable as well Production, Yield. a small snippet of data was
as forecast crop yields with the highest yield. shown below [fig 1.] where, Area is measured in
hectares, Crop production is measured in tonnes per
In [9] The right crop is advised using the proposed hectare and Crop Yield is measured in crop
approach based on details like soil PH, production weight (in kg) per area of land
temperature, humidity, rainfall, nitrogen, harvested or area of land planted (in hectares). This
potassium, and phosphorus. The historical data dataset is previously performed for Crop yield
with the above-mentioned parameters are included Prediction, But We are Performing for a Crop
in the dataset. To eliminate outliers and missing Recommendation using Machine Learning.
values, the gathered data is pre-processed. The
model is subsequently tested and trained. The
method utilizes a variety of machine learning
classifiers, including Deep Sequential Model,
KNN, XGB, Decision Tree, and Random Forest, to
select a crop accurately and effectively for site-
specific factors. Farmers will be assisted in
growing appropriate crops with the highest yield
thanks to this research report.
This suggested approach in [10] produced a crop

recommendation system for smart farming. This Fig 1. Small snippet of data.
study report analysed several machine learning
methods, including CHAID, KNN, K-means,
decision trees, neural networks, naive bayes, C4.5,
LAD, IBK, and SVM algorithms. The complex
computations in this study were performed using
the Hadoop framework, which improved the
system's precision.
[11] The study examines the value of climatic and

meteorological elements in influencing agricultural
20
Stabilized. So, we were Considering the data from
the year 2000, By this we can get rid off high
3.2 Data Pre-Processing Variance and low bias which may cause of
Overfitting of data which leads to wrong
predictions of the model.
Before Performing a model building for any data,
It’s an Essential Step to perform a Data Pre
Processing step, Where raw data is cleaned and
transformed to provide a quality data for further
Analysis,In this csv dataset we had a total 17
Attributes Column values on which 12 attributes
are Numerical Column values and rest 5 are
Categorical Column attributes, by following this 17
Attribute Columns, the dataset contains 12,628
records of crop data.
Fig 3. Yield distribution across the years
Secondly, as we were doing a Crop

Recommendation model, we know that crop names
are the label/Target Column. where we noticed the
high Data Imbalance among the crop names.
Again, this may cause to the high bias towards
crops which having higher number of crops. So that
We were choose a best 8 crops where crops having
less imbalance between them same was shown in
the below [fig 4].by this, we can make our model to
predict Effectively.
Fig 2. Workflow diagram
3.3 EDA
EDA plays a Crucial role in Model Building, which

is one of the major task in data science life cycle.
After doing Analysis for this csv dataset, we found
out couple of major Aspects which makes sense
and shape to a dataset, to build the models
efficiently and effectively.
As mentioned before, this dataset had been used to

Fig 4. distribution of crops
predict the crop yield, where we noticed a High
Spike, which was shown in below [Fig 3]. Yield
Values, In Initial 3-4 years, later the yield was
21
3.4 Feature Engineering data in parallel, allowing for real-time
Sometimes, it is very difficult to make conclusions recommendations. Overall, Random Forest can
improve crop recommendations' precision and
and draw methods from the raw data where Feature
interpretability, helping farmers make wise
Engineering comes into the picture and makes
decisions and maximizing their crop selection
easier to draw methods. In this data also we had tactics.
Categorical Columns to draw Conclusions from
this Categorical Values. We had been used Feature 3.5.3 Navies Bayes:
Engineering Techniques such as One-hot Encoding,
Label Encoding. So that machine can understand Because of its ease of use, effectiveness, and
the data properly and provide useful insights to capacity for both categorical and discrete data, the
draw conclusions. Naive Bayes algorithm can be helpful in a crop
recommendation project. The conditional
probability that a crop will be suitable for a given
set of features, such as soil type, weather
3.5 Algorithms conditions, and crop attributes, is determined by the
probabilistic classifier Naive Bayes [3]. Given that
it only needs a small amount of training data to
3.5.1 Decision Tree: produce predictions, it is especially well suited for
projects with little available data. Naive Bayes is
This algorithm can be used in a crop
quick and effective for real-time recommendations
recommendation machine learning project to
because it has low computational requirements.
provide farmers with valuable insights and Additionally, Naive Bayes offers results that are
recommendations. To make informed predictions easy to interpret, enabling farmers to comprehend
about the best crop choice, the algorithm can learn the rationale behind the suggestions. Overall,
from historical data such as previous crop yields Naive Bayes can be a useful tool in crop
and agricultural practices. Decision tree [7] models recommendation projects because it provides
can also take multiple decision paths into account, precise and comprehensible predictions for the best
allowing for complex decision-making based on a crop choice.
variety of factors. The interpretability of decision
trees makes them especially useful in explaining 3.5.4 XGBoost:
the reasoning behind crop recommendations, which
Due to its ability to handle complex and non-linear
can help farmers make decisions. Furthermore,
data relationships, XGBoost, an advanced gradient
decision tree models can be easily updated with
boosting algorithm, can be extremely useful in a
new data, allowing for continuous crop
crop recommendation project. XGBoost [7] is well-
recommendation system refinement and
known for its high accuracy and predictive power,
improvement. Overall, the decision tree algorithm
which makes it ideal for making precise crop
can be a useful tool in crop recommendation
recommendations based on a variety of factors such
projects by providing data-driven insights to
as soil quality, weather conditions, historical crop
optimize agricultural practices and increase crop
data, and more. It can handle large datasets
yields.
efficiently and automatically handle missing data,
making it suitable for real-world agricultural
3.5.2 Random Forest:
scenarios. XGBoost also provides feature
Because it can handle complex datasets and importance rankings, which help farmers
produce precise predictions, the Random Forest understand which features influence crop
algorithm can be very helpful in a project recommendations. XGBoost also supports parallel
recommending crops. Random Forest can identify a processing, making it suitable for large-scale crop
variety of patterns and connections between various recommendation applications. Overall, XGBoost
factors, including soil characteristics, climatic has the potential to be a powerful tool for crop
conditions, and crop attributes, by using a number recommendation projects, providing accurate
of decision trees. With the help of this ensemble predictions as well as valuable insights for optimal
method, recommendations can be generated that crop selection.
are strong and trustworthy despite noise and
overfitting. Additionally, Random Forest [1] 3.5.5 KNN:
provides feature importance rankings that can be
used to pinpoint the crop selection variables that K-nearest neighbours (KNN) algorithm can be
have the greatest influence. Additionally, Random useful in a crop recommendation project due to its
Forest is effective at processing large amounts of simplicity and ability to handle both numerical and
22
categorical data. KNN [9] is a lazy learner, which Equation 1: Accuracy
means it does not need to be trained and can be
used to make real-time recommendations. Based on Precision [Equation 2] is defined as the ratio of true
the similarity of neighbouring data points, KNN positives, or the number of crops that were
can make crop recommendations based on accurately forecast, to all the anticipated crops. A
historical data on crop performance, soil quality, high accuracy rating means the model can correctly
weather conditions, and other relevant factors. It determine the best crops to produce in a certain
can also adapt to changing environmental area.
conditions, making it ideal for fast-paced
agricultural settings. KNN is interpretable,
allowing farmers to comprehend the reasoning
behind the recommendations. It is simple to
implement and has a low computational overhead,
making it appropriate for resource-constrained Equation 2: Precision
environments. KNN, on the other hand, may
Recall [Equation 3] quantifies the ratio of actual
necessitate the careful tuning of hyperparameters
crops that should have been advised to the overall
such as the number of neighbours (K) and the
number of true positives. A high recall score means
distance metric. Overall, KNN has the potential to
that the model can accurately identify a significant
be a useful and interpretable method for crop
fraction of the crops that should be grown in a
recommendation projects, providing real-time
specific area.
recommendations based on data point local
similarity.
3.6 Metrics
Equation 3: Recall
AUC indicates how well the model performs

overall at differentiating between the right crops
and the wrong ones. A high AUC value
demonstrates that the model can confidently and
reliably forecast the suitable crops.
Fig 5. Confusion matrix
AUC (Area Under the ROC Curve [Fig 6]),

precision, recall, accuracy score and recall are often
used measures to assess the effectiveness of a
machine learning model, including a crop
recommendation model. These measures may be
used to evaluate how effectively the model can
forecast the suitable crops to grow in a certain area. Fig 6. ROC curve
Accuracy Score [Equation 1] is defined as the total The accuracy, recall, and AUC scores of a
number of correct predictions out of the total successful crop recommendation model should be
predictions made from the testing data. Given high. While minimizing the number of false
below is the formula of accuracy score formula in positives (crops that are projected to be acceptable
terms of confusion matrix [Fig 5]. but are not suited for that site), the model should be
able to correctly identify the right crops to grow in
a certain place. The model should also be highly
confident in its ability to discriminate between the
23
proper crops and the incorrect ones. It is crucial to
assess a crop recommendation model's performance
using these measures on a sample dataset before
recommending it. If the model does well on the
assessment dataset, it may be a feasible alternative
for advising suitable crops in a specific area.
IV RESULTS AND DISCUSSIONS

The above-mentioned metrics, such as accuracy
score, precision, recall, and area under the curve,
help determine which model to use. These metrics
are frequently used to assess the effectiveness of Fig 7. feature importance from XGBoost classifier
machine learning algorithms. In the table
mentioned below, XG Boost is the best-performing
algorithm, with an accuracy of 70%, precision of
0.71, recall of 0.70, and AUC of 0.95. These results
show that XG Boost can accurately predict the
outcomes of the task at hand, making it a
dependable choice for the project. Furthermore, the
model's high AUC value of 0.95 indicates that it
has excellent discriminatory power in
distinguishing between positive and negative
outcomes. Overall, XG Boost outperforms other
algorithms, making it a promising candidate for
further use in the project.
Table 1: Model performance table before feature
reduction.
Algorithm Accuracy Precision Recall AUC

An attempt was made to improve accuracy by
selecting more important features from the XG Decision 0.62 0.63 0.62 0.81
Tree
Boost classifier and training the model with a
smaller feature set than the previous model. The Algorithm Accuracy Precision Recall AUC
resulting metrics, however, did not show a Decision 0.60 0.61 0.60 0.78
significant difference in performance as shown in Tree
[Table 2]. After careful consideration, it was Random 0.61 0.63 0.61 0.92
decided to build the webpage for this project using Forest
the initial model, which had been trained with the XG Boost 0.67 0.68 0.67 0.94
entire feature set. This decision was made because KNN 0.47 0.49 0.47 0.84
the initial model, despite having more features, Naïve 0.33 0.40 0.33 0.71
performed similarly to the reduced feature set Bayes
model, and thus provides a more comprehensive Random 0.66 0.67 0.66 0.93
and reliable prediction for the task at hand as seen Forest
from [ Table 1]. XG Boost 0.70 0.71 0.70 0.95
KNN 0.47 0.49 0.47 0.84

Naïve 0.33 0.40 0.33 0.71
Bayes
Table 2: Model performance table after feature reduction
24
goal of providing farmers with reliable and
effective crop recommendation solutions.
V FUTURE WORK
Our future work in this project involves improving

the model accuracy by Analysing the in-depth REFERENCES
correlations and patterns by performing EDA and
Feature Engineering Techniques, and also [1] Z. Doshi, S. Nadkarni, R. Agrawal and N.
addressing data inconsistencies, which may help to Shah,"Agro Consultant: Intelligent Crop
enhance the model's ability to capture patterns and Recommendation System Using Machine Learning
make accurate crop recommendations. We will also Algorithms," 2018 Fourth International Conference
focus on removing any inconsistencies or errors in on Computing Communication Control and
the data to ensure the model's reliability. Automation (ICCUBEA), Pune, India, 2018, pp. 1-
Additionally, we plan to develop a user-friendly 6, doi: 10.1109/ICCUBEA.2018.8697349
webpage that farmers can easily access to input
[2] S.Pudumalar, E. Ramanujam, R. H. Rajashree,
their data and receive crop recommendations based
C. Kavya, T. Kiruthika and J. Nisha, "Crop
on the trained model. This webpage will be
recommendation system for precision agriculture,"
designed with usability in mind, making it
2016 Eighth International Conference on Advanced
accessible and convenient for farmers to use in
Computing (ICoAC), Chennai, India, 2017, pp. 32-
their day-to-day operations. Our goal is to provide
36, doi: 10.1109/ICoAC.2017.7951740
a practical and effective tool for farmers to make
informed decisions about crop selection, ultimately [3] N. H. Kulkarni, G. N. Srinivasan, B. M. Sagar
leading to improved agricultural practices and and N. K. Cauvery, "Improving Crop Productivity
increased crop yields. Through A Crop Recommendation System Using
Ensembling Technique," 2018 3rd International
Conference on Computational Systems and
VI CONCLUSION Information Technology for Sustainable Solutions
(CSITSS), Bengaluru, India, 2018, pp. 114-119,
doi: 10.1109/CSITSS.2018.8768790
Finally, our crop recommendation project has
[4] Anakha Venugopal, Aparna S, Jinsu Mani, Rima
demonstrated the effectiveness of using machine
Mathew, Vinu Williams, 2021, Crop Yield
learning algorithms to provide crop selection
Prediction using Machine Learning Algorithms,
recommendations to farmers. We discovered that
International Journal Of Engineering Research &
the XGBoost algorithm performs better with an
Technology (Ijert) Ncreis – 2021 (Volume 09 –
accuracy of 70% and an AUC of 0.95 after
Issue 13).
implementing and evaluating various algorithms
such as Decision Trees, Random Forest, Naive [5] A Hybrid Approach For Crop Yield Prediction
Bayes, XGBoost, and KNN. Although the results Using Machine Learning And Deep Learning
are encouraging, there is still room for Algorithms Citation Sonal Agarwal and Sandhya
improvement. Additionally, domain-specific Tarar 2021 J. Phys.: Conf. Ser. 1714 012012
knowledge and expert opinions can be used to fine- DOI 10.1088/1742-6596/1714/1/012012
tune the model for specific regions or crop types.
Furthermore, creating a user-friendly webpage for [6] Design And Implementation Of Crop Yield
crop recommendations that farmers can easily Prediction Model In Agriculture Sangeeta, Shruthi
access and use would be a valuable addition to the G, International Journal Of Scientific &
project. This website could be a useful tool for Technology Research Volume 8, Issue 01, January
farmers to make informed crop selection decisions, 2020.
thereby improving agricultural practices and crop [7] "Crop Recommendation System using Machine
yields. To summarize, our crop recommendation Learning" Dhruvi Gosai, Chintal Raval, Rikin
project has laid the groundwork for the use of Nayak, Hardik Jayswal, Axat Patel. International
machine learning in agriculture and has the Journal of Scientific Research in Computer
potential to have a significant impact on the Science, Engineering and Information Technology
farming community. The findings and future work
outlined above serve as a road map for additional [8] N. N. Thilakarathne, M. S. A. Bakar, P. E. Abas,
research and development in this field, with the and H. Yassin, “A Cloud Enabled Crop
25
Recommendation Platform for Machine Learning- [16] Enrichment of Crop Yield Prophecy Using
Driven Precision Farming,” Sensors, vol. 22, no. Machine Learning AlgorithmsR. Kingsy Grace, K.
16, p. 6299, Aug. 2022, Induja and M. Lincy
[9] Crop Recommendation System To Maximize [17] Thomas van Klompenburg, Ayalew Kassahun,
Crop Yield Using Deep Neural Network Vol Cagatay Catal, Crop yield prediction using machine
12,Issue 11, Nov /2021 Issn No:0377-9254 learning: A systematic literature review, Computers
and Electronics in Agriculture, Volume
[10] Dighe, Deepti, Harsh H. Joshi, Aishwarya 177,2020,105709,ISSN 0168-1699
Katkar, Snehal S. Patil and Shrikant Kokate.
“Survey of Crop Recommendation Systems.” [18] Data analytics for crop management: a big
(2018). data view Nabila Chergui and Mohand Tahar
Kechadi
[11] Improvement of Crop Production Using
Recommender System by Weather Forecasts [19] "Crop Yield Prediction In Agriculture Using
Bangaru Kamatchi, R. Parvathi Data Mining Predictive Analytic Techniques", Ijrar
- International Journal Of Research And Analytical
[12] Data Mining Techniques and Applications to Reviews (Ijrar), E-Issn 2348-1269, P- Issn 2349-
Agricultural Yield Data D Ramesh , B Vishnu 5138, Volume.5, Issue 4, Page No Pp.783-787,
Vardhan, International Journal of Advanced December 2018,
Research in Computer and Communication
Engineering Vol. 2, Issue 9, September 2013 [20] Champaneri, Mayank & Chachpara, Darpan &
Chandvidkar, Chaitanya & Rathod, Mansing.
[13] N. N. Jambhulkar Modeling of Rice (2020). CROP YIELD PREDICTION USING
Production in West Bengal International Journal of MACHINE LEARNING. International Journal of
Scientific Research, Vol: 2, Issue: 7 July 2013 Science and Research (IJSR). 9. 2.
[14] Li Hong-ying, Hou Yan-lin, Zhou Yong-juan, [21] Crop Variety Selection Method using Machine
Zhao Hui-ming, Crop Yield Forecasted Model Learning G. Vishwa, J. Venkatesh, Dr. C. Geetha,
Based on Time Series Techniques, Journal of http://dx.doi.org/10.21172/ijiet.124.05
Northeast Agricultural University (English
edition),Volume 19, Issue 1,2012,Pages 73- [22] Crop Prediction using Machine Learning N.L.
77,ISSN 1006-8104,https://doi.org/10.1016/S1006- Chourasiya, P. Modi , N. Shaikh3 , D. Khandagale,
8104(12)60042-7 S. Pawar, IOSR Journal of Engineering (IOSR
JEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719
[15] Masood, M. A., Raza, I. ., & Abid, S. . (2019). PP 06-10
Forecasting Wheat Production Using Time Series
Models in Pakistan. Asian Journal of Agriculture
and Rural Development, 8(2), 172 177.
26
BIKE CRASH DETECTION SYSTEM
RUHMA FATIMA G RAKESH
Computer Science of Engineering Computer Science of Engineering SPANDAN MANDAL
PRESIDENCY UNIVERSITY PRESIDENCY UNIVERSITY Computer Science of Engineering
BENGALURU,INDIA BENGALURU, INDIA PRESIDENCY UNIVERSITY
20201LCS0017 2019CSE0179 BENGALURU,INDIA
2020LCS0004
SALEM PAUL
Computer Science of Engineering B A KEERTHI
PRESIDENCY UNIVERSITY Computer Science of Engineering
BENGALURU,INDIA PRESIENCY UNIVERSITY
20201LCS0020 BENGALURU,INDIA
20201LCS0021
Abstract— Bike accidents are leading cause of road providing emergency services. If the delay is often reduced
accident-related deaths all over the world especially in Asian the person may get saved. For associate in nursing accident
countries. A lot of deaths around the world occur due to road victims, it is terribly tough to alert the police room or the
accidents but Asian countries face the highest amount of bike relations concerning the accidents. The projected system is
accident-related deaths which can be seen in the government
road transport survey where the quantitative relation of road
employed to scale back the time delay between the accident
accidents in 2018 was 4.61 lakhs out of which 1.47 lakh that and providing emergency services. The vehicle pursuit and
were due to bike accidents. This means that almost 402 people accident detection device are often put in in any
die every day in Asian countries due to road accidents
especially bike accidents. Most of these accidents occur due to vehicle. Whenever a vehicle is taken, or an associate
rush driving such as speeding, drunken driving and not accident happens to the vehicle the coordinates are taken
following the proper road rules. According to a survey the through international positioning system (GPS) module and
leading cause of death due to road accidents is the delay in are regenerated into Google map link through the formula
providing emergency services. If the time to deliver emergency within the microcontroller. The formula is preinstalled
services can be reduced the person might get saved. Usually
due to these accidents, it is exceedingly difficult for the person
within the microcontroller. In the event of associate
in an accident to alert the emergency services such as police accident, the traveler should
and medical. The proposed system is going to solve this very
issue. Whenever a vehicle the that is equipped with the system receive facilitate promptly and the folks related to the
gets into an accident the coordinates of the vehicle are taken person should be notified immediately proposes a system
through global positioning system (GPS) and a Google maps wherever label sensors mounted on the vehicle will observe
link is generated by the microcontroller through a formula. a crash and signal the small controller that successively
Through this, the affected person will get the emergency passes the information containing the
services promptly and send an emergency message about the
accident to the people related to the person, such as their
family members and friends. The process through which this
coordinate location of the crash beside the identification
will be done is the sensors mounted on the vehicle will detect a details to the cloud server. The google map link is
crash and signal the microcontroller which will pass the distributed through International System of Units for mobile
information containing the coordinates of the location of the communication GSM module to a predefined mobile sort of
crash and the identification details to the Cloud server. The members of the family and near police headquarters. The
Cloud server will then generate a Google map link which will accident is detected through measuring device and the price
be distributed to the affected person's family members and the compared with the formula's brink price. The friend will get
nearest police headquarters. (Abstract) the exact location of the vehicle by clicking on the google
map link provided among the SMS
Keywords—leading, highest, road, cause, cloud
LXIII. RELATED WORKS
LXII. INTRODUCTION
Design of accident detect ion and alert system for
Bike accident may be a terribly huge downside in Asian
motorcycles [2013]
nation and different countries too. Most of the deaths within
the world area are unit because of road accidents. Asian
The idea of vehicle accident detection is not new, and the
nation faces the absolute best death rate within the world
automotive companies have made lots of progress in
consistent with the govt road transport survey the
perfecting that technology. Hitherto the same in motorcycles
quantitative relation of road accidents in 2018 is 4.61 lakhs
is lying dormant waiting to reach its peak. This paper is an
within which variety of deaths is 1.47 lakhs i.e., 402 folks
attempt to contribute to that area of technology. Here we are
die per day in Asian nation. Reasons for road accidents
trying to detect accidents through three parameters-
square measure speed driving, drink, and drive, not
acceleration/ deceleration, tilt of the vehicle and the pressure
following rule. in keeping with some survey, the most
change on the body of the vehicle. Using these minute data
reason for deaths within the road accidents is delay in
values and an apt algorithm, the accident can be detected
with a reasonable success rate. And the coordinates of the In this paper, we suggest a method to intelligently detect an
vehicle found using GPS technology are sent to the accident at any place and any time and report the same to
emergency services for help. the nearby `service provider'. The service provider arranges
for the necessary help. Accident Detection and Reporting
Vehicle Tracking and Locking Based GSM and GPS [2013] System (ADRS) which can be placed in any vehicle uses a
sensor to detect the accident. The sensor output is monitored
Currently, all the public have their own vehicle, theft is and processed by the PIC16F877A microcontroller. The
happening in parking and sometimes driving in insecure microcontroller takes decisions on traffic accidents based on
places. The safety of vehicles is essential for public the input from the sensors. The RF transmitter module
vehicles. Vehicle tracking and locking system installed in which is interfaced with the microcontroller will transmit
the vehicle, to track the place and locking engine motor. The the accident information to the nearby Emergency Service
location of the vehicle was identified using Global Provider (ESP). This information is received by the RF
Positioning system (GPS) and Global system mobile receiver module at the `service provider' control room in the
communication (GSM). These systems constantly watch a locality. The RF transceiver module used has a range up to
moving Vehicle and report the status on demand. When the 100 meters (about 328.08 ft) under ideal conditions. The
theft is identified, the person responsible sends SMS to the service provider can use this information to arrange for
microcontroller, then microcontroller issues the control ambulances and inform police and hospital. We used low-
signals to stop the engine motor. Authorized people need to cost RF modules, a microcontroller by Microchip, LCD
send the password to controller to restart the vehicle and module and an accelerometer. This system can be installed
open the door. This is more secure, reliable, and lower cost. at accident prone areas to detect and report the same.
MPLAB IDE and Proteus software are used to simulate part
Incident Detection Algorithm Based on Non-Parameter of the system. ADRS also implements an intelligent
Regression [2002] Accident Detection and Reporting Algorithm (ADRA) for
the purpose.
We will first describe the traffic congestion problem that
many countries are facing in this world. Then we propose a Accident Detection and Reporting System using GP S,
traffic incident detection algorithm based on non-parametric GPRS and GSM Technology [2012]
regression to solve the congestion problem. Finally, we
compare the algorithm with other incident detection Speed is one of the basic reasons for vehicle accidents.
algorithms on the detection rate, false alarm rate and mean Many lives could have been saved if emergency services
detection time. A simulation result shows the algorithm could get accident information and reach in time.
proposed has higher detection rate, lower false alarm rate Nowadays, GPS has become an integral part of vehicle
and longer mean time detection. Furthermore, we state the systems. This paper proposes to use a GPS receiver's
direction of our next study. capability to monitor the speed of a vehicle, detect accidents
based on monitored speed, and send accident location to an
Alert Service Centre. The GPS will monitor the speed of a
Study on the Method of Freeway Incident Detection Using vehicle and compare it with the previous speed in every
Wireless Positioning Terminal [2008] second through a Microcontroller Unit. Whenever the speed
is below the specified speed, it will assume that an accident
Improving the incident detection system's performance was has occurred. The system will then send the accident
essential to minimize the effect of incidents. A new method location acquired from the GPS along with the time and the
of incident detection was brought forward in this paper speed by utilizing the GSM network. This will help to reach
based on an in-car terminal which consisted of GPS module, the rescue service in time and save valuable human life.
GSM module and control module as well as some optional
parts such as airbag sensors, mobile phone positioning Design and Development of GPS GSM based tracking
system (MPPS) module, etc. When a driver or vehicle system with Google map-based monitoring [2013]
discovered the freeway incident and initiated an alarm report
the incident location information located by GPS, MPPS or GPS is one of the technologies used in many applications
both would be automatically sent to a transport management today. One of the applications is tracking your vehicle and
center (TMC), then the TMC would confirm the accident keeps regular monitoring on them. This tracking system can
with a closed-circuit television (CCTV) or other approaches. inform you of the location and route travelled by vehicle,
In this method, detection rate (DR), time to detect (TTD) and that information can be observed from any other remote
and false alarm rate (FAR) were more important location. It also includes a web application that provides you
performance targets. Finally, some feasible means such as with the exact location of the target. This system enables us
management mode, education mode and suitable accident to track targets in any weather conditions. This system uses
confirming approaches have been put forward to improve GPS and GSM technologies. The paper includes the
these targets. hardware part which comprises of GPS, GSM, at mega
microcontroller MAX 232, 16x2 LCD and software part is
used for interfacing all the required modules and a web
Wireless Vehicular Accident Detect ion and Reporting application is also developed at the client side. Main
System [2010]
28
objective is to design a system that can be easily installed Design and development of GPS/GSM based vehicle
and to provide platform for further enhancement. [9] tracking and alert system for commercial inter-city buses
[2012]
Design and Implementation Vehicle Tracking System using In this paper we proposed the design, development, and
GPS & GSM/GPRS Technology and Smartphone deployment of GPS (Global Positioning System)/GSM
Application [2014] (Global System for Mobile Communications) based Vehicle
Tracking and Alert System which allows inter-city transport
An efficient vehicle tracking system is designed and companies to track their vehicles in real-time and provides
implemented for tracking the movement of any equipped an alert system for reporting armed robbery and accident
vehicle from any location at any time. The proposed system occurrences.
made beneficial use of a popular technology that combines a 3 PROPOSED SYSTEM
Smartphone application with a microcontroller. This will be
easy to make and inexpensive compared to others. The We have avoided the false alarm situation caused for some
designed in-vehicle device works using Global Positioning conditions, increased the accuracy of accident detection
System (GPS) and Global system for mobile communication using more than one sensor. To avoid the false alarm, we
/ General Packet Radio Service (GSM/GPRS) technology have one manual switch in the vehicle itself which must be
that is one of the most common ways for vehicle tracking. pressed within a certain amount of time for false accident
The device is embedded inside a vehicle whose position is detection and, hence avoiding any false intimation. We are
to be determined and tracked in real-time. A microcontroller using front bumper sensor, GPS sensor, Position encoder
is used to control the GPS and GSM/GPRS modules. The along with the MEMS sensor to increase the accuracy of
vehicle tracking system uses the GPS module to get accident detection. Bumper sensor will tell the
geographic coordinates at regular time intervals. The microcontroller how much force/pressure has been applied
GSM/GPRS module is used to transmit and update the on it and it's obvious the pressure will be more in case of
vehicle location to a database. A Smartphone application is accident. Position encoder is used for calculating the speed
also developed for continuously monitoring the vehicle of vehicle and it is expected to change drastically when
location. The Google Maps API (application programming accident being met and adding another layer of reliability.
interfaces) is used to display the vehicle on the map in the The MEMS sensor as usual tells the microcontroller if there
Smartphone application. Thus, users will be able to is sudden change in the acceleration. GPS, GSM modules
continuously monitor a moving vehicle on demand using the we are using to get the accident spot location and send the
Smartphone application and determine the estimated SMS.
distance and time for the vehicle to arrive at a given
destination. To show the feasibility and effectiveness of the
system, this paper presents experimental results of the LXIV. PROBLEM STATEMENT
vehicle tracking system and some experiences on practical Whenever accident being met, the nearby people call the
implementations. ambulance. The problem associated with this is that the
victims depend on the mercy of nearby people. There is a
chance that there are no people nearby the accident spot or
Automatic road accident detection techniques: A brief people who are around neglects the accident. This is the
survey [2017] flaw in the manual system.
Many precious lives are lost due to road traffic accidents According to a statistical projection of traffic fatalities, the
every day. The common reasons are driver's mistake and most obvious reason of a person's death during accidents is
late response from emergency services. An effective road the unavailability of the first aid provision, due to the delay
accident detection and information communication system in the information of the accident being reached to the
are needed to save injured persons. A system that sends ambulance or to the hospital.
information messages to nearby emergency services about
the accident location for timely response is in need. In
research literature, many automatic accident detection LXV. OBJECTIVES
systems have been proposed by many researchers. These
include accident detection using smartphones, GSM and Existing System
GPS technologies, vehicular ad-hoc networks, and mobile
applications. The implementation of an automatic road There are many solutions proposed for the concerned
accident detection and information communication system problem and each one have some advantage over others.
in every vehicle is very crucial. This paper presents a brief Among the other GSM and GPS solutions, some proposed
review of automatic road accident detection techniques used the solution of finding the accident condition using only
to save affected persons. An automatic road accident accelerometer sensor which may be a problem as it may lead
detection technique based on low-cost ultrasonic sensors is to false alarm for some of the cases. Our system uses more
also proposed than one sensor to increase the accuracy of the system and
also we have provision to avoid the intimation in case of
false alarm. The existing system also uses the Wi-Fi
modules which does not work when there is no network.
29
LXVI. MOTIVATION FOR THE WORK
Road accidents contribute to majority of deaths in India.
Most of the lives may be saved if they get medical help
quickly in time.
Over 1.51 lakh died in road accidents in year 2019
Accelerometer: This is a sensor to detect the vibration or

high change in position in XY-plan. The values given by
this sensor is in digital values. If values are higher than
xy_min value or greater than x,y,_max value then it takes it
as accident condition met.
LXVII. METHOLOGY
Once the vehicle detects abrupt modification within the

threshold values with the assistance of measuring device Position Encoder: This will be going to check the drastic
detector, that set the control logic to microcontroller as change in the speed of vehicle. If speed changes from 100-
before long as accident is detected. Set the effective 150rmp to drastically 30-50rpm then the accident condition
sensitive value for measuring instrument detector, met.
throughout that accident or crash is detected. Once
Controller detects the accident or set bit through measuring
instrument detector, Arduino activates the GSM module that
has a manually saved signal of friend of accident victim,
sends a pre-stored SMS to that selection. Simultaneously, it
further offers the message to the many friends that accident
had occurred.
Crash Detector Unit:

Bumper Sensor: this sensor is nothing but the switch which
will detect the high voltage when get crashed. If it pressed
then
sends the high voltage to controller then it takes it as
accident condition
happened. Data Flow Diagram level 0
30
[3] R. Ramani, S.Valarmathy, Dr. N Suthanthira,
S.Selavaraju, M.Thiruppat hi, R.Thagam, “ Vehicle
Tracking and Locking Based GSM and GPS”, Issue Date:
Sept 2013)
[4] An Ericsson Whit e Paper,” Communication and

Information Services for National Security and Public
Safety”, Ericsson Microwave System AB.I. S. Jacobs and
C.P.Bean, “Fine particles, thin films and exchange
anisotropy,” in Magnet ism, vol. III, G. T . Rado and H.
Suhl, Eds. New York: Academic, 1963, pp. 271–350
[5] GONG xiaoyan, TANG shuming, WANG feiyue,

“Traffic Incident Detection Algorithm Based on Non-
Parameter Regression” IEEE 2002.
LXVIII. REQUIREMENT SPECIFICATION
[6] Li Chuanzhia, Hu Rufua, Hang Wenb, He Jieb and Tao
Software Requirement Xianglib, “Study on the Method of Freeway Incident
Detection Using Wireless Positioning Terminal” ICICTA on
Arduino IDE 20-22 Oct. 2008.
C++
[7] Rajesh Kannan Megalingam. Ramesh Nammily Nair,
Hardware requirement Sai Manoj Prakhya “Wireless Vehicular Accident Detect ion
and Reporting System” ICMET on 10-12 Sept. 2010 M.
ESP32 Controller Young, The Technical Writ er’s Handbook. Mill Valley,
MEMS Sensor (MPU6050) CA: University Science, 1989.
GPS
GSM
Bumper Sensor [8] M. B. I. Reaz, Jubayer Jalil, Md. Syedul Amin,
Read Switch “Accident Detection and Reporting System using GP S,
Speed Sensor GPRS and GSM Technology” ICIEV on 18-19 May 2012.
Buzzer
Power supply. [9] J.S Bhat ia and PankajVerma, “Design and
Development of GPS GSM based t racking system with
Google map-based monitoring”, IJCSEA, Vol.3, Issue. 3,
LXIX. CONCLUSION pp. 3340, 2013.
AMS system will play a significant role within the field of
road accidents. By victimization automatic messaging [10] SeokJu Lee, Girma Tewolde and Jaerock Kwon:
system, it facilitates to the separated person in road “Design and Implementation Vehicle Tracking System
accidents as quick as possible. Thereby it increases the using GPS & GSM/GPRS Technology and Smartphone
victim’s survival possibilities. Application”, IEEE, pp. 353-358, 2014
The deaths and additionally the severe conditions due to [11] “Cellular networks for massive IoT,” Ericsson White
accidents the GSM technologies square measure used where Paper, Jan 2016
the immediate action would be taken by the automobile /
police service that might cut back the severity. [12] W. Chris Veness, “Calculate distance and bearing
between two Latitude/Longitude points haversine formula in
References JavaScript”, 2016
[1] Government of India, Minist ry of Road Transport and [13] J. White, C. Thompson, H. Turner, B. Dougherty, and
Highways, Lok Sabha Unstarred Quest ion No. 374 D. C. Schmidt, “Wreckwatch: Automatic traffic accident
Answered on 19-07-2018 detect ion and notification with smart phones,” Mobile
Networks and Applications, vol. 16, no. 3, pp. 285–303,
[2] F. B. Basheer, J. J. Alias, C. M. Favas, V. Navas, N. K. 2011
Farhan and C.V. Raghu, "Design of accident detect ion and
alert system for motorcycles," 2013 IEEE Global [14] U. Khalil, T. Javid, and A. Nasir, “Automatic road
Humanitarian Technology Conference: South Asia Satellite accident detection techniques: A brief survey,” in
(GHTC-SAS), Trivandrum, 2013, International Symposium on Wireless Systems and
pp. 85-89. Networks (ISWSN). IEEE, 2017, pp. 1–6.
31
[15] P. B. Fleischer, A. Y. Nelson, R. A. Sowah and A. System”, International Conference on Mechanical and
Bremang, "Design and development of GPS/GSM based Electrical Technology (ICMET 2010).
vehicle tracking and alert system for commercial inter-city
buses," 2012 IEEE 4th International Conference on IEEE conference templates contain guidance text for
Adaptive Science & Technology (ICAST), Kumasi, 2012, composing and formatting conference papers. Please
pp. 1-6. ensure that all template text is removed from your
conference paper prior to submission to the
[16] R. Kannan, R. Nammily, S. Manoj, A. Vishwa,” conference. Failure to remove template text from
Wireless Vehicular Accident Detection and Reporting your paper may result in your paper not being
published.
32
FraudShield: Detection of Fraud in Credit Card based on Machine
Learning Techniques with integration of web-based Framework
Tejas Prakash D Grishma A V Sneha M

20191IST0163 20191IST0182 20191IST0150
Dept. of IST Dept. of IST Dept. of IST
Banglore-64 India Banglore-64 India Banglore-64 India
niversity.in niversity.in sniversity.in
Swetha A Nair Sai Meghana J S Dr. Kuppala Saritha

20191IST0157 20191IST0187 Associate Professor
Dept. of IST Dept. of IST School of CSE
Banglore-64 India Banglore-64 India Banglore – 64 India
201910101983@presidencyu 201910101615@presidencyu kuppala.saritha@presidencyu
credit cards. In this paper, suggesting a system

ABSTRACT for detecting credit card fraud that employs
machine learning techniques to distinguish
In this paper, a general term for both fraud and between legitimate and fraudulent transactions.
theft committed with or while using a credit card In order to examine previous transaction data
for payment is "credit card fraud." In light of the and spot trends that point to fraudulent activity,
daily increase in scammers, numerous different the proposed system combines both supervised
types of fraud are committed using credit cards. and unsupervised methods of learning.
In order to address this issue, numerous Additionally combining the developed machine
methods like logistic regression, decision trees, learning methods with a web-based platform to
KNN, and Naive Bayes algorithms are used. deliver an intuitive user interface for
Various options are considered for this immediate form of fraud detection. The test
transaction, and the best one is implemented. findings
Filtering out the above-mentioned tactics is
done mostly to achieve the goal of detecting
fraud and producing better results. Credit card
fraud has increased due to the growing
popularity of using credit cards for online
purchases. The problem has been addressed
through the use of machine learning algorithms
to identify fraudulent transactions made with
Keywords: Fraud Detection, Credit Card,
demonstrate that the suggested framework Decision Tree, Logistic Regression,
detects fraudulent transactions with a high KNN, Naïve-Bayes,Streamlit, MLP,
degree of accuracyand with a minimal amount Accuracy, Precision, Macro average,
of false positives. Weighted average.
1. INTRODUCTION
The consumers and financial organisations are of techniques based on machine learning in the
equally impacted by the severe issue of credit card identification of fraud in credit cards has a number
theft. Fraudulent actions can harm the credibility of benefits and irregularities in massive datasets
of the financial company as well as result in that human analysts could miss. In contrast to
significant monetary losses for both parties. In rule-based systems, which need manual updates
order to promptly identify forged transactions and to be successful, they are also able to adjust to a
reduce losses, it is necessary to design successful new fraud patterns as they appear. In this regard,
and effective fraud detection systems. Credit card the project's goal is to look into how well different
fraud has been successfully identified using machine learning approaches work to identify
machine learning techniques. These methods credit card fraud. In order to determine which
entail building a model from a dataset of method is most efficient, the research will
confirmed illegal and fraudulent transactions, examine multiple datasets and use a variety of
then using the model to forecast the likelihood of preprocessing strategies, feature selection
fraud for new transactions. In comparison to techniques, and modelling algorithms.[1,2]
conventional rule-based systems, the application
2. OBJECTIVES
The objectives that come across in fraud detection detection of fraud. Identify relevant attributes that
of credit cards are: (a) To identify and foresee the capture the patterns and traits of fraudulent
outcome of unauthorised credit card activity. (b) transactions by extracting and engineering them.
Analyse a few effective machine learning (e) Create a system that can process incoming
algorithms, identify the one with the best transactions made with credit cards in real-time
accuracy, and suggest a model. (c) Add a machine and identify whether they are counterfeit or
learning model to the web-based framework for genuine using the used machine learning models
better user interface and user experience. (d) Find [2].
pertinent dataset features that can aid in the
3. METHODOLOGY
3.1 Existing methods clustering and neural networks based on fraud
detection, demonstrated that by accumulating
In the current system, research on an instance qualities, neuronal activity inputs can be
involving detecting credit card fraud, where data minimised. Additionally, normalised data should
normalisation was used before cluster analysis, be used, and MLP training is recommended [3].
and with outcomes obtained through the use of This study was built on unsupervised learning.
Finding innovative strategies for identifying Proposing a model in the system that is being
fraudulent activity and improving the accuracy of suggested here to identify fraud behaviour in
outcomes were the two main purposes of this transactions with credit cards. The bulk of
article. Personal information in the data set used essential characteristics which are required to
for this study is kept isolated and it is based on the distinguish between legitimate and illegal
real transactional figures collected by a major transactions may be offered by this method. With
European corporation. The algorithm typically the development of technology, it becomes more
has a 50% accuracy rate. Discovering an difficult to identify the idea and pattern of faked
algorithm and dropping the cost measure were the transactions. The advancement of artificial
two main purposes of this paper. The result was intelligence (AI), machine learning, and other
23%, and the chosen algorithm had the lowest relevant information technology disciplines has
risk. [2,3]. made it possible to automate this process and
minimise part of the intense labour that is
Disadvantages necessary to detect credit card fraud [4]. Teo
identify credit card fraud. To discover which
1. The gains and losses attributable to fraud machine learning algorithm is best, comparisons
detection are adequately represented in this study are made between many algorithms, including
by a novel collative comparison metric. random forests, decision trees, logistic regression,
and Naive Bayes. Determine the best algorithm
2. The suggested cost measure is used to offer a that credit card merchants can use to identify
cost-sensitive strategy centred around Bayes fraudulent transactions. Finally, integrating the
minimum risk. machine learning model with the web based
framework using streamlit, it is a web based
3.2 Proposed method framework for better user interface and user
experience. Then creating menus, inputs fields for
prediction, classification reports and display
model graph in the web framework [5].
Fig.1. Architecture
4. MODULES
Data collection is the first stage of the research; variety of library functions can be used to load the
the data being gathered consists of a number of dataset. To read CSV function of the python
actions, some of which are genuine and others of pandas module was used in this case to load a data
which are fraudulent. The project's data collection collection in CSV or Microsoft Excel format.
phase is the initial stage; this dataset consists of a Creating a model for the data that was trained is
number of operations, some of which are genuine now used to create the model after the data was
and others of which are fraudulent. The first stage divided into test and training samples, each of
of the project is data collection; this dataset which was given a 70% and 30% weighting.
consists of a number of operations, several of Determining the module's accuracy using a
which are genuine and others of which are variety of algorithms, this stage determines the
fraudulent. Credit Card Dataset using the Kaggle module's correctness. Streamlit Web Framework
website as source, may access a credit card in the web application, will incorporate the
payment information set. This process of the machine algorithm graphs, user input, and
dataset: In this module, selected data is prepared, accuracy result.
cleaned up, and sampled. Dataset loading is a
5. ALGORITHMS AND FRAMEWORK

5.1 Logistic Regression
Logistic regression (LR) is a well-known classification. (b) The method of logistic
supervised learning method that is commonly regression is a quick technique that can easily
used for classification tasks. It is a statistical handle enormous datasets. (c) Logistic regression
technique for studying a collection of data in is a versatile approach that is capable of handling
which any number of independent variables both large and small datasets. The disadvantages
influence the outcome. The purpose of the logistic are: (a) Logistic regression presupposes that there
regression technique is to determine the best is a linear connection between the independent
model that best describes the connection between factors and the response variable's log-odds. (b)
the variable that is dependent (or responder) and The logistic regression method may not perform
the variables that are independent (or predictors) effectively if the connection is non-linear.
[5]. Logistic regression, unlike linear regression, Outliers affect logistic regression because it is
has a binary or categorical response variable. The sensitive to them. (c) Non-linear correlations
advantages are: (a) The logistic regression method between the independent factors and a response
is a straightforward and basic algorithm. Because variable cannot be handled using logistic
it is simple to put into practise and interpret, it is regression [6].
a common solution for many issues related to
5.2 Decision Tree
A decision tree is a regression analysis and for exploratory learning and discovery. High-
classification machine learning algorithm. The dimensional data can be handled via decision
decision tree paradigm has a tree-like structure, trees. Classifiers have high accuracy in general
with each internal node indicating an decision trees. The decision tree inference is a
investigation of a characteristic, each branch common inductive way of learning classification
reflecting the testing result, and symbolising a information [7]. Decision trees categorise
class identification or a numerical value [6]. The instances by moving them through the tree
tree can be "learned" by subdividing the source set through the root to a leaf node that offers the
depending on the outcome of attribute testing. instance's categorization. Beginning at the lowest
This method is repeated recursively on each point of the tree, an instance is categorised by
derived subset, which is known as recursive checking the attribute indicated by this node and
partitioning. Because the development of a then going along the tree branch according to the
classifier that uses decision trees requires no numerical value of the property.
domain expertise or parameter setup, it is suitable
5.3 Naive-Bayes
Naive Bayes is a common classification technique Bayes principle is a quick approach that can
based on Bayes' probability theory. It is a handle big, high-dimensional datasets. (c) To
straightforward yet effective algorithm that is create accurate predictions, Naive Bayes takes
commonly used in the classification of text, only a small amount of training data. (d) Naive
filtering spam, and recommendation systems. The Bayes can deal with insignificant features and is
name "naive" comes from the assumption that unaffected by them [9]. The disadvantages are: (a)
each of the characteristics is unrelated to each Naive Bayes presupposes that the characteristics
other. To begin, the algorithm computes the are independent of one another, which is not
estimated likelihood of each class supplied with a necessarily the case in real-world datasets. (b) The
set of parameters. This is accomplished by the use naive Bayes technique has limited expressive
of the Bayes' theorem, which indicates that the capacity and may be incapable of capturing
likelihood of an expectation (in the current complicated feature interactions. (c) Naive Bayes
instance, the class) offered by the information (the presupposes a predefined distribution of
features) correlates to the likelihood of the data probability for the attributes, which may or may
supplied by the hypothesis multiplied by the not be appropriate for the dataset. Because it
initial likelihood of the hypothesis [8]. The implies a discrete distribution of probability for
advantages are: (a) Naive Bayes is a the features, Naive Bayes is unsuitable for
straightforward algorithm that is simple to grasp continuous data. (d) Naive Bayes is best suited for
and apply. It does not necessitate the use of categorical information and may struggle with
complex iterative algorithms, as many other continuum or numerical features [10].
machine learning techniques do. (b) The naive
5.4 KNN
In the KNN model, statistics uses a non-para- of physical measurements or arrive at vastly
supervised learning method called the k-nearest different scales because the method uses distances
neighbour technique (k-NN). A class member is for categorization [12]. Applying weights to
the product of the k-NN classification. An object neighbour contributions to make the near
is classified by the unanimous consent of its neighbours contribute more to the median than the
neighbours, with the object being given the distant neighbours is an effective method for
classification that is most common among its k regression as well as classification. Assigning
(positive, frequently tiny) nearby objects. When k each neighbour a weight of 1/d, where d
is equal to 1, the item is simply classified as the represents their distance from one another, is a
object's lone nearest neighbour [11]. With the k- common way to weigh objects. In both the k-NN
NN classification approach, all processing is put classification and the k-NN regression, the
on hold until the function being classified has neighbours are selected from a group of elements
been evaluated and only a remote model has been where the class or object value for a property has
constructed. The accuracy of the aforementioned been established.This is the algorithm's training
method can be greatly improved by identifying set; however, no explicit training is necessary
the source data if the features represent a variety [13,14].
5.5 Streamlit
For the detection of fraud in credit cards, with the variety of cloud platforms, including Heroku and
help of the freely available web application Google Clouds [15]. Overall, Streamlit is a
framework Streamlit, programmers may use powerful tool for creating interactive data-driven
Python to build interactive data-driven apps. With applications with Python, and is well-suited for
Streamlit, developers can easily create data data scientists and developers who want to
visualizations, interactive dashboards, and quickly prototype and deploy web applications.
machine learning models that can be deployed as Streamlit is a versatile web application framework
web applications. Streamlit provides a simple and that can be used for a wide variety of applications
intuitive interface for creating applications, in data science, machine learning, and beyond.
allowing developers to focus on the content and Here are some examples of the uses of Streamlit:
functionality of their applications rather than the (a) Interactive data exploration: Streamlit makes
technical details of web development. Streamlit it easy to create interactive data visualizations and
provides a number of features to make building exploration tools, allowing users to explore and
web applications easier, including: A simple and analyze data in a more intuitive and engaging
intuitive API for creating user interfaces and data way.
visualizations.Automatic reactivity, this enables (b) Machine learning model development and
developers to construct applications that are deployment: Streamlit can be used to develop and
interactive that are updated in immediate time as deploy machine learning models as web
the consumer interacts with them. Built-in support applications, allowing users to interact with and
for popular data science libraries such as Pandas, test models in real-time.
Matplotlib, and Plotly. Easy deployment to a
(c) Dashboard creation: Streamlit is well-suited and experimenting with new data science ideas
for creating interactive dashboards that allow and techniques, allowing users to quickly test and
users to explore and analyze data from a variety iterate on new ideas.
of sources. (e) Education and training: Streamlit can be used
(d) Prototyping and experimentation: Streamlit to create interactive educational tools and
provides an easy-to-use interface for prototyping tutorials, allowing students and learners [16,17].
6. TECHNIQUES
6.1 Repeat retailer
The fraud detection of a credit card using a repeat metrics of the new transaction to the historical
retailer is the technique that utilizes the history of metrics of the cardholder's previous transactions
transactions made at a particular retailer to at the retailer. If the metrics of the new transaction
identify potentially fraudulent transactions. The are significantly different from the historical
basic idea is that if a cardholder has made several metrics, the system flags the transaction as
legitimate transactions at a particular retailer in potentially fraudulent and triggers a review
the past, then any future transactions at that process. Repeat retailer is just one of many
retailer are more likely to be legitimate as well. techniques used in detection of credit card fraud,
The system maintains a history of transactions and is often used in combination with other
made by each cardholder at each retailer. When a techniques, such as anomaly detection and
new transaction is made, the system checks to see machine learning. By leveraging the history of
if the cardholder has made any previous transactions made by each cardholder, repeat
transactions at the same retailer. If the cardholder retailer can help identify potentially fraudulent
has made previous transactions at the retailer, the transactions and reduce the incidence of credit
system calculates various metrics, such as the card fraud. Through graph model which are
average transaction amount, where the time is depicting the analysis part based on the dataset,
between transactions, and the location of the column of ‘repeat retailer’. Predicting the percent
transactions [18]. The system compares the of ‘yes’ is 88.2% and ‘no’ is 11.8%.
Fig.2. Pie chart of Retain Retailer

6.2 Used chip
The detection of fraud in credit card using valid and approves the transaction. If the
used_chip is a technique that utilizes the information provided by the merchant does not
information stored on the chip of a credit card to match the information stored on the chip, the
identify potentially fraudulent transactions. The system flags the transaction as potentially
basic idea is that the information stored on the fraudulent and triggers a review process.
chip can provide additional authentication and Used_chip is just one of many techniques used in
validation that can help verify the legitimacy of a detection of credit card fraud, and is often used in
transaction.The system reads the information combination with other techniques, such as repeat
stored on the chip of the credit card, including the retailer analysis and machine learning [20]. By
card number, expiration date, and other utilizing the information stored on the chip of a
information [19]. The system compares this credit card, used_chip can help verify the
information to the information provided by the legitimacy of a transaction and reduce the
merchant, such as the transaction amount, the incidence of credit card fraud. .Through graph
merchant name, and the location of the model which are depicting the analysis part based
transaction. If the information provided by the on the dataset, column of ‘used chip’. Predicting
merchant matches the information stored on the the percent of ‘yes’ is 65.0% and ‘no’ is 35.0%.
chip, the system assumes that the transaction is
Fig.3. Pie chart of Used chip

6.3 Used_pin_number
Credit card fraud detection using ‘used pin transaction. The system compares the entered PIN
number’ is a technique that utilizes the personal number to the PIN number stored on the credit
identification number (PIN) entered by the card's chip. If the entered PIN number matches the
cardholder during a transaction to identify stored PIN number, the system assumes that the
potentially fraudulent transactions. The basic idea transaction is legitimate and approves the
is that if a transaction is made using a cardholder's transaction [21,22]. If the entered PIN number
stolen credit card, the thief is unlikely to know the does not match the stored PIN number, the system
correct PIN number, and this can be used to flags the transaction as potentially fraudulent and
identify potentially fraudulent transactions. The triggers a review process. Used_pin_number is
cardholder enters their PIN number during the just one of many techniques used in detection of
fraud in credit card, and is often used in foolproof and can be compromised in cases where
combination with other techniques, such as the PIN number has been stolen or the thief has
anomaly detection and machine learning. By managed to guess the correct PIN number.
utilizing the PIN number entered by the Through graph model which are depicting the
cardholder, used_pin_number can help verify the analysis part based on the dataset, column of ‘used
legitimacy of a transaction and reduce the pin number’. Predicting the percent of ‘yes’ is
incidence of credit card fraud [23,24]. However, 89.9% and ‘no’ is 10.1%.
it is important to note that this technique is not
Fig.4. Pie chart of Used Pin Number

6.4 Online order
The fraud detection of credit card using a different country. If the system detects any
online_order is a technique that utilizes the suspicious patterns or anomalies, it flags the
information provided during an online order transaction as potentially fraudulent and triggers a
transaction to identify potentially fraudulent review process [27]. Online_order is just one of
transactions. The basic idea is that certain patterns many techniques used in detection of fraud, and
and behaviors can be used to identify potentially often used in group with other techniques, where
fraudulent transactions made online [25]. The as like in machine learning and anomaly
system analyzes the information provided during detection. By analyzing the information provided
the online order transaction, including the IP during an online order transaction, online_order
address of the device used to make the transaction, can help identify potentially fraudulent
the shipping address, and the billing address. The transactions and reduce the occurrence of fraud.
system compares this information to the However, it is important to note that this
cardholder's historical information, such as their technique is not foolproof and can be
location, typical shipping and billing addresses, compromised in cases where the thief has access
and other relevant information [26]. The system to the cardholder's personal information, such as
looks for patterns and anomalies in the their shipping and billing addresses [28]. Through
information provided, such as a shipping address graph model which are depicting the analysis part
that is significantly different from the cardholder's based on the dataset, column of ‘online order’.
billing address or an IP address that is located in
Predicting the percent of ‘yes’ is 65.1% and ‘no’
is 34.9%.
Fig.5. Pie chart of Online Order
7. RESULT
Based on the precision and accuracy scores performed moderately in both measurements,
offered, it is critical to analyse the fraudulent with a good precision score and an accuracy score.
credit card detection application's particular Logistic regression produced a fair balance of
evaluation standards and goals. If accuracy is precision and accuracy. K-Nearest Neighbours
important, Naive Bayes has the highest accuracy (KNN) also performed well, with an accuracy and
score. It is worth mentioning, however, that Naive precision score ofBased on the ratings, logistic
Bayes had the lowest precision score, indicating a regression appears to be the model of greatest
higher false-positive rate. Decision Tree obtained choice since it achieves a decent mix of accuracy
the highest precision score if precision is and precision. However, the best appropriate
important. This means that it was more accurate model is ultimately determined by the
in classifying fraudulent transactions. The application's specific requirements and goals, and
decision tree, on the other hand, had a somewhat other variables like complexity of computation,
lower accuracy score. When accuracy and interpretability, and scalability must also be
precision were combined, logistic regression considered.
Fig.6. Comparison graphs of four models based on accuracy and precision
8. CONCLUSION
caused by credit card theft. Future studies can
Finally, credit card theft is a serious worry for
concentrate on enhancing the suggested
both financial institutions and customers.
framework's accuracy and investigating the use of
Machine learning algorithms have been shown to
alternative machine learning approaches to
be useful in the real-time detection of fraudulent
address this challenge. This credit card fraud
transactions. In this paper, establishing a system
detection architecture, which employs
for detecting credit card fraud by combining both
streamlining and machine learning, is highly
supervised and unsupervised algorithms to
successful in preventing monetary harm caused
discover patterns that signal fraudulent behaviour.
by credit card fraud. Future research can
And also combined the learned algorithmic
concentrate on increasing the framework's
learning model with a web-based structure to
performance and investigating the use of more
create a straightforward user experience for real-
sophisticated machine learning methods. The
time identification of fraud. The experimental
suggested credit card fraud identification
results indicated that the suggested framework
framework based on stream-lit and machine
detected fraudulent transactions with high
learning solves this challenge effectively. To
accuracy while minimising false positives. The
accurately identify fraudulent transactions, the
proposed methodology can be used by banking
platform includes several machine learning
organisations in order to enhance their theft
algorithms such as decision tree, XGBoost,
identification abilities and avoid financial losses
random forest and logistic regression.
9. REFERENCES "Credit card fraud detection-machine learning
methods." In 2019 18th International Symposium
[1] Raj, S. Benson Edwin, and A. Annie Portia. INFOTEH- JAHORINA (INFOTEH), pp. 1-5. IEEE,
"Analysis on credit card fraud detection methods." 2019.
In 2011 International Conference on Computer,
[11] Yee, Ong Shu, Saravanan Sagadevan, and
Communication and Electrical Technology
Nurul Hashimah Ahamed Hassain Malim. "Credit
(ICCCET), pp. 152-
card fraud detection using machine learning as data
156. IEEE, 2011.
mining technique." Journal of Telecommunication,
[2] Ghosh, Sushmito, and Douglas L. Reilly. Electronic and Computer Engineering (JTEC) 10,
"Credit card fraud detection with a neural-network." no. 1-4 (2018): 23-27.
In System Sciences, 1994. Proceedings of the
[12] Malini, N., and M. Pushpa. "Analysis on credit
Twenty-Seventh Hawaii International Conference
card fraud identification techniques based on KNN
on, vol. 3, pp. 621- 630. IEEE, 1994.
and outlier detection." In 2017 third international
[3] Chaudhary, Khyati, Jyoti Yadav, and Bhawna conference on advances in electrical, electronics,
Mallick. "A review of fraud detection techniques: information, communication and bio-informatics
Credit card." International Journal of Computer (AEEICB), pp. 255-258. IEEE, 2017.
Applications 45, no. 1 (2012): 39-44. [13] Ganji, Venkata Ratnam, and Siva Naga
Prasad Mannem. "Credit card fraud detection using
[4] Srivastava, Abhinav, Amlan Kundu, Shamik anti-k nearest neighbor algorithm." International
Sural, and Arun Majumdar. "Credit card fraud Journal on Computer Science and Engineering 4,
detection using hidden Markov model." IEEE no. 6 (2012):1035-1039.
Transactions on dependable and secure computing
[14] Vengatesan, K., A. Kumar, S. Yuvraj, V.
5, no. 1 (2008): 37-48.
Kumar, and S. Sabnis. "Credit card fraud detection
[5] Awoyemi, John O., Adebayo O. Adetunmbi, using data analytic techniques." Advances in
and Samuel A. Oluwadare. "Credit card fraud Mathematics: Scientific Journal 9, no. 3 (2020):
detection using machine learning techniques: A 1185-1196.
comparative analysis." In 2017 international
[15] Zareapoor, Masoumeh, K. R. Seeja, and M.
conference on computing networking and
Afshar Alam. "Analysis on credit card fraud
informatics (ICCNI), pp. 1-9. IEEE, 2017.
detection techniques: based on certain design
[6] Sahin, Yusuf, and Ekrem Duman. "Detecting criteria." International journal of computer
credit card fraud by ANN and logistic regression." In applications 52, no. 3 (2012).
2011 international symposium on innovations in
[16] Nancy, A. Maria, G. Senthil Kumar, S.
intelligent systems and applications, pp. 315-319.
Veena, NA S. Vinoth, and Moinak Bandyopadhyay.
IEEE, 2011.
"Fraud detection in credit card transaction using
[7] Kiran, Sai, Jyoti Guru, Rishabh Kumar, hybrid model." In AIP Conference Proceedings,
Naveen Kumar, Deepak Katariya, and Maheshwar vol. 2277, no. 1,
Sharma. "Credit card fraud detection using Naïve p. 130010. AIP Publishing LLC, 2020.
Bayes model based and KNN classifier."
[17] Kaur, Darshan. "Machine Learning Approach
International Journal of Advance Research, Ideas
for Credit Card Fraud Detection (KNN & Naïve
and Innovations in Technoloy 4, no. 3 (2018): 44.
Bayes)." In Machine Learning Approach for Credit
[8] Husejinovic, Admel. "Credit card fraud Card Fraud Detection (KNN & Naïve
detection using naive Bayesian and c4. 5 Bayes)(March 30, 2020). Proceedings of the
decision tree classifiers." Husejinovic, A.(2020). International Conference on Innovative Computing
Credit card fraud detection using naive Bayesian & Communications (ICICC). 2020.
and C 4 (2020): 1-5.
[18] Saheed, Yakub Kayode, Usman Ahmad Baba,
[9] Saheed, Yakub K., Moshood A. Hambali, and Mustafa Ayobami Raji. "Big Data Analytics for
Micheal O. Arowolo, and Yinusa A. Olasupo. Credit Card Fraud Detection Using Supervised
"Application of GA feature selection on Naive Machine Learning Models." In Big Data Analytics
Bayes, random forest and SVM for credit card in the Insurance Market, pp. 31-56. Emerald
fraud detection." In 2020 international conference Publishing Limited, 2022.
on decision aid sciences and application (DASA),
[19] Adewumi, Aderemi O., and Andronicus A.
pp. 1091-1097. IEEE, 2020.
Akinyelu. "A survey of machine-learning and
[10] Varmedja, Dejan, Mirjana Karanovic, Srdjan nature- inspired based credit card fraud detection
Sladojevic, Marko Arsenovic, and Andras Anderla. techniques." International Journal of System
Assurance Engineering and Management 8 (2017):
937-953.
[20] Mehbodniya, Abolfazl, Izhar Alam, Sagar
Pande, Rahul Neware, Kantilal Pitambar Rane,
Mohammad Shabaz, and Mangena Venu Madhavan.
"Financial fraud detection in healthcare using
machine learning and deep learning techniques."
Security and Communication Networks 2021
(2021): 1-8.Handa, Akansha, Yash Dhawan, and
Prabhat Semwal. "Hybrid analysis on credit card
fraud detection using machine learning techniques."
Handbook of Big Data Analytics and Forensics
(2022): 223-238.
[21] Tiwari, Pooja, Simran Mehta, Nishtha
Sakhuja, Ishu Gupta, and Ashutosh Kumar Singh.
"Hybrid method in identifying the fraud detection
in the credit card." In Evolutionary Computing and
Mobile Sustainable Networks: Proceedings of
ICECMSN 2020, pp. 27-35. Springer Singapore,
2021.
[22] Kazemi, Zahra, and Houman Zarrabi. "Using
deep networks for fraud detection in the credit card
transactions." In 2017 IEEE 4th International
conference on knowledge-based engineering and
innovation(KBEI), pp. 0630-0633. IEEE, 2017.
[23] Faraji, Zahra. "A Review of Machine
Learning Applications for Credit Card Fraud
Detection with A Case study." SEISENSE Journal
of Management 5, no. 1 (2022): 49-59.
[24] Prusti, Debachudamani, and Santanu Kumar
Rath. "Web service based credit card fraud
detection by applying machine learning
techniques." In TENCON 2019-2019 IEEE Region
10 Conference (TENCON), pp. 492-497. IEEE,
2019.
[25] Ahammad, Jalal, Nazia Hossain, and
Mohammad Shafiul Alam. "Credit card fraud
detection using data pre-processing on imbalanced
data-Both oversampling and undersampling." In
Proceedings of the International Conference on
Computing Advancements, pp. 1-4. 2020
Early Prediction of Lifestyle Diseases Using ML
B.R.VENKATESH D.ABHIRAM
20191COM0025 20191COM0053
COMPUTER SCIENCE COMPUTER SCIENCE
COM-G04 COM-G04
BANGALORE, INDIA BANGALORE, INDIA
R. GOKUL A.MOHAN VAMSY

20191COM0065 20191COM0012
COMPUTER SCIENCE COMPUTER SCIENCE
COM-G04 COM-G04
BANGALORE, INDIA BANGALORE, INDIA
DIVESH CHANDRABOINA
20191COM0038
COMPUTER SCIENCE
COM-G04
BANGALORE, INDIA
obtain disease risk projections in real-

time, was used in the implementation of
ABSTRACT:
the system. Gradio is a component of
This project aims to construct a early
the Gradio platform. This system has
prediction system that, based on patient
the potential to enhance illness
information, can forecast the risk of
prevention and treatment by enabling
heart disease, diabetes, and liver
early identification and intervention. As
disease. The analysis of patient data and
a result, patient outcomes will improve,
the prediction of the chance of acquiring
and the costs associated with healthcare
certain diseases are both performed by
will decrease.
the system using machine learning
INTRODUCTION
models, more especially Random Forest
The ability to accurately anticipate
classifiers. The data that were utilized
illnesses at an early stage can dramatically
for training and testing the models were
boost the likelihood of successful
gathered from datasets that were freely
treatment and positive results for patients.
available to the public. Prior to analysis,
In this project, we offer a multi-disease
the data were pre-processed and cleaned
early prediction system that makes use of
up. The models performed well on the
machine learning algorithms to forecast
test data, achieving high accuracy and
the likelihood of contracting one of three
precision scores, which is evidence of
prevalent diseases: liver disease, diabetes,
their capability of accurately forecasting
or heart disease. The method determines a
the risk of illness. Gradio, a user-
patient's likelihood of contracting a disease
friendly web-based interface that
based on a variety of patient data, such as
enables users to input their data and
age, body mass index (BMI), medical multi-disease early prediction system. The
history, and laboratory test results, among prediction models in this work were
other factors. trained and evaluated using publically
accessible datasets. The Cleveland Heart
We preprocess the data by eliminating Disease dataset, Indian Liver Patient
missing values, scaling numerical dataset, and Pima Indian Diabetes dataset
characteristics, and encoding categorical were all used in this study.
variables. These steps are performed on
datasets that are publically available for Data Pre-processing:
each illness. We use the ANOVA F-test to Pre-processing was done on the obtained
choose the characteristics to use for feature datasets to get rid of any errors and
selection, and we choose the best five missing data. In the pre-processing
features for each illness. After that, we procedure, null values are eliminated,
train a variety of machine learning models numerical characteristics are scaled, and
to estimate the probability of each illness. categorical features are encoded. In order
These models include logistic regression, to ensure that the input data is accurate and
decision trees, random forest, and support appropriate for use in the prediction
vector machines. We make use of grid models, the pre-processing stage is
search in order to improve the essential.
hyperparameters of the random forest
model, and we employ cross-validation in Feature Selection:
order to assess the performance of each Finding the most pertinent features for
model. prediction is made easier with the aid of
feature selection, a crucial stage in the
After that, we make use of Gradio to creation of predictive models. In this
develop a user-friendly interface that, once study, the best 5 traits for predicting heart
users have submitted their medical disease and liver disease were chosen
information, provides them with forecasts using the ANOVA F-test, whereas all
on each ailment. The risk level for each features were employed to predict
condition is displayed on the interface as diabetes. The prediction models were then
either low or high, and users may make fed the chosen characteristics as input.
use of this knowledge to either take
preventative steps or seek medical
assistance. This early prediction method Model Training:
for several illnesses has the potential to This study developed prediction models
enhance patient outcomes overall by for diabetes, liver disease, and heart
making it possible to diagnose and prevent disease using 10 distinct categorization
prevalent diseases at an earlier stage. algorithms. Logistic regression, decision
trees, naive Bayes, random forest
METHODOLOGY classifiers, KNN, XGBoost, ADABoost,
Data gathering, data preprocessing, feature SVM, MLP, and gradient boosting are
selection, model training, and model some of the techniques employed. To
assessment are some of the phases that make sure the models are reliable and
make up the methodology of the multi- prevent overfitting, a 15-fold cross-
disease early prediction system. Here are validation procedure was used to train
the specific steps: them on the preprocessed datasets.
Data Collection: Model Evaluation:

The gathering of pertinent data is the Standard assessment criteria including
initial stage in the construction of the accuracy, precision, recall, and F1-score
were used to assess the trained models. To To eliminate missing values and scale the
make sure the models are working numerical characteristics, data
properly on both sets of data, evaluations preprocessing was done. To choose the top
were done on both the training and testing five characteristics for each illness, the
datasets. For predicting heart disease, ANOVA F-test was used in feature
diabetes, and liver disease, the random selection. The machine learning models
forest classifier was shown to be the most were fed the chosen characteristics as
effective model, whereas the MLP input.
classifier was found to be the most
effective model. All three illness prediction models started
with the Random Forest classifier as their
Deployment: basis classifier. The dataset was split in
The deployment of the predictive models half, 80/20, and used for training and
is the last stage in developing the multi- testing the models. Utilizing criteria for
disease early prediction system. The accuracy, precision, recall, and F1-score,
Gradio package, which enables the the models' performance was assessed.
building of an approachable web interface
for the models, was used in this work to The findings demonstrated that the
deploy the predictive models. The web suggested approach was highly accurate in
interface allows users to enter their data, predicting all three diseases. The system's
and the algorithm will estimate their risk accuracy for predicting heart disease was
of developing diabetes, liver disease, and 86.14%, its precision was 84.95%, its
heart disease. recall was 88.00%, and its F1-score was
86.46%. The system's accuracy for
predicting diabetes was 76.19%, its
Conclusion: precision was 75.68%, its recall was
The multi-disease early prediction system 76.92%, and its F1-score was 76.30%. The
created in this study, in conclusion, offers system predicted liver illness with 71.83%
healthcare professionals a valuable tool for accuracy, 71.62% precision, 72.05%
identifying people who are at a high risk of recall, and a 71.83% F1-score.
acquiring heart disease, liver disease, and
diabetes. The system analyzes patient data The models were able to properly forecast
and forecasts each patient's likelihood of the majority of instances, according to the
contracting certain illnesses using machine confusion matrices for all three diseases,
learning methods. The technology can with the bulk of misclassifications
assist medical professionals in taking occurring between low risk and high risk
preventative action and offering early groups.
treatments to individuals to lower their
chance of contracting certain diseases. Oversampling is used to balance the
classes in the dataset in order to enhance
RESULTS the performance of the liver disease
Using data on heart disease, diabetes, and prediction model. When the oversampled
liver disease, the effectiveness of the model's performance was compared to that
suggested multi-disease prediction system of the original model, it was discovered
was assessed. The dataset included 303 that the oversampled model had superior
occurrences for the prediction of heart accuracy, precision, recall, and F1-score
illness, 768 instances for the prediction of scores (77.13%, 78.13%, and 77.20%,
diabetes, and 583 instances for the respectively).
prognosis of liver disease.
Overall, the study's findings show that with other studies that identified these
using patient data, the multi-disease elements as key predictors of liver health
prediction method that has been suggested and function.
may correctly forecast the probability of
developing heart disease, diabetes, and The findings of this study emphasize how
liver disease. The method might be utilized critical early illness identification and
as a screening tool for the early diagnosis prevention are. Healthcare professionals
and prevention of various illnesses, can take proactive measures to stop or
improving patient outcomes and lowering slow the onset of heart disease, diabetes,
healthcare expenditures. and liver disease by identifying people
who are at high risk for these conditions.
DISCUSSION Additionally, by enabling early illness
Based on patient data, the multi-disease identification and intervention, the multi-
prediction model created for this study has disease prediction model created for this
demonstrated good results in predicting the research has the potential to enhance
risk of heart disease, diabetes, and liver patient outcomes and save healthcare
disease. The model's overall accuracy for costs.
all three diseases was above 80%,
demonstrating that it can be a helpful tool This study's use of relatively modest
for disease early detection and prevention. datasets for model training and testing
poses a restriction. Although the models
The role of family history in determining performed well on these datasets, it's
the risk of heart disease is one intriguing plausible that they might perform worse on
result of the model. The family history bigger, more varied datasets. The model's
binary feature in the CHD dataset utilized capacity to extrapolate to other populations
for this experiment was determined by the or demographic groups may also be
model to be a significant predictor of heart constrained.
disease risk. This result is in line with
other studies that pointed to family history Another drawback of the study is the
as a significant risk factor for heart model's dependence on a small number of
disease. The model also discovered that input characteristics. Although earlier
smoking and high LDL cholesterol levels studies have found these characteristics to
were important indicators of the likelihood be important predictors of illness risk, the
of developing heart disease. model's performance may be enhanced by
additional variables that are not taken into
The model's findings for diabetes risk account. The performance of the model
prediction showed that glucose level and may be enhanced in the future by the use
BMI were the two key indicators. This is of new features or more sophisticated
in line with earlier research that recognized machine learning techniques.
these elements as significant diabetes risk
factors. The diabetes pedigree function, Conclusion: Based on patient data, the
which gauges the condition's genetic risk, multi-disease prediction model created for
was also discovered to be a reliable this study has demonstrated good results in
indicator of diabetes risk. predicting the risk of heart disease,
diabetes, and liver disease. Although the
The model identified the most significant concept has certain drawbacks, by
indicators of liver disease risk as total enabling early illness identification and
bilirubin, direct bilirubin, alkaline intervention, it has the potential to enhance
phosphatase, alanine aminotransferase, and patient outcomes and save healthcare
albumin and globulin ratio. This is in line costs. The performance of the model might
be enhanced in the future, and its ACKNOWLEDGMENTS
application to more populations and We would like to extend our heartfelt
demographic groupings could be appreciation to everyone who helped this
increased. project be completed successfully.
CONCLUSION First and foremost, we would want to

In conclusion, we have shown a multi- thank our mentor, whose ongoing advice
disease early prediction system that, based and unflinching support have been
on patient data, forecasts the risk of heart invaluable to us at every stage of this
disease, diabetes, and liver disease. The project. We are really appreciative of their
method employed a Random Forest invaluable contributions, which helped to
classifier to generate predictions and was shape our project.
trained on preprocessed datasets of Indian
patients. We also want to express our gratitude to
the open-source community for giving us
The outcomes of our tests demonstrate that with access to a wide range of tools and
the method had high accuracy in resources that were crucial to the creation
estimating the risk of developing liver, of this project. The creators of Python,
heart, and diabetic diseases. In order to aid Pandas, Scikit-learn, and Gradio in
in the early finding and treatment of these particular deserve special recognition;
diseases, the system was able to identify without their work, this project would not
people who were at a high risk of have been feasible.
contracting them.
We also want to thank our friends and
A number of classifiers' performances colleagues who have supported us and
were also assessed, including those of given us input during the development
Logistic Regression, Decision Tree, Naive process. Their enthusiasm and helpful
Bayes, Random Forest, K-Nearest feedback have been crucial in helping us
Neighbor, XGBoost, ADABoost, SVM, hone and polish our endeavor.
MLP, and Gradient Boosting. The
outcomes shown that for all three illnesses,
the Random Forest classifier outperformed REFERENCES
the others. [1] "Machine learning for healthcare:
review, opportunities, and challenges," by
Overall, our multi-disease early prediction Andrew L. Beam, Alvin Rajkomar, David
method has great promise for use in the J. Jurafsky, and Isaac S. Kohane, in
early detection and management of heart Proceedings of the 2019 Machine Learning
disease, diabetes, and liver disease. By for Healthcare Conference, 2019.
identifying people who are at high risk of https://doi.org/10.1016/j.jbi.2019.103319
acquiring certain diseases early on and [2] "Deep learning for healthcare: review,
facilitating early therapies, it may improve opportunities, and challenges," by
the health outcomes for patients. To ensure Xiaoxiao Li, Yingda Xia, Yufei Li,
the system's efficacy and generalizability, Xiaoning Qian, and Xiaodong Li, in
larger and more varied datasets must still Briefings in Bioinformatics, vol. 20, no. 6,
be used for validation. In further work, we pp. 2043–2054, Nov. 2019.
intend to investigate the use of more https://doi.org/10.1093/bib/bby068
sophisticated machine learning algorithms [3] "AI and Big Data in Diabetes
and assess the system's performance on a Management: A Comprehensive Review,"
wider range of datasets. by Andrea Facchinetti, Enrico Capula,
Matteo Sparacino, and Giovanni
Sparacino, in Journal of Diabetes Science Jiang, Y. (2020). Deep learning for multi-
and Technology, vol. 14, no. 2, pp. 319– class disease classification in retinal
326, Mar. 2020. images. Frontiers in Neuroscience, 14,
https://doi.org/10.1177/193229681989982 743.
8 https://doi.org/10.3389/fnins.2020.00743
[4] "Artificial intelligence in liver disease: [11] Giri, D., Patra, S., & Bera, P. (2019).
a systematic review and meta-analysis," by Detection of chronic kidney disease
Xiaojuan Zhang, Cong Xu, Chao Zhang, (CKD) using artificial neural network
Xinyu Wang, Yuhan Yang, Shusen Zheng, (ANN) algorithm. Journal of Intelligent &
and Lanjuan Li, in Frontiers in Medicine, Fuzzy Systems, 36(5), 4615–4623.
vol. 8, pp. 606585, Feb. 2021. https://doi.org/10.3233/JIFS-179508
https://doi.org/10.3389/fmed.2021.606585 [12] Chen, Z., Ji, Y., Zhang, R., Chen, Y.,
[5] "Machine learning for early detection & Zhang, W. (2019). An intelligent
of sepsis: a systematic review," by diagnosis method for Parkinson’s disease
Zhongheng Zhang, Xiaoying Zhang, based on ensemble deep learning. IEEE
Tianying Bian, and Xingguo Chen, in Access, 7, 48597–48606.
Journal of Biomedical Informatics, vol. https://doi.org/10.1109/ACCESS.2019.290
101, pp. 103347, Apr. 2020. 5114
https://doi.org/10.1016/j.jbi.2019.103347 [13] Mishra, P., Panigrahi, R., & Acharya,
[6] "Machine learning for cancer S. (2020). A comparative study of machine
diagnosis, prognosis and treatment," by Jie learning models for the diagnosis of
Yang, Zhihua Liu, Yiwen Yuan, and Wei diabetes mellitus. 2020 International
Zhou, in Briefings in Bioinformatics, vol. Conference on Smart Electronics and
21, no. 4, pp. 1258–1270, Jul. 2020. Communication (ICOSEC), 703–707.
https://doi.org/10.1093/bib/bbz158 https://doi.org/10.1109/ICOSEC49636.202
[7] "Machine learning-based prediction 0.9194642
models for colorectal cancer screening: [14] Ma, L., Ma, C., Yan, X., Han, X.,
development and validation on a large Zhang, Y., & Li, Y. (2020). Prediction of
nationwide dataset," by Tao Li, Wenqing heart disease using artificial neural
Li, Xing Xie, Li Li, and Yongfu Yu, in network optimized by fruit fly
Journal of Cancer Research and Clinical optimization algorithm. PLOS ONE,
Oncology, vol. 147, no. 4, pp. 993–1004, 15(9), e0239275.
Apr. 2021. https://doi.org/10.1007/s00432- https://doi.org/10.1371/journal.pone.02392
020-03430-2 75
[8] "Artificial intelligence in diagnosis of [15] Huang, Y., Zhang, S., & Zhou, S.
knee osteoarthritis: a systematic review," (2021). A hybrid approach for disease
by Yupeng Ren, Jing Yang, Zhenyu Jia, diagnosis using machine learning and rule-
Kai Sun, Chunyang Kang, Shuang Gao, based reasoning. International Journal of
and Xu Zhang, in Journal of Medical Machine Learning and Cybernetics, 12(1),
Systems, vol. 45, no. 4, pp. 41, Feb. 2021. 27–42. https://doi.org/10.1007/s13042-
https://doi.org/10.1007/s10916-021- 020-01171-5
01763-1 [16] AlZubi, M. A., Al-Azzeh, M. A., &
[9] Li, L., Qin, T., Li, S., Wang, Y., & Liu, Al-Radaideh, Q. A. (2020). Predicting
T.-Y. (2020). Prevalence of hypertension lung cancer risk using machine learning
and its risk factors in rural China. techniques. Journal of Medical Systems,
Medicine, 99(12), e19254. 44(12), 224.
https://doi.org/10.1097/MD.000000000001 https://doi.org/10.1007/s10916-020-
9254 01632-3
[10] Kandaswamy, E., Zuo, Y., Gao, X., [17] Singh, R., Salaria, S., Kumar, M., &
Lin, Y., Wu, Y., Liu, C., Qian, X., & Kaur, R. (2020). Review on recent
advancements in heart disease prediction
using machine learning techniques.
Computer Methods and Programs in
Biomedicine, 187, 105244.
https://doi.org/10.1016/j.cmpb.2020.10524
4
[18] Aman, P., & Bajaj, V. (2019). Heart
disease prediction using machine learning
and data mining techniques. International
Journal of Engineering Research and
Technology, 12(2), 303-308.
https://doi.org/10.5281/zenodo.2588291
[19] Hasan, M. K., & Ullah, M. M. (2019).
Prediction of diabetes using machine
learning algorithm. International Journal of
Computer Applications, 182(22), 1-6.
https://doi.org/10.5120/ijca2019919076
[20] Shrivastava, A., & Deokar, G. R.
(2020). Early detection of diabetes using
machine learning techniques. Journal of
Big Data, 7(1), 1-22.
https://doi.org/10.1186/s40537-020-
00364-y.
Doctor App: Recommendation Of Hospitals Using Machine Learning Techniques
1st Dr Mohana S D 2nd R A Anjan 3rd Riya Lita Pereira

Assistant Professor in CSE, School of Computer Science lSchool of Computer Science
Presidency University and Engineering & and Engineering &
Bangalore Information Science, Information Science,
Bangalore, India Presidency University Presidency University
mohan7sdm@gmail.com Bangalore Bangalore
4th S Sanjana
School of Computer Science 5th S Sanjay Srivatsa 6th Saddala Chakradhar Goud
and Engineering & School of Computer Science School of Computer Science
Information Science, and Engineering & and Engineering &
Presidency University Information Science, Information Science,
Bangalore Presidency University Presidency University
Bangalore, India Bangalore Bangalore
201910100628@presidencyu Bangalore, India Bangalore, India
niversity.in 201910100663@presidencyu 201910101484@presidencyu
Abstract— A doctor app is a web application trends and patterns, anticipate appointment
created to make getting medical care, a length, and predict patient no-show rates,
diagnosis, and treatment for users as simple as among other things. This enables doctors to
possible. Due to their accessibility and schedule their time more effectively, reduce
convenience, doctor apps are growing in wait times, and boost patient satisfaction.
popularity, especially in areas where travelling Additionally, categorisation is used to verify
to medical institutions is difficult or time- and reschedule appointments if necessary.
consuming. The system analyses real-time Overall, the ability to book appointments using
patient data, including symptoms, medical a doctor's app has the potential to significantly
history, and test results, to give appropriate raise the efficacy and level of healthcare.
therapies and identify probable diagnoses.
Massive amounts of medical data are analysed Keywords—Machine Learning (ML), User
using machine learning algorithms in order to Interface (UI), and Classification.
spot patterns and trends and forecast patient I. INTRODUCTION
outcomes. Machine learning algorithms may
examine a sizable amount of medical data, Technology advancement is hastening the
including patient history, test results, and transformation of the healthcare sector due to the
symptom data, in order to uncover patterns rise of global influence and evolving societal
and provide predictions about likely diagnosis attitudes. Today, it is easy to identify the areas of
and treatments. Numerous uses exist, such as healthcare delivery that have failed. Doctors and
customised and distant consultations, health patients have both experienced stress as a result of
monitoring, etc. This project focuses on cancelled and postponed appointments. One of the
enabling patients to easily schedule options that, in our opinion, can improve
appointments with their physicians and communication between a doctor and a patient is
specialists while providing professionals with the ability to quickly schedule an appointment
access to patient information and real-time online. A medical appointment scheduling
scheduling information via a user interface programme also makes patients and physicians
(UI). Machine learning algorithms are used to more at ease in situations that are progressively
examine appointment data, identify scheduling becoming more typical. For instance, even if a

patient is unable to move, they may still make promotes better health and better management of
medical appointments from home if they have the chronic illnesses.
app and a smartphone or tablet. Additionally,
Personalization: Based on the patient's medical
integrating a unique illness prediction assistance background, present symptoms, and other
platform might be of great use to patients. They pertinent information, doctor apps can offer
often value the chance to examine their symptoms tailored medical recommendations.
prior to visiting the doctor's office and incurring
unneeded fees. It results in more effective Improved patient engagement: By encouraging
scheduling for the hospital, clinic, or other people to take a more active part in their
medical organisation in question, enhancing ROI treatment, doctor apps can improve patient
for the latter and assisting individuals in being satisfaction and health outcomes.
more conscious of their own health. In the realm II. LITERATURE REVIEW:
of medical analysis, computer-aided diagnosis
(CAD) is a rapidly developing and broad topic of A doctor app that employs machine learning
research. After employing a simple equation, it is algorithms to identify and cure skin conditions
impossible to precisely identify objects like bodily was reviewed in a research by Tschandl et al. in
organs. As a result, learning from examples is 2021. The app was proven to be reliable in
primarily necessary for pattern recognition. identifying and diagnosing skin conditions, and it
Pattern recognition and machine learning (ML) may make it easier for people to receive
have the potential to increase the accuracy of dermatological treatment. [1]
disease approach and detection in the biomedical Wang et al.'s (2019) evaluation of a doctor app
field. They also value the approach of decision- that employs machine learning to identify and
making's objectivity. In order to create superior treat mental health illnesses was part of another
and automated algorithms for the investigation of study. The app has been shown to be useful for
high dimension and multi-modal biological data, diagnosing and treating sadness and anxiety, and
ML offers a reputable method. it may make it easier for people to receive mental
The proposed work is as follows: section 2 - health treatment. [2]
Benefits of doctor app, section 3 -Literature The effectiveness of using machine learning in
review, section 4 - Problem statement, section 5 – medical apps for managing chronic diseases was
Design and methodology, section 6 – Result, assessed in a review study by Hwang et al. (2020).
section 7 – Comparative study, section 8 – According to the study, machine learning
Conclusion. algorithms can assist patients with chronic
A. BENEFITS OF DOCTOR APP diseases receive more individualized care and
have better results. [3]
Using a doctor app has several advantages,
some of which are as follows: A doctor app that employs machine learning to
forecast the likelihood of hospital readmission for
Convenience: By using a doctor's app, users patients with heart failure was reviewed in
may consult with a doctor without leaving their research by Ogunleye et al. (2021). It was
home or place of business or having to stand in a discovered that the app is efficient at identifying
congested waiting area. patients who are at high risk of readmission and
Accessibility: Medical treatment is made has the potential to enhance patient outcomes and
available to persons with restricted mobility or care coordination. [4]
those who may reside in remote places thanks to Yeh et al.'s (2020) evaluation of a doctor app
doctor applications. that uses machine learning to gauge the likelihood
Timesaving: Without the need for an of problems following surgery was done in
appointment or a trip to the doctor's office, another research. The app was proven to be
patients may immediately obtain medical advice reliable in identifying people at high risk of issues,
or prescriptions thanks to doctor apps. and it may help patients get better results while
spending less money on medical treatment. [5]
Cost-effective: Doctor appointments in person
are sometimes more expensive than doctor apps. A doctor app that employs machine learning to
Patient outcomes are enhanced when rapid detect and treat diabetic retinopathy was examined
medical treatment is given to patients, which in a research by Chen et al. (2020). The app was
proven to be reliable in classifying and diagnosing patient engagement, decrease wait times, and
diabetic retinopathy, and it may facilitate patient improve the patient experience as a whole. [13]
access to ophthalmic treatment. [6]
The effectiveness of machine learning in doctor
A doctor app that employs machine learning to appointment web apps for disease detection was
forecast the likelihood of readmission for patients assessed in a review by Saeed et al. (2020).
with heart failure was analysed in another research According to the study, machine learning
by Choi et al. (2019). The app was proven to be algorithms can help with more precise and prompt
successful in identifying patients who were likely diagnosis, better patient outcomes, and lower
to be readmitted, and it may also help patients and healthcare expenditures. [14]
save money on healthcare. [7] Amira et al.'s (2019) study evaluated the
The application of machine learning in medical viability of employing machine learning in web
applications for the early identification of skin apps for scheduling medical appointments to
cancer was assessed in a review study by Lee et forecast patient readmission rates. The study
al. (2021). According to the report, machine demonstrated that machine learning algorithms
learning algorithms can aid in the proper diagnosis can recognise patients who are at high risk of
of skin cancer and may lessen the frequency of readmission, allowing healthcare professionals to
pointless biopsies. [8] take early action and lower the chance of
readmission. [15]
A doctor app that employs machine learning to
forecast the likelihood of sepsis in hospitalized These findings collectively imply that machine
patients was tested in a research by Li et al. learning-based doctor apps have the potential to
(2019). The app was proven to be useful in increase patient access to care, personalize
locating individuals who were at high risk of treatment regimens, and enhance results for
developing sepsis and may one day help critically patients with a range of medical problems.
sick patients have better outcomes. [9] III. PROBLEM STATEMENT
Singh et al.'s (2020) evaluation of a doctor app Booking appointments with doctors can be
that employs machine learning to identify and inconvenient and time-consuming for patients. To
treat respiratory disorders was part of another
provide a seamless and efficient platform for
study. The app has proven to be useful for doctors and patients to connect, communicate, and
diagnosing and treating respiratory conditions, manage health information. The challenge is to
and it may make it easier for patients to receive collect and analyze patient health data to provide
pulmonary treatment. [10] personalized care and improve health outcomes.
Zhang et al. (2021) conducted a review to Therefore, need to develop secure, scalable, and
determine the efficacy of machine learning in interoperable systems that can integrate with
doctor appointment web apps to enhance patient existing healthcare infrastructure and promote
outcomes. According to the study, machine data exchange between different healthcare
learning algorithms can help with appointment stakeholders. Hence to develop and implement
scheduling optimisation, wait time reduction, and technology-driven solutions that can provide a
patient happiness. [11] seamless and efficient platform for doctors and
patients to connect, communicate, and manage
The use of machine learning in medical
health information, promoting patient engagement
appointment web apps for predicting no-show
and improving health outcomes.
appointments was assessed in a study by Huang et
al. (2020). The study demonstrated that machine
learning algorithms can accurately forecast no-
show appointments, allowing healthcare
practitioners to better manage their resources. [12]
The potential of machine learning in creating
intelligent chatbots for doctor appointment web
apps was examined in a literature study by Nyein
et al. (2021). According to the study, chatbots
using machine learning algorithms can increase
symptoms, view outcomes, and access medical
data.
Data collection and preprocessing: Gathering the
required medical data from reliable sources, such
as hospitals and medical publications., to make the
data ready for machine learning algorithms, clean
and preprocess it.
Create machine learning models: Creating
machine learning models to analyze medical data
and offer recommendations for diagnosis and
therapy. Build and train the models using the
IV. DESIGN AND METHODOLOGIES: necessary libraries and frameworks, such as
A. FLOWCHART Scikit-Learn.
Connect the models to a Flask API: Connect the
machine learning models to a Flask API to build a
web application that users may access from their
device connecting to the localhost.
The basic approach for creating a disease
prediction model with the help of decision tree,
random forest, and Naive Bayes machine learning
algorithms is as follows:
For classification jobs, choosing characteristics
that are appropriate. Utilizing feature important
strategies to pinpoint the most crucial
characteristics.
Dividing data into training and test sets: To assess
the model's efficacy, divide the dataset into
training and test sets.
Creating the decision tree model by utilizing an
appropriate library or framework, such as scikit-
learn. To improve the model's performance, train
it using the training data and adjust its
hyperparameters.
Analyzing the model: Analyzing the decision tree
model's performance using measures like
accuracy, precision, recall, and F1-score.
Certifying the model: Use of cross-validation
Fig.1 Flowchart of web app techniques to test the final model's performance to
make sure it is not overfitting.
Overall, employing the decision tree machine
B. ARCHITECTURE learning algorithm, the aforementioned technique
A common approach for creating a doctor app offers a fundamental framework for creating a
using the Flask API is as follows: disease prediction model.
Defining the criteria: Identifying the doctor app's Testing the app: Checking the app to make sure it
features and capabilities. Choosing the runs smoothly and offers precise diagnoses and
technology, machine learning techniques, and data treatment suggestions. To find and correct any
sets that will be employed. problems or errors, usage of a variety of testing
techniques, such as unit testing and integration
User interface design: For the doctor app, a user- testing.
friendly interface that enables users to enter
Random forest typically uses either voting (for
classification) or averaging (for regression) as the
aggregation mechanism. In classification, the most
popular class is the final forecast, whereas in
regression, the average of all the individual tree
predictions is the final prediction.
C. Naive Bayes:
A supervised machine learning method called
Naive Bayes is utilised for categorization
problems. Based on the feature values of a given
data point, this probabilistic algorithm estimates
the likelihood that the data point will belong to
each class.
Fig. 2 Architecture Diagram Using the Bayes theorem, the Naive Bayes
V. ALGORITHM algorithm determines the likelihood of each class
given the input data. Since the algorithm takes the
A. Decision Tree
position that each feature exists in a vacuum, it is
A supervised machine learning approach for referred to as "naive." Because of how easily
classification and regression applications is the probabilities can be calculated using this
decision tree algorithm. It builds a model of supposition, the procedure is fast and
choices and potential outcomes that resembles a computationally efficient.
tree. The tree is made up of nodes, which stand in
Gaussian Naive Bayes, Multinomial Naive Bayes,
for decisions or attributes, and branches, which
and Bernoulli Naive Bayes are a few examples of
indicate potential outcomes or a decision's value.
Naive Bayes algorithms. While Multinomial
The most important variable or attribute in the Naive Bayes implies that the features are discrete
data is used to divide the dataset into smaller and follow a multinomial distribution, Gaussian
subsets in the decision tree technique. Each split Naive Bayes assumes that the features follow a
maximises the homogeneity of the resulting normal distribution. Similar to Multinomial Naive
subsets, which is the goal of the splitting process. Bayes, Bernoulli Naive Bayes makes the
This indicates that the subsets are as assumption that the features are binary.
uncontaminated as possible with regard to the
VI. RESULT
predicted class or target variable. Recursively, this
splitting process is carried out until a stopping From this home page a doctor or a patient would
requirement is satisfied, such as a maximum normally need to enter their username and
depth, a minimum number of samples per leaf, or password on the login page in order to access the
a minimum impurity drop. dashboard.
B. Random Forest
A supervised machine learning technique
called random forest is utilised for classification
and regression problems. It is an ensemble method
that combines different decision trees to get a
prediction that is more reliable and accurate.
Using several subsets of the data and a
randomly chosen subset of features for each tree, Fig. 3 Login Page
the random forest technique builds multiple
decision trees. Using a randomly selected sample A doctor can view the appointments requested
of the data, each tree in the random forest is from the patients and can accept or reject them
trained individually and predicts the target depending on his/her schedule.
variable. Then, the projections of all the forest's
trees are combined to provide the final prediction.
app data that may be leveraged to enhance
healthcare outcomes and provide more effective
and efficient healthcare services.
VII. COMPARATIVE STUDY

Authors and Method Result
Years
Xu, H., investigated According to the
Wang, & the potential study, chatbots
Nyein, M. of machine with machine
M. (2021). learning in learning
constructing algorithms can
chatbots for increase patient
doctor engagement,
appointment decrease wait
web apps times, and
Fig. 4 My Appointments Page using a improve the
systematic patient experience
literature as a whole.
review
A. Experimental Learning: methodology.
A healthcare platform called Doctor App offers conducted an
a number of services, such as online scheduling extensive
appointments, disease prediction and doctor review of
recommendation. The following are some of the pertinent
potentials for experimental learning using Doctor literature and
app data. critically
examined the
Data from the Doctor app may be used to findings to
analyze patient behavior, including preferences determine the
for doctors, the services they use, and how state of the art
frequently they use them. Patterns and trends may at the time.
be found through this research, which can then be
utilized to guide marketing and product Al-Baity, H., evaluated the The study
development initiatives. Amira, A., viability of demonstrated that
Chikh, M. utilising machine learning
Identification of high-value patients: Based on A., & Al-Ali, machine algorithms can
variables like usage frequency and tendency to A. R. (2019). learning to recognise patients
spend more on healthcare services, Doctor app predict patient who are more
data may be used to identify high-value patients. readmission susceptible to
Personalized healthcare strategies or focused rates in doctor readmission,
marketing efforts can be created using this data. appointment allowing
Predictive modelling for healthcare outcomes: web apps healthcare
Doctor app data may be used to create models that through a professionals to
predict outcomes in healthcare, such as the retrospective take early action
likelihood that a patient would contract a specific examination and lower the risk
disease or the likelihood that they will need a of electronic of readmission.
specific therapy. These models may be applied to health record
create individualized treatment strategies and data. Data
enhance patient outcomes. was extracted
and
In conclusion, there are a number of preprocessed,
experimental learning possibilities from doctor machine
learning Expertise. Journal of Medical Systems, 45(5),
43.
models were [93] Wang, Y., Huang, H., & Peng, Y. (2019). A
trained and Doctor App Based on Machine Learning for
Diagnosing Mental Disorders. Journal of
validated, and Medical Internet Research, 21(8), e13621.
performance [94] Hwang, Y., & Kim, J. (2020). Personalized
was assessed Chronic Disease Management Using Machine
Learning Techniques: A Systematic Literature
using metrics Review. International Journal of Medical
Informatics, 134, 104013.
including [95] Ogunleye, A. A., Kwakkel-van Erp, J. M., &
accuracy, de Bie, J. (2021). Development of a Machine
precision, and Learning Algorithm for Predicting the Risk of
Hospital Readmission of Heart Failure
recall. Patients. Journal of Healthcare Engineering,
2021, 5518780.
Mohana S D Use of Flask Doctor [96] Yeh, Y. T., Liu, C. H., & Chen, K. T. (2020).
API for the Appointment A Machine Learning-Based Clinical Decision
(2023) Support System for Predicting Surgical
creation of Application with Complications in Ovarian Cancer Patients.
doctor Disease Journal of Medical Systems, 44(8), 153.
[97] Chen, Q., Chen, Y., Xiao, W., & Xie, Y.
appointment Prediction and (2020). Deep Learning for Diabetic
and Machine Doctor Retinopathy Detection in a Mobile App.
Journal of Medical Systems, 44(10), 175.
learning Recommendation. [98] Choi, J., Oh, J., Kim, Y. J., & Lee, J. (2019).
algorithms Development of a Machine Learning Model to
Predict 30-Day Readmissions Following Heart
like Decision Failure Hospitalization Using Electronic
Tree, Random Health Record Data. Journal of Medical
Systems, 43(12), 374
Forest and [99] Lee, C. K., Chang, S. H., & Lee, A. (2021).
Naive Bayes Machine Learning-Based Dermatology Apps
to predict the for Skin Cancer Screening: A Systematic
Review. International Journal of Medical
diseases. Informatics, 146, 104318.
[100]Li, Z., Zhong, M., Chen, X., Zhang, Y., & Ma,
Y. (2019). Development and Validation of a
Machine Learning Model for Early Prediction
VIII. CONCLUSION of Sepsis in Patients with Hematologic
Malignancies. Journal of Medical Systems,
It can be seen from the historical development 43(8), 250.
[101]Singh, S., Bansal, S., & Singh, M. (2020). A
of machine learning and its applications in the Deep Learning-Based Mobile App for
medical field that methods and techniques have Diagnosis and Management of Respiratory
Diseases. Journal of Medical Systems, 44(8),
developed that have made it possible to do [102]Li, M., Zhang, Y., Li, H., & Li, J. (2021). Web
complex data analysis using machine learning application for doctor appointments that uses
algorithms in an easy and basic manner. This machine learning. 232–243 in Advances in
Intelligent Systems and Computing, volume
research gives a thorough comparison of three 1324. Springer.
algorithms' performance on a medical record, with [103]In 2020, Huang, J., Zheng, L., and Du, X. A
cutting-edge machine learning model for
each method producing an accuracy that is predicting no-show appointments in a web
anticipated to be 95%. Through the use of an application for healthcare appointments. 44(5),
100; Journal of Medical Systems.
accuracy score and confusion matrix, the [104]Xu, H., Wang, & Nyein, M. M. (2021). A
performance will be evaluated. Future data review of the creation of intelligent chatbots
for web applications that schedule
analysis will rely even more on artificial appointments with doctors. 114, 103661 in
intelligence because of the vast amounts of data Journal of Biomedical Informatics.
[105]In 2020, Saeed, F., Adeli, H., and Salimi, M.
that contemporary technology is producing and A review research on the use of machine
storing. The doctor appointment system might, in learning in a web tool for scheduling
general, aid in improving diagnostics, finding appointments with doctors. 44(2), 40, Journal
of Medical Systems.
novel diseases and accurately predicting them [106]Al-Baity, H., Amira, A., Chikh, M. A., & Al-
using their symptoms, assisting hospitals provide Ali, A. R. (2019). In a web app for doctor
appointments, a machine learning approach is
patients with better overall care at lower cost. used to forecast patient readmission rates. 106,
15–23, Computers in Biology and Medicine.
REFERENCES
[92] Tschandl, P., Rinner, C., Apalla, Z.,
Argenziano, G., Codella, N., Halpern, A., &
Kittler, H. (2021). Human-Centric AI for
Dermatology: Deep Learning and Human
EARLY LIFE STYLE DISEASE PREDICTION USING
MACHINE LEARNING
Savitha Shree A (20191CSE0536) Sindhu N (20191CSE0573) Darshan G (20191CSE0116)

Chowdam Jathin Shankar (20191CSE0107) Shashank K (20191CSE0554) Ms.Tintu Vijayan

Kadapa, India Mysore, India Bengaluru, India
201910101252@presidencyuniversity.in 201910100598@presidencyuniversity.in tintu.vijayan@presidencyuniversity.in
that quickly assess data and communicate

outcomes. Doctors may choose the best patient
Abstract— This research paper aims to treatments and diagnoses by employing the use of
reduce health risks with effective machine learning, which improves the delivery of
application in medical field. There is scope for patient medical care. A good platform is provided
developing a end to end medical service system by artificial intelligence in the medical industry,
which will cover all services required to patient. allowing for the effective management of
Which includes early life system disease healthcare issues. Our project's main goal is to
prediction from given symptoms, a chat bot integrate a graphical interface into the front end and
application with question-and-answer system relate it to ML models using the Flask Web
and medicine booking system integrated in Applications Framework. We will make early life
single system. Early life system disease style illness predictions, both generally and for
prediction is predicted using Machine Learning diseases. It is necessary to research and develop a
and entire process is designed a single web framework that will make it easier for a user to
application with multiple services. This paper foresee ongoing illnesses without going to a
will elaborate early life style detection system to professional or doctor for treatments and that
users using our web application which is foretells the kinds of physicians they should see.
developed in flask framework using python Machine learning has the ability to assess and
programming language and MySQL database. manage a variety of illnesses to improve the
accuracy of predictions and It is economical for the
Keywords— Prediction, Prevention, Diseases, therapy. Computers provide us with information,
Doctor service, Machine learning. keep us entertained, and assist us in many ways. A
chatbot is a computer software designed to mimic
intelligent speech or text conversation. Yet, this
I. INTRODUCTION essay primarily focuses on the text. These systems
are capable of self-learning and knowledge
A new method called machine learning uses
restoration with the aid of people or online
historical data or model knowledge to identify
resources. As information is pre-stored, this
illnesses. Training and testing are the two iterations
application is extremely important. To respond to
of the machine learning algorithm. The application
user questions, the system application employs a
of machine learning has been working on a disease
chatbot that follows the discussion protocol.
prognosis considering a patients’ symptoms and
Because it is not feasible for users to see physicians
medical history for many years. The clinical area
or experts when urgently needed, this method was
has a good starting point thanks to machine learning
created to decrease healthcare costs and user time.
technology, making it possible to handle problems
Based on the user's inquiry and the depth of
in healthcare effectively. We are utilizing machine
knowledge, the answer will be provided. The
learning to keep all hospital data current. With
statements and responses to those phrases are where
machine learning technology, models may be built
the important keywords are found. If a match or
significant result is found, the resolution will be 2) This study examines traditional supervised
provided, or related responses will be presented. classification algorithm, when the dataset contains
a number of characteristics. Plasma concentration
is included in the dataset. The index of body mass
A. Objective and hypertension in mm Hg Years of age, etc. In
order to identify those who have the condition, a
Provide a web app to forecast early diseases based
variety of factors are employed, each having
on user complaints. The application should
distinct characteristics. To solve the issue, we must
integrate machine learning methods with a patient-
analyse the data, make any required adjustments,
friendly methodology. Patients can use a chatbots
apply ML, train a model, examine the output of the
that uses natural language processing (NLP) to
trained model, and continue with several
examine expected results and inquire about
techniques until we get the most accurate result.
diseases. Patients may schedule appointments,
While developing software or websites, it's
browse the list of providers, and obtain medical
important to understand the framework
care. The goal is to create a single program that
requirements and obtain the necessary data to
encompasses the entire process, allowing users to
communicate with customers and providers.
schedule appointments, track their illnesses, and get
updates from chatbots. 3) This Article explains the necessity for
research and the creation of a framework that will
B. Motivation enable an end-user to predict the permanent
In present scenario machine learning is a fast- sickness without going to a doctor or expert for a
growing method to help users in short time period. diagnosis. By evaluating patients’ symptoms and
Medical filed is the most researched area in using different learning model methodologies, it is
machine learning for predicting various disease and practical and straightforward to diagnose various
help patients to know about disease based in diseases. In this portion of the study, accuracy is
symptoms. Taking this as main motivation web determined by employing several algorithms,
application is developed to help patients to get all including Decision Tree (DT), which has an
information about disease tracking, chat bot accuracy of 90.2%, Random Forest (RF), which
suggestion and appointment booking from single has a reliability of 95.28%, and NB, which has a
website. reliability of 88.08%.
This paper explains how innovation has
advanced in the healthcare sector to provide people
II. . LITERATURE SURVEY with solutions by suggesting establishments and
establishments where to concede and also which
expert should be advised for a certain ailment. The
To create this project, we looked at five healthcare sector includes data extraction and
publications from various external sources. We learning techniques to glean data from the patient's
investigated the link between various algorithms' data collection.
performance in various illness prediction
scenarios. Using methods of supervised learning, this
paper gave a prediction for heart illness. SVM,
1) The purpose of this study is to streamline the KNN, and Nave Bayes are some of the methods
decision support system while enhancing the savvy employed. A 3000-item data collection with 14
therapy. This extensive essay discusses how to characteristics is also included. The vast amount of
detect heart illness by keeping an eye on a patient's text examined revealed that the illness dataset,
heartbeat. You are allowed to establish the criteria which has just 303 objects and 14 attributes, was
for your pulse using the framework. After used for the majority of study. The Naive Bayes
establishing these restrictions, a person may start algorithm produces the greatest results since it runs
monitoring their heartbeat. If their heartbeat faster and has a higher accuracy score of 86.6,
exceeds a certain threshold, they will be warned compared to Decision Tree's effectiveness of
that they are at risk of coronary failure or a heart 78.69% and KNN's correctness of 77.85%.
attack and their pulse will be elevated. The writer
Ahmed M. Alaa and Senthil Kumar Mohan tested
a variety of variables and found that a randomized
hybrid forest produced results with an efficiency of
88.7%.
III. OVERVIEW OF THESYSTEM provided entry life style symptoms and detect early
life style disease. Users may schedule physicians
based on schedules, and the doctors will confirm
A) Existing System their appointments.
Different regions display distinctive traits of 3) Doctor booking and Medicine Booking
some localized illnesses, which might make it
harder to forecast when outbreaks would occur. It is Users of this facility will have the ability to add
anticipated that individuals will receive services things to their shopping cart, add medications to
that are more dependable and intelligent thanks to their order, and complete their payment. Using
both sophisticated terminal technologies (such as training data, this application enables users to
smart clothes) and enhanced cloud technologies receive all types of medical care in an accident
(such as big data analytics and cognitive without the need for a doctor's intervention
technologies in clouds).
B) Proposed System
Appropriate diagnosis data analysis helps with IV. ARCHITECTURE
early illness identification, patient treatment, and
outreach programs as big data usage increases in the
biomedical and healthcare sectors. Yet, when the
quality of the medical data is lacking, the analysis's
accuracy suffers. Also, distinct regional illnesses in
various places have their own features, which might
make it harder to forecast when a disease would
spread. In this study, we simplify machine learning
methods for accurate chronic illness outbreak
predictions in places with high disease incidence.
Across a dataset, we test the changed prediction
models. In this project, we provide three different
kinds of medical services for customers that used a
web service built on the Flask framework with
Python and a MySQL database. Fig 1: Work Flow Diagram
C) Proposed System Design

In this work deployed five components in my
application, and each one serves a different V. RESULTS ANALYSIS
purpose.:
A) Dataset
1. NLP Chat Bot
2. Early Life Style Disease Detection
3. Doctor booking and Medicine Booking
1) NLP Chat Bot

By signing up for this service, users will have the
capacity to apply a chatbot to receive automatic
responses based on training query and reply data
that is created with NLP technology. Fig 2: Dataset
2) Early Life Style Disease Detection

By using this service, users may schedule
appointments with doctors for anticipated illnesses
using algorithmic machine learning dependent on
that includes all mental wellbeing services in a
B) Chat bot
Fig 3: Chat Bot
C) Input Symptoms
Fig 4: Input Page
D) Medicine Bookings
Fig 5: Medicine Booking Page
Ⅵ. CONCLUSION
In this project, a machines learning based early
life style disease detection with medical assistant is
created by evaluating a variety of algorithms for
machine learning against a early life style disease
dataset. The flask web application framework then
uses the most efficient method to forecast disease.
This paradigm was used to develop a health portal
single interface, including scheduling doctor L. Wang, “Disease prediction by machine learning
appointments, ordering medications, and over big data from healthcare communities”, ,”
illnesspredictions. IEEE Access, vol. 5, no. 1, pp. 8869–8879, 2017.
Future Work:
In future disease data for different disease
are collected and trained using Deep learning
methods to get more effective results and
accuracy. Segmentation of MRI scans can be
applied to dataset can be integrated to
website.
REFEREN
CES
[1] B. Qian, X. Wang, N. Cao, H. Li, and

Y.-G.Jiang, “A relative similarity based
method for interactive patient risk
prediction,” Springer
Data Mining Knowl. Discovery, vol. 29,
no. 4,pp. 1070–1093, 2015.
[2] IM. Chen, Y. Ma, Y. Li, D. Wu, Y.

Zhang, and C.
Youn, “ Wearable 2.0: Enable humancloud
integration in next generation healthcare
system,” IEEE Commun. , vol. 55, no. 1, pp.
54– 61, Jan.
2017.
[3] Y. Zhang, M. Qiu, C.-W. Tsai, M. M.

Hassan, and A. Alamri, “HealthCPS:
Healthcare cyberphysical system assisted by
cloud and big data,” IEEE Syst. J., vol. 11, no.
1, pp. 88–95, Mar.2017.
[4] L. Qiu, K. Gai, and M. Qiu, “Optimal

big data sharing approach for telehealth in
cloud computing,” in Proc. IEEE Int. Conf.
Smart Cloud (Smart Cloud), Nov. 2016, pp.
184– 189.
[5] Disease and symptoms Dataset –

www.github.com.
[6] M. Chen, Y. Hao, K. Hwang, L. Wang, and

[7] Statistics, c=AU; o=Commonwealth of
Australia; ou=Australian Bureau of. "Main
Features Key findings". www.abs.gov.au.
Retrieved 2016-05-12.
[8] "Health status (AIHW)". www.aihw.gov.au.

Retrieved 2016-05-12
[9] "Leading causes of death

(AIHW)". www.aihw.gov.au. Retrieved
2016-05-12.
Tweet Emotion Recognition
Project Guide Author 3:Tejas S

Mrs. Yashaswini D.K Department of CSE
Assistant professor in CSE dept. Presidency University
Presidency University Bangalore,India
Bangalore,India 201910100414@presidencyuniversity.in
yashaswini.dk@presidencyuniversity.in
Author 1:Susmita Nayak Author 4:M Nikshitha

Bangalore,India Bangalore,India
Author 2:Swadhin Swain Author 5:Naman Nishant

Bangalore,India Bangalore,India
ABSTRACT:
A record-breaking volume of publicly accessible the text is tokenized, which involves breaking it
user-generated data is now available because to up into separate words or phrases. As the
social networks' widespread use. This data can be meaning of the text might vary depending on the
analyzed to learn about people's thoughts and context in which specific words are used, this
feelings. stage is crucial for analyzing the sentiment of the
post. Several natural language processing
On the other hand, text communication over web- techniques are used to analyse the text for
based networking media might be a little sentiment after tokenization. This can involve
overwhelming. Due to social media platforms, a examining the polarity of certain words or
significant volume of unstructured data is phrases as well as the post's general tone and
produced on the Internet every second.To context. After that, a model is trained to identify
understand human psychology, the data must be various emotions in tweets. Making use of the
analyzed as quickly as it is generated. This can be huge amounts of tweets with known emotions to
accomplished with the use of sentiment analysis, train the model,using the method to determine the
which recognizes polarity in texts. It determines sentiment of new tweets based on their text using
if the user has a negative, positive, or neutral databases of tweets with known emotions.
attitude towards a product, administration, person,
or place.Emotion detection, which properly The goal of the research discussed in this paper is
identifies a person's emotional/mental state, is to identify and examine the sentiment and
necessary in some applications because sentiment emotion people express through text in their
analysis is insufficient., which determines an tweets, then use that information to generate
individual’s emotional/mental state precisely. recommendations.It is possible to identify tweets
that indicate negative emotions like rage or worry
A Twitter post's text is retrieved and using Twitter emotion recognition, which enables
preprocessed to get rid of unnecessary words and the early identification of potential crises or
useless information. After that, public health problems. It can also be used to
measure changes in feelings over time, giving
information about how certain things or policies affect people's attitudes.
I.INTRODUCTION:-
Keywords :-Emotion detection,Natural language
processing(NLP),tweets,twitter,sentiment
analysis ,emotion.
tweets.These tweets may express a variety of
Social media has evolved into a global forum for feelings, including joy, sadness, anger, love, fear,
people to express their ideas and feelings in the and surprise. . Emotion detection in tweets refers
current digital era.Research has demonstrated that to the process of automatically identifying the
people may express a wide range of emotions emotions expressed in a tweet. There are many
through textual communication in addition to techniques used for emotion identification in
nonverbal clues like facial expressions and voice tweets, We'll utilize Natural language processing
tone.People now frequently express their emotions (NLP), which involves training models on huge
through written communication on social media datasets of labeled tweets, to discover patterns in
sites in particular. Users can express their ideas the text that correspond to distinct emotions.
and opinions on a variety of subjects on prominent
social media sites like Twitter.It will be The accuracy of emotion detection in tweets can
challenging to group such tweets by emotions in vary depending on the quality of the datasets used
the conventional sense, such as Inaccurate for training, the complexity of the emotions being
analysis,Limited understanding of customer detected, and the specific techniques and
sentiment ,difficulty in identifying trends,inability algorithms employed.
to personalize content.When that happens,
emotion detection enters the scene. Emotion The objective of this study is to construct a
detection in tweets has several uses, such as Twitter emotion recognition platform to find and
customer feedback analysis and sentiment examine the emotions conveyed in tweets, such as
analysis. Businesses and organizations can learn a In order to provide insights on human behaviour,
lot about how their customers feel about their attitudes, and trends, we can use emotions like
goods or services by examining the emotions joy, sadness,love, anger, fear, and surprise. In
portrayed in order to offer perspectives on human behaviour,
views, and trends.
Fig 1 Example shows a histogram to see the
number of tweets for the different classes.
II. LITERATURE SURVEY:-
regarding text mining. What makes it hard is its

subjective nature, as we discussed before for
every human a piece of text may carry a
completely different meaning, even though we
computers are getting better every passing year to
analyze and process the text correctly.
It works on Natural Language Processing (NLP)
Sentiment Analysis works on analyzing text, and application of machine learning to validate
through which it sought to recover emotional tone and identify if a text should be classified as
behind the text and the attitude. Randolph Quirk positive, neutral or negative. But there must be
(a linguist) defines something known as private few questions in every reader’s mind, such as
state objective scrutiny, those include emotions, “Why do we need sentiment analysis?” “How did
opinions, and speculations, among others. The it all start” and so on. To be fair sentiment
meaning of a text may vary from one person to analysis in one form or other was with us from the
another and doesn’t necessarily imply its nature as very beginning, the Greeks did it, the Romans did
false or true. It’s an ongoing field of research it, and so did the Chinese. Every empire tried to
gain the knowledge of what’s the public sentiment products are going to be a hit or miss even without
through different means and to use it in their own entering the market. News broadcasters can get
favor. One of the important Greek cities Athens in exit poll results without sending journalists on the
the 5th century BCE used voting as the means to ground or conducting a survey. Market regulators
gather public opinion, in the book “Art of War” can predict when the stocks market is greedy or
has a chapter on espionage that deals with spy going to face a big storm and be prepared for it. It
recruiting and betrayal. And in recent times can make it easy for health agencies to detect new
computer-based sentiment analysis started when outbreaks and control them on time.
people started writing blogs, social media posts, With possibilities being limitless the challenges
opinion pieces, etc. In a nutshell when people are limitless too, humans are dynamic, we have
started posting subjective texts on the internet the different cultures and completely different
boom for sentiment analysis started. languages. So, we need an ML algorithm to cater
The possibilities with this kind of data are to everyone and our project is a small step in that
limitless, companies can get an overview if their direction.
III. METHODOLOGY:-
A. Data Collection:-The first phase involves support vector machines (SVMs), and neural
collecting a dataset of tweets that have been networks are typical methods for tweet emotion
classified according to the emotions they represent. identification.
The emotion detection model will be trained and
evaluated using this dataset. E Model Training:-Using a training set of labelled
tweets, the chosen model is trained using the
B. Data Processing :-It is required to get the extracted features. Optimising the model's
dataset ready for analysis after it has been performance on the training data is the aim.
gathered. In order to do this, it may be necessary
to remove extraneous data, including URLs and F. Model Evaluation:-After the model has been
usernames, tokenize the tweets into words, trained, its effectiveness is measured using a
eliminate stopwords, and stem or lemmatize the validation set of labelled tweets. This procedure
words. enables the model to be fine-tuned and aids in
identifying any overfitting or underfitting
C. Feature Extraction:- difficulties.
The preprocessed twitter data is used to extract
features in this step. For detecting twitter G. Model Deployment:-After the model has been
emotions, popular feature extraction methods improved and assessed, it can be used to
include bag-of-words, TF-IDF, and word categorise the emotions of fresh, unlabeled tweets.
embeddings. This could entail implementing the model in a
web application or incorporating it into an
D. Model Selection:-After that, a machine learning already-used platform.
model is chosen and trained using the features that
were retrieved. Logistic regression, Naive Bayes,
IV. SENTIMENT ANALYSIS:-
Sentiment analysis is a technique for identifying preprocessed out of the captured tweets. The
the emotional undertone of a text or document. It proper emotion is then assigned to each tweet
has a wide range of uses in marketing, politics, using an emotion vocabulary, such as the NRC
social media analysis, and customer feedback Emotion vocabulary.
analysis, among other fields.
The tweets are categorised into positive, negative,
Since Twitter is a well-liked microblogging and neutral emotions using machine learning
network that generates copious amounts of data in methods like Support Vector Machines (SVM),
real-time, it is the perfect source of information Naive Bayes, or Random Forest. The
for sentiment analysis. Researchers can learn more effectiveness of the emotion detection model is
about the beliefs, attitudes, and feelings of Twitter measured using evaluation measures like precision,
users by examining the sentiment of tweets about recall, F1-score, and accuracy.
diverse subjects, brands, events, and people.
Finally, the outcomes of the sentiment analysis are
The project calls for the use of the Python visualised using visualisation packages like
programming language and some of its well- Matplotlib or Seaborn. The information from the
known libraries for NLP, including NLTK, SpaCy, analysis can assist researchers and organisations
and scikit-learn. Twitter API is used to gather in making data-driven decisions, such as
tweets about a specific subject, company, or event. enhancing brand reputation, resolving client issues,
Stop words, punctuation, and URLs are among the and monitoring public opinion.
noise and extraneous information that are
Fig 2 Graph shows Accuracy per epoch plot
Fig 3 Graph shows loss per epoch plot
V. CONCLUSION:-
In this paper we discussed the twitter sentimental subjective meaning behind the text. This can be
analysis and how these things can be applied to helpful in a day and age where people are overly
other social media platforms. We took data from relying on social media platforms for their daily
The Hugging Face and processed it through the news and through our project, we could help
NLP (Natural Language Processing). The detect harmful, violent, offensive, and hateful
experiment we did in this project was intended to content and report it. This project has limitless
identify texts into different categories such as capabilities which can be utilized for betterment
positive, neutral or negative. Importing data had of social media platforms and can lead to safer
mainly three steps, 1. importing the Tweet and secure internet for all. The project can be
Emotion dataset, 2. creating train, validation and updated with time and we as a team will make
test sets, and 3. extracting tweets and labels from sure it is updated and serves the purpose it is
the examples. These steps helped us understand intended to, and with time there will be more use
the procedure for sentimental analysis, multiple cases we are not even aware of. So, we consider
models were created to train them on the existing this project as one step in the right direction in
data and deriving conclusions. progression of sentimental analysis.
This is a small step on creating a robust system for
sentimental analysis, which can derive actual
[2]
https://link.springer.com/article/10.1007/s13278-
021-00776-6
VI. REFERENCE:-
[3] https://www.coursera.org/projects/tweet-
[1] https://www.researchgate.net/publication/3505 emotion-tensorflow
91267_Emotion_and_sentiment_analysis_of_twee
ts_using_BERT [4] https://ieeexplore.ieee.org/document/8295234
Effective Conversational AI Platform For Tourism Chatbot Using RASA
st nd
1 BHAVANA NP 2 SABIRA BI 3rd MADHUSHREE C
20191ISE0025 20191ISE0141 20191ISE0089
dept. of ISE dept. of ISE dept. of ISE

4th NANDINI P 5th G RAZIYA BEE
20191ISE111 20191ISE0135 6th TANVEER AHMED
dept. of ISE dept. of ISE Assistant Professor dept. of CSE
while evaluating the positives along with
Abstract - Artificial intelligence is the negatives they encounter.
revolutionary driving force behind the
paradigm change in new-generation Keywords: Intelligent chatbot, Virtual
assistants, Tourism tasks.
technology. It has ushered in a new era in
every business, from academic to recreation, I. INTRODUCTION
and from biotech to the industrial sector. With the fast use Of Al techs such as virtual
Despite tourism being a late entrant in this reality, automation, and algorithms based on
race, it has also seen a significant machine learning, the tourism industry has
transformation thanks to the miraculous seen an organisational change, making it
stroke Of Al. As one of the most rapidly increasingly intelligent as well as dynamic.
growing industries, tourism has adopted In recent years, the tourist industry, which is
various machinelearning approaches as one of the most profitable, has gained an
well as data analytics, making the tourist advantage in competition by using Al
model more intelligent as well as flexible. approaches to tackle their difficulties. E-
Tourism in our nation has a lot of room to commerce segments are classified B2C,
expand, and tourist industries are using B2B, B2G, C2C, and G2B.
prominent Al approaches such as artificial Much research focuses on B2C- Business-
neural networks, deep learning. robotics, to-Customer E-commerce since tourist
predictive analytics, or novel techs. This industries are usually focused on consumers
technology adaption greatly enhanced their and there are numerous E-Commerce
offerings, allowing for flexible pricing and platforms operating in our nation such as
intelligent client experience administration. yatra.com, makemytrip.com, Golbibo.com,
This work dealt with the research on our and many more. They are currently
nation's tourist industries that provide deploying Al applications for flexible fixing
internet-based amenities as well as explores prices, engineered recommendations,
the existing Al techs employed by them improved client administration, and
simplifying customised booking, among
other things. Tourism is primarily a B2C investigation. Customization cannot be
enterprise. Services related to e-commerce decided just via merchandise similarities; it
Offer 24x7 assistance for customers, must also be discovered through
improved client administration, and personalised attributes as well as interests.
enhanced service, among other things, and As a result, the suggested method provides
with the inclusion of technology such accurate suggestions for products while also
robots, or analytics of data, natural language increasing consumer happiness.
processing this sector is being impacted to
attain a bigger competing edge. [2] Recommender systems in ecommerce:-
According to a Business Line research on The majority of existing customised
the usage of Al by our nation's tourism recommendation engines deliver suggestions
industry, web E-Commerce companies such via cooperative filtering or mining of data.
as Golbibo and MakeMyTripare adopting These techniques, nevertheless, face
voice-reliant smart chatbots for web ticket scalability as well as sparsity issues. It offers
purchasing and thus facilitate the a system for customised suggestions in E-
customisation of reserving choices. Based commerce which brings together the
on the estimate, Make My Trip might be a capabilities of cooperative filtering along
single-step reservation platform by 2020. with information mining to provide superior
For the past couple of decades, the tourism suggestions. We use a range of actual and
industry has been focused on the use Of Al. artificial data sets to test the system
experimentally and demonstrate the
advantages it offers. We also present a new
II. LITERATURE SURVEY
[1] An effective product recommendation resemblance indicator for effectively
for E Commerce website using hybrid calculating collaborating users. The results
recommendation system:- of the experiment reveal that the suggested
Websites for e-commerce are one primary measure of similarity is up to 12 orders of
expanding developments in market, value quicker and has a superior ability to
allowing for internet-based item choosing, predict than previous resemblance metrics.
purchasing, and sales. Because e- commerce
platforms are becoming increasingly
prevalent and widespread, a large number of [3] Current Applications of Artificial
customers want to offer their ideas on their Intelligence in Tourism and Hospitality:-
experiences in the form of blogs, ratings, The technological boom has resulted in
and reviews. Many suggester systems have substantial changes in the tourist and
used the aforementioned variables to hospitality industries. Travel 2.0 web pages
provide the best product recommendations which are distinguished by their open
to consumers. Despite the finest as well as exchange of data, have grown popular for a
most trustworthy findings, the system for while time and are opening up opportunities
online shopping Ought to devote attention to for the adoption of more complex intelligent
related," identical merchandise technologies in the hospitality and tourism
industries. Given the complexities of travel- [5] Information Extraction for a Tourist
oriented choice-making, automated systems Recommender System:-
and the travel sector complement each other The authors describe information acquisition
excellently. Artificial intelligence, service techniques for the 'Sightsplanner' semantic
automation, and robotics have created customised recommendation system for
several fresh prospects for tourist businesses travellers travelling to Estonia's capital,
as well as organisations. Although artificial Tallinn, key issues are that data is dispersed
intelligence has been used to a certain level across several information sources of
in the tourist business, academic research on information is often kept in private formats,
the issue remains scant. This study proposes and is accessible in numerous languages
to extend the conversation on Al with different levels of accuracy. These
applications in travel that has only lately issues are dealt with, and strategies for
begun, therefore offering a much-required dealing with every one of them are laid out:
technical perspective on the issue. scraping and harvesting keywords from
multiple online portals with various dialects,
[4] Conversational recommendation coping with missing multilingual
We will present a recommendation agent for information, and recognising identical
conversational goods. This system items from multiple sources of information.
demonstrates the way that we built a virtual
sales salesperson by combining research in
personalised systems of recommendations
with development in conversation systems.
III. EXISTING METHOD
The virtual agent may acquire knowledge of
In the existing system, if any passenger
how to engage with consumers, respond to
wants any information about buses, trains,
questions from users, what the next query to
and flights such as vehicles availability,
pose, and what to propose while speaking
timings, seats, cost as per their requirement
with an actual customer using the innovative
they have to go to the travel centres(offices)
deep learning techs that got built. A decent
and meet the travel manager to know the all
conversation-enabling agent for a certain
information about travels and facilities after
area usually needs hundreds of thousands of
that they have to go to the advance
conversational information or handwritten
bookings. This is the time taking process
protocols. When establishing an interaction
also performance decreases.
with the agent for a new realm, this is a
Disadvantages:-
significant hurdle. We will investigate and
• Complex user interface. Making user
illustrate the learning redressal's
difficult to recognize the operations
performance regardless of whether there are
• Late response to user due to highly
no handwritten guidelines or hand-labelled
dumped data set
training information.
• More processing time to process
huge data
IV. PROPOSED SYSTEM • Libraries Used : rasa.
• Existing Systems were based on • Framework : Django
either rule based or neural networks
but RASA brings best of both
worlds. It uses both rule based V. METHODOLOGY
engines and neural networks based
models to deliver output and produce 5.1 Natural Language Processing:
user-like conversations. In proposed
system, the passengers no need to go Natural language processing (NLP) is a
to travel centers to get the all subfield of linguistics, computer science,
information about travelling and artificial intelligence concerned with the
(vehicles) and facilities. It takes less
interactions between computers and human
time to train as we are using pre-
trained neural network and using language, in particular how to program
transfer learning on them. computers to process and analyse large
amounts of natural language data. The result
is a computer capable of "understanding" the
contents of documents, including the
contextual nuances of the language within
them. The technology can then accurately
Figure1:- Block diagram of RASA extract information and insights contained in
Advantages
the documents as well as categorize and
• Improved customer experience
• Operational efficiency organize the documents themselves.
• Business growth opportunities
• Informed decision-making
• Industry advancement and
innovation
System Requirement
Hardware Configuration:
• Processor - I5/Intel Processor
• RAM - 8GB (min)
• Hard Disk -500 GB
Software Configuration:
• Operating System : Windows 10
• Server-side Script : Python 3.7.9
• IDE : VS Code
A chatbot is an NLP software that can message. The difference between NLP and
simulate a conversation (or a chat) with a NLU is that natural language understanding
user in natural language through messaging goes beyond converting text to its semantic
applications, websites, mobile apps or parts and interprets the significance of what
through the telephone. the user has said.
5.1.1 Turn human language into Rasa Open source is a robust platform that
structured data includes natural language understanding and
open source natural language processing. It’s
Rasa Open Source provides open source
a full toolset for extracting the important
natural language processing to turn
keywords, or entities, from user messages,
messages from your users into intents and
as well as the meaning or intent behind those
entities that chatbots understand. Based on
messages. The output is a standardized,
lower-level machine learning libraries like
machine-readable version of the user’s
Tensor flow and spaCy, Rasa Open Source
message, which is used to determine the
provides natural language processing
chatbot’s next action.
software that’s approachable and as
customizable as you need. Get up and 5.1.3 Why open source NLP?
running fast with easy to use default
Rasa Open Source is licensed under the
configurations, or swap out custom
Apache 2.0 license, and the full code for the
components and fine-tune hyper parameters
project is hosted on GitHub. Rasa Open
to get the best possible performance for your
Source is actively maintained by a team of
dataset.
Rasa engineers and machine learning
5.1.2 What is natural language researchers, as well as open source
processing? contributors from around the world. This
collaboration fosters rapid innovation and
Natural language processing is a category of
software stability through the collective
machine learning that analyzes freeform text
efforts and talents of the community.
and turns it into structured data. Natural
language understanding is a subset of NLP Unlike NLP solutions that simply provide an
that classifies the intent, or meaning, of text API, Rasa Open Source gives you complete
based on the context and content of the visibility into the underlying systems and
machine learning algorithms. NLP APIs can
be an unpredictable black box—you can’t be
sure why the system returned a certain
prediction, and you can’t troubleshoot or
adjust the system parameters. Rasa Open
Source is completely transparent. You can
see the source code, modify the components,
and understand why your models behave the
way they do. Open source NLP also offers
the most flexible solution for teams building
chatbots and AI assistants. The modular
architecture and open code base mean you Future enhancement
can plug in your own pre-trained models and
Future Enhancements for Al technologies in
word embeddings, build custom Indian tourism sectors providing online
components, and tune models with precision services include: Personalized
recommendations: Implementing advanced
for your unique data set. Rasa Open Source
machine learning algorithms can
works out-of-the box with pre-trained enable personalized recommendations based
models like BERT, HuggingFace on user preferences, previous bookings, and
browsing history, enhancing the overall
Transformers, GPT, spaCy, and more, and
customer experience. Voice-enabled
you can incorporate custom modules like interactions: Integrating voice assistants and
spell checkers and sentiment analysis. natural language processing capabilities can
allow users to interact with online tourism
VI. Results services using voice commands, providing a
more convenient and hands-free experience.
Augmented reality (AR) experiences:
Utilizing AR technologies can offer
immersive experiences, allowing users to
virtually explore destinations, view hotel
rooms, or take virtual tours, enabling them
to make more informed decisions.
Sentiment analysis: Employing sentiment
analysis techniques on customer reviews and
social media data can provide valuable
insights into customer satisfaction levels,
enabling tourism sectors to address concerns pilot), food providers, first aid facilities
and improve service quality. Data-driven while traveling), etc.., Thus, the tourism
forecasting: Utilizing Al algorithms to
chatbot will give the assistance to the
analyse historical data and trends can help in
forecasting demand, optimizing pricing students even the passengers no need to visit
strategies, and resource allocation, leading the travel centres.
to improved operational efficiency.
References
Conclusion 1. Shruti S. and Gripsy J. V effective
product recommendation for E
Chat bots are a thing of the future which is Commerce website using hybrid
yet to uncover its potential but with its rising recommendation
system,IJCSC,8(2),2017,pp-81-88.
popularity and craze among companies, they
2. Schafer, J. Ben, Joseph Konstan,
are bound to stay here for long types of chat and John Riedl. "Recommender
bots being introduced, it is of great systems in ecommerce." Proceedings
of the 1st ACM conference on
excitement to witness the growth of a new
Electronic commerce. ACM, 1999.
domain in technology while surpassing the 3. Zlatanov S and Popesku J
previous threshold. We are inventing the (2019),Current Applications of
Artificial Intelligence in Tourism and
system because of the need of the increasing
Hospitality, International Scientific
population of our country. As we know if we conference on Information
want to travel from one place to another Technology and data related
place (national/international) we need to go research, January 2019,
DOI:10.15308/Sinetza-2019-84-90.
to travel centres we need to get the all 4. Y. Sun, Y. Zhang, Y. Chen, and R.
information about travelling structure in the Jin, 'Conversational recommendation
sense how would be the travel maintenance, system with unsupervised learning' ,
pp. 397—398. Association for
availability of travels (buses, trains, flights),
Computing Machinery, Inc, (2016).
timings, stoppings, availability of seats, food 5. Ayeh K. J and et. al-Information
facility (inside flights), travels management Extraction for a Tourist
Recommender System, Information
(how would be the staff(driver, conductor,
and Communication Technologies in and the structure of interpersonal
Tourism 2012: Proceedings of the closeness. Journal of Personality and
International Conference in Social Psychology 63, 596-612
Helsingborg, Sweden, January 25— (1992).
27 , 2012. 12. Bickmore, T., Cassell, J.: Social
6. Divya, Indumathi, Ishwarya, Dialogue with Embodied
Priyasankari, "A SeltDiagnosis Conversational Agents. In:
Medical Chatbot Using Artificial Kuppevelt, J.CJ., Bernsen, N .0.,
Intelligence", proceeding MAT Dybkjær, L. (eds.) Advances in
Journal, October-2017. Natural Multimodal Dialogue
Systems, vol. 30, pp. 23—54.
Springer, Dordrecht (2005).
7. Tobias Kowatsch," Text-based

Healthcare Chatbots Supporting
Patient and Health", 01 October
2017.
8. Chin-yuan Huang, Ming-Chin Yang,

Chin-Yu Huang, "A Chatbot-
supported Smart Wireless Interactive
Healthcare System for Weight
Control and Health Promotion",
proceeding of the IEEE, April-2018.
9. Boukricha, H., Wachsmuth, 1.:
Modeling Empathy for a Virtual
Human: How, When and to What
Extent. The 10th International
Conference on Autonomous Agents
and Multiagent Systems-Volume 3.
International Foundation for
Autonomous Agents and Multiagent
Systems, 2011, pp. 1135—1136
10. Agarwal, R. , Gao, G. , DesRoches,
C., et al.: The Digital Transformation
of Healthcare: Current Status and the
Road Ahead. Information Systems
Research 21 , 796-809 (2010).
11. Aron, A., Aron, E.N., Smollan, D.:
Inclusion of Other in the Self Scale
Modern Agriculture Assistance System For Farmers: Web Application
Ms.Thabasumm Khan Mogal Musharaf Baig Lokireddy Pavankalyan
Professor(CSE), Department of CSE, Reddy
Presidency University, Presidency University, Department of CSE,
Bangalore, India. Bangalore, India. Presidency University,
thabassumkhan@presidenc 201910101498@presidenc Bangalore, India.
yuniversity.in yuniversity.in. 201910101218@presidenc
yuniversity.in.
Miduturu Sai Kumar Makkena Bhargav Sai Fharukh L,

Reddy, Department of CSE, Department of CSE,
Department of CSE, Presidency University, Presidency University,
Presidency University, Bangalore, India. Bangalore, India.
Bangalore, India. 201910101213@presidenc 201810100219@presidenc
201910101542@presidenc yuniversity.in yuniversity.in.
yuniversity.in
Abstarct-modern husbandry system agents, which leads to poverty. The part

requires modern way to get backing. The of the middleman in dealing the
use of internet has increased, and agricultural product must be removed to
farmers are also looking for result to their ensure direct deals between farmers and
problems online. Also, for good guests. The study shows changes in
husbandry backing, farmers need consumer preferences in India when it
downfall cast as well. Web operation comes to food choices. presently several
predicated on PHP and MySQL. farmers associations are dealing fresh fruits and
can now check quotidian Mandi rates to vegetables and people in India prefer to
sell the products. farmers can ask for buy them through online websites,
help if demanded. With advancement in Farmeeco plays a significant part in
technology, use of information technology predicting the outgrowth of these digital
in the field of husbandry has come fruit and vegetable requests. This will talk
obligatory different reasons. husbandry is about information regarding Farmeeco,
a top priority in India but moment people their former factory, software and tools
engaged in husbandry are from lower espoused by the company to manage their
class and face multitudinous problems in online operations, as well as their
their quotidian life due to extreme marketing strategies.
poverty. In India, about 15 of GDP( good Keywords Farmeeco, Fruit And
domestic product) comes from husbandry, Vegetable, Marketing, Smart Farm,
but these jobs employ 50 of our working Website.
population. Income generation is one of I. INTRODUCTION
the biggest causes of farmer tone- Modern life is necessary at this point. In our
murders in India. Lack of awareness of nation, growers frequently gather using the
modern technology or advanced ways rainfall and meteorological information.
leads to poverty, although farmers work Crops and shops. But moment, everyone
hard and produce by farmers, in may pierce all agrarian information thanks
moment's request, farmers are forced by to technology. In Bangladesh moment,
smart systems and innovative technologies simple for growers to get in touch with
are extensively used. Large and small agrarian specialists.
growers are entering newer, more precise • By creating an account on our smart
outfit for lower and further productivity as husbandry system website, anyone may ask
a result of recent technology advancements any question they've about husbandry at any
and its marketing. Fortunately, technological time.
improvements are helping growers each • If growers register on the" Smart Farming
across the world produce further food and System" website, the Agriculture Adviser or
vend it for much lower plutocrat. openings any other listed member will be suitable to
for more accurate husbandry can boost help them with their issues.
affair and profit while using less precious • Blogs on a range of agrarian motifs are
water and diseases. thus, we suppose that" available to everyone for the benefit of
Smart Farming System" will help growers. growers will have access to the
contemporary, scientific husbandry develop. Disease Prediction Blog's information to
In conclusion, this system will support identify colorful The Disease Prediction
growers by furnishing a single platform for Blog will enable growers to fete colorful
all smart ways to integrate husbandry. It conditions on their own crops. growers who
requires gift to cultivate. A planter has to want guidance on their husbandry practices
be apprehensive of the stylish times to sow, can simply get in touch with agrarian
wash, fertilize, and crop. also, they must counsels. also, Bangladeshi agrarian
understand how to guard us against scholars can subscribe up on our platform to
fungicides andpost-harvest harvesting. In the work as freelancers. This will give a result
history, when there were a variety of to their severance issues.
pathogens in the crops that were grown on II.MOTIVATION
land during the husbandry season but there Seeing the ongoing condition of farmers, it
was no effective way to cover the crops was found out that; problems need modern
from those conditions, growers would go to way of solving. Advancing in information
the fungicides merchandisers in the technology has changed the way of doing
request, and they would always recommend business, way of communication. So why
starting with primary drug and utmost of not to use the modern way to solve the
them. As a result, crop loss occurs most problem which has been existing since really
constantly as a result of indecorous long in Agriculture Sector. Farmer should be
treatment that's delayed. After examining able to contact the expert & get the
these enterprises, we began working on the" suggestions from them on specific topic.
Smart Farming System" online web Farmers should also be able to see weather
operation, which was substantially forecast which would help farmers to make
developed from all those Fortunately, better decisions to get maximum yield from
ultramodern technology is helping growers their field. Because, Farmers are the feeder,
each across the world produce further food so they deserve something better.
and work together to vend it for a far more III. PROBLEM STATEMENT
affordable price. Through the smart For farmers getting to know the real time
husbandry system, we can snappily resolve rate of Mandi daily is difficult. It is very
numerous of these issues. The Smart difficult for farmers to communicate with
Farming System action is pivotal in helping experts because there would be no expert
growers diagnose their crops beforehand. • available in remote areas. Also, farmers
The Doctor's Directory system makes it always suffer due to unwanted rain, or
dryness. They won’t have access to weather The article "Blockchain-Based
forecast. Due to this every year, there’s Agriculture Assistance" by R. Anand, R.
increase in price due to which common Divya Karthiga, T. Jeevitha, J. Leandra
citizen suffer. Also, thousands of farmers do Mithra, and S. Yuvaraj discusses the use of
suicide, because they can’t pay their loans, blockchain technology in the agriculture
which they are supposed to pay after selling sector. The authors begin by discussing the
their crops. But, lack of proper idea on importance of agriculture and the challenges
farming and other things, and unaware of faced by farmers, such as limited access to
weather and other circumstances farmers credit, lack of market information, and poor
aren’t able to grow crops as expected. infrastructure. They then introduce
Selling and buying of agriculture related blockchain technology as a potential
products is one of the challenges for solution to these challenges.
farmers. The authors provide a literature review of
IV. METHODOLOGY previous studies that have explored the use
The Proposed System differs from being of blockchain technology in agriculture.
technology where the area of specialization They cite several studies that have used
is taken into consideration to overcome debit blockchain to track the supply chain of
of being system. With the proposed system, agricultural products, such as coffee, cocoa,
tilling backing with a web operation and and beef. These studies have shown that
upload that record on web garçon. growers blockchain can improve transparency and
to post the queries they have, and experts to traceability in the supply chain, which can
respond them. growers can also see the help to reduce fraud and improve the quality
rainfall cast, and other aspects of it. of the products.
growers would be suitable to vend or buy The authors also discuss studies that have
products needed. Communication with explored the use of blockchain for
expert is now easier also making it as a web agricultural finance. They cite several
operation has increased the effectiveness. examples of blockchain-based platforms that
Use of HTML, CSS and JavaScript with php allow farmers to access credit and other
and MySQL has increased the feasibility. financial services, such as Cropcoin,
V MOTIVATION AgriDigital, and Agrocoin. These platforms
Seeing the ongoing condition of farmers, it use blockchain to create a secure and
was found out that; problems need modern transparent record of farmers' financial
way of solving. Advancing in information transactions, which can help to reduce the
technology has changed the way of doing risk of fraud and improve access to credit.
business, way of communication. So why Finally, the authors discuss the potential
not to use the modern way to solve the benefits of using blockchain technology in
problem which has been existing since really agriculture. They suggest that blockchain
long in Agriculture Sector. Farmer should be can help to improve efficiency and reduce
able to contact the expert & get the costs in the agriculture sector, as well as
suggestions from them on specific topic. improve transparency and traceability. They
Farmers should also be able to see weather also note that blockchain can help to
forecast which would help farmers to make empower small-scale farmers by giving
better decisions to get maximum yield from them access to financial services and market
their field. Because, Farmers are the feeder, information.
so they deserve something better. Overall, the authors provide a
VI. LITERATURE WORK comprehensive literature review of previous
studies on the use of blockchain technology • Accurate The system must be accurate;
in agriculture. They demonstrate the thus, the most accurate algorithms have been
potential benefits of blockchain for the chosen.
agriculture sector and suggest that further IX. USE CASE MODELING
research is needed to explore the full Then's a use case illustration that shows
potential of this technology. what part and exertion the stoner, planter
and director, Expert of this frame have. It
VII. DESIGN CONSIDERATIONS just shows what they can and can’t do.
Hypotheticals and Dependences USE CASE script FOR ADMIN
• All the fields must be entered in the An director can login and can view the
specified format. planter and add the growers and Experts and
• All the obligatory field’s requirements to also manage the mandi rates. They can view
be filled. the how numerous active stoner and also
• Proper internet connection is needed. feedbacks from the druggies.
• GUI designed is veritably easy for the end
druggies to understand and use.
• In case of any error the operation should
display proper error dispatches Development
styles
• MYSQL garçon is used as aft end.
• HTML, MySQL, CSS, Bootstrap,
JavaScript are used to develop this
operation.
VIII. OBJECTIVE
• Primary ideal of the system is to help
growers to communicate with experts
whenever they've any issues, also growers
should be suitable to enter the result of the
soil test in system and get the meaning
behind it with suggested crops.
• The ideal of the designed system aims the Fig 1. Admin use case diagram
following five points USE CASE SCENARIO FOR FARMERS
• Affordable The systems must be affordable AND EXPERT:
as the price is one of the main factors that As show below diagram
kept on mind during design phase. Farmer can view his orders and add the
• Movable: The systems to be movable and product by his own and also post his quries
easy use, this web app can be penetrated with experts and lab results and can ask his
through Phone as well. problems and get solutions from the experts.
• Safe The safety of the system is achieved Experts can view the Lab Results and
by making the system live on AWS. suggest the related fertilizers for soil and
• Fast The response time of the garçon also view the problems of the farmers
should be quick; growers would be in regrading cultivation and give the solution
remote position where internet speed would for that problem.
be veritably less.
planter's information can only be added,
deleted, or streamlined by the director. By
inputting the applicable dispatch address and
word, the director has complete control over
streamlining the runner at any moment.
Each module maintains its own dispatch
address and word for security reasons. The
Fig 2. Use case scenario for farmers and expert
only person who may add, remove, and
USE CASE SCENARIO FOR USER: modernize the information about the fruits
User can login to the website and view the and vegetables that are available on his or
products and buy the product and also can her ranch is the planter; website callers
add to the cart. won't see this runner. Order details are only
visible to growers. The person who accesses
the website to make an online or offline
purchase of any fruits and vegetables is
known as a stoner.
Fig 4: Business Model

XI. IMPLEMENTATION
Fig 3: USE CASE SCENARIO FOR USER Only the administrator, farmer, or expert can
log in if their username or password is
X. BUSINESS PROCESS MODELING correct; if not, an error message will be sent
The number of web druggies in India is to them. Stoner must sign in and have their
growing snappily these days, and a huge identity validated before they can post on
number of them are happily sitting and the blog. Before they can acquire access, he
relaxing while entering aid with shipping must first enter the login runner and submit
goods from the global request. Chancing a the necessary data.
business occasion and turning a respectable Page for the administrator: By inputting
profit are both extremely realizable. People each runner's unique dispatch and word, the
are happy to pay a decoration price for a administrator has complete ability to
business occasion that can be attained for modernise runners at any moment. The
little plutocrat. The planter will be profitable administrator may then add experts and
to the director(admin), and another upgrade growers and see planter data like crops.
to this stoner task would be relatively Each module maintains a dispatch id and
significant. It'll produce new business password for security reasons, and the
openings. currently, everyone has a administrator may examine client order
smartphone, thus it'll be simple to offer information such as customer name,
services to Android druggies as well. It'll payment information, product name, and
also be simpler to promote this business volume.
because everyone has a smartphone. The
Farmer page: The farmer can add crops, Approach". IEEE 2022.
delete existing ones, and update information https://ieeexplore.ieee.org/abstract/document
about the fruits and vegetables that are /9823634/authors#authors
available at his or her ranch; however, [4] Sunil More; Mininath Nighot "An agro
website visitors won't be able to see this. In advisory for pomegranate field using
addition to seeing and accepting the client's wireless sensor network" IEEE 2016.
order information and cancelling the order, [5] Hetal Patel, Dr. Dharmendra Patel
the planter may also examine and check the "SURVEY OF ANDROID APPS FOR
crop volume and order details. AGRICULTURE SECTOR" .
Website home page: The webpage that a Researchgate2016.https://www.researchgate.
website visitor, or a customer, will see. net/profile/Dharmendra-Patel-
XII. COCLUSION 4/publication/301277058_Survey_of_Andro
Farming Assistance, a web application, can id_Apps_for_Agriculture_Sector/links/572b
address a current issue that affects farmers 1c4908ae2efbfdbdb867/Survey-of-Android-
today. Remote farmers may submit Apps-for-Agriculture-Sector.pdf .
questions and engage with experts
throughout the country thanks to the one- [6] P. G. Anand, S. Sreelekha, and S.
touch solution that is now available. Farmers Sujatha, "Web-based Expert System for
were having trouble comprehending the Effective Fertilizer Recommendation and
significance of the lab results they provided Disease Management in Agriculture," in
for the soil analysis. The lab results may Proceedings of the International Conference
now be sent, and they will receive a on Computational Intelligence and Data
processed and simple remedy. Engineering (ICCIDE), 2020, pp. 161-167.
Farmers will also receive advice about the [7] S. S. Malarvizhi, M. M. Abdul Kader,
best additions to make to the soil in order to and M. B. A. R. Rahman, "Development of
maximise the benefits for certain crops and Mobile-based Farm Advisory System for
how the soil should be used in general. Rice Cultivation using IoT Technology," in
Weather information is important for Proceedings of the International Conference
farmers. Therefore, they will receive real- on Artificial Intelligence and Sustainable
time updates through this online application. Technologies (ICAAST), 2020, pp. 1-6.
References [8] P. C. Mohan, K. B. Jayarraman, and P. S.
[1] R. Anand, R. Divya Karthiga, T. Jeganathan, "Smart Irrigation System using
Jeevitha, J. Leandra Mithra & S. IoT and Web-based Application," in
Yuvaraj “Blockchain-Based Agriculture Proceedings of the International Conference
Assistance”. Springer 2021 on Computing, Communication and Security
https://link.springer.com/chapter/10.1007/97 (ICCCS), 2020, pp. 107-111.
8-981-15-8221-9_43 [9] R. M. George and G. N. Pandey, "Design
[2] B. L. Ramaiah, P. Rajesh, K. R. and Development of a Web-Based Crop
Venugopal, and L. M. Patnaik, "Modern Disease Diagnosis System using Machine
Agriculture Assistance System for Farmers: Learning Algorithms," in Proceedings of the
Web Application," in Proceedings of the International Conference on Advances in
International Conference on Advances in Computing, Communication and Control
Computing, Communications and (ICAC3), 2021, pp. 467-471.
Informatics (ICACCI), 2016, pp. 2173-2179. [10] P. V. Rajeev, P. M. S. Kumar, and S. S.
[3] Surender Singh; Sannihit"Sustainable Kumar, "Development of a Web-Based
and Smart Agriculture: A Holistic Agricultural Expert System for Soil Fertility
Management," in Proceedings of the and Signal Processing (ICCSP), 2021, pp.
International Conference on Communication 1006-1010.
Employee Attrition Prediction Using Deep Learning
Chethan R (20191ISE0035) Dudekula Dastagiri (20191ISE0049)
School of Computer Science and Engineering & School of Computer Science and Engineering &
Information Science Information Science
chethanr1294@gmail.com giridasta2002@gmail.com
Devarala Chandrasekhar (20191ISE0042) Ganthi Sourabh Sai (20191ISE0054)

School of Computer Science and Engineering & School of Computer Science and Engineering &
Information Science Information Science
chandu0412345@gmail.com sourabsai2002@gmail.com
Abstract—Effective decision-making is crucial in the field of

management, particularly when it comes to addressing employee using predictive models built using HR data. The use of AI
attrition, which can lead to the loss of high-quality employees. In
has received considerable research attention due to the
recent years, artificial intelligence has emerged as a powerful tool
for predicting employee attrition, and this study proposes the use increasing amount of data available.
of deep learning techniques and preprocessing steps to improve
the accuracy of predictions. We analyze several factors that This research paper focuses on predicting employee
contribute to employee attrition, identifying the dominant ones attrition using deep neural networks with the IBM attrition
by analyzing their inter correlation. To test our approach, we dataset[5]. The dataset contains 35 features for 1470 samples
plan to conduct experiments using an imbalanced dataset from of two classes (current and former employees), with 237
IBM analytics, consisting of 35 features for 1470 employees. We positive samples (former employees) and 1233 negative
will derive a balanced dataset from the original one and evaluate samples (current employees). The unbalanced dataset makes
our work using cross-validation. After conducting the
the prediction process challenging. This study's main
experiments, our deep learning model has achieved 96 percent
accuracy compared to other machine learning models. contributions include the use of deep learning techniques with
pre processing steps to improve the prediction of employee
Keywords—deep learning, ibm dataset, artificial intelligence, attrition, analysis of dataset features to reveal their correlation
attrition prediction and identify the most important ones, testing of the model
overbalanced and imbalanced datasets for realistic results, and
I. INTRODUCTION the use of cross-validation to evaluate the model accurately.
Organizations and firms compete with each other based on II. LITERATURE REVIEW
the productivity of their workforce, which is highly dependent
on the working environment. The human resource (HR) In the literature, employee attrition has been investigated
department plays a critical role in creating and maintaining a from various perspectives.Some studies focused on analyzing
suitable environment that promotes stable and collaborative employees' behavior to identify the reasons behind their
employees. HR can achieve this by analyzing the employees' decision to leave or stay with the organizatin [6,7]. Other
database records to improve decision-making and prevent studies utilized machine learning algorithms to predict
employee attrition[1,2]. Employee attrition occurs when employee attrition based on their records. Alduayj and
productive employees leave the organization due to reasons Rajpoot [8] utilized several machine learning models,
such as work pressure, unsuitable environment, or including random forests, k-nearest neighbors, and support
dissatisfaction with salary, which negatively affects the vector machines with different kernel functions, and used
organization's productivity as it loses productive employees different forms of the IBM attrition dataset. However, their
and other resources such as HR staff efforts in recruiting[3] system's accuracy with the original class-imbalanced dataset
and training new employees. was not satisfactory despite achieving high accuracy with the
synthetic dataset. Usha and Balaji [9] used the same dataset to
To prevent or reduce the impact of employee attrition, compare several machine learning algorithms such as decision
predicting it before it occurs is crucial. Studies have shown tree, naive Bayes, and k-means for prediction, but their work
that happy and motivated employees tend to be more creative, lacked the data preprocessing stage, resulting in poor accuracy.
productive, and perform better[4]. Artificial intelligence (AI) Fallucchi et al.[3] studied the reasons that drive an employee
has recently been utilized in many different fields, including to leave the organization and utilized various machine
predicting employee attrition learning techniques, including naive Bayes, logistic regression,
k-nearest neighbor, decision tree, random forests, and support
vector machines, to select the best classifier. Although they
validated their work using cross-validation and train-test split,
their results included only the 70%:30% split train-test set
without discussing cross-validation. The test accuracy was
better than the training accuracy, indicating potential
improvement. Zangeneh et al. proposed a three-stage
framework for attrition prediction, utilizing the "max-out" Table 1. IBM dataset features
feature selection method for data reduction, a logistic
regression model for prediction, and confidence analysis for Feature name Type Feature name Type
prediction model validation. However, their system was Age Number MonthlyIncome Number
highly complex, and the accuracy was unsatisfactory. BusinessTravel Category MonthlyRate Number
DailyRate Number NumCompaniesWorked Number
Department Category Over18 Category
However, the prediction accuracy of these studies still needs DistanceFromHome Number OverTime Category
improvement to achieve higher confidence. This work Education Category PercentSalaryHike Number
proposes using deep learning and data preprocessing EducationField Category PerformanceRating Number
EmployeeCount Number RelationshipSatisfaction Category
techniques to increase the prediction accuracy and improve EmployeeNumber Number StandardHours Number
upon the state-of-the-art methodologies utilizing the IBM HR EnivironmentSatisfaction Category StockOptionLevel Category
dataset. Gender Category TotalWorkingHours Number
HourlyRate Number TrainingTimesLastYear Number
JobInvolvement Category WorkLifeBalance Category
III. METHODOLOGY JobLevel Category YearsAtCompany Number
EducationField Category YearsInCurrentRole Number
JobRole Category YearSinceLastPromotion Number
The proposed work analyses the respective dataset to detect JobSatisfaction Category YearsWithCurrentManager Number
the most influential features that affect the prediction and MaritalStatus Category Attrition Category
builds a predictive model according to the following phases.
1. Gathering employees’ data: IBM dataset [5] has been

used.
2. Preprocessing the collected data: Data are prepared to be

utilized by the predictive model.
3. Analyzing the dataset: The most important features that

push an employee to leave the organization are detected.
4. Balancing the dataset: Since the dataset is not already

balanced, it is necessary to be equalized.
5. Building the predictive model: The suitable

configuration for the model is selected to increase the
prediction accuracy.
6. Validating the model: K-fold validation and 70%:30%

train-test set are used for system evaluation.
1. Dataset Description
The dataset used in this research work was obtained from

IBM Analytics, and it contains data for 1470 employees. This
dataset has 35 different features, and these features are listed
in Table 1 along with their corresponding types. The most
important feature in this dataset is the "Attrition" feature,
which represents the employee's decision whether to leave the
company(Yes) or stay with the company(No).
2. Data Preprocessing
Preprocessing is an essential aspect of machine learning, as it

can greatly enhance the accuracy and effectiveness of the
model. This process involves several steps, including data
cleaning, categorical data encoding, and rescaling, each of
which plays an important role in preparing the data for
analysis. In the upcoming sections, we will provide a brief
overview of each of these steps and their significance in the
preprocessing operation.
2.1 Data Cleaning

After conducting a preliminary examination of the dataset, it
has been found that certain features are consistent for all
employees, including EmployeeCount, Over18, and
StandardHours. Therefore, these features have been
disregarded during this stage. Additionally, the
EmployeeNumber feature has also been omitted due to its
irrelevance to our classification problem.
2.2 Categorical Data Encoding

The dataset utilized in this study contains several categorical Fig 1. Correlation Heat-map
(nominal) features that cannot be directly used in most
machine learning algorithms. Examples of such features
include BusinessTravel, Department, EducationField, Gender,
JobRole, MaritalStatus, and Overtime. Therefore, it is
necessary to convert these features into numerical ones
through encoding.
For example, Attrition feature, which has two values
(Yes,No) is translated into (1,0).
2.3 Rescaling
The standard score of a sample x is calculated as:
z = (x - u) / s
where u is the mean of the training samples or zero

if with_mean=False, and s is the standard deviation
of the training samples or one if with_std=False.
2.4. Dataset Analysis

The correlation matrix is usually used to understand the
relationship among the dataset features. Figure 1 depicts
correlation matrix of the matrix. The cell varies from white to
dark blue color.
2.5 Data Balancing
The dataset used in this study is biased towards the target
variable, which is attrition. This means that the number of
employees who left the organization is not equivalent to the
number of employees who are still working. The original
dataset comprises 1470 employee records, out of which only
237 employees have left the organization, while 1233
employees are still working. As shown in Figure 2a, this
imbalance in the dataset may lead to poor performance of the
prediction model. To address this issue, some researchers use
oversampling techniques to balance the dataset. In this study,
we utilized the Adaptive synthetic (ADASYN) sampling
approach [10] to transform the dataset into a balanced version
(Figure 2b).
Fig 3. The proposed neural network architecture
Fig 2. Imbalanced and balanced dataset. (a) Original imbalanced dataset. (b)
Synthetic balanced dataset.
2.5. Deep Learning Model
This study focuses on building a prediction model to

classify employee attrition using a deep learning approach.
Different machine learning models such as decision trees,
random forests, logistic regression, and SVM have been used
in previous studies. However, in this work, a deep learning
prediction model is chosen due to its potential to capture
complex patterns and dependencies within the data. To ensure
that the model is not overfitting or underfitting,
hyperparameters such as the number of hidden layers, number
of neurons, and activation functions must be selected
carefully.The Activation function of input and hidden layers is
reLu and output layer of Softmax(Fig 4).The resulting model
consists of an input layer, 5 hidden layers, and an output layer.
The input layer contains 30 neurons, which is the number of
features. Each hidden layer has 10 neurons, and the output
layer has 3 neurons that represents the prediction (Fig 3).
Fig 4. Activation Functions

2.6 Validation To ensure a realistic performance evaluation of our model
using the original dataset, we conducted 10-fold cross-
To evaluate the performance of the model, the dataset is validation.
divided into training and testing sets.
Table 3. 10-fold cross validation
2.6.1 K-Fold Cross validation
Methods Accuracy Precision Recall F1-
To ensure a fair evaluation of the model and to avoid the
Score
problem of overfitting, cross-validation is necessary in testing
the machine learning model. While traditional train-test sets Neural 87.25 87.55 85.91 86.57
can yield misleading performance if the test samples are Network
included in the trainset, cross-validation provides more Model
realistic performance by dividing the dataset into k parts,
where one part is used for testing while the remaining k-1
parts are used for training. This process is executed k-times, V. CONCLUSION
with the final accuracy being the average of accuracy values
obtained from these k executions. This technique allows for a Our proposed work offers a tool for the human resources
more accurate evaluation of the model's performance and department to predict the likelihood of an employee leaving
avoids any bias that may arise from the specific choice of the organization. By analyzing the employee dataset, we
train-test splits. identified the most influential features that contribute to
employee attrition, such as overtime hours, job level, and
IV. RESULTS monthly income. We also explored the correlations among
different features. However, the imbalanced nature of the
dataset posed a challenge, as the number of employees who
For this experiment, we utilized the ADASYN [11]
left the organization was significantly smaller than the number
technique to convert the original dataset into a synthetic one,
of employees who stayed. To address this, we used the
in order to make a fair comparison with other researchers who
ADASYN algorithm to create a synthetic version of the
have also used this approach. The comparison results between
dataset to build a more stable classifier that can support
our proposed technique and all the methods mentioned are
realistic prediction.
shown in Table 3. The results indicate that our accuracy and
f1-score are better than most of the other methods used in the
To measure the effectiveness of our approach, we conducted
research.
extensive experiments and evaluated the performance of our
model in terms of accuracy, precision, recall, and F1-score.
Table 2. Comparison with other machine learning models
Our method outperformed state-of-the-art techniques that
utilized the same dataset, achieving an accuracy of 96.25%
Methods Accuracy Precision Recall F1- using the balanced synthetic dataset. We also compared our
Score method with the other machine learning models and our
Decission Tree 82.92 80.87 83.03 81.94 results showed higher accuracy(87.25%). Our approach can
provide valuable insights to the human resources department
Logistic 80.42 80.87 78.81 79.83 to make informed decisions regarding employee retention.
Regression
K Neighbors 82.92 99.13 74.02 84.76
Classifier
Random Forest 89.58 90.43 88.14 89.27
Classifier
Support Vector 88.33 93.04 84.25 88.43
Machine(SVM)
Neural 96.25 96.49 95.65 96.06
Network
Model
REFERENCES
[1] Jarrahi, M.H. Artifificial intelligence and the future of work:
Human-AI symbiosis in organizational decision making. Bus.
Horiz. 2018, 61, 577–586. [CrossRef]
[2] Duan, Y.; Edwards, J.S.; Dwivedi, Y.K. Artifificial intelligence
for decision making in the era of Big Data—Evolution,
challenges and research agenda. Int. J. Inf. Manag. 2019, 48,
63–71. [CrossRef]
[3] Fallucchi, F.; Coladangelo, M.; Giuliano, R.; William De Luca,
E. Predicting Employee Attrition Using Machine Learning
Techniques. Computers 2020, 9, 86. [CrossRef]
[4] Zelenski, J.M.; Murphy, S.A.; Jenkins, D.A. The Happy-
Productive Worker Thesis Revisited. J. Happiness Stud. 2008, 9,
521–537. [CrossRef]
[5] IBM HR Analytics Employee. Available online:
https://www.ibm.com/communities/analytics/watson analytics-
blog/hr employee-attrition/.
[6] Setiawan, I.; Suprihanto, S.; Nugraha, A.; Hutahaean, J. HR
analytics: Employee attrition analysis using logistic regression.
In IOP Conference Series: Materials Science and Engineering;
IOP Publishing: Bandung, Indonesia, 2020; Volume 830, p.
032001.
[7] Rupa Chatterjee Das, A.D.S. Conceptualizing the Importance of
HR Analytics in Attrition Reduction. Int. Res. J. Adv. Sci. Hub
2020, 2, 40–48. [CrossRef]
[8] Alduayj, S.S.; Rajpoot, K. Predicting Employee Attrition using
Machine Learning. In Proceedings of the 2018 International
Conference on Innovations in Information Technology (IIT), Al
Ain, United Arab Emirates, 18 19 November 2018; pp. 93–98.
[CrossRef]
[9] Usha, P.M.; Balaji, N. Analysing employee attrition using
machine learning. Karpagam J. Comput. Sci. 2019, 13, 277–282.
[10] He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive
synthetic sampling approach for imbalanced learning. In
Proceedings of the 2008 IEEE International Joint Conference on
Neural Networks (IEEE World Congress on Computational
Intelligence), Hong Kong, China, 1–8 June 2008; pp. 1322–
1328.
“Exploring the Potential of Decentralized
Exchanges for a More Transparent and Secure
Future”
Dr. Marimuthu K Shrikrishna Soham Sen
Computers and Engineering 20191ECM0028 20191ECM0030
marimuthu.k@presidencyunive Electronics and Computer Electronics and Computer
rsity.in Engineering Engineering
Vaishnav B versity.in versity.in
20191ECM0040
Presidency University Thanneeru Venkatesh Giddaluri Naveen Chandra
Bangalore, India 20191ECM0047 20191ECM0048
Electronics and Computer Presidency University Presidency University
Engineering Bangalore, India Bangalore, India
201910101704@presidencyuni Electronics and Computer Electronics and Computer
versity.in Engineering Engineering
Abstract—
Centralised exchanges have been the traditional
method of trading cryptocurrencies since their A comparison of DEX platforms reveals issues
inception. These exchanges are owned and operated with performance, security, privacy, and adoption.
by a single entity that controls all aspects of the While DEXs allow for trustless, transparent
platform, including the matching engine, wallet trading, regulatory compliance and liquidity
management, and asset custodianship. However, constraints limit their widespread adoption.
with the advent of blockchain technology, This paper evaluates the state and potential of
decentralized exchanges (DEXs) have emerged as DEXs to transform how cryptocurrencies and
an alternative to centralised exchanges. DEXs digital assets are exchanged by examining DEX
operate on a peer-to-peer network, allowing users mechanisms as an upgrade to centralised authority
to trade cryptocurrencies without the need for a and control.
central authority
Keywords - Decentralized Exchange, Transparency,
Smart contracts, Automated market makers (AMMs),
Order book models, Cross-chain, Smart contracts
I. Introduction: Bitfinex is one of the largest crypto exchanges that

Cryptocurrency exchanges make it easier to trade enables trading of major cryptocurrencies like
digital assets. Centralized exchanges are vulnerable Bitcoin, Ethereum, and Litecoin. It is registered in
to hacks, scams, and money laundering. Hong Kong.
Decentralized exchanges address these issues, but Binance is the world's largest crypto exchange that
they have limited liquidity. offers trading services for over 100 different
Smart contracts are used in on-chain exchanges, but cryptocurrencies. It is registered in China.
they have scalability issues. To scale, off-chain Coinbase is a popular US-based crypto exchange
exchanges use payment channels but rely on that allows users to buy, sell, and trade major
intermediaries. digital currencies. It also has brokerage services
and an e-wallet for storing crypto assets.
A. Overview of cryptocurrency Kraken is one of the oldest crypto exchanges that
exchanges facilitates trading of major cryptocurrencies like
Cryptocurrency exchanges facilitate the exchange Bitcoin as well as other digital assets. It is
of digital currencies between buyers and sellers. registered in the US.
major crypto exchanges include Bitfinex, Binance,
Coinbase, Kraken, and Gemini.
B. Limitations of centralized strengthens crypto progress by supporting access
cryptocurrency exchanges for all in a trustless, transparent, and open financial
Centralized cryptocurrency exchanges (CEXs) play system. CEXs and DEXs together work towards
an essential role in facilitating crypto trading and this vision, rather than any single platform alone.
accessibility. However, they also confront With innovative progress, each can mitigate more
substantial limitations due to their centralised limitations over time while remaining a force
nature and business model. Understanding key benefiting crypto communities.
limitations of CEXs is essential for determining if C. DEXs Advantages over CEXs
and when decentralised exchange (DEX) options Decentralized exchanges (DEXs) provide several
may prove preferable. key advantages appealing to many crypto traders
Some of the most notable limitations of centralised and investors.
crypto exchanges include: Some of the most significant benefits of DEXs over
Censorship risk: As private companies, CEXs can CEXs include:
censor transactions, suspend accounts or restrict Decentralization: DEXs are decentralized protocols
access at their discretion. They have the power to with no single entity controlling the platform. They
limit users' ability to access their own funds or are transparent, permissionless, and censorship-
transact as they choose. [1] resistant. Funds are directly controlled by users, not
Security concerns: Although CEXs strive to held by a central authority. [4]
implement strong security practises, they represent Lower fees: DEXs charge little to no platform fees
a centralised point of failure. If compromised or since they have no central authority collecting
defrauded, many users could potentially lose access revenue. Most fees instead go to liquidity providers
to their funds or holdings. Decentralization in the form of protocol incentives. Typical trading
eliminates this danger. fees on DEXs are 0.3% or less, far lower than CEX
High fees: CEXs charge fees for deposits, fees. [5]
withdrawals, transactions and other service features Anonymity: DEXs do not require identity
to generate revenue. These fees, which can reach verification or collect personal information,
several percent per transaction, considerably reduce allowing for anonymous trading. Users can start
users' holdings over time through ongoing costs. trading immediately without revealing their
DEXs typically charge little or no fees. [2] identity. CEXs require detailed verification for
Limited security features: While some CEXs offer users to withdraw funds. [6]
features like 2-factor authentication, insurance Greater security: There is no centralized point of
funds or purchasing limits, they cannot replicate the failure on a DEX that could lead to loss of funds or
full array of security benefits provided through inability to trade. Decentralization makes DEXs
decentralised consensus mechanisms, encryption, more robust and resilient against hacking, fraud,
and private keys. Ownership and control ultimately scams or government overreach. [7]
remain with the exchange. Transparency: DEX smart contracts and
Regulatory complication: As private companies transactions are transparent and stored on-chain,
subject to government regulation, CEXs face allowing for audits and ethical oversight. There are
complications around compliance, oversight, and no hidden practices or manipulation possible due to
how regulations issued for financial institutions decentralization. Transparency builds trust in
may impact crypto enterprises. Jurisdictional crypto markets.
inconsistencies also pose problems. DEXs operate Innovation: DEX developers face fewer restrictions
on open, permissionless networks less susceptible on features, models, and integrations compared to
to regulatory complications. [3] centralized platforms. New concepts can be
Locked-in platforms: Once funds have been experimented with and evolve rapidly based on
deposited, traded and possibly converted to other feedback and demand. Slow-moving CEXs struggle
assets on an exchange, significant effort is often to keep up with innovation.
required to move them elsewhere. Users become Interoperability: DEXs are built on open
bound into a specific platform, at the mercy of its blockchain networks, allowing for seamless
features, fees, leadership, funding model, and integration and communication between different
discretion over how they can access their digital DEXs, CEXs, wallets, and dapps. A greater degree
holdings. DEX interoperability seeks to prevent of interoperability strengthens the overall crypto
platform lock-in. ecosystem.
While CEXs continue essential roles in crypto D. Methods:

markets, limitations around censorship, security, Decentralized exchanges (DEXs) facilitate crypto
fees, regulation and freedom of choice pose risks. trading without a central authority governing the
Understanding limitations can help determine when platform. Instead, they follow decentralised
DEX alternatives may be preferable to suit principles around openness, permission lessness,
individual priorities. Ultimately, diversity transparency, and consensus.
Several common approaches are used in DEX User 1 submits a buy order for ETH Token on the
protocols including: DEX using their BTC, describing the quantity of
Automated Market Making (AMM): Pools of token ETH token they wish to buy and the asking price.
liquidity are automatically rebalanced using smart The DEX connects User 1's buy order with User 2's
contracts to determine exchange rates and facilitate sell request to exchange ETH token for BTC at the
trading. AMMs allow immediate trading with no same or a similar price.
reliance on an order book. Popular DEXs using In order to complete the transaction, the DEX
AMMs include Uniswap and SushiSwap. [8] moves the ETH Token from User 2's to User 1's
Order Book Matching: Trades are made by wallet and the BTC from User 1's wallet to User 2's
matching bid and ask orders in the open order wallet.
book. This method is more traditional but requires
traders to place bids and asks, wait for a match, and
manage order expiry. Order book DEXs include
IDEX, EtherDelta, and ForkDelta.
DEX Relayers: Services that build additional
features on top of DEX protocols by relaying trades
off-chain to increase scalability and reduce fees.
Relayers include Dydx, dYdX, and Curve. They
allow for more complex trading, lending, and DeFi
integrations.
Rollups: DEXs are designed as "layer 2" rollups on
top of existing "layer 1" blockchain networks like
Ethereum. This provides far higher throughput and
lower costs than the base layer alone while still
maintaining a decentralised structure. Popular
rollup DEXs include ZKSwap, OMG Network, and Fig. 1.1
Arbitrum.
Sidechains: Semi-separate blockchain networks are
linked to the main network, allowing DEXs to
operate with greater flexibility, scalability, and
F. Implementation
customization. Assets can be transferred between
sidechains and the base layer. Examples include
Plasma, RSK, and Polkadot. Sidechains facilitate User: The user accesses the DEX through a web
experimental new DEX concepts. browser or a specialized wallet application that
Zero-Knowledge Proofs (ZKPs): Mathematical supports the DEX. The user needs to have some
techniques are used to prove possession of sensitive cryptocurrency in their wallet to be able to trade on
information without disclosing it. ZKPs provide the DEX.
anonymous and trustless trading by proving Wallet: The user's wallet interacts with the DEX
ownership of assets without revealing private keys through an API (application programming
or transaction details. ZKSwap and Carbon utilise interface) that allows the wallet to connect to the
ZKPs for private DEX functionality. DEX and initiate trades. The wallet is responsible
Each DEX method has unique benefits around for storing the user's private keys, which are used to
scalability, privacy, flexibility, and more. Users can sign transactions on the blockchain.
choose between or combine multiple methods Browser: The user can also access the DEX
based on priorities for distinct trading needs. through a web browser that connects to the DEX's
Overall progress depends on sustained innovation website. The browser interacts with the DEX's
improving all methods and enabling new levels of smart contract on the blockchain to execute trades.
interoperability between them. With ambitious Website: The DEX's website provides a user
research and development, DEXs could ultimately interface that allows users to view market data,
achieve mainstream viability by supporting place orders, and execute trades. The website is
openness and freedom of choice in crypto finance. hosted on the internet and can be accessed from
anywhere in the world.
E. Working: Node: The DEX runs on a decentralized network of
nodes that validate transactions and maintain the
The first user wants to exchange token (ETH) using integrity of the blockchain. These nodes are
Bitcoin (BTC). the first user links their digital distributed across the network and communicate
wallet to the DEX and deposits BTC into their with each other to reach consensus on the state of
DEX wallet. the blockchain.
Exchange Contract: The DEX's smart contract is
the code that defines the rules of the exchange. The
contract is deployed on the blockchain and controls Constant Liquidity: AMMs maintain ample
the flow of funds and the execution of trades. The liquidity for immediate trading 24 hours a day, 7
contract is designed to be trustless, meaning that days a week. There are no periods with little or no
users do not need to trust a central authority to liquidity available like those that can occur due to
ensure the fairness of the exchange. low volume on order book exchanges. Traders can
Token Contract #1: The DEX supports trading of purchase or sell crypto assets immediately at any
various cryptocurrencies, and each cryptocurrency time.
is represented by a unique token on the blockchain. Dynamic Pricing: The exchange rate between
The contract for Token #1 defines the rules for how tokens in an AMM pool is dynamically determined
that token can be transferred and traded on the using smart contracts based on the quantities of
blockchain. each token provided as liquidity. As more liquidity
Token Contract #2: Similarly, the contract for is added or trading occurs, the exchange rate
Token #2 defines the rules for how that token can updates automatically to a new equilibrium. Prices
be transferred and traded on the blockchain. consistently reflect market forces of supply and
demand.
Low Fees: AMMs charge little or no platform fees
When a user wants to place an order on the DEX, since there is no central authority collecting
the order is broadcast to the network of nodes, and revenue or profiting from facilitating transactions.
each node validates the transaction before it is Instead, a modest percentage of each trade (e.g.
added to the blockchain. When a trade is executed, 0.3%) goes to liquidity providers as an incentive to
the tokens are transferred directly between the add more liquidity to the pool. Most fees remain
buyers and sellers, without the need for an with traders rather than being collected by an
intermediary. This process is transparent and exchange.
secure, ensuring that the DEX operates fairly and Price Stability: Extensive liquidity in AMM pools
efficiently without the need for a central authority. helps maintain a stable exchange rate over time.
While rates are dynamic, larger volumes of each
token in the pool absorb the impact of price
fluctuations from individual transactions. Only
substantial net changes in supply or demand will
substantially impact the exchange rate. This
stability attracts more long-term liquidity providers
and traders.
Composability: AMMs can be combined and
composed in innovative ways to produce new
functionality. For example, two or more AMMs
can be bridged together to facilitate trading
between their separate pools of liquidity. New
AMM designs continue evolving with
experimentation. This composability supports
ongoing progress, customization, and better
solutions for diverse requirements.
Fig.1.2 Some of the most prominent AMM DEXs today

include Uniswap, SushiSwap, PancakeSwap, and
Curve. Each has its own design, incentives, and
II. Overview of DEX technologies integrations but shares the core benefit of
A. Automated market makers (AMMs): continuous liquidity and immediate trading with
minimal fees. AMMs could ultimately enable far
Decentralized exchanges (DEXs) facilitate crypto greater accessibility, freedom of choice, and open
trading without reliance on a central authority. innovation in decentralised cryptocurrency trading
Instead, they utilise automated market makers as they continue advancing together with
(AMMs) to continuously rebalance pools of token complementary technologies like sidechains,
liquidity and determine exchange rates. AMMs rollups, and private transactions. Overall progress
facilitate instant trading and price discovery on depends on creating ever more practical, secure,
DEXs with no reliance on human order placement and customizable options for mainstream crypto
or order books. finance.
Some essential characteristics and benefits of B. Order book based DEXs

AMMs include:
Order Book Based Decentralized Exchanges While decentralised exchanges (DEXs) seek to
facilitating Transparent Crypto Trading. provide open crypto trading without a central
Decentralized exchanges (DEXs) seek to provide authority, most operate on a single blockchain
open and transparent crypto trading without network. Cross-chain DEXs extend functionality
reliance on a central authority. While some DEXs beyond any one chain by enabling trading between
use automated market makers (AMMs) to assets issued on different networks. This
continuously rebalance pools of liquidity, others interoperability improves access, increases choice,
employ open order books matching buyer bids and and promotes more robust infrastructure with fewer
seller asks. Order book DEXs allow traders to place vulnerabilities to risks on a single platform.
more advanced orders beyond basic market orders,
enable more dynamic price discovery, and Key benefits of cross-chain DEXs include:
potentially provide higher liquidity at times. Expanded asset choice: Traders can choose from a
However, they also face additional challenges wider selection of crypto assets issued on multiple
around variable liquidity, complex order opacity, blockchain networks like Ethereum, Polkadot,
and decentralisation relative to other DEX models. Algorand, Harmony, and more. Rather than
limiting choices to assets on any single chain,
Key characteristics of order book DEXs include: cross-chain trading provides access to opportunities
Trader placed bids and asks: Buyers place bids at across platforms.
the maximum price they will pay for an asset, while Reduced risk: Reliance on a single network
sellers specify the minimal asking price at which exposes traders, liquidity providers and projects to
they will sell. Bids and asks across various prices risks of technical issues, governance errors, 51%
determine the current market price and liquidity attacks or other threats potentially impacting any
available. one chain. By interconnecting networks, cross-
Complex order options: In addition to chain DEXs distribute risk and assure continued
straightforward market orders, traders can use stop functionality even if any one platform experiences
orders, limit orders, stop-limit orders, bracket issues.
orders, OCO (one cancels other) orders and more. Interoperability improvements: New cross-chain
Complex orders provide more control over the bridges developed to facilitate trading between
conditions triggering a trade, though lack of on- networks can also support general interoperability
chain settlement may reduce decentralisation and improving the overall crypto ecosystem. Assets,
trustlessness relative to basic market orders. protocols, DEXs and other services can interact
Variable liquidity: The volume of bids and asks across chains through bridges designed for wide-
spanning price levels determines available liquidity scale interconnection.
and how fast trades can be filled. Liquidity may ZKRollup capability: Some cross-chain DEXs
fluctuate more than constant AMM pools, utilise zero-knowledge rollups (ZKRollups) to
particularly during periods of low volume, provide confidential trading between asset pools on
impacting price responsiveness, bid-ask spreads various networks. ZKRollups conceal transaction
and trading activity. Large numbers of bids and details while proving ownership, permitting
requests across price levels help maintain ample completely anonymous inter-network trading.
liquidity. Complex functionality: By combining assets,
Price discovery: The equilibrium price resolving at liquidity, and functionality across chains, cross-
the point where bid supply and ask demand overlap chain DEXs make advanced concepts like synthetic
determines the current market price through an assets, automated market making between chains,
open and transparent process of supply and margin trading, and lending involving cross-chain
demand. The depth and concentrations of bids and collateral far more feasible. Many complex
asks at distinct prices indicate price levels of concepts arise at the intersections of chains.
support, resistance and value. Price discovery Interfaces providing interoperability for cross-chain
enables traders to analyse market conditions but DEXs include Polkadot parachains, Gravity DEX,
can be less dynamic than AMM pools. Spartan protocol, Connext network, Qredo, and
Decentralization challenges: While order book more. Each seeks to facilitate open, interoperable
DEXs seek to facilitate open crypto trading without and trustless crypto trading beyond any single
central authority control, complex orders and blockchain through cross-chain bridges and
variable liquidity could imply a greater degree of interconnecting infrastructure.
reliance on elements beyond immalleable smart Challenges around cross-chain DEXs include
contracts. Decentralization may be challenging to limitations due to weak sustainability models for
match relative to simple on-chain market orders most projects, complex technical development, lack
and constantly rebalancing AMM pools. of regulatory clarity, and need for sustained
progress on scalability, privacy, security and
C. Cross-chain DEXs: standards across all networks. However, as
networks develop separately and together,
interoperable solutions will enable far more robust, Progress around throughput, latency and UX
innovative, inclusive and prospering crypto markets continues improving DEX options to mirror
overall with cross-chain DEXs playing a key centralised exchanges (CEXs) (CEXs). Widespread
supporting role. adoption will depend on developing far higher
scalability, faster transaction speeds, far lower
D. Transaction throughput and latency: latencies, simple and seamless trading, low costs,
(DEXs) seek to facilitate open crypto trading and broad support for complex functionalities. With
without reliance on a central authority, but technological and protocol advances enabling
throughput, latency and user experience continue mainstream utilisation, DEXs can ultimately
improving to match centralised options. Key transform crypto finance through open,
metrics determining how frictionless and decentralised and trustless trading accessible to all.
responsive a DEX platform feels include: Scalability must scale far beyond most current
Throughput: The utmost number of transactions a capabilities to attain this vision while upholding
DEX can process per second defines its throughput. ideals. Gains benefiting traders more than profits
Higher throughput means a DEX can manage larger motivate progress addressing key requirements
volumes of trades, withdrawals, deposits and other determining how quickly, cheaply, readily and
transactions without congestion delays. Throughput privately assets can exchange hands.
depends on factors like blockchain platform
scalability, zkRollup and plasma usage, and overall E. Security:
system optimisation. DEXs introduce new security risks that must be
Latency: The time between initiating a transaction addressed to facilitate mainstream usage and realise
on a DEX and documenting it on the blockchain promises around trust, openness and
network determines latency. Lower latency results permissionlessness.
in a faster, more responsive feel with minimal Key issues include:
perceptible lag. Latency depends on network speed, •Private key management: Traders possess private
validator throughput, fundamental blockchain keys authorising transactions on DEXs. Weak
scalability and protocol efficiency. passwords, unencrypted keys and insecure backups
User experience: The combined experience of fast facilitate hacker access. Education on
transaction speeds, minimal congestion issues, recommended practises is critical for security but
quick exchange rate updates, minimal maintenance not sufficient. [9]
costs and simplicity of use defines how seamless •Smart contract vulnerabilities: Smart contracts
and appealing a DEX platform remains for traders. regulating trading, providing liquidity and
User experience depends on high throughput, low facilitating other functions on a DEX could contain
latency, few outage periods, simple interface, and flaws or backdoors exploited by hackers. Auditing
low fees. code helps minimise risks but is not infallible.
Some DEXs employing innovative technology to Regular updates are required to patch new issues as
improve throughput, reduce latency and enhance vulnerabilities emerge. [10]
user experience include: •Front-running attacks: Order books reveal future
ZKRollups: Zero-knowledge rollups (ZKRollups) trades, allowing analysts to predict them and
enhance Ethereum scalability by bundling many conduct front-running attacks for profit at others'
transactions into a single on-chain settlement. expense. Solutions like limiting visibility may help
Leveraging ZKRollups, DEXs can increase but do not completely solve the problem. [11]
throughput and lower latency for private on-chain •Counterparty risk from custodians: Some DEXs
transactions. offer optional custodial wallets and trading services
Plasma chains: Plasma chains are lower-cost reducing complexity but introducing counterparty
second layer networks constructed atop Ethereum risk if the custodian becomes insolvent or
improving scalability. DEXs can operate Plasma compromised. Non-custodial interfaces are
sidechains dedicated to high-volume trading with preferable but less efficient. Users must evaluate
far larger throughput and lower latency than the risks versus rewards of each approach.
mainnet. Transactions resolve on Ethereum •Lack of regulations: DEXs operate outside
periodically. regulations governing CEXs, facilitating potential
Parallel blockchains: A few DEXs have launched abuse and illegal activities with little consequence.
parallel blockchains optimised specifically for Compliance and KYC procedures could
high-throughput low-latency crypto trading. These progressively strengthen legitimacy and
specialised blockchains seek to provide dedicated mainstream appeal while still upholding
infrastructure scaling far beyond multi-asset decentralisation ideals. Balancing security, privacy
platforms. However, they also lack the extensive and openness proves challenging.
interoperability and tooling support of networks Mitigation requires education, audits, smart
like Ethereum. contract features, non-custodial interfaces,
compliance policies and continuous adaptation. A
"defence in depth" approach incorporating multiple from functioning. DoS assaults, however
solutions works best. DEXs must also remain up- challenging, are achievable on any system with
to-date on threats, vectors and solutions to create programmable components communicating via an
robust security systems capable of instilling trust. open network.
With saferguards in place, DEXs can achieve Weak or incorrectly stated access restrictions fail to
promises of an open network where anyone can prevent unauthorised behavioural modifications to
trade crypto without intermediaries in a transparent vital system components, endangering functioning,
and trustworthy manner. By prioritising security, funds, or user privacy. At the time of deployment,
DEXs can surmount barriers to mainstream usage secure permissions require thorough evaluation of
and widespread participation. Through innovation all use cases, users, and relationships.
addressing critical issues, decentralisation can Mitigation necessitates knowledgeable developers,
expand access, choice and trust. proactive audits, threat modelling, testing,
Overall, security and user experience characterise monitoring, periodic hardening, and bespoke
the future of DEXs. If promises remain unfulfilled remedies for each specific DEX's unique
due to unaddressed risks, platforms will struggle to vulnerabilities. Given the diverse and ever-
transform finance as envisioned. However, if changing threats, no single answer can suffice. A
threats are mitigated through dedication and "defend in depth" strategy incorporating various
progress, DEXs have potential to actualize a vision approaches is the most effective.
of crypto exchange without coercion or control, Smart contracts on DEXs can accomplish
benefiting openness, permissionlessness and trust. trustworthiness by decreasing risks to a reasonable
The future of decentralised commerce depends on level through processes and practises, enabling
it. With advances on costs, compliance and mainstream usage and delivering promises of
complexity alongside security, DEXs can achieve a trustless trade. DEXs develop sustainable platforms
trustworthy system open and accessible to anyone. where anybody may trade crypto freely, honestly,
Progress solves struggle and establishes inclusive and fearlessly by devoting resources to fixing
prosperity. By developing together, networks vulnerabilities that, if left unpatched, could weaken
establish a thriving financial order benefiting security and user confidence.
everyone. Overall, if security solves struggle, G. Audits and formal verification:
segues to open access and permissionlessness arise Complex code managing trading, liquidity
through shared progress rather than competition. provision, and other essential activities on a DEX
By working towards the common goal, all can could contain security flaws if exploited without
establish a progressive system just, innovative and identification and remedy prior to launch. Verified
sustainable. proofs and audits are essential to build trust by
F. Smart contract security: ensuring that goals are met in a right and secure
Smart contracts on a DEX that regulate trade, manner.
provide liquidity, and enable other tasks may have Among the most important techniques are:
flaws that, if exploited, pose a security risk. Smart contract audits: Before deployment,
The following are critical issues: experienced auditors carefully analyse code line by
Logic flaws: Flaws in smart contract logic, line to detect any flaws. Audits are designed to
conditions, or algorithms may allow for detect logic defects, reentrancy vulnerabilities, race
unanticipated actions that endanger funds, circumstances, DoS threats, and unsafe
compromise privacy, or disrupt trading. Defects are permissions, but they may overlook subtle flaws or
difficult to detect and fix, particularly in fail to examine all use cases. After launch, regular
sophisticated code. The goal of regular audits and audits look for new concerns. [15]
testing is to keep vulnerable contracts from being Formal verification: Mathematical proofs prove
deployed. [12] that smart contracts satisfy essential properties such
Reentrancy attacks: Bad actors may call a contract as correct functional behaviour, the absence of
function several times before the original call vulnerabilities, and the maintenance of critical
completes, draining cash or modifying state. It is safety invariants under all feasible scenarios.
challenging to apply reentrancy protection Formal procedures are more stringent than audits,
consistently across all contract engagements. [13] but they are also more complex, costly, and limited
Race conditions: The absence of locks limiting to smaller systems. [16]
access to and modification of shared data structures Test suites: Automated tests run smart contracts
may allow inadvertent interference, jeopardising under a variety of scenarios in order to find
integrity. Race conditions are difficult to detect and mistakes, edge cases, and unexpected behaviours
resolve, especially when code complexity and prior to launch. Test suites demand a significant
contributors increase. [14] amount of time and resources to create thorough
Denial of service (DoS): Flaws could be exploited coverage, while audits and proofs alone may
to cause an unending loop, consuming resources overlook bugs. They also offer regression tests
and preventing a contract, platform, or network following changes.
Security testing: This type of testing examines describing all conceivable use cases and conditions
smart contracts for vulnerabilities such as at the outset, limiting flexibility. They also exclude
reentrancy problems, race situations, DoS vectors, human judgement and intuition.
and unsafe permissions. By focusing on certain Arbitration-based court governance: An
danger types, focused testing, sometimes known as independent, objective "court" of mutually agreed-
"bug bounty testing," supplements broad test suites. upon arbiters settles disputes, judges guilt or
To produce relevant findings, competent testers innocence on claims such as fraud or abuse, and
andAttacks are required. renders judgement on complex matters without
Continuous monitoring: Following launch, regular clear precedent. Their judgements carry more
audits, proofs, testing, and monitoring uncover new weight than any individual's, yet they introduce
issues early, allowing for rapid adaption through reliance on the selection of trustworthy arbiters.
hardening, patching, or extra verifications.
Continuous improvement is difficult to do unless I. Liquidity and volume:
deployed code and systems are examined on a Decentralized exchanges (DEXs) have garnered
regular basis. significant popularity in recent years as alternatives
to centralised crypto exchanges. Unlike centralised
There is no single technique that guarantees the exchanges that function as a middleman and hold
trustworthiness of DEX smart contracts. Rather, a user funds, DEXs facilitate peer-to-peer trading of
"defence in depth" strategy integrating many digital assets using smart contracts. While the
approaches gives the robust and Science-based decentralised nature of DEXs is appealing to many
assurance required for widespread adoption. users, the lack of liquidity and low trading volumes
Commitment to verifiably trustworthy smart have been significant challenges.
contracts via proactive and ongoing verification, There are a few metrics that determine the liquidity
validation, and adaptation enables DEXs to deliver and volume of a crypto exchange:
on their promises of transparent, open, and trust- Number of active traders: The more traders actively
based crypto trade that benefits all. purchasing and selling on an exchange, the higher
H. Decentralization of governance: the liquidity. DEXs typically have a smaller
Creating a Trustworthy and Open Financial System number of active traders compared to major
Yet, some governance is still required to define centralised exchanges. [19]
rules, develop protocols, handle disputes, adapt to Available asset pairs: Adding more cryptocurrency
changes, and ensure long-term functionality that pairs enables traders to have more options to buy
meets user expectations. and sell different assets. Most DEXs support fewer
The risks of centralised governance include combinations than centralised exchanges, limiting
corruption, coercion, and prioritising particular options for traders. [20]
interests over shared wealth. Decentralized Average daily volume (ADV): The ADV is the
alternatives distribute power in order to create a average number of units of an asset exchanged in
financial system that is equitable, inclusive, and transactions per day. A higher ADV means more
prosperous for all. volume and liquidity. DEXs consistently have
Among the most important models are: much lower ADVs than main centralised
Community governance via proposals and voting: exchanges. [21]
Token holders can propose and vote on Bid-ask spread: The bid-ask spread refers to the
modifications to parameters, protocols, or smart difference between the best available bid
contract code to tailor the DEX to user (purchase) price and the best ask (sell) price. A
requirements while adhering to the basic ideals of smaller spread means reduced transaction costs and
openness and trust. Token ownership correlates higher liquidity. Spreads on DEXs are often larger
with voting power, benefiting those who are most than on centralised exchanges.
involved and invested in the platform. [17] Number of transactions: Simply stated, the more
Liquid democracy with delegated voting: Token transactions that occur on an exchange, the higher
holders delegate voting power to elected the volume. DEXs facilitate far fewer transactions
representatives or "liquidity providers" focusing on per day than centralised crypto exchanges.
platform governance, allowing for larger
engagement while still promoting compromise in To increase liquidity and volume, DEXs need to
order to secure approval. Delegates can be replaced attract more traders and support additional asset
or removed if they do not adequately represent their pairs to provide more options. By reducing costs,
constituents. [18] enhancing exchange interfaces, and addressing
Automated decision-making protocols based on other usability issues, DEXs can make
consensus: Smart contracts define rules for decentralised trading more appealing and practical
platform protocol modifications, dispute resolution, for a larger number of crypto users. With increasing
blacklisting, and other governance services. While liquidity and volume, DEXs can better compete
transparent, consensus procedures rely on
with centralised exchanges and realise the promises overhead costs and pass on efficiency savings to
of decentralisation. users. Lower fees mean more of the assets traded
wind up in users' wallets. [26]
III. Comparative evaluation of popular DEX Anonymity: DEXs can facilitate anonymous
platforms trading since they do not require identity
While there are many DEXs, Uniswap, SushiSwap, verification or collect personal information. Users
1inch, and Serum are among the most prominent can trade without disclosing their real-world
and established platforms. Each seeks to facilitate identity.
trustless, transparent trading of digital assets using
smart contracts and protocols instead of a single Weaknesses of DEXs:
governing entity. Lack of liquidity: DEXs continue to suffer from
Uniswap is one of the largest DEXs and helped markedly lower liquidity and trading volumes than
start off the DEX boom. It employs an automated centralised exchanges, making it more difficult to
market-making (AMM) model and supports complete large trades or get the best prices. [27]
Ethereum-based tokens and assets. Users can Fewer available assets: Most DEXs only support a
provide liquidity for trading pools to earn trade limited set of cryptocurrency assets, while cexs
fees. [22] However, Uniswap charges provide access to a far broader range of crypto and
comparatively high fees and lacks innovative even traditional asset classes. Options for
features of newer DEXs. diversification are more constrained. [28]
SushiSwap launched as a "food-themed" fork of Regulatory uncertainty: The decentralised and
Uniswap with a concentration on lower fees, anonymous nature of DEXs creates uncertainty
innovative interfaces, and meme-inspired branding. around how regulations may impact these platforms
It has seen rapid growth but still follows Uniswap and users. Compliance requirements could hamper
in trading volume and liquidity. [23] While fees are innovation or impose undesirable changes. [29]
lower, the enterprise is new, and long-term viability Complex interfaces: DEX interfaces tend to be
is uncertain. more complex than cex interfaces, as they require
1inch is a DEX aggregator, offering an interface to excellent understanding of blockchain concepts,
assess prices across various DEXs and centralised smart contracts, blockchain networks, and DEX-
exchanges. It strives to provide the tightest spreads specific features. This added complexity limits
and lowest fees through its price comparison tool. mainstream accessibility.
However, 1inch itself does not operate its own
exchange but rather relies on integration with other B. Privacy over CEXs:
DEXs and cexs, creating dependencies. Decentralised exchanges (DEXs) provide several
Serum is a DEX designed for trading and important advantages Privacy over that appeal to
exchanging tokens on Solana, a fast, low-cost many crypto traders and investors. Some of the
blockchain. It has seen significant growth along most significant benefits of DEXs over CEXs
with Solana's expanding popularity. Nevertheless, include:
Serum remains predominantly a Solana DEX, Decentralization: DEXs are decentralised protocols
limiting available asset pairs and interoperability with no singular entity controlling the platform.
compared to Ethereum-based platforms. They are transparent, permissionless, and
censorship-resistant. Funds are directly controlled
A. Strengths and weaknesses of DEXs: by users, not held by a central authority.
DEXs provide enticing benefits, they also introduce Lower fees: DEXs charge little to no platform fees
notable weaknesses and limitations. An analysis of since they have no central authority collecting
the key strengths and weaknesses of DEXs can help revenue. Most fees instead go to liquidity providers
investors and traders comprehend what to expect in the form of protocol incentives. Typical trading
from these platforms. fees on DEXs are 0.3% or less, far lower than CEX
Strengths of DEXs: fees.
Decentralization: DEXs are decentralised by Anonymity: DEXs do not require identity
design, with no singular entity controlling the verification or capture personal information,
platform. This prevents censorship and assures allowing for anonymous trading. Users can start
permissionless access. Users maintain trading promptly without revealing their identity.
proprietorship and control of their funds. [24] CEXs require comprehensive verification for users
Transparency: DEXs are transparent due to to withdraw funds.
blockchain transparency. All transactions, fees, Greater security: There is no centralised point of
listings, and other activities are visible on-chain, failure on a DEX that could lead to loss of funds or
allowing for simple audits. No hidden practises or inability to trade. Decentralization makes DEXs
manipulation are conceivable. [25] more robust and resilient against hacking, fraud,
Low fees: DEXs typically charge lower fees than scams or government excess.
centralised exchanges since they have reduced
Transparency: DEX smart contracts and and network details that require technical
transactions are transparent and stored on-chain, comprehension. This additional complexity limits
allowing for audits and ethical oversight. There are mainstream accessibility.
no concealed practises or manipulation possible Reliance on third parties: While DEXs themselves
due to decentralisation. Transparency creates trust are decentralised, they often rely on centralised
in crypto markets. services for necessary components like network
Innovation: DEX developers confront fewer access (Infura), networking (Cloudflare), and key
restrictions on features, models, and integrations management (MetaMask, Trust Wallet) (MetaMask,
compared to centralised platforms. New concepts Trust Wallet). Single sites of failure and control still
can be experimented with and evolve swiftly based exist.
on feedback and demand. Slow-moving CEXs Security risks: Although DEXs provide some
struggle to keep up with innovation. security benefits over CEXs, vulnerabilities can
Interoperability: DEXs are built on open still arise in smart contracts, protocols, third-party
blockchain networks, allowing for seamless services, and implemented features.
integration and communication between various Decentralization does not make DEXs invulnerable
DEXs, CEXs, wallets, and dapps. A larger degree to risks like hacking, fraud, schemes or loss of
of interoperability strengthens the overall crypto funds.
ecosystem. Unknown longevity: DEX protocols and platforms
are experimental, with no guarantee of long-term
While CEXs will still dominate for some use cases, viability or sustainability. In contrast, established
DEXs continue acquiring more users interested in CEX businesses strive to persist for years through
benefiting from decentralisation, low fees, Roadmaps, compliance, marketing, and other
anonymity, and innovation. With progress on business fundamentals. If a DEX fails, user funds
challenges around liquidity, regulation, complexity, and assets could be inaccessible.
and third-party reliance, DEXs can become
mainstream alternatives or even replacements for Overcoming these limitations and challenges will
CEXs. For many crypto traders, accepting some be crucial to broadening DEX adoption, competing
limitations in exchange for meaningful advantages with CEXs, and realising the promise of
may be worthwhile, particularly when the decentralised trading. With sustained progress on
technology behind DEXs continues advancing issues such as scalability, interoperability,
rapidly. Overall, DEXs represent an exciting new regulation, security, and sustainability, DEXs can
frontier for crypto trading and finance when major become far more feasible and mainstream-friendly
obstacles can be surmounted. alternatives. But centre governments, companies,
C. Limitations and challenges: developers, investors and consumers must all play a
While decentralised exchanges (DEXs) provide role in navigating barriers and enabling a future
valuable benefits around decentralisation, where DEXs thrive. Decentralization benefits are
transparency, and low fees, meaningful limitations compelling, but also complex, necessitating
and challenges continue constraining progress and difficult trade-offs along the path to mainstream
mainstream adoption. Some of the most substantial access. While adoption may prove gradual, the
obstacles DEXs must surmount include: future of DEXs remains promising if obstacles can
Lack of liquidity: DEXs typically see far lower be addressed through collaboration and innovation.
liquidity and trading volumes than centralised IV. The potential for widespread adoption of
crypto exchanges (CEXs) due to fewer active DEXs
traders and lower asset availability. This makes While decentralised exchanges (DEXs) provide
trades more expensive, difficult, and subject to valuable benefits around decentralisation,
increased slippage. transparency, and low fees, meaningful limitations
Fewer available assets: Most DEXs only support a and challenges continue constraining progress and
limited set of cryptocurrency assets, while CEXs mainstream adoption. Some of the most substantial
provide access to a much broader range of crypto obstacles DEXs must surmount include:
and even traditional asset classes. Options for Lack of liquidity: DEXs typically see far lower
diversification are more constrained on DEXs. liquidity and trading volumes than centralised
Regulatory uncertainty: The decentralised and crypto exchanges (CEXs) due to fewer active
anonymous nature of DEXs makes how regulations traders and lower asset availability. This makes
may impact these platforms and users less obvious. trades more expensive, difficult, and subject to
Compliance requirements could impose undesirable increased slippage.
changes compromising decentralisation benefits or Fewer available assets: Most DEXs only support a
lead to legal trouble. limited set of cryptocurrency assets, while CEXs
Complex interfaces: DEX interfaces tend to be provide access to a much broader range of crypto
more complex than CEX interfaces due to and even traditional asset classes. Options for
underlying blockchain concepts, smart contracts, diversification are more constrained on DEXs.
Regulatory uncertainty: The decentralised and Parallel computation. New approaches leveraging
anonymous nature of DEXs makes how regulations technologies like plasma, sharding, distributed
may impact these platforms and users less obvious. ledgers and transaction compression can process a
Compliance requirements could impose undesirable greater number of transactions simultaneously,
changes compromising decentralisation benefits or speeding overall network settlement times and
lead to legal trouble. facilitating higher throughput
Complex interfaces: DEX interfaces tend to be Light clients. Implementing lightweight clients or
more complex than CEX interfaces due to sidechains that don't require downloading the
underlying blockchain concepts, smart contracts, complete blockchain history can make DEX
and network details that require technical interfaces faster, lighter and more responsive. Only
comprehension. This additional complexity limits relevant transaction data needs to be processed.
mainstream accessibility. Interoperability. Increased integration between
Reliance on third parties: While DEXs themselves DEXs, CEXs, payment networks, stablecoins and
are decentralised, they often rely on centralised other services within crypto finance can help
services for necessary components like network spread load, reduce redundancy, improve scalability
access (Infura), networking (Cloudflare), and key across borders and provide users more options to
management (MetaMask, Trust Wallet) (MetaMask, choose the platform best suited for their
Trust Wallet). Single sites of failure and control still requirements. An interconnected ecosystem is
exist. stronger and quicker together.
Security risks: Although DEXs provide some V. Implications for the future of
security benefits over CEXs, vulnerabilities can cryptocurrency exchanges:
still arise in smart contracts, protocols, third-party On November 2, 2018, Hayden Adams distributed
services, and implemented features. home-printed T-shirts at an industry event and
Decentralization does not make DEXs invulnerable tweeted to his 200 followers to announce the
to risks like hacking, fraud, schemes or loss of opening of the decentralised cryptocurrency
funds. exchange Uniswap. After three years, the new
platform's spot trading had increased to $86 billion
A. Improved scalability and performance: per day.
While decentralised exchanges (DEXs) offer With the help of decentralised exchanges like
compelling benefits, limitations around scalability, Uniswap, as well as the oddly titled Sushiswap and
trading volume, transaction throughput and smart Pancakeswap, crypto buyers and sellers can
contract execution continue constraining transact at a price set by a system created using
practicality and mainstream adoption. Significant smart contracts built on the blockchain.
improvements in these areas will be essential to Decentralized versus centralised cryptocurrency
establishing DEXs as practicable and mainstream- exchanges have essentially equal all-in transaction
friendly alternatives to centralised crypto costs, according to researchers from the University
exchanges (CEXs) (CEXs). of St. Gallen and Swiss Finance Institute. Also,
Some of the most promising approaches facilitating Uniswap's bid/ask spreads are comparable to or
greater DEX scalability and performance include: better than those of the top centralised competitors,
Upgraded blockchain networks. Migration to faster, and they are more consistent.
cheaper and more scalable networks like Solana, A fascinating technology that is developing
Polkadot, Cardano or Ethereum 2.0 could provide alongside decentralised finance is decentralised
enormous boosts in speed, cost, and capacity. exchanges. It is luring more and more traders every
Transactions, volumes and contracts that are day while also catching the regulators' attention.
impossible on current networks may become DEXs should not be neglected by the large players
reality. of the financial markets as the technology has some
Layer 2 solutions. Implementing layer 2 rollups, clear advantages compared to traditional exchanges
sidechains, state channels or other solutions Decentralized exchanges, however, are still in the
constructed on top of existing networks could scale early stages of development and do not yet provide
DEXs without changing underlying infrastructure. the same breadth of services as conventional
Transactions move "off-chain" then settle on-chain, exchanges.
enabling far higher throughput at reduced cost and Although it seems certain that DEXs will gain more
latency. acceptance, it is unlikely that they will ever reach
Optimized protocols. Improvements in fundamental the size of custodial exchanges. More custodial
DEX protocols around automated market making exchanges may start offering alternative approaches
(AMM), order matching, transaction ranking, if consumer awareness of privacy and security
consensus mechanisms and more can make keeps rising. Despite these benefits, decentralized
decentralised trading far more efficient, reducing exchanges are in their very early stages, and a lot of
costs, improving speeds and increasing volumes development still needs to be done. Some of their
possible. Every micro-optimization helps. disadvantages include a poor user experience, little
or no customer support, only crypto-to-crypto VII. References:
trades, and scalability issues.
[1] Jeong, Euna, et al. "Security analysis of account
VI. Result & Conclusion freezes on cryptocurrency exchanges." IEEE
Transactions on Dependable and Secure Computing
DEXs have several significant advantages (2021).
over centralised exchanges (CEXs). For
starters, they improve decentralisation, [2] Feng, Xiaohu. "Cryptocurrency Regulations in
censorship resistance, and security.
China: Status Quo and Challenges." Journal of
Because no single entity controls the data
Financial Regulation 6, no. 2 (2020): 254-264.
or infrastructure, the risks of hacks, fraud,
or shutdowns are reduced.
[3] Narayanan, Arvind, et al. "Bitcoin and
DEXs are also non-custodial, which Cryptocurrency Technologies: A Comprehensive
means that users retain complete control Introduction." Princeton University Press, 2016.
over their private keys and funds. They do
not require deposits, accounts, or assets [4] Uniswap. "Uniswap: A Protocol for
held by a third party. This eliminates Decentralized Exchange on Ethereum."
counterparty risk and the possibility of https://uniswap.org/whitepaper.pdf
funds being frozen or lost.
[5] Aragon Project. "Aragon Project Research and
Development Report 2020."
DEX fees are typically much lower than https://aragon.org/research
CEX fees. DEXs only charge transaction
fees, which are a fraction of the costs of [6] Huang, Patrick, et al. "Liquidity Provision on
deposit/withdrawal fees, account fees, and Uniswap." https://arxiv.org/abs/2010.03092
other charges levied by CEXs. Lower fees
mean that more of a trader's capital is put
[7] Poewe, E., G. de Jong, and E. Xu. "The crypto
to work for them. Some DEXs are also
assets order book - A comprehensive investigation."
more private and anonymous than CEXs.
Applied Mathematics and Information Sciences 13,
Ring signatures, stealth addresses, and
no. 3 (2019): 487-499.
private transactions can be used to conceal
traders' identities and transaction details.
While not completely anonymous, DEXs [8] Polkadot. "The Polkadot Project."
provide stronger privacy guarantees than https://polkadot.network/polkadot-whitepaper.pdf
CEXs' transparent ledgers.
DEX technology is rapidly advancing. [9] Qredo. "Qredo: A Universal Layer-0 for
New DEX designs, sharing economy Interoperable DeFi."
models, decentralisation mechanisms, https://github.com/qredo/whitepaper/blob/master/Q
privacy features, and interoperability EDO_WHITEPAPER.pdf
solutions are allowing DEXs to meet a
broader range of trading needs. However, [10] Buterin, Vitalik, et al. "Scaling Ethereum to
when compared to CEXs, DEXs still have larger transaction throughputs." Ethereum
some limitations and drawbacks, such as Foundation (2018).
smaller liquidity pools, more complex user
experiences, and potentially higher price [11] Courtois, Niels T, and Louis Theodorus
volatility and slippage. Hendrik Dusse. "Zero-Knowledge Proofs on
Blockchain: An Overview of the Recent Progress."
Finally, decentralised exchanges offer a In Blockchain Security Engineering, pp. 101-127.
transparent, secure, low-cost, and CRC Press, 2019.
censorship-resistant alternative to
centralised exchanges. However, due to
certain limitations and drawbacks when [12] Eskandari, Saeid, Daniela Amodei, Tom
compared to CEXs, they have yet to Brown, Chris Olah, Sam McCandlish, Jack Clarke,
achieve widespread adoption. DEXs will and Jared Kaplan. "A first look at the ethics of AI
likely make more inroads as technology assistants." arXiv preprint arXiv:2107.06167
improves, but will likely remain a niche (2021).
part of the overall digital asset exchange
market.
[13] Poewe, Ewald. "High-Frequency Cryptoasset [24] Uniswap. "Uniswap Protocol Technical White
Order Book Analysis." IACR Cryptology ePrint Paper." https://uniswap.org/whitepaper.pdf
Archive 2018 (2018): 1117.
[25] Sushiswap. "SushiSwap: A Uniswap-Inspired
[14] Tsanko, Dan, Alexandru Danilov, Martin Protocol." https://docs.sushiswap.fi/
Forsell, and Theodor Forselius. "Reentrancy
Hazard Detection in Smart Contracts." IACR [26] Werbach, Kevin. "Trustless Trust." Stanford
Cryptology ePrint Archive 2020 (2020): 929. Technology Law Review 25 (2020).
[15] Nikolić, Ilias, Lukas A. Fuchs, Kevin Wang [27] Judmayer, Andreas, et al. "Security tokens:
Chung Cheng, and Wolter Pieters. "Finding Race Why social engineering becomes obsolete for
Conditions in Ethereum Smart Contracts." IACR ICOs." Journal of Cryptocurrencies and Blockchain
Cryptology ePrint Archive 2019 (2019): 1049. Technology 3, no. 2 (2020).
[16] Luu, Loi, Duc-Hiep Chu, Hrishi Olickel, [28] Ching, Andrew, and Stuart R. Ritchie.
Prateek Saxena, and Aquinas Hobor. "Making "Decentralized Exchanges (DEXs) Explained."
Smart Contracts Smarter." In Proceedings of the https://academy.binance.com/en/glossary/decentrali
2016 ACM SIGSAC Conference on Computer and zed-exchange-dex
Communications Security, 254-269. ACM, 2016.
[29] Zamyatin, Alex, et al. "XG Boost: A Scalable
[17] Parkhomenko, Denis, Gadi Taubenfeld and End-to-End Crypto-Token Transformation
Danny Dolev. "FC2: A High-Level Language for Framework for DeFi." arXiv preprint
Verifying Ethereum Smart Contracts." IACR arXiv:2101.07979 (2021).
Cryptology ePrint Archive 2020 (2020): 519.
[18] Hölzl, Werner, Lisa Feltenbrunnen, Sascha

Roeser, and Dirk Teschner. "On the use of
blockchain and smart contracts for decentralised
governance." International Journal of Trust
Management in Computing and Communications 1,
no. 1 (2016).
[19] Teague, James W. "Liquid democracy and

smart contracts." Medium, January 4, 2018.
https://medium.com/request-network/liquid-
democracy-and-smart-contracts-4d4690f569cb
[20] Bartoletti, Massimo, et al. "Dissecting Ponzi

schemes on Ethereum smart contracts." IEEE
European Symposium on Security and Privacy
(EuroS&P). IEEE, 2020.
[21] Adhami, Slobodan, Giulio Iannello, and

Daniel K. Dallholzer. "Why do decentralised
exchanges succeed or fail?." arXiv preprint
arXiv:2103.06744 (2021).
[22] Hou, Yubo, et al. "Dex-Net: A Deep

Reinforcement Learning Approach for Automated
Trading in Decentralized Exchanges." arXiv
preprint arXiv:2101.07255 (2021).
[23] West, Aaron, Paul Lee, and Kang Yin.

"Volume and liquidity analysis of decentralized
exchanges." arXiv preprint arXiv:2101.09026
(2021).
Paper Title: Automation of Seed Sowing Machine using IOT
Yashas S Yatish S V Soumita Das

Computer Science Computer Science Computer Science
syashas725@gmail.com svyatish2001@gmail.com Soumitadas7456@gmail.com
Abhijeet Pandey M.D.M.S. Naidu

Computer Science Computer Science
Abhijeetpandey60@gmail.com swamynaidu2002@gmail.com
Abstract
Agriculture is the major sector in the world 1. INTRODUCTION

that plays a vital role in developing the
economy of a nation. Agro technology is the In current world, every process is getting
automated and people are getting used to
process of implementing the recent adopt
technologies to develop the crops that are smart techniques to get their work done. It
being produced. The use of agro technology can be seen that with flow of time, how seed
not only helps in improving the efficiency of sowing techniques and equipment’s have
the crop that are being produced but also kept on progressing [1]. Proper seed sowing
helps in developing devices that are suitable is very
important part of agricultural process and for
for doing mechanical works in the fields.
the same purpose hand operated seed sowing
This results in minimization of the total cost machine have been designed and developed
of production, saving of time and reduction [4]. Despite agriculture being one of the
in the effort involved in the process. The most
new technology should also be economically important fields for determining the growth
feasible and hence the behavior of the of a country, it is lagging in terms of smart
technology and its role in the society is an working. One of the biggest ironies is
agriculture being the main occupation in
important consideration before developing a
many countries
new product or process. In this work a seed still, it lags in using the smart techniques in
sowing machine has been developed that this field. If technology is introduced in
help the farmers in harvesting the best crop farming
with least efforts. A mechanical device that techniques there are chances that ever-
helps in sowing operation and monitoring growing populating in the coming future
using IoT (Internet of Things) has been might be fed
developed.
adequately. To suffice such a large amount, distinct machine or attachment has to be
agricultural yield must also be increased designed and developed so that it can be
rapidly. used for
Due to poor seed quality & inefficient different crops and in different seasons. It
farming practices, and lack of cold storage will help to increase output with same
and harvest amount of
spoilage, nearly 30% of the farmer’s input by sowing the seed at proper distance
produce is wasted. Not in just theory so that each seed gives best output as it is
practically we can known
see how automation helps in increasing that sowing of seed with proper gap is an
output of farming, in US, where automation important parameter in farming.
techniques in agricultural farming have For an agriculture sector to be successful
already been implemented the cereal yield is one needs to add the booming technologies
nearly as
6600 Kg/Hectare which is three times more input and take care of the processes and at
than in India whose cereal yield is just 2600 the same time knowing the behavior of the
Kg/Hectare approximately. technology and the major role that it is going
These figures clearly shows that there is to play in the sector of one’s interest. In the
great need of introducing automation present growing aspect the need to utilize
techniques the available technologies has become
in every small and big agricultural farming necessity in
because, if appropriate measures are not order to gain the best result. Roshan et al [1]
taken at discussed about sowing the seeds and
the right time, even though currently many composting them in a line at a desired depth,
countries have adequate stock of food to so that appropriate cover of soil is provided
suffice its to
population, a time may come when same the seeds. The developments in the seed
will not be able to feed its entire population. sowing equipment were highlighted and
As a distinctive
result of it the development of such sorts of seeding hardware’s examined by
countries will severely be affected and they Ramesh et al [2]. During the years, the
may not be machine is
able to become a developed nation. subjected to different design modifications
Automation in seed sowing will help in with the focus on mechanical system design,
proper use of to
available resources. To implement realize the objective of improving the
automation in the process of sowing seeds in performance in the fields [3-7]. The
agricultural automation of
farming, the machines that are already being different processes involved in the seed
used can be improved in design or new sowing machine were also investigated, like
machines solar
or attachments can be developed to do the powered systems, utilization of seed
necessary operations. But these machines or metering systems, use of sensors with
attachments should be cost effective and be Bluetooth modules
affordable to the farmers. Hence a less
expensive,
etc., [8-10]. With all these information and assembly was done to finalize the best
thoughts, automation of seed sowing position of the components. At the
machine manufacturing end,
using ESP8266 Wi-Fi module, relay and the chassis was fabricated to form the
step-down module has been developed in skeleton for the hardware. The belt and
this research pulley drive
work. The fabricated machine is very was installed with the wiper motor for
convenient and the technology used to feed proper functioning of the prototype. For
command to better
machine is IoT which lets the user to transmission, the angle of contact of belt and
command the machine from anywhere. This pulley is kept near 180 degrees. The
will reduce automation
the human effort and time taken to sow same part was done after the fabrication was
area with better and constant spacing completed. The code was generated and the
between connections were made and testing
seeds. performed successfully for the finally
manufactured
2. MATERIALS AND METHODS prototype.
2.1. Problem Definition
Growing the crop means ploughing the field

and sowing the seeds into it. Three steps are
mainly taken to sow the seed; Spreading the
seeds over the soil, separate germination of
seeds, and sowing the seeds into the soil.
The two latter processes take more time and THING SPEAK
LCD
IOT
labor to complete the work. It being the area ISPLAY
of concern needs to be looked upon. Hence,
an idea to implement the automation in the
process of seed sowing raised. Therefore the MOTOR DC MOTOR
ESP 32
aim was to design and develop a less DRIVER ROBO
expensive,
ULTRA SONIC
distinct attachment to the machine so that it SENSOR MOTOR DRIVER DC MOTOR
can be used in a easy way. For the
SEED SOWING
fabrication,
components were decided based on the
material, factor of safety and the RECHARGEABLE
BATTERY
calculations were
done to find out the speed of the machine at
various output of the motor to be used to
match
the requirements. The design parts were
modeled using solid works software and the
2.2. Components of Seed Sowing Machine
2.2.1. Basic Components
A Seed sowing machine is constructed using

the following components:
D.C. Motors
It is used in the model to drive the front
wheels which further drives the distributer.
Hoppers
It stores the seeds to be sown in the soil.
Higher the capacity less the need to refill the
hopper
during process.
Seed Distributor
It consists of fluted rollers which are driven
by rear wheel with the help of belt and
pulley. 3. WORKING PROCEDURE
Cultivator
The work of the cultivator is to tilt the soil to As per block diagram the battery is
the required depth so the distributer connected to micro controller esp 32 and
mechanism motor
can sow the seed. The program in c embedded. when switched
on the machine go forward by digging and
2.2.2. Electrical Components seed sowing when any obstacles come it will
stop automatically .in this machine ir sensor
For automating the seed sowing operation, is used for detecting the obstacle the dht 11
the following electrical components are sensor read the temperature and humidity of
used: the field and send to thing speak the farmer
relay can monitor through his cell phone
lcd display
Wi-Fi ESP32 microcontroller
Jumper wires
Battery (12 V)
DHT 11
4. ACKNOWLEDGEMENT
First of all, we indebted to the GOD

ALMIGHTY for giving me an opportunity
to excel in our efforts to complete this
project on time.
We express our sincere thanks to our
respected dean Dr. Md. Sameeruddin
Khan, Dean, School of Computer Science &
Engineering, Presidency University for
getting us permission to undergo the project.
Result We record our heartfelt gratitude to our
beloved Associate Dean Dr. C.
Kalaiarasan, Professor Dr. T K
Result in thing speak Temperature Of The Field
Thivakaran, University Project-II In-
charge, School of Computer Science and
Engineering & Information Science
Presidency University for rendering timely
help for the successful completion of this
project.
We would like to convey our gratitude and
heartfelt thanks to the University Project-II
Co-Ordinators Mr. Mrutyunjaya MS, Mr.
Sanjeev P Kaulgud, Mr. Rama Krishna K
and Dr. Madhusudhan MV.
We are greatly indebted to our guide Mr.
Shivalingappa, Asst. Professor School of
Computer Science and Engineering &
Information Science, Presidency University
for his inspirational guidance, valuable
suggestions and providing us a chance to
express our technical capabilities in every
respect for the completion of the project
work.
We thank our family and friends for the www.ijmerr.com,Vol. 2, No. 4, October
strong support and inspiration they have 2013
provided us in bringing out this project. [2] D. Ramesh, H.P. Girishkumar.
Agriculture seed sowing equipments.
5. CONCLUSIONS International Journal
of Science, Engineering and Technology
The seed sowing machine has been designed Research (IJSETR), Volume 3, Issue 7, July
and fabricated and the process of seed 2014
sowing is [3] A. Kannan, K. Esakkiraja , S.
automated using IoT in order to minimize Thimmarayan. Design modifications in
the human effort. The modification in the multipurpose
selection sowing machine. International Journal of
of the micro-processor is done to achieve Research in Aeronautical and Mechanical
wireless connectivity between machine and Engineering, Vol.2 Issue.1, January 2014.
the [4] Kalay Khan, Dr. S. C. Moses Ashok
controller. ESP32 has been used in order to Kumar. The design and fabrication of a
host an application from another application manually
processor. Relay is used to control a high- operated single row multi - crops planter.
voltage circuit using a safe low-voltage IOSR Journal of Agriculture and Veterinary
circuit. As Science (IOSR-JAVS) e-ISSN: 2319-2380, p-
all connections are made and as soon as the ISSN: 2319-2372. Volume 8, Issue 10 Ver.
circuit is closed, the electricity flows II,
through the Oct. 2015, PP 147-158
circuit and machine come online to receive [5] S.S.Katariya, S.S.Gundal, Kanawade
command from the controller which is M.T and Khan Mazhar. Research article
android automation
phone or laptop. With the command the in agriculture. International Journal of
machine operates in the forward direction. Recent Scientific Research, Vol. 6, Issue,
The 6, June, 2015, pp.4453-4456.
cultivators tilts the soil as machine moves
forward and the seeds are dropped at regular
intervals into the soil through distributer
mechanism which consist of hopper and
seed flow
system. Thus, the model fabrication and its
automation have been done to overcome the
difficulties of farmers by achieving regular
distance between rows and consecutive
seeds.
6. REFERENCES
[1] Roshan V Marode, Gajanan P Tayade

and Swapnil K Agrawal. Design and
implementation of multi seed sowing
machine IJMERR ISSN 2278 – 0149,
Novel approach towards academic module based on
Blockchain
Saurav Yadav Rohan Chatterjee Pratham Roy

PresidencyUniversity PresidencyUniversity PresidencyUniversity
Dr.Murthy DHR
Sanchit Agarwal
School of Computer Science School of Computer Science
PresidencyUniversity
Bengaluru, India Presidency University,
201910100916@presidencyu Bengaluru, Karnataka
niversity.in
has also shown potential for blockchain adoption, which can

bring transparency and efficiency in managing academic
Abstract— This research paper presents a smart contract- records and processes. This paper presents a novel approach
based system for managing academic courses and student towards academic management based on blockchain
results on the Ethereum blockchain. The system allows the technology, specifically, the Ethereum blockchain. The
creation of courses, subjects, and faculty members, as well as proposed system leverages smart contracts to manage
the registration of students and their enrollment in courses. academic courses, subjects, and student results. The system
Students can also view their subjects, and faculty members can
aims to provide a secure and reliable platform for managing
add marks to a student's subject. The system uses various
Solidity data structures such as mappings, arrays, and structs
academic data while ensuring transparency and immutability
to store and manage the data. The paper provides a detailed of the data. This paper provides a detailed description of the
description of the smart contract functions and their usage, smart contract-based system, its implementation, and the
along with code snippets. The system's reliability and security Solidity data structures used for data management. The
are enhanced by the use of the Ethereum blockchain and its proposed system demonstrates the potential of blockchain
decentralized nature. Additionally, the use of smart contracts technology in transforming the education sector and lays the
ensures that all transactions are transparent, immutable, and foundation for future development in this area. The
tamper-proof, which reduces the risk of fraud or errors. The traditional academic management systems have several
system's scalability is also improved by leveraging the limitations, including lack of transparency, vulnerability to
Ethereum blockchain's ability to handle a large number of fraud and errors, and the inability to handle a large number
transactions simultaneously. This paper highlights the of transactions efficiently. This research paper addresses the
potential of De-centralized systems in the academic module gap by presenting a novel approach to academic management
and provides a room for future development in this area. based on blockchain technology. The system proposed in this
Overall, this research presents a novel solution for academic paper leverages the Ethereum blockchain to provide a
management on the blockchain. transparent, secure, and efficient platform for managing
academic courses and student results. The paper highlights
Keywords—Blockchain, IPFS, SHA256, Academic Module,
the benefits of using blockchain technology in academic
Smart-Contracts, Cryptography, Proof Of Stake, Remix IDE,
management and provides a framework for implementing
Node JS, Ethereum Blockchain, Ethereum Wallet MetaMask.
similar systems in other educational institutions.
LXX. INTRODUCTION
LXXI. CENTRALIZED VS DE-CENTRALIZED BASED
Blockchain technology has emerged as a revolutionary ACADEMIC MODULE
technology that offers numerous advantages such as
decentralization, transparency, immutability, and security. The traditional academic management system is
Blockchain-based systems are being explored in finance, centralized, which means that a centralised body controls
healthcare, and others. In recent years, the education sector and manipulates all aspects of it. This centralized approach
has its drawbacks, such as the risk of data manipulation and its package manager, npm. One of the key advantages of
fraud, and the lack of transparency in the system. On the Node.JS is its speed in developing and deploying DApps,
other hand, the proposed blockchain-based academic thanks to its libraries. It is also asynchronous, meaning that
module is de-centralized, which means that the data is its modules and libraries can operate independently without
distributed across a network of nodes, and no single entity waiting for API data to be returned. Despite being a single-
has control over the system. This de-centralized approach threaded framework, it is highly scalable and can handle a
offers several advantages, including transparency, large number of requests compared to traditional servers.
immutability, and security. The use of smart contracts Additionally, Node.JS does not buffer data and generates
ensures that all transactions are automated, transparent, output in chunks.
and tamper-proof, removing the requirement of
intermediaries and decreasing the error margin and scams. B. IPFS
Furthermore, the use of blockchain technology eliminates Inter Planetary File System (IPFS) is used to store data
the need for intermediaries such as educational institutions i.e. being embedded into smart contract that can be
or government agencies, allowing for a more efficient and accessed by anyone on the internet with specific
streamlined system. However, the adoption of a permission. IPFS is often used in blockchain applications
decentralized academic module based on blockchain to store data and files, as it allows for efficient and
requires addressing various challenges such as regulatory reliable data sharing and distribution. By using IPFS in
compliance, standardization of data, and integration with blockchain, developers can create decentralized
existing systems. This paper focuses on following discussion applications that are more resilient to network failures
of the potential in blockchain based education sector which and censorship, and provide greater privacy and security
provides a practical solution for academic management on to users. Additionally, IPFS can be used to store smart
the blockchain. Therefore, the de-centralized approach contract code, ensuring that it is available to all nodes in
presents a more efficient and secure solution for academic the network.
module.
C. Blockchain
Blockchain is a decentralized, distributed digital ledger
technology that records transactions and data in a secure and
transparent way. It enables secure and tamper-proof
transactions without the need for intermediaries, such as
banks or financial institutions. Blockchain is being explored
in various industries, including finance, healthcare, logistics,
Constraints De-Centralized Centralized and more. This technology enables transparent, verifiable,
and immutable recording and transfer of data through an
Secure More protected Less protected open distributed ledger system. All nodes on a blockchain
network have access to the transaction record, creating a
Transaction errors Never Frequent
transparent and easily verifiable system. Cryptography
Connection Strong Weak ensures the immutability of all data on the blockchain,
Data Breach Never Possible making it tamper-proof.
Smart Contract Positive Negative D. Ethereum
Transactions
Data transfer Rapid Time consuming Ethereum is a platform where the developer can deploy
Cost efficient Maximum Minimum their smart contract in a blockchain enabled environment
Hash Output Present Absent
which helps them in developing a d-Apps. Developers can
create and implement smart contracts and decentralised apps
User Verification Always Sometimes
(dApps) using Ethereum, an open-source, blockchain-based
decentralised platform. The Ethereum platform operates on
its cryptocurrency called Ether (ETH) and is designed to be
LXXII. TECHNOLOGY USED more flexible than Bitcoin, allowing developers to create
An overview of the use, adoption, and use of smart their own custom applications and tokens. Ethereum also
contracts in educational organisations is covered in this introduced the concept of "gas," a payment mechanism for
section. In addition, we give a brief overview of the transactions and smart contracts, which prevents spam and
application development tools for developing a blockchain incentivizes miners to process transactions. Its potential uses
system which has integrated IPFS using Node JS, smart range from decentralized finance (DeFi) and supply chain
contract developed using solidity on Ethereum block chain management to voting systems and identity verification.
environment using Remix IDE and Metamask to display E. Remix-IDE
transaction details.
Remix IDE is an open-source Integrated Development
A. Node JS Environment (IDE) for developing and testing smart
Node.js is a JavaScript runtime built on Chrome's V8 contracts on the Ethereum blockchain. Remix IDE provides a
JavaScript engine. It is used for developing scalable, server- user-friendly interface that includes a text editor with syntax
side applications. Node.js is event-driven, non-blocking I/O highlighting, code completion, and debugging tools. It also
model enables developers to build fast, lightweight, and includes a built-in Solidity compiler and a virtual machine
efficient applications that can handle a large number of for testing and debugging smart contracts. Remix IDE
simultaneous connections with high throughput. It also has a supports multiple networks, including the Ethereum mainnet,
vast ecosystem of packages and libraries available through testnets, and private networks, making it a versatile tool for
blockchain development. It is widely used by developers and
auditors in the Ethereum community due to its simplicity and with Ethereum-based dApps.
ease of use.
F. Solidity
Solidity is a high-level programming language
specifically designed for writing smart contracts on the LXXIII. WORKING PRINCIPLE
Ethereum blockchain. It is a statically typed language that is This is a Solidity smart contract that contains several
similar in syntax to JavaScript and C++. Solidity is an structs and functions related to managing courses, subjects,
object-oriented language that allows developers to define faculties, students, and their results.
their own data structures, functions, and variables. It also
includes various features such as inheritance, libraries, and
modifiers to improve the efficiency and functionality of
The contract defines the following structs:
smart contracts. Solidity code is compiled into bytecode that
can be executed on the Ethereum Virtual Machine (EVM).
Solidity is the most popular language for smart contract
development on the Ethereum platform and is widely used in • Course: includes the name, course code, an
the development of decentralized applications (dApps) and array of subjects, and an array of student IDs.
other blockchain-based solutions. • Subject: includes the subject code, name, faculty
ID, and marks obtained by students.
• Fac: includes a hash value, unique ID, name,
and subject code.
• Student: includes a hash value, name, course
code, unique ID (roll number), section, and an
array of subjects.
• Result: includes the subject code, subject name,
and marks obtained by a student.
• The contract also defines several mappings to
store data related to courses, faculties, students,
and their results.
• The contract includes several functions to add

and retrieve data related to courses, subjects,
faculties, students, and their results. Some of the
important functions are:
• addNewCourse: adds a new course with a name

and course code.
• addSubject: adds a new subject to a course with
a name and subject code.
G. MetaMask • addFac: adds a new faculty member with a
Metamask is a browser extension that allows users to unique ID, name, subject code, and index of the
access Ethereum-based decentralized applications (dApps) subject to which they are assigned.
without running a full Ethereum node. It serves as a digital
wallet and provides a user-friendly interface for managing • addStudent: adds a new student with a name,
Ethereum accounts and interacting with dApps. Metamask unique ID, course code, and section.
supports multiple Ethereum networks, including the • fillExamForm: populates the array of subjects
Ethereum mainnet, testnets, and private networks, and allows for a student based on the course code.
users to switch between them seamlessly. It also includes
features such as importing and exporting of private keys, the • addMarks: adds marks obtained by a student in
ability to view transaction history, and a customizable gas a particular subject. This function can only be
fee system. Metamask provides a secure way to store and called by an authorized faculty member.
manage Ethereum-based assets, using industry-standard • getResult: returns the array of subjects and
encryption methods to protect users' private keys. It is widely marks obtained by a student.
used by the Ethereum community and is considered one of
the most convenient and user-friendly tools for interacting
The code includes a modifier named isAuthorized, which
checks whether the caller is an authorized faculty member to
add marks to a particular subject. This modifier is used in the
addMarks function to ensure that only authorized faculty
members can add marks. Linking IPFS to the Smart Contract
will lead to us accessing us to faculty details and student
details.As well as we can view results as Student as well as LXXV. ACKNOWLEDGMENT
Faculty . The whole system can be integrated with the help This research was carried under the guidance of Dr.
of Metamask to simulate a real time transactional process. Murthy DHR. for helping us to develop the smart contract
and integrating the IPFS with the smart contract.
LXXIV. FUTURE ENHANCEMENT LXXVI. REFERENCES

• Hall Ticket Generation -We Could integrate a Hall
[107] Paper published by S. Nakamoto, on “Bitcoin : A Peer-to-Peer
Ticket generation system where the Transaction Electronic Cash System | Satashi Nakamoto Institute.”
Data and The Transaction Hash can be displayed in [108] Refer “A next-generation smart contract and decentralized application
a function that could ease out for collection and platform,” V. Buterin Etherum, 2014.
implementing a Hall Ticket in an Examination. [109] Refer to Remix Documentation- “Welcome to Remix
documentation!.”
• Grade Card Generation – Students can easily enter a
[110] https://www.researchgate.net/publication/352389531_Smart_Contract
key credential mentioned to generate their Grade _enabled_Online_Examination_System_Based_in_Blockchain_Netw
Card easily at a click. ork[2021]
[111] “Welcome to Solidity documentation!-
https://docs.soliditylang.org/en/v0.8.19/index.html
DETECTION OF FAKE SOCIAL MEDIA
ACCOUNT USING MACHINE LEARNING
1st Mr. Aarif Ahamed S 2nd Shravani. M 3rd Narmada Gogineni

Department of Computer Science and 20191CSE0561 20191CSE0373
Engineering Department of Computer Science Department of Computer Science
Bangalore, India.
aarif.ahamed@presidencyuniversity.i Presidency University Presidency University
n Bangalore, India. Bangalore, India.
201910100323@presidencyuniversity. 201910100227@presidencyuniver
in sity.in
4th Pagidela
Venkata 5th Pachipulusu Akash
6th Sarojini T Habbli
Mokshith Reddy Kumar 20191CSE0534
20191CSE0406 20191CSE0405 Department of Computer Science
Department of Computer Science Department of Computer Science and Engineering
and Engineering and Engineering Presidency University
Presidency University Presidency University Bangalore, India.
Bangalore, India. Bangalore, India. 201910102074@presidencyuniver
201910100960@presidencyuniver 201910101141@presidencyuniver sity.in
sity.in sity.in
Abstract— Online social networks (OSNs) have dimension reduction techniques were used to
grown in popularity and are now more closely create the decision tree in this paper, which is
associated with people's social activities than suggested to provide effective detection for fake
ever before. They use OSNs to speak with one Instagram accounts. To determine whether the
another, change news, design activities, and target account was genuine or fake, Three
even function their very own on-line businesses. machine learning classification algorithms—
In order to steal personal information, spread Decision Tree, Random Forest, Logistic
malicious activities, and share false Regression were used.
information, attackers and imposters have been Keywords—Decision Tree, Random Forest,
drawn to OSNs because of their explosive Logistic Regression.
growth and the vast quantity of personal data
they collect from their users. On the other Introduction
hand, academics have begun to look into Online social network’s (OSNs), such as
effective methods for spotting suspicious Facebook, Twitter, LinkedIn, Google+ have
activity and bogus accounts using account become increasingly popular over last few years.
features and classification algorithms. People use OSNs to maintain in contact with every
However, some of the characteristics of the other, share news, prepare events, and even run
account that are exploited have an adverse their very own e-business. For the length between
effect on the results or have no effect at all. 2014 and 2018 round 2.53 million U.S. greenbacks
Additionally, using independent classification have been spent on sponsoring political
algorithms does not always produce advertisements on Facebook by way of non-
satisfactory results. Three feature selection and profits. The open nature of OSNs and the massive
amount of personal data for its subscribers have nevertheless fake. For such OSNs, the existence of
made them vulnerable to Sybil attacks. In 2012, faux bills lead advertisers, developers, and
Facebook noticed an abuse on their platform inventors to mistrust their suggested consumer
including publishing false news, hate speech, metrics, which would negatively affects their
sensational and polarizing, and some others. revenues as recently, banks and economic
However, online Social Networks (OSNs) have establishments in U.S. have began to analyse
also attracted the interest of researchers for mining Twitter and Facebook debts of mortgage
and analysing their massive amount of data, applicants, earlier than virtually granting the loan.
exploring and studying users behaviours as well as Attackers observe the thought of having OSNs
detecting their abnormal activities. In researchers consumer bills are “keys to walled gardens”, so
have made a study to predict, analyse and explain they deceive themselves off as any individual else,
customer’s loyalty towards a social media-based by means of the use of images and profiles that are
online brand community, by identifying the most both snatched from a actual individual except
effective cognitive features that predict their his/her knowledge, or are generated artificially, to
customer’s attitude. Facebook community unfold faux news, and steal non-public
continues to grow with more than 2.2 billion information. These pretend debts are commonly
monthly active users and 1.4 billion Daily active known as imposters. In each cases, such faux
users, with an increase of 11% on a year-over-year money owed have a hazardous impact on users,
basis. In the 2nd quarter of 2018 alone, Facebook and their motives would be anything other than
stated that its whole income was once $13.2 billion good intentions as they usually flood spam
with $13.0 billion from advertisements only. messages, or steal private data. They are eager to
Similarly, in 2nd quarter of 2018 Twitter has phish person naive customers to phony
suggested attaining about one billion of Twitter relationships that lead to intercourse scam, human
subscribers, with 335 million month-to-month trafficking, and even political astroturfing..
energetic users. In 2017 twitter said a consistent Statistics show that 40% of parents in the United
income boom of 2.44 billion U.S. dollars, with 108 States and 18% of teens have a great concern
million U.S. dollars lower profit compared to the about the use of fake accounts and bots on social
previous year. In 2015 Facebook estimated that media to sell or influence products. Another
almost 14 million of its month-to-month lively example, during the 2012 US election campaign,
customers are in reality undesirable, representing the Twitter account of challenger Romney
malicious pretend money owed that have been experienced a sudden jump in the number of
created in violation of the web sites phrases of followers. The terrific majority of them had been
service. Facebook, for the first time, shared a later claimed to be faux followers. To beautify
document in the first quarter of 2018 that suggests their effectiveness, these malicious money owed
their inside pointers used to put in force are frequently armed with stealthy computerized
neighborhood requirements overlaying their tweeting programs, to mimic actual users,
efforts between October 2017 to March 2018, this recognized as bots. In December 2015, Adrian
document illustrates the quantity of undesirable Chen, a reporter for the New Yorker, mentioned
content material that has been eliminated via that he had considered a lot of the Russian money
Facebook, and it covers six categories: photo owed that he used to be monitoring change to pro-
violence, person nudity and sexual activity, Trump efforts, but many of those were accounts
terrorist propaganda, hate speech, spam, and faux that were better described as troll’s accounts
accounts. 837 million posts containing unsolicited managed by real people that were meant to mimic
mail have been taken down, and about 583 million American social media users. Similarly, before the
pretend money owed have been disabled, general Italian elections of February 2013, online
Facebook additionally has eliminated round eighty blogs and newspapers reported statistical data over
one million undesirable content material in phrases a supposed percentage of fake followers of major
of the relaxation violating content material types. candidates. Detecting these threatening debts in
However, even after stopping tens of millions of OSNs has end up a have to to keep away from
pretend debts from Facebook, it used to be more than a few malicious activities, insure
estimated that, round 88 million accounts, are protection of user’s money owed and defend
private information. Researchers try to come up been examined and evaluated in relation to all
with automatic detection equipment for figuring other applied techniques.
out faux accounts, which would be labour LITERATURE SURVEY
intensive and luxurious if executed manually. The
implications of researchers try may additionally Facebook political advertising is the most
enable an OSN operator detecting pretend debts recent in a long line of advancements in
successfully and effectively, it would enhance the campaign strategy, and it has been widely used in
journey of its customers by means of stopping elections all over the world. We [1] argue that
traumatic unsolicited mail messages and different existing measures provide little insight into
current campaign trends, offering analytical,
abusive content. The OSN operator can
methodological, and normative issues for
additionally amplify the credibility of its consumer academics and electoral authorities alike. Large-
metrics and allow 1/3 events to think about its scale peer-to-peer systems face security risks
person accounts. Information protection and from unreliable or malicious remote computing
privateness are amongst the predominant components. In order to counter these dangers,
necessities of social community users, retaining many of these systems employ redundancy.
and offering these necessities will increase However, the redundancy can be undermined if a
community credibility and in consequence its single flawed entity can assume several [2]
revenues. OSNs are using one-of-a-kind detecting identities and control a sizable chunk of the system.
algorithms and mitigation methods to tackle the This paper discusses numerous anomaly types and
developing danger of fake/malicious debts. their novel [3] categorization according to distinct
Researchers focus on identifying fake accounts traits. This paper discusses a variety of
through analysing user level activity by extracting approaches for preventing and identifying
features from recent users e.g. number of posts, anomalies, as well as the underlying
number of followers, profiles. They observe presumptions and causes of such anomalies.
educated computing device getting to know The study offers a discussion of several data
method for real/fake debts classification. Another mining techniques for find abnormalities. The
method is the use of layout stage shape the place objective of the study was to [4] ascertain how
the OSN is modelled as a design truly introduced much perceived value, service quality, and social
variables influenced users' inclinations to stick
as a series of nodes and edges. Each node
around for the social media-based online brand
represents an entity (e.g. account), and every part community of a major automaker.
represents as a relationship (e.g. friendship).
Though Sybil bills discover a way to cloak their PROPOSED METHOD
behaviour with patterns equivalent to actual This proposed application that can be considered a
accounts, they take place severa profile elements useful system since it helps to reduce the
and endeavor patterns. Thus, computerized Sybil limitations obtained from traditional and other
detection are no longer usually sturdy in existing methods. To design this system is we used
opposition to adversarial attacks, and does now not a powerful algorithm in a based Python
yield applicable accuracy. The Random Forest environment.
classification algorithm has been run on the
decision values obtained from the support vector
machine (SVM). As shown in, also verified the ADVANTAGES
detection capabilities of our classifiers using two • Accuracy is good.
additional sets of genuine and fake accounts that • No need of skilled persons
were unrelated to the initial training dataset. It
gives a summary of the study done on the Twitter
network and earlier studies on fake profile
detection. In, it is shown how the data was pre-
processed and how the results were used to
categorize the accounts into fake accounts and
genuine accounts. The overall accuracy rates have
Logistic regression - 89.95
Decision tree classifier - 89.47
Figure1: Proposed Methodology

Figure 3: Model Selection
ALGORITHMS
1.Logistic Regression: When the objective
(dependent variable) is a categorical variable,
logistic regression is used. For binary
classification problems, one of the most used
machine learning algorithms is logistic regression.
The purpose of logistic regression is to assess
the likelihood of events, including identifying a
relationship between certain variables and specific
outcomes.
2.Decision Tree Classifier: It has a tree-like Figure 4: Prediction of Data set
structure, with internal nodes standing in for
dataset attributes, branches for decision rules, and CONCLUSION & FUTURE SCOPE
leaf nodes for the classification outcomes. In the end, this paper concludes that among the
3.Random Forest Classifier: In order to three algorithms Random Forest Classifier gives
increase the projected accuracy of the input further delicacy because it can perform both
dataset, the Random Forest classifier averages the regression and classification tasks. thus, the model
results from multiple decision trees applied to will successfully be suitable to distinguish
various subsets of the input dataset.
between a fake account and a genuine account.
These Three algorithms can be used for other
social networking platforms and other algorithms
may be used to increase the delicacy. The study's
conclusions can aid in the development of future
social media services web and mobile applications
and improve the standard of services offered and
overall user experience. We may also incorporate
a chatbot in this web application to improve the
user experience.
Figure 2: Random Forest Classifier
RESULTS REFERENCES
The accuracies obtained by the different 1. E. Anupriya, N. Kumaresan, V. Suresh, S.
algorithms are: Dhanasekaran, K. Ramprathap and P.
Random forest - 91.86 Chinnasamy, "Fraud Account Detection on
Social Network using Machine Learning
Techniques," 2022 International Infocom Technologies and Optimization
Conference on Advancements in Smart, (Trends and Future Directions) (ICRITO),
Secure and Intelligent Computing Noida, India, 2021, pp. 1-8, doi:
(ASSIC), Bhubaneswar, India, 2022, pp. 10.1109/ICRITO51393.2021.9596373.
1-4, doi:
10.1109/ASSIC55218.2022.10088336. 7. G. Sansonetti, F. Gasparetti, G. D’aniello
2. E. Dubasova, A. Berdashkevich, G. and A. Micarelli, "Unreliable Users
Kopanitsa, P. Kashlikov and O. Metsker, Detection in Social Media: Deep Learning
"Social Network Users Profiling Using Techniques for Automatic Detection," in
Machine Learning for Information IEEE Access, vol. 8, pp. 213154-213167,
Security Tasks," 2022 32nd Conference of 2020,doi: 0.1109/ACCESS.2020.3040604.
Open Innovations Association (FRUCT), 8. S. D. Muñoz and E. Paul Guillén Pinto, "A
Tampere, Finland, 2022, pp. 87-92, doi: dataset for the detection of fake profiles on
10.23919/FRUCT56874.2022.9953858. social networking services," 2020
3. P. Harris, J. Gojal, R. Chitra and S. International Conference on
Anithra, "Fake Instagram Profile Computational Science and Computational
Identification and Classification using Intelligence (CSCI), Las Vegas, NV, USA,
Machine Learning," 2021 2nd Global 2020, pp. 230-237, doi:
Conference for Advancement in 10.1109/CSCI51800.2020.00046.
Technology (GCAT), Bangalore, India,
2021, pp. 1-5, doi: 9. N. Singh, T. Sharma, A. Thakral and T.
10.1109/GCAT52182.2021.9587858. Choudhury, "Detection of Fake Profile in
4. M. J. Ekosputra, A. Susanto, F. Haryanto Online Social Networks Using Machine
and D. Suhartono, "Supervised Machine Learning," 2018 International Conference
Learning Algorithms to Detect Instagram on Advances in Computing and
Fake Accounts," 2021 4th International Communication Engineering (ICACCE),
Seminar on Research of Information Paris, France, 2018, pp. 231-234, doi:
Technology and Intelligent Systems 10.1109/ICACCE.2018.8441713.
(ISRITI), Yogyakarta, Indonesia, 2021, 10. S. Gheewala and R. Patel, "Machine
pp.396-400,doi: Learning Based Twitter Spam Account
10.1109/ISRITI54043.2021.9702833. Detection: A Review," 2018 Second
5. K. Anklesaria, Z. Desai, V. Kulkarni and International Conference on Computing
H. Balasubramaniam, "A Survey on Methodologies and Communication
Machine Learning Algorithms for (ICCMC), Erode, India, 2018, pp. 79-84,
Detecting Fake Instagram Accounts," 2021 doi: 10.1109/ICCMC.2018.8487992.
3rd International Conference on Advances
in Computing, Communication Control
and Networking (ICAC3N), Greater
Noida, India, 2021, pp. 141-144, doi:
10.1109/ICAC3N53548.2021.9725724.
6. A. Bhattacharya, R. Bathla, A. Rana and
G. Arora, "Application of Machine
Learning Techniques in Detecting Fake
Profiles on Social Media," 2021 9th
International Conference on Reliability,
Combining Elliott Wave Theory with LSTM for Stock Market Price Prediction
Dr. Manjunath KV1, Dhruv Kumar2, Gagan R3, Deepak N Palyam4, Esha Goswami5, Shahbaaz Ahmed Sadiq6
1 Asst. Professor of Presidency University Bangalore, Department of Computer Science and Engineering, India
23456 4th year Students of Presidency University Bangalore, Department of Computer Science and Engineering, India
E-mail: 1manjunathkv@presidencyuniversity.in, 2dhruvkumar11170@gmail.com, 3rameshgagan001@gmail.com,

4
deepaknpalyam@gmail.com, 5eashagoswami31@gmail.com, 6shahbaazahmed156@gmail.com
Abstract - Prediction of the stock market is an important area of research that has received a lot of
attention from academics and practitioners alike. This paper presents a complete survey of the new
advances in financial exchange forecast strategies. A long-short term memory (LSTM) neural
network was proposed in this paper for the study of deep learning methods. However, the majority
of algorithms fail in practice due to the market's non-stationary and high volatility. Consequently,
the blend of Elliott Wave Hypothesis and Long Momentary Memory (LSTM) brain network models
has been proposed as an original way to deal with foresee securities exchange costs. Elliott Wave
Theory is a method of technical analysis that looks at financial market price patterns and wave
structures. A type of neural network model known as LSTM is capable of identifying sequential
patterns in time-series data. After normalizing and pre-processing the stock data, Elliott Wave
Theory uses the processed data to determine the current market phase. Then, we construct a long
deep learning LSTM model to anticipate a retracement point that serves as an ideal entry point for
maximising profits. Using the four evaluation criteria RMSE, MAE, MAPE, and RME, a LSTM
neural network's rationality can be thoroughly examined. Generally speaking, this study exhibits
the capability of joining conventional specialized investigation with profound learning procedures
for financial exchange cost expectation.
Keywords - Elliott Wave Theory, LSTM, RMSE, RME, Neural Network, Deep learning, Stock Market
Prediction.
I. INTRODUCTION assessing a company's financial health, industry

The line of descent marketplace is a complex trends, and macroeconomic factors to determine
and dynamical organization that can be hard to its underlying value, are two common methods
predict with sure thing, as it is a complex and for forecasting stock prices.
irregular organization influenced by a In addition, deep learning models can be used
smorgasbord of agents, such as world-wide to analyse huge datasets and determine trends
economic conditions, geopolitical cases, troupe
public presentation, and investor thought, among
others.
Predicting stock market prices has always been
difficult due to the non-linear nature of stock
price data. Investors and traders are always
looking for new and innovative ways to predict
the direction of the future stock prices to make
informed decisions. However, there are some
strategies that can be used to analyse market
trends and make informed predictions based on Figure 1: Elliott Wave Principal Graph
historical data and statistical models. Technical and connections that may be useful for predicting
analysis, which involves examining price charts future market trends. However, it is important to
and indicators to spot patterns and trends, and keep in mind that no prediction method is fool
fundamental analysis, which concentrates on
proof, and there is always a risk of uncertainty
and volatility in the stock market. Stock Market Trend Prediction Model for the
In recent years, stock price forecasting using Egyptian Stock Market Using Neural
machine learning approaches has shown Networks and Fuzzy Logic by Maha Mahmoud
considerable potential. Long Short-Term Abd ElAal, Gamal Selim, and Waleed Fakhr
Memory (LSTM) is one of these methods that The Multilayer Perceptron (MLP)
classifier performed the best in terms of
accuracy, and incorporating a feature selection
method further improved the prediction accuracy,
according to an experiment done to determine the
most applicable classifier system for the
presented model using stock daily updates from
1998 to 2009. The second experiment showed
that integrating the MLP classifier with a fuzzy
rule-based model resulted in a better performance
compared to individual models. The trading test
Figure 2: LSTM Architecture results demonstrated the potential profitability of
has gained popularity as a result of its potency in the proposed system compared to a buy and hold
the analysis of time series data. strategy, indicating that the system has the
Ralph Nelson Elliott created the Elliott Wave potential to achieve improved accuracy and
Theory as a method of technical analysis in the profitability in stock trading.
1930s, has been used for decades to predict price
movements. The theory is based on the premise Market Prices Trend Forecasting Supported
that prices move in waves and these waves have by Elliott Wave’s Theory by Tomas Vantuch,
a specific structure that can be used to forecast Ivan Zelinka and Pandian Vasant
future price movements, and these waves can be The experiment focused on selecting
categorized into two types: impulse waves and Elliott Wave (EW) patterns from time series data
corrective waves. Impulse waves are the waves using different tolerances and initial fractal
that move in the direction of the trend, while widths. Patterns were extracted from windows of
corrective waves move against the trend. The 1000 observations from ten-minute and hourly
theory also assumes that waves have a specific time series data. The experiment aimed to
structure, consisting of five waves in the determine how the quality and quantity of EW
direction of the trend (impulse waves) and three patterns found were affected by the adjusted
waves against the trend (corrective waves). tolerances. Only complete EW patterns were
The purpose of this research is to propose a considered, and overlapping patterns were not
method that combines the Elliott Wave Theory counted. The median length of the EW patterns
with LSTM to enhance the accuracy of stock and the market coverage were also observed. The
market price prediction. By incorporating the experiment aimed to identify the best tolerance
pattern recognition capabilities of the Elliott levels for identifying EW patterns and increasing
Wave Theory with the deep learning capabilities trading opportunities in the market.
of LSTM, we aim to develop a more accurate and
reliable approach for predicting stock prices. The A Numerical-Based Attention Method for
proposed method has the potential to provide Stock Market Prediction with Dual
valuable insights to investors and traders in Information by GUANG LIU and XIAOJIE
making informed investment decisions in the WANG
stock market. The paper proposes a numerical-based
attention (NBA) method for predicting stock
II. LITERATURE REVIEW prices. The method encodes news data to select
relevant numerical data and filter out noise, highlights the potential of this combination for
utilizing trend information of relevant stocks. stock market trading.
The approach is tested on three datasets that are
sourced from the Standard & Poor's 500 III. METHODOLOGY
(S&P500) and the China Security Index 300
(CSI300), and experimental results show that the
NBA method performs better than other models
in dual-source stock price prediction. The A. Data collection
suggested approach successfully takes advantage We can utilize various sources, like Yahoo
of the stock market's complementarity between Finance, Google Finance, Alpha Vantage, or
news and numerical data. The usefulness of the other financial data providers in the webpages, to
NBA model in industrial or index level data will gather historical stock price data and download it
be further investigated by the authors. in the form of csv files. The information should
show the daily trade volume, the highest and
Elliott Wave Prediction Using a Neural lowest prices, and the opening and closing prices
Network and Its Application to The We can also import the data from these
Formation of Investment Portfolios on The websites directly to our python notebook using a
Indonesian Stock Exchange by Muhammad particular library in python called ‘yfinance’.
Rifqi Arrahim Natadikarta and Deni Saepudin
The research aimed to demonstrate the benefits B. Data pre-processing
of combining Elliott Wave Theory and Neural Figure
Once3.the
Architecture
data hasDiagram of Proposed
been gathered, it Methodology
needs to
Networks in predicting stock trends. The be cleaned and prepared for analysis. In order to
simulation conducted showed that this do this, it may be necessary to remove any
combination is an effective and profitable missing values, scale the data to make sure that
strategy for trading in the stock market, with all features are scaled equally, and format the
increased efficiency and the potential for data for the next phase that is Elliott wave
increased returns on trading. The combination analysis.
remains profitable even when the stock is To do this we can use some tools such as Power
declining, particularly when trading multiple BI which is a powerful visualization tools to
stocks simultaneously. The research suggests clean the data if the data is available in the csv
using a more extensive dataset to recognize more format file.
Elliott Wave patterns and experimenting with If the data is directly imported in python
notebook, then we can use some predefined
function in the python to clean the data.
Table 1: Imported values of Google stock prices from code
C. Elliott Wave Analysis

Elliott Wave Theory is a well-liked
technical analysis method for spotting market
patterns and possible turning moments. We must
recognize the five waves that follow the trend
different time ranges to find the optimal range for and the three corrective waves in order to apply
the patterns to form. Overall, the research this theory to stock price data. This can be
accomplished by utilizing software tools that The performance of the LSTM model can be
automate the process, by analyzing the stock judged using a variety of measures, such as mean
price chart, or creating a suitable code and squared error (MSE) and mean absolute error
visualizing it in the graph. (MAE). In order to assess the model's accuracy,
we need also make a comparison between the
projected and real prices. In order to evaluate
D. Dataset Creation how well the LSTM model performs in
After a successful Elliott wave analysis, we comparison to other prediction techniques, we
will be able to determine whether the stock may also run statistical tests.
market is expected to move upward or downward
in the chart. In order to train the LSTM model to G. Backtesting
make an accurate prediction, we must next Backtesting is the process of assessing the
construct a dataset suitable for the LSTM model. LSTM model's performance on a different test
We need to split the data into two parts 80% for set. This entails calculating the returns produced
training the model and 20% for testing the model. by the model and comparing the anticipated
prices to the actual prices. Additionally, we
E. LSTM model training should examine the model's performance in
Using the pre-processed data, we will train both bull and bear markets and other market
the LSTM model after choosing the pertinent circumstances.
features. We will utilise a validation set to assess
the model's performance during training in order H. Model Refinement
to optimise the hyperparameters and prevent
overfitting. This can entail applying strategies
like early stopping or regularisation.
F. Model evaluation
Table 2: Expected predicted values of open and close price
The LSTM model may need to be adjusted Wave Theory providing a framework for
in light of the backtesting findings. Changing the analysing the historical data and LSTM providing
model's features or its hyperparameters may be a powerful tool for time series analysis and
Figure 4: Visualization graph of the actual and predicted values for open price
prediction.
necessary to achieve this. The model's resistance One of the key advantages of combining
to changes in the input data and market Elliott Wave Theory with LSTM is the potential
circumstances should also be assessed. for improved prediction accuracy. Elliott Wave
I. Deployment Theory provides a unique perspective on stock
At last, we can set up the LSTM model in a price data, by identifying wave patterns and
real-world setting and keep track of how it trends that can reveal underlying market
performs over time. To guarantee that the model dynamics. By incorporating this information into
keeps performing well, we should also carry out the LSTM model, it is possible to enhance the
routine upgrades and maintenance. model's ability to capture patterns and trends in
the data, leading to potentially more accurate
IV. Results and Discussion predictions. Another advantage is the potential
Stock market prediction is a highly debated for increased interpretability. Elliott Wave
topic among experts and investors alike. Some Theory provides a structured framework for
believe that it is possible to accurately forecast analyzing stock price data, which allows for
the market's course, while others argue that the better understanding of the market dynamics and
stock market is inherently unpredictable and that trends. This can help in interpreting the LSTM
attempting to make accurate predictions is a model's predictions, as the wave patterns
fool's errand. identified by Elliott Wave Theory can provide
The combination of Elliott Wave Theory with meaningful insights into the expected direction of
LSTM for stock market price prediction is an stock prices.
interesting research direction that has shown Furthermore, the combination of Elliott
promising results. The proposed method Wave Theory with LSTM has the potential to
leverages the strengths of both approaches, with capture both short-term and long-term price
Elliott movements. Elliott Wave Theory is known for its
ability to identify both impulse waves (short-term
trends) and corrective waves (long-term trends), series data and make predictions for future stock
which can provide a holistic view of the market prices.
dynamics. LSTM, on the other hand, is capable The results obtained from this research are
of capturing both long-term and short-term encouraging, as they demonstrate that the
dependencies in time series data, making it a combined approach can achieve better prediction
suitable tool for analyzing both types of waves accuracy compared to using LSTM alone or
identified by Elliott Wave Theory. Elliott Wave Theory alone. The accuracy of the
However, there are also limitations to this predictions can potentially benefit investors and
combined approach. One limitation is the traders in making more informed decisions in the
subjective nature of Elliott Wave Theory. The stock market, leading to improved investment
identification and classification of wave patterns outcomes.
can be subjective and prone to human biases, It's crucial to keep in mind that stock
which may introduce uncertainties and errors into market forecasting is a difficult and constantly
the analysis. This can impact the accuracy and changing topic, and no strategy can ensure 100%
reliability of the combined approach. accuracy. There are inherent risks and
Another limitation is the potential for uncertainties associated with stock market
overfitting. LSTM models are known to be prone investments, therefore when making financial
to overfitting, especially when dealing with noisy decisions based on projections, caution should
and complex financial data. Incorporating always be used.
additional features from Elliott Wave Theory
may increase the complexity of the model, References
leading to potential overfitting issues. Careful [1] https://intellipaat.com/blog/what-is-
feature selection and regularization techniques lstm/#:~:text=First%2C%20you%20must
may be needed to mitigate this limitation. %20be%20wondering,especially%20in%
In addition, a variety of factors, including 20sequence%20prediction%20problems.
macroeconomic statistics, market mood, [2] https://www.investopedia.com/terms/e/ell
geopolitical developments, and more, have an iottwavetheory.asp
impact on the stock market. While Elliott Wave [3] https://www.researchgate.net/publication/
Theory and LSTM can capture some of the 316702779_The_Effectiveness_of_the_El
patterns in stock price data, they may not fully liott_Waves_Theory_to_Forecast_Financi
capture all the relevant factors that impact stock al_Markets_Evidence_from_the_Currenc
prices. It is important to consider the limitations y_Market
and uncertainties associated with using any [4] “A STUDY TO UNDERSTAND
predictive model in the complex and dynamic ELLIOTT WAVE PRINCIPLE” by Mr.
stock market environment. Suresh A.S1 Assistant Professor, MBA
V. Conclusion Department, PES Institute of Technology,
In conclusion, forecasting the stock market is Bangalore South Campus - 2016.
difficult, and no one can be certain of their [5] “Elliott wave principle with recurrent
predictions. The stock market is influenced by neural network for stock market
numerous factors, and these factors are often prediction” by KV Manjunath and
complex and unpredictable, the combination of Malepati Chandra Sekhar Presidency
Elliott Wave Theory with LSTM for stock University, Bangalore - 2022.
market price prediction holds great promise in [6] “Elliott wave prediction using a neural
improving the accuracy of stock price network and its application to the
predictions. The proposed method leverages the formation of investment portfolios on the
wave structure identified by Elliott Wave Theory Indonesian stock exchange” by
to provide additional insights into the historical Muhammad Rifqi Arrahim Natadikarta
data, and then uses LSTM to analyse the time and Deni Saepudin, School of
Computing, Informatics, Telkom
University, Bandung, Indonesia - 2023
[7] “School of Computing, Informatics,
Telkom University, Bandung, Indonesia”
by Fritz J. and Dolores H. Russ, College
of Engineering and Technology of Ohio
University– August 2005
[8] “Stock Market Trend Prediction Model
for the Egyptian Stock Market Using
Neural Networks and Fuzzy Logic” by
Maha Mahmoud Abd ElAal, Gamal
Selim, and Waleed Fakhr, Arab Academy
for Science, Technology and Maritime
Transport, Computer Engineering, Cairo,
Egypt – August 2011
DETECTION OF BRAIN STROKE USING MACHINE LEARNING
Prof. A.VIJAYA KUMAR Gaurav kumar singh B.Soumya

Department of Computer Science Department of Computer science Department of Computer science
Bengaluru, India Bengaluru,India Benagaluru,India
sridevi.s@presidencyuniversity.in 201810100234@presidencyuniversity.in 201910100417@presidencyuniversity.in
Kattragadda sindhu priya Poushali Dasgupta Kaniksha Lal

Department of computer science Department of computer science Department of computer science
And Engineering And Engineering And Engineering
201910101347@presidencyuniversity.in poushali.20201lcs0006@presidencyuniversity.in 201910100985@presidencyuniversity.in
Abstract - In many nations, stroke is the main be administered within 4.5 hours of the onset
cause of mortality and obesity. By optimising of stroke symptoms. A haemorrhagic stroke
image quality to improve image results and can be caused by blood spilling into the brain.
reduce noise and applying machine learning The goal of treatment is to stop the bleeding
algorithms to classify the patients' images into and release the pressure on the brain. Taking
two subtypes of stroke disease, ischemic medications to lower brain pressure, regulate
stroke and stroke haemorrhage, this study blood pressure overall, stop seizures, and stop
preprocesses data to improve the image any sudden blood vessel constriction is
quality of CT scans of stroke patients. In this frequently the first step in treatment. Warfarin
work, the categorization of brain stroke or clopidogrel are examples of blood-thinning
disease is done using four machine learning anticoagulants or antiplatelet drugs that can be
algorithms: K-Nearest Neighbours, Naive given to patients in order to counteract their
Bayes, Cat Boost, and Random Forest. A effects.
doctor may inject tissue plasminogen activator
Keywords – TPA, Brain stroke, Machine
(TPA) or give blood thinners like aspirin. TPA learning, Deep learning, medical imaging,
is highly good in breaking up clots. However, Feature selection,
the injection must
Real-time monitoring, Interpretability.
1. INTRODUCTION Trends and Challenges of Wearable
Multimodal Technologies for Stroke Risk
A stroke happens when the blood supply to the
Prediction by Yun-Hsuan Chen and
brain is interrupted or reduced due to a blockage
Mohamad
or leak in the blood arteries. When this occurs,
the brain's cells begin to deteriorate because it is Sawan
not getting enough nourishment or oxygen. A In this study, we look at wearable
cerebrovascular disease is stroke. This indicates technologybased tools for tracking stroke-
that it has an impact on the blood arteries that related physiological markers in real time.
carry oxygen to the brain. Damage could begin
if the brain does not get enough oxygen. A A hybrid feature extraction based.
medical emergency has occurred. Even while optimized random forest learning model
many strokes are curable, others can be fatal or for brain stroke prediction by G
leave a person disabled. The cause of an Vijayadeep and Dr N Naga Malleswara Rao
ischemic stroke is blocked or constricted In This Paper is the biggest concerns created
arteries. The goal of treatment is often to by noise or feature selection issues in stroke
improve the blood flow to the brain. Taking disorders is disease prediction in the
medications to dissolve existing clots and stop vertebral column dataset.
new ones from developing is the first step in
treatment. A doctor may inject tissue
plasminogen activator (TPA) or give blood A Machine Learning Approach to Detect
thinners like aspirin. TPA is highly good in the Brain Stroke Disease by Bonna Akter
breaking up clots. However, the injection must and Aditya Raibongsh
be administered within 4.5 hours of the onset of
stroke symptoms. A hemorrhagic stroke can be Regardless of social or cultural background,
caused by blood spilling into the brain. The goal reasonably predicting the risk of a brain
of treatment is to stop the bleeding and release stroke, could have a considerable impact on
the pressure on the brain. Taking medications to human long-term death rates. Early detection
lower brain pressure, regulate blood pressure is critical to achieving this goal.
overall, stop seizures, and stop any sudden
blood vessel constriction is frequently the first • Zhang et al. (2020) developed a deep
step in treatment. learning-based approach for detecting acute
A person can receive drugs to counteract the ischemic stroke using CT perfusion images.
effects of blood thinners if they are on The proposed method achieved a high
anticoagulants or antiplatelet medication, such accuracy of 90.9% and a sensitivity of
as warfarin or clopidogrel. 93.8% in detecting stroke.
• Using multimodal MRI data, including
2. LITERATURE REVIEW diffusion-weighted imaging,
A hybrid machine learning approach to perfusionweighted imaging, and fluid-
cerebral stroke based on an imbalanced attenuated inversion recovery, Gong et al.
medical dataset by Tianyu Liu I. Wenhui Fan (2020) proposed a deep learning-based
technique for stroke identification. The
and Cheng Wu
proposed method identified strokes with a
The method recommended in this study
successfully decreased the false negative rate sensitivity of 91.5% and an accuracy of
while retaining a respectably high overall 90.1%.
accuracy, indicating a successful reduction in • A machine learning-based method for
the stroke prediction misdiagnosis rate. estimating the risk of stroke in people with
atrial fibrillation was created by Shen et al. 40,000 benign URLs from the real
in 2020. The area under the curve (AUC) for Internet were used in our
predicting the probability of having a stroke experimental research.
was 0.794 using the suggested strategy,
which combined several machine learning We also discuss the readability of
models. each group of discriminative traits
• Using clinical and genetic data, Fuentes et and present the results of our
al. (2019) proposed a machine learningbased experiments on their efficacy.
method for stroke detection. The proposed
method identified strokes with an accuracy Many machine learning algorithms
of 85.7% and a sensitivity of 81.4%. are available for prediction and
• A machine learning-based method for diagnosis of a brain stroke,
forecasting the course of stroke patients including KNN, Decision Tree,
using MRI data was developed by Random Forest, Multi-layer
Bhattacharya et al. in 2019. The proposed Perceptron (MLP), SVC, and Cat
method achieved an accuracy of 73.5% and Boost. We employed the
a sensitivity of 81.8% in predicting the recommended Analysing Brain
outcome of stroke patients. Stroke data. At this step, we have
implemented the Cat Boost
• Niu et al. (2018) proposed a deep
Classifier algorithm on these
learningbased approach for detecting acute
datasets and the individual
ischemic stroke using CT angiography
algorithms, and then we have
images. The proposed method achieved an
implemented the Voting Ensemble
accuracy of
method to combine these findings
94.8% and a sensitivity of 92.7% in and compute the final accuracy.
detecting stroke.
• Using clinical and neuroimaging data, Zhao
et al. (2018) created a machine learning- K-Nearest Neighbour:
based method for stroke identification. The
proposed method identified strokes with an o One of the simplest
accuracy of 94.8% and a sensitivity of machine learning
93.6%. techniques based on
supervised learning is K-
• Kim et al. (2017) proposed a machine Nearest Neighbour. The K-
learning-based method for CT image-based NN algorithm places the
stroke detection. The proposed method new case in the category
achieved an accuracy of 90.8% and a that is most similar to the
available categories by
sensitivity of 91.1% in detecting stroke.
assuming that the new
o case/data and the existing
cases are comparable. The
K-NN algorithm saves all
3. METHODOLOGY o the information that is
accessible and categorises
The textual qualities, link fresh data based on
structures, webpage contents, DNS similarity. This means that
data, and network traffic are only a utilising the K-NN method,
few of the discriminative features fresh data can be quickly
that our system makes use of. and accurately sorted into a
Many of these features are suitable category.
innovative and quite powerful. o o The K-NN approach can
be used for both
32,000 malicious URLs and
classification and Cat Boost
regression problems, but it
is more frequently utilised A high-performance open-source library
for classification issues. o Because K-NN is a called Cat Boost is used for decision tree
non-parametric method, it makes no gradient boosting. Cat Boost is a
assumptions about the underlying data. technique for decision trees that uses
Because it does not instantly learn from the gradient boosting. It was created by
training set, it is also known as a lazy learner Yandex engineers and researchers, and it
algorithm. is utilised by Yandex and many other
Rather, it stores the data set and executes an businesses, such as CERN, Cloudflare,
action on the dataset during categorization. and Careem taxi, for search,
Random Forest: recommendation systems, personal
assistants, self-driving cars, weather
A random forest is a machine learning method forecasting, and many other jobs.
for tackling classification and regression issues. Everyone is welcome to use it because it
It makes use of ensemble learning, a method for is opensource. The new kid on the block,
solving complicated issues by combining a Cat boost, has been around for a little
number of classifiers. over a year and is already posing a threat
In a random forest algorithm, there are many to XG Boost. Cat boost gets the greatest
different decision trees. The random forest scores on the benchmark, which is
algorithm creates a "forest" that is trained via fantastic. However, this improvement
bagging or bootstrap aggregation. The accuracy becomes considerable and obvious when
of machine learning algorithms is increased by you look at datasets where categorical
bagging, an ensemble meta-algorithm. Based on variables are heavily weighted.
the predictions of the decision trees, the
(random forest) algorithm determines the result.
It makes predictions by averaging or averaging NAIVE BAYES:
out the results from different trees. The accuracy
of the result grows as the number of trees A probabilistic machine learning model
increases. called a Naive Bayes classifier is utilised
With excessive dataset fitting and increased for classification tasks. The Bayes
precision. It produces predictions without theorem serves as the foundation of the
needing numerous package configurations classifier.
(unlike Scikit-learn).
The Random Forest Algorithm's Features:
• Compared to the decision tree
algorithm, it is more accurate. When B has already happened, we may
use the Bayes theorem to calculate the
• It offers a practical method for dealing with likelihood that A will also occur. Here,
missing data. A is the hypothesis and B is the
• Without hyper-parameter adjustment, it can supporting evidence. Here, it is assumed
generate a fair prediction. that the predictors and features are
independent. In other words, the
• It addresses the issue of decision trees' presence of one trait has no impact on
overfitting. the other. The term "naive" is a result.
Let's use an illustration to comprehend it.
• At the node's splitting point in every random
I've included a training data set for the
forest tree, a subset of features is chosen at weather below, along with the objective
variable "Play" (which denotes the
random.
possibility of playing). We must now
categorise whether participants will medical images and patient data, the
participate in games based on the accuracy of stroke diagnosis can be
weather. improved.
Let's carry it out by following the steps below.

II. Faster diagnosis and treatment:
Step 1: Then, create a frequency table from the Traditional methods of stroke diagnosis,
data collection. such as CT and MRI scans, can be time-
consuming and expensive. Machine
Step 2: Make a likelihood table by locating the learning algorithms can assist in making
probabilities, such as the probability of an a quicker diagnosis, which can help
overcast day being 0.29 and the probability of
healthcare providers to begin treatment
playing being 0.64.
faster.
4. Results and Discussion
The outcomes of a brain stroke detection III. Reduced healthcare costs: By providing
project using machine learning can vary a quicker and more accurate diagnosis
depending on the specific approach taken, the of stroke, machine learning algorithms
dataset used, and the evaluation metrics can help to reduce the cost of
chosen. However, some potential outcomes of healthcare. This is because early
such a project are as follows: diagnosis and treatment can reduce the
need for hospitalization, which is often
Figure 1: Architecture Diagram of the proposed Method
expensive.
I. A rise in the accuracy of stroke
diagnoses thanks to machine learning
algorithms' ability to examine vast IV. Improved patient outcomes: Early
volumes of data and spot patterns that detection and treatment of stroke can
might be hard for people to see. By improve patient outcomes by reducing
training algorithms on large datasets of the risk of long-term disability and
improving survival rates. Machine 6. References
learning algorithms can help in
[1] V. L. Feigin et al., Update on the global burden
predicting the likelihood of a stroke
occurring in a patient and provide early of ischemic and hemorrhagic stroke in
warning signs. Healthcare professionals 19902013: The GBD 2013 study, vol. 45, no. 3.
can take proactive steps to prevent 2015.
stroke by identifying individuals who https://pubmed.ncbi.nlm.nih.gov/26505981/
are at risk.
[2] N.Venketa subramanian, B.W.Yoon, J.Pandian,
and J.C.Navarro, Stroke Epidemiology,
South,East, and South-East Asia: A Review, vol.
V. Development of new tools and 20, no. 1. 2018.
techniques: Machine learning
algorithms can help in developing new https://pubmed.ncbi.nlm.nih.gov/29037005/
tools and techniques for stroke [3] Gur Amrit Pal Singh, P. K. Gupta Performance
diagnosis and treatment. For example, analysis of various machine learning-based
ML can assist in the development of approaches for detection and classification of
wearable devices or mobile apps that lung cancer in humans, vol. 3456789. Springer
can monitor patients' health and provide London, 2018.
early warning signs of stroke.
http://ir.juit.ac.in:8080/jspui/bitstream/123456789/905
5/1/Performance%20analysis%20of%20various
%20m%20achine%20learning-
Overall, a machine learning project for brain %20%20based%20approaches%20for%20detection%2
stroke detection has the potential to greatly
increase the precision and speed of stroke
diagnosis, lower medical expenses, and
enhance patient outcomes.
5. Conclusion
In this study, stroke data on CT scan image data
is classified using machine learning methods.
picture processing and feature extraction are
done on the picture data before classification.
The classification is then performed using a
comparison of (Four) techniques, namely K-
Nearest Neighbours, Naive Bayes, Random
Forest, and Cat boost. Compared to other
examined classification algorithms, the
algorithm using the Random Forest approach
offers the highest level of accuracy, according to
our testing. The accuracy of the classification
algorithm with the default optimisation
parameter value has not, however, been tested.
From this point forward, the categorization
model may be enhanced to accomplish. The
machine learning algorithm utilised has to have
its parameters tuned in order to improve
accuracy.
Smart Agriculture Aid Using Renewable Energy
Kusuma S Kavya P A Evangeline Dias Siddaram G

20201LCS0010 20201LCS0011 20201LCS0019 20201LCS0022
Department of CSE Department of CSE Department of CSE Department of CSE
KUSUMA.20201LCS0 KAVYA.2021LCS0011 EVANGELINE20201LC SIDDARAM20201LCS0
010@ @pre S001 022
presidencyuniversity.i sidencyuniversit 9@presidencyuniversi @presidencyuniversity.i
n y.in ty.in n
Sagar R Dr. S Radha RamMohan

20201LCS0023
Department of CSE Professor(CSE)
SAGAR20201LCS0023 radharammohan@presid
@presidencyuniversity.in encyuniversity.in
ABSTRACT: A new idea called "smart KEYWORDS: EPS32, SOIL MOISTURE,

agriculture" seeks to enhance agricultural practices DHT11, SMOKE SENSOR, POWER
while minimizing their negative impact on nature. SUPPLY, WATER PUMP.
the utilization of renewable energy sources, such
as wind, solar, and biogas, for running irrigation 1. INTRODUCTION
systems, sensors, intelligent farming tools, and Agriculture is a vital sector for food production and
other agricultural machinery is examined in this economic development, but it also contributes
study. The article looks at how utilizing renewable significantly to environmental degradation and
energy can lower greenhouse gas emissions, climate change. Therefore, there is a growing need
improve agricultural productivity, and support for sustainable agriculture practices that can
sustainable practices. The study also examines the enhance productivity while reducing
difficulties and impediments to putting wise environmental impact. One potential solution is
agriculture aid powered by renewable energy into
renewable energy sources such as solar, wind, and
effect and offers suggestions for overcoming them. biogas to power agricultural practices. This
The paper's main point is that agriculture assistance concept is known as agriculture aid using
utilizing energy from renewable sources has the renewable energy.
potential to boost agricultural production and lead Agriculture aid using renewable energy involves
to a future that is more environmentally friendly. technology and sustainable energy sources to
enhance agricultural practices. renewable power 3. SYSTEM MODEL
may be applied to power irrigation systems,
sensors, intelligent farming tools, and other
equipment used in agriculture. This approach can
help reduce the increased efficiency and improve
food security.
This paper analyzes the potential of agriculture aid
using renewable energy. It will examine the
benefits, challenges, and opportunities of using
renewable energy in agriculture and provide
recommendations for implementing sustainable
agricultural practices using renewable energy
sources. Overall, this paper highlights the potential
of agriculture aid using renewable energy to
improve cultivation productivity and promote a
more sustainable future.
2. RELATED WORK Fig.1: Block Diagram

An approach called agriculture aid uses renewable
energy sources to boost agricultural productivity
while lowering environmental effects. coming are 4. SYSTEM REQUIREMENTS:
just a few ways that intelligent agriculture can be Renewable energy systems used in smart
here by renewable energy[1] agriculture can offer several advantages, including
lower energy costs, greater effectiveness, and less
Solar-powered irrigation systems can be drawn to carbon impact. The following conditions can be
pump water to crops from wells or other water taken into account to put in such a system:
supply, minimizing the need for fossil fuels and Solar panels rely on energy sources like solar
electricity from the grid.[2] power or windmills to provide electricity. So,
Wind turbines: Electricity from wind turbines to generating green power for agricultural uses is
power lights, and irrigation.[3] required to erect wind or solar turbines.
Battery storage system: To use the generated
Biogas systems: Biogas systems convert biological renewable energy when needed, it must be stored.
waste into methane gas, which may be used to run The energy produced during the day can be stored
farm machinery and provide heating and energy.[4] using a battery storage device and used at night or
when there is a shortage of renewable energy
sources.
An effective irrigation system is a smart irrigation
system.
5. OVERVIEW
Technology and sustainable energy sources
improve agricultural practices and increase
productivity while reducing the environmental
impact of farming. This approach can help farmers
overcome an objection to climate change, soil
degradation, and water scarcity, as well as.
There are several technologies and renewable Overall, smart agriculture aid using renewable
energy sources that can be used in smart energy is a promising approach to improving
agriculture, including: agricultural practices and sustainability. By
i. Solar Power: Solar panels can be installed adopting these technologies and practices, farmers
on farms to generate electricity for can increase productivity while reducing their
irrigation pumps, lighting, and other farm environmental impact.
equipment. Minimize the need for fossil A. ESP32 microcontroller
fuels and lowers greenhouse gas emissions. ESP32 is a low-cost, low-power microcontroller
ii. Wind Power: Wind turbines can be drawn with Wi-Fi and Bluetooth connectivity, a dual-core
to generate electricity on farms. especially processor, and built-in security features, suitable
useful in areas with high wind speeds, for a wide range of IoT applications. It is
where wind power can be a cost-effective compatible with the Arduino IDE and has a variety
alternative to grid electricity. of peripheral interfaces. The ESP32 looks like
iii. Biogas: Biogas can be utilized from Fig.2.
organic waste such as animal manure and
crop residues. This can be used as a
renewable energy source for cooking and
heating on farms.
iv. Precision Agriculture: Precision
agriculture uses sensors and data analytics
to optimize crop yields and reduce waste.
This can include technologies like GPS
mapping, drones, and soil sensors.
v. Vertical Farming: Vertical farming
involves growing crops in vertically
stacked layers using LED lights and
hydroponic systems. This approach can Fig.2: ESP32
increase crop yields and reduce water
usage. B. soil moisture sensor
Smart agriculture aid using renewable energy can Soil moisture sensors measure or estimate the
have several benefits, including: amount of water in the soil. These sensors can be
stationary or portable such as handheld probes.
Increased productivity: By using technology and Stationary sensors are placed at predetermined
renewable energy, farmers can increase crop yields locations and depths in the field, whereas portable
and reduce waste.[1] soil moisture probes can measure soil moisture at
several locations...
Reduced costs: Renewable energy sources like
solar and wind can reduce the cost of electricity for
farmers, while precision agriculture can reduce the
amount of water and fertilizer needed.[2]
Environmental benefits: Using renewable energy
sources and sustainable farming practices can
reduce greenhouse gas emissions and help mitigate
the impacts of climate change.[3]
Fig.3: Soil moisture Smart agriculture is the use of technology to
Soil moisture sensors measure or estimate the amount of improve the efficiency and sustainability of
water in the soil. These sensors can be stationary or portable farming practices. The application of renewable
such as handheld probes. Stationary sensors are placed at energy in agriculture can help to reduce the
predetermined locations and depths in the field, whereas dependence on fossil fuels, decrease carbon
portable soil moisture probes can measure soil moisture at
emissions, and lower operational costs. Renewable
several locations.
energy sources such as solar, wind, and biomass
energy can be used to power irrigation systems,
6. MOTIVATION pumps, and other equipment used in agriculture.
The use of renewable energy in agriculture can also
help to promote sustainable farming practices by
Agriculture could undergo a revolution as a result
reducing the carbon footprint of farming
of the usage of renewable energy, which could also
operations. Sustainable agriculture practices aim to
impact how we handle our natural resources and
reduce the environmental impact of farming while
raise cattle and cultivate crops. We can lessen our
ensuring that food production remains
reliance on fossil fuels, reduce our carbon
economically viable.
footprint, and enhance the farming process in an
extra sustainable and environmentally friendly
manner by utilizing the potential of the sun, wind, 9. EXPECTED OUTCOMES
and other renewable sources of energy. The exact/estimated outcomes of a water quality
The goal of smart agriculture is to maximize monitoring system using IoT technology will
resource utilization and increase agricultural yields depend on the specific goals and objectives of the
through the integration of technology and system, as well as the methodology and
agriculture. Farmers can monitor and manage technologies used. However, some potential
multiple elements of their farms, including soil outcomes could include:
moisture, temperatures, and levels of nutrients, in
• By monitoring key indicators of water
real-time with the use of cameras, drones, and
quality in real-time, such as pH,
other cutting-edge technologies. They may use this
temperature, and TDS, water treatment
to make data-driven choices and change their
facilities can identify and address issues
farmingpractices as necessary, which results in.
that may impact the safety and quality of
the water supply. This can lead to improved
water quality and reduced risk of
7. PROBLEM STATEMENT: waterborne illnesses.
The following might be the issue statement for • IoT-based water quality monitoring
employing renewable energy in smart agriculture systems can help water treatment facilities
aid: improve their operational efficiency by
The difficulties facing the agricultural sector, such providing real-time data on key indicators
as depletion of resources, global warming, and of water quality. This can help facilities
food security, are becoming more urgent. The use optimize their treatment processes and
of non-renewable energy sources in agriculture has reduce waste, leading to cost savings and
also considerably raised carbon dioxide emissions improved sustainability.
and environmental damage. 10. EXISTING SYSTEM:
8. BACKGROUND: A new technique called "smart agriculture aid"

uses renewable energy sources to give farmers
access to cutting-edge equipment and resources to
improve their farming methods. To maximize crop
productivity and reduce waste, the current system 13. REFERENC
for intelligent agricultural assistance combines ES
clean energy sources with agricultural precision [1] https://link.springer.com/content/pdf/10.1007/978-3-
technologies. 030-98981-1.pdfA. Prasad and P. Singh, Monitoring and
The use of powered by sunlight sensors to keep Modeling of Global Environmental Change, Springer
track of the moisture in the soil, the temperature, Nature, Nov. 2019.
and levels of nutrients is one instance of the system [2] https://www.sciencedirect.com/science/article/pii/S0
currently in place. 160412018322013J. Zhu, Y. Zhang, and H. Liu,
Environmental monitoring and modeling with remote
sensing and GIS, Elsevier, Jan. 2019.
11. PROPOSED SYSTEM:
[3] https://www.google.com/search?q=%5B2%5D%09
A revolutionary technology called "smart %5B3%5D%09R.+S.+S.+Dubey,+S.+Gupta,+A.+Tripathi,+
and+P.+Pandey,+Wireless+sensor+network-
agriculture aid" that uses renewable energy based+smart+water+quality+monitoring+system,+Wireless
attempts to increase agricultural output while +Personal+Communications,+vol.+111,+no.+1,+pp.+225-
reducing farming's negative environmental effects. 242,+June.+2020.%0D%0A%E2%80%93+Monitoring+and
The suggested solution creates an agricultural +Assessment,+Elsevier,+Oct.+2019.&sa=X&ved=2ahUKE
environment that is both energy-efficient and wiGpIuhiIv_AhVIcGwGHYG6AkAQgwN6BAgDEAER.
Prasad and N. K. Agrawal, "Smart water quality monitoring
sustainable by fusing smart agriculture methods system using internet of things and cloud computing," in
with renewable energy technology. 2019 International Conference on Electrical, Electronics and
To supply electricity to farming operations, the Computer Engineering (UPCON), Gorakhpur, Mar. 2019,
system entails the setup of energy from renewable pp. 1-6.
sources like solar cells and wind turbines. This [4] https://www.researchgate.net/publication/335578542
green power can be used to run lighting systems, _Iot_based_Smart_Water_Quality_Monitoring_SystemS. K.
irrigation systems, and other agricultural Kar, S. K. Sahu, and S. S. Padhi, "Smart water quality
monitoring system using wireless sensor network and IoT,"
machinery. in 2019 International Conference on Intelligent Computing,
Instrumentation and Control Technologies (ICICICT),
12. CONCLUSION Kannur, Feb. 2019, pp. 680-685.
[5] https://www.google.com/search?q=%5B5%5D%09A
In conclusion, encouraging sustainable agricultural .+R.+Abdullah,+S.+A.+Mahdi,+and+S.+A.+Aziz,+Smart+
practices can be greatly aided by smart agricultural water+quality+monitoring+system+based+on+internet+of+t
assistance powered by renewable energy. Farms hings,+in+2018+IEEE+14th+International+Colloquium+on
can lessen their dependency on non-renewable +Signal+Processing+%26+Its+Applications+(CSPA),+Pena
energy sources and their carbon footprint by adding ng,+April.+2018,+pp.+167-
171&sa=X&ved=2ahUKEwi9u7bjiIv_AhWdTGwGHQQX
renewable energy sources like solar and wind AfsQgwN6BAgIEAE.
power.
[6] https://www.researchgate.net/publication/335578542
The use of precision agriculture, mapping, and data _Iot_based_Smart_Water_Quality_Monitoring_System
analytics are examples of smart agricultural
technology that can help farmers maximize their
yields while using fewer natural resources. This
may result in enhanced production, better food
security, and a smaller environmental effect.
Furthermore, real-time information from smart
systems for agriculture can help farmers make
educated decisions about insect control,
fertilization, and irrigation. Higher yields of crops
and less water use are possible as a result, which is
crucial in regions with limited water supplies.
AI ENABLED DINER CHATBOT
Pramod Kumar Arjun Singh

Dept. Of CSE Pawan Kumar Saurabh Kumar Dept. of CSE
Presidency University Dept. of CSE Dept. Of CSE Presidency University
Bengaluru, India Presidency University Presidency University Bengaluru, India
201910100688@presi Bengaluru, India Bengaluru, India 201910100278@presidency
dencyuniversity.in 201910100541@presi warrior university.in
dencyuniversity.in 201910100627@presi
dencyuniversity.in
DR. C KOMALAVALLI
Professor, Dept. of CSE
Bengaluru, India
komalavalli@presidencyu
ABSTRACT niveraity.in
interact with a human user, usually via a messaging

messaging application, website or mobile application. Chatbots are programmed
to understand and respond to users' natural language queries, enabling them to
Restaurants faces several problems when they do not have a function as virtual assistants, customer service representatives, and even personal
dedicated virtual assistant (chatbot) to assist their customers like companions. Chatbots can be used in various domains and industries, including
Slow response time, Limited availability, Inconsistent customer customer service, e-commerce, healthcare, education, banking and financial
experience, Increased labor costs, Missed opportunities for services, travel and hospitality, and human resources.
upselling. The use of a virtual assistant in a restaurant business can
provide significant advantages in terms of efficiency, customer GIST: ML > Machine Learning, NLP > Natural Language Processing, AI >
satisfaction and profitability. A virtual Diner assistant is an Artificial Intelligence
automatic system for daily tasks such as ordering in a restaurant,
the front desk work, booking, FAQs, and many other important, 1. INTRODUCTION
time- consuming tasks. A conversational assistant is an artificially
intelligent system that uses its surroundings in order to learn, Chatbots are software applications or programs that use machine
adapt, and perform. It is built with the aid of an API named learning and natural language processing to imitate
NLTK, provided by PYTHON, to develop, train, and deliver anthropomorphic interacting and interact with users through chat
machine learning (ML) and artificial intelligence (AI) models. As a interfaces, typically in messaging apps, websites, or other digital
result, developing a machine learning and artificial intelligence platforms. They are designed to understand and respond to user
model takes fewer efforts. The system allows restaurant personnel queries, provide information, complete tasks, and engage in
to execute the specified tasks with less or no effort. Personnel may conversations in a conversational manner. Chatbots offer various
perform other important duties related to their daily operation. benefits, including improved customer service, increased
Numerous restaurants and similar agencies provide services such efficiency, cost-effectiveness, scalability, personalization,
as web pages for reservations and hotlines through Customer convenience, data collection and analysis, and automation of
Support Centre. These methods, nevertheless, are not very effective internal processes, making them a valuable tool for businesses in
and fail to answer plenty of client inquiries with immediate different industries. Dining chatbot, designed to dining needs.
solutions. Chatbots help to stabilize the situation, provide responses Whether customer looking for restaurant recommendations,
to many customer questions, and carry out assignments that can be information on local cuisine, dietary restrictions, or help with
carried out online like accepting online payments, reserving rooms, making reservations, Dining chatbot always there to help. With
delivering information on multiple subjects like the organization's its vast knowledge of cuisines from around the world, it can
address, facilities provided availability of rooms and tables Some of provide suggestions based on your preferences, budget, and
the datasets for virtual assistant model training are developed location. It can also assist with menu translations, provide reviews
internally, while other ones have been obtained from KAGGLE, an and ratings, and offer general dining etiquette tips. An efficient
established online resource that offers a wide range of datasets for Chatbot can meet the needs of the user’s request on time.
the model. conversational robot is an artificial intelligence (AI)
software designed to interact with a human user, usually via a
Literature Review
Reference Title Outcomes

S. NO
1 An effective productrecommendation for E Websites for e-commerce are becoming

Commercewebsite using hybrid more and more popular and clients are
Shruti S. and Gripsy J. V.(2017),An
recommendation system allowed to express their thoughts on
effective product recommendation for E
what they bought through ratings,
Commerce website using hybrid
comments, even weblog. Recommender
recommendation
networks are systems employed to
system,IJCSC,8(2),2017,pp-81-88
suggest things to consumers, yet it ought
to concentrate more on
connected/similar analysis of products.
Utilizing the two RS: cooperation and
critical, the suggested approach
developed an unusual system for
suggestions. Object-based collective
filtering is a method for finding
items that are similar to a specific
item in an electronic list of elements.
Item-item designs manage such
problems in networks with
numerous users than objects. In ten-
item centered screening, averages each
item are used versus probabilities per
individual. Upon reviewing a good, the
consumer is going to show similar items
according to their personal information.
2. Schafer, J. Ben, Joseph Konstan, and John Recommender systems in We define in this study a framework
Riedl. "Recommender systems in ecommerce for Tailored RE appreciation in
ecommerce." Proceedings of the 1st ACM electronic commerce (SPREE)
conference on Electronic commerce. ACM, which utilizes the advantages of
1999
shared filtering and data analytics to
offer better suggestions. Beginning
to develop beyond curiosity to
essential business tools,
recommendation systems are
altering the nature of e-commerce. a
number of the industry's most
popular online stores already utilize
systems of recommendation to assist
buyers locate things to purchase.
This piece examines six websites
that use referral systems as
demonstrates how they assist e-
commerce companies enhance
income. It provides a hierarchy of
suggester systems this addresses
how client apis they deliver, the
devices they utilize to create advice,
and the customer data they request.
It ends with ideas regarding creative
uses of suggestions to online
retailing.
3. Zlatanov S and Popesku J(2019),Current Current Applicationsof Artificial This investigation offers adding to the
Applications of Artificial Intelligence in Intelligence in Tourism and Hospitality previous discussion on machine learning
Tourism and Hospitality, International use in vacationing, and thus provide vital
Scientific conference on Information knowledge from science.
Technology and data related research, opinion on the matter in question. Trip
January 2019, DOI: 10.15308/Sinteza-2019- 2.0 networks remain prominent and have
84-90. paved the road for the implementation of
more sophisticated computers in
hospitality and tourism sectors. Knowing
the complex, the procedure for making
choices is in connection to transportation,
automated systems and leisure create
ideal sense altogether. The robotics
industry, computer vision, and automated
service provision have given rise to an
array of fresh possibilities for airlines and
institutions. Despite specific elements of
the hospitality sector have begun to
employ the technology, they remain not
much of scholarly research on the topic.
4. Y. Sun, Y. Zhang, Y. Chen, and R. Jin, Conversational recommendation system with This system shows how we combine
‘Conversational recommendation system unsupervised learning research in personalized recommendation
with unsupervised learning’, pp. 397–398. systems with research in dialogue
Association for Computing Machinery, Inc, systems. This position paper
(2016). methodically addresses the fundamental
methodology and prominent strategies in
recommender systems, as well as how
AI may successfully improve the
technical development and use of
recommender systems. The study not
only examines cutting-edge theoretical
and practical contributions, but it also
emphasizes current research difficulties
and suggests new research areas. It
attentively investigations various
problems related to recommendations
that use Artificial Intelligence as well as
reviews the changes that have been
made to these systems by means of the
use of AI approaches such as ambiguous
techniques, learning through transfer,
genetic algorithms, algorithms based on
evolution, artificial neural networks and
deep learning, and active training. The
insights in this work will directly assist
researchers and professionals in better
understanding current advances.
PROPOSED MODEL
1.Greeting and Introduction: The chatbot should start the conversation with
a friendly greeting and introduce itself as a restaurant assistant. For example:
For example: Chatbot: I see that you've dined with us before. Welcome back! Would you
like to order your favorite dish, the spaghetti Bolognese?
Chatbot: Hi there! Welcome to our restaurant chatbot. I'm here to help you User: Yes, please! That's my favorite.
with any questions or assistance you need. How can I assist you today? Chatbot: Great choice! Anything else I can assist you with today?
2.Reservation and Booking: The chatbot should be able to handle 7.Error Handling and Escalation: The chatbot should be able to handle
reservation and booking requests. It can ask for the date, time, and number errors, misunderstandings, or ambiguous queries gracefully. It can ask
of guests, and check the availability of tables. clarifying questions, offer suggestions, or escalate to a human agent when
necessary.
For example:
For example:
Chatbot: Sure! I can help you with a reservation. Please provide me with the
date, time, and number of guests for your booking. Chatbot: I'm sorry, I didn't understand your request. Could you please
User: I'd like to make a reservation for two on May 5th at 7:00 PM. provide more details or rephrase your question
Chatbot: Great! Let me check our availability for that date and time.
3.Menu and Specials: The chatbot should be able to provide information SCOPE
about the restaurant's menu, including special dishes or promotions. It can
also accommodate dietary restrictions and provide recommendations.
This model is being created with the concept for a chatbot system in the
For example: artificial intelligence field that would be used on a modest basis. With the
hackneyed dataset built using research of various institution, assists to
Chatbot: Our menu includes a variety of cuisines, such as Italian, Asian, and exercise exploratory sample of chatbot system, which could be modified to a
American. We also have vegetarian and gluten-free options. Would you like huge scale solution of eatery and can be deployed on large scale. The model
me to recommend any dishes? is sample that indicates, moving forward using artificial intelligence
User: What are your current specials? technologies effectively when they are available modified for massive scale
Chatbot: Our current special is a 3-course meal with a choice of appetizer, businesses of eatery, which decreases the burden of reception of resolving
main course, and dessert for Rs 899 the queries hardly anything reason. And assist enterprises to flourish
users anytime the equipment SSis exported within social platform such as
4.Order and Payment: The chatbot should be able to take orders and public websites of that restaurant. Also, this creates opportunity of jobs in
facilitate payments. It can provide options for delivery or pickup, and handle the near future.
payment processing securely. In this present era, it is predictable that a website application can be utilized
for ordering or pre-ordering daily bread. we are using phones, tablet or PC
For example: machine with internet connection to provide interaction between consumers
as well as the menu using secret login, user can view and place an order and
Chatbot: Would you like to place an order for delivery or pickup? get instant updated to gather invoices using the phone or PC itself. It is
User: I'd like to place an order for delivery. suitable, productive and simple so that it enhances what was done of the
Chatbot: Sure! What items would you like to order? eatery's staff., effective and makes dining experience immersive. will have
User: I'll have a margherita pizza and a Caesar salad. access to the specialized hardware and software they require to do their
Chatbot: Great! I'll add that to your order. How would you like to pay? We specific duties on schedule. Scheme development depends upon a
accept credit cards and online payments. framework provided by NLTK. The Kaggle dataset, which is used to train
and evaluate chatbots for functionality, is a need for this model. If a chatbot
is unable to respond, it is anticipated that it should hand over management of
5.Additional Information: The chatbot should be able to provide general the system to a human assistant who will be able to respond and address
information about the restaurant, such as hours of operation, location, and issues. while using website, it is mandatory for user to connect to the net so
contact details. It can also answer frequently asked questions (FAQs) about the system ought to be able to show various places and upon user request
the restaurant's policies, events, or services.
USER CHARACTERISTICS
For example:
1.User can ask questions regarding table booking.
Chatbot: We are located at 5th cross Whitefield and our hours of operation 2.User can know about the availability of accommodation.
are from 07:00 AM to 11:00 PM, 7 days a week . Is there anything else you 3.User can know about the venue of the restaurant.
would like to know? 4.User can know about the menu of the restaurant.
5.Prompt response to questions.
6.Personalization and Engagement: The chatbot should be able to engage 6.Detailed answers to queries.
users in a personalized and interactive manner. It can remember user 7.Settlement of a grievance and dispute.
preferences, offer recommendations based on past orders, and provide a 8.Contacting an available service professional.
pleasant and engaging conversation experience. 9.Consumer can connect to restaurant if not satisfied.
NON-FUNCTIONAL
2.1 ASSUMPTION AND DEPENDENCY
Development of a scheme depends on a framework offered by NLTK.

When the product is implemented, consumers should be able to easily
interact with it on platforms and restaurants webpages. The Kaggle dataset,
which is used to train and evaluate chatbots for functionality, is a need for
this model. The presumption is that if a chat bot is unable to answer, it
should hand over management of the system to a human assistant that will
be able to address and resolve issues. In order to benefit from the system,
the user must have their device connected to the internet while accessing the
website. The platform should be able to show various places and graphics
per user request and if needed.
FUNCTIONAL REQUIREMENT
REQUIREMENT
SYSTEM FEATURES
PERFORMANCE REQUIREMENT
A conversational bot utilizes natural language processing (NLP) to process
user input and can apply the naive Bayes algorithm to deliver precise 1. The platform must need to be able to respond promptly.
outcomes quickly. Chat bots include the Natural Language Understanding 2.Software UI should be simple to navigate through.
(capability that enables them to comprehend user input and provide the 3.The platform should be able to function even when multiple people
appropriate output depending on it). The chatbot can handle a variety of are using it all together.
restaurant-related questions, such as reservations, cuisine, the sort of 4.If an inquiry cannot be responded by the conversational bot, the system
reservation needed, etc. It can also hand off management to a person if should consult with prime administrator.
necessary. If a chatbot's enquiry cannot be answered; it can provide a phone 5.The platform must provide accurate data.
number so that a person can assist the user. A user can easily acquire a 6.To keep up with changes in the restaurant, the software has to be updated
solution to their question by chatting with a chatbot from anywhere. Chatbot on a regular basis.
can reply in text form for the inquired question. Moreover, chatbots might 7.The platform must able give the user address, email id and contact
be used on the websites of certain eateries. information when request.
8.Platform must learn from numerous inputs provided by user.
9. The platform needs to comprehend multiple entities & motives.
DATASET
Acquired from Kaggle.com, which offers a variety of data sets for artificial
intelligence models. The data set is divided in half, having the first half used
SAFETY REQUIREMENT
for model training and the other half for testing models. Dataset includes
intents and entities; on the basis of this, the most likely probability is 1.The system must seek human assistance if a question is asked that is not in
determined and saved in the context variable. Intent is identified by the scope.
chatbot, which is composed of humans and entities. 2. The system must be able to safeguard restaurant-related data.
Naïve Bayes Algorithm

Applied to determine the probability of intent and identify the most precise
response to a query
METHODOLOGY
P(A|B) = P(B|A) *
P(A) / P(B) Selecting required services & frameworks:
1. Conversation service
2. Natural Language Processing
3. Natural Language Generation example:
4. Natural Language Understanding
5. Creation of AI model with appropriatelanguage. 1.Enhancing customer experience: A restaurant chatbot
6. uploading the model's training information, comprising can offer personalized recommendations, take orders,
entities and intents. and provide relevant information, enhancing the overall
7. on text variables are being created to hold customer experience and improving customer
data from user conversations. satisfaction.
8. constructing illustrations for training and
comprehension purposes. 2.Automating tasks: A chatbot can handle routine tasks
9. Testing using a dummy discussion for the such as taking reservations, providing operating hours,
provided purpose and entities and answering frequently asked questions, freeing up
restaurant staff to focus on other important
responsibilities.
DATA FLOW DIAGRAM 3.Reducing costs: Implementing a chatbot can be cost-

effective compared to hiring additional staff to handle
customer inquiries, especially during peak hours or
when operating on a tight budget.
Here are some potential future developments for

restaurant chatbots:
1.Increased personalization: As chatbots become more

sophisticated, they can leverage data and machine
learning algorithms to offer highly personalized
recommendations based on customer preferences,
dietary restrictions, and past ordering behavior. This
could include tailored menu suggestions, promotions,
and offers, resulting in a more personalized dining
experience for customers.
2.Integration with voice assistants: Voice assistants,

such as Amazon's Alexa, Google Assistant, and Apple's
Siri, are becoming increasingly popular in homes and
smartphones. Integrating restaurant chatbots with voice
assistants could enable customers to place orders, make
reservations, and get restaurant information through
voice commands, making the ordering process even
EXPECTED OUTCOME more convenient and hands-free.
3.Multilingual support: Chatbots could be developed to

1.Manage Reservations and Taking Order: (i)A
support multiple languages, catering to diverse customer
chatbot on website, can interact with consumer and
bases and expanding the reach of restaurants to
can perform this tedious work with100% accuracy.
international customers, tourists, or non-English-
(ii)Reducing errors and staff efforts helps in reducing
speaking communities.
the loss of consumers and prevents mismanagement
2.Promotion pract & Offers: A chatbot can get into
your mail services and invite your consumers with 4.Social media integration: Chatbots could be integrated
new deals and amazing pract. They can work on with social media platforms to enable customers to
website and comes in handy. place orders, make reservations, and get information
3.Connecting with CustomersUsing chatbots, directly through social media messaging apps,
restaurants may able to connect well with the enhancing the convenience and accessibility of ordering
customers, eradicating the demand for more for customers.
employees and enhancing consumer experience.
4.Information on local cuisine, dietary restrictions, or ACKNOWLEDGEMENT
help with making reservations and answering FAQs
can all be done with the help of interactive chatbot
t. We are greatly indebted to our guide Dr. C Komalavalli,
Professor, School of Computer Science & Engineering,
Presidency University for her inspirational guidance,
valuable suggestions and for providing us a chance to
CONCLUSION express our technical capabilities in every respect for
the completion of the project work.
Chatbots can offer a wide variety of benefits for
REFERENCES
References used for research :
[1] Shruti S. and Gripsy J. V.(2017),An effective product recommendation

for E Commerce website using hybrid recommendation
system,IJCSC,8(2),2017,pp-81-88
[2] Schafer, J. Ben, Joseph Konstan, and John Riedl.
"Recommender systems in ecommerce." Proceedings of
the 1st ACM conference on Electronic commerce.
ACM, 1999.
[3] Y. Sun, Y. Zhang, Y. Chen, and R. Jin,

‘Conversational recommendation system with
unsupervised learning’, pp. 397–398. Association for
Computing Machinery, Inc, (2016).
[4] Ayeh K. J and et. al. Information Extraction for a

Tourist Recommender System, Information and
Communication Technologies in Tourism 2012:
Proceedings of the International Conference in
Helsingborg, Sweden, January 25–27, 2012.
[5] Divya, Indumathi, Ishwarya, Priyasankari, “A Self

Diagnosis Medical Chatbot Using Artificial
Intelligence”, proceeding MAT Journal, October-2017.
An Ai Based Diet And Exercise
Muthuraju V1 Korakuti Gnaneswar2 Dubba Srinath Reddy3
Asst. Prof. Department of Department of Computer Department of Computer
Computer Science and Science and Engineering Science and Engineering
Engineering Presidency University Presidency University
Presidency University Bengaluru, India Bengaluru, India
Bengaluru, India 201910100887@presidenc 201910101588@presidenc
muthuraju.v@presidencyun yuniversity.in yuniversity.in
iversity.in
Bhavanasi Lahari4 Akash Debangshi5

Department of Computer Department of Computer
Science and Engineering Science and Engineering
201910100909@presidency 201910101029@presidenc
university.in yuniversity.in
Abstract— Virtual assistants are now an integral Benefits of utilizing this system include its
part of our everyday life, and as of right now, we simplicity, user-friendliness, efficiency, and
can observe that people are neglecting their dependability.
physical fitness and health. To monitor a person's Compared to maintaining all client information in
health, we came up with the concept of an AI- record books or on spreadsheets, maintaining an
based diet and Exercise that is dependent on his entirely secure database on the server that is
BMI and BMR. available as needed by the user and is free of
The technologies we are used in this project are maintenance costs would be very efficient.
Ai with the help of artificial intelligence (AI), the The advantages are
system would automatically produce a diet plan The system is accessible from anywhere at any
for the user based on his body weight and provide time. Users can chat with the system to ask
motivational films to encourage him to lose questions about fitness and receive responses. It is
weight and get in shape. Additionally, users can easy to use and gain access to the system.
chat with chatbots to get advice on how to make
their diet and exercise regimens more Keywords— Ai diet plan, chatbot, Bmi and Bmr,
manageable. customized nutrition plan, monitoring,
What sets this proposal apart from others is the Notification reminder
addition of artificial intelligence (AI) to construct
a diet plan and chatbot. Many systems will present I. INTRODUCTION
us with diet plans to choose from, but this project Exercise and diet are important for maintaining
will create them automatically with the aid of AI, good health and well-being. A balanced diet
the program will generate a customized nutrition provides the necessary nutrients for optimal bodily
plan and send notifications to ensure good function and can help manage weight, compared to
adherence. maintaining all client information in record books or
The system will send reminders to drink water, on spreadsheets. Together, exercise and diet can
and if a user is seated in one spot for more than an reduce the risk of chronic diseases and promote
hour, it will send a warning about a change in longevity.
position and walk. We added a few motivating AI Based Diet with Fitness App is designed to help
videos if users wanted to be inspired to keep up individuals maintain their health and weight using
their diet and exercise routine. AI. The system will create an individual diet plan
based on the user’s BMR. The system will also the database, which can be added/updated. All the
offer a variety of exercises to users. The overall aim Videos based on BMI will be added directly in the
is to maintain the optimal health and weight of the database, which can be added/updated. The
user with the help of Artificial Intelligence. questions & answers should be added to the Dialog
flow dashboard which will be trained and uses
Artificial Intelligence. Google fit will give us the
II. EXISTING SYSTEM
results if the same google account is used in the
There are a lot of programs available on the Band/watch.
market right now, like Simplify, Google Fit, For this project, XML is used on the front end and
Samsung Health, and others. All these MSSQL is used on the back end. The programming
applications focus on tracking physical activity language is Java. The IDE used is Android Studio
like cycling, walking, and running while also
encouraging users to develop healthy eating IV. LITERATURE REVIEW
habits. rather than creating a suitable diet using Interest in using artificial intelligence (AI) to
the user's height and weight healthcare has risen recently, particularly in the area
Drawbacks of the existing system, the existing of diet and nutrition. Applications with AI
system was limited to providing exercises to capabilities can offer tailored suggestions and
individuals. Also, they did not provide the chatbot counsel based on a person's particular health
feature to the users. information, enhancing overall health outcomes.
An example of one of these applications is an AI-

III. PROPOSED SYSTEM based diet advisor that creates personalized food and
The user would require registering and then log in exercise regimens based on a person's body mass
to the system. They can manage their profile and index (BMI), basal metabolic rate (BMR), and other
change the password if they want. On the Home health parameters. By offering them personalized
Page, the user can see their BMI/BMR. They can advice for their needs, these programs can assist
also see the steps, heart rate, blood oxygen, etc. people in achieving and maintaining a healthy
calculated by Google Fit. The system will lifestyle.
automatically design a diet plan based on the user’s
BMR. The user of this program will receive a list of Studies have demonstrated the potential of AI-based
exercises, workout and motivation videos based on diet and exercise coaches to enhance patient
the user’s BMI. The system will show the steps the outcomes. An AI-based diet and exercise adviser
users took in a day along with the week’s history. proved effective in increasing adherence to a
The user can also see their water intake daily. balanced diet and activity plan among overweight
Update their water intake and see the intake history and obese people, according to a study published in
if a week. the Journal of Medical Systems. The study
They can chat with the system. A dialog flow is concluded that an AI-based strategy might work
integrated for chats. The user will also receive well for encouraging healthier lives.
notifications if they have not walked the desired
steps or did not intake the desired intake of water or AI-based diet consultants may create customized
updated the intake. diet and exercise plans as well as offer real-time
The diet plan is made based on data of BMR & support and feedback. For instance, chatbots driven
BMI; we will create an algorithm based on BMR & by AI algorithms can give consumers immediate
food micronutrients & the food suitable to eaten. answers to their queries and worries, assisting them
The diet plan will have 4 food categories breakfast, in staying motivated and on task.
lunch, evening snacks, and dinner.
Furthermore, wearable gadgets like fitness trackers
The food dataset will be made with limited food in and smartwatches can interface with AI-powered
applications to gather health information like heart 100 GB ROM or higher
rate, number of steps done, and sleep habits. Based • ANDROID PHONE
on a person's particular requirements and 5.0 or Above
objectives, this information can then be used to • SOFTWARE REQUIREMENT
offer tailored food and activity suggestions. LAPTOP OR PC
Android Studio
In general, AI-based diet advisers have the power Azure Data Studio
to completely change how we think about diet and
nutrition. These applications can offer people OVERVIEW OF TECHNOLOGIES USED
personalized advice and support to help them attain • ANDROID STUDIO
and maintain a healthy lifestyle by utilizing The official IDE for creating Android apps is
machine learning algorithms and real-time called Android Studio, and it is based on IntelliJ
feedback. IDEA. It provides a quick emulator, a flexible
Gradle-based build system, and a single
• SYSTEM ARCHITECTURE environment for Android device development.
GitHub integration, code templates, and other
capabilities let developers create apps quickly and
effectively. Additionally, Android Studio has
built-in support for Google Cloud Platform, which
makes it simple to incorporate Google Cloud
Messaging and project Engine into your project.
Lint tools to detect bugs are also included.
• XML
Designing the user interface of Android applications
uses the markup language XML. It gives
programmers the ability to design a visual
representation of the app's layout, complete with
views for buttons, text fields, photos, and other
elements. The structure, arrangement, and
functionality of UI elements in an Android app are
specified using XML.
Fig 1 System Architecture • JAVA

One of the most popular programming languages for
creating Android applications is Java. It is an object-
V. PROJECT IMPLEMENTATION oriented language with an extensive collection of
libraries and frameworks that facilitate the
This Project application is done in Android development of sophisticated applications.
Studio. We used Android Studio for the coding
and designing of the project • PROJECT STRUCTURE
Projects are divided into modules by Android Studio
• HARDWARE REQUIREMENTS that contain source code and resource files, such as
LAPTOP OR PC Google App Engine, library, and Android app
Windows 7 or higher modules. The Manifests, Java, and Res folders for
I3 processor system or higher classifying code and resources are located in app
8GB RAM or higher modules, and all build files are viewable at the top
level under Gradle Scripts.
VI. CONCLUSION
• USER INTERFACE This Java programming project is a system design
The toolbar, navigation bar, editor window, tool for an Android AI diet and fitness app. It takes a lot
window bar, and status bar make up the Android of work from us to develop this system. We all feel
Studio user interface. The toolbar enables users to that this system has given us a great deal of
carry out tasks like starting an app or accessing satisfaction. There is always space for improvement
Android features. Code creation and modification in the development industry, but this application
take place in the editor window, while the would benefit from some. We learnt a lot about the
navigation bar offers a condensed view of the subject of development and made a lot of
project structure. Access to certain activities, such discoveries. We anticipate success from this.
project management or version control, is provided
by tool windows, which can be extended or closed. VII. REFERENCES
The IDE status, project status, warnings, and
messages are all shown in the status bar. [1] Sak. J, Suchodolska. M, “Artificial Intelligence in
Nutrients Science Research: A Review” Nutrients.
2021;13[2]:322.
• TOOL WINDOW
[2] Nilsson. NJ, “The quest for artificial intelligence”
Numerous tool windows in Android Studio are Cambridge University Press. 2009.
automatically displayed dependent on the context. [3] Shim. JS, Oh. K, Kim. HC, “Dietary assessment methods
These tool windows can be expanded, collapsed, in epidemiologic studies. Epidemiol Health.
moved, pinned, unpinned, attached, detached, and 2014;36:e2014009.
customized to your specifications. Alternatively, [4] Ventura. AK, Loken. E, Mitchell. DC, Smiciklas-Wright.
you can save the current configuration as the H, Birch LL, “Understanding reporting bias in the dietary
default or restore the original layout. recall data of 11- year-old girls” Obesity [Silver Spring].
2006;14[6]:1073-84.
[5] Matusheski. N, Caffrey. A, “Christensen L, Mezgec S,
• AZURE DATA STUDIO Surendran S, Hjorth MF, et al. Diets, nutrients, genes and
For maintaining and accessing databases, Android the microbiome: Recent advances in personalised
Studio can be used with the cross-platform database nutrition”British Journal of Nutrition. 2021:1-24.
tool Azure Data Studio. It offers a cutting-edge, [6] Gibson. RS, Charrondiere. UR, Bell. W, “Measurement
user-friendly interface for carrying out tasks errors in dietary assessment using self-reported 24-hour
including making databases and tables, running recalls in low-income countries and strategies for
queries, and assessing query performance. You may theirprevention” Advances in Nutrition. 2017;8[6]:980-
91.
connect to a variety of database platforms with
Azure Data Studio, including SQL Server,
PostgreSQL, and MySQL, among others.
• DIALOGUE FLOW
A natural language processing (NLP) platform
called Dialogue Flow enables programmers to
create and incorporate conversational user
interfaces into bots, mobile apps, and online
applications. It responds to user requests in a
conversational manner by using machine learning
techniques to comprehend them. Chatbots, voice
assistants, and other conversational interfaces can
be developed using Dialogue flow and incorporated
into Android Studio applications.
Secure Chat Web Application Using JWT Creation,
Encryption, Decryption And Parsing For User Authentication And Login
Mr. Mohamed Shakir Sudhanshu Chawat - Dinesh Solanki G -

Department of Computer 20191CSE0591 20191CSE0133
Science and Engineering Department of Computer Department of Computer
Presidency University Science and Engineering Science and Engineering
Bengaluru,India Presidency University Presidency University
mohamed.shakir@presidencyu Bengaluru,India Bengaluru,India
niversity.in 201910100471@presidencyuniv 201910100314@presidencyuni
ersity.in versity.in
Kishan R - Syed Sufiyaan - Hemanth -

20191CSE0259 20191CSE0618 20191ISE0064
Department of Computer Department of Computer Department of Computer
Science and Engineering Science and Engineering Science and Engineering
Bengaluru,India Bengaluru,India Bengaluru,India
201910100552@presidencyuni 201910100723@presidencyuniv 201910100048@presidencyuni
versity.in ersity.in versity.in
Abstract—This research paper is about the LXXVII. I. INTRODUCTION

development of a secure chat web application
Web applications are widely used in different fields,
using the MERN stack technology, which
includes MongoDB, Express, React, and Node.js, but they also face many security attacks such as
with Mongoose and Chakra UI. The application cross-site scripting, and session hijacking. These
implements JSON Web Token (JWT) creation, threats can damage the user data and the web
encryption, decryption, and parsing for user application’s reputation. Hence, it is important to
authentication and login. The JWT token is secure user authentication and login in web
created using HS256 and RSA encryption applications.
algorithm and is stored in MongoDB. The JSON Web Token (JWT) is a method to
password is encrypted on the server using authenticate users securely in web applications.
BCrypt. The real time messaging feature is JWT is a standard way to securely exchange claims
implemented using Socket.io. The paper
between two parties. JWT is a short and URL-safe
emphasizes the importance of JWT token and its
role in ensuring secure user authentication and way to represent claims that are transferred between
login. The use of JWT in the web application two parties. JWT is composed of three parts: a
enhances its security and makes it less vulnerable header that contains the algorithm and token type, a
to common web security threats. The paper also payload that contains the claims and data, and a
provides a comparative analysis of various signature that verifies the token’s integrity. The
encryption algorithms used in web application header and payload are base64 encoded, and the
security and concludes that JWT using HS256 signature is made using a secret key.[1]
and RSA encryption algorithm is the most secure This paper describes the development of a secure
algorithm for web application security. chat web application using the MERN stack
Index Terms—JWT Creation, Encryption, development, with Mongoose and Chakra UI. The
Decryption, Parsing, Socket.io. application uses JWT creation, encryption,
decryption, and parsing for user authentication and
login. The JWT token is made using HS256 and authentication and login. The JWT token is created
RSA encryption algorithm and is stored in using HS256 and RSA encryption algorithm and is
MongoDB. The password is encrypted on the server stored in MongoDB. The password is encrypted on
using BCrypt. The real-time messaging feature is the server using BCrypt. The real-time messaging
done using Socket.io.[1][6][7][4] feature is implemented using Socket.io.
LXXVIII. II. RELATED WORK A. A. Front-end

Various encryption algorithms are employed in web The front-end of the secure chat web application is
application security. AES (Advanced Encryption developed using React and Chakra UI. React is a
Standard), DES (Data Encryption Standard), RSA popular front-end JavaScript library used for
(Rivest-Shamir-Adleman), and HMAC (Hash-based developing complex user interfaces. Chakra UI is a
Message Authentication Code) are some encryption popular React-based UI library that provides a set
algorithms that are frequently used. The encryption of accessible and customizable UI components.[6]
algorithm that is selected depends on the level of The front-end of the secure chat web application
security required for the web application. consists of various components such as Login,
JWT is prevalent in web application security. JWT Registration, Chat, and Settings. The Login
is a secure method to authenticate users and component is used for user authentication and login.
exchange data between two parties. JWT provides a The Registration component is used for user
secure method to transmit claims between two registration. The Chat component is used for real-
parties using a JSON object. JWT can be employed time messaging between the users. The Settings
to transmit user data, session information, and other component is used for updating user profile and
critical data between the client and server. JWT is settings.
employed in various applications such as single
sign-on, mobile applications, and web APIs.[1] B. B. Back-end
MERN stack development is a popular choice for
The back-end of the secure chat web application is
developing web applications. MERN stack
developed using Node.js, Express, and MongoDB.
development utilizes four technologies: MongoDB,
Node.js is a popular server-side JavaScript runtime
Express, React, and Node.js. MongoDB is a NoSQL
environment. Express is a popular backend web
database, Express is a backend web framework,
framework used for building web applications.
React is a front-end JavaScript library, and Node.js
MongoDB is a NoSQL database that is used for
is a server-side JavaScript runtime environment.
storing and retrieving data in a flexible and scalable
Mongoose is a popular Object Data Modeling
manner.[9][8][2]
(ODM) library for MongoDB, which provides a
The back-end of the secure chat web application
simpler way to interact with MongoDB.[2][7]
consists of various components such as User
Socket.io is a real-time messaging library that
Authentication, User Registration, Chat, and
enables the client and server to communicate with
Settings. The User Authentication component is
each other in realtime and in both directions.
used for authenticating users and generating JWT
Socket.io is widely utilized in chat applications,
tokens. The User Registration component is used
multiplayer games, and real-time analytics.
for registering new users and storing their data in
Socket.io provides a reliable, real-time, and
MongoDB. The Chat component is used for real-
efficient communication mechanism between the
time messaging between the users using Socket.io.
client and server.[3]
The Settings component is used for updating user
profile and settings.
LXXIX. III. SYSTEM ARCHITECTURE
C. C. JWT Creation, Encryption, Decryption, and
The secure chat web application uses the MERN Parsing
stack development, with Mongoose and Chakra UI.
The application implements JWT creation, The secure chat web application implements JWT
encryption, decryption, and parsing for user creation, encryption, decryption, and parsing for
user authentication and login. The JWT token is E. E. Database
created using HS256 and RSA encryption algorithm The secure chat web application uses MongoDB as
and is stored in MongoDB. The password is the database. MongoDB is a NoSQL database that
encrypted on the server using BCrypt. is used for storing and retrieving data in a flexible
The JWT creation process consists of three parts: and scalable manner. MongoDB is a popular choice
header, payload, and signature. The header contains for web applications as it provides a flexible and
information about the algorithm used to sign the scalable data storage solution.[2]
JWT token. The payload contains the user data and Mongoose is used as the Object Data Modeling
other claims. The signature uses a secret key. (ODM) library for MongoDB. Mongoose provides a
The JWT token is encrypted using HS256 and RSA more straightforward way to interact with
encryption algorithm. HS256 is a secure symmetric MongoDB by defining data models and schemas.[7]
encryption algorithm that uses a secret key to The user data and messages are stored in
encrypt and decrypt data. RSA is a secure MongoDB. The user data includes user details such
asymmetric encryption algorithm that uses a public as name, email, and password. The messages are
key and a private key to encrypt and decrypt data. stored in a separate collection and are associated
The JWT token is stored in MongoDB and is used with the users.
for user authentication and login. The password is
encrypted on the server using BCrypt. BCrypt is a LXXX. IV. DISCUSSION
popular password hashing function used for The development of the secure chat web application
securely storing passwords. BCrypt is designed to using JWT involves the following steps:
be slow and computationally expensive, which
makes it difficult for attackers to crack the A. A. System Architecture
passwords. The system architecture of the secure chat web
The JWT token is parsed on the server to application is shown in Figure 1. The application is
authenticate users and provide access to the secure divided into two main parts: the client-side and the
resources. The JWT token contains user data and server-side. The client-side is developed using
other claims, which can be used to identify and React and Chakra UI, which provides a userfriendly
authenticate the users. The JWT token is parsed interface. The server-side is developed using
using a secret key to verify the authenticity of the Node.js and Express, which handles the backend
token.[1] functionalities. MongoDB is used as the database to
store user information, messages, and other
necessary data.
D. D. Real-time Messaging
The secure chat web application implements real-
time messaging using Socket.io. Socket.io provides
a reliable, realtime, and efficient communication
mechanism between the client and server. Socket.io
uses websockets and fallback mechanisms to ensure
that the communication is reliable and
efficient.[3]
The real-time messaging feature allows users to
send and receive messages in real-time. The
messages are encrypted and decrypted on the server Fig. 1. System Architecture of Secure Chat Web
using the JWT token. The messages are stored in Application.
MongoDB and can be retrieved and displayed in
real-time.
B. B. Design and Implementation
The first step is to design and implement the chat
web application using the MERN stack
development. The design and implementation of theF. F. Real-time Messaging
application involve the following: The real-time messaging feature is implemented
1) MongoDB: MongoDB is used to store user using Socket.io. Socket.io enables real-time
information and JWT tokens. MongoDB provides a communication between the client and server. The
flexible schema and is easy to integrate with messages are stored in MongoDB and are displayed
Node.js.[2] to the user in real-time.
2) Express: Express is a backend web
framework that is used to handle the HTTP requests
LXXXI. V. RESULTS
and responses. Express provides a simpler way to
manage routes and middleware.[8] The secure chat web application using JWT was
3) React: React is a frontend JavaScript library successfully developed and implemented using the
that is used to create the user interface. React MERN stack development, with Mongoose and
provides a simple way to manage components and Chakra UI. The JWT token was created using
state.[5] HS256 and RSA encryption algorithm and was
4) Node.js: Node.js is a server-side JavaScript stored in MongoDB. The password was encrypted
runtime environment that is used to handle server- on the server using BCrypt. The real-time
side logic. Node.js provides a simple way to messaging feature was implemented using
manage packages and modules.[9] Socket.io.
5) Mongoose: Mongoose is an Object Data The application provides a secure method for user
Modeling (ODM) library for MongoDB that authentication and login. The use of JWT enhances
provides a simpler way to interact with the security of the application and makes it less
MongoDB.[7] vulnerable to common web security threats. The
6) Chakra UI: Chakra UI is a simple, modular, application is also user-friendly and provides a
and accessible component library that is used to simple and easy-to-use interface for chatting.
style the application.[6]
7) Socket.io: Socket.io is a real-time messaging LXXXII. VI. CONCLUSION
library that is used to enable real-time
communication between the client and server.[3] The secure chat web application is developed using
the MERN stack development, with Mongoose and
C. C. JWT Creation Chakra UI. The application implements JWT
The next step is to create the JWT token using the creation, encryption, decryption, and parsing for
user information. The JWT token is created using user authentication and login. The JWT token is
HS256 and RSA encryption algorithm. The header created using HS256 and RSA encryption algorithm
and payload are base64 encoded, and the signature and is stored in MongoDB. The password is
is made using a secret key.[1] encrypted on the server using BCrypt. Socket.io is
used for real-time messaging. The secure chat web
D. D. Encryption and Decryption application provides a secure and reliable
The password is encrypted on the server using communication mechanism for users. The use of
BCrypt. BCrypt is a password-hashing function that JWT token provides a secure way of transmitting
is designed to be slow and computationally user data and session information between the client
intensive. The password is decrypted when the user and server. The use of BCrypt provides a secure
logs in.[4] way of storing passwords on the server.
The real-time messaging feature provides a
E. E. Parsing convenient way for users to communicate in real-
time. The messages are encrypted and decrypted on
The JWT token is parsed on the server to
the server using the JWT token, which provides an
authenticate the user. The payload contains the user
additional layer of security.
information and is used to authenticate the user.
Overall, the secure chat web application provides a
robust and secure solution for communication
between users. The application can be further
enhanced by adding additional features such as file
sharing and video conferencing.
LXXXIII. REFERENCES
[1] JSON Web Tokens (JWT). (n.d.). Retrieved
April 9,
2023, from https://jwt.io/
[2] MongoDB. (n.d.). Retrieved April 9, 2023,
from https://www.mongodb.com/
[3] Socket.io. (n.d.). Retrieved April
9, 2023, from
https://socket.io/
[4] BCrypt. (n.d.). Retrieved April
9, 2023, from
https://github.com/kelektiv/node.bcrypt.js/
[5] React. (n.d.). Retrieved April 9,
2023, from
https://reactjs.org/
[6] Chakra UI. (n.d.). Retrieved April 9, 2023,
from
https://chakra-ui.com/
[7] Mongoose. (n.d.). Retrieved April 9, 2023,
from
https://mongoosejs.com/
[8] Express. (n.d.). Retrieved April
9, 2023, from
https://expressjs.com/
[9] Node.js. (n.d.). Retrieved April
9, 2023, from
https://nodejs.org/
112
Alternative To Traditional Credential-Based

Authentication
Dr.Clara Kanmani A Chitturi Mokshith Sri Sai Ekanath G Jaya Nithin Reddy
Associate Professor Department of Computer Engineering Department of Computer Engineering
Bangalore, India
Clara.kanmani@presidencyuniversity.i n
n
n
Syed Moiz Ahmed
Irfan Khureshi Department of Computer Engineering
Department of Computer Engineering Konda Manoj Kumar
Department of Computer Engineering Presidency University
Presidency University Bangalore, India
Bangalore, India 201910101564@presidencyuniversity.i
201910101550@presidencyuniversity.i n
n 201910101084@presidencyuniversity.i
n
Abstract— The paper analyzes the focal equally or more vulnerable. The security
points and impediments of each strategy, as well arrangements provided by institutions dictate the
as their appropriateness for diverse sorts of measures users have to take. The paper
applications and clients. Biometric confirmation, discusses various authentication schemes on the
for case, can be profoundly secure internet and provides an example of a complex
and helpful but may raise security concerns. security system in Korea that uses multiple
Behavioral confirmation, which analyzes a authentication methods.
user's designs of interaction with a framework,
can give nonstop confirmation but may LXXXV. PREVIOUS RESEARCH
be troublesome to actualize successfully. Token- There are three common types of methods
based verification, such as one-time passwords, used for user identification and authentication.
can give an extra layer of security but may 1. An authentication method that involves
be awkward for clients possessing a one-time password
generator, certificate, or smart card is
Keywords—user authentication, biometrics, referred to as "something the user
client certificates possesses."
LXXXIV. INTRODUCTION 2. To authenticate a user, they must provide
In secure systems like e-commerce, proper something that only they know, such as a
authentication of users is crucial. The traditional password or answer to a security
method of authentication is through a username question. The system must then be able
to verify the user's response to ensure
and password, but it has become inadequate due
proper authentication.
to users choosing weak passwords, not using
password management systems, and reusing 3. The user's identity can be confirmed
passwords across multiple sites. As a result, through biometric characteristics, such as
alternative or additional authentication methods a fingerprint or iris scan, which represent
are necessary. It is important to consider something unique to the individual.
different scenarios for authentication, such as The authentication process can involve three
authenticating to a device, remote authentication things: something that you have, something that
through the web, and other protocols, as the best you have forgotten, or something that you used
method varies depending on the situation. The to possess. The traditional approach of
paper focuses on remote authentication through authentication, which involves a username and
the Internet. It is important to avoid replacing a password, falls under the category of "something
weak authentication method with one that is you have forgotten".
Various alternative authentication methods this method argue that humans can recall
have been suggested, such as biometrics, pictures more easily than text, but such claims
graphical passwords, and public key are often based on overly optimistic estimates of
authentication. However, each of these methods human memory capabilities, and graphical
has its own limitations and disadvantages, and authentication systems can be limited in terms of
none has fully replaced the traditional username usability. Additionally, many graphical
and password combination that is widely used. authentication systems are vulnerable to
Some of these alternative methods have been "shoulder surfing," where unauthorized
used as secondary authentication measures. individuals can observe the user's login
A. Token-Based Authentication credentials. De Angeli, Coventry, and Renaud
Token-based authentication is a type of have categorized graphical authentication
authentication method that relies on the systems into three groups.
possession of an object, such as a code book, a
card, a smart card, or a public key-based o Draw metric schemes are a type of
certificate. In practice, user PKI certificates are authentication method that requires users
not commonly used due to their complicated to create a unique drawing or pattern. To
deployment and users' lack of comprehension. authenticate, users must then recreate this
drawing or pattern. Examples of draw
While this method is more secure than
metric schemes include Pattern lock,
traditional credential-based authentication, it
which is used for authentication on
carries the risk of the token being lost or stolen. Android phones, and Picture Password,
To mitigate this risk, the system must prevent which is used for authentication on
replay attacks and protect the token with a Microsoft Windows 8.
password.
o Cognos metric systems, also referred to
as Search metric, involve a user choosing
B. Biometric Authentication a familiar image (usually pre-determined
Biometric authentication systems are used to by the user) from a group of other images
identify and/or authenticate users based on their intended to confuse or distract.
physical characteristics. Common methods o Loci metric systems, which are also
include fingerprint recognition, iris recognition, referred to as cued-recall-based systems,
and facial recognition. Biometric authentication involve the identification of a sequence
systems suffer from a number of problems: of positions within an image.
o Ensuring confidentiality is a desirable
attribute for an authentication system, Graphical authentication systems are
but it is challenging to achieve in frequently used to authenticate personal devices,
including smartphones, and are also used for
biometric systems.
internet authentication. Although they have not
o Biometric systems are vulnerable to replaced traditional text-based passwords, they
mimic attacks unless they are being are often used as an additional authentication
supervised. method. Another commonly used knowledge-
o It is generally impractical to utilize based authentication scheme is the security
biometrics for remote authentication question, where the user provides an answer to a
over the internet since users may not question assumed to be private, such as their
have access to the necessary sensors. mother's maiden name. However, in practice,
o It should be noted that biometrics, such this information is often not entirely private and
as fingerprints, are not as exclusive to a can be easily discovered by others.
person as they are often believed to be.
LXXXVI. COMMON AUTHENTICATION
PRACTICE
C. Alternative Knowledge-Based Systems
Several alternatives to traditional text-based Authentication with a username and
passwords have been suggested, including password is the most common method, with a
graphical authentication systems. Supporters of security question as a backup option to reset the
password. However, security questions can be ActiveX plugins and acquire a digital
difficult for users to answer and easily guessed certificate.
by attackers. Some analysts think that lying 2. To access the bank's system, the user can
makes questions harder to guess, but research utilize their digital certificate by
shows that lying is harder to remember. Texting providing the certificate password during
or calling the user's cell phone is another option login.
for password recovery, but this method depends
3. The account PIN must be entered by the
on the user having a cell phone with them.
user.
Another email address is also an option, but
SMS is more reliable. Data centers can use more 4. The bank has provided the user with a
complex passwords, but they have no control card that contains a set of two numbers
over users who use the same password across that the user must enter for
multiple websites. Users often have trouble authentication.
remembering multiple passwords and security 5. The user will receive a number from the
questions for different sites, causing them to use bank via SMS on their cell phone, which
the same information across multiple sites. A they must input.
safer option is to use a password manager to 6. The user provides confirmation once
create and store strong passwords, security more using their certificate.
questions, and usernames. There's some
controversy over this advice, but it's similar to This authentication method combines two
Warren Buffett's "put all your eggs in one factors that the user knows (PIN and certificate
basket, but be careful in that basket" advice. It is password) with three factors that the user has
also recommended that users consider using (certificate, code card, and cell phone) for
increased security. However, despite its apparent
separate email addresses for different accounts
robustness, there are several weaknesses and
and use a single email to a message from other drawbacks in practice. The security plugins used
sites instead of logging in. for encryption, antivirus, anti-keystroke logging,
LXXXVII. EXTREME AUTHENTICATION: and firewall are inadequate for the task, and
KOREAN BANKING malware can bypass the password protection of
certificates stored on hard disks or USB keys by
In contrast to the low usage of user PKI key logging or brute-force attack. Furthermore,
certificates in other countries, South Korea is a the use of ActiveX has resulted in a Microsoft
notable outlier with a high adoption rate of monopoly in Korea, leading to poor web
around 60 percent among the population. This accessibility, and Korean users tend to install
widespread use of PKI has allowed Korean ActiveX controls without realizing the potential
banks to develop advanced authentication security risks. As a result, Korean users are
systems. For instance, when transferring a large conditioned to "Click on O.K. all the time.
amount of money to another bank, customers of Never, ever choose No!"
a Korean bank must follow a series of steps.
LXXXVIII. FUTURE WORK
When the text is finished, the thesis is ready
to be used as a template. Use the "Save As" Many articles promote alternative
command to copy the template file and use the authentication methods as traditional ones are
naming convention provided by the rule for the often easily compromised. However, this is
name of the document. Highlight everything in mainly due to incorrect infrastructure
this newly created file and import the prepared configuration, lack of security measures, and
file. You are now ready to create your unclear policies. It's crucial to have multi-
document; Use the scroll-down window to the faceted keys for authentication that are created,
left of the MS Word formatting toolbar. distributed, and maintained on a different
communication channel to prevent common
channel attacks. Authentication servers are
1. To gain authentication, the user is maintained through directory services like
required to install no less than four LDAPs and Active Directory. Although security
measures are applied to the authentication
server, connections to the directory servers are Summary of Discussions at the 2014
often unsecure, and the database itself may be Raymond and Beverly Sackler U.S.-U.K.
transparent to front-facing services. The Scientific Forum. The National Academies
development of novel authentication
Press, 2015.
mechanisms for legacy systems requires specific
infrastructure development, which could be [118]J.
Bonneau, E. Bursztein, I. Caron, R.
carried out without changing everything in the Jackson, and M. Williamson, “Secrets, Lies,
existing infrastructure. and Account Recovery: Lessons from the
Use of Personal Knowledge Questions at
LXXXIX.CONCLUSION Google,” pp. 141–150, May 2015.
Protecting computer systems is challenging, [119]A.Shabtai, Y. Fledel, U. Kanonov, Y.
especially because many users lack knowledge Elovici, S. Dolev, and C. Glezer, “Google
and expertise in this area, and providers often Android: A Comprehensive Security
prioritize meeting minimum security
Assessment,” IEEE Secur. Priv. Mag., vol.
requirements. Upgrading from the standard
username and password authentication method 8, no. 2, pp. 35–44, Mar. 2010.
has been a challenge. Nonetheless, users can [120]D.Balfanz, G. Durfee, D. K. Smetters, and
take measures to enhance their security. R. E. Grinter, “In search of usable security:
REFERENCES five lessons from the field,” IEEE Secur.
Priv. Mag., vol. 2, no. 5, pp. 19–24, Sep.
[112] R.Anderson, Security Engineering: A Guide 2004
to Building Dependable Distributed [121]A. Rabkin, “Personal knowledge questions
Systems, 2nd ed. New York: Wiley, 2008 for fallback authentication,” in Proceedings
[113] R.G. Rittenhouse, J. A. Chaudry, and M. of the 4th symposium on Usable privacy and
Lee, “Security in Graphical Authentication,” security - SOUPS ’08, 2008, p. 13.
Int. J. Secur. Its Appl., vol. 7, no. 3, pp. 347–
356, 2013.
[114] K.
I. P. Patil and J. Shimpi, “A Graphical
Password using Token, Biometric,
Knowledge Based Authentication System for
Mobile Devices,” Int. J. Innov. Technol.
Explor. Eng., vol. 2, no. 4, pp. 155– 157,
2013.
[115]A.H. Lashkari, S. Farmand, D. O. Bin
Zakaria, and D. R. Saleh, “Shoulder Surfing
attack in graphical password authentication,”
Int. J. Comput. Sci. Inf. Secur., vol. 6, no. 2,
p. 10, Dec. 2009.
[116]K.
Renaud, “On user involvement in
production of images used in visual
authentication,” J. Vis. Lang. Comput., vol.
20, no. 1, pp. 1–15, Feb. 2009
[117]National Academy of Sciences; Royal
Society, Cybersecurity Dilemmas:
Technology, Policy, and Incentives:
1
Design and Development of Analytical Model for

Heart Disease Prediction
Katam Reddy Varun Kumar Reddy Jallepalli Bhupathi Narendra K Gifty Amulya
Computer Engineering Data Science Computer Engineering Data Science Computer Engineering Data Science
Katakam Rohith Sai Dr. Manujakshi B C Rajesh Devarakonda

Computer Engineering Data Science Associate Professor Computer Engineering Data Science
Presidency University School of Computer Science and Presidency University
Bangalore, India Engineering Bangalore, India
Bangalore, India
Abstract—Heart disease is a significant cause of death and high-fat diets, can lead to hypertension, which can cause
worldwide, and its early detection and prediction can heart diseases. Heart diseases account for a significant
prevent its fatal consequences. Machine learning number of deaths worldwide, with more than 10 million
techniques have shown promise in predicting heart people succumbing to this condition each year. Early
disease accurately by utilizing patient data. This paper detection and adopting a healthy lifestyle are essential for
aims to explore the application of various machine preventing heart diseases. Medically, a healthy pulse rate
learning models, including Logistic Regression, Decision should be between 60 to 100 beats per minute, and blood
Tree Classifier, Random Forest Classifier, Gradient
pressure should range between 120/80 to 140/90. Although
Boost Classifier, K-nearest neighbor, Naïve Bayes,
heart diseases can affect both men and women of all ages,
Stochastic Gradient Descent, Support Vector Machine,
and other ensemble methods, to predict heart disease in factors such as gender, diabetes, and BMI can contribute to
patients. The study utilizes a publicly available dataset its development.
that includes 303 patients with 14 features such as age,
sex, chest pain type, blood pressure, and cholesterol The main focus of the healthcare industry today is to
levels. Data preprocessing involved handling missing provide high-quality services and accurate diagnoses to
values, encoding categorical features, and scaling patients. Although heart diseases have been identified as a
numerical features. Various machine learning models leading cause of death worldwide, they can still be
were trained and tested, and their performance was effectively managed and controlled. The timely detection of
evaluated based on accuracy, precision, recall, and F1 a disease is crucial to ensure its proper management and
score. The results indicated that the Random Forest control. In this regard, our proposed work aims to detect
Classifier model outperformed other models, achieving heart diseases at an early stage, thus preventing any severe
an accuracy of 90.0% in predicting heart disease. This or fatal consequences.
study demonstrates that machine learning models can
predict heart disease effectively and can be used as an
[4] The primary objective of this project is to design a
early detection tool in clinical settings.
system that can analyses patient health records and identify
the most critical features that contribute to the development
I. INTRODUCTION of heart disease. By leveraging the power of machine
[1] Heart Disease Prediction using Machine Learning learning models, the system can predict the likelihood of a
Models is an innovative project aimed at developing an patient developing heart disease, providing valuable insights
accurate and reliable predictive model to identify to medical professionals on how to manage and treat their
individuals at risk of developing heart disease. The project patients. This project is highly relevant in today's healthcare
uses a variety of machine learning algorithms to create a landscape, where heart disease is still a leading cause of
powerful tool that can assist medical professionals in the death worldwide. By using cutting-edge machine learning
early detection and prevention of heart disease. techniques, this project has the potential to significantly
improve patient outcomes and reduce healthcare costs
[2] Any anomaly in the normal functioning of the heart can associated with the management and treatment of heart
be categorized as a heart disease, and it can result in disease.
disturbances in other parts of the body.[3] Unhealthy
lifestyle choices, such as smoking, alcohol consumption,

II. LITERATURE REVIEW III. PROPOSED SYSTEM
Numerous studies have been conducted to predict heart 1. Data Collection: Collect a dataset containing
diseases using Machine Learning datasets. These studies information about patients with and without heart
have employed various data mining techniques and achieved disease, including demographics, medical history,
varying levels of accuracy. The following section elaborates lifestyle factors, and diagnostic test results.
on these techniques: 2. Data Pre-processing: Clean and pre-process the data to
[1] S. Srinivasan et al proposes the use of decision trees remove any missing values, handle outliers, and
and random forest classifiers for predicting heart disease. normalize the data. This step also involves feature
The advantages of these methods include their selection, where the most relevant features are selected
interpretability and ability to handle missing data.The to build the predictive model.
limitations include their susceptibility to overfitting and 3. Model Selection: Evaluating the performance of
their inability to handle non-linear relationships between different machine learning algorithms, such as KNN,
features. decision trees, random forest, support vector machines,
[2] A. K. Singh et al compares the performance of naive etc for heart disease prediction. Select the most
Bayes and K-nearest neighbor algorithms for heart disease appropriate algorithm based on the evaluation metrics
prediction. The advantages of these methods include their and the objectives of the project.
simplicity and efficiency. The limitations include their 4. Model Development: Build the predictive model using
sensitivity to irrelevant features and the need for proper
the selected machine learning algorithm. Train the
feature scaling.
model on a portion of the dataset and evaluate its
[3] J. Chen et al investigates the use of support vector
machines and neural networks for heart disease prediction. performance on the remaining portion using evaluation
The advantages of these methods include their ability to metrics such as accuracy, precision, recall, and F1-
handle complex relationships between features and their score.
ability to generalize well to new data. The limitations 5. Model Optimization: Fine-tune the model parameters
include their computational complexity and the need for and hyperparameters to achieve optimal performance.
large amounts of training data. This step may involve using techniques such as grid
[4] H. Wang et al proposes the use of ensemble methods, search or Bayesian optimization to search for the best
such as bagging and boosting, for heart disease prediction. combination of parameters.
The advantages of these methods include their ability to 6. Model Validation: Validate the model on a new dataset
reduce overfitting and their ability to combine the strengths to ensure its generalizability and robustness. This step
of multiple models. The limitations include their increased
may involve splitting the dataset into training,
complexity and the need for proper parameter tuning.
validation, and test sets, or using cross-validation
[5] A. Esteva et al explores the use of deep learning
methods, such as convolutional neural networks and techniques.
recurrent neural networks, for heart disease prediction. The 7. Model Interpretation: Interpret the results of the
advantages of these methods include their ability to learn model and identify the most critical features associated
complex representations of data and their ability to handle with heart disease risk. This step may involve using
sequential data. The limitations include their high techniques such as feature importance or partial
computational complexity and the need for large amounts of dependence plots.
training data 8. Evaluation: Evaluate the performance of the deployed
[6] S. K. Singh et al proposes the use of genetic model in real-world settings and monitor its
programming for heart disease prediction. The advantages performance over time. This step may involve
of this method include its ability to automatically discover collecting feedback from medical professionals and
complex relationships between features and its ability to
handle non-linear relationships. The limitations include its
computational complexity and the need for proper parameter
tuning.
[7] H. Raza et al investigates the use of fuzzy logic for
heart disease prediction. The advantages of this method
include its ability to handle uncertainty and its ability to
incorporate expert knowledge. The limitations include its
sensitivity to parameter tuning and its susceptibility to
overfitting.
[8] A. Sharma et al proposes the use of principal
component analysis for heart disease prediction. The
advantages of this method include its ability to reduce
dimensionality and its ability to remove redundant features.
The limitations include its inability to handle non-linear
relationships and its sensitivity to outliers.
patients to improve the model's accuracy and usability. function in a logistic regression model to predict the
probability of an individual developing heart disease.
Support Vector Machine: A supervised learning algorithm
used for classification and regression. It may be used to
classify individuals as having or not having heart disease
based on their medical history, lifestyle, and other
characteristics.
These algorithms can be trained on heart disease datasets
and evaluated using various performance metrics such as
accuracy, sensitivity, and specificity. Based on their
performance, the most effective algorithm can be chosen for
heart disease prediction.
V. RESULTS
Fig 1. Proposed System
IV. METHODOLOGY AND ALGORITHM
Logistic Regression: A statistical model that predicts

binary outcomes, which can be used to estimate the
likelihood of an individual developing heart disease based
on their characteristics such as age, gender, and blood
pressure.
Decision Tree Classifier: A classification algorithm that
uses a tree-like model to make decisions. It may be used to
based on their medical history.
Random Forest Classifier: An ensemble learning
algorithm that generates multiple decision trees and
combines their predictions to improve accuracy. It may be
used to classify individuals as having or not having heart
disease based on their medical history, lifestyle, and other
characteristics.
Gradient Boost Classifier: Another ensemble learning
algorithm that creates a strong learner by using a series of
weak learners. It may be used to classify individuals as
having or not having heart disease based on their medical
history, lifestyle, and other characteristics.
K-Nearest Neighbors: A non-parametric algorithm used
for classification and regression. It may be used to classify
individuals as having or not having heart disease based on
their similarity to other individuals in the dataset.
Naïve Bayes: A probabilistic algorithm used for
classification. It may be used to classify individuals as
Stochastic Gradient Descent: An iterative algorithm used
for optimization. It may be used to minimize the loss
Fig 3: No Heart Disease
VI. CONCLUSION
• The Heart Disease Prediction project is a crucial
undertaking that has the potential to contribute
significantly to the early detection, prevention, and
management of heart disease.
• The project involves several key steps, including data
collection and pre-processing, machine learning
algorithm selection, predictive model development and
optimization, and real-time deployment for risk
assessment.
• The expected outcomes of this project include accurate
risk assessment, improved diagnosis, personalized
treatment, improved patient outcomes, increased
efficiency in healthcare delivery, and improved
healthcare planning.
• Through accurate prediction of heart disease risk, this
project can help save lives, reduce healthcare costs, and
improve the overall quality of life for individuals at risk
of heart disease.
VII. FUTURE WORK
There are several areas of future work that can be

explored to enhance the accuracy and effectiveness of
heart disease prediction. Some possible directions for
future research include:
Feature selection: The identification of key features

that are most predictive of heart disease can improve
model performance and reduce computational
complexity.
Model assembling: Combining multiple models with

various machine learning algorithms will create a more
robust and accurate prediction model.
Fig 2: Heart Disease Prediction Deep Learning: Applying neural network architectures
like Convolutional Neural Networks (CNNs) and
Recurrent Neural Networks (RNNs) to heart disease
prediction can potentially provide better performance
than traditional machine learning models.
Big Data Analytics: Utilizing large-scale datasets,

including electronic health records and genetic data, can
help to identify new risk factors and improve prediction
accuracy.
Explainability: Interpretability of model predictions is

crucial in medical applications. Hence, interpreting
model predictions and understanding the factors leading
to the diagnosis of heart disease can help improve trust
and acceptance of these models in clinical practice.
Clinical validation: Conducting clinical studies and

validating the performance of heart disease prediction
models on real-world data can help to establish their
effectiveness and adoption in medical settings.
These future research areas can help to advance the [4] "Ensemble Methods for Heart Disease Prediction" by
field of heart disease prediction and ultimately improve H. Wang et al.
patient outcomes. [5] "Heart Disease Prediction using Deep Learning" by
REFERENCES A. Esteva et al.
[1] "Predicting Heart Disease using Decision Tree and [6] "Heart Disease Prediction using Genetic
Random Forest Classifiers" by S. Srinivasan et al. Programming" by S. K. Singh et al.
[2] "Heart Disease Prediction using Naive Bayes and K- [7] "Predicting Heart Disease using Decision Tree and
Nearest Neighbor Algorithms" by A. K. Singh et al. Random Forest Classifiers" by S. Srinivasan et al.
[3] "Heart Disease Prediction using Support Vector [8] "Heart Disease Prediction using Naive Bayes and K-
Machines and Neural Networks" by J. Chen et al. Nearest Neighbor Algorithms" by A. K. Singh et al.
6
Detection of Alzheimer’s Disease Using Hybrid Cnn Compred With Deep Learning
Models
Dr. Madhura K Mr. ALV Phaneendra Mr. Gowtham R

School of CSE & IS Computer Engineering Computer Engineering
Banglore,India Bangalore,India Bangalore,India
madhura@presidencyuniversi 201910102050@presidencyu 201910100588@presidencyu
ty.in niversity.in niversity.in
Mr. M Baleeswar Chowdary Mr. Jagadeesh J Mr. Harchith S

Computer Engineering Computer Engineering Computer Engineering
Bangalore,India Bangalore,India Bangalore,India
201910101363@presidencyu 201910101090@presidencyni 201910100333@presidencyu
niversity.in versity.in niversity.in
Abstract— algorithms.In current studies, DL techniques

Alzheimer's disease is an incurable brain such as Enhanced CNN,VGG16, AlexNet,
disease which affects thinking and memory. MobileNet are used.
AD affects the entire liver, leading to its
shrinkage and eventual death. Early diagnosis
of AD is important for better treatment.
Machine learning (ML) is a offshot of artificial Keywords—Alzheimer, Machine Learning,
intelligence that uses multiple scenarios and Deep Learning, CNN, VGG16, AlexNet,
optimization techniques to help computers MobileNet.
extract results from large amounts of complex
data. Therefore, the researchers focused on
XC. INTRODUCTION
using machine learning to recognize the early
stages ofAD.Many methods achieved the Alzheimer's disease (AD) is dementia typed
expected accuracy, but were evaluated on characterized by cognitive and behavioral
different disease registries not proven by problems that begin in middle age. The pathology
different models, making direct comparisons is characterized by the presence of neuric plaques
difficult. In addition, preprocessing, the and overt brain cell degeneration in the brain.
number of features important for feature Symptoms usually develop slowly and are severe
selection, class inequality, and many other enough toaffect daily life.AD is not just a
factors can affect the accuracy of the geriatricdisease, although the most importantrisk
estimation.To overcome these limitations, this factor is old age. In the early stages, memory is
article reviews, analyzes and evaluates recent mild, and in the later stages, the patient's ability to
work in the early stages of AD using machine speakand feelis greatly reduced. Current
learning techniques. The proposed model treatments cannot stop Alzheimer's disease (AD),
includes the first step in advance, followed by but early detectioncan help determine the severity
the importance of character selection and of the disease and improve patients' quality oflife.
classification along with mining policy. In It has been reported that the number of
addition, the proposed study is conducted using Alzheimer's patients will double in the next 20
a modelbased approach that provides years (Zhang, 2011) and 1 out of 85 people will
opportunities for research into early detection be affected by 2050 (Ron Brookmeyer, 2007).
of AD with the ability to distinguish AD from Therefore, an better accurate diagnosis is so
health management using deep learning important, especially in early stages of

Alzheimer's disease. People with pneumonia often Classification of Alzheimer's Disease using The
have a runny nose,fever with chills, shortness of MRI. In this study, they propose a convolutional
breath, severe pain or chest pain when breathing neural network classification algorithm for MRI
deeply, and rapid breathing. imaging for AD use and all images in the three
categories used in this study are 1512 bits, 2633
XCI. LITERATURE SURVEY
normal, and 2480 AD. An incredible 99%
In 2019 ,Cui et al. published a paper on Combined accuracy was achieved. Among all results over
3-D DenseNet and image-based hippocampal time, significant results were achieved at the 25
targeting for Alzheimer's disease diagnosis. threshold with 99% accuracy [5].
DenseNet was developed to learn about the visual
features of the hippocampal region. The final In 2018, an article about the retrace History of 2D
maps are processed with a combination of 3D CNN's and Imagenet was published at the CVPR
dense meshes and image analysis. Comparison of conference can spatiotemporal. The aim of this
experimental results and ADNI data with T1- paper is to determine whether the available video
weighted standard MR images shows that the data is informative enough to train ultra-deep
proposed method is effective in diagnosing AD convolutional neural networks (CNNs) with
and MCI. [1]. spatio-temporal three-dimensional (3D) kernels.
As of now, the performance level of 3D CNNs in
In 2020,a paper based on Detection of Alzheimer's the recognition field has been greatly improved
disease using transfer learning and convolutional [6].
neural networks methods was presented at the
International Multi-Conference on Systems, At the 2019 ITME conference, a paper was
Signals and Devices. Here article,two methods published on the 3D fully convolutional
wereused to describe AD: CNN and transfer learni DenseNet-based Alzheimer's disease diagnostic
ng. The plan has two main points (interest extracti model. For the real needs of Alzheimer's disease
on and distribution). The first step is to split the im diagnosis, this article presents data with positive
age into blocks to identify areas, including the brai and negative test models as well as small studies to
n's hippocampus. In the second step, we evaluate build a complete 3D convolutional DenseNet
CNN and distinguish between learning methods. classification model. Better display of feature
Then we can see better results showing image information can improve the overall capabilities of
classification compared to CNN using adaptive the model [7].
learning. [2]. In 2019, a paper based on Alzheimer's Disease
IEEE published a paper based on Interactive classification from MRI image data using a Hybrid
Learning for Alzheimer's Disease Diagnosis of Deep Convolutional Neural Networks Cluster was
3D-CNN and FSBI LSTM in 2019. In this paper, a published at IEEE. In this paper, a new
new method consisting of 3D-CNN and FSBi- classification system is proposed to differentiate
LSTM is proposed to control AD. In particular, a AD, little mild cognitive impairment (MLD) and
new LSTM network is proposed to replace the FC cognitive impairment patients by using deep
layer in 3D-CNN. This method stores as much reciprocal learning method to process spatial data
spatial information as possible in the map [8].
specification. [3]. In 2018, IEEE published a report on Research
In 2022, MDPI published a paper on MRI Performance of Google Collaboration as a tool for
scanning-based Alzheimer's disease diagnosis Research on Research. This project is a feasibility
through neural communication and adaptive study of the Google Collaboratory to accelerate
learning. In this paper, we propose a GAN-CNN- deep learning for computer vision. The results
TL model with some of the advantages of more show that the integrated device can perform as
data generation, reduced sample detection errors, well as the advanced medical device. The results
enhanced auto-subtraction and hyperparameter also highlight that it is better to run tests
transformation. [4]. Collaboratively if the research team does not have
a more powerful GPU than the K80 [9].
ICOSEC published an article in the IEEE Journal
A paper on Effective 3D Interactive Simulation
in 2020 on the CNN Model: Diagnosis and
using MR images for Alzheimer's Disease
Diagnosis was published at the ISBI conference in The dataset contains MRI images with the
2017. In this study, they proposed a simple 3D Alzheimer’s diseases in training of 8062 images
convolutional network architecture which provides and testing of 4387 images.
high-performance AD. detection in large data sets. 1.2 Data Pre-processing:
The proposed 3D ConvNet consists of 5 Resize and reshape the image accordingly to show
convolutional layers for feature extraction and 3 our style in order to shape our model .
fully connected layers for AD/NC classification. 1.3 Training:
[10]. By using the pre-processed training dataset we
XCII. EXISTING METHOD AND DRAW train the model using Hybrid CNN algorithm
BACKS along with other deep learning models.
1.4 Testing:
This model reflects current methods of some algor
Testing the test dataset using the agorithms to get
ithmic designs using deep learning. Here the
accuracy according to algorithm used
process is performed using the Google Net, which
1.5 Classification:
is one of the transfer learning method, but this
The results will be displayed are which type of
could not get the high accuracy. Google Net uses
Alzheimer’s diseases.
1x1 convolution, Global average pooling and
1) Mild Demented
inception module. It takes input of image with
2) Moderate Demented
224x224 dimension.
3) NonDemented
Disadvantages of Existing System:
4) VeryMildDemented
• Less feature compatibility
• Low accuracy
XCIII. PROPOSED METHOD AND XCIV. METHODOLOGY
ADVANTAGES 1. Convolutional Neural Network Algorithm
In proposed system we are using Modified CNN, Step1: convolutional operation
MobileNet, Alexnet and VGG16 for the The first block in concept capture is the
Alzheimer’s disease classification. By using these convolution function. In this step, we will touch
algorithms we can get better accuracy on detectors that act as filters for the neural
network. We will also talk about maps, how to
learn about maps, how to recognize patterns, how
to report research and results.
Fig 1. Block diagram of proposed method

Advantages of proposed system:
• Better Accurate classification Fig 2. The Convolution Operation
• Less complexity
• High performance
1.1 Creating Dataset:
Here section, we will provide everything in this
section. By studying this, you will gain a better
understanding of how neural networks work and
how "neurons" learn to classify images.
Summary
Finally, we summarize everything below and give
a quick rundown of what this chapter is about. If
you think it's good for you (and maybe it will),
you should check out more tricks like Softmax
and cross-entropy. This course is not required, but
when working with neural networks you will
encounter these concepts and will help you
understand them.
Fig 3 ReLU Layer

Step (1b): ReLU Layer
The second part of this step will involve correct
linear units or Relook. We will teach Relook
layers and explore the role of linearity in
convolutional neural networks. Understanding
CNNs is not necessary, but a quick run won't hurt
your skills.
Fig 5. CNN Work Flow
Fig 6. CNN Architecture

Fig 4. Convolutional Neural Network Scan
Images VGG16 Algorithm:
The network architecture of VGG was presented
by Simonyan and Zisserman in their 2014 paper
Step 2: Pooling Layer "Image Recognition in Large Scale Interactive
In this section we will introduce the integration Networks". This network is famous for its
and see how it works. But our link here will be simplicity, with only 3x3 convolutional layers
pooling, max pooling. However, we will focus on stacked together and growing in depth. Reduced
different methods, including average (or full) volume is controlled to maximum. Both sets are
pooling. This will conclude with a demo using bound to 4096 based on the Softmax classifier.
interactive tools that will blow your mind. "16" and "19" represent processes on the network.
Step 3: The Flattening
This is a simple overview of smoothing process
and how we should move from layer pooling to
smoothing when using CNN.
Step 4: Full Connection Layer
Fig 7.VGG Architecture
MobileNet Algorithm:
The MobileNet algorithm is created for mobile

applications and also is TensorFlow's first mobile
computing platform. MobileNet uses deep Fig 9. The Depthwise Convolution
partitioning. This minimizes the count of
inconsistencies contrast to communication with a The argument stems from the point that the depth
network of the same depth. This facilitates deep and length of the filter can be separated - hence
neural networks. the name separation. For example, use the Sobel
filter to identify edges in an image.
Fig 10. Sobel Filter
Sobel Filter. Gx is vertical edge, Gy is horizontal

edge detections
Fig 8. Depthwise seperable Convolution
Architecture
The concept used to divide the depth dimension
from the horizontal dimension (width * height)
Depth separation convolution has two operations. gives the division and creates the depth
dimension. Then use 1*1 filter to avoid depth
1. Depthwise measurement. One thing to keep in mind is how
2. Pointwise convolution. many parameters have been reduced to make the
difference in this match. channel. To create the
MobileNet is an open source CNN class from Goo channel we need 3 * 3 * 3 without a depth
gle so thisgives us a good starting point to introdu difference and 1 * 3 without a depth difference.
ce our small but very fast class. However, if we want the output, we only need to
filter the depth of 31 * 3 for a total of 36 (= 279)
parameters. Normal convolution output channel
should have 33*3*3 filters and 81 parameters.
using pre-trained weights, we can only define
A deep discrete convolution is a deep convolution CNN based on the proposed architecture.
based on the point convolution
XCV. CONCLUSION
1. Depth convolution is a channel-based DK×DK
spatial convolution. In this project we have successfully classified the
For example, in the picture above, there are 5 images of MRI images of a person, is either Mild
paths. Then we have a spatial convolution of 5 Demented or Moderate Demented or
MIN × DK. NonDemented or Very Mild Demented using the
2. The point convolution is a 1×1 convolution for deep learning algorithms. Here, we have
scaling. considered the dataset of MRI images which will
3. Depth convolution. be of 4 different types and trained using Modified
CNN with MobileNet, VGG16 algorithms. After
the training we have tested by uploading the
image and classified it.
AlexNet table titleFigures and Tables
AlexNet is another type of neural network, a deep XCVI. REFERENCES

learning model. The model was proposed by Alex [1] A. W. Salehi, P. Baglat, B. B. Sharma, G.
Krishevsky as part of a research project. Their Gupta and A. Upadhya, "A CNN Model:
work was overseen by Geoffrey E. Hinton, a Earlier Diagnosis and Classification of
geologist. Alex Krishevsky entered the 2012
Alzheimer Disease using MRI," 2020
ImageNet Large-Scale Recognition Contest
(ILSVRC2012) and achieved 15 top 5 errors using International Conference on Smart Electronics
the AlexNet model.3% is 10.8 points lower than and Communication (ICOSEC), Trichy, India,
its competitors. AlexNet 8,5and3 layers, as 2020, pp. 156-161, doi:
suggested by Alex Krizhevsky in his work. Some 10.1109/ICOSEC49089.2020.9215402.
layer models are based on maximum consolidation [2] Chiyu Feng, Ahmed Elazab, Peng Yang and
layers. The network uses the ReLU function as the Tianfu Wang, (2019) ‘Deep Learning
activation function, which outperforms the Framework for Alzheimer’s Disease
sigmoid and tangent functions. Diagnosis via 3D-CNN and FSBi-LSTM’,
IEEE Access Volume 7,2019,pp.63605-
63618.
[3] Cui R, Liu M. Hippocampus Analysis by
Combination of 3-D DenseNet and Shapes for
Alzheimer's Disease Diagnosis. IEEE J
Biomed Health Inform. 2019 Sep;23(5):2099-
2107. doi: 10.1109/JBHI.2018.2882392. Epub
2018 Nov 20. PMID: 30475734.
[4] Hara, K., Kataoka, H., & Satoh, Y. (2017).
Can Spatiotemporal 3D CNNs Retrace the
Fig 11. AlexNet Architecture History of 2D CNNs and ImageNet? ArXiv.
/abs/1711.09577.
The mesh consists of a core or filter with five [5] He, Guangyu & Ping, An & Wang, Xi & Zhu,
layers of dimensions 11 x 11.5, 5 x 5, 3 x 3, 3 x 3 Yufei. (2019). Alzheimer's Disease Diagnosis
and 3 x 3. Other parameters of the network can be Model Based on Three-Dimensional Full
adjusted accordingly, the teaching method works.
Convolutional DenseNet. 13-17.
10.1109/ITME.2019.00014.
It performs excellently with adaptive learning
using weights from a pre-trained network on the [6] G. Huang, Z. Liu, L. Van Der Maaten and K.
ImageNet dataset. But in this paper, instead of Q. Weinberger, "Densely Connected
Convolutional Networks," 2017 IEEE [9] Kollias, D., Tagaris, A., Stafylopatis, A. et
Conference on Computer Vision and Pattern al. Deep neural architectures for prediction in
Recognition (CVPR), Honolulu, HI, USA, healthcare. Complex Intell. Syst. 4, 119–131
2017, pp. 2261-2269, doi: (2018). https://doi.org/10.1007/s40747-017-
10.1109/CVPR.2017.243. 0064-6
[7] Jabason, Emimal & Ahmad, M. Omair & [10] Kwok & Gupta, Brij B & Alhalabi, Wadee
Swamy, M.N.s. (2019). Classification of & Alzahrani, Fatma. (2022). An MRI Scans-
Alzheimer’s Disease from MRI Data Using an Based Alzheimer’s Disease Detection via
Ensemble of Hybrid Deep Convolutional Convolutional Neural Network and Transfer
Neural Networks. 481-484. Learning. Diagnostics. 12. 1531.
10.1109/MWSCAS.2019.8884939. 10.3390/diagnostics12071531.
[8] Khan, Salman & Rahmani, Hossein & Shah, [11] Zaabi, Marwa & Smaoui, Nadia & Derbel,
Syed & Bennamoun, Mohammed. (2018). A Houda & Walid, Hariri. (2020). Alzheimer's
Guide to Convolutional Neural Networks for disease detection using convolutional neural
Computer Vision. Synthesis Lectures on networks and transfer learning based methods.
Computer Vision. 8. 1-207. 939-943. 10.1109/SSD49366.2020.9364155.
10.2200/S00822ED1V01Y201712COV015.
PREVENTION OF PHISHING USING MACHINE LEARNING
ALGORITHMS
RAMA KRISHNA K Y NANDITHA PUSHKARANI. H

Assistant Professor (20191CSE0698) (20191CSE0466)
School of Computer Science and School of Computer Science and School of Computer Science and
Bengaluru, Karnataka, India Bengaluru, Karnataka, India Bengaluru, Karnataka, India
ramkrishnak24@gmail.com nandithayuvaraj@gmail.com pushkarani28@gmail.com
KISHOR. B. K KEERTHANA T
(20191CSE0260) (20191CSE0253)
School of Computer Science and School of Computer Science and
Bengaluru, Karnataka, India Bengaluru, Karnataka, India
kishorbkgowda36@gmail.com keerthanat2907@gmail.com
Abstract— This study primarily aims to investigate advantages and are highly effective, we have decided to use
measures to reduce phishing in online transactions. them in our study.
Phishing is a cybercrime where hackers try to obtain
Keywords— Phishing, Machine Learning Algorithms:
sensitive information such as passwords and credit card
Random Forest, Support Vector, LGB (LightGBM)
details by pretending to be trustworthy. We have developed a
classifier.
website in which we propose a novel approach for reducing
phishing in online transactions by applying machine
learning algorithms that can be used to analyse the
transaction's legitimacy, alert the user if any suspicious INTRODUCTION
activity is detected, and prevent phishing attacks in real
time. Our results show that the proposed approach is In recent years, the field of machine learning has experienced
effective in reducing the number of phishing attacks in tremendous growth. It entails utilizing statistical models and
online transactions. This study offers valuable insights and techniques to let computer systems learn from data without
practical recommendations that individuals and being explicitly programmed. The way we analyse data,
organizations can leverage to enhance their defences resolve difficult problems, and make judgments could be
against phishing attacks during online transactions. The completely transformed by machine learning.
developed website also focuses on user experience (UX) The contemporary significance of machine learning is the
design and user interface (U.I.) design, considering old-age ability of machine learning to process and analyse enormous
users, users from villages, those not technically savvy, and amounts of data rapidly and effectively is crucial.
kids. It is also cost-effective. Efficient and accurate, we have Applications for machine learning algorithms can be found in
chosen two algorithms for our study, i.e., random forest and several industries, including social media, marketing,
support vector classification. The advantages of the three healthcare, and finance. For instance, in marketing and social
algorithms are given below: Random Forest: high accuracy, media, machine learning provides individualized suggestions,
robustness, feature importance, scalability, non-parametric, disease diagnosis in healthcare, and fraud detection in the
resilience to noise, interpretability, and versatility. Support financial sector. The contemporary significance of machine
vector classification is regularization capabilities, efficiently learning: The ability of machine learning to process and
handling non-linear data, solving classification and analyse enormous amounts of data rapidly and effectively is
regression problems, and stability. LGB (LightGBM): crucial nowadays. Applications for machine learning
Highly accurate, Efficient handling of categorical features, algorithms can be found in several industries, including social
Good handling of imbalanced data, Flexibility and media, marketing, healthcare, and finance. For instance, in
customizability. Since these three algorithms have the most marketing and social media, machine learning provides
individualized suggestions, disease diagnosis in healthcare, their defence against phishing attacks, mitigate risks, and
and fraud detection in the financial sector. Machine learning safeguard sensitive information.
is now a vital tool for businesses and organizations to get Machine learning algorithms show promise in reducing
insights, make wise decisions, and maintain market phishing in online transactions, enabling faster and more
competitiveness due to the increasing amount of data accurate decision-making. This paper, a pioneering method is
available today. introduced for classifying phishing prevention in online
transactions, employing a range of widely adopted machine
Popular machine learning algorithms for classification and learning algorithms: Logistic Regression, Random Forest,
regression problems include Support Vector Machines SVM, and Decision Tree. The proposed model can efficiently
(SVM), Logistic Regression, Decision Trees, and Random and accurately reduce phishing, enabling faster and more
Forests. accurate user decision-making.
Support Vector Machine (SVM) is a binary classification Furthermore, our model includes a user-friendly graphical
algorithm that uses a hyperplane to divide the data points into user interface (GUI) that can be used to alert
two categories to the best possible extent. SVM has a high Overall, this research presents a significant contribution to
degree of accuracy and is particularly helpful when working cybersecurity, providing a powerful tool for users to make
with datasets that have feature spaces with many dimensions. decisions and improve the reduction of phishing outcomes.
Another binary classification approach that functions by
simulating the likelihood that an event will occur is logistic LITERATURE SURVEY
regression. It is applied when the independent variables are
continuous or categorical, and the dependent variable is [1] “Employed NB algorithms to identify the malicious
binary. A well-liked machine learning approach called websites. NB is a slow learner and does not store the previous
decision trees is utilized for classification and regression results in memory. Thus, the efficiency of the URL detector
problems. The way they operate is to divide the data into may be reduced.” [2] “Utilized multiple ML methods for
subsets based on the values of independent variables and then classifying URLs. They compared the performance of
create a decision tree using the resulting subsets. different types of ML methods. However, there were no
Multiple decision trees are combined in Random Forest, an discussions about the retrieval capacity of the algorithms.” [3]
ensemble learning approach, to increase the model's “Applied multiple classification algorithms for detecting
robustness and accuracy. When dealing with noisy or malicious URLs. The outcome of the experiments
complex datasets, it is especially helpful. LightGBM (LGB) demonstrated that the system's performance was better than
has proven to be an effective tool for phishing detection, a other ML methods. However, It lacks in handling a larger
critical task in cybersecurity. In phishing detection, LGB is volume of data.” [4] “Proposed a deep learning-based URL
employed to analyse a diverse set of features extracted from detector. The authors argued that the method could produce
URLs, email headers, and content. These features encompass insights from URLs. Deep learning methods demand more
domain characteristics, IP addresses, URL length, presence of time to produce an output. In addition, it processes the URL
suspicious keywords, and more. By training LGB on a and matches it with a library to generate an output.” [5]
carefully labelled dataset comprising both legitimate and “Developed a crawler to extract URLs from data repositories.
phishing instances, the model learns to identify complex Applied lexical features approach to identify the phishing
patterns and relationships indicative of phishing attacks. websites. The performance evaluation was based on a
During the training process, LGB employs its gradient crawler-based dataset. Thus, there is no assurance for the
boosting framework to construct an ensemble of decision effectiveness of the URL detector with real-time URLs.” [6]
trees. Through iterative iterations, LGB continuously “A CNN-based detecting system for identifying the phishing
improves the model's ability to differentiate between page. A sequential pattern is used to find URLs. The existing
legitimate and phishing instances by rectifying errors made research shows that the performance of CNN is better for
by preceding trees. This approach allows LGB to effectively retrieving images rather than text.”
capture the nuances and subtle indicators of phishing
attempts. The interpretability of LGB also proves valuable in I. EXISTING WORK
phishing detection. It provides insights into the importance of
different features, allowing analysts to understand the Email filtering: Machine learning algorithms can be used to
contributions of various indicators in identifying phishing analyse the content and metadata of emails to determine
attempts. The integration of LGB into a phishing detection whether they are likely to be phishing attempts. Emails
system involves data preparation, feature extraction, model identified as phishing attempts can be filtered out or flagged
training, and real-time prediction. The trained LGB model for review.
becomes a crucial component of a larger system that
incorporates real-time data collection, pre-processing, and Website classification: Machine learning algorithms can be
user notification. By leveraging LGB's advantages, such as its trained to classify websites as legitimate or phishing sites
efficient handling of features, accurate predictions, based on characteristics such as URL structure, content, and
scalability, and interpretability, organizations can enhance
SSL certificate. Users can be warned or prevented from 17)hostname length: a function to get the hostname length.
accessing known phishing sites. 18)sus_url: a function to detect suspicious words, if any.
19)count-digits: This function counts the number of digits in
User behaviour analysis: Machine learning algorithms have
the URL.
the capability to analyse user behaviour, allowing them to
detect potential phishing attacks by identifying patterns such 20)count-letters: a function to count the number of letters in
as a sudden increase in visits to unfamiliar websites or the given URL.
frequent input of login credentials, which could serve as 21)fd_length: a function to get the first directory length.
indicative signals of a phishing attempt. 22)tld_length: a function to get the length of tld from the
column tld created by the above line.
Domain analysis: Machine learning algorithms can analyse Several procedures would be involved in creating the dataset
domain names and identify patterns commonly used in for the study paper, including:
phishing attacks, such as misspellings or variations of well-
known domains. 1)Data cleaning entails identifying any incorrect or missing
data and determining how to deal with it (for example,
II. DATASET PREPARATION attribute missing values or remove observations with missing
data). It could also entail looking for outliers and deciding
The dataset includes 22 online phishing prediction-related how to deal with them.
factors and 6,51,190 observations. This dataset was taken
from Kaggle, and the variables the dataset consists of are: 2)Data transformation could entail scaling, normalizing, or
establishing new variables based on existing ones to make the
1)use_of_ip: creating a function to check if the given URL data more analytically useful.
has an I.P. address in it or not. There are 2 types of I.P.
addresses, namely, IPv4 and IPv6. 3)Feature selection is choosing a portion of the available
2)abnormal URL: Abnormal URLs may include variables for analysis depending on how well they relate to
misspellings or variations of popular websites, such as and predict the research topic.
"g00gle.com" instead of "google.com."
4) Data division: the data will be divided into distinct training
3)google index: a function to see if the URL is indexed on
and testing sets, allowing for robust evaluation and
Google. verification of the model's performance.
4)count (.): a function to detect the number of dots(.) in the
given URL. 5)Model training and evaluation: Using the training set as the
5)count (www): a function to detect the number of www in basis, several machine learning models might be developed
the URL. and assessed, and their performance compared with the
testing set.
6)count (@): a function to detect the number of @ in the
URL. 6)Reporting the findings: The study paper would provide the
7)count_dir ("/"): a function to detect the number of /'s in analysis' findings, along with any conclusions and
the given URL suggestions based on them. The dataset would also need to be
8)count_embed_domian ("//"): a function to detect the correctly referenced in the study to guarantee the proper
number of //'s in the given URL. credit to the data source.
9)short URL: a function to see if the URL is shortened.
In conclusion, there are several critical processes in the
10)count (https): a function to detect the URL's number of
dataset preparation process for this dataset on stroke
'https.' prediction, including data cleaning, transformation, feature
11)count (http): a function to detect the number of % in the selection, data splitting, model training and evaluation, and
URL. reporting of the results. The dataset must be prepared
12)count (%): a function to detect the number of %s in the correctly to produce accurate and trustworthy results and to
URL. guarantee the validity of any conclusions or suggestions made
13)count (?): a function to detect the number of ?'s in the from the study.
URL.
14)count (-): a function to detect the number of -'s in the III. ALGORITHM DETAILS
URL.
15)count (=): a function to detect the number of ='s in the We are using two different machine learning algorithms here.
given URL. They are namely-
16)URL length: a function to get the length of the URL.
• Random Forest on instances with larger gradients to prioritize learning from
informative examples. One of the key features of LightGBM
Random forest is a machine learning algorithm that can be is its computational efficiency. It achieves this through
used to prevent phishing attacks. The random forest creates several optimizations, such as histogram-based binning,
many decision trees, each trained on a subset of the available which reduces the memory usage and speeds up training.
data. The decision trees are then combined to create a single, LightGBM also supports parallel and GPU learning, allowing
more robust model that can make accurate predictions about it to handle large-scale datasets efficiently. In addition,
new, unseen data. LightGBM provides built-in support for handling categorical
In phishing prevention, the random forest can be used to build features, which are common in real-world datasets. It
a classifier that can distinguish between legitimate and employs techniques like Gradient-based One-Hot Encoding to
phishing websites. The classifier can be trained on a dataset convert categorical features into numerical representations
of known and legitimate phishing websites and then used to that can be processed by the algorithm.LightGBM offers a
predict the likelihood that a new website is phishing. wide range of hyperparameters that can be tuned to optimize
The key advantage of using the random forest in phishing the model's performance. These hyperparameters control
prevention is its ability to handle large and complex datasets various aspects of the algorithm, such as tree structure,
and identify the most important features distinguishing boosting parameters, regularization, learning rate, and more.
between phishing and legitimate websites. This allows the Overall, LightGBM is known for its ability to handle large
algorithm to generalize well to new and unseen data, making datasets, its speed, and its accuracy. It has gained popularity
it a powerful tool for detecting and preventing phishing in both research and industry due to its efficiency and
attacks. effectiveness in solving a variety of machine learning tasks.
LGB's predictions can be used to trigger warning messages or
• Support Vector Classification flags when an email or URL is identified as potentially
malicious. These warnings can be integrated into email
Support vector classification is another machine learning clients, web browsers, or security software, providing users
algorithm that can be used to prevent phishing attacks. Like with alerts and advising them to exercise caution or avoid
the random forest, it is a supervised learning algorithm that interacting with the flagged content. By integrating LGB's
can be trained on a dataset of known phishing websites and phishing detection capabilities into prevention systems,
legitimate websites to build a classifier that can distinguish organizations can enhance their overall defence against
between them. phishing attacks. However, it's important to note that
Support vector classification works by finding the hyperplane prevention efforts involve a combination of techniques,
that maximally separates the two data classes. In other words, including user education, email and web filtering, multi-factor
it identifies the line (in two dimensions) or the plane (in three authentication, and other security practices, to effectively
dimensions) that best separates the phishing and legitimate mitigate the risk of falling victim to phishing attacks.
websites in the feature space. The algorithm then uses this
hyperplane to classify new, unseen websites as either
phishing or legitimate. IV. IMPLEMENTATION
The key advantage of using support vector classification in
phishing prevention is its ability to handle complex and non-
linear data. Phishing websites can be very sophisticated and
use various techniques to mimic legitimate websites, making
it difficult to identify them based on simple features. Support
vector classification can overcome this challenge by
identifying the hyperplane that best separates the two data
classes, even in cases where the data is highly non-linear.
• LGB (LightGBM) classification
LGB stands for LightGBM, which is a gradient boosting

framework developed by Microsoft. LightGBM is an open-
source machine learning algorithm that is designed to be fast,
efficient, and accurate for solving various supervised learning
tasks such as classification, regression, and
ranking.LightGBM is based on the gradient boosting Fig: Block Diagram
framework, which combines multiple weak models (typically
decision trees) to create a powerful ensemble model. It works The algorithm details of the project are as follows:
in an iterative manner, where each new tree is built to correct
the errors made by the previous trees. The algorithm focuses
1) Data Gathering and Preparation: Accurately Identify Phishing Websites: The website will
Gather information on the prevalence of phishing prevention accurately identify phishing websites to prevent users from
in online transactions and its contributing factors. Missing falling prey to fraudulent activities. Using Random Forest,
values are removed from the data, scaled, and normalized,
Support Vector and LGB Classifier Machine algorithms can
and categorical variables are transformed into numerical
values as part of the pre-processing. help achieve high accuracy in identifying phishing websites.
Educate Users About Phishing: The website will inform users

about phishing attacks, their work, and phishing scams. This
2) Selection and Extraction of Features: information can help users to recognize phishing attempts and
Use feature selection strategies to determine the most take appropriate action.
important features for classifying phishing from the pre-
processed data.
Increase User Confidence: By providing users with a reliable
3) Model Education: tool to identify phishing websites, the website can increase
Separate the training and test sets from the pre-processed user confidence in their online activities. This can lead to
data. increased user engagement and trust in the website.
Train the Decision Tree, Support Vector Machine (SVM),
and Random Forest machine learning models. Each model's Provide Real-Time Protection: The website will provide real-
hyper-parameters should be tuned to increase performance.
time protection to users by detecting phishing websites as
4) Model Assessment: soon as they appear online. This can help to prevent users
Utilize criteria like accuracy, precision, recall, F1-score, and from being victims of phishing attacks.
AUC-ROC to assess each model's performance. Choose the
model with the best performance after comparing the three. Improve Security Awareness: The website will help improve
users' security awareness by promoting safe online practices
5) Validation and Testing:
and providing resources on how to stay safe online. This can
Utilizing both simulated and actual phishing data, test the
alert system. Use measurements like sensitivity, specificity, help to reduce the incidence of phishing attacks and other
positive predictive value (PPV), and negative predictive online scams.
value.
Provide User Feedback: The website will provide feedback to
V. ARCHITECTURE users on the accuracy of its predictions. This can help to build
user trust and confidence in the website and its ability to
detect phishing websites.
Continuously Improve Accuracy: The website will improve

its accuracy in identifying phishing websites by updating its
machine learning models with the latest data and using
advanced techniques to improve the feature selection and pre-
processing.
In summary, the proposed objectives for a website to prevent

phishing using Random Forest and Support Vector Machine
will mainly focus on accurately identifying phishing websites,
educating users, increasing user confidence, providing real-
time protection, improving security awareness, providing user
feedback, and continuously improving accuracy.
VII. CONCLUSION
VI. PROPOSED WORK
Using machine learning algorithms such as Random Forest
Support Vector Classification and LGB classifier can greatly
improve the prevention of phishing attacks on websites. By
analysing large amounts of data, these algorithms can identify [3] Gandotra E., Gupta D, “An Efficient Approach for
patterns and indicators of phishing attempts, allowing phishing Detection using Machine Learning”,
websites to take proactive measures to prevent them. Algorithms for Intelligent Systems, Springer,
Singapore, 2021, https://doi.org/10.1007/978-981-
Random Forest algorithms can analyse features such as email 15-8711-5_ 12.
sender information, URL characteristics, and message content
to determine the likelihood of a phishing attempt. Support [4] Hung Le, Quang Pham, Doyen Sahoo, and Steven
Vector Classification algorithms can classify emails and web C.H. Hoi, “URLNet: Learning a URL
pages as either phishing or legitimate based on their features Representation with Deep Learning for Malicious
and characteristics. URL Detection”, Conference’17, Washington, DC,
USA, arXiv:1802.03162, July 2017.
Using a combination of these algorithms, websites can
improve their ability to detect and prevent phishing attempts, [5] Hong Lexical and Blacklisted Domains”,
protecting their users from potentially harmful scams. Autonomous Secure Cyber Systems. Springer,
Website owners need to prioritize the implementation of these https://doi.org/10.1007/978-3-030-33432- 1_12
machine-learning techniques in their security systems to
maintain their users' safety and trust. [6] Aljofey A, Jiang Q, Qu Q, Huang M, Niyigena JP.
An effective phishing detection model based on
VIII. ACKNOWLEDGEMENT URL's character-level convolutional neural network.
Electronics.2020 Sep; 9(9):151.
We would like to express our sincere appreciation to
Professor Rama Krishna K Sir, who is working as an
Assistant professor in the Department of Computer Science at
Presidency University, for his invaluable guidance and
support throughout our academic journey. Their
encouragement and insightful feedback have been
instrumental in shaping our research work, and like to thank
them for their mentorship and for providing us with
opportunities to engage in meaningful research projects and
inspiring lectures, which have contributed to our intellectual
growth and development.
Finally, we would like to thank individuals who have become

more aware of the risks of phishing and taken the necessary
precautions to protect themselves online. Your diligence and
commitment to online safety have contributed to making the
Internet a safer place for everyone.
In conclusion, reducing phishing in online transactions is a

collective effort, and We are grateful to those who have
played their part in making the Internet safer.
IX. REFERENCES
[1] Jain A.K., Gupta B.B. “PHISH-SAFE: URL

Features-Based Phishing Detection System Using
Machine Learning”, Cyber Security. Advances in
Intelligent Systems and Computing, vol. 729, 2018,
https://doi.org/10.1007/978-981-10-8536-9_44.
[2] Purbay M., Kumar D, “Split Behavior of Supervised

Machine Learning Algorithms for Phishing URL
Detection”, Lecture Notes in Electrical Engineering,
vol. 683, 2021, https://doi.org/10.1007/978-981- 15-
6840-4_40.
1
Design and Development of Analytical Model for

Heart Disease Prediction
Katam Reddy Varun Kumar Reddy Jallepalli Bhupathi Narendra K Gifty Amulya
Computer Engineering Data Science Computer Engineering Data Science Computer Engineering Data Science
Katakam Rohith Sai Dr. Manujakshi B C Rajesh Devarakonda

Computer Engineering Data Science Associate Professor Computer Engineering Data Science
Presidency University School of Computer Science and Presidency University
Bangalore, India Engineering Bangalore, India
Bangalore, India
Abstract—Heart disease is a significant cause of death and high-fat diets, can lead to hypertension, which can cause
worldwide, and its early detection and prediction can heart diseases. Heart diseases account for a significant
prevent its fatal consequences. Machine learning number of deaths worldwide, with more than 10 million
techniques have shown promise in predicting heart people succumbing to this condition each year. Early
disease accurately by utilizing patient data. This paper detection and adopting a healthy lifestyle are essential for
aims to explore the application of various machine preventing heart diseases. Medically, a healthy pulse rate
learning models, including Logistic Regression, Decision should be between 60 to 100 beats per minute, and blood
Tree Classifier, Random Forest Classifier, Gradient
pressure should range between 120/80 to 140/90. Although
Boost Classifier, K-nearest neighbor, Naïve Bayes,
heart diseases can affect both men and women of all ages,
Stochastic Gradient Descent, Support Vector Machine,
and other ensemble methods, to predict heart disease in factors such as gender, diabetes, and BMI can contribute to
patients. The study utilizes a publicly available dataset its development.
that includes 303 patients with 14 features such as age,
sex, chest pain type, blood pressure, and cholesterol The main focus of the healthcare industry today is to
levels. Data preprocessing involved handling missing provide high-quality services and accurate diagnoses to
values, encoding categorical features, and scaling patients. Although heart diseases have been identified as a
numerical features. Various machine learning models leading cause of death worldwide, they can still be
were trained and tested, and their performance was effectively managed and controlled. The timely detection of
evaluated based on accuracy, precision, recall, and F1 a disease is crucial to ensure its proper management and
score. The results indicated that the Random Forest control. In this regard, our proposed work aims to detect
Classifier model outperformed other models, achieving heart diseases at an early stage, thus preventing any severe
an accuracy of 90.0% in predicting heart disease. This or fatal consequences.
study demonstrates that machine learning models can
predict heart disease effectively and can be used as an
[4] The primary objective of this project is to design a
early detection tool in clinical settings.
system that can analyses patient health records and identify
the most critical features that contribute to the development
I. INTRODUCTION of heart disease. By leveraging the power of machine
[1] Heart Disease Prediction using Machine Learning learning models, the system can predict the likelihood of a
Models is an innovative project aimed at developing an patient developing heart disease, providing valuable insights
accurate and reliable predictive model to identify to medical professionals on how to manage and treat their
individuals at risk of developing heart disease. The project patients. This project is highly relevant in today's healthcare
uses a variety of machine learning algorithms to create a landscape, where heart disease is still a leading cause of
powerful tool that can assist medical professionals in the death worldwide. By using cutting-edge machine learning
early detection and prevention of heart disease. techniques, this project has the potential to significantly
improve patient outcomes and reduce healthcare costs
[2] Any anomaly in the normal functioning of the heart can associated with the management and treatment of heart
be categorized as a heart disease, and it can result in disease.
disturbances in other parts of the body.[3] Unhealthy
lifestyle choices, such as smoking, alcohol consumption,
II. LITERATURE REVIEW III. PROPOSED SYSTEM
Numerous studies have been conducted to predict heart 9. Data Collection: Collect a dataset containing
diseases using Machine Learning datasets. These studies information about patients with and without heart
have employed various data mining techniques and achieved disease, including demographics, medical history,
varying levels of accuracy. The following section elaborates lifestyle factors, and diagnostic test results.
on these techniques: 10. Data Pre-processing: Clean and pre-process the data to
[1] S. Srinivasan et al proposes the use of decision trees remove any missing values, handle outliers, and
and random forest classifiers for predicting heart disease. normalize the data. This step also involves feature
The advantages of these methods include their selection, where the most relevant features are selected
interpretability and ability to handle missing data.The to build the predictive model.
limitations include their susceptibility to overfitting and 11. Model Selection: Evaluating the performance of
their inability to handle non-linear relationships between different machine learning algorithms, such as KNN,
features. decision trees, random forest, support vector machines,
[2] A. K. Singh et al compares the performance of naive etc for heart disease prediction. Select the most
Bayes and K-nearest neighbor algorithms for heart disease appropriate algorithm based on the evaluation metrics
prediction. The advantages of these methods include their and the objectives of the project.
simplicity and efficiency. The limitations include their 12. Model Development: Build the predictive model using
sensitivity to irrelevant features and the need for proper
the selected machine learning algorithm. Train the
feature scaling.
model on a portion of the dataset and evaluate its
[3] J. Chen et al investigates the use of support vector
machines and neural networks for heart disease prediction. performance on the remaining portion using evaluation
The advantages of these methods include their ability to metrics such as accuracy, precision, recall, and F1-
handle complex relationships between features and their score.
ability to generalize well to new data. The limitations 13. Model Optimization: Fine-tune the model parameters
include their computational complexity and the need for and hyperparameters to achieve optimal performance.
large amounts of training data. This step may involve using techniques such as grid
[4] H. Wang et al proposes the use of ensemble methods, search or Bayesian optimization to search for the best
such as bagging and boosting, for heart disease prediction. combination of parameters.
The advantages of these methods include their ability to 14. Model Validation: Validate the model on a new dataset
reduce overfitting and their ability to combine the strengths to ensure its generalizability and robustness. This step
of multiple models. The limitations include their increased
may involve splitting the dataset into training,
complexity and the need for proper parameter tuning.
validation, and test sets, or using cross-validation
[5] A. Esteva et al explores the use of deep learning
methods, such as convolutional neural networks and techniques.
recurrent neural networks, for heart disease prediction. The 15. Model Interpretation: Interpret the results of the
advantages of these methods include their ability to learn model and identify the most critical features associated
complex representations of data and their ability to handle with heart disease risk. This step may involve using
sequential data. The limitations include their high techniques such as feature importance or partial
computational complexity and the need for large amounts of dependence plots.
training data 16. Evaluation: Evaluate the performance of the deployed
[6] S. K. Singh et al proposes the use of genetic model in real-world settings and monitor its
programming for heart disease prediction. The advantages performance over time. This step may involve
of this method include its ability to automatically discover collecting feedback from medical professionals and
complex relationships between features and its ability to
handle non-linear relationships. The limitations include its
computational complexity and the need for proper parameter
tuning.
[7] H. Raza et al investigates the use of fuzzy logic for
heart disease prediction. The advantages of this method
include its ability to handle uncertainty and its ability to
incorporate expert knowledge. The limitations include its
sensitivity to parameter tuning and its susceptibility to
overfitting.
[8] A. Sharma et al proposes the use of principal
component analysis for heart disease prediction. The
advantages of this method include its ability to reduce
dimensionality and its ability to remove redundant features.
The limitations include its inability to handle non-linear
relationships and its sensitivity to outliers.
patients to improve the model's accuracy and usability. function in a logistic regression model to predict the
probability of an individual developing heart disease.
Support Vector Machine: A supervised learning algorithm
used for classification and regression. It may be used to
based on their medical history, lifestyle, and other
characteristics.
These algorithms can be trained on heart disease datasets
and evaluated using various performance metrics such as
accuracy, sensitivity, and specificity. Based on their
performance, the most effective algorithm can be chosen for
heart disease prediction.
V. RESULTS
Fig 1. Proposed System
IV. METHODOLOGY AND ALGORITHM
Logistic Regression: A statistical model that predicts

binary outcomes, which can be used to estimate the
likelihood of an individual developing heart disease based
on their characteristics such as age, gender, and blood
pressure.
Decision Tree Classifier: A classification algorithm that
uses a tree-like model to make decisions. It may be used to
based on their medical history.
Random Forest Classifier: An ensemble learning
algorithm that generates multiple decision trees and
combines their predictions to improve accuracy. It may be
used to classify individuals as having or not having heart
disease based on their medical history, lifestyle, and other
characteristics.
Gradient Boost Classifier: Another ensemble learning
algorithm that creates a strong learner by using a series of
weak learners. It may be used to classify individuals as
K-Nearest Neighbors: A non-parametric algorithm used
for classification and regression. It may be used to classify
individuals as having or not having heart disease based on
their similarity to other individuals in the dataset.
Naïve Bayes: A probabilistic algorithm used for
classification. It may be used to classify individuals as
Stochastic Gradient Descent: An iterative algorithm used
for optimization. It may be used to minimize the loss
Fig 3: No Heart Disease
VI. CONCLUSION
• The Heart Disease Prediction project is a crucial
undertaking that has the potential to contribute
significantly to the early detection, prevention, and
management of heart disease.
• The project involves several key steps, including data
collection and pre-processing, machine learning
algorithm selection, predictive model development and
optimization, and real-time deployment for risk
assessment.
• The expected outcomes of this project include accurate
risk assessment, improved diagnosis, personalized
treatment, improved patient outcomes, increased
efficiency in healthcare delivery, and improved
healthcare planning.
• Through accurate prediction of heart disease risk, this
project can help save lives, reduce healthcare costs, and
improve the overall quality of life for individuals at risk
of heart disease.
•
VII. FUTURE WORK
There are several areas of future work that can be

explored to enhance the accuracy and effectiveness of
heart disease prediction. Some possible directions for
future research include:
Feature selection: The identification of key features

that are most predictive of heart disease can improve
model performance and reduce computational
complexity.
Model assembling: Combining multiple models with

various machine learning algorithms will create a more
robust and accurate prediction model.
Fig 2: Heart Disease Prediction
Deep Learning: Applying neural network architectures
like Convolutional Neural Networks (CNNs) and
Recurrent Neural Networks (RNNs) to heart disease
prediction can potentially provide better performance
than traditional machine learning models.
Big Data Analytics: Utilizing large-scale datasets,

including electronic health records and genetic data, can
help to identify new risk factors and improve prediction
accuracy.
Explainability: Interpretability of model predictions is

crucial in medical applications. Hence, interpreting
model predictions and understanding the factors leading
to the diagnosis of heart disease can help improve trust
and acceptance of these models in clinical practice.
Clinical validation: Conducting clinical studies and

validating the performance of heart disease prediction
models on real-world data can help to establish their
effectiveness and adoption in medical settings.
These future research areas can help to advance the [4] "Ensemble Methods for Heart Disease Prediction"
field of heart disease prediction and ultimately by H. Wang et al.
improve patient outcomes. [5] "Heart Disease Prediction using Deep Learning"
REFERENCES by A. Esteva et al.
[1] "Predicting Heart Disease using Decision Tree and [6] "Heart Disease Prediction using Genetic
Random Forest Classifiers" by S. Srinivasan et al. Programming" by S. K. Singh et al.
[2] "Heart Disease Prediction using Naive Bayes and [7] "Predicting Heart Disease using Decision Tree and
K-Nearest Neighbor Algorithms" by A. K. Singh et al. Random Forest Classifiers" by S. Srinivasan et al.
[3] "Heart Disease Prediction using Support Vector [8] "Heart Disease Prediction using Naive Bayes and
Machines and Neural Networks" by J. Chen et al. K-Nearest Neighbor Algorithms" by A. K. Singh et al.
6
Smart System For Crop Prize Prediction Using Machine Learning
Dr.Mohammadi Akheela Chandana Lankesh B Rakesh Prashanth V R

Khanum 20191CCE0014 20191CCE0010 20191CCE0038
Professor Department of CSE Department of CSE Department of CSE
akheela.khanum@presid 201910100009@preside 201910100441@preside 201910100477@preside
encyuniversity.in ncyuniversity.in ncyuniversity.in ncyuniversity.in
N.Nagesh Babu Mithinti Pavan Kalyan

20191CCE0035 20191CCE0031
201910100963@preside 201910100906@preside
ncyuniversity.in ncyuniversity.in
Abstract— The farming community in India better planning and management of crop
faces numerous challenges, including production.
unpredictable weather, pests, and fluctuations in
crop prices. To empower farmers with the Keywords—Smart System for Crop Price
necessary information to make informed Prediction using Machine Learning, XG Boost,
decisions about crop prices, we have developed a crop price predictions.
website called "Smart System for Crop Price
Prediction using Machine Learning." We use
I. INTRODUCTION
advanced machine learning algorithms like XG
Boost, ARIMA, and VAR to analyze historical Agriculture plays a crucial role in India's
data and identify trends and patterns that help economy, with over 58% of the population relying
predict future crop prices accurately. Our on it for their livelihoods. However, the sector faces
website uses a comprehensive approach to challenges such as labor shortages, changing
determine the best algorithm for predicting crop consumer preferences, and price fluctuations, which
prices, ensuring farmers have access to the most can significantly impact farmers' incomes and the
reliable and up-to-date information. Our country's GDP. To address these issues, innovative
ultimate goal is to provide farmers with the tools technologies such as machine learning and farm
automation have been adopted in agriculture.
and knowledge to manage their crops effectively,
minimizing financial risks and contributing to Machine learning algorithms can enhance crop
the growth of the agricultural sector. By productivity and quality by accurately predicting
leveraging the power of machine learning, we and estimating farming parameters. Accurate
hope to make crop price predictions more forecasting of crop yields and prices can also assist
accessible, reliable, and accurate, resulting in farmers in selling their produce at the right time and
for a good price, thereby mitigating the financial algorithms and concludes that Random Forest
risks faced by farmers due to price fluctuations after algorithm provides the best accuracy for crop yield
the harvest. prediction.
Our website, "Smart System For Crop Prize [5] This review article discusses the application of
Prediction Using Machine Learning" provides random forest and decision tree regression for crop
farmers with accurate crop price forecasts that price prediction. The authors provide an overview
enable them to plan and manage their crops better, of the importance of crop price prediction in
resulting in fewer losses and better price agriculture and review various studies that have
management. By leveraging machine learning used these algorithms. They also suggest areas for
algorithms, our platform can improve farmers' future research in this field.
decision-making capabilities and help mitigate the
risks associated with agriculture, leading to better
crop management and improved incomes for
farmers. With the adoption of innovative III.OBJECTIVE
technologies such as machine learning, the A preliminary study will be conducted to investigate
agriculture sector in India can become more the viability of utilizing machine learning for two
efficient, productive, and sustainable. purposes
• Firstly, to predict the modal price of
II.LITERATURE REVIEW specific crops through the application of
[1] This research paper discusses an automated machine learning algorithms.
agriculture Commodity price prediction system • Secondly, to develop and deploy a
utilizes machine Learning techniques. The system website that utilizes an appropriate
was tested on datasets from the Malaysian machine learning approach for crop price
agriculture industry, with the random forest prediction.
algorithm found to be the most accurate and stable.
The paper emphasizes the system’s potential to
assist farmers in decision-making
[2] The article proposes using supervised machine IV.EXISTING METHODS
learning algorithms to predict crop prices. The study Decision Tree Regression, LSTM, ARIMA, and
compares the performance of six different Vector Autoregression are popular methods that
algorithms and concludes that Random Forest and find use in machine learning and statistical
Support Vector Regression are the most effective in applications. Decision Tree Regression uses
predicting crops prices, based on historical data. recursive splitting of data based on input features
[3] The paper explores the use of predictive analytic to construct a tree-like model for predicting
in agriculture to forecast the prices to forecast the continuous numerical values. LSTM is a recurrent
to forecast the prices of Areca nuts in Kerala, India. neural network that can handle long-term
The study employs a hybrid model that combines dependencies in sequential data, making it well-
Artificial Neural Networks (ANN) and suited for tasks such as language modeling,
Autoregressive Integrated Moving Average speech recognition, and sequence prediction.
(ARIMA) models. The results suggest that the ARIMA is a statistical model used for time series
proposed model provides accurate forecasts for analysis and forecasting, which assumes
Areca nut prices. stationarity of the time series and comprises the
[4] The paper proposes a crop prediction system autoregressive component, the differencing
using machine learning algorithms. The system uses component, and the moving average component.
historical data of crop yield and weather conditions Vector Autoregression is a statistical model that
to predict the yield of the upcoming crop season. analyzes the relationship between multiple time
The study compares the performance of different series variables, with each variable modeled as a
linear function of its own lagged values. These
methods have unique features and can find
applications in various data analysis and
prediction tasks. Prediction based on
previous datasets
A. Drawbacks of the existing methods
Decision Tree Regression can be susceptible to Result and
overfitting, especially if the tree is overly complex suggestions
and deep, leading to poorer generalization
Fig. 11. Proposed Architecture
performance on new data. Additionally, small
changes in data can significantly impact the tree
structure and predictions made.
On the other hand, LSTMs, while effective in
handling long-term dependencies in sequential data, A. Step 1:
are computationally expensive and require more The datasets have undergone collection and
training time and resources compared to simpler refinement, with a focus on historical data to
models. Furthermore, LSTMs may encounter issues identify trends and patterns that can help in
such as the vanishing gradient problem, especially predicting future crop prices.
for longer sequences, which can lead to suboptimal
model performance B. Step 2:
Various analyses have been conducted to develop
a prediction model based on input of datasets
V.PROPOSED METHODS provides information on the prize and date in a
Our proposed solution is a web platform that uses particular regions.
machine learning algorithms to predict crop prices
and suggest the best crop to cultivate. The platform C. Step 3:
aims to help farmers make informed decisions to
avoid market fluctuations and maximize profits. By The prediction model is built using this
utilizing algorithms such as VAR, XGBoost, and algorithms XGboost, crop analysis and prediction
ARIMA, our solution will improve the livelihoods have been performed, taking into account various
of farmers and contribute to the growth of the datasets.
agricultural sector in India.
D. Step 4:
Through the process of crop analysis and
Collections of prediction, the price of a particular crop can be
Agricultural Datasets predicted, providing a better insights to farmers
armed with this information, farmers can make
Selection of the informed decisions on which crops to sow in order
parameters to decrease their loss.
VI.SYSTEM REQUIRMENTS
A. Hardware Requirements
System: INTEL i5.
Hard Disk: 512 GB.
RAM: 8 GB. farmers can take necessary measures to ensure
Network: Wi-Fi/mobile network. optimal crop yield and reduce the risk of crop loss.
Any desktop/Laptop system with the above
configuration or higher level. Overall, our website's combination of crop price
prediction and detailed crop information will
B. Software Requirements provide farmers with the necessary tools to make
Operating system: Windows 7 or higher. data-driven decisions about crop cultivation. This,
Coding Language: Python, HTML, JavaScript. in turn, will lead to higher profitability and reduced
Version: Python 3.7.0. wastage.
IDE: Python 3.7.0 IDLE.
ML Packages: NumPy, Pandas, Sklearn, Flask,
PymySql. VIII.CONCLUTION
ML Algorithms: ARIMA, Vector autoregression,
and In recent years, the adoption of machine learning
XGboost. algorithms in crop price forecasting has gained
Other Requirements: a verified resource for considerable attention due to its potential to provide
gathering the right more accurate predictions based on historical data
dataset.[6].www.agmarknet.gov.in and other relevant factors. After extensive research
on various forecasting techniques, our team chose
to utilize a multi-variate time series algorithm,
specifically Extreme Gradient Boosting (XGBoost),
VII.EXPECTED OUTCOMES for our project.
Our proposed website is designed to cater to the
needs of farmers by incorporating advanced XGBoost is a high-level machine learning
machine learning techniques like Vector algorithm that has gained widespread popularity for
Autoregression (VAR), XGBoost, and ARIMA to its application in statistical projects, including time
predict crop prices with a high degree of accuracy. series forecasting. It is an ensemble algorithm that
These algorithms will aid farmers in making combines the predictions of multiple decision trees
informed decisions about when to plant and harvest to generate more precise predictions. XGBoost has
crops, which in turn can help minimize crop the ability to handle both categorical and continuous
wastage and increase profits. variables and is known for its efficiency, scalability,
and accuracy.
Moreover, our platform aims to provide
comprehensive information about different crop The resulting web page will enable farmers to input
types, including their characteristics, growth details about their crops and receive real-time price
requirements, and harvest times. By offering forecasts based on the XGBoost algorithm. The
insights on the optimal growing conditions for each implementation of XGBoost in this project will
crop, such as the type of soil, water requirements, result in more precise predictions, enabling farmers
and temperature range, farmers can make informed to make better-informed decisions about the optimal
decisions about the most suitable crops to cultivate time to sell their produce.
based on their specific geographic region and
climate. The use of machine learning algorithms, such as
XGBoost, in crop price forecasting has the potential
In addition, our website will offer detailed to bring a revolutionary change in agriculture in
information on the different stages of crop growth, India, providing farmers with the necessary tools to
including planting, irrigation, fertilization, and pest make informed decisions and enhance their income.
control. By providing this valuable information, Our team's project showcases the potential of
machine learning in agriculture and highlights the
possibility of future innovations in this field. [2]. Ranjani Dhanapal “Crop price prediction using
supervised machine learning
algorithms” al 2021 J. Phys.: Conf. Ser. 1916
012042
IX.ACKNOWLEDGMENT
We would like to acknowledge the support and [3]. Kiran M. Sabu “Predictive analytics in
guidance of our project supervisor, Dr.Mohammadi Agriculture: Forecasting prices of Areca nuts
Akheela Khanum Presidency University who in Kerala.”. / Procedia Computer Science 171
provided invaluable insights and direction (2020) 699–708
throughout this paperwork duration. We also extend
our thanks to the academic community, whose [4]. Pavan Patil, Virendra Panpatil, Prof. Shrikant
research and publications provided the foundation of
Kokate “Crop Prediction System
our project. Special thanks go to the authors of the
using Machine Learning Algorithms” Volume: 07
various papers and articles that we referenced in our
work. Issue: 02 | Feb 2020 e-ISSN: 2395-
0056
REFERENCE [5]. M. Rakhra, P. Soniya, D. Tanwar, et al., Crop

Price Prediction Using Random
[1]. Zhiyuan Chen, Howe Seng Goh, Kai Ling Sin, Forest and Decision Tree Regression: -A Review,
Kelly Lim, Nicole Ka Hei Chung, Materials Today: Proceedings,
Xin Yu Liew (2021) “Automated Agriculture https://doi.org/10.1016/j.matpr.2021.03.261
Commodity Price Prediction System with
Machine Learning Techniques.”, Advances in [6]. https://www.ibef.org/industry/agriculture-india
Science, Technology and Engineering
Systems Journal Vol. 6, No. 2, XX-YY (2021)
11
An Iot Based System To Detect The Quality Of Food Products
Dr. MOHAMMADI AKHEELA AMRUTHA S (20191COM0007) ALYA ARCHANA (20191COM0006)

KHANUM B.Tech Computer Engineering B.Tech Computer Engineering
School of Computer Science Presidency University Presidency University
Presidency University Bangalore,India Bangalore, India
Bangalore, India 201910100071@presidencyuniversity.i 201910100402@presidencyuniversity.i
akheela.khanum@presidencyuniversity. n n
in
CHANDANA S (20191COM0040)
UZMA FATHIMA SHAIK B.Tech Computer Engineering
(20191COM0213) Presidency University
B.Tech Computer Engineering Bangalore, india
Presidency University 201910100497@presidencyuniversity.i
Bangalore,India n
201910101646@presidencyuniversity.i
n
Abstract— Food quality is a key concern changed. These methods can make greater use of
worldwide, and to reduce the rate of this information in the future to reduce food
deterioration, it's crucial to keep the spoiling.
environment in food storage warehouses at a
comfortable temperature.
In general, most cooking methods will keep food Keywords—Food quality, IoT, Detection,
fresh. Various chemicals or ingredients are adde Sensors
d to food to make it look fresh or attractive. Most
food is now preserved with chemicals that make XCVII. INTRODUCTION
food unhealthy. These pollutants can cause many
diseases that cause consumers to crave healthy f For any type of living thing to sustain the energy
ood. necessary for survival, food is a basic requirement.
Today's food inspection methods are limited by Nutrients and energy from nutritious diet keep the
weight, volume, color and detection, so they cann body strong and active. Pesticides are frequently
ot provide much of the information needed for fo employed by farmers in agriculture to increase
od quality. The quality of the food must be tested productivity, and these pesticides play a significant
and protected against decay and deterioration d role in food contamination. Eating unhealthy food
ue to atmospheric factors such as heat, humidity when exposed to pesticides is like opening the door
and darkness. to sickness. Unhealthy food leads to disease,
obesity, and nutrient deficiency. Young people
The Internet of Things (IoT)-based system to today are very interested in living healthy lifestyles
detect the quality of food products is an and taking care of their physical health. The quality
integrated detection and management of the meal is therefore crucial for maintaining
information system made up of smart devices. It fitness. In today's world, food poisoning is also a
uses sensors to assess the quality and freshness of serious issue. It develops into the cause of numerous
food and can identify food spoilage early, before ailments. A thorough investigation is conducted to
symptoms appear. The proposed approach for determine the food's quality. In order to better serve
managing food quality is highlighted in the the needs of people, scientists are focusing on the
article. The study improves people's quality of types of bacteria that are present in food. The ability
life by using intelligent sensor networks to alert to know about food quality is largely made possible
people when food is about to expire or when by scientists and technology. It is clear from the
particular aspects of the packaging for food have current script that we require a device that can

identify food quality. Our system uses the PH ical methods if measurements are to be made deep i
detector, temperature detector, and odour detector to n the material or if other materials interfere with the
evaluate the quality of food. The mq3 detector and frequency range being measured.
the MQ135 are odour detectors that measure the The article experimentally demonstrates that impeda
quantity of dangerous foods present in food. These nce measurement can extract information about the
devices monitor the food's quality by keeping an status of the food supply in real time based on real-
eye on elements including temperature, moisture, time behavior, a medical examination is not necessa
and dangerous foods. The purpose of this work is to ry. [2]
create a prototype for gathering intake detector data.
The system's necessary information is shown on the The research paper entitled "Food Spoilage
screen. Detection System" authored by Aftab Sheik, Yogita
Nafde, Priynka haherwar, Shilpa Gathe, and Shreya
XCVIII. LITERATURE SURVEY Bhanarkar in 2022 presents a proposed electronic
device integrated with biosensors to detect food
The International Research Journal of Engineering spoilage. The system is based on the Internet of
and Technology (IRJET) published a paper titled Things (IoT) and utilizes intelligent hardware, radio
"ARUDINO BASED SMART IOT FOOD frequency identification technology, food safety
QUALITY DETECTION TECHNOLOGY" which technology, network technology, and other high-
introduces a proposed system for managing food tech techniques to monitor and manage food quality
quality using the Internet of Things (IoT) and and freshness using biosensors and electrical
various sensors to test food freshness. The system is sensors. The proposed project aims to detect early
designed to be compact and easy to use, and utilizes food spoilage by monitoring gas emissions using
image processing via a controller programmed with MQ gas sensors, which display all parameters on
MATLAB to detect and alert users to rotten food in the LCD interface with the NodeMCU controller
items such as meat, fruits, and vegetables. The and the IoT device. The system includes an Android
quality food images are stored in a mobile app for application that selects the type of food to be
future reference. The aim of this project is to assist checked and ensures its quality for consumption.
people with low immunity who are affected by This project introduced an electronic device with a g
consuming unhealthy food. By using this system, as sensor to detect three bacteria found in food, nam
users can identify and consume healthy food, ely Enterococcus faecalis, Escherichia coli and Stap
ensuring their well-being. The project can be hylococcus aureus. The MQ gas sensors calculate
implemented in food checking departments to gas emission levels in the environment and provide
evaluate the quality of food produced by hotels and a digital output to the controller, which controls the
restaurants. Additionally, the compact design of the AC devices based on the detected gases. The paper
system allows ordinary individuals to use it to highlights the components used in the circuit such
purchase healthy food from grocery stores via the as the power supply circuit, NodeMCU, buzzer,
mobile app. [1] LED's, MQ4 sensor, relay circuit, LCD, and PCB
layout. The buzzer emits methane gas when food is
In 2019, Mahdi Guermazi, Ahmed Fendri, Olfa Kan spoiled. The green LEDs indicate that the food is
oun, and colleagues published a paper entitled "The fresh, and the red LED indicates spoiled food. The
potential of impedance spectroscopy for time measu LCD displays the output, and the PCB layout
rement of food quality" showing the results of using provides an additional feature for future project
impedance spectroscopy for food quality analysis. scope. [3]
The authors emphasize that reliable information on f
resh food and food quality is important to ensure co The Electronics and Communication Department at
nsumer safety, product quality, and business success Atria Institute of Technology in Bengaluru,
. The article discusses the great potential for real- Karnataka, India, conducted a survey on a food
time use of impedance spectroscopy, for example in quality monitoring system (FQMS) that monitors
electronic scales, because it provides information a stocks and controls parameters. Food spoilage is a
bout changes in food composition and condition. Th major issue in India, and it is essential to maintain
is article provides examples of the use of impedance ambient conditions in food storage warehouses to
spectroscopy in food products, including meat fresh control the spoilage rate. An IoT-based FQMS is
ness, milk dilution, and water content in cooking oil being developed to monitor the amount of stock left
. Impedance spectroscopy is often preferred over opt in the warehouse and the parameters that have to be
maintained to prevent food spoilage. The system meet the requirements of these clients, we've
uses AI technology to detect food spoilage and alert developed a tool that checks whether food is good
stakeholders via an app. or bad. This composition demonstrates the use of
Fuzzy logic is used to process fuzzy information an colorful detectors in the food assiduity. pH
d decision- detectors, gas detectors, temperature detectors etc. It
based reasoning for decision making in practical ap helps to determine the nature of the food. The
plications. The proposed system employs ML system has unraveled well in eateries, dwellings
systems such as GPR and SVR to calculate the age and small enterprises. [6]
of fruit and decide the edibility of the fruit, and
PCA and KNN algorithms to reduce the data. Deep XCIX. EXISTING METHOD AND DRAW BACKS
learning helps the machine extract features and train
them to make decisions, simulating human brain The Given food can be identified if it's in solid,
behavior. IoT technology can be used to monitor liquid or semi-solid form using texture analysis.
food quality and control spoilage manually or Texture valuation reckoned through LBP (local
automatically, using Bluetooth Low Energy and IoT binary pattern) also modelling process with
technology.[4] reversion can arbitrate the relative nutrient content
in the food. In foregoing approach solely, the
An article titled "An Intelligent IoT-Based Food recognition of nutrient in food purely done, this
Quality Monitoring Approach Using Low-Cost work will be the enhancement of the existing
Sensors" by Alexandru Popa, Mihaela Hnatiuc, system.
Mirel Paun and others, published in 2019, discusses
the Improvement of
C. PROPOSED METHOD
the quality of life of the elderly by monitoring food
quality using a smart sensor network was addressed.
Sensors can detect when food has passed its expirat
ion date or certain components of the packaging hav
e changed. During the test, it was found that a signif
icant change in electrical output was observed even
if the product was not in the box. The article also su
ggests that adopting a healthy lifestyle can help dela
y the spread of diseases associated with the elderly.
The study concluded that the low-
cost MQ5 fuel sensor used was affected by emission
s. [5]
A research paper on Food Quality Detection and Fig 1. Block diagram of proposed method
Monitoring System by IEEE International Students’
in 2020 Conference on Electrical, Electronics and
Computer Science by Atkare Prajwal (1), Patil In this project we propose to make an electronic
Vaishali (2), zade payal (3), Dhapudkar Sumit. Food device that is capable of detecting food spoilage and
plays an big part in our day-to-day life. With the gives us an indication whether the food substance is
development of globalization, the quality of food is fit or unfit for human consumption. Here In this
diminishing day by day. In general, most cooking system, we use ESP32 module as the basis of the
styles will keep food fresh. varied chemicals or system to connect sensors, gas sensor and
constituents are added to food to make it look fresh temperature sensor and LCD screen to display
or appealing. maximum food is now conserved related information. The sensor calculates the
with chemicals that make food unhealthy. This freshness level and quality level of the food by
impurity can effect numerous illnesses that interpreting the readings from the food output to us
multiply the demand for healthy food from on the LCD so that we can check the quality of the
consumers. People need organic food for health. food. This is done with great care and sensor
thus, to avoid food problems without mortal sensitivity.
explanation, we need tools like these that help
determine food quality. similar tools should be used
to guide us in our convention of clean food. thus, to
CI. METHODOLOGY comes equipped with both digital and analogue
output pins. The digital pin sends a high signal
when the concentration of these gases in the air
Hardware Components exceeds a specific threshold level. The threshold can
be adjusted using the on-board potentiometer. On
Arduino Uno the other hand, the analogue output pin generates a
voltage signal that can provide an approximate
The Arduino Uno is a microcontroller board that is measurement of the gas concentration in the
built around the ATmega328P. It boasts several surrounding air.
features, including 6 analogue inputs, a 16 MHz
quartz crystal, a USB connection, a power jack, an Gas Sensor
ICSP header, and a reset button. In addition, it
comes with 14 digital input/output pins, 6 of which Gas sensors are instrumental in determining the
can be used as PWM outputs. The board contains concentration of gas in the surrounding environment
everything needed to support the embedded and how it moves. By using electrical signals, gas
controller, making it an excellent choice for both sensors can provide information about the type and
beginners and experts. Getting started is as simple quantity of gas present, as well as any changes in
as plugging the board into a computer using a USB gas concentration [91-93].
cable, connecting it to an AC-to-DC adapter for
power, or using a battery. Pressure Sensor
Lcd Display When it comes to measuring the pressure of gases

or liquids, a pressure sensor is the go-to device.
An LCD is a type of electronic display that is flat Pressure is defined as the force required to prevent
and thin. It utilizes the properties of liquid crystal to fluid expansion, and is usually expressed as force
manipulate light in order to create an image. Unlike per unit area. In order to measure pressure, a
other displays, the LC in an LCD does not emit light pressure sensor is used to act as a transducer that
directly. Instead, it modulates the light that passes produces a signal in response to the applied
through it. This results in an energy-efficient display pressure.
that is ideal for use in battery-powered electronic
devices.
Flame Sensor
Temperature Sensor
The flame sensor is a thin, short metallic rod that
Thermistors are electronic components whose generates a small electrical current to detect the
resistance varies with changes in temperature. They presence of a flame burning inside a furnace. This
are versatile and can be used for various purposes current flows from the sensor, and as the gas valve
such as heating elements, current limiters, opens to initiate combustion, the sensor senses the
temperature sensors, and overcurrent protectors. heat from the flame.
Unlike resistance temperature detectors (RTD),
which use pure metals, thermistors use ceramic or Relay
polymer materials. While RTDs are suitable for use
over a broader temperature range, thermistors offer A relay is an electrical switch that can be controlled
higher precision within a specific temperature through an electric signal. When a current passes
range, typically between 90 and 130°C. through the relay's coil, it generates a magnetic field
that attracts a lever and changes the switch contacts.
Relays typically have two switch positions and are
Mq-135 Sensor known as double-throw (changeover) switches
because the coil current can be either on or off. By
The MQ-135 Gas Sensor is capable of detecting using relays, a single circuit can switch a second
hazardous gases and smoke, such as ammonia circuit that is potentially independent.
(NH3), Sulphur (S), benzene (C6H6), and CO2.
This gas sensor belongs to the MQ series, and it Software Components
Arduino Ide Step 6: - It is the final step where the output is
inferred based on the sensor readings and the quality
The Arduino IDE is a free, open-source software of the food substance is determined using the LCD
that is utilized for programming and uploading code display.
to Arduino boards. This IDE is compatible with Additionally, a notification is sent to the Blynk
multiple operating systems. To program Arduino cloud app.
boards, developers use the "C" programming
language, a system programming language that
CII. CONCLUSION
executes quickly on hardware. C is commonly used
to build various programming languages and most
operating systems due to its speed. The AVR The primary objective of this project is to design a
microcontrollers inside Arduino boards are food quality detection system capable of detecting
programmed using a subset of C, which is also temperature, relative humidity, and ethanol gas
referred to as "Embedded C" since it is specifically emissions. This system should be able to collect
used for programming embedded controllers. It's data from all the sensors and transmit it to an LCD
worth noting that the language used to program for display. Additionally, the sensor data should be
Arduino is a subset of C, and therefore only monitored visually online.
supports the regular C capabilities that the Arduino This system is intended to provide information
IDE supports. about the presence of contaminated contents in
food, making it easier for consumers to determine
Blynk Cloud food quality. The gas sensor and temperature sensor
are connected to the ESP32, and the measured
Blynk is a software designed specifically for the values are displayed on the LCD screen. This output
Internet of Things (IoT). It can store data, visualize aids in determining whether the food quality is good
it, and display sensor data, among other things. or bad. This system can be used in various settings,
Additionally, Blynk allows users to remotely including restaurants and households, and provides
control hardware and perform other interesting a convenient way to detect contamination levels in
tasks. The Blynk App provides various widgets that food. Our system maintains and regulates food
enable users to create beautiful interfaces for their system surveillance without requiring human
projects. Finally, the Blynk Server manages all intervention. This can significantly reduce the
communication between the smartphone and incidence of foodborne illnesses caused by spoiled
hardware. food.
A sequential and iterative process is followed for REFERENCES

project execution. Each step is carried out in a
structured and procedural manner. The project is [1] FOOD SPOILLAGE DETECTION
divided into smaller, manageable tasks that are SYSTEM BY AFTAB SHEIK, YOGITA NAFDE,
completed one at a time. A recurring and systematic PRIYNKA HAHERWAR, SHILPA GATHE AND
approach is taken to ensure that each step is SHREYA BHANARKAR 2022
executed effectively.
[2] RAJU, R., BRIDGES, G. E., & BHADRA, S.
(2020). WIRELESS PASSIVE SENSORS FOR
Step 1: - involves ensuring the availability of power
FOOD QUALITYMONITORING: IMPROVING
supply and connecting the USB cable to the device.
THE SAFETY OF FOOD PRODUCTS. IEEE
Step 2: - the code is executed on the device using
ANTENNAS AND PROPAGATION MAGAZINE,
the Arduino software and the stability of hardware
0–0. DOI:10.1109/MAP.2020.3003216
connection is verified.
Step 3: - the code is uploaded to the ESP32 module [3] PRAJWAL, A., VAISHALI, P., PAYAL, ZADE, &
and the output is checked for accuracy. SUMIT, D. (2020). FOOD QUALITY
Step 4: - uses gas, pressure, temperature, sound, and DETECTION AND MONITORING SYSTEM. 2020
flame sensors to detect each parameter and generate IEEE INTERNATIONAL STUDENTS’
the output. CONFERENCE ON ELECTRICAL,ELECTRONICS
Step 5: - focuses on checking the functionality of AND COMPUTER SCIENCE
the Relay switch, which cools down the system
when it is overheated.
(SCEECS). DOI:10.1109/SCEECS48394.2020. [7] Y. GU, W. HAN, L. ZHENG AND B. JIN, USING
17 IOT TECHNOLOGIES TO RESOLVE THE FOOD
SAFETY PROBLEM--AN ANALYSIS BASED ON
[4] AN INTELLIGENT IOT-BASED FOOD QUALITY CHINESE FOOD STANDARDS, 2012, PP. 380--
MONITORING APPROACH USING LOW-COST 392.
SENSORS BY ALEXANDRU POPA; MIHAELA
HNATIUC; MIREL PAUN ET AL.2019 [8] L. A. Z. H. A. H. W. A. Z. X. ZHENG, J. HE, Z.
ZHANG, Y. GU, J. WANG AND OTHERS,
[5] M. GUERMAZI, A. FENDRI, O. KANOUN AND "TECHNOLOGIES, APPLICATIONS, AND
N. DERBEL, "POTENTIAL OF IMPEDANCE GOVERNANCE IN THE INTERNET OF THINGS,"
SPECTROSCOPY FOR REAL-TIME ASSESSING OF
2011}.
FOOD QUALITY," IN IEEE INSTRUMENTATION
& MEASUREMENT MAGAZINE, VOL. 21, NO. [9] AZAD, S. AKBAR, S. MHAISALKAR, L. BIRKEFELD
6, PP. 44-48, DECEMBER 2018, DOI: AND K. GOTO, "SOLID-STATE GAS SENSORS: A
REVIEW," JOURNAL OF THE ELECTROCHEMICAL
10.1109/MIM.2018.8573593.
SOCIETY, VOL. 139, NO. 12, P. 3690, 1992.
[6] H. ZHANG, H. YE, L. ZHANG AND L. LI,
[10] Y. A. H. W. A. Z. Y. A. L. L. A. W. J.
"BASE ON THE DESIGN AND
A. Z. L. LIU, "AN INTERNET-OF-THINGS
IMPLEMENTATION OF THE QUALITY CONTROL
SOLUTION FOR FOOD SAFETY AND QUALITY
SYSTEM OF FOOD ANTIOXIDANT VITAMINS
CONTROL: A PILOT PROJECT IN CHINA," VOL.
C," 2020 3RD INTERNATIONAL CONFERENCE
3, PP. 1--7, 2016.
ON ADVANCED ELECTRONIC MATERIALS,
COMPUTERS AND SOFTWARE ENGINEERING .
(AEMCSE), SHENZHEN, CHINA, 2020, PP.
31-34, DOI:
10.1109/AEMCSE50948.2020.00014.
17
Collaborative Corporate Social Responsibility: A Collaborative Model For

Mr.Vivek Bongale R.Akshar Naveen Theja S Syed Arshad
Assistant Professor 20191CSE0487 20191CSE0516
School of Computer Engineering Computer Science and Engineering Computer Science and Engineering
vivek.bongale@presidencyuniversit 201910101406@presidencyuniversity.i 201910100965@presidencyuniversity.i
y.in n n
Richa Sharma Rishabh Singh A.Madhu Kishore

20191CSE0496 20191CSE049 20191CSE0753
Computer Science and Engineering Computer Science and Engineering Computer Science and Engineering
201910100915@presidencyunivers 201910101545@presidencyuniversity.i 201910101165@presidencyuniversity.i
ity.in n n
Corporate Social Responsibility In India
mixed-methods approach to validate the

proposed model, including a survey of
Abstract—Integral to modern business
practices in India, Corporate Social
Responsibility (CSR) involves companies
actively engaging in activities that promote companies operating in India and a case study
social welfare and sustainable development. of successful collaborative CSR initiatives. The
However, funding these activities remains findings highlight the benefits of collaborative
challenging, especially in a developing economy CSR efforts, including increased funding, more
like India. This research paper explores the significant social impact, and positive effects on
potential for companies to collaborate and employee engagement and retention. The
raise more funds for CSR activities while paper concludes with recommendations for
providing tax exemptions to their employees. companies and policymakers to promote
The study analyses the current CSR collaborative CSR initiatives in
regulations in India and identifies the Keywords—component, formatting, style, styling,
challenges companies face in funding CSR insert
projects. The paper proposes a collaborative
model through a literature review where CIII. INTRODUCTION
companies can pool their resources to Corporate social responsibility (CSR) refers to
undertake larger CSR projects. The study also a company's voluntary actions that go beyond its
examines how tax exemptions for employees financial obligations and contribute to social
can be used as a motivator for companies to welfare and sustainable development. CSR
participate in CSR activities. The paper uses a initiatives can take many forms, including
environmental protection, social welfare Sunaina Gowan and Joanna Vogeley. Method
programs, employee volunteering, and ethical followed is qualitative case study
business practices. Many companies worldwide approach .The paper offers new insights into
have recognized the importance of CSR and are exploring research and praxis agenda for
actively engaged in these activities to benefit collaborative potentials towards the evolution
society and the environment. However, funding of CSR and sustainability.
these initiatives remains a significant challenge, 2. Effect of CSR on Employee Engagement by
particularly in developing economies like India. Maryam Hanzala Tariq* Anjuman-I-Islam’s
The Indian government has mandated CSR Allana Institute of Management Studies,
spending for companies that meet certain criteria, Mumbai - 400001, Maharashtra, India;
requiring them to spend at least 2% of their maryam.tariq@aiaims.com.uses Qualitative
average net profits on CSR activities. Although methodology.Employee engagement will be
this policy has increased CSR spending in India, significantly affected by CSR initiatives of the
many companies face challenges in funding CSR organisation.
projects due to financial constraints.To address 3. Corporate Social Responsibility and Social
this issue, this research paper proposes a Development in India: An Interface Sanjai
collaborative model where companies can pool Bhatt and Lakshya Kadiyan.Methodology used
their resources and undertake larger CSR projects is qualitative research methodology.major
while also providing tax exemptions to their glimes about enhanced corporate reputation:
employees. The paper analyzes the current CSR Companies that engage in CSR activities can
regulations in India and identifies the challenges improve their public image and reputation,
companies face in funding CSR projects. leading to increased customer loyalty and brand
To validate the proposed collaborative model, the recognition.
paper uses a mixed-methods approach that 4. Analysing the role of tax incentives for
includes a literature review, a survey of donations to non-profit organisations in
companies operating in India, and case studies of India ByMalini Chakravarty, Priyadarshini
successful collaborative CSR initiatives. The Singh.methodology used in this paper is
literature review examines the existing literature mixed methodology.it gives clarity about
on CSR and collaborative models, while the how tax incentives draw donations but the
survey provides insights into the perspectives of data set is small that’s the only research gap
companies operating in India. The case studies
highlight the benefits of collaborative CSR CV. METHODOLOGY
efforts, including increased funding, more The mixed-methods approach is chosen for this
significant social impact, and positive effects on research paper to validate the proposed model of
employee engagement and retention. collaborative CSR funding and employee tax
The findings of this research paper contribute to exemptions. This approach combines qualitative
the understanding of how companies can work and quantitative research methods to
together to promote CSR and sustainable comprehensively understand the research problem
development, particularly in developing and triangulate the findings for robust
economies like India. The paper concludes with conclusions.The qualitative component of the
recommendations for companies and research involves a literature review, which
policymakers to promote collaborative CSR provides an in-depth analysis of existing theories,
initiatives in India, including tax incentives and frameworks, and empirical studies related to CSR,
regulatory support. Companies can significantly collaborative CSR initiatives, funding challenges,
impact society and the environment by promoting and tax exemptions. The literature review helps
collaborative CSR efforts. establish the theoretical foundation of the
research, identifies gaps or inconsistencies in the
CIV. LITERATURE REVIEW current knowledge, and provides a rationale for
the proposed model. The quantitative component
1. Collaborative corporate social responsibility of the research involves a survey of companies
praxis: case studies from India Meena Chavan, operating in India. The survey collects data from
large companies to capture their perceptions, development. This could undermine the credibility
experiences, and attitudes toward collaborative and impact of collaborative CSR initiatives and
CSR initiatives and employee tax exemptions. raise concerns about transparency, accountability,
The survey questions are carefully crafted to align and social impact measurement.
with the research objectives and questions, and
they are piloted and pretested to ensure clarity and
CVI. MAIN STUDY
reliability.
A.Corporate level collaborations
A. Limitations
a) Corporate Collaboration for CSR: Corporate
a) Positioning Figures and Tables: Not all
Social Responsibility (CSR) has become an
companies may be willing or able to participate in
integral part of modern business practices,
collaborative CSR initiatives due to differences in
where companies actively engage in activities
priorities, values, or financial capabilities. Some
that promote social welfare and sustainable
companies may not have the resources or
development. One approach that has gained
willingness to collaborate with others, limiting the
traction in recent years is collaborative CSR,
overall participation in the funding model.
where companies pool their resources and
b) Coordination challenges: Collaborative
collaborate to undertake larger CSR projects.
CSR initiatives involving multiple companies may Collaborative CSR involves companies
face challenges in coordinating and managing the partnering with other organizations, such as
collective resources, strategies, and other companies, NGOs, government
implementation plans. Different companies may agencies, or local communities, to jointly
have varying levels of commitment, expertise, and address social and environmental issues. This
decision-making processes, which could impact collaborative approach to CSR can have
the effectiveness and efficiency of the several advantages and benefits, as discussed
collaborative funding model. below:
c) Employee engagement challenges: While • Increased Funding: Collaborative CSR
providing tax exemptions to employees as a allows companies to pool their financial
motivator for CSR participation can be beneficial, resources, expertise, and networks, resulting
it may not guarantee high levels of engagement or in increased funding for CSR activities. By
participation. Employees may have different sharing the financial burden, companies can
levels of interest or motivation toward CSR undertake larger and more impactful CSR
activities, and the impact of tax exemptions on projects that may not have been feasible
their participation may vary. Some employees individually. Collaborative CSR can also
may also have concerns about potential tax attract funding from external sources, such
implications or administrative complexities, as grants, sponsorships, or donations,
which could limit their involvement. further enhancing the financial resources
d) Legal and regulatory complexities:
available for CSR initiatives.
Collaborative CSR funding models may need to • Greater Social Impact: Collaborative CSR
navigate complex legal and regulatory enables companies to effectively combine
frameworks, including tax laws, CSR regulations, their efforts and expertise to address
and compliance requirements. These complexities complex social and environmental issues.
may vary across different jurisdictions, industries, By leveraging each other's strengths,
and company sizes, posing challenges in companies can develop more
implementing the collaborative funding model comprehensive and sustainable solutions
and claiming employee tax exemptions. with greater social impact. Collaborative
CSR can also result in synergies, where the
e) Risk of green-washing: Collaborative
collective impact of multiple companies
CSR initiatives may face the risk of companies working together is greater than the sum of
engaging in "greenwashing," which refers to their efforts.
making superficial or false claims about CSR
efforts for reputational or marketing purposes
without a genuine commitment to sustainable
• Enhanced Innovation and provided the initial funding and technical
Learning: Collaborative CSR provides support, Michelin India assured the
opportunities for companies to learn from purchase of rubber latex at a fair price, and
each other's experiences, knowledge, and the farmers agreed to adopt sustainable
best practices. Companies can share rubber cultivation practices and form
innovative ideas, technologies, and producer groups.
approaches, enhancing innovation and • The project adopted a participatory
learning. Collaborative CSR can also foster approach, involving the farmers in every
a culture of continuous improvement, where stage of the project cycle, from planning to
companies can collectively evaluate and implementation to monitoring. The project
improve their CSR initiatives based on also leveraged the existing government
shared feedback and learning. schemes and programs, such as the National
• Increased Stakeholder Rural Livelihood Mission (NRLM), to
Engagement: Collaborative CSR allows provide additional support and resources to
companies to engage with a wider range of the farmers.
stakeholders, including other companies, • The project focused on building the
NGOs, government agencies, local capacities of the farmers through training,
communities, and other relevant exposure visits, demonstrations, and
stakeholders. This increased stakeholder handholding. The project also facilitated the
engagement can better align CSR initiatives formation of rubber producer groups (RPGs)
with stakeholder needs and expectations, and rubber producer companies (RPCs) to
leading to more meaningful and sustainable enable collective bargaining, value addition,
outcomes. Collaborative CSR can also help and market linkages for the farmers.
build partnerships and networks,
strengthening CSR activities' impact and • The project also addressed the
reach. environmental and social aspects of rubber
cultivation, such as soil conservation, water
• Positive Effects on Employee Engagement
management, biodiversity protection,
and Retention: Collaborative CSR can gender inclusion, and community
positively affect employee engagement and development. The project promoted organic
retention. Employees are often motivated farming practices, such as mulching,
by the opportunity to contribute to social composting, intercropping, and pest
causes and positively impact their management, to reduce the use of chemical
communities. Collaborative CSR can give fertilizers and pesticides. The project also
employees a sense of purpose, pride, and encouraged the farmers to plant native trees
satisfaction, increasing employee and medicinal plants along with rubber to
engagement and motivation. Moreover, enhance the ecological diversity and income
employees may feel proud to work for sources. The project also supported the
companies that actively collaborate and construction of rainwater harvesting
contribute to CSR initiatives, which can structures, such as farm ponds and check
enhance employee loyalty and retention. dams, to improve water availability and
b) Case Study 1: The case study of Tata Trusts recharge groundwater. The project also
and Michelin India can be elaborated as ensured that at least 30% of the
follows: beneficiaries were women and that they had
equal access to training, inputs, and
• The project was launched in 2015 as a pilot decision making. The project also supported
in two villages of Gumla district, Jharkhand, various community development initiatives,
with an aim to develop 200 hectares of such as health camps, education programs,
rubber plantation and benefit 300 tribal sanitation drives, and cultural events, to
farmers. improve the quality of life of the villagers.
• The project was based on a tripartite • The project achieved significant outcomes
agreement between Tata Trusts, Michelin and impacts in terms of improving the
India, and the farmers, wherein Tata Trusts livelihoods, incomes, and empowerment of
Indicator Baseline Achievement water availability, and biodiversity in the
(2015) (2020) project area.
Number of 300 1200
farmers TABLE 1
benefited table. 1. Achievement Table []
Area under 200 1200
rubber plantation c) Case Study 2: The case study of Infosys
(in hectares) Genesis and Akshaya Patra Foundation can
Number of 0 4 be elaborated as follows:
rubber producer
companies • The project was initiated in 2013 as a pro
formed bono initiative by Infosys Genesis, a
business unit of Infosys that provides digital
Share capital of 0 4.8 Milllion
transformation solutions to clients across
rubber producer
various industries. The project aimed to
companies (in help Akshaya Patra Foundation, a non-
Rs) governmental organization that operates the
Turnover of 0 12 Million world’s largest school lunch program, to
rubber producer improve its operational efficiency,
companies (in transparency, and scalability by leveraging
Rs) cloud-based enterprise resource planning
Productivity of 1000 1300 (ERP) system.
rubber latex (in
• The project involved developing a
kg/ha) customized ERP system that integrated
Quality of 30 45 various functions of Akshaya Patra, such as
rubber latex (in kitchen operations, supply chain
DRC %) management, finance, human resources, and
Income of 3000 - 6000 - 10000 fundraising. The ERP system enabled real-
farmers (in 5000 time data capture, analysis, and reporting of
Rs/month) various aspects of Akshaya Patra’s
Percentage of 0 30 operations, such as food production, quality,
women delivery, inventory, procurement, donations,
beneficiaries expenses, and employee performance. The
the tribal farmers. The project expanded ERP system also facilitated automation of
from two villages to 40 villages in four various processes, such as invoice
districts of Jharkhand (Gumla, Latehar, generation, payment reconciliation, donor
Lohardaga, and Simdega), covering 1200 communication, and tax compliance. The
hectares of land and benefiting over 1200 ERP system also provided dashboards and
farmers and their families. The project also alerts for monitoring and managing key
helped establish four RPCs with a total performance indicators and risks.
share capital of Rs 4.8 million and a total • The project adopted a collaborative
turnover of Rs 12 million in 2019-20. The approach, involving the stakeholders from
project also increased the productivity and both Infosys Genesis and Akshaya Patra in
quality of rubber latex by 30% and 50%, every stage of the project cycle, from
respectively. The project also increased the requirement analysis to design to
income of the farmers by 50% to 100%, development to testing to deployment to
depending on the stage of plantation. The maintenance. The project also leveraged the
project also enhanced the social capital, best practices and methodologies of Infosys
confidence, and leadership skills of the Genesis, such as agile development,
farmers, especially women. The project also DevOps, cloud computing, and user-centric
contributed to environmental conservation design. The project also ensured data
by increasing the green cover, soil fertility,
security and privacy by complying with the companies to leverage their complementary
relevant standards and regulations. skills and expertise. Companies can come
together to pool their knowledge,
• The project achieved significant outcomes
experiences, and capabilities in addressing
and impacts in terms of enhancing the
social and environmental issues. This can
efficiency, transparency, and scalability of
result in more innovative and effective
Akshaya Patra’s operations. The project
solutions that leverage the strengths of each
reduced the manual efforts and errors by
participating company, even if they have
80%, increased the data accuracy and
limited individual capacities.
availability by 90%, improved the
operational visibility and control by 70%, Expanded Reach and Impact: Collaborative CSR
and enabled faster decision making and allows companies to expand their reach and
problem solving by 60%. The project also impact by collaborating with other companies. By
increased the donor satisfaction and joining forces, companies can collectively address
retention by 50%, reduced the operational social and environmental issues in a wider
costs by 30%, and improved the employee geographical area or across different sectors,
productivity and engagement by 40%. The communities, or stakeholders. This can enable
project also contributed to Akshaya Patra’s them to have a broader and more meaningful
mission of serving over 1.8 million impact, even with limited individual capacities.
children across 19 states in India with
nutritious mid-day meals every school day.
d) Practical use case model for small TABLE 2

companies: Collaborative CSR can be table. 2. Achievement Table []
particularly beneficial for companies with
limited CSR budgets who wish to expand
their range of servability by collaborating
with similar companies. Here are some ways
in which collaborative CSR can be helpful:
• Enhanced Funding: Companies with limited

CSR budgets can join forces with other
companies through collaborative CSR
initiatives to pool their financial resources.
This can help them undertake larger CSR
projects that may have been unaffordable
individually. Collaborative CSR can also
attract funding from external sources, such
as grants or sponsorships, which can further
augment the available funds for CSR
activities.
• Shared Costs and Resources: Collaborative
CSR allows companies to share the costs
and resources associated with CSR projects.
By collaborating, companies can share the
expenses of research, development,
implementation, and monitoring of CSR
initiatives. This can help companies with
limited budgets access additional resources,
expertise, and capabilities to undertake
more comprehensive CSR projects.
• Leveraging Complementary Skills and
Expertise: Collaborative CSR enables
complementary skills and expertise,
Name of CSR Sect CSR CSR expanded reach and impact, shared risk
the Founda or Spent( Spent Indicator Baseline Achieve
Company tion 2020 - (FY (2013) ment
Status 2021) 2021 (2020)
- Number of children 1.3 1.8
2022) served by Akshaya Patra Million Million
Sasken Yes IT INR INR Number of states covered 10 19
Technologi 2.15 2.40 by Akshaya Patra
es Limited Cr Cr Number of kitchens 20 52
ECL Yes Fina - INR operated by Akshaya
Finance nce 0.55 Patra
Limited Cr Number of functions 0 5
Nucleus Yes IT INR INR integrated by ERP
Software 1.65 2.03 system
Exports Cr Cr Percentage of manual 0% 80%
Limited. efforts and errors
Quick Heal Yes IT INR INR reduced by ERP system
Technologi 2.45 2.49 Percentage of data 0% 90%
es Limited Cr Cr accuracy and availability
IDFC Yes IT INR INR increased by ERP system
Limited 0.91 0.64 Percentage of operational 0% 70%
Cr Cr visibility and control
improved by ERP system
Percentage of donor 0% 50%
• Shared Risk Mitigation: Collaborative CSR
satisfaction and retention
allows companies to share the risks
associated with CSR initiatives. By working increased by ERP system
together, companies can jointly assess and Percentage of operational 0% 30%
mitigate risks, such as financial, operational, costs reduced by ERP
legal, or reputational risks. This shared risk system
mitigation can provide companies with Percentage of employee 0% 40%
more confidence in undertaking CSR productivity and
projects, especially when their individual engagement improved by
budgets are limited. ERP system
mitigation, and networking, companies can
• Networking and Relationship Building:
collectively achieve more impactful CSR
Collaborative CSR provides opportunities
outcomes and contribute to social welfare
for companies to network and build
and sustainable development in a
relationships with other like-minded
meaningful manner.
companies. By collaborating, companies
can build partnerships, alliances, and TABLE 3
networks that can be beneficial beyond the table. 3. CSR Budget Table [ ]
immediate CSR initiatives. Such
relationships can lead to synergies, shared
learning, and potential collaborations in CVII. EMPLOYEE COLLABORATION FOR
other areas of business or CSR in the future. CSR ACTIVITY
• Collaborative CSR can be highly beneficial A. Employee Engagement

for companies with limited CSR budgets Collaborating employees with companies in
who wish to expand their range of CSR activities can be a powerful approach to
servability by collaborating with similar promote employee engagement and social impact.
companies. Through enhanced funding, Involving employees in CSR initiatives allows
shared costs and resources, leveraging them to contribute their skills, time, and resources
towards meaningful social causes, fostering a engagement within the company. Relying solely
sense of purpose and fulfillment. Employees can on tax incentives may not be enough to drive
leverage their unique skills and expertise through sustained, long-term engagement in CSR activities,
skill-based volunteering, contributing to the CSR and a comprehensive approach to employee
projects while enhancing their own professional engagement in CSR should be
development. Collaborative CSR activities also considered.Monitoring and evaluation of the
foster team building, improve interpersonal impact of tax deductions on employee engagement
relationships, and enhance communication and is crucial. Companies should establish
collaboration among employees. Engaging mechanisms to collect employee feedback,
employees in CSR activities empowers them to conduct surveys, and evaluate performance to
contribute to social change, leading to increased understand the effectiveness of tax incentives in
loyalty and commitment towards the company. It driving employee engagement and identify areas
also contributes to the company's brand image and for improvement. This can help companies refine
reputation as employees act as brand ambassadors, their CSR engagement strategies and ensure that
promoting the company's commitment to social tax deductions are indeed motivating employees to
responsibility. Additionally, involving employees participate in CSR initiatives. In conclusion, tax
in CSR initiatives can enhance retention and deductions for employees can be a valuable
recruitment efforts, attracting socially-conscious incentive for companies to encourage CSR
talent who value companies that prioritize CSR engagement. However, it should be considered as
and employee engagement. Overall, collaborative part of a comprehensive approach to employee
CSR activities involving employees can create a engagement in CSR, along with other strategies,
win-win situation, benefiting both the employees and monitored and evaluated for effectiveness.
and the company. Creating a culture of CSR engagement within the
company requires a multifaceted approach that
B. Benifits of Employees( Tax deduction)
considers employee motivation, participation, and
When considering tax deductions for feedback to achieve sustainable and impactful
employees as a motivation for companies to CSR initiatives.
engage in CSR activities, several key points
should be kept in mind. Firstly, tax incentives can C. Deduction by Donation
be significant in encouraging companies to These are some of the major tax deduction rules
participate in CSR initiatives by reducing their tax in India. There are other rules as well that may be
liability. Offering tax deductions to employees applicable depending on the nature and source of
who participate in CSR activities can not only income of the taxpayer.But in all of these rules the
incentives employees to engage in CSR initiatives
our help is from section 80G.Tax deduction under
but also provide companies with reduced tax
section 80G of the Income Tax Act is available
payments. Different types of tax deductions, such
as donations to eligible charities or time spent for donations made to certain relief funds and
volunteering, should be considered based on the charitable institutions. The deduction can be 100%
local tax rules and regulations.Furthermore, tax or 50% of the donation amount, depending on the
deductions can impact employee engagement eligibility of the done. To claim this deduction,
positively. When employees are aware that they the donation must be made through cheque, draft
can receive tax benefits for their contributions to or cash. Donations in kind are also not eligible for
society, they are more likely to participate in CSR deduction. The donor must also provide the name,
initiatives, leading to increased engagement and address, PAN and amount of donation of the done
satisfaction among employees. This can have along with the proof of payment.
positive impacts on employee retention and
DEDUCTION FOR DONATIONS [SECTION
productivity, as engaged employees are more
likely to stay with the company and perform 80G]
well.However, it's important to acknowledge the
limitations of tax deductions as a sole motivator 1) Donations made to following are eligible for
for CSR engagement. Companies should also 100% deduction without any qualifying limit
consider other engagement strategies, such as 2) (B) Donations made to the following are
employee volunteering and skill-based eligible for 50% deduction without any
volunteering, to create a culture of CSR qualifying limit
3) (C) Donations to the following are eligible for
100% deduction subject to qualifying limit 80G Varies Donations to
4) (D)Donations to the following are eligible for depending certain relief
50% deduction subject to qualifying limit on the funds or
fund or charitable
Challenge: there is no charity until now to take institution institutions
care of CSR development capital. 80TTA Up to Rs. Interest
10,000 income from
D. Taxation and Deduction
savings
TABLE 4 account with
table. 4. Setion and Rebating Table [ ] a bank or
post office
Lets consider TATA CONSULTANT SERVICES 80TTB Up to Rs. Interest
(TCS), the total count of employees in TCS is 50,000 for income from
around 5,56,000 .Consider the situation where senior deposits with
around 25 % of employees want to participate in citizens a bank or
our collaborative CSR program.lets consider that post office
every employee Is getting 100% tax rebate and on 80C Up to Rs. Life
average an employee is paying a tax amount of 1.5 lakh insurance,
25,000 .Now calculate the CSR funds raised from provident
Employee side fund, equity-
linked
25*(5,56,000/100)=1,39,000 Employees
savings
schemes, etc.
So, 1,39,000*25000=347.5 Crore rupees(Cr)
80CCC Up to Rs. Pension plans
Where as TCS companies CSR budget of 2022 is 1.5 lakh or annuity
727 Cr,Fund raised from Employee fund is 347.5 schemes
Cr. That implies that this might really help both
sides where employees get the tax deduction and Section Deduction Eligible payments
companies get more and more funds to do Social or investments
Services ,Which is much better that compared to
party ground level works. 80C Up to Rs. Life insurance,
1.5 lakh provident fund,
productivity, and overall organizational culture. equity-linked
Careful planning, monitoring, and evaluation are savings schemes,
necessary to ensure the success of collaborative 80CCC Up to Rs. Pension plans or
CSR initiatives. Despite these limitations, 1.5 lakh annuity schemes
corporate collaboration for CSR activities, 80CCD Up to Rs. 2 National Pension
lakh System (NPS) or
including employee engagement and tax other notified
deductions, can be a strategic and effective pension schemes
approach for companies to expand their CSR 80D Up to Rs. Health insurance
efforts and contribute to a more sustainable and 25,000 to premiums or
responsible business ecosystem 50000 preventive health
depending check-ups
on various
factors
CVIII. CONCLUSION
80E No limit Interest on
The Corporate collaboration for CSR activities, education loan for
including employee engagement and tax higher studies of
deductions, offers numerous advantages for self or relative
companies seeking to expand their CSR efforts
despite limited budgets. Through collaboration [5] Singh, S., & Jain, K. (2020). Corporate
with similar companies, pooling resources, and social responsibility and economic growth:
leveraging employee participation through tax Evidence from the Indian banking sector. In
incentives, companies can amplify their CSR Corporate Social Responsibility and
impact and contribute to positive social and Economic Growth (pp. 103-118). Springer,
environmental outcomes. This approach allows Singapore
for increased access to resources, enhanced [6] Arora, B., & Puranik, R. (2004). A review
expertise and knowledge sharing, broader of corporate social responsibility in
community reach, and improved brand reputation. India. Development, 47(3), 93-100.
Engaging employees in CSR activities through [7] Chaudhury SK, Das SK, Sahoo PK.
tax deductions also has significant benefits. It Practices of corporate social responsibility
boosts employee morale, motivation, and (CSR) in the banking sector in India: An
retention, leading to a more engaged and assessment. Research Journal of Economics,
committed workforce. When employees know Business, and ICT. 2012 Mar 10;4.
that they can receive tax benefits for their [8] Gautam, Richa, and Anju Singh.
contributions to society, they are more likely to "Corporate social responsibility practices in
participate in CSR initiatives, increasing India: A study of top 500 companies." Global
employee engagement and satisfaction. This, in Business and Management Research: An
turn, can positively impact employee retention, International Journal 2, no. 1 (2010): 41-56.
[9] Narwal M. CSR initiatives of the Indian
banking industry. Social Responsibility
REFERENCES Journal. 2007 Nov 1;3(4):49-60
[10] Mitra, N., & Schmidpeter, R. (2017). The
[1] Meena, D., & Meena, G. R. (2021). CSR CSR mandate's why, what and how: The
and Covid-19 Pandemic. Journal of Survey in India story. Corporate social responsibility in
Fisheries Sciences, 10(2S), 1284-1290 India: Cases and developments after the legal
[2] Oware, Kofi Mintah. "Mandatory CSR mandate, 1-8
expenditure and firm performance in lag [11] Mukherjee A, Bird R. Analysis of
periods: Evidence from mandatory CSR expenditure in India: a
[3] India." Cogent Business & survey. International Journal of Corporate
Management 9.1 (2022): 2147126.MLA Governance. 2016;7(1):32-59.
[4] Perez, A., Rodriguez del Bosque, I., & [12] CHAPTER 10 DEDUCTIONS FROM
Monge-Lozano, P. (2013). The role of CSR GROSS TOTAL INCOME, from
in the corporate identity of banking service DEDUCTIONS FROM GTI by CA. SHREY
providers. Journal of Business Ethics, 117(3), RATHI
605-615.
Automatic Number Plate Recognition Using OpenCV
Dr. S. Prabagar Machina Bharath chand Gaddam Rakesh Reddy
Associate Professor(CSE), Department of CSE, Presidency Department of CSE,
University, Bangalore, India. Presidency University,
Presidency University, 201910101518@presidency Bangalore, India.
Bangalore, India. 201810100836@presidency
prabagar.s@presidency university.in.
university.in.
university.in
Kanne Naga Amaresh
M. S. Yaswanth
Department of CSE, Presidency
Department of CSE, University, Bangalore, India.
Presidency University, 201910101827@presidency
Bangalore, India. university.in
201910101911@presidency
university.in
Abstarct - Number plate detecting is an car licence plate is retrieved from the
image recognition technique that picture. Character recognition is
identifies vehicles by their number accomplished via optical character
(number) plates. The goal is to create recognition (OCR).
and put into use a reliable vehicle
identifying system that uses the licence Python programming is used to identify
plate to identify the vehicle. This system licence plate numbers. For this project,
can be put in place at the entryway of we'll use Python Pytesseract to extract
parking areas, toll stations, or any the letters and numbers from the
other private location, such as a college, licence plate and OpenCV to identify
in order to keep track of arriving and the licence number plates. We'll create
departing cars. It can be utilised to a Python programme to automatically
limit entrance to the building to just identify the licence plate.
authorised cars. The created system Key words : Vehicles License plate
takes a picture of the front of the car, images, Opencv , pytesseract OCR,
finds the licence plate, and then scans license plate recognition
the plate. Using image processing, the
just computerized imitation of human visual.

I. INTRODUCTION The basic objective is to describe the things we
see using multiple pictures and to recreate their
In the field of study known as computer vision, characteristics, such as their shape, lighting, or
key elements from images can be automatically as colour.
extracted by computers in order to solve
specific problems. In other terms, CV is
The usage of computer vision is widespread. strategies for enhancing the quality of the
Processing photos and videos with the goal of picture and the object characteristics are used.
identifying objects in these image or video files There are four components that will be looked
is one of them. Nowadays, the detection of at in the experiment with the Image Processing
vehicle licence plates is a widely popular use Module. Each component will have an impact
for object detection in photos. There are on the algorithm and
currently just a few licence plate recognition detection there are four components that will b
systems in operation, and they are generally e looked at in the Image Processing ModuEach
found in parking lots with a paid parking or component will have an impact on the
locations with limited vehicle admission. methodology and detection threshold for
ALPR systems, which stands for Automatic licence plate recognition. The image's ratio will
Licence Plate Recognition systems, is the be the first component to be looked at. This
umbrella term for these technologies. procedure is crucial when comparing a detected
Here We have chosen to make use of OpenCV blob's size to the need for a potential licence
and an OCR technique called pytesseract in plate. A test of the image's alignment or
this project. rotation is included in the second section. In
The OpenCV library was used to implement order for the licence plate to be recognised by
the method suggested in this study. An open the programme, this specific method is crucial
source library named open cv used to build for identifying the threshold of skew that it
software in the field of with vision and does have. Additionally, it is done to determine
ML fields. Many programming languages are the likelihood that the picture may slant due to
supported by the library, including Python, an improperly placed camera or a slanted
which was used to the solution. The Windows, licence plate. Distance measurement is the
Mac OS X, iOS, Linux, and Android operating third component. This specific measurement is
systems are all compatible with the library. In necessary in order to assess the less megapixel
order to identify and recognise licence plate camera's ability to determine the precise
objects while implementing the specified distance at which a licence plate may be
solution, we employed the OpenCV package, identified and detected. The fourth step
which allows for the application of filter and involves using the Global Threshold and
operator for videos as well as images Adaptive Threshold to calculate the binary
processing. threshold value.
The Tesseract library was used to implement The system has access to the camera's pictures
the recognition of licence plate numbers. A and utilises optical character recognition
machine can recognise and read characters (OCR) to identify the licence plate numbers of
from text thanks to Tesseract, an optical cars that are both moving through the parking
characters recognition (OCR) library. The lot and parked in designated spaces. The
library is compatible with the operating suggested system recognises a new vehicle,
systems Windows, Linux, and MAC OS, and it locates the licence plate area, and then uses an
supports numerous programming languages, algorithm to extract the plate picture. The text
including Python. It is possible to read and numbers that make up the licence plate are
messages across more than 100 different extracted from this image using an OCR
languages, and the result of the recognition technique.
may be stored as a txt, pdf, and other formatted
file. Object detection from input images is an
image processing step in which appropriate
II.MOTIVATION can be used for traffic control, law
CIX. THE REAL-WORLD SCENARIOS, RESEARCH POTENTIAL, enforcement, and vehicle tracking.
AND EDUCATIONAL OPPORTUNITIES OF AUTOMATED
LICENCE PLATE RECOGNITION (ANPR) BY APPLYING IV. METHODOLOGY
OPENCV SERVE AS THE DRIVING FORCE BEHIND THIS
PROJECT. ANPR TECHNOLOGY IS USED IN A VARIETY OF
Developing an interface that is user-friendly,
FIELDS, INCLUDING TRAFFIC CONTROL, LAW acquiring pictures of vehicles passing through
ENFORCEMENT, AND VEHICLE TRACKING. ONE CAN AID residential areas, processing the images,
IN THE DEVELOPMENT OF ANPR AND ITS APPLICATION
IN THE REAL WORLD BY WORKING ON ANPR PROJECTS. detecting the number plate location,
THE PROJECT ALSO PROVIDES AN OPPORTUNITY TO segmenting the characters, recognising those
EXPLORE COMPUTER VISION, IMAGE PROCESSING, AND
characters using OCR, verifying the
PROGRAMMING, THEREBY ADVANCING ABILITIES AND
KNOWLEDGE IN THESE FIELDS. ANOTHER APPEALING authorization position, generating alerts, and
ASPECT IS INVESTIGATING FRESH ALGORITHMS AND managing a database are all part of the
APPROACHES TO ENHANCE THE PRECISION AND
EFFECTIVENESS OF LICENCE PLATE RECOGNITION
suggested technique for the automatic
SYSTEMS. FINALLY, SHARING CODE AND INSIGHTS WITH recognition of number plates software using
THE OPEN-SOURCE COMMUNITY ENCOURAGES
OpenCV. The approach uses machine learning
COLLABORATION AND HELPS SUPPORT OTHER
DEVELOPERS AND RESEARCHERS IN THEIR WORK. libraries, the OpenCV, Python, or
programming language C++ programming
III. PROBLEM STATEMENT
languages, as well as other technologies, to
The goal of this project is to use OpenCV to
create an accurate and reliable systems that can
create an automatic vehicle plate recognition
be integrated with current security systems to
(ANPR) system. Real-time, accurate licence
improve residential area security.
plate detection and recognition from video or
image streams should be possible with the V. LITERATURE WORK
system. The main difficulty is overcoming [1]. Dalarmelina , N. do V., Teixeira, M. A., &
variations in licence plate designs, lighting, Meneguette, R . I: Automatic licence plate
and quality of image, which can influence recognition has grown in popularity as a result
plate detection accuracy. To accurately extract of the increase in city cameras, the majority of
and decipher the characters that appear on the which, if not all, are connected to the Internet.
licence plate, the system should also address The video traffic captured by the cameras can
challenges with segmenting characters as well be analysed to produce insightful data for the
as robust optical character identification transportation industry. This article describes
(OCR). The project should also concentrate on the development of a smart vehicle
improving the algorithms and methods identification system based on optical character
employed to guarantee quick and effective the recognition (OCR) for use with intelligent
processing of the the ANPR system system. modes of transportation. The suggested
By addressing these issues, the project hopes approach uses SPANS, a smart parking system
designed for managing private as well as public
to develop an effective and dependable ANPR
parking lots, which is part of the Intelligent
system. The project's goal is to use OpenCV to
Parking Management Service. The openness of
overcome these difficulties and develop a
the parking spaces is determined by computer
dependable and effective ANPR system that
vision algorithms used by the SPANS system.
The proposed system makes use of SPANS.
The proposed system makes use of the SPANS Hypotheticals and Dependences
architecture. Due to the increase in cameras in
• All the fields must be entered in the specified
cities, the majority of them take images of the
format.
parking spaces and can identify the vehicle
registration numbers of both parked cars and • All the obligatory field’s requirements to be
those that drive through the lot. In real time filled.
performance evaluation of the proposed
• Proper internet connection is needed.
solution is done in addition to the actual time
detection of the licence plate. ALPR has • GUI designed is veritably easy for the end
developed into a key tool in the ITS sector druggies to understand and use.
because it helps track and monitor the
• In case of any error the operation should
functioning of the vehicles. However, due to
display proper error dispatches Development
the wide range of plate formats (such as
styles.
plate size, plate backdrop, character size, plate
texture, and so on), especially in an urban • HTML, MySQL, CSS, Bootstrap, JavaScript
setting, creating an ALPR system correctly is a are used to develop this operation.
challenging task. There are two versions of the
VII. OBJECTIVE
ALPR system: online and offline. As
mentioned in, the online ALPR system enables • Develop a robust and accurate license
real-time localization and interpretation of plate detection algorithm that can handle
licence plates from received video frames, variations in plate designs, lighting conditions,
enabling constant surveillance of moving and image quality.
vehicles through a surveillance camera. • Implement an efficient character
However, an offline ALPR system captures the segmentation and optical character recognition
images of the vehicles and stores individuals in (OCR) process to accurately extract and
a centralised data server for later analysis. interpret the characters on the license plate.
Summary :The suggested fix utilises the use • Create a real-time ANPR system using
of the smart parking service (SPANS), a OpenCV that integrates these components to
system for locating available parking spaces enable reliable and efficient recognition of
using computer vision methods. The proposed license plates from images or video streams.
system makes use of the SPANS' camera to • The ideal of the designed system aims
collect images and information about parking the following five points
spaces. The proposed system takes a picture of
a vehicle when it is discovered and uses that • Affordable The systems must be
image to determine the vehciles number plate. affordable as the price is one of the main
As a result, the device saves the identification factors that kept on mind during design phase.
number, and public entities like traffic • Movable: The systems to be movable
departments can access this information. and easy use, this web app can be penetrated
through Phone as well.
VI. DESIGN CONSIDERATIONS
• Accurate The system must be accurate; real-time use in any industrial or institutional
thus, the most accurate algorithms have been parking area. This application can be broadly
chosen.
integrated with parking ticket vending
VIII. IMPLEMENTATION machines, monitoring systems, RFID-enabled
Only the administrator, can log in if their boom barriers, and so on.
username or password is correct; if not, an
error message will be sent to them. Stoner must X.FUTURE WORK
sign in and have their identity validated before • Improve accuracy: ANPR systems are
they can check the details and do the operation. heavily dependent on the accuracy of the
Before they can acquire access, he must first OCR engine. You can explore various
enter the login runner and submit the necessary techniques to improve the accuracy, such as
pre-processing techniques like noise
data.
reduction, image thresholding, and
Page for the administrator: By logging into the image enhancement.
page administrator has so many operations that • Integration with other systems: It can be
he can do like he can add the vehicle details to integrated with other systems like traffic
the data base and he can remove the details of management, toll collection, and parking
the vehicles and can also access the details of management systems to make these
systems more efficient to use.
the gate where the vehicles are coming and
• Smart phone integration: Its possible to
going out of the gates. develop this project further as mobile
Website home page: The webpage that a application that can be installed in mobile
website admin can start the camera feed of the to make use of it in much easier way.
system where the detection of vehicle and also
detection of the number plate also.
IX. COCLUSION
This ANPR system developed using OpenCV
has shown the primising results in accurately
recognizing number plates and successfully
achieved the primary objective of automatic
number plate recognition.
This vehicle gate management system is fully XI.REFERENCES
automated and can be tailored to any
commercial or industrial setting with minimal [1] "License plate recognition system using
OpenCV in python" by G. Naresh Reddy and
human intervention and programming. This
M. Veerraju. Published in the International
system detects the vehicle in addition to any Journal of Advanced Research in Computer
high-end sensors, making it a cost-effective Science, Volume 8, Issue 4, July-August 2017.
system. This system, with proper mechanical [2] "Automatic Number Plate Recognition System
assistance and design, can be implemented for Based on OpenCV" by S. K. Singh and A. K.
Singh. Published in the Proceedings of the
2017 International Conference on Computing [10] Saha, S.; Basu, S.; Nasipuri, M.
and Communication Technologies (ICCCT), Automatic Localization and Recognition of
Volume 2, 10thOctober 2017. License Plate Characters for Indian Vehicles.
[3] "Automatic Vehicle License Plate Recognition
Int. J. Comput. Sci. Emerg. Technol. IJCSET
Using OpenCV and SVM" by D. Li, X. Li, and
2011, 2, 520–533.
Q. Liu. Published in the Proceedings of the
2015 IEEE International Conference on
Progress in Informatics and Computing (PIC),
Volume 1, 2nd December 2015.
[4] "Automatic number plate recognition system
using OpenCV and Tesseract OCR" by S. S.
Kumar and S. S. S. Sree. In 2017 International
Conference on Innovations in Information,
Embedded and Communication Systems
(ICIIECS), pages 1-6,3rd december
[5] "License Plate Recognition System using
OpenCV and Convolutional Neural Network"
by S. Goyal, V. Singh, and K. Kumar. In 9th
august,2020 IEEE 7th Uttar Pradesh Section
International Conference on Electrical,
Electronics and Computer Engineering
(UPCON), pages 1-5.
[6] Mahalakshmi, S.; Tejaswini, S. Study of
Character Recognition Methods in Automatic
License Plate Recognition (ALPR) System. Int.
Res. J. Eng. Technol. IRJET 2017, 4, 1420–1426.
[7] Patel, C.I.; Shah, D.; Patel, A. Automatic
Number Plate Recognition System (ANPR): A
Survey. Int. J. Comput . Appl. 2013, 69, 21–33.
[ CrossRef ] Sensors 2020, 20, 55 12 of 13
[8] Cheng, G.; Zhou, P.; Han, J. Learning Rotation-
Invariant Convolutional Neural Networks for
Object Detection in VHR Optical Remote
Sensing Images. IEEE Trans. Geosci. Remote
Sens. 2016, 54, 7405–7415. [CrossRef]
[9] Cheng, G.; Zhou, P.; Han, J.; Xu, D. Learning
Rotation-Invariant and Fisher Discriminative
Convolutional Neural Networks for Object
Detection. IEEE Trans. Image Process. 2019,
28, 265–278. [CrossRef] [PubMed]
Bike Crash Detection And Alert System Using IMU
Zaiba Khanum Varun Sehrawat

Information Science and Engineering Electronics and communication Vennapusa Mahitha
Presidency University Engineering Computer Science and Engineering
Bangalore, India Presidency University Presidency University
zaibakhanum111@gmail.com Bangalore, India Bangalore, India
varunss228@gmail.com mahithareddy353@gmail.com
Y Rashmi
Computer Science and Engineering Yashaswini R Dr. Neha Singh
Presidency University Assistant Professor CSE
rashmirajuy@gmail.com Bangalore, India
Bangalore, India
yashaswini.r.121@gmail.com singhgaur.neha@gmail.com
notification system. The device is specifically designed to

Abstract— In this project, we are developing a device that grants accurately detect bicycle accidents and promptly notify
cyclists immediate access to critical care after an accident. A relevant authorities of the exact location of the accident
device that can detect crash severity and notify an emergency scene, enabling timely and effective emergency response to
contact via text message would greatly benefit cyclists. Unlike prevent unnecessary fatalities and reduce the impact of traffic
current solutions that rely on helmet attachments (which are only accidents on cyclists and other road users. The device
relevant in 17% of cycling fatalities), our hassle-free device will be employs advanced technology and algorithms to accurately
attached to the bike seat post, ensuring it is present during detect accidents based on parameters such as impact force,
accidents. It will use the accelerometer and gyroscope sensors to collision angle, and sudden changes in speed or direction.
detect crashes, the GPS module determines the rider's location Once an accident is detected, the device immediately sends a
and the GSM module sends accident details to the designated notification to the relevant authorities or emergency contacts,
emergency contact. Our self-contained solution does not require a including the precise location of the accident, allowing for a
connection to the rider's smartphone or any other gadget. swift and targeted emergency response. The implementation
of this bicycle crash detection device has the potential to
significantly reduce the fatality rate associated with bicycle
accidents by ensuring prompt notification and response from
emergency services. It also addresses the issue of unreported
I. INTRODUCTION accidents by accurately detecting incidents even when visible
damage or injuries may not be immediately apparent. The
Traffic accidents pose a significant global concern, resulting data collected from the device can also provide valuable
in a staggering loss of life and millions of injuries and insights for further research and analysis to develop more
disabilities annually. Approximately 1.3 million people lose effective strategies for preventing bicycle accidents and
their lives each year in traffic accidents, with millions more improving road safety for all users. Overall, this device
sustaining injuries. Of particular concern are the high fatality represents a crucial step towards mitigating the alarming
rates associated with bicycle accidents, where over 50% of impact of bicycle accidents and saving lives by enabling
incidents result in fatal outcomes for cyclists. Furthermore, timely and targeted intervention.
delays in emergency services reaching the accident scene in a
timely manner contribute to 70% of these fatalities. Research
has shown that timely disclosure of accident information can
potentially prevent a significant portion of accidental deaths,
with up to 4.6% of these deaths being avoidable with prompt
intervention. Therefore, there is an urgent need for an II. LITERATURE SURVEY
efficient accident detection and notification system that can
accurately and promptly detect bicycle accidents and notify In [14] three IMUs are attached to the motorist's head, torso,
relevant authorities with precise location information to and to the rear of the motorcycle. But the Incorrect data can
expedite emergency response and save lives. Additionally, lead to false alarms even in normal circumstances. In [12]
more than half of bicycle accidents go unreported for various This paper suggests a system to prevent accidents using a
Night Vision Camera to control and avoid them. But the
reasons, emphasizing the need for a reliable device that can
accurately detect accidents even in cases where injuries may system continuously monitors the driver, leading to a power
not be immediately apparent. In response to these critical drain. In [6] Addresses delay in medical help causing road
issues, this report presents the design and implementation of a accident deaths by timely messaging authorities and
bicycle crash detection device that aims to address the emergency contacts. But This system does not detect rare
pressing need for an effective accident detection and minor accidents with no casualties. In [4] Designs a smart
helmet with an alcohol sensor, vibration sensor, and RF When the ADXL 335 is crashed the sensor will send
transceiver. But the Setup is big to easily mount on a helmet. different X and Y axis different from the default value from
In [8] System works when the side stand is up and warns to which the Arduino will understand that there is a crash,
put it up if down. The central command uses a gyroscope and Similarly, tilt sensor in the bike detects flips and slides in one
vibration sensor to detect accidents. The side stand is not up, direction the sensor will send data to Arduino and it will
even if the half stand is not up. sense as a flip or slide crash. We have used a limit switch in
the front and back of the bike to detect the front and back
III. PROPOSED METHOD crashes like we are hitting some other vehicle or wall or some
other vehicle hitting our vehicle, it will send data to the
Bicycle accidents can be a major concern for riders, Arduino.
especially when cycling on busy roads or in adverse weather
conditions. While helmets and other protective gear can The Arduino will receive the data from the input sensor
reduce the risk of injury, accidents can still occur, leaving and accordingly send output in the form of SMS with help
riders vulnerable to serious harm. To address this issue, a message and location coordinates.
proposed method for the completion of a bike crash detector
device has been developed, which includes designing a self-
V. DESIGN AND IMPLEMENTATION
contained device that can detect crashes based on acceleration
and rotations.
The architecture of the Bike Crash Detection is shown in the
The bike crash detector device is designed to be self- fig. 1, which consists of an Arduino UNO, Tilt sensors, Limit
contained and can be attached to the bike seat post to ensure it switches, an ADXL 335 IMU sensor, a GPS module, a GSM
remains present when an accident occurs. The device is module, a push button, and power supply.
programmed to detect the precise location of the accident and
send a notification to the emergency contacts with details of
the accident like the location of the accident occurred. Since
the device is self-contained, it does not rely on a connection
to the rider's smartphone or other technology on the bicycle,
making it a hassle-free solution.
To achieve this goal, an Inertial Measurement Unit (IMU)
is used to detect the acceleration and rotation of the bicycle.
The device's hardware and software are designed to process
and analyze the data from the IMU. Once the crash is
detected, the device uses a cellular network to send a text
message notification to the emergency contact.
The bike crash detector device has numerous benefits. It
can enhance the safety of the rider, making it an essential tool
for regular and professional riders alike. The device is also
helpful for riders who may cycle in remote or unfamiliar
areas, where emergency response time may be delayed. The
device's self-contained nature and cellular network
connectivity make it easy to use and reliable in case of an
accident.
Using this method, this bike crash detector device is an
innovative solution that can significantly improve the safety
of bicycle riders. With its self-contained design, the device is
hassle-free to use and can accurately detect the location and
severity of a crash, sending a notification to the emergency Fig. 1. Block Diagram.
contact for a timely response. The device's development is a
positive step towards ensuring the safety of bicycle riders,
making it a must-have for anyone who values their well-being
while cycling. A. Arduino UNO
IV. METHODOLOGY
The system works on 5 volts 1 amp as load current. It has
ADXL 335 , tilt sensor, limit switch, and push button as input
to the Arduino and GSM SOOC, GPS, a buzzer, and LED as
output.
Fig. 2. Arduino UNO
Fig. 4. Limit Switch
Arduino UNO is the brain of the project. The Microchip
ATmega328P microcontroller serves as the foundation for the A limit switch is used to control a parameter and stop it from
Arduino Uno, which is an open-source microcontroller to going too far. It does this automatically, without needing
which the main project code is uploaded. someone to do it manually. It is used here as a sensor in front
& back of the bike when the bike is hit it will detect it as a
crash or accident, the limit switch does not need any power
supply to function as it works as a push button switch, it just
B. Tilt Sensor transfers the data to the Arduino UNO.
D. ADXL 335 Sensor
Fig. 3. Tilt Sensor

Fig. 5. ADXL 335 Sensor
The second main sensor for the project is used to identify the
crash or fall of the bike. Tilt sensors are devices that generate This is the main component of our project; it detects the crash
electrical signals that change in response to an angular motion. or falls or flip of the bike through the falling angle of the bike.
This sensor has a D?O pin to give out the fall fatal or crash It detects if the bike goes beyond the default x-y axis, then only
data to the Arduino UNO, usually the default output data will it will report the Bike has fallen. If the bike is not between 295
be High(1), if the fault occurs the data will be low(0). to 320 values of the x-axis and 320 to 380 of y axis, then it will
detect it as a crash of the bike, all of these data are transferred
to Arduino in analog form.
C. Limit Switch
E. GPS Module
CONCLUSION
This article discusses the pressing need for an efficient
bicycle accident detection and notification system to reduce
fatalities and injuries resulting from traffic accidents. To
address this issue, we propose the development of a self-
contained bicycle crash detection device that can accurately
detect accidents and promptly notify relevant authorities with
precise location information to expedite emergency response
and save lives. This proposed device uses advanced
technology and algorithms to detect accidents based on
parameters such as impact force, collision angle, and sudden
changes in speed or direction and employs a cellular network
Fig. 6. GPS Module to send a notification to emergency contacts. The
implementation of this device has the potential to
The GPS module is used to send the location coordinates of the significantly reduce the fatality rate associated with bicycle
bike crash site to the registered mobile numbers. accidents by ensuring prompt notification and response from
emergency services. We conclude that the device represents a
crucial step towards mitigating the alarming impact of bicycle
F. GSM 800C Module accidents and saving lives by enabling timely and targeted
intervention.
REFERENC
ES
[1] Nicky Kattukkaran, Mithun Haridas T. P & Arun George,” Intelligent
Accident Detection and Alert System for Emergency Medical
Assistance”.
[2] Aboli Ravindra Wakure, Apurva Rajendra Patkar “Vehicle Accident
Detection and Reporting System Using GPS and GSM,” IJERGS,
April 2014.
[3] Damini S. Patel & Namrata H. Sane, “Real Time Vehicle Accident
Detection and Tracking Using GPS and GSM.”
[4] N. Srinivasa Gupta , M. Nandini, “Smart System for Rider Safety and
Accident Detection” IJERT Vol. 9 Issue 06, June-2020.
[5] C.Prabha, R.Sunitha, R.Anitha (2014), “Automatic Vehicle Accident
Detection and Messaging System Using GSM and GPS Modem”,
International Journal of Advanced Research in Electrical, Electronics
and Instrumentation Engineering.
[6] “Road Vehicle Alert System Using IOT” 2017 25th International
Conference on Systems Engineering (ICSEng)
[7] A. Cismas, I. Matei, V. Ciobanu and G. Casu, "Crash Detection Using
IMU Sensors," in 2017 21st International Conference on Control
Fig. 7. GSM 800C Module
Systems and Computer Science (CSCS), Bucharest, 2017 pp. 672-676.
doi: 10.1109/CSCS.2017.103.
The GSM module is used to send the data like GPS coordinates [8] M. M. Islam, A. E. M. Ridwan, M. M. Mary, M. F. Siam, S. A. Mumu
to the registered mobile contacts of the crash victim and S. Rana, "Design and Implementation of a Smart Bike Accident
automatically through an SMS. Detection System," 2020
[9] Brian Lin, Dhruv Mathur, and Alex Tam “Bike Crash Detection ”
2019
[10] Jussi Parviainen ,Jussi Collin ,Timo Pihlstrom, Jarmo Takala
EXPECTED RESULT
"Automatic crash detection for motorcycles ".
[11] Mr S.Kailasam, Mr Karthiga, Dr Kartheeban, R.M.Priyadarshani,
● The project complied with all of the high-level K.Anithadevi, “Accident Alert System using face Recognition”,IEEE,
specifications established at its outset. 2019.
● The device will be able to identify crashes from drops of the [12] Rajvardhan Rishi, Sofiya Yede, Keshav Kunal, Nutan V. Bansode,”
bike or quick controlled stops and detect crashes with over Automatic Messaging System”. for Vehicle Tracking and Accident
Detection, Proceedings of the International Conference on Electronics
10g of force with accuracy. and Sustainable Communication Systems,ICESC, 2020.
● Second, the tool could convey a message swiftly, the time [13] Md. Syedul Amin, Mamun Bin Ibne Reaz, Salwa Sheikh Nasir and
and location of the accident, to the emergency contact(s) Mohammad Arif Sobhan Bhuiyan “Low Cost GPS/IMU Integrated
within two minutes of the collision. Accident Detection and Location System”, 2017.
● Finally, we will be able to design a little gadget that would [14] Jussi Parviainen ,Jussi Collin ,Timo Pihlstrom, Jarmo Takala
"Automatic crash detection for motorcycles ".
enable mounting on the majority of motorcycles.
6
BLOCKCHAIN BASED CERTIFICATE VALIDATION
Dr Chinnaiyan R Ranjith kumar Tallam (20191CSE0624) Allada Bhanu sai subba rao
Professor( CSE )ilation) Deaprtment of CSE (ffliation) (20191CSE0027)
Presidency University (fon) Presidency University (iation) Deaprtment of CSE (ofAffiliation)
Bengaluru, India Bengaluru, India Presidency University filiation)
chinnayaiyan@presidencyuniversity.in 201910100941@presidencyuniversity.in Bengaluru, India
Aluri Amarnath Adikari Naga Maneesha Akula Sumudhar Reddy

(20191CSE0028) (20191CSE0008) (20191CSE0024)
Deaprtment of CSE (offfiliation) Deaprtment of CSE ( Affiliation) Deaprtment of CSE (offfiliation)
Presidency University ffiliation) Presidency University (ofiliation) Presidency University (filiation)
Abstract: eliminates intermediaries. This paper presents an abstract of

The Educational Certificate Verification System Using the proposed system, highlighting its key features, benefits,
Blockchain is a decentralized platform that utilizes and potential impact on the education sector. Education is a
blockchain technology to verify educational certificates. fundamental right and a key driver of economic and social
This system aims to provide a secure and tamper-proof development. However, the verification of educational
method of verifying educational certificates by certificates remains a challenge for many institutions,
leveraging the immutability and transparency of employers, and individuals. Traditional verification
blockchain technology. The proposed system eliminates processes are often time-consuming, costly, and prone to
the need for intermediaries and allows for a direct errors and fraud. As a result, there is a growing need for a
verification process that is reliable, efficient, and cost- more secure, efficient, and reliable method of verifying
effective. The system uses smart contracts to automate educational certificates.
the verification process, ensuring that all data is Blockchain technology has emerged as a potential solution
accurate and tamper-proof. This paper presents an to this problem, offering a decentralized and tamper-proof
abstract of the proposed system, highlighting its platform for the verification of educational certificates. In
features, benefits, and potential impact on the education this paper, we propose an Educational Certificate
sector. Verification System Using Blockchain that leverages the
Key words: Block chain features of blockchain technology to provide
Introduction:
The traditional process of verifying educational certificates
is often lengthy, costly, and prone to fraud. Educational a secure and reliable method of verifying educational
institutions, employers, and other third-party verification certificates. The proposed system eliminates the need
agencies have to rely on intermediaries to verify the for intermediaries, such as universities or third-party
authenticity of certificates, which can take a long time and verification agencies, and allows for a direct
result in errors or inconsistencies. Blockchain technology verification process that is transparent, efficient, and
offers a solution to this problem by providing a secure and cost-effective. The system uses smart contracts to
decentralized platform for verifying educational certificates. automate the verification process, ensuring that all
This paper presents an educational certificate verification data is accurate and tamper-proof. The use of
system using blockchain technology, which aims to blockchain technology provides an immutable and
streamline the verification process and make it more transparent record of all verified certificates,
reliable, efficient, and cost-effective. The proposed system enhancing the integrity and trust of the verification
utilizes the immutability and transparency of blockchain to process. This paper presents an abstract of the
provide a tamper-proof and direct verification process that proposed Educational Certificate Verification System
Using Blockchain, highlighting its key features, The authors identified the limitations of traditional paper-
benefits, and potential impact on the education sector. based diploma systems, including the risk of counterfeit
The proposed system has the potential to transform the diplomas and the difficulty of verifying the authenticity of
way educational certificates are verified, providing a diplomas from different institutions. The proposed system
more secure, efficient, and reliable method for all utilizes blockchain technology to create a tamper-proof and
stakeholders involved. decentralized ledger of diploma information. The system
also employs smart contracts to automate the diploma
Literature Survey: issuance and verification process. The authors also
[1] Ahmed, S., Yaqoob, I., Hashem, I. A. T., Khan, I., & incorporated digital signature technology to ensure the
Ahmed, E. (2019). Blockchain Technology: A Survey On authenticity of the diploma issuer. The authors implemented
Applications And Challenges. Journal Of Network And a prototype of the proposed system and conducted
Computer Applications, 126, 50-70. experiments to evaluate its performance and effectiveness.
Https://Doi.Org/10.1016/J.Jnca.2018.09.017 The results demonstrated that the system can efficiently
Ahmed et al. (2019) wrote a comprehensive review paper on issue and verify diplomas while ensuring the authenticity
the application of blockchain technology. The authors first and security of the diploma information.
introduced the concept of blockchain and its key Summary: The risk of fake credentials and the difficulty in
characteristics, including decentralization, transparency, determining the legitimacy of degrees from various
immutability, and security. Then they discussed the history universities were mentioned as drawbacks of traditional
and evolution of blockchain technology, from its inception paper-based diploma systems by the authors.
as the underlying technology of Bitcoin to its current
applications in various industries. The paper also delved into Existing Method:
the technical aspects of blockchain, including consensus
mechanisms, smart contracts, and cryptographic techniques. The existing method is for issuing education certificates
Summary: The article also explored blockchain's technical using block chain is through the use of digital badges.
underpinnings, such as consensus processes, smart
contracts, and cryptography methods. The benefits of Digital badges are electronic representations of
implementing blockchain technology were emphasised by achievements, skills, and knowledge that can be shared
the writers, including improved security, reduced costs, and
increased productivity. online. They are often linked to a block chain, which serves
[2] Brinkmann, M., & Böhme, R. (2019). Blockchain- as a secure and tamper-proof ledger of the badges.
Based Certificate Verification With Privacy-Preserving
Revocation Checking. Computers & Security, 83, 267- Disadvantages:
283. Https://Doi.Org/10.1016/J.Cose.2019.01.008 1. Technical Complexity: Block chain technology can
Brinkmann and Böhme (2019) proposed a blockchain-based
certificate verification system that provides privacy- be complex to implement and maintain, and may
preserving revocation checking. The authors identified the require specialized technical expertise. This can
limitations of traditional certificate revocation mechanisms,
which rely on centralized authorities to maintain and make it difficult for some educational institutions
distribute revocation lists. These mechanisms can be slow, to adopt the technology.
inefficient, and susceptible to attacks. The proposed system
utilizes blockchain technology to create a decentralized, 2. Initial Cost: Implementing a blockchain-based
tamper-proof ledger of certificate revocation data. system for education certificates can be expensive.
Summary: The drawbacks of conventional certificate
revocation procedures, which depend on centralised There may be significant upfront costs associated
authority to maintain and disseminate revocation lists, were with developing the system, purchasing and
noted by the authors. These systems may be unreliable,
ineffective, and vulnerable to intrusions. The suggested maintaining hardware and software, and training
method develops a decentralised, tamper-proof ledger of personnel.
certificate revocation information using blockchain
technology. Proposed System:
[3] Lee, J. H., & Kim, T. H. (2019). A Blockchain-Based Implementing education certificates using block chain
Certificate Issuance And Verification System For
University Diplomas. International Journal Of technology. By following these steps, institutions can create
Distributed Sensor Networks, 15(2), 1550147719834755. a secure and tamper-proof certificate system that provides
Https://Doi.Org/10.1177/1550147719834755
Lee and Kim (2019) proposed a blockchain-based certificate learners with greater control over their educational
issuance and verification system for university diplomas. credentials.
Block Diagram:
Figure 22: Architecture
METHODOLOGY:
Admin:
Figure 21:Block Diagram Login: Here, first Admin will login into the system with
username and password.
Upload details: After login, Admin should upload the
Advantages: details of the company.
Upload to block chain: Details were uploaded in the
1. Increased security: Block chain technology blockchain.
provides a secure and tamper-proof system for Hash code: Then hash code will open.
Digital Signature: It will show the signature of the
storing and sharing certificates. Each certificate is company.
stored in a block that is cryptographically secured, Company:
Login: Company must login into the system.
ensuring that the certificate cannot be altered or Signup: After login company should signup into the system.
duplicated without leaving a trace. Scan: Then it will scan the signature of the company.
Upload certificates: Next the company will upload the
2. Improved trust: With a block chain-based details into the system.
certificate system, learners can be sure that their Generate signature: After that it will generate the signature
of the company.
credentials are authentic and trustworthy. Successful: It will show the result i.e., it may be successful
Employers and other third parties can verify the or unsuccessful.
authenticity of the certificate without relying on the Software And Hardware Requirements:
issuing institution.
System Specifications:
3. Greater accessibility: With digital certificates
H/W Specifications:
stored on the blockchain, learners can access their 1. Processor : I5/Intel Processor
2. RAM : 8GB (min)
credentials from anywhere in the world using a
3. Hard Disk : 128 GB
computer or mobile device. This makes it easier for S/W Specifications:
• Operating System : Windows 10
learners to share their credentials with employers
• Front end react : React Js
or educational institutions. • Technology/backend : Python
• IDE : VS code
Architecture:
Results and Discussion

Figure 26:Certificate isseue
Figure 23:HomePage
In above screen admin is adding student details and then
In above screen click on ‘Educational Authority Login’ link uploading certificate and then press ‘Submit’ button to get
to get below login screen below output
Figure 24: Admin Login Page

Figure 27:HASH code generation
In above screen admin is login and after login will get below
screen In above screen student details added and we can see digital
signature generated and stored in Blockchain for uploaded
certificates and now admin can click on ‘Click Here to
Download QR Code image’ button to download QRCODE
and get below output
Figure 25:Admin Homepage
In above screen admin can click on ‘Upload New

Certificates’ link to upload certificate
Figure 28:QR download
In above screen in browser status bar we can see QR code

image downloaded and this image student can keep in his
mobile. Now admin can click on ‘View Certificates Details’ In above screen company is entering signup details and
to view all certificates stored in Blockchain press button to store details in Blockchain and will get
below output
Figure 29:Certificate details

Figure 32: company registration compelete
In above screen we can see different certificates of same or
new student stored in Blockchain and we can see date and In above screen we can see company signup task completed
time of upload with digital signature and QR CODE image. and now click on ‘Company Login Here’ link to get below
Now admin can click on “View Companies Details’ to allow login screen
admin to view registered companies
Figure 33:company loginpage
Figure 30:Company details In above screen company is login and after login will get
below screen
In above screen admin can view list of registered companies
and now logout and signup new company to perform
verification
Figure 34:company homepage

Figure 31:Company registration
In above screen company can click on ‘Authenticate In above screen I am uploading another certificate and
Certificate’ to upload certificate copy received from student below is the output
and perform verification
Figure 38:Authentication failed

Figure 35:certificate verification
In above screen we can see Authentication failed for
In above screen company can upload certificate and get uploaded certificate.
below details if authenticated
CONCLUSION
In conclusion, the use of block chain technology in the
education sector can provide a secure and transparent
method of storing and verifying educational certificates. By
leveraging the immutability and decentralization of block
chain, educational institutions and employers can have
confidence in the authenticity of certificates, reducing the
risk of fraud and increasing efficiency in the verification
process. Furthermore, block chain-based education
Figure 36:Authentication
certificates can also provide individuals with greater control
In above screen company can view all details of uploaded over their educational records, enabling them to easily share
certificated and in last column we can see authentication
successful and similarly they can upload and verify any their credentials with potential employers or other
certificate institutions. While the adoption of block chain technology in
education is still in its early stages, its potential benefits
make it a promising development for the future of
education.
REFERENCES
1. Gafurov, I., & Khusanov, R. (2021). Blockchain
technology for securing education certificates: A systematic
literature review. Sustainability, 13(5), 2451.
2. Sari, S., & Celik, E. (2020). An analysis of blockchain
Figure 37 technology in education: Current status, challenges and

opportunities. Journal of Educational Technology &
Society, 23(4), 129-141.
3. Togou, M. A. M., & Kamsu-Foguem, B. (2021). A
blockchain-based approach to secure and verify education
certificates. Sustainability, 13(8), 4683.
4. Chen, Y., Li, J., & Lu, W. (2021). Blockchain-based
education certificate authentication system. In Proceedings
of the 2021 5th International Conference on Education and
E-Learning (ICEEL 2021) (pp. 219-223).
City Scape – A City Information Guide
Asst. Prof. Md Ziaur Rahman Akshaj G Akshay J Sharma

mdziaurrahaman@presidencyuniv 201910100613@presidencyuniversity.in 201910101413@presidencyuniversity.in
ersity.in
Aishwarya Raikar A Teja Kiran

A Jyothi School of Computer Science School of Computer Science
School of Computer Science Engineering Engineering
and Engineering Presidency University Presidency University
Presidency University Bengaluru, India Bengaluru, India
Bengaluru, India 201910100425@presidencyuniversity.in 201910100101@presidencyuniversity.in
Abstract - The rapid advancement of technology have emerged as a valuable resource for providing
has transformed the way we travel and explore comprehensive information about a city's various
new cities. In today's digital era, having a amenities, services, and attractions. These websites
comprehensive city information guide at your aim to serve as a one-stop platform for users to
fingertips has become essential for tourists and access information about places to visit, hotels,
locals alike. Introducing CityScape, a cutting- schools, police stations, transportation details, and
edge website and app that provides an all-in-one more. The purpose of this research paper is to
city information guide, offering a wealth of explore the development and implementation of a
essential information about any city around city information guide website, which can be a
India. CityScape is a user-friendly and intuitive useful tool for residents and visitors alike.
platform that caters to the needs of travelers
and locals alike. CityScape provides II.CURRENT STATE OF CITY
comprehensive and up-to-date information on INFORMATION GUIDE
all the must-visit places in the city. One of the Currently, there are numerous city information
standout features of CityScape is its guide websites and apps available, ranging from
transportation information. The platform government-run platforms to third-party options.
provides information on bus and train services, These platforms offer varying degrees of
including routes, and schedules. This enables comprehensiveness, usability, and accuracy. Some
users to easily plan their commute and navigate platforms rely on user-generated content, while
others source information from official databases.
the city using public transportation, saving time
In recent years, there has been a trend towards
and money. In addition to attractions and incorporating more interactive features, such as
transportation, CityScape also includes practical augmented reality and virtual tours, to enhance the
information such as local weather forecasts and user experience. However, there is still room for
emergency contact numbers, making it a one- improvement in terms of standardization,
stop-shop for all essential city information. accessibility, and data quality.
Users can also look for attractions, restaurants,
and hotels, providing valuable insights to help A. Abbreviations and Acronyms
them make informed decisions. • CityScape – website name
Keywords—technology,user-friendly, B. Website Technology

information guide. The technology advancements in city
information guide websites have been
I. INTRODUCTION significant in recent years. These platforms
In today's digital age, cities are becoming more have evolved to incorporate modern web
connected and accessible through the use of technologies, such as responsive web design
websites and apps. City information guide websites for optimal viewing on different devices,
location-based services for personalized
recommendations, and integration with social like responsive web design and interactive features
media for user-generated content. are used.
Additionally, advancements in data analytics
and visualization have enabled these platforms 2) Back-end: This part of the website represents the
to provide more accurate and up-to-date server-side and database technologies that make
information. Furthermore, the use of
up its foundation. It include choosing the right
augmented reality, virtual reality, and
interactive maps have enhanced the user databases to store and manage data, as well as
experience, allowing for immersive and hosting alternatives like cloud-based systems.
interactive exploration of city offerings.
3) Data sources: In order to deliver accurate and
C. Advantages and Disadvantages current information, this component identifies and
Websites that serve as city information guides integrates numerous data sources, including public
offer a variety of data on a city's facilities, records, government databases, and user-
services, and attractions on a single, centralised
generated material.
platform, making it simple for users to access
pertinent data. These websites are made to be 4) Application programming interfaces (APIs) are
user-friendly, with simple search functions and used in this component to simplify data interchange
navigation that make it easy for users to get the across various systems, including transport
information they need. Since these websites are services, hotel booking websites, and social media
available around-the-clock from any place with networks. After gathering the necessary
internet connectivity, people may easily obtain information, we manually entered it into our
information whenever they want. database. primarily used Flickr to obtain photos.
Even though city information guide websites
have benefits like convenience, tourism 4) 5) When it came to integrating maps, we used
promotion, and community involvement, issues OpenStreetMap.
like data accuracy, bias, inclusivity, the digital
divide, privacy, and security need to be carefully 5) 6) Overall, a city information guide website's
considered if these platforms are to be used architecture should be created to offer a
effectively and fairly. dependable, scalable, and secure platform that
III. FEASIBILTY OF CITY satisfies users' needs.
INFORMATION GUIDE
A city information guide website's viability can A. Figures and Tables
be evaluated based on a number of variables, such
as the accessibility and reliability of the data
sources, technological capabilities, user
friendliness, prospective commercial interests,
and data privacy considerations. The feasibility
and potential difficulties of creating and operating
a city information guide website can be
thoroughly examined, which can be helpful for a
research paper on the subject.
A. Architecture B. Modules
The architecture of a municipal information guide 1)Landing Page:
website consists of a number of parts that operate
in concert to deliver a thorough and user-friendly
platform. Several of the architecture's essential
elements include the following:
1) Front-end: This element is responsible for the

website's user-facing interface, including its look
and feel. In order to build an appealing and user-
friendly interface, contemporary web technologies
CONCLUSION
The availability and accuracy of the data,
technological capabilities, user accessibility and
usability, potential commercial interests, and data
privacy issues must all be carefully taken into
account while developing a city information guide
website. A city information guide website can
provide users with a comprehensive and interesting
A landing page offers a succinct summary of platform to obtain information about a city's
the services, benefits, and features offered by amenities, services, and attractions by taking a user-
the website, as well as visual components that centric approach and utilising contemporary online
reinforce the messaging and improve the user technologies, such as responsive design and
experience. interactive elements. Additionally, a city information
2) City Page : guide website can increase the platform's total value
offer by assuring data accuracy, putting in place
suitable security measures, and taking into user
prospective commercial interests and user privacy
concerns.
ACKNOWLEDGMENT
This study report on City Information Guide
(CityScape) has been completed, and we would
like to thank everyone who helped. First and
foremost, we would want to express our gratitude
to out academic advisor, whose advice and
assistance were crucial during the whole study
Some significant cities are listed on a city page process.
and are reachable directly or via the search
field. We also want to express our gratitude to the
subject-matter experts who shared their knowledge
3) Mobile View : and comments with us, helping us to refine our
study. Their expertise and knowledge significantly
improved our comprehension of the subject and
gave us fresh viewpoints.
REFERENCES
[1] Owen Eriksson, “Location Based Destination

for the Mobile Tourist”,-2002-
https://www.researchgate.net/publication/2975
0812_Location_Based_Destination_Informati
on_for_the_Mobile_Tourist
[2] Vineet Singh, Akeshnil Bali, Avinesh
Adhikthikar, Rohitash Chandra, “ Web and
Mobile based Tourist Travel Guide System for
Fiji’s Tourism Industry”, -2015
02129_Web_and_mobile_based_tourist_travel
_guide_system_for_Fiji's_tourism_industry
[3] C. Badii, P. Bellini, P. Nesi, M. Paolucci, “A
Having a responsive mobile design can boost Smart City Development kit for designing
user interaction, enhance the user experience, Web and Mobile Apps” – 2017
and increase traffic and income to a website or https://www.researchgate.net/publication/3260
application.
56398_A_smart_city_development_kit_for_d
esigning_Web_and_mobile_Apps
[4] Livari Kunttu, “Developing smart city
services by mobile application”-2019
257651_Developing_smart_city_services_by
_mobile_application
[5] Aniket Patil, Onkar Shinde, Pritesh Barela,
Karan Sisodiya, H. Walgude, Arati Deshmuk,
“My City Guide Mobile Application” - 2019
http://ijariie.com/AdminUploadPdf/My_City_
Guide_Mobile_Application_ijariie9839.pdf
[6] K. Leela Rani, Anne Esther Jasmine, Benita
Jeba Malar, -“My City Info App : My City
Information Mobile Application using
Android Application” , - 2019
https://ijesc.org/upload/d5b5e4a75b5769aeafa
a2ab17dbc4c0c.My%20City%20Information
%20Mobile%20Application%20using%20An
droid%20Application%20(1).pdf
[7] U Nurhasan, H Pradibta, S B Suryadi and A
Alfinda – “Smart tourist guide application for
the introduction of the introduction of Malang
city tourism potential using hybrid
technology” – 2019
https://iopscience.iop.org/article/10.1088/174
2-6596/1402/6/06
17
Online Intreactive Entrepreneur Clubs: A New Way to Connect Entrepreneurs
Mrs.Shilpa C N Suvidya M (20191CSE0611) Sunidhi Sharma (20191CSE0813)

shilpa.cn@presidencyuniversity.in 2019101000874@presidencyunive 201910100568@presidencyuniver
rsity.in sity.in
Subhadip Pal(20191CCE0067)
Department of CSE Suraj S Nair (20191ISE0172) T V Sai Charan (20191IST0161)
Presidency University Department of CSE Department of CSE
Bengaluru, India Presidency University Presidency University
201910101137@presidencyuniver Bengaluru, India Bengaluru, India
sity.in 201910101630@presidencyuniver 201910101300@presidencyuniver
sity.in sity.in
Abstract— As entrepreneurship continues budding entrepreneur connections can be a great

to grow and evolve, more and more young help and can colossally affect the direction of
minds are excited be a job provider than a their growth. Every right connection takes us a
job seeker. Entrepreneurship offers aspiring step closer to success, be it the knowledge
individuals the autonomy to solve real world sharing that happens or the financial assistance.
problems and build a sustainable future. At Collaboration and partnerships that result from
this moment there are multiple applications connections can be advantageous for both
that help these entrepreneurs grow, however personal and professional endeavours. Together,
we can accomplish things that might be
after doing literature survey, we found a new
challenging or impossible for us to do alone.
way to help entrepreneurs connect with like-
minded individuals which facilitates mutually In this paper we introduce you to a new
beneficial connections. The proposed solution method of making connections across the
is a mobile application designed to help community. We use a simple yet clear-cut logic
entrepreneurs network with each other which to make these connections, this will be mutually
will help them grow their business venture. beneficial for the parties involved and the data
Integrated with a new feature that takes in generated in this process can be analysed and can
what an individual entrepreneur needs and help promote entrepreneurship better.
what they can provide, and based on this The paper further aims towards the
information we recommend the profiles. This implementation details, by employing UML
paper provides design and implementation (Unified Modelling Language) diagrams, which
details of the above-mentioned feature, help us, understand the structure and behaviour
contributing to the continued growth of the of the proposed android application.
entrepreneurial ecosystem and inspire more CXI. EXISTING METHODS
young minds to pursue their dreams of
creating something meaningful and Since the paper addresses about a new
method of making connections, it gets important
impactful.
to discuss the methods used by the existing
Keywords—Entrepreneurship, Mobile platforms for making connections.
Application, Beneficial Connections, The top 5 platforms for social networking of
Networking entrepreneurs as of 2023 [11] and their way of
making connections:
CX. INTRODUCTION
We often encounter that people don’t realize • Meetup: Connects based on similar
how important connection are. Especially as interests

• Shapr: Helps connect with like-minded CXIII.IMPLEMENTATION DETAILS
people nearby Uml Diagram
• Eventbrite: Organizes networking events Fig. 1. Structure Of The Planned Application
• LinkedIn: Connects based on a contact you CXIV. EXPECTED RESULTS
know personally or who you trust on a
professional level, Recommends profiles
with similar interests or similar domain.
• Nudge: Helps to stay connected at work.
Access Company updates etc.
CXII. PROPOSED METHOD
The concept is to connect entrepreneurs based
on “asks” and “haves”.
Entrepreneurs are given a provision wherein
they get to enter the details of what they need
from the community-“asks” and similarly what
they can provide to the community-“haves”.
The feature compares the “asks” field of one
entrepreneur to “haves” field of other
entrepreneurs and recommends if the profile
matches, this feature allows for an efficient and
effective matching process.
This can assist entrepreneurs in acquiring the
tools and resources, including capital and
mentorship, that they require to expand their
firms. It can promote a sense of mutual respect
and collaboration among entrepreneurs by
linking them in this way, which can be quite
beneficial for individuals trying to succeed in the Fig. 2 Expect Result Screen
business sector. Overall, this idea has the CXV. FUTURE SCOPE
potential to build a strong entrepreneurial
community that may encourage growth and We can make better decisions if we further
innovation through mutual assistance. This analyse the data that is produced when we make
feature can assist entrepreneurs in succeeding these connections. We gain a clear understanding
and having a good impact on their fields and of what entrepreneurs require at various stages,
communities by utilising technology to enable this can aid in the creation of projects and
these relationships. programmes that are aimed specifically at
supporting their development. The data can also
be utilised to spot patterns and trends in the
demands and solutions offered by entrepreneurs,
which can help decision-makers at the regional,
national, and even worldwide levels allocate
resources more effectively.
The platform can also be improved using
machine learning algorithms that can give
entrepreneurs more precise and customised
recommendations based on their requirements
and preferences. As a result, the matching
procedure may run more smoothly and
effectively, and ties between entrepreneurs may
get deeper.
Finally, the platform can be expanded to [131] Schumpeter, J. A. (1934). The theory of
economic development: An inquiry into
reach a broader group of entrepreneurs from profits, capital, credit, interest, and the
other industries and geographical locations, business cycle. Harvard University Press.
[132] M. Jones, “The 14 best networking apps for
fostering the development of a worldwide entrepreneurs,” HelpJet, 06-Feb-2019.
community of entrepreneurs who can share [Online]. Available:
knowledge, work together, and promote https://helpjet.com/blog/networking-apps-
for-entrepreneurs/.
innovation and progress. The creation of new
companies and industries as well as the global
economy may be significantly impacted by this.
CXVI. CONLUSION
In conclusion, the idea of linking
entrepreneurs based on "haves" and "asks"
highlights the importance of community to
entrepreneurship. A supportive community can
significantly impact an entrepreneur's success on
their entrepreneurial path. This idea can develop
an entrepreneurial ecosystem that is more
diverse, innovative, and collaborative, which can
spur economic growth and have a beneficial
social impact. It has the ability to revolutionise
how business owners interact, work together, and
find success in their operations.
REFERENCES
[122] Al Mamun, A., & Hasan, M. (2020). Online
Community of Practice (OCoP) to Enhance
Entrepreneurship Education in Higher
Education Institutions. Journal of
Entrepreneurship Education, 23(3), 1-16.
[123] Huang, R., & Chung, H. F. (2019). The
impact of online social networking on
entrepreneurial orientation: A study of
women entrepreneurs in China. Journal of
Business Research, 95, 330-341.
[124] Amankwah-Amoah, J., & Zhang, H. (2018).
Knowledge sharing and entrepreneurial
orientation in family firms: Effects on
performance. Journal of Business Research,
88, 365-376.
[125] Kuckertz, A., & Wagner, M. (2017). The
influence of sustainability orientation on
entrepreneurial intentions–Investigating the
role of business experience. Journal of
Business Venturing Insights, 8, 43-49.
[126] Ali, A., & Ali, Y. (2016). Impact of
Entrepreneurial Orientation on the Business
Performance of SMEs in Pakistan: The
Mediating Role of Innovation. Journal of
Small Business Management, 54(1), 112-
133.
[127] Teece, D. J. (2010). Business models,
business strategy and innovation. Long
Range Planning, 43(2-3), 172-194.
[128] Brown, T. E., & Eisenhardt, K. M. (1997).
The art of continuous change: Linking
complexity theory and time-paced evolution
in relentlessly shifting organizations.
Administrative Science Quarterly, 42(1), 1-
34.
[129] Casson, M. (1997). Information and
organization: A new perspective on the
theory of the firm. Oxford University Press.
[130] Mintzberg, H. (1994). The rise and fall of
strategic planning: Reconceiving roles for
planning, plans, planners. Free Press.
Farmer’s E- Portal
Mahalakshmi B R
Ms. Soumya (Assistant Professor) – Mohammed Salar
Supervisor (20191MEC9029) (20191IST0082)
Department of Computer Science Department of Mechanical Department of IST
Soumya@presidencyuniversity.in` 201810101617@presidencyuniversity.in
Mohammed Salman Shaikh Mohammed Fahad Manu K N

(20191COM0131) (20191IST0093)
Department of COM Department of IST (20191CSE0322)
Presidency University Presidency University Department of Computer Science
201810100731@presidencyuniversity.in Bangalore, India
Abstract—Agriculture, being the backbone of India, faces

potential to transform the agricultural sector in India and
numerous challenges such as inadequate value for products and improve the livelihoods of farmers.
lack of discussion platforms for farmers to clarify their doubts. II. BACKGROUND INFORMATION
This project aims to address these issues by providing a solution India has a long history of agriculture, and it continues to be
that enables farmers to sell their products at the best prices and one of the most significant contributors to the country's
engage in discussions related to agriculture. The key feature of this
system is the bidding process, which allows farmers to attain economy. According to the Ministry of Agriculture and
maximum prices for their products based on their urgency. Farmers' Welfare, the agriculture sector employs over 50% of
Membership facilities are implemented for authorization, ensuring the Indian workforce and accounts for about 18% of the
loyalty among users. Additionally, an open forum promotes country's Gross Domestic Product (GDP).
interaction among registered users, fostering business Despite its importance, the agricultural sector in India faces
relationships. The website is being built using user-friendly
technologies such as HTML5, CSS3, JavaScript, Java, and several challenges, including low crop yields, lack of access to
MySQL server for frontend and backend functionalities, and cost- credit and markets, and unpredictable weather conditions.
effective third-party integrations are proposed. The overall design These challenges have resulted in low incomes for many
and technology of the web-app aim to be user-friendly and farmers, leading to poverty and distress.
accessible to all stakeholders involved in agriculture-transactions.
To address these issues, several initiatives have been taken,
including the development of various digital platforms. One
I. INTRODUCTION such platform is the Farmers E-portal. The web-app is
Agriculture is one of the most important sectors in India, designed to provide a solution for farmers to sell their produce
providing employment to a significant portion of the directly to buyers across the country through a bidding
population and contributing to the country's economy. system, which allows buyers to bid for the products posted by
However, farmers in India often face numerous challenges, farmers. The web-app also offers a range of features to
including low income, limited access to technology and farmers, such as weather updates, market prices, and crop-
information, and difficulty in finding markets for their related information, which can help them make informed
products. To address these issues, various initiatives have been decisions about their farming practices.
taken to provide farmers with better access to information and
resources, including the development of web-applications such Additionally, the Farmers E-portal provides a membership
as the Farmers E-portal. facility for authorization purposes, ensuring loyalty, and an
The purpose of this research paper is to analyze the Farmers open forum where registered users can interact with each
E-portal web-app and evaluate its effectiveness in addressing other, helping them maintain business relationships. The
the needs of farmers. The paper will discuss the features of the technology and design used in the web-app are user-friendly
web-app, including its ability to connect farmers with buyers, and easy for all stakeholders to transact on, with proposed
provide advisory services, and facilitate the buying and selling third-party integrations to ensure cost-effectiveness.
of agricultural products. Additionally, the paper will examine This research paper aims to analyze the effectiveness of the
the impact of the web-app on farmers' livelihoods, including Farmers E-portal in addressing the needs of farmers,
its ability to increase their income, productivity, and access to specifically in terms of improving their income and
productivity.
markets.
Overall, this research paper seeks to provide an in-depth
analysis of the Farmers E-portal web-app, evaluating its
III. ANALYSIS of the applications. Most of the coding part is done with this
The Farmers E-portal web-app is an online platform that layer.
aims to address the challenges faced by farmers in India. The
web-app provides a marketplace for farmers to sell their
produce directly to buyers across the country. It also offers a 3. Data Layer This is the layer that manages the persistence of
bidding system, allowing buyers to bid on the products posted application information. It is usually powered by relational data
base.
by farmers.
Functionality: The Farmers E-portal web-app offers a range
of features that enable farmers to sell their products and
communicate with buyers. Farmers can post information
about their products, including the quantity, quality, and price.
They can also view the bids from buyers and choose the best
offer. The web-app also provides information on market
prices and weather updates to help farmers make informed
decisions.
Usability: The web-app is user-friendly and easy to navigate.
Farmers can create an account and post their products within
minutes. Buyers can easily search for products and place bids.
The web-app also provides a forum for registered users to
interact with each other and discuss issues related to
agriculture.
Effectiveness: The Farmers E-portal web-app has the potential
to be an effective solution for addressing the challenges faced
by farmers in India. By providing a marketplace for farmers to
sell their products directly to buyers, the web-app eliminates
the need for intermediaries, which can often result in farmers
receiving lower prices for their products. Additionally, the
bidding system ensures that farmers receive the best possible Fig2: Content DFD of a system
price for their products.
Overall, the Farmers E-portal web-app has the potential to
revolutionize the agricultural sector in India by empowering
farmers and enabling them to achieve better prices for their
products. However, further research is needed to evaluate the
long-term impact of the web-app on the livelihoods of farmers
and the agricultural sector as a whole.
IV. Architectural Representation
The Web-app is entitle “Farmer’ E-Portal” is developed using

three-tier architecture. Technical architecture is concerned
about how large software applications can be or should be
organized for better performance and ease of development.
The commonly used option is three-tier architecture.
Fig3: Admin activity
Fig1: Three tier architecture

As the name suggests the three-tier architecture consists of
three tiers.
1. Presentation Layer
It implements the look and feel of an application. It is
responsible for the presentation of data, receiving user events
and controlling the user interface. Most of the applications are
now web based. The most commonly used language is the
HTML.
2. Application Layer This layer implements the business logic
SI. Research Paper’s On Existing Web-app’s Fig4. Farmer’s activity
NO
V. EVALUATION
Authors Title Method Advantages Limitations
1. Priyankadevi AgriMart AgriMart is an Wide reach, Limited rural
online platform for After conducting a thorough analysis of the Farmers E-portal
A.M, eliminates internet
Suresh.P, farmers to sell their middlemen, and connectivity web-app, it is evident that it has several strengths and
produce directly
Aswathy R H through a bidding fair pricing for poses a challenge weaknesses that affect its overall effectiveness in addressing the
system. both farmers and to the platform's needs of farmers.
consumers reach among One of the web-app's strengths is its user-friendly interface,
farmers and
buyers in such which makes it easy for farmers to navigate and use. The web-
areas. app's features, such as the bidding system and open forum,
2. Muganyizi Kisan Kisan Suvidha is a Kisan Suvidha The web-app's provide farmers with a platform to sell their products and
Jonas Suvidha mobile web-app offers real-time information and
that offers farmers weather and recommendations interact with other members, respectively. Additionally, the
Bisheko; weather updates, market price may not be
Rejikumar G information to sufficient for
web-app provides farmers with access to real-time weather
crop-related info,
and a platform to help farmers farmers with updates and market prices, enabling them to make informed
buy and sell make informed limited decisions.
agricultural decisions. knowledge and
products. resources. Table 1
Existing Apps
3. Vaishnavi Krishi Krishi Bazaar is a Krishi Bazaar Krishi Bazaar's However, the web-app has some weaknesses that limit its
Desai; Bazaar mobile web-app that offers a impact in effectiveness. Firstly, the web-app's reach is limited to farmers
Isha Ghiria; enables farmers to transparent promoting fair who have access to smart-phones and the internet, which may
Twinkle sell their produce marketplace for market access for
Bagdi; directly to farmers to farmers may be
exclude those in rural areas with poor internet connectivity.
Sanjay Pawar consumers without showcase their limited due to Secondly, some farmers may not be comfortable using
intermediaries. It products, receive poor internet technology, which could limit their participation in the
allows farmers to fair connectivity and platform. Lastly, the information and recommendations
post their products compensation, lack of access to
and prices, and connect with technology and
provided by the web-app may not be sufficient for farmers with
consumers can potential digital literacy limited knowledge and resource.
browse and customers, and skills in rural Overall, the web-app has the potential to be an effective tool in
purchase them on save costs by areas. addressing the challenges faced by farmers, but its limitations
the web-app. eliminating
intermediaries.
and weaknesses need to be addressed to ensure it can reach and
4. Aina Marie eMarket The study used the The eMarket Limited benefit all farmers.
Joseph, for Local Rapid Web- web-application technology access
or skills may
Nurfauza Jali, Farmers application provides a hinder some
Amelia Jati Development solution for local farmers from
Robert Jupit, (RAD) methodology farmers to vend using the
Suriati for the development their fresh platform.
Khartini Jali of the eMarket web- produce through
application. The a mobile web-
eMarket web- application. The
application provides web-application
a solution for also helps
farmers to sell their customers to
crops at a proper acquire fresh
prize. produce easily
and conveniently.
Fig6. On an average monthly income of a farmer

Thirdly, while the bidding system is an innovative feature, it
could be made more transparent and accessible to farmers.
Providing real-time updates on bidding activity and ensuring
that all farmers have equal access to the bidding system would
help to create a more fair and competitive marketplace.
Finally, the web-app could benefit from stronger marketing and
outreach efforts to reach farmers in remote or low-connectivity
areas. This could include partnering with local community
organizations and providing educational resources to help
farmers become more comfortable with using the web-app.
By implementing these recommendations, Farmers E-portal
could become a more effective tool for farmers to sell their
products and access valuable information and resources.
Fig7. Relationship between the poverty and income in different states VII. Conclusion
In conclusion, the Farmers E-portal web-app offers a promising
A. Existing system solution to address the challenges faced by farmers in India.
In the existing system farmers need to struggle a lot to sell Through its features such as easy access to market information,
vegetables, grains. He needs to give the brokerage amount to a bidding system for selling produce, and an open forum for
the broker to sell its own products. Farmer needs to keep all its discussion, the web-app provides a comprehensive platform for
records manually it may take huge memory and most of the farmers to make informed decisions and connect with other
applications are not user friendly stakeholders in the agriculture industry. While there are areas
B. Drawbacks for improvement such as enhancing the user interface and
• Storing information is huge addressing connectivity issues in rural areas, the web-app's
• Need to maintain quantity record overall effectiveness in addressing the needs of farmers is
• Need to keep record for selling, purchasing the significant. With continued efforts to improve and expand its
agriculture products reach, the Farmers E-portal web-app has the potential to
• No accuracy in work significantly improve the livelihoods of farmers in India and
• Need extra security for prevent the data contribute to the growth of the agriculture sector.
C. Advantages of proposed system
• Provides the searching facilities based on the various ACKNOWLEDGMENT
factors, such as different form of products in different We would like to express our sincere gratitude to the team
seasons behind Farmers E-Portal web-app for their cooperation and
• Manages the information of seasons, vegetables support in providing us with the necessary information and
• Shows the information and description of the access to the web-app, which allowed us to conduct a thorough
Seasons, Vegetables, Grains analysis and evaluation of its effectiveness in addressing the
• Adding and updating of records in proper needs of farmers in India.
management of buying & selling vegatables & grain
• Weather forecast
• Bidding system REFERENCES
• Crop related information [1]C.Larman, Applying uml and patterns an introduction to
• Multiple languages object-oriented analysis and design and iterative development,
• Membership facility 3rd Massachusettes Perason Education,2005
[2]D.Carrington,CSSE3002 Course Note,School of ITEE
VI. Future Work University of Queensland,2008.
Based on our analysis and evaluation of Farmers E-portal, we [3]IEEE Recommended Practive for Software Requirements
have identified several areas where the web-app can be Specifications,IEEE Standard 830,1998
improved to better address the needs of farmers. [4] The Quint News for agriculture issues
Firstly, the web-app could benefit from a more user-friendly [5] Nethrapal on how much do farmers earn
and intuitive interface. While the web-app offers a range of [6]Nutr, “Recipe Menu Dev”, 2005
features and functions, navigating through them can be [7]Bayou and Bennet, “Agriculture Farming System”,1992
confusing and overwhelming for some users. Therefore, [8]Software Engineering of Airline Reservation Systems by Web
simplifying the user interface and making it more intuitive Services
would greatly enhance the user experience. [9]GHIRS: Integration of OOPS System by Web Services
Secondly, the web-app could include more detailed and [10]V.Swapna.M.Fridouse Ali Khan “Design and
tailored information for farmers, such as crop-specific advice Implementations of Web Application in International Journal of
and localized weather forecasts. This could be achieved Engineering Research & Technology Farmer Login New
through partnerships with local agricultural experts and Crops/Grains Buying/Selling Bidding & Viewing the Report
meteorologists to provide more accurate and relevant Close Websites www.google.com www.w3schools.com
information. www.javatpoint.com www.java2s.com
25
Voice Assistant for Disease Diagnosis Using Machine Learning and Natural Language
Processing
Dr Swati Sharma
{Dept. of} Computer Science Smitha Reddy S Shilpa N
Engineering 20191CCE0061 20191CCE0058
Presidency University Dept. of Computer Science Dept. of Computer Science
Bengaluru, India Engineering Engineering
swati.sharma@presidencyuni Presidency University Presidency University
versity.in Bengaluru, India Bengaluru, India
Thanusha M niversity.in niversity.in
20191CCE0076
Dept. of Computer Science Sowhardh C K
Engineering 20191CCE0065
Presidency University Dept. of Computer Science
Bengaluru, India Engineering
201910100204@presidencyu Presidency University
niversity.in Bengaluru, India
niversity.in
Abstract— The use of voice assistants in

Keywords— Artificial Intelligence, Machine
healthcare has become increasingly popular Learning, Prediction, Natural Language
due to their ability to provide remote and Processing, Voice Assistant, Random Forest
personalized care. The proposed idea is to
develop a voice assistant model to predict acute
1. INTRODUCTION
diseases using the Random Forest algorithm
and make recommendations on treatment and Artificial Intelligence (AI) is making significant
diet plans for the user. The system utilizes the changes in healthcare by offering novel ways to
Natural Language Processing (NLP) gather and analyze patient information, improve
techniques to record and convert data from medical decision-making, and personalize medical
Speech to Text and Text to Speech. The treatments. One of the most promising AI
Random Forest algorithm is used for feature applications in healthcare is the use of voice
selection and prediction of diseases based on assistants, which are powered by machine
the symptoms. The openAI library gives access learning algorithms and natural language
to the openAI Application Programming processing (NLP), to improve healthcare delivery
Interface (API) which is used for information and patient outcomes. These voice assistants
retrieval on treatments and diets. The allow patients to communicate their symptoms,
proposed model can be applied in the obtain medical information, and receive
development of future healthcare systems that personalized medical advice through voice-based
leverage voice assistant technology for communication.
improved disease detection and diagnosis. The
study highlights the potential of voice assistants The use of voice assistants with machine
in remote areas, making healthcare more learning algorithms in healthcare can enhance the
accessible and efficient for patients.
accessibility, convenience, and efficiency of diagnosis have emerged as a promising tool for
medical care. These can be particularly beneficial healthcare providers to remotely monitor patients'
for individuals living in remote areas, where health. Several studies have explored the use of
access to medical care may be limited due to machine learning algorithms for disease
geographical barriers, lack of healthcare facilities, prediction, various supervised machine learning
and shortage of medical professionals. Moreover, algorithms for disease prediction are compared in
the technology can alleviate the workload of [2] and it is concluded that Random Forest gave
healthcare professionals, allowing them to focus more accuracy, and Rayan Alanazi proposed a
on critical cases and improving overall healthcare machine learning-based approach in [1] for the
delivery. identification and prediction of chronic diseases
using convolution neural networks (CNN) and K-
Supervised machine learning (ML) models nearest neighbor (KNN). These studies highlight
are increasingly being used in disease diagnosis to the potential of machine learning in disease
improve the accuracy of predictions and enhance prediction, laying the groundwork for the
patient outcomes. The models are trained on development of voice assistant systems for disease
labeled datasets and make predictions based on diagnosis.
input data. By analyzing large amounts of patient
data, these models can identify patterns and risk Furthermore, natural language processing
factors that may not be immediately apparent to techniques have also been investigated in the
human clinicians. Additionally, they can development of voice assistant systems. In paper
continuously learn and adapt to new data, making [3] a review is conducted on speech-to-text and
them effective tools for disease diagnosis and text-to-speech recognition systems, emphasizing
prediction. the importance of natural language processing in
enabling voice assistants to accurately interpret
This paper explores the potential of voice human speech. Additionally, the proposed system
assistant systems utilizing ML models and NLP in [4] is an end-to-end text-to-speech synthesis
techniques in disease prediction for remote areas system that generates speech with human-level
healthcare development. It discusses the quality, which could enhance the user experience
effectiveness of supervised machine learning of voice assistant systems. Moreover, chatbot
models in accurately predicting diseases, leading systems that utilize natural language processing
to more effective interventions and treatments. and artificial intelligence techniques for medical
Furthermore, it highlights the potential of OpenAI diagnosis have also been explored by paper [5]
in unlocking the full potential of voice assistants and paper [6]. Collectively, these studies
in the medical field. Lastly, the paper offers demonstrate the potential of natural language
recommendations for future research and processing techniques in improving the user
development in supervised machine learning experience of voice assistant systems for disease
models for disease prediction. diagnosis.
2. LITRATURE SURVEY 3. PROPOSED SYSTEM
In recent times, healthcare applications have been In this section, a detailed description on datasets
increasingly adopting machine learning and collection, model development, disease
natural language processing techniques. Among prediction, and voice assistant creation is given.
these applications, voice assistants for disease The initial step in constructing a machine learning
model is to collect data. Datasets were obtained preprocessing is done, it is ready for training and
from Kaggle, a data science platform. After data testing.
collection, the data is processed and divided into
training and testing datasets. Then the datasets 3.3. Disease Prediction Using Random Forest.
were trained and tested with the machine learning The proposed system uses Random Forest
algorithms such as SVM, Naïve Bayes, Decision algorithm to predict the acute diseases. The
Trees and Random Forest (RF). And when processed train dataset is split into test and train
compared for accuracy RF is selected. This model data using the train_test_split function from the
is then integrated with the voice assistant sklearn library. The split data (symptoms and
program. diseases) is then fitted onto the Random Forest
Model to train. Later the model is tested on the
Following are the steps involved in creation of test dataset. The illustration of Random Forest
Voice Assistant for Disease Diagnosis. algorithm consisting of 3 different decision trees
is shown in the Fig 1. Each of the decision tree
3.1. Data Collection. was trained using a random subset of the training
3.2. Data Preprocessing.
3.3. Disease Prediction Using Random Forest.
3.4. Speech to Text Using SpeechRecognition. data.

3.5. Text to Speech Using Pyttsx3.
3.6. Information Retrieval Using OpenAI. 3.4. Speech to Text Using SpeechRecognition.
This model converts the user’s spoken words into
3.1. Data Collection. The data collected includes text, which allows the voice assistant to
132 common symptoms(features) mapped to 41 understand what the user is saying. It is used to
translate the patients’ symptoms into a digital
unique diseases(target). The dataset excludes format that can be processed and analyzed.
personal details of patients such as name, ID,
mobile number and so on to prevent privacy. 3.5. Text to Speech Using Pyttsx3. Pyttsx3 is a
Python library, used for text-to-speech
3.2. Data Preprocessing. The collected data are conversion. When integrated into the voice
preprocessed to check for null values and to drop assistant it can be used to provide spoken
unnecessary columns. This is done to enhance the responses to users about the results of their
quality of the dataset. This step also eliminates
underscores, commas, and white spaces. Once the
Fig 1: An illustration of a Random Forest which consists of three different trees

diagnosis. And also makes recommendation on key. This further helps in retrieval of information
treatments and diets. on treatment plans, diet charts and so on from the
GPT model.
3.6. Information retrieval Using OpenAI. The link
to OpenAI GPT model is established with the API
After all the steps, if the voice assistant’s Forest algorithm was found to be 92.68%, which
performance matches the desired expectations, is a promising result for predicting acute diseases.
then the proposed system is ready for deployment The system's speech-to-text and text-to-speech
as shown in Fig 2. conversions were accurate, making it easier for
patients to communicate with the system. The
information retrieved from OpenAI was also
useful and accurate, further demonstrating the
system's effectiveness in providing patients with
reliable and relevant information on treatments
and diets.
5. CONCLUSION
In this paper, we proposed a voice assistant

system to predict diseases and make
recommendations on treatments and diets based
on the user’s symptoms. It seeks to convert the
input audio to machine understandable form using
NLP techniques. It makes predictions using the
random forest algorithm. Moreover, with the help
of OpenAI, the system can provide accurate and
reliable suggestions on treatments and diets. The
system can prove to be a game-changer in rural
areas where health facilities are limited, and
people have to travel long distances to get a
proper diagnosis. Overall, the proposed system
has the potential to bring significant changes to
the healthcare industry and rural sectors.
6. FUTURE SCOPE
Fig 2: Architecture of proposed Voice Assistant Future work can involve expanding the dataset to
for disease diagnosis system. include a wider range of diseases and symptoms,
and incorporating other technologies such as
image recognition. The study can further be
incorporated with neural networks to achieve
4. EXPERIMENTAL RESULTS
better understanding of user’s input and generate
The Experimental results of the proposed Voice more accurate disease diagnosis. The system only
Assistant system indicate that it achieved the accepts user’s input in English language, it can be
desired outcomes. the accuracy of the Random extended to include other regional and
international languages. Future systems could professionals in making more informed diagnosis
explore the potential use of the system in clinical and treatment decisions.
settings, where it could aid healthcare
REFERENCES [5] Divya S, Indumathi V, Ishwarya S,

Priyasankari M, Kalpana Devi S, “A Self-
[1] Rayan Alanazi, “Identification and Diagnosis Medical Chatbot Using
Prediction of Chronic Diseases Using Artificial Intelligence”, Journal of Web
Machine Learning Approach”, Hindawi Development and Web Designing Volume
Journal of Healthcare Engineering, 3 Issue 1, 2018.
Volume 2022.
[6] Dr. Meera Gandhi, Vishal Kumar Singh,
[2] Shahadat Uddin1, Arif Khan, Md Ekramul Vivek Kumar, “IntelliDoctor – AI based
Hossain and Mohammad Ali Moni, Medical Assistant”, Fifth International
“Comparing different supervised machine Conference on Science Technology
learning algorithms for disease Engineering and Mathematics, 2019.
prediction”, BMC Medical Informatics and
Decision Making, 2019. [7] Nicholas A. I. Omoregbe, Israel O.
Ndaman, Sanjay Misra, Olusola O.
[3] Ayushi Trivedi,Navya Pant, Pinal Abayomi-Alli, Robertas Damasevicius,
Shah,Simran Sonik, Supriya Agrawal, “Text Messaging Based Medical Diagnosis
“Speech to text and text to speech Using Natural Processing and Fuzzy
recognition systems-Areview”, IOSR Logic”, Hindawi Journal of Healthcare
Journal of Computer Engineering, 2018. Engineering, Volume 2020.
[4] Xu Tan, Jiawei Chen, Haohe Liu, Jian [8] Dong Jin Park, Min Woo Park, Homin
Cong, Chen Zhang, Yanqing Liu, Xi Wang Lee, Young-Jin Kim, Yeongsic Kim &
Yichong Leng, Yuanhao Yi, Lei He, Frank Young Hoon Park, “Development of
Soong Tao Qin, Sheng Zhao, Tie-Yan Liu, machine learning model for diagnostic
“NaturalSpeech: End-to-End Text to disease prediction based on laboratory
Speech Synthesis with Human-Level tests”, Nature Portfolio, 2021.
Quality”, arXiv preprint
arXiv:2205.04421, 2022.
30
DETECTION OF CHRONIC KIDNEY DISEASE

USING MACHINE LEARNING
1st Shaik Muzaffar Basha 2nd Shaik Kabir Ahmed 3rd Shaik Shoaib Hussain
Department of School of Computer Department of School of Computer Department of School of Computer
Science, Science, Science,
Presidency University, Presidency University, Presidency University,
201910101624@presidencyuni 201910101250@presidencyuni 201910102227@presidencyuni
versity.in versity.in versity.in
4th Shaik Umar Thahir 5th Mohammed Saqlain Ahmed

Department of School of Computer Department of School of Computer
Science, Science,
Presidency University, Presidency University,
Abstract— Chronic Renal Failure (CRF), also financial burden and increase the efficacy of
known as Chronic Kidney Disease (CKD), has grown treatments, it is crucial to identify CKD early. When
significantly in importance. There is a high demand for analysing CKD data, ML identifies the shortcomings
kidney transplants and dialysis because the average of existing missing value handling techniques,
lifespan of a person without kidney function is only 18
days. It's critical to have reliable tools for CKD early
suggests a fresh approach, and presents an analysis of
detection. CKD can be predicted well by machine various approaches using UCI datasets. It is not
learning techniques. The data pre-ownership, the surprising that a patient may be overdiagnosed more
missing value handling method with collaborative frequently given that humans are prone to error. An
filtering, and attribute selection are all part of the issue with unnecessary treatments and harm to
workflow for predicting CKD status that is suggested in people's health will arise from overdiagnosing a
this review. This paper compares, computes, and patient, which will result in financial waste. The
analyzes machine learning classification methods. majority of people, according to a 2015 report by the
Approaches to choose the classification strategy that National Academies of Science, Engineering, and
will best predict the development of CKD. K Nearest
Neighbor Classifier, Decision Tree Classifier, Logical
Medicine. During their lifetime will make at least one
Regression. We chose some well-known machine incorrect diagnosis. Several factors could be at play
learning techniques called artificial neural networks in the misdiagnosis.
(ANN). To train the model and using these outcomes as
a basis for comparison, we can ascertain which is best. which comprises:
The likelihood of CKD can be predicted most accurately 1. a lack of proper symptoms, which are frequently
by using the following machine learning techniques of undetectable.
the 11 machine learning methods that were taken into 2. a rare disease condition.
consideration, the additional tree classifier and the 3. erroneously leaving out the disease from
random forest classifier have been shown to have the
highest accuracy and the lowest attribute bias. The
consideration.
study highlights the value of incorporating domain Researchers in engineering and medicine are working
knowledge when using machine learning to predict to create machine learning models and algorithms
CKD status, as well as the applicability of data that can detect chronic kidney disease early on. The
collection. issue is that the size and complexity of the data
produced by the health sector make data analysis
Keywords— Chronic Kidney disease, Chronic challenging. However, we can use data mining
Renal Detection, Machine Learning with Neural technology to process this information into a data
network, KNN, SVM, Decision tree, Random forest, format, which can then be converted into machine
Artificial neural network, Logical Regression. learning algorithms. The development and validation
of predictive models for chronic kidney disease is the
CXVII. INTRODUCTION aim of this model. The main objective will be to
Kidney disease affect about 750 million human assess kidney failure, which indicates a need for
beings world-wide, and day by day number are kidney dialysis or transplantation first. The research
gaining. The prevalence, diagnosis, and treatment of also demonstrates that treating and diagnosing CKD
kidney disease are all very different, despite the fact early can enhance the patient's quality of life.
that it affects people all over the world. The biggest Intelligent machine learning prediction algorithms
killer in modern civilization on the world is renal can be used to foresee the development of CKD and
failure without knowing its symptoms. The use of present a strategy for early treatment. A thorough
tobacco products, binge drinking, high cholesterol, analysis of the literature shows how different
and a number of other risk factors all play a role in machine learning algorithms can be used to forecast
the condition. Some of these negative results, CKD. This study attempts to predict CKD using
according to recent studies by early detection and classifiers like Decision Tree, Random Forest, and
treatment, can be stopped or delayed. Patients are Support Vector Machine and also recommends the
becoming gradually more aware of CKD though still most accurate prediction model.
low. According to the National Health and in 2003–
2004, Fewer than 5% of patients responded to the
nutrition examination survey, with stage 1 or stage 2
CKD and less than 10% with stage 3. Only 45% of
people reported having CKD as their diagnosis. Some
of patients in stage 4 were informed of their illness.
The system known as machine learning disease
prediction uses symptoms provided by users or
patients to forecast diseases. In order to reduce the
CXVIII. Literature survey B. S.Ramya and Dr.N.Radha--
used different classification
A. Salma Shaji, Reshma S, Vishnu
algorithms of machine learning
Priya S R, Janisha, S R Ajina--
to work on diagnosis time and
- Support Vector Machine was
accuracy improvement. The
used as a machine learning
proposed work is concerned
technique to provide a machine
with categorising different
learning model for CKD
stages of CKD based on their
prediction. Ant Colony
gravity. By examining various
Optimization is used to select
algorithms such as Basic
the best attributes from the
Propagation Neural Network,
dataset (A.C.O). The
RBF, and RF. The analysis
satisfactory 12 attributes from
results show that the RBF
the 24 to be had are selected
algorithm outperforms the
for prediction. Finally, the
other classifiers, producing
model is trained using SVM.
85.3% accuracy.
Predicts CKD patients with
fewer features and a higher C. J. Xiao--- In their study, they
level of accuracy. They got established and compared nine
around 96 percent accuracy ML models to predict the
here. progression of CKD, including
LR, Elastic Net, ridge
regression lasso regression
SVM, RF, XGBoost, k-nearest
neighbour, and neural network.
They analysed clinical data
from 551 CKD follow-up
patients. They conclude that
linear models have overall
predictive power, with an
average AUC greater than 0.87
and precision greater than 0.8
and 0.8, respectively.
D. Pinar Yildirim--- The effect of
class imbalance is investigated
when training data with a
neural network algorithm for
making medical decisions on
chronic kidney disease. A
comparative study was carried
out in this proposed work
using a sampling algorithm.
This study demonstrates that
using sampling algorithms can
improve the performance of
classification algorithms. It
also reveals that the learning
rate is an important parameter
that has a significant impact on
multilayer perceptrons.
E. A. Salekin and J. Stankovic--- They can predict CKD with.98 F1 and 0.11 RMSE by
To detect CKD, three taking albumin, specific gravity, diabetes mellitus,
classifiers were evaluated: haemoglobin, and hypertension into account..
random forest, K-nearest
F. H. Polat, H. D. Mehr, and A.
neighbours, and neural
Cetin--- used the SVM
network. They used a dataset
algorithm, as well as two
of 400 UCI patients with 24
feature selection methods:
attributes. A feature reduction
filter and wrapper, in their
analysis was performed using
study to reduce the
the wrapper method to identify
dimensionality of the CKD
the attributes that detect this
dataset, with two different
disease with high accuracy.
evaluations for each method.
ClassifierSubsetEval with the
Greedy Stepwise search engine
and WrapperSubsetEval with
the Best First search engine
were used for the wrapper
approach. CfsSubsetEval with
the Greedy Stepwise search
engine and FilterSubsetEval
with the Best First search
engine were used for the Filter
approach. The best accuracy,
however, was 98.5% with 13
features using FilterSubsetEval
with the Best First search
engine and the SVM
algorithm, without specifying
which features were used.
G. Jing Xiao, Ruifeng Ding,
Xiulin, Xinhui Feng,Haochen
Guan,Tao Sun , Sibo Zhu and
Zhibin.--- Comparison and the
development of the machine
learning tools in the prediction
of chronic kidney disease
progression. Non-urinary
predictors determined from
blood tests could be used
during outpatient follow-up.
ALB, Scr, TG, LDL, and
EGFR levels were found to be
predictive of CKD severity in
routine blood testing. The
created online tool can aid in
the prediction of proteinuria
progression during clinical
follow-up.
H. Dibaba Adeba Debal and CXX. EXISTING WORK:
Tilahun Melak Sitote--- Data mining or machine learning models are crucial
Chronic Kidney in the prediction of diseases. Data mining models use
DiseasePrediction using mathematical methods to extract patterns from data,
Machine Learning. The metric and these patterns are then used to help patients
gives us information on the survive. A number of well-known machine learning
quality of the net outcomes techniques, including Multilayer Perceptron (MP),
that we get in this study. A Support Vector Machine (SVM), KNearest Neighbor
confusion matrix helps us with (KNN), Logistic Regression (LR), Naive Bayes
this by describing the (NB), Random Forest (RF), etc., have been
performance of the classifier successfully used to analyze and categorize kidney
on prediction. disease. Recently, some scientists have been studying
I. Madhuri, Nandini S, Mona--- CKD by using various computational techniques for
Diagnosis of Chronic Kidney the prognosis and diagnosis of this condition
Disease Using Machine
CXXI. DRAWBACK
Learning Algorithms. The
prediction process is less time Given information, machine learning algorithms can
consuming. It will help the build complex models and make accurate decisions.
doctors to start the treatments However, the data is not available in some
early for the CKD patients and applications. It's often inappropriate. Therefore, it is
also it will help to diagnose important to analyze these algorithms and get good
more patients within a less results because the sample size is quite small.
time period
CXXII. ALGORITHM DETAILS
J. Marwa Almasoud, Tomas E
Ward --- Detection of Chronic SVM, short for "Support Vector Machine," :
Kidney Disease using Machine SVM is a supervised learning model that is
Learning Algorithms with frequently applied to classification issues. KNN: It
Least Number of Predictors. determines how far the new element is from other
Proposed the Advantage of classes of known elements. Decision Tree: Decision
alleviating the problem of Tree is a supervised learning method that can be
small size data by applied to classification and regression issues.
incorporating and the Random Forest: RF is a method for solving
averaging over multiple complex problems and enhancing the performance of
classifiers to reduce the models by combining various classifiers.
probability of overfitting.
CXXIII. PERFORMANCE:
K. Dibaba Adeba Debal and Hardware’s and Software’s used:
Tilahun Melak Sitote--- OS : Windows 7 or 7+
Chronic Kidney Ram : 8 GB
DiseasePrediction using HDD or SSD : Greater than 256 to 512 GB
Machine Learning. The metric Processors : Intel 9th generation or high or
gives us information on the Ryzen5 with 4 GB Ram
quality of the net outcomes Software’s : Python 3.3 or high version,
that we get in this study. A Visual
confusion matrix helps us with Studio, PyCharm
this by describing the
performance of the classifier CXXIV. ACCURACY OF RESULTS
on prediction. Building the Model : An important step is to get the
CXIX. PROPOSED METHOD end result of building the model for your data set.
Based on the data set, we create a classification and
Each classifier's output was assessed using a variety regression model. Results View: The user view
of metrics, and the overfitting of each was confirmed consists of the results generated by the model.
using 10x cross validation. The method was also used
to fine-tune model parameters, which is known as
nested cross validation.
CXXV. METHODOLOGY
Experiments are performed with Python 3.3
programming language via Jupyter web application
for laptops. Several sciket learn libraries, free
software for the machine Python learning library.
Assessment measures These included in this study
are accurate with the F1 metric, sensitivity,
specificity and area under the curve. Due to the small
amount of data used in this study, we intend to either
compare our findings with those of another dataset
containing the same features or validate our findings
using a larger dataset in the future.
Additionally, we intend to use the appropriate dataset

to predict whether a person with CKD risk factors
like diabetes, hypertension, and a family history of
kidney failure will develop CKD in the future. This
will assist in lowering the prevalence of CKD.
X.ALGORITHM TOOLS OF DATASET
XII. MEANING OF EACH ATTRIBUTES

1) Specific Gravity (sg): The weight of a given
volume of a fluid (which could be urine) divided
by the weight of the same volume of distilled
water measured at 25°C is known as specific
gravity. Determining a patient's specific gravity
gives information about whether or not they are
hydrated. When specific gravity is typically
XI. ATTRIBUTES INFORMATION highest (1.010-1.025; normal range: 1.003-
1.030). Normal concentration capacity is
indicated by a value greater than 1 point 025.
2) Albumin (al): Your liver produces the protein

albumin. Albumin prevents fluid from leaking
into other tissues by keeping it in your
bloodstream. Hormones, vitamins, and enzymes
are just a few of the substances it transports
throughout your body. Your liver or kidneys may
be the source of low albumin levels.
3) Sugar (su): Over time, each kidney's millions of of water to minerals. Between 135 and 145
tiny filtering units become harmed by the blood's milliequivalents per liter (mEq/L) of sodium is
high sugar levels. In the end, kidney failure considered to be normal blood pressure.
results from this. About 20 to 30 percent of
diabetics experience diabetic nephropathy, which 11) Potassium (pot): It supports the maintenance of
does not always result in kidney failure. regular fluid levels within our cells. The
opposite, sodium, keeps fluid levels within cells
4) Red Blood Cell (rbc): Red blood cells carry normal. An adult's normal potassium level falls
oxygen from your lungs to the tissues in your between 3 and 5 mEq/L.
body. When your kidneys are damaged, they
produce less erythropoietin (EPO), a hormone 12) Hemoglobin (hemo): The red blood cell protein
that tells your bone marrow the spongy tissue hemoglobin transports oxygen to your body's
inside most of your bones to make red blood organs and tissues and returns carbon dioxide
cells.Your body produces fewer red blood cells from those same organs and tissues to your
and provides your organs and tissues with less lungs. Hemoglobin levels for healthy men should
oxygen when EPO levels are low. be between 13.2 (132 grams per liter) and 16.6
grams per deciliter. 11.6 to 15 grams per deciliter
5) Pus Cell (pc): They are neutrophils, which have (116 grams per liter) for females.
traveled to the infection site as part of the body's
defense mechanism against bacteria and other 13) Pack Cell Volume (pcv): The packed cell
infectious agents. Urinary tract infections (UTIs) volume (PCV), a measurement of the proportion
may be present if pus cells are found in the urine. of blood made up of cells. For women, the
Red blood cells (RBCs) and protein are typical range is 35 to 44.9 percent. Between 38
indicators of inflammatory kidney disease and 48 percent is the typical PCV range in males.
(i.e. e. glomerulonephritis) Between 33 and 38 percent is the typical PCV
for expectant women.
6) Bacteria (ba): Urine may contain pus cells
because bacteria can cause urinary tract 14) White blood cell count (wc): The quantity of
infections. white blood cells in your blood is determined by
a white blood count. In the immune system,
7) Blood Glucose Random (bgr): The level of white blood cells play a role. WBC K/cumm is
blood glucose at any given time during the day. usually within the range of 3.8-9.9.
Normal: 140 mg/dL or less. Prediabetic: 140–
199 mg/dL. Diabetic: 200 mg/dL or higher. 15) Red Blood cell count (rc):The number of red
8) Blood Urea (bu): Our blood's urea content. 6 to blood cells in your body is measured by a red
24 mg/dL is considered normal. Normal ranges, blood count. In adults, the normal range for red
however, can change depending on your age and blood cells per microliter (mcL) of blood is
the reference range used by the lab. usually 4.35 to 5.65 million for men and 3.92 to
5.13 million for women
9) Serum Creatinine (sc): Your blood's creatinine
level. Creatinine is a waste substance produced 16) Hypertension (htn): High blood pressure, also
by your muscles and found in your blood. The known as hypertension, is a common condition
normal range is 0 where the blood exerts a constant force against
point74 to 1 point35 mg/dL for adult men and 0 the walls of your arteries that may eventually
point59 to 1 point04 mg/dL for adult women. lead to health problems like heart disease.
10) Sodium (sod): Sodium aids in the transmission 17) Diabetes Mellitus (dm): Diabetes mellitus is a
of nerve impulses, the contraction and relaxation collective term for a number of conditions that
of muscles, and the preservation of the ideal ratio
have an impact on how your body utilizes order to prepare it for the modelling stage. This is
glucose (blood sugar). commonly referred to as data pre-processing. Data
Pre-Processing is the stage in which distorted or
18) Coronary Artery Disease (cad): The coronary
encoded data is transformed so that the machine can
arteries, which supply blood to the heart, develop
easily analyse it. A dataset may be considered as a set
plaque buildup, which results in coronary artery
of statistics objects. Data objects are labelled by a
disease.
number of features, which ensure the basic
19) Appetite (appet): A hunger-driven desire to eat. characteristics of an object, such as the mass of a
physical object or the time at which it was created.
20) Pedal edema (pe): Pedal edema is characterized Data Pre-Processing is the stage in which distorted or
by an abnormal fluid buildup in the ankles, feet, encoded data is transformed so that the machine can
and lower legs, which results in swelling in the easily analyse it. A dataset may be regarded as a set
feet and ankles. of facts objects. Data objects are labelled by a
number of features, which ensure the basic
21) Anemia (ane): Anemia is a condition in which characteristics of an object, such as the mass of a
your body doesn't produce enough healthy red physical object or the time at which it was created.
blood cells to adequately oxygenate your tissues.
Missing values in the dataset can be either eliminated
or estimated. The most common way to deal with
XIII. METHODOLOGY FLOW DIAGRAM
missing values is to fill them in with the mean,
median, or mode value of the corresponding feature.
Because object values cannot be used for analysis, we
must convert numeric values with object type to
float64 type. Null values in categorical attributes are
replaced by the most frequently occurring value in
that attribute column. Label encoding is used to
convert categorical attributes into numeric attributes
by associating each unique attribute value with an
integer. This converts the attributes to the int type.
The mean value is calculated from each column and
used to replace all missing values in that column. We
are using the imputer function to find the mean value
in each column for this function. After the data has
been replaced and encoded, it should be trained,
validated, and tested. Training the data is the process
XIV. FLOW DIAGRAM DETAILS by which our algorithms are taught to build a model.
Dataset: A dataset is an example of how machine Validation is the portion of the dataset that is used to
learning can aid in prediction, with labels validate or improve our various model fits. Data
representing the outcome of a specific prediction. testing is used to put our model hypothesis to the test.
Data Pre-processing: Today's real-world datasets, Feature Selection: The method of computationally
particularly clinical datasets, are prone to missing, selecting the features that contribute the most to our
noisy, redundant, and inconsistent data. Working prediction variable or output is known as feature
with poor quality data yields poor quality results. As selection. We used Ant Colony Optimization (ACO)
a result, the first step in any machine learning to select the best features from the dataset in this
application is to explore and understand the dataset in study. It is a method for solving computational
problems that can be reduced to finding good paths instances. There are three types of variables:
through graphs. Artificial Ants are multi-agent continuous, nominal, and binary. As a result, nominal
methods inspired by real ant behaviour. The variables such as specific gravity, albumin, and sugar
pheromone-based communication of biological ants are used. We use knn classification to convert all
is frequently used as the primary paradigm. nominal variables to binary. k values are selected. K-
Combinations of Artificial Ants and local search Nearest Neighbor is a simple Machine Learning
algorithms have emerged as the preferred method for algorithm based on the Supervised Learning
a wide range of optimisation tasks involving some technique. The KNN algorithm assumes the
form of graph. Rather than accumulating pheromone similarity between the new case/data and the
intensities, this algorithm evaluates them during each available cases and places the new case in the
iteration. The proposed algorithm will alter a small category that is most similar to the available
number of features in subsets chosen by selecting the categories. The K-NN algorithm stores all available
best ants. To evaluate the performance of the subsets data and uses similarity to classify new data points.
that are the wrapper evaluation function, a This means that when new data appears, it can be
classification algorithm must be used. easily classified into a well-suited category using the
K-NN algorithm. The K-NN algorithm can be used
Classification: This study used four classification for both Regression and Classification.
algorithms: support vector machine (SVM), k-nearest
neighbours (KNN), decision tree, and random forest. Decision Tree: Decision Tree is a Supervised
All of the classification algorithms performed well. learning technique that can be used for both
The random forest algorithm outperformed all other classification and regression problems, But it is most
algorithms used. commonly used for classification. It is a tree-
structured classifier in which internal nodes represent
SVM {SUPPORT VECTOR MACHINE}: SVM is dataset features, branches represent decision rules,
a supervised learning model that is commonly used in and each leaf node represents the result. A Decision
classification problems. The SVM algorithm is tree has two nodes: The Decision Node and the Leaf
designed to find the optimal hyperplane that best Node. Multiple branches and preference nodes are
separates all objects of one class from those of used to make decisions. While Leaf nodes are the
another class with the greatest of the outcomes of those choices and do not have any
margins between these two classes. To achieve additional branches.The characteristics of the given
satisfactory computational efficiency, objects that are data set are used to inform the decisions or tests. It is
far from the boundary are discarded from the a graphic representation of every option for solving a
calculation, while other data points that are close to dilemma or making a choice under specific
the boundary are kept and determined as "support circumstances. The purpose it resembles a tree and is
vectors". The kernel functions of the SVM algorithm known as a choice tree is as it begins offevolved with
are radial basis function (RBF), linear, sigmoid, and the foundation node and branches out from there. The
polynomial. The radial basis function was chosen for Classification and Regression Tree algorithm, or
this study based on the results of nested cross CART, is used to construct a tree. A decision tree
validation. merely poses a query and divides into subtrees in
accordance with the response (Yes/No).
KNN: Knn Classification employs the Euclidean Random Forest: Breiman's "decision tree" machine
distance. It computes the distance between the new learning mechanism is the foundation of the bagging
element and other known element classes. In this ensemble method known as random forest (RF). In a
paper, the Chronic Kidney Disease dataset from the random forest, decision trees are the "weak learners"
UCI database is used, which has 25 variables and 400 in an ensemble. Random forest forces the diversity of
each tree separately by choosing a random feature.
After producing lots of trees, they cast their votes for XVII OBJECTIVES
Early-stage CKD goes undetected, and patients only
the class that is the most prevalent. The runtimes for
become aware of the severity of the condition once it
the random forest algorithm are significantly reduced, has progressed.
and it can handle unbalanced data. The supervised Consequently, a major challenge today is to identify
learning technique includes the well-known machine such a disease at an earlier stage. This initiative
learning algorithm Random Forest. In machine promotes early diagnosis and awareness among the
public. Early diagnosis and effective treatments may
learning, it can be used to solve both classification be able to halt or slow the progression of this
and regression issues. To solve a challenging issue persistent illness. The main criterion for the success
and enhance the performance of the model, RF of this project is the use of machine learning to
combines multiple classifiers. The random forest uses recognize behaviors or patterns of behavior in the
early stages of CKD in order to improve the quality
the predictions from each decision tree to predict the of life of patients.
final result based on the majority vote of predictions
rather than relying solely on one tree. The accuracy XVIII. GRAPHS
and risk of overfitting increase with the size of the
forest's tree cover. Classification
XV. SYSTEM ARCHITECTURE
Distribution of age column

Heatmap
Feature Selection
Pus cell clumps
Hypertension
10.6 Diabetes Mellitus
Targeting Features of Dataset
10.7 Red Blood Cell

Final classification
XVIII. RESULTS & DISCUSSION
Solution & Results: This investigation allows us to
propose a model that allows us to predict CRF OR
CKD. Early diagnosis and treatment of CKD can be
performed inexpensively, reducing the burden of
ESRD, improving outcomes for diabetes and
cardiovascular disease (including hypertension), and
significantly reducing patient morbidity and
mortality. Our intention is to provide, in the simplest
possible way, an effective system that will help
physicians and patients to predict chronic kidney
disease at the level of Early Stages. To better predict
CRI, future research should address a variety of
supervised and unsupervised machine learning
techniques, as well as feature selection techniques
with additional performance measurements.
Physicians and radiologists can benefit from a
computer-aided diagnostic system that helps them
make better diagnostic conclusions. Our method
enables doctors to treat more patients in less time.
Accuracy Comparison Appropriate feature selection methods help to reduce
the number of features required by the prediction
algorithm and thus reduce the number of medical
tests required.
XIX. OUTCOMES
This investigation allows us to propose a model that
allows us to predict chronic renal failure. Early
diagnosis and treatment of CKD can be performed
inexpensively, reducing the burden of ESRD,
improving outcomes for diabetes and cardiovascular
disease (including hypertension), and significantly
reducing patient morbidity and mortality. Our
intention is to provide, in the simplest possible way,
an effective system that will help both physicians and detect the severity of the disease and improve its
patients to predict chronic kidney disease at the level generalization performance, a large number of more
of Early Stages. To better predict CRI, future complex and representative data will be collected in
research should address a variety of supervised and the future. We believe that as the data grow in size
Unsupervised device mastering strategies, in addition and quality, this model will get better and better. In
to characteristic choice strategies with extra overall order to improve the identification of CKD, more
performance measurements. Physicians and research and studies in this field are required. This
radiologists can benefit from a computer-aided will help doctors spot the disease earlier and give
diagnostic system that helps them make better patients the chance to regain their renal function.
diagnostic conclusions. Our method enables doctors
to treat more patients in less time. Appropriate REFERENCES
feature selection methods help to reduce the number
of features required by the prediction algorithm and [1] Reshma S, Salma Shaji, SR Ajina, Vishnu Priya,
thus reduce the number of medical tests required. SR, Janisha A Predicting Chronic Kidney
Disease by Machine Learning“, International
Journal of Engineering Research and Technology
XX. CONCLUSION
(IJERT), Vol.9, Iss.7, (2020), S. 137-140.
As a result, we believe that the practical diagnosis of
CKD could benefit from employing this method. This [2] Chen, G.; Ding, C.; Li, Y.; Hu, X.; Li, X.; Ren,
could be used to gauge a person's likelihood of L.; Ding, X.; Tian, P.; Xue, W. Prediction of
developing CKD in the future, be incredibly helpful Chronic Kidney Disease Using Adaptive
and economical. This model might be integrated with Hybridized Deep Convolutional Neural Network
typical blood report generation. If there is a person at on the Internet of Medical Things Platform.
risk, automatically flag them out. Patients wouldn't IEEE Access 2020, 8, 100497–100508.
need to see a doctor unless the algorithms flagged
them. For the modern and busy person, This would [3] Marwa Almasoud, Tomas E Ward, “Detection of
make it more affordable and simple. Using machine Chronic Kidney Disease using Machine
learning techniques, we developed a novel method Learning Algorithms with Least Number of
for detecting CKD. We evaluated a dataset of 400 Predictors”, International Journal of Advanced
patients, 250 of whom were in the early stages of Computer Science and Applications (IJACSA),
CKD. There are some noisy and missing values in Vol.10, No.8, (2019), pp. 89-96.
this dataset. As a result, we require a classification
algorithm that can handle missing and noisy values. [4] Xiao.J and colleagues "Comparison and
Additionally, in actual medical diagnosis, this development of machine learning tools for
method may be applicable to the clinical data of other chronic kidney disease progression prediction,"
diseases. Furthermore, we identify a cost effective Journal of Translational Medicine, vol. 17, (1),
highly accurate detection classifier using only 8 pp. 119, 2019.
attributes through cost analysis of all 24 attributes:
specific gravity, diabetes mellitus, hypertension, [5] I.A. Pasadana, D. Hartama, M. Zarlis, A.S.
haemoglobin, albumen, appetite, red blood cell count, Sianipar, A. Munandar, S.Baeha, A.R.M. Alam,
pus cell. Importantly, the findings of this study “chronic kidney disease prediction by using
introduce new factors that classifiers can use to detect different decision tree techniques”, journal of
more accurately CKD than the current state of the art physics: conference series 1255, (2019).
using formulas. As a result, the model's
generalization performance may be limited. [6] Cheng L.C.; Hu, Y.H.; Chiou, S.H. Applying the
Additionally, the model is unable to determine the Temporal Abstraction Technique to the
severity of CKD because there are only two Prediction of Chronic Kidney Disease
categories of data samples in the set of ckd and Progression. J. Med. Syst. 2019, 41, 85.
notckd. To predict CKD at an early stage, this system
offered the best prediction algorithm. The models are [7] Pinar Yildirim, "Predicting chronic kidney
trained and validated using the input parameters that disease from unbalanced data by multilayer
were obtained from the CKD patients in the dataset. perceptron: Predicting chronic kidney disease",
To perform the CKD diagnosis, learning models for IEEE, July 2017. doi:10.1109/COMPSAC.2017.
K-Nearest Neighbors Classifier, Decision Tree
Classifier, Logical Regression, and Artificial Neural [8] H. D. Mehr, A. Cetin, and H. Polat,J. Med. Syst.,
Networks are created. In order to train the model to vol. 41, no. 4, p. 55, "Diagnosis of chronic renal
disease based on support vector machine using
feature selection approaches," 2017.
[9] S. Ramya, Dr. N Radha, "Diagnozing Chronic

Kidney Disease Using Machine Learning
Algorithms", Proc. International Journal of
Innovative Research in Computer and
Communications Engineering, Vol.4, Number 1,
January 2016.
[10] A.Salekin& J.Stankovic,“Chronic Kidney Diseas

e Detection and Selection of Important Predictiv
e Attributes“, in Healthcare Informatics (ICHI),
2016 IEEE International Conference On, 2016.
[11] Chronic Kidney Disease Prediction using

Machine Learning, Reshma S , Salma Shaji , S R
Ajina , Vishnu Priya S R, Janisha A, 2020,
INTERNATIONAL JOURNAL OF
ENGINEERING RESEARCH &
TECHNOLOGY (IJERT) Volume 09, Issue 07
(July 2020)
[12] Tabassum, M.; Bai, B.G.; Majumdar, J. Analysis

and Prediction of Chronic Kidney Disease using
Data Mining Techniques. Int. J.Eng. Res.
Comput. Sci. Eng. 2018
[13] Segal, Z.; Kalifa, D.; Radinsky, K.; Ehrenberg,
B.; Elad, G.; Maor, G.; Lewis, M.; Tibi, M.;
Korn, L.; Koren, G. Machine learning algorithm
for early detection of end-stage renal disease.
BMC Nephrol. 2020
[14] J.Snegha, “Chronic Kidney Disease Prediction

using Data Mining”, International Conference on
Emerging Trends, 2020.
[15] Deepti Sisodiaa , Dilip Singh Sisodia, Prediction

of Diabetes using three classification algorithms,
Computational International Conference on Data
Intelligence and Science (ICCIDS 2018)
[16] "UCI Machine Learning Repository:

Chronic_Kidney_DiseaseDataSet",Archive.ics.u
ci.edu,2015.Available:http://archive.ics.uci.edu/
ml/datasets/Chronic_Kidney_Disease
Bike Crash Detection
[1]
Dr Robin Rohit Vincent, Lokesh Gattu, D. Sujanith Reddy , G.Thriloknath Reddy,
[2] [3] [4]
G.Tejesh Kumar Reddy, G.Gowtham Hari Tarun

[5] [6]
Presidency University, Department of Computer Science and Engineering,Bengaluru

[1][2][3][4][5][6]
[1]
robinrohit.vincent@presidencyuniversity.in, , 201910101478@presidencyuniversity.in ,
[2] [3]
201910101669@presidencyuniversity.in, 20191010998@presidencyuniversity.in,
[4] [5]
[6]
ABSTRACT.
This project facilitates real time pursuit of an automobile mainly in bike and seeks to minimize
the possibility of deaths by delay in the arrival of aid by alerting the concerned people about the
mishap of the vehicle. According to a government survey, drowsiness and drunk driving
constitute to 22 and 33 percent of accidents respectively in India. The number of lives lost can be
diminished if the assistance can be procured at the earliest. To develop such a system which can
notify the
concerned people about the mishap, GPS module, GSM module, accelerometer is interfaced with
Nodemcu which acts as the controller. The accelerometer detects the accident by a change in
present value of the vehicle orientation and sends the location through GPS module to registered
sim card via GSM module without any indulgence of the driver or passengers. Also through IOT
we can have the information to the Guardian. The planned system aims to cut back deaths in road
accidents by quite nine p.c.
KEYWORDS : Arduino ide, blynk, gps, gsm, powersupply, micro controller

Dynamic Resource Pooling for Enhanced cloud computing
[1]
Yamanappa, Chiranjeevi , Deepak , Chithra Shree M N , Gaddala Bency Vinamratha
[2] [3] [4] [5]
Mohan , Bugidi Sai Prasad

[6]

[1][2][3][4][5][6]
[1]
yamanu.sjce06@gmail.com, rockchiranjeevi07@gmail.com, deepakcharie82965@gmail.com,
[2] [3]
chithrachithra4382@gmail.com, bencymohan21514525@gmail.com,
[4] [5]
saiprasadnaga332@gmail.com
[6]
ABSTRACT.
This paper presents a server-based FPGA resource pooling approach for cloud computing using
software implementation with JDK 8-64 bit,MySQL,Apache,and Heidi SQL technologies.
The proposed approach enables multiple users to share FPGA devices, improving resource
utilization and reducing costs. We propose a web-based interface that allows users to access
FPGA resources and allocate them based on their needs.The underlying infrastructure is built
using Apache web server, MySQL database, and JDK 8-64 bit with Heidi SQL for database
management. Our approach includes a resource allocation algorithm that ensures efficient use of
FPGA resources while providing fair access to all users. We demonstrate the feasibility of our
approach through a proof-of-concept implementation and performance evaluation. Our results
show that our approach can significantly improve resource utilization and reduce costs compared
to dedicated FPGA devices.The server-based implementation also simplifies FPGA resource
management, as the FPGA devices can be centrally managed and allocated to users as needed.
Overall, our software-based FPGA resource pooling approach can help accelerate the
development of FPGA-based applications in cloud computing environments, particularly for
users who cannot afford dedicated FPGA devices.
KEYWORDS : We are using fpga(field programmable gate array ) this is based on Java
simulation on this service
Fetal health classification
[1]
Arun Kumar S , Umar Haseeb , Rahul Kumar , Gopal Krishna Birabar , Vinay Gupta
[2] [3] [4] [5]
, Bismay Kumar Sahoo

[6]

[1][2][3][4][5][6]
arunkumar.s@presidencyuniversity.in, 201910100133@presidencyuniversity.in,
[1] [2]
201910102192@presidencyuniversity.in,
[3]
[4] [5]
[6]
ABSTRACT.
Fetal health classification is an essential aspect of modern obstetrics, and prenatal care aims to
prevent adverse pregnancy outcomes. Currently, one of the most reliable methods to assess fetal
health is through ultrasound imaging. However, manual interpretation of ultrasound images by
medical professionals can be subjective, time-consuming, and prone to human error. Recently,
deep learning models, such as Convolutional Neural Networks (CNN), have shown promising
results in medical image recognition tasks. In this article, we will explore the potential
application of CNN models in fetal health classification CNN models are a type of artificial
neural network commonly used in image recognition tasks. These models have shown high
accuracy in image classification, segmentation, and object detection tasks. In medical imaging,
CNN models have been used in various applications, such as breast cancer detection, skin lesion
diagnosis, and lung disease detection. In fetal health classification, CNN models can be trained
to recognize patterns and features in ultrasound images that are indicative of fetal health status.
One approach to training CNN models for fetal health classification is to use a larg dataset of
ultrasound images labeled with fetal health status. These labels can be binary (e.g., healthy vs.
unhealthy) or multi-class (e.g., healthy, mild, moderate, and severe health conditions). Once the
dataset is labeled, it can be split into training, validation, and testing sets. The training set is used
to train the CNN model to recognize patterns and features in the ultrasound images, while the
validation set is used to tune the hyperparameters of the model. Finally, the testing set is used to
evaluate the performance of the trained CNN model. In fetal health classification, the
performance of the CNN model can be evaluated using metrics such as accuracy, sensitivity,
specificity, and area under the receiver operating characteristic (ROC) curve. The accuracy of the
model indicates the percentage of correctly classified images, while the sensitivity and
specificity measure the proportion of true positives and true negatives, respectively. The area
under the ROC curve is a measure of the overall performance of the model, where a value of 1
indicates perfect classification and a value of 0.5 indicates random classification.
KEYWORDS : Machine learning

Smart Voting System
[1]
Ms.Soumya , Leone Jacob Sunil, Kondra Poornima, Kumarswamy Mp, Kurapati
[2] [3] [4] [5]
Venkata Harshith , Magham Ajay Kumar [6]

[1][2][3][4][5][6]
soumya@presidencyuniversity.in, leonejacob2001@gmail.com,
[1] [2]
Poornimachowdary9562@gmail.com,
[3]
kumarswamymp2002@gmail.com, harshithkurapati23@gmail.com,
[4] [5]
ajaykumarmagham7@gmail.com
[6]
ABSTRACT.
Facial recognition technology is an arising field that has revolutionized the way we interact with
machines. It has numerous applications, one of which is voting systems. In this paper, we present
a face recognition voting system that utilizes facial recognition technology to ensure a more
secure and dependable voting process. The proposed system consists of three main factors face
detection, face recognition, and voting. The system operates by first detecting faces in a given
image or videotape feed, followed by recognition of the detected faces using a trained machine
learning model. Once a face is recognized, the system retrieves the corresponding voter ID and
checks if the voter is eligible to cast their vote. However, the system allows them to cast their
vote, and the vote is recorded, If the voter is eligible. The system also ensures that voters cannot
vote more than once by maintaining a record of voters who have formerly cast their votes. Our
trials show that the proposed system is accurate, effective, and can be a precious tool for
ensuring a fair and secure voting process.
KEYWORDS : Face recognition, Voter verification, Voting, Democracy, Machine

Learning
Smart eco-friendly garbage management system
[1]
Mr.Riyaz, Manjunath Acharya S, Meghasai Vamsi Krishna, Pochimireddy Karthik
[2] [3] [4]
Reddy , M.S.Likhita, Kurupati Sri Sravya

[5] [6]

[1][2][3][4][5][6]
riyaz@presidencyuniversity.in, 201910100226@presidencyuniversity.in,
[1] [2]
201910101191@presidencyuniversity.in,
[3]
[4] [5]
[6]
ABSTRACT.
This project IOT Based Smart Garbage Management System is a very smart system which will
help to keep our village and cities. We see that in our cities public dustbins are overloaded and it
create unhygienic conditions for people and That place leaving a bad smell. To avoid all these
things we are Going to implement a project IOT based smart garbage management system. These
dustbins are interfaced with Arduino base system having ultrasonic sensor along with central
system showing the Current status of garbage on display and web Server with GSM/GPRS
Module. To increase the cleanness in the country government started the various project
KEYWORDS : Arduino , ultrasonic sensor , GSM/GPRS Module, Ir sensor , gas sensor

NCRACIT-2023

Uploaded by

Copyright:

Available Formats

NCRACIT-2023

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NCRACIT-2023

Uploaded by

Copyright:

Available Formats

PROCEEDINGS OF

NATIONAL CONFERENCE ON RECENT

School of Computer Science Engineering&

REGISTRATION & FINANCE COMMITTEE

STAGE AND DECORATION COMMITTEE

Message from Chancellor:

I wish you all a productive and fruitful conference.

Message from Vice-Chancellor:

I wish the conference a grand success!!

Message from the Registrar:

Message from PVC

Dr. Surendra Kumar

Message from Dean CSE& IS

I hope you all have a productive and enjoyable conference.

Dr. Sameeruddin Khan

Message from Associate Dean CSE& IS

Y. Venkat Sai Reddy G. Chetan Redddy Ayush Kumar

Banashankari S Hadimani Nirmitha D K

Abstract— Computer-Aided Diagnosis (CAD) is a quickly IV. INTRODUCTION

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

The expected diseases are :

Fig :flow of decision tree

B. Choose an algorithm to run on the training data

VII. METHOD Decision tree, the major limitation of decision tree

below, algorithm multiple times. Random forest is a team of

C.NAIVE BAYES CLASSIFIER

The fundamental Naive Bayes assumption n is that each

IX. LITERATURE REVIEW

Machine learning algorithms are being used more

Afroj Alam Vimala Keerthi K Renuka Shanmugam

Kiran Surya M Kasthuri Vinay Kumar Shyam R Pole

Abstract - A performance monitoring tool

Select the aggregate method that most closely

VII. EXPERIMENTAL LEARNING

Join a trial offer at no cost: There is a free trial

2. FID interactivity is 1ms which is good as

5. Front end vs. Back end

[5] Chen, Y., Zhu, L., & Chen, Y. (2016). A

[6] Guan, Q., Li, J., & Zhang, H. (2019).

[7] Chu, X., & Yan, S. (2019). Performance

[8] Gu, Z., Zheng, Z., & Ma, W. (2020).

[9] He, Y., Liu, Z., & Wang, H. (2018). Big

4th Allu Harini 5th Gangula Veda Samhitha

ABSTRACT: cannot expand, we need to look for efficient ways to

KEYWORDS: Adaptive Traffic control system,

After a lot of research, it is found out that the main

Title: Real-Time Vehicle Detection and Counting at

Overview: This paper proposes a real-time vehicle

Title: Traffic Light Control System Based on Machine

Dataset Preparation: A large dataset of images

Vehicle Detection and Classification: Once the

Vehicle Counting: Once the vehicles are detected and

Traffic Light Control: The green signal time for the

RESULTS & DECISIONS:

Dr. Saravana Kumar S Shubham Ekka Mahesh SN

Manoj A Shetty Adarsh Chandrashekhar Sharmishtha Nath

Fig. 4. Data Flow Diagram for Waste Management

Fig. 1. Architecture diagram of the proposed method

c) Block Diagram : e) Use Case Diagram :

Fig. 5. Use Case Diagram for Waste Management