NCRACIT-2023
NCRACIT-2023
NCRACIT-2023
ISBN: 978-93-5906-181-8
Organized By
www.presidencyuniversity.in
1
PROCEEDINGS of National Conference NCRACIT-2023
Committee List
CHIEF PATRON
Dr. Nissar Ahmed, Chancellor
PATRONS
Dr. D. Subhakar, Vice-Chancellor
Dr.Surendra Kumar A M, Pro Vice-Chancellor
Dr. Abdul Sharief Dean - SOE
Dr. C. S. Ramesh Dean - Research & Development
Dr. Sivaperumal Director - International Relations
Dr. Sameena Ahmed - Registrar
GENERAL CHAIR
Dr. Md Sameeruddin Khan Dean - SoCSE & IS
GENERAL Co-CHAIR
Dr. Kalairasan C. Asso. Dean - SoCSE & IS
ADVISORY COMMITTEE
All HoDs, SoCSE & IS
CONFERENCE CHAIRs
Dr. Gopal K. Shyam, Prof. & HoD, COM & CEI
Dr. Manujakshi B C., Asso. Prof., SoCSE & IS
ORGANIZING COMMITTEE
Dr. Mujeer Mulla
Dr. Preethi
Mr. Vetrimani
Mr. Riyaz
PUBLICITY COMMITTEE
Dr. Madhusudan M V
Mr. Rama Krishna
Mr. Sanjeev K.
Mr. Mrutyunjaya M. S.
WEBSITE COMMITTEE
Mr. Amogh P. Kulkarni
Ms. Sreelatha P K
PUBLICATION COMMITTEE
Mr. Muthuraju V.
Mr. Yamanappa
Ms. Shilpa C.N.
SESSION COMMITTEE
Dr. Ila Chandrakar
Ms. Smitha Patil
Ms. Amirtha Preeya
Mr. Shivalingappa
REVIEW COMMITTEE
Dr. Sandeep Albert Mathias
PROCEEDINGS of National Conference NCRACIT-2023
Ms. Galiveeti Poornima
Dr. Harishkumar K S
CERTIFICATE COMMITTEE
Ms. Sneha Bagalkot
Ms. Priyanka V.
I am delighted to know that the School of Computer Science Engineering & Information Science is
hosting a National conference on April 28-29, 2023. This conference is an opportunity for researchers,
industry professionals, and students to come together and share their latest findings and innovations in
the field of computer science and engineering.
As we all know, the field of computer science and engineering is rapidly evolving and it significantly
impacts various sectors of our society. I encourage all of you to participate in this event by attending or
submitting your research work for presentation.
I would like to thank the organizing committee for their hard work and dedication in putting together
this conference. I also extend my gratitude to all the participants for their active engagement and
contribution toward making this event a success.
Chancellor,
Presidency University,
Bengaluru, India
PROCEEDINGS of National Conference NCRACIT-2023
I am delighted to note that the School of Computer Science Engineering & Information Science is
organizing a National Conference on Recent Advancements and Challenges in Information
Technology (NCRACIT – 2023). Certainly, this type of conference not only brings all the researchers
and students to one platform but also inculcates the research culture among the entire fraternity of
Education in the country, there by, contributing to the development of a nation.
I hope that this conference would certainly induce innovative ideas among the participants paving the
wayfornewinventionsandtechnologiesinComputingandInformationTechnology. I congratulate the
School and all Faculty members for initiating the conduction of such a conference.
Vice-Chancellor
Presidency University,
Bengaluru, India
PROCEEDINGS of National Conference NCRACIT-2023
I am pleased to note that our university will be hosting a National Conference on Recent Advancements
and Challenges in Information Technology (NCRACIT – 2023). The conference proceedings will
cover a wide range of topics including Artificial Intelligence, Big Data, Cyber Security, the Internet of
Things, Cloud Computing, and many others. We hope that this event will provide a unique opportunity
for participants to engage in discussions, network with peers, and gain new perspectives on the latest
trends and challenges in Information Technology.
As the field of IT continues to rapidly evolve, it is crucial for us to stay updated with the latest research
and advancements. This conference will not only serve as a platform to share new ideas and concepts
but also help foster collaborations and partnerships within the academic and industrial communities.
We look forward to your active participation and contributions towards making this conference a
success.
Registrar
Presidency University,
Bengaluru, India
PROCEEDINGS of National Conference NCRACIT-2023
It is my pleasure to extend a warm welcome to all of you attending the National Conference on Recent
Advancements and Challenges in Information Technology (NCRACIT – 2023).
The field of IT is rapidly evolving, and this conference is an excellent opportunity for participants to
learn about the latest advancements, research, and challenges in the field. We are proud to host this
event, which brings together experts, academicians, industry professionals, and students from across
the country to share their knowledge and insights.
The conference proceedings will cover a broad range of topics, including but not limited to Artificial
Intelligence, Cybersecurity, Data Science, and the Internet of Things. We are confident that the
conference will provide a unique platform for participants to learn, share, and network with peers from
various domains of IT.
I would like to thank the organizing committee for their efforts in putting together this conference, and
I would also like to express my appreciation to all the participants for their active participation and
contribution to the event. We hope that this conference will be an enlightening and enriching
experience for all involved and will lead to further advancements in the field of IT.
It gives me great pleasure to welcome you all to the Presidency University's National Conference on
Recent Advancements and Challenges in Information Technology (NCRACIT – 2023). As the Dean
of Computer Science Engineering& Information Science, I am thrilled to be a part of this important
event that brings together experts and professionals from different fields of IT.
This conference is a platform for researchers, academicians, industry professionals, and students to
share their knowledge and insights into the latest advancements in IT. The conference proceedings will
cover a wide range of topics, including Artificial Intelligence, Machine Learning, Cyber security, the
Internet of Things, and many others.
We believe that this conference will help us identify the key challenges and opportunities in the field
of IT and develop new strategies for addressing them. We also hope that it will foster collaborations
and partnerships between academia and industry, leading to innovative research and development in
the field.
As the world becomes increasingly reliant on technology, it is more important than ever to stay up-to-
date with the latest advancements and challenges. This conference provides an excellent opportunity
for us to do just that. I would like to express my gratitude to the organizing committee for their hard
work and dedication in making this conference possible. I also extend my warmest thanks to all the
participants for their active engagement and contribution to the conference.
Research at Presidency University is culture and to promote this, the university offers various
research programs. Dedicated faculty members and research scholars are undertaking
research in cutting-edge technologies. Research circles mentored by senior researchers
provide guidance to young members and instill research culture in the schools. The university
encourages research and aspires to become one of the best universities known for applied
research, and also encourages the dissemination of research outcomes through forums such
as this, one being organized by the School of Computer Science Engineering and Information
Science. I congratulate the school for organizing the National Conference on Recent
Advancements and Challenges in Information Technology (NCRACIT – 2023). I convey my
best wishes to the organizers and the participants and hope that the conference will open up
new avenues to tackle the latest technological issues.
I hope you all have a productive and enjoyable conference and look forward to seeing the
valuable insights and research that will be presented.
Dr. C. KALAIRASAN
Associate Dean - CSE& IS
Presidency University,
Bengaluru, India
Lung Cancer Detection using YOLO CNN
Algorithm
Presidency University
Bangalore, India
email-
siraj.ahmed@presidencyuniversi
ty.in
email- yadagurusai@gmail.com
Abstract— The main objective of this research is, to create and physiological causes of human illnesses by examining
a computer vision algorithm which uses the YOLO (You Only vast volumes of data. Following that, clinical diagnoses are
Look Once) convolutional neural network (CNN) architecture made using this knowledge, and medical services are
to identify lung cancer in medical photographs. A series of provided.
computed tomography (CT) scan pictures will be used as the
input for the proposed method, which will then output the Contrary to conventional machine learning methods,
likelihood that lung cancer is present in the input image. The deep learning does not require human feature extraction,
input photos will be subjected to object detection using the which boosts time and resource efficiency. Deep learning is
YOLO CNN architecture, allowing for the location of possible carried out via neural networks, which are composed of
malignant spots. In order to further refine the discovered areas neurons. In neural networks, including many neurons in each
and categorize them as cancerous or non-cancerous, the output layer, the input of the following layer is regarded as the
of the YOLO network will be processed through further layers upper layer output. The neural network may use nonlinear
of convolutional neural networks. A sizable collection of CT processing and connections between layers to change the
data will be used to train the suggested method. input into the output. More importantly, the high-level
network automatically learns more abstract and generalized
Keywords—YOLO, CNN, CT,
characteristics from the input, overcoming the limitation that
I. INTRODUCTION machine learning requires explicit feature extraction.
Big data in health has grown over the past few years due II. PROPOSED METHODOLOGY
to the quick advancement of computer technology and
The Model Process Contains 4 Phases of work:
medical data. The use of technology in medicine has become
1. Image Preprocessing
more prevalent in recent years. Numerous domains that
merge medical, computer science, biology, mathematics, and 2. VGG16 Implementation
other sciences are involved in this technique of employing 3. Comparison of VGG16 with ML Algorithms
medicine. It is supported by vast biological data and 4. Deployment of Model in App
sophisticated computer technologies. It makes use of
artificial intelligence to uncover the underlying principles
1. Image Preprocessing: The first two layers have 4096 channels each,
The initial stage of our model is this. Our dataset on CT while the third layer has 1000 channels and
pictures of lung cancer was taken from Kaggle. The performs 1000-way ILSVRC classification. The
data set was divided into three categories: final layer is a soft-max layer.
adenocarcinoma, squamous cell carcinoma, and normal.
Test, Train, and Validate categories are used to 3. Comparison of VGG16 with ML Algorithms:
categorize each sort of cancer cell picture. We executed
Algorithms constructed: We have constructed the
picture data augmentation procedures including rescale,
following models:
horizontal flip, and rotation of the photos after resizing
1. Support Vector Machine (SVM)
them to 350*350.
2. K-Nearest Neighbours (KNN)
3. Random Forest Classifier (RFC).
2. VGG16 Implementation:
In addition to the input and output layers, CNN also has We employed numerous ML algorithms and
a number of hidden layers. An instance of CNN is compared their accuracies and various parameters
VGG16. The model's creators studied the networks and with VGG16 (CNN model).
increased the depth using an architecture with 3. Deployment of App:
extremely tiny (3x3) convolution filters, which
To use tensor flow converter function libraries to
demonstrated a considerable advancement over the
deploy our model in an application, we transformed it
state-of-the-art setups. The depth was raised to 16–19
to tensor flow light. The app was developed using
weight layers, or around 138 trainable parameters.
Android Studio, and after deploying a tensor flow
model and using a CT picture as input, it can detect the
VGG16 USED FOR- kind of cancer and display some of its symptoms and
therapies. The app also includes information on lung
VGG16 is an object identification and classification
cancer and its many kinds.
approach that, when used to classify 1000 images into
1000 separate categories, has an accuracy rate of
92.7%. It is a popular method for categorizing photos
and is easy to use with transfer learning.
III. MERITS AND DEMERITS
S. No. Research Paper Proposed
Method
1. Deep Learning In order to
Predicts Lung predict lung
Cancer cancer, this study
Fig.1.1. VGG16 Architecture Treatment employed
Response from techniques like
1. In VGG 16, the number 16 denotes 16 weighted Serial Medical recurrent neural
layers. VGG16 consists of 21 layers overall—13 Imaging networks (RNN)
convolutional layers, 5 max pooling layers, 3 dense and
layers—but only 16 of them are weight layers, also convolutional
referred to as learnable parameters layers. neural networks
(CNN).
2. The input tensor for VGG16 has three RGB However, the
channels and a size of 224, 244. Paper was only
able to look into
3. The unique characteristic of the VGG16 model a particular type
is that it constantly uses the same padding from a of scanner from
2x2 filter with a stride 2 and uses a 3x3 filter of a single CT
convolution layers with stride 1 rather than a lot of provider.
hyper-parameters. 2. Pancreatic Ductal The authors of
Adenocarcinoma: this work
4. Both convolution and max pool layers are Machine employed the
distributed equally across the design. Learning Based Support-vector
Quantitative +machine
5. Conv-1 Layer has 64 filters, Conv-2 Layer has Computed (SVM) and
128 filters, and Inv-3 Layer has 256 filters in Conv- Tomography Logistic
3, and 512 filters in Conv-4 and Conv-5. Texture Analysis Regression
For Prediction Of Analysis
6. Three layers of fully connected neural networks Histopathological methods.
are followed by a stack of convolutional layers. Grade But, Due to the
small number of
enrolled patients, 3) Support Vector Machine
overfitting may The classification process uses the ML algorithm SVM.
result. Here, the It is an algorithm for supervised learning. SVM is mostly
CT imaging employed to create a hyperplane between two classes
parameters vary. that may categorize n-dimensional space. Future
datapoint plotting in the appropriate category is simple.
3. Lung Cancer In this study, For instance, I need to separate my two courses
Detection: A they presented a efficiently. A class can carry out several duties. If you
Deep Learning method for group them according to only one attribute, there could
Approach applying deep be some overlap, as the graph below shows. As a result,
residual learning we will continue to add traits to ensure accurate
to identify lung classification. We are obtaining improved accuracy
cancer from CT while using the deep learning methodology in
images. To comparison to other ML techniques. Our algorithm
determine the performs effectively even with smaller data sets since we
possibility that a enhance the supplied dataset with picture data. The
CT scan contains model cannot accurately predict the kind of cancer if the
cancer, they CT scan images are not acquired properly or if the image
combined the is not clear. Only one CT picture viewpoint is presently
predictions from supported by the model.
various
classifiers,
including IV .CONCLUSION
XGBoost and We used machine learning methods in this study and
Random Forest. compared the outcomes to the VGG16 model. To find the
Tab. 1.1 Literature survey most effective algorithm for determining if a CT picture
contains cancer or not, we calculated the Accuracy and
Recurrent Neural Network, Logistic Regression Analysis, Precision of every machine algorithm as well as the
Support Vector Machine (SVM), and Linear Discriminate Accuracy of VGG16.This model may be used to forecast
Analysis (LDA) are some of the current methodologies real-time CT scans. As a result, we employed Android
utilized for lung cancer prediction. Studio to create an app for user experience.
a) Recurrent Neural Network
Recurrent neural networks (RNNs), a type of neural
network, use the outcome from the previous stages as the REFERENCES
input for the following phase. In this scenario when it is
important to predict the following word in a phrase, [1] Huang, T. et al. Distinguishing lung adenocarcinoma from lung
modern neural networks contain separate inputs and squamous cell carcinoma by two hypomethylated and three
outputs. As a result, RNN was created, and it utilized a hypermethylated genes; a meta- analysis. PLoS ONE 11m e0149088
(2016).
Hidden Layer to discover a resolution to this issue. The
[2] Davidson, M.R., Gazdar, A.F. & Clarke, B.E. Te pivotal role of
basic and most important property of RNNs is that the pathology in the management of lung cancer. J. Torac. Dis. 5(Suppl
hidden state will preserve part of the sequence's 5), S463-S478 (2013).
information. [3] Aisner, D. L. et al. Te impact of smoking and TP53 mutations in lung
adenocarcinoma patients with targetable mutations-the lung cancer
2) Logistic Regression Analysis mutation consortium (LCMC2). Clin. Cancer Res. 24, 1038-1047
(2018).
Using prior observations from a data set, a statistical
[4] Hosny, A. et. al. Deep learning for lung cancer prognostication: a
analysis method known as logistic regression predicts a retrospective multi-cohort radiomics study. PLoS Med 15, e1002711
binary result, such as yes or no. A logistic regression (2018).
model predicts a dependent data variable by looking at
the association between one or more already existing
independent variables.
Prediction of diseases using machine learning
algorithms
V. OVERVIEW
The dataset we have considered consists of 132
symptoms, the combination or permutations of which
leads to 41 diseases. Based on the 4920 records of
patients, we aim to develop a prediction model that takes
in the symptoms from the user and predicts the disease he
is more likely to have.
The considered symptoms are:
The disease prediction system is implemented using the algorithm is overfitting. It appears as if the tree has
three data mining algorithms i.e. Decision tree classifier, memorized the data. Random Forest prevents this
Random forest classifier and Naïve Bayes classifier. The problem: It is a version of ensemble learning. Ensemble
description and working of the algorithms are given learning refers to using multiple algorithms or same
VI. DASHBOARD:
New Relic provides all of the performance
information on the dashboard. None of it has to be
modified. On the "browser page load time," you
APM, Browser, Synthetics, Infrastructure, and
Insights are just a few of the capabilities and
technologies that New Relic offers. Try out
several features to see whether they can help you
optimize your infrastructure or apps.
Learn from community resources: New Relic
has a sizable user base that frequently exchanges
advice and best practices. Use this community to
your advantage by reading the material, watching
the tutorials, and participating in user
Fig. 3 groups or forums
You may test out New Relic and discover how
to utilize its tools to optimize your apps or
infrastructure by following these steps.
We built an e-commerce website called ‘The
Gadget House’ and we are monitoring it using
New Relic.
Fig. 4
Fig. 9
Fig. 7
3. CLS is 0 which indicates there is no shift
during loading.
Fig. 10
6. Throughput
load time, response time, errors, and
throughput and learn what is causing
performance problems from the source. In
order to monitor performance over time,
you may also create reports and set up
alerts depending on
performance thresholds.
By using New Relic, we can optimize the
performance of our web applications,
improve the user experience, and ensure
that our applications meet our performance
Fig. 11 requirements. Overall, New Relic is a
powerful tool for any organization that
wants to deliver high-performing web
7. User Centric Page Load Times applications and ensure that they meet
their business objectives.
Firms may gain a lot from website
research utilizing an APM solution like
New Relic. The platform's real-time
monitoring and analytics features make it
possible to swiftly identify and fix
application performance problems.
Additionally, the platform from New Relic
offers enterprises information into the
performance of their applications, which
can be utilized to optimize software and
make informed decisions.
X. REFERENCES
Fig. 12
[1] Alam, A., Muqeem, M., & Ahmad, S.
(2021). Comprehensive review on Clustering
Techniques and its application on High
Dimensional Data. International Journal of
Computer Science & Network Security, 21(6),
237-244.
IX. CONCLUSION
[2] Alam, A., Qazi, S., Iqbal, N., & Raza, K.
(2020). Fog, edge and pervasive computing in
A potent application performance intelligent internet of things driven
monitoring (APM) tool, New Relic offers applications in healthcare: Challenges,
information on the efficiency of web limitations and future use.
applications' front-end and back-end
elements. Real user monitoring, browser [3] Shrestha, R. (2021). Performance
and mobile monitoring, APM, Evaluation and Optimization of Web-Based
infrastructure monitoring, and serverless Applications: A Survey. Journal of Network
monitoring are some of its characteristics. and Systems Management, 29(1), 261-293.
With New Relic, you can keep an eye on doi: 10.1007/s10922-020-09576-1.
crucial performance indicators like page
[4] Kaushik, A. (2015). Web Analytics 2.0:
The Art of Online Accountability and Science
of Customer Centricity. Wiley.
INTRODUCTION:
In the recent years due to technological
advancements the automobile sector is also
manufacturing a higher number of vehicles and every
year almost 253 million vehicles are being
manufactured and sold to customers. As the roads
developed city has a limited ability to reconnect to its
traffic, thus providing a real time solution to this
problem may solve most of the problem. Providing a
real time traffic control based on the density of vehicles
could be the best solution so far.
LITERATURE REVIEW:
suitable for small-scale applications. dataset is used to train and evaluate the performance of
the YOLOv4 object detection model.
Title: Real-time Traffic Light Control using YOLOv4 Object Detection Model Training: The
Image Processing Techniques Author: Yi Zhou et YOLOv4 object detection model is trained on the
al.
Year: 2017
Overview: This paper proposes a real-time traffic
light control system using image processing
techniques to optimize traffic flow at intersections.
The system is designed to detect vehicles and
adjust the traffic light timings based on real-time
traffic conditions. The authors use image
processing techniques to detect and track vehicles
on the road.
Advantages: The proposed system achieves high
accuracy in detecting and tracking vehicles on the
road, leading to efficient traffic flow and reduced
waiting times for vehicles. Limitations: The
proposed system relies on image processing
techniques which may not be as accurate as deep
learning models in detecting and tracking vehicles.
The system may also be affected by poor lighting
conditions or adverse weather conditions.
METHODOLOGY:
The proposed Automatic Traffic Lights by Image
Classification using Machine Learning system
utilizes computer vision techniques and machine
learning algorithms to automate traffic light control
based on the number of vehicles detected on the
road. The methodology can be described as
follows:
DATA USED/COLLECTED
We have taken a data set which is suitable for our
project from the given website.
In this data set we have five columns
1. Date and Time
2. Junction (1,2,3,4)
3. No. of vehicles crossed at particular time in
particularjunction.
4. Time allotted for green signal.
5. No. of vehicles still left at the signal.
For example:
At 11/1/2015 10:00 AM at junction 1 in total 15
vehicles crossed the signal.
REFERENCES:
[1] "Real-Time Traffic Light Control System Based on
Machine Learning," by Hamid Fardoun, et al. (2019)
Link: https://ieeexplore.ieee.org/document/8916763
[2] "A Machine Learning Based Intelligent Traffic Signal
Control System," by Zhe Zhang, et al. (2020)
Link: https://www.mdpi.com/1999-4893/13/11/267
[3] "Real-Time Traffic Signal Control Based on Machine
Learning," by Wei Wang, et al. (2018)
Link: https://ieeexplore.ieee.org/document/8529264
Link: https://ieeexplore.ieee.org/document/8373008
[5] "A Real-Time Traffic Light Control System using
Convolutional Neural Networks," by Yingjie Li, et al. (2019)
Link: https://ieeexplore.ieee.org/document/8738484
[6] "Intelligent Traffic Light Control System based on
Machine Learning," by Qianwei Yu, et al. (2018)
Link: https://www.sciencedirect.com/science/article/pii/S
2212017318305727
[7] "An Intelligent Traffic Light Control Systembased on
Machine Learning and Computer Vision,"by Tariqul
Islam, et al. (2020) Link:https://www.mdpi.com/2071-
1050/12/15/6151
[8] "A Traffic Light Control System Based on Machine
Learning and Wireless Sensor Networks," by Yichen
Cheng, et al. (2021)
Link: https://www.mdpi.com/2076-3417/11/8/3682
[9] "A Machine Learning-based Traffic Light Control
System for Urban Traffic," by Xiaofan Li, etal. (2021)
Link: https://www.mdpi.com/1996-1073/14/1/172
[10] "A Machine Learning-based Traffic LightControl
System for Pedestrian Safety," by
Alessandro De Palma, et al. (2021)
Link:https://www.mdpi.com/2076-3417/11/11/5051
Optimizing Garbage Collection: A Smart System For A Smarter City
Abstract— Tons of scrap are dumped in open country's rising population and rising waste
areas every day. Environmental impurity and creation. The government is concentrating on
conditions are brought on by indecorous scrap waste management as a result of a rise in waste.
operation and transportation. Irrespective of The survey indicates that Mumbai produced
their size, location, or economic status, every 16,200 tons of waste per day in 2001, which rose
metropolitan area spends a significant amount to 19,100 tons in 2005. To address this issue, there
of money on waste collection. Waste disposal is is a need for timely and effective waste collection.
the overall conditioning and conduct required to Due to rapid population growth in recent years, the
amount of waste requiring disposal has increased,
handle waste, from generation to final disposal.
making it crucial to have a proper solid waste
The traditional approaches are relatively boring
management system to prevent the spread of
and unsanitary. There is no proper tracking dangerous illnesses. Monitoring the condition of
system for the garbage carrying vehicles or the smart bins and making decisions based on that
waste cargo, and the procedure requires manual information. The mission's goal is to visit every
monitoring. Point to point collection of waste is part of the nation, whether urban and rural, in
already being undertaken by many Chinese order to promote it as the perfect nation to the rest
cities. The garbage is brought to the dumping of the globe. The procedure of collecting trash is
yard after being collected at the source. To laborious, ineffective, and time-consuming. There
achieve route and garbage collection is no tracking mechanism for the procedure, which
optimization, a new system can be created to involves manual monitoring of waste loads and
track the garbage vehicles in a certain ward of a garbage-carrying vehicles. To achieve route and
firm. Proper segregation must be carried out at garbage collection optimization, a new system can
the disposal site where the trucks discharge the be created to track the garbage vehicles in a
trash into the conveyer belt and should have certain ward of a firm. The push carts and garbage
distinct sections for dry and moist waste. trucks may be equipped with sensors, and they
could be tracked based on their GPS location to
Keywords—Tracking system, Sensors, cover the entire ward. The garbage collection push
Arduino UNO, GPS/GSM. carts are improperly constructed. The majority of
the trash flows over the road while being
X. INTRODUCTION transported.
In general, solid trash is categorized as coming Based on the "Consolidated Annual Review
from homes, businesses, hospitals, markets, yards, Report On Implementation Of Solid Waste
and street sweepings. In the past, was hauled management Rules, 2016," India produces
outside the town by horse-drawn carts. It is approximately 62 million tons of waste on an
typically challenging to manage the collection and annual basis. Due to the difficulties in the
transportation of garbage, as well as the tracking collection process and operation of the carts and
of vehicle locations, without the aid of advanced lack of tracking installation of the vehicles, only
technology. One of the most significant activities 43 million tons of the waste are collected. The
in urbanization and economic growth are remaining 19 million tons of waste remain
increasing in developing nations. The challenge of uncollected, leading to displeasure and spreading
garbage disposal arises in India as a result of the of infections. The 11.9 million tons of the
collected waste is treated by shops at the for
dumpsite while the remaining is used as compost analysis.
for landfills. Lack of robotization in carts lead to
inefficient scrap collection and increased 2 "Smart Utilized Improved Limited
complexity thereby motivating us to come up Garbage wireless waste to
with a solution for it. Collecti sensor manage monitori
on network ment ng the
XI. LITERATURE REVIEW System s and efficiency level of
A. Existing Methods Using RFID , reduced garbage
In the past, garbage bins were emptied by Wireless technol costs, only,
cleaners at specified intervals. The person who Sensor ogy to reduced cannot
cleaned the garbage can ran a significant risk to Networ collect environm detect
their health due to the toxic gases. There has never ks and and ental the type
been any automated planning, scheduling, or RFID monitor impact, of waste
monitoring of the waste from its source (a Technol garbage increased or its
residence) to its destination (dump yard) in any ogy" by data. public composi
previous study publications or projects addressing M. U. awarenes tion.
effective garbage collection and treatment. Chowdh s of
B. Research On Few Affiliated Papers ury and waste
M. I. manage
TABLE I. LITERATURE SURVEY ON FEW PAPERS Hossain ment.
Sl. Paper Method Advantag Limitati 3 "Smart Develop Improved More
no. Title es ons Waste ed a waste complex
1 "Smart Develop Improved Limited Manage smart manage and
Garbage ed a waste to ment waste ment expensiv
Monitor smart manage monitori System manage efficiency e than
ing garbage ment ng the Based ment , reduced systems
System monitori efficiency level of on IoT: system costs, that
Using ng , reduced garbage Towards using reduced only
IoT system costs, only, Urban IoT environm monitor
Technol using reduced cannot Sustaina technol ental the level
ogy" by IoT environm detect ble ogy that impact, of
S. G. technol ental the type Develop can improved garbage,
Han and ogy. impact. of waste ment" detect recycling requires
S. Y. Lee Sensors or its by S. G. the type rates. addition
were composi Prabhu of waste al
placed tion. and P. and its technol
inside Selvi composi ogy and
the tion. processi
garbage Sensors ng
bins to were power.
monitor placed
the inside
garbage the
level, garbage
and bins to
data monitor
was the
transmit waste,
ted to a and
server data
was 6 "Smart Utilized Improved Limited
transmit Waste a smart waste to
ted to a Manage waste manage monitori
server ment manage ment ng the
for System ment efficiency level of
analysis. Using system , reduced garbage
Machin that costs, only,
4 "Develo Utilized Increased Limited
e incorpor reduced cannot
pment a smart capacity to
Learnin ates environm detect
of a bin of monitori
g and sensors ental the type
Smart system garbage ng the
IoT" by and impact. of waste
Garbage that bins, level of
S. B. machine or its
Bin includes improved garbage
Singh learning composi
System sensors waste only,
and R. algorith tion.
Using to manage cannot
Singh ms to
IoT detect ment detect
predict
Technol the level efficiency the type
the level
ogy" by of , reduced of waste
of
J. K. garbage costs, or its
garbage
Kim, K. and a reduced composi
in bins
H. Kim, compact environm tion.
and
and C. or to ental
optimize
W. compres impact.
collectio
Chung s the
n
garbage.
routes.
5 "A Develop Improved More
7 "Smart Develop Improved More
Smart ed a recycling complex
Waste ed a recycling complex
Garbage smart rates, and
Manage smart rates, and
Bin garbage reduced expensiv
ment waste reduced expensiv
System bin environm e than
System manage environm e than
Using system ental systems
for ment ental systems
Sensor that impact, that
Efficient system impact, that
Fusion uses reduced only
Garbage that reduced only
for sensors costs. monitor
Collecti uses costs. monitor
Waste and the level
on and RFID the level
Segrega camera of
Disposal tags to of
tion and images garbage,
" by N. identify garbage,
Recyclin to may
K. and sort may
g" by A. detect require
Khanuja different require
M. S. and sort significa
and P. types of significa
Alam, S. different nt
Goyal waste, nt
S. types of processi
and processi
Hasan, waste. ng
sensors ng
and S. S. Data is power.
to power.
Roy transmit
monitor
ted to a
the level
central
of
server
garbage
for
in bins.
analysis.
Data is
transmit • Limiting human involvement.
ted to a
• Minimizing human effort and time.
central
server • Creating an atmosphere free of trash and
for that is healthy.
analysis. XIV. METHODOLOGY
A. Design Procedure
XII. PROPOSED METHOD a) The proposed model has been divided into
Three separate ultrasonic sensors are used in four parts:
the project, and when each dustbin is filled with i. Garbage Collection: The garbage truck
garbage, the ultrasonic sensor automatically has a robotic arm that helps the colony or
detects the amount of waste in the bin and sends community transport trash from each
information to the Arduino uno. This will direct home to the trash can. The residents have
the garbage collection robot using an RF module, the ability to track the vehicle.
a transmitter, and a receiver module so that it may ii. Monitoring and Overload Detection: The
go collect trash from the chosen trashcan. The GSM module can be used to alert the
robotic arm needs to be manually operated using a appropriate officials about the overload
mobile application to collect trash from the status. Each garbage can has an ultrasonic
dustbin after the robot automatically travels to it sensor, LCD screen, and SMS capability
using a track that we placed for it to follow.
that may be used to determine the amount
The robot will arrive at its destination with the of filling.
aid of an IR sensor, and once there, the IR sensor iii. Tracking Mechanism: The garbage truck's
will send data to Arduino to stop the robot. The location may be determined via the app or
robot subsequently transports the collected by sending an SMS in conjunction with
garbage to the main dumping area, where it is the GPS/GSM module. The location of the
divided into dry, moist, and metallic waste. The garbage truck may be found out by the
rubbish will be placed on a conveyer belt, and we public using the app or through SMS.
will have a blower to remove any dust or other dry iv. Segregation Mechanism: Garbage will be
waste from the belt. Wet waste will remain on the
dumped on the conveyer belt as soon as
belt itself, and we will use a powerful magnet to
the garbage truck pulls up to the landfill.
remove any metallic waste from the belt.
We will separate the scrap through the
The whole procedure, from garbage collection conveyer belt by blowing air through it,
to garbage segregation, would be communicated to which causes wet waste and metallic
the public via SMS. debris to go across the belt, assisting in the
OBJECTIVES
XIII. separation of dry and wet waste. We will
set up a powerful magnet for metallic
Smart waste management is a concept that
debris, which will draw the metal through
allows us to handle many issues that bother
it. So, we may separate the garbage into
society, such as pollution and infections. Waste
management must be completed immediately; dry, moist, metallic, and non-metallic
else, irregular management would result, harming waste by following the respective
the environment. The idea of smart cities and operation.
smart waste management are primarily
compatible. b) Architecture Diagram :
The following are the key goals of our suggested Presented below is a visual depiction of
system: the architectural diagram for the proposed
technique. This diagram provides a clear
• Keeping an eye on garbage disposal. understanding of how waste management
• Offering intelligent waste management processes are carried out and how information
technologies. is effectively conveyed to the public.
d) Data Flow Diagram :
Abstract—Healthcare is an essential aspect of living a improve the quality of life, and increase life expectancy. In
healthy life although it isn’t accessible to most part of rural India the doctor to patient ratio in the city is 1:854 but in the
areas. Nowadays in healthcare, medical professional use rural areas it is 1:2000 as most of the hospitals and doctors
machine learning as a tool to manage patients and clinical data. are in the district towns which makes it hard for people
Natural Language Processing is a subcategory of machine especially the old and disabled to travel long distances. They
learning that enables computers to comprehend, analyze and reach out to professional help if the health condition
produce human language. With the help of natural language deteriorates drastically, creating irreversible changes to the
processing communication between humans and machines body which could have been prevented if they seeked help
become much simpler and possible. To overcome this, the
earlier. Artificial Intelligence(AI) is widely used in the
project acts as a bridge between healthcare and the people. In
this research project, using machine learning algorithms such
medical field from storing the patient records to assisting
as Artificial Neural Network, Natural Language Processing doctors in surgery. Using AI to our advantage we can
and Naive Bayes classifier, a medical chatbot is being created provide better healthcare to people in rural areas where it is
to make decisions about the risks and predict treatment difficult to access it.
outcomes. The chatbot engages in a conversation with the user, By using AI's Machine Learning application, it is
the symptoms are presented in the form of a query which is possible to program computers to emulate human thought
processed by the machine learning algorithm which predict the
processes. It enables medical equipment to make predictions
disease, the preventive measures and treatment plan. It
and offer insights from vast amounts of data that the
intimates patient queries to the doctor based on the risk factor
of the disease predicted.
healthcare professionals might overlook. A constantly
changing patient data set is essential when using machine
Keywords—Artificial Neural Network (ANN), Naive Bayes learning in the healthcare industry. These data can be used to
Classifier, Natural Language Processing (NLP), Machine identify trends that help medical practitioners identify new
Learning, Artificial Intelligence, Decision Tree, Medical diseases, assess risks, and forecast treatment outcomes..
Chatbot, Healthcare Using Machine Learning, the medical chatbot that will be
developed will help eliminate the issue by creating a bridge
XVI. INTRODUCTION between the patient and healthcare. They can provide
Healthcare is of utmost importance in our daily lives. It is immediate responses to patient queries, eliminating the need
essential for the well-being of individuals, families, for patients to travel the distance to the hospitals and the
communities and nations. Access to quality healthcare expense. Artificial Neural Network (ANN), Naive Bayes
services is a fundamental human right, and it plays a crucial Classifier, Natural Language Processing, and Natural
role in maintaining and improving the health and quality of Language Toolkit are the machine learning algorithms that
life of people. One of the significant advantages of are used. ANN is a technology that is used to generate results
healthcare is that it helps in the prevention and treatment of on a computer that are comparable to diagnoses provided by
diseases. Regular check-ups, immunizations, and screenings humans. ANN is a tool that is used to produce computer
can help detect and prevent illnesses before they become generated outcomes that are similar to the diagnosis made by
serious. Early diagnosis and treatment of illnesses can also human reasonings. Deep learning, which is the capacity of
lead to better outcomes and improved health. Healthcare also the ANN to learn from enormous quantities of data, is based
helps in the management of chronic conditions such as on ANNs. The goal of natural language processing, a
diabetes, heart disease, and cancer, among others. Proper subcategory of machine learning, is to enable computers to
management of these conditions can reduce complications, comprehend, analyse, and produce human language. To
interact with the machine and communicate with it, you system asks users questions about their symptoms in order to
employ natural language processing. An algorithm called the make a diagnosis and prescribe a course of treatment based
Naives Bayes Classifier is utilised to classify data accurately, on those answers. Using a top-down approach, the decision
which in turn enables it to predict outcomes quickly. Here, it tree method is used to identify, diagnose, and locate potential
interprets user communications and analyses user inquiries. solutions. The database could be updated and voice input
added in the future to improve the system.
XVII. LITERATURE REVIEW
C. An Intelligent Web-Based Voice Chatbot
A. Diabot: A Predictive Medical Chatbot using Ensemble
Learning Author: S. J. du Preez, M. Lall, S. Sinha
This paper presents the design and development of an
Author: Manish Bali, Samahit Mohanty, Subarna intelligent voice recognition chat bot. The paper presents a
Chatterjee, Manash Sarma, Rajesh Puravankara technology demonstrator to verify a proposed framework
This paper presents a generic text-to-text 'Diabot' required to support such a bot (a Web service). While a black
chatBOT which engages patients in conversation using box approach is used, by controlling the communication
advanced Natural Language Understanding (NLU) structure, to and from the Web-service, the Web-service
techniques to provide personalised prediction using the allows all types of clients to communicate to the server from
general health dataset and depending on the many symptoms any platform. The service provided is accessible through a
that the patient was asked about. The idea has been further generated interface which allows for seamless XML
developed into a DIAbeteschatBOT for specialised diabetes processing; the extensibility improves the lifespan of such a
prediction utilising the Pima Indian diabetes dataset to service. By introducing an artificial brain, the Web-based bot
provide proactive preventative actions. The paper presents a generates customised user responses, aligned to the desired
cutting-edge Diabot design with a simple front-end user character. Questions asked to the bot, which are not
interface for the average person using React UI, RASA understood, are further processed using a third-party expert
NLU-based text pre-processing, quantitative performance system (an online intelligent research assistant), and the
comparison of different Using individual machine learning response is archived, improving the artificial brain
algorithms as classifiers and integrating them together into capabilities for future generation of responses.
an ensemble with a majority vote.. The accuracy of the
ensemble model is balanced for general health prediction and D. A Self-Diagnosis Medical Chatbot Using Artificial
highest for diabetes prediction among all weak learners, Intelligence
which provides motivation for further exploring ensemble Author: Divya S, Indumathi V, Ishwarya S, Priyasankari
techniques in this domain. M, Kalpana Devi S
This research uses NLU & Advanced ML algorithms to According to Divya S. et al study, their suggested system
first diagnose a generic disease using a text-to-text offers a text-to-text conversational agent that can diagnose a
conversational Diabot and then extend this study as a user's illness by posing a series of questions to them. By
specialisation into deeper-level predictions of diabetes. matching the retrieved symptoms to the papers' symptoms
Diabetes is a non-communicable disease and early detection and the database's classifications, it classifies the illness as a
of it can let people know of its serious consequences and minor or serious sickness. In addition to a tailored diagnosis,
help save human lives. It is one of the major healthcare
a suitable specialist is recommended. Also, the user-
epidemics being faced by Indians, with close to 40 million
people who suffer with diabetes and this number is estimated provided information is kept in a database for later use. The
to touch 70 million people by 2025. Diabetes also causes performance of this bot's symptom recognition and
blindness, amputation and kidney failures. To diagnose diagnosis could be enhanced in the future, and it might also
diabetes, a doctor has to study a person's past history, offer more medical features for a more thorough symptom
diagnostic reports, age, weight etc. prediction.
This work combines an ensemble of five classifiers - E. The Application of Medical Artificial Intelligence
Multinomial Naïve Bayes (MNB), Decision Tree (DT), Technology in Rural Areas of Developing Countries
Random Forest (RF), Bernoulli Naïve Bayes (BNB) and
Author: Jonathan Guo and Bin Li
Support vector machine (SVM) - to predict various diseases
generically and specifically. The system consists of a front- Artificial intelligence (AI), a fast-evolving branch of
end User Interface (UI) for the patient to chat with the bot, a computer science, is now being actively applied in the
chatbot communicates with the NLU engine via API calls, medical industry to enhance clinical work's professionalism
and two models are trained at the backend using the general and effectiveness while also reducing the risk of medical
health and Pima Indian diabetes datasets. The challenges in a errors. The disparity in access to healthcare between urban
real-time implementation are mainly related to accuracy. and rural areas is a critical issue in developing nations, and
the lack of skilled healthcare professionals is a major factor
B. Artificial Intelligence based Smart Doctor using
in the unavailability and poor quality of healthcare in rural
Decision Tree Algorithm
areas. According to several studies, using AI or computer-
Author: Rida Sara Khan, Asad Ali Zardar, Zeeshan assisted medical approaches could lead to better healthcare
Bhatti outcomes in developing nations' rural areas. Thus, it is
An AI-based health physician system that could worthwhile to discuss and investigate the creation of
communicate with patients, make diagnoses, and recommend medical AI technology that is appropriate for rural locations.
an immediate fix or therapy for their issue was proposed and
put into practise in the works of Rida Sara Khan et al. This
Many MDDS systems, including those for internal, forensic,
veterinary, pathology, radiology, psychiatry, and other
H. A Personalized Medical Assistant Chatbot: MediBot
fields, were created in the 1990s.
AI technology is bringing revolutionary changes across the Author: Gajendra Prasad K. C, Assistant Professor.
healthcare field, and will play a huge role in electronic The MediBot is a personalized medical assistant chatbot
health records (EHRs), diagnosis, treatment protocol that can predict disease using Apriori algorithm and
development, patient monitoring and care, personalized Recurrent Neural Network algorithm. It can be used as a
medicine, robotic surgery, and health system management. tool of communication and can help people to keep track of
This article introduces the potential of medical artificial their health regularly and properly without going anywhere.
intelligence (AI), reviews healthcare disparities and quality Machine Learning has had a major impact in the field of
in developing countries' rural areas, and discusses the medical science due to its ability to learn and analyze from
functions of technologies related to AI in medicine, such as the examples provided. In today's fast paced life, people
computer-assisted diagnosis and mobile clinical decision often don't take proper care of their health and as a result
support systems (mCDSS). Additionally, it suggests a people often end up ignoring their health conditions. The
multilayer medical AI service network with the goal of MediBot can help people find their health problem just by
enhancing the usability and standard of rural healthcare in entering symptoms.
developing nations. Chatbots are highly personalized virtual assistants that
F. Technical Aspects of Developing Chatbots for Medical mimic human conversation using Machine Learning
Applications algorithms. They are becoming increasingly popular in
business groups due to their ability to reduce customer
Authors: Zeineb Safi, Alaa Abd-Alrazaq, Mowafa Househ service cost and handle multiple users at a time. Chatbots
Applications known as chatbots may have natural language are currently the most advanced and time saving technology
discussions with users. Chatbots have been developed and available, but there is a need to make them efficient in
used in the medical field to serve different purposes. The medical field as well. This project provides a platform
most notable instance is the usage of chatbots like Apple's where human can interact with a chatbot which is highly
Siri and Google Assistant as personal assistants. Chatbots trained on datasets. Machine Learning algorithms take a
have been created and utilised for a variety of purposes, more natural approach for computation rather than taking a
including marketing and offering various services. logical approach, and the output is depended on the dataset
Chatbots are being more widely used in the medical industry they are trained on. One can apply those methods and gain
as a tool to make it easier for patients to get information and from them even if they are not aware of the correct rationale
lighten the strain on clinicians. For connecting with patients, behind them.
many commercial chatbot programmes that are accessible as
online or mobile applications have been developed. It is
I. Survey Paper on Medical Chatbot
important to know the current state of different methods and
techniques that are being employed in developing chatbots Author:Dev Vishal Prakash, Prof. Shweta Barshe,
in the medical domain for many reasons. By conducting this Anishaa Karmakar, Vishal Khade
poll, researchers will be able to recognise the various ways The literature survey discusses the potential of medical
that have been utilised in the future and build on them to chatbots in healthcare services to improve healthcare service
create chatbots that are more intelligent and provide users a quality, reduce the workload of healthcare professionals and
more natural experience. streamline healthcare services. The survey highlights the
challenges and negative consequences of medical chatbots,
G. A Literature Survey of Recent Advances in Chatbots such as negative perceptions about their ability to provide
accurate information and secure user privacy, and lack of
Authors: Guendalina Caldarini, Sardar Jaf, and user interest. The survey aims to identify key factors that
Kenneth McGarry can motivate individuals to accept services delivered
Chatbots are intelligent conversational computer through medical chatbots by healthcare organizations and to
systems designed to enable automated online guidance and help formulate appropriate strategies for better designing
support. Chatbots are currently applied to a variety of medical chatbots. The study adopts a two-stage mixed-
different fields and applications, spanning from education to method approach involving interviews and surveys based on
e-commerce, encompassing healthcare and entertainment. the theory of planned behavior to obtain a deeper
Improvements in their implementation and assessment are understanding of individuals’ motivations for using medical
significant research subjects since chatbots are so common chatbots. A significant element that promotes the adoption
and used in so many different sectors. of medical chatbots was shown to be emotional preference.
The main contributions of this paper are: The study also highlights the need for more research on
(i) a thorough analysis of the research on chatbots in the medical chatbots to ensure their successful implementation.
literature as well as the most recent techniques for
implementing them, with a concentration on Deep Learning XVIII. METHODOLOGY
algorithms,
(ii) the identification of the challenges and limitations of
chatbots implementation and application, and
(iii) recommendation for future research on chatbot.
and suggest further precautions that the patient needs to take
in an efficient manner.
The input requests from the patient are sent to the chatbot
server, which uses the bot controller logic to determine how
XIX. DESIGN PROCEDURE
to respond to the user's request. The data preprocessing
model is then used to prepare the raw data based on the
user's input and provide accurate responses to the user
through the chatbot client.
The design procedure for the medical chat board tool for
problem identification involves the use of machine learning
algorithms and natural language processing (NLP), with The data is stored in SQLite as a single row of instance data
Python as the coding language and Jupiter notebook as the or a collection of instance data, depending on how it was
IDE. The minimum operating system requirements are trained in the dataset. The chatbot server stores both the
Windows XP professional or Windows 7 or later. training and test data, and feeds the appropriate data based
on the user's details. If there are no key patterns present in
The front-end of the tool consists of two modules: the the user's input, the virtual doctor prescribes medicines
registration module and the query module. The registration based on the symptoms and uses machine learning logic,
module enables users to register with their details and log in specifically the support vector machine (SVM) algorithm, to
to the chatbot using a username and password. Even doctors identify and predict the disease. The SVM algorithm
are required to register in the same way as other users. analyzes the disease and prescribes medicine.
The query module enables users to ask questions regarding The objective of this work is to predict the diagnosis of a
their symptoms and diseases in the automatic chatbot, which disease with a number of attributes and provide a solution to
responds according to the disease with the user. The disease the patient through the chatbot. The classifier model is used
prediction module uses machine learning logic such as to identify the key patterns or features from the medical
Naive Bayes and decision tree algorithms to recognize and data, and classification techniques are then used to predict
analyze the diagnosis of the disease after reducing the number of
the symptoms described by the patient, predict the disease in attributes provided by the user.
a particular area, and even provide an accuracy score for the
prediction. The tool can also prescribe accurate medicines
XX. ALGORITHMS individual words, ignoring their order and context. It counts
the frequency of each word in a document and uses these
Neural Net: A neural net is a machine learning algorithm counts as input features for classification.
that is based on the structure and function of the human
brain. It consists of layers of interconnected nodes or Cross Validation Function: Cross-validation is a statistical
neurons, which process input data and produce output method used to evaluate machine learning models by
predictions. Neural nets are particularly useful in pattern dividing data into subsets for training and testing. The cross-
recognition tasks such as image or speech recognition. validation function helps to optimize model performance by
identifying potential issues such as overfitting and
KNN: KNN, or k-nearest neighbors, is a classification underfitting. It is particularly useful for determining the best
algorithm that assigns new data points to the class of the hyperparameters for a given model.
nearest neighbor in the training data. The value of k REFERENCES
determines the number of nearest neighbors considered.
KNN is a simple and easy-to-understand algorithm, but can The template will number citations consecutively within
be computationally intensive for large datasets. brackets [1]. The sentence punctuation follows the bracket
[2]. Refer simply to the reference number, as in [3]—do not
use “Ref. [3]” or “reference [3]” except at the beginning of a
SVM: Support vector machine (SVM) is a classification
sentence: “Reference [3] was the first ...”
technique that uses a hyperplane in a high-dimensional
space to divide input points into classes. Finding the Number footnotes separately in superscripts. Place the
hyperplane that maximises the margin between the two actual footnote at the bottom of the column in which it was
classes is the aim of SVM. SVM is very helpful for cited. Do not put footnotes in the abstract or reference list.
managing data that cannot be separated linearly. Use letters for table footnotes.
Unless there are six authors or more give all authors’
Decision Tree: A classification technique known as a names; do not use “et al.”. Papers that have not been
decision tree employs a tree-like structure to represent a published, even if they have been submitted for publication,
succession of decisions and their outcomes. Each node should be cited as “unpublished” [4]. Papers that have been
represents a judgement call based on an attribute, and each accepted for publication should be cited as “in press” [5].
branch the decision's result. Decision trees can handle Capitalize only the first word in a paper title, except for
category and numerical data and are simple to comprehend. proper nouns and element symbols.
For papers published in translation journals, please give
Logistic Regression: Based on input factors, the the English citation first, followed by the original foreign-
classification method known as logistic regression forecasts language citation [6].
the likelihood of a binary result. It uses a logistic function to
transform a linear combination of input variables into a
probability between 0 and 1. Logistic regression is [5] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of
Lipschitz-Hankel type involving products of Bessel functions,” Phil.
particularly useful when the outcome variable is Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955.
dichotomous. (references)
[6] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed.,
1R: 1R is a classification algorithm that chooses a single vol. 2. Oxford: Clarendon, 1892, pp.68–73.
attribute as the best predictor of the class label. It calculates [7] I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange
anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds.
the error rate of each attribute and selects the one with the New York: Academic, 1963, pp. 271–350.
lowest error rate as the final classifier. 1R is a simple and [8] K. Elissa, “Title of paper if known,” unpublished.
fast algorithm, but can be limited in its accuracy. [9] R. Nicole, “Title of paper with only first word capitalized,” J. Name
Stand. Abbrev., in press.
Ensemble: Ensemble algorithms combine multiple [10] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron
individual algorithms to create a more accurate and robust spectroscopy studies on magneto-optical media and plastic substrate
prediction. Examples of ensemble algorithms include interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August
1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
bagging, boosting, and random forests.
[11] M. Young, The Technical Writer’s Handbook. Mill Valley, CA:
University Science, 1989.
Bags of Words Model: The bags of words model is a text
classification algorithm that represents text as a bag of
Real-time Estimation of Heart Rate under lighting
using Web Camera
Deepak M D Galam Anusha Priya Gonuguntla Goutham Sai
Assistant Professor Department of CSE Department of CSE
Department of CSE Presidency University Presidency University
Presidency University Bangalore, India Bangalore, India
Bangalore, India 201910101473@presidencyuniversity.i 201910101442@presidencyuniversity.i
deepak.md@presidencyuniversity.in n n
Abstract— The early detection of variation in heart rate is with each beating, and HR may be calculated using this
essential for effective treatment because cardiovascular disease colour variation.
is one of the leading cause for death worldwide. In the field of
medical diagnostics, heartbeat detection is a crucial task, but Previously, a few techniques for pulse detecting using a
conventional methods call for specialized tools and qualified camera were developed, however such procedures have
personnel. The use of signal processing and computer vision restrictions on the elements impacting colour values, such as
techniques has gained popularity in recent years. This study variances in ambient illumination during video recording and
describes a technique for real-time heart rate monitoring using variations in blood parameters produced by heartbeat. Most
a webcam and JavaScript. The suggested method takes the non-touch methods use RGB colour space to produce face
facial region out of the webcam's video frames and uses signal footage that is best suited for lab settings or constant ambient
processing to estimate the heart rate. In particular, the lighting. Because the ambient light is not constant, these
technique detects the subtle color changes brought on by the approaches are not appropriate for real-time software and
blood flow in the skin and uses the chrominance information of cannot achieve heart rate.
the facial region to estimate the heart rate. The method also
employs motion compensation algorithms to decrease the The proposed technique employs LAB colour space for
impact of head motions and facial expressions on heart rate non-intrusive heart rate detection, hence eliminating ambient
measurement. The suggested approach can offer a low-cost and light changes while extracting face pictures. The suggested
non-invasive way to detect a person's heart rate and has process starts by locating the Region of Interest (ROI) on the
potential applications in healthcare, fitness monitoring, and face and identifying the skin likely region, followed by using
wellness tracking. LAB colour space. The colour fluctuations are then
examined for every pixel in time and amplified to obtain a
Keywords—Cardiovascular disease, Computer vision (CV), closer picture of the signal from the chosen ROI. Finally, the
Signal processing, Heart rate (HR), Motion compensation. captured area is used to extract the signals from which the
XXI. INTRODUCTION HR is extracted using peak detection algorithms.
Since many years, physiological signal analysis has been XXII. ASSOCIATED WORK
used extensively in the field of medical research. Research For the past decade, academics have been engaged on
has demonstrated that heart rate is a source of data that computer vision (CV) technologies. The first proposed use of
reveals a person's psychophysiological status. Aside from using facial assistance for observations to measure
HR change measures, several medical diagnoses also make physiological parameters in human beings. Verkruysse
use of breathing rate and ECG signals. The progression of provided an example of how to use PPG to calculate HR
novel pulse-measuring methods and machine intelligence from a person's face in natural light. The key concept behind
algorithms enabled the identification of tension, sleepiness, these techniques is to get the pulse based on transient
and different emotions. The advancement of non-invasive changes in facial colour using blind source separation (BSS).
physiological sensing technologies will result in a slew of Additionally, researchers applied algorithms for various
new applications since they will be rapid, simple, and techniques to optical processing and noise reduction to
attainable in real time. This research presented a real time specifically analyse HR estimations.
face video heart rate tracking system utilizing a web camera
by estimating the variation in skin color produced by Earlier systems detected pulses from collected video by
heartbeat. calculating small head motions resulting from the Newtonian
reaction to flow of blood by every heartbeat. Here,
The idea of monitoring cardiovascular system parameters compliance with artery and head mechanics, as well as
without contact with the human body has developed. The erratic and inadequate illumination circumstances, may have
circulatory system allows blood to flow throughout the body an impact on how well the features are monitored. Medical
due to the heart's continuous blood pumping. The resulting applications for ambient light-based virtual plethysmography
blood supply creates colour change in the skin on the face imaging include vascular skin lesion characterization and
vital signs are remotely monitored for sports or triage. The b) Integral Imaging: Integral pictures analyze
prerequisite in this instance is to quantify heart and breathing rectangular characteristics in a consistent amount of
rates will result in less accurate findings. time. Compared to previous systems with more
attributes, this increases computation time while
With adaptive filtering techniques like Normalised Least improving computation speed. The quantity does not
Mean Square, remote HR observations from face films are affect how quickly features are processed.
conducted under controlled conditions to quantify HR.
Changes in ambient illumination and movements that
enhance subject interference reduce their effectiveness.
Webcam-based face footage in RGB colour space is used for
HR monitoring. The HR was calculated using a
straightforward webcam in an indoor setting with continuous
ambient light from the colour change in the skin caused by
heartbeat. This approach can't determine where HR is
fluctuating due to the surrounding light, making it unsuitable
for real-time applications.
Our goal is to create a non-contact heart rate estimating
device that uses a camera to track the variation in skin tone Fig. 2. Integral Image creation Features
caused by each heartbeat. The gathered footage is converted
from RGB to LAB colour space using signal processing c) OpenCV Algorithm: This method is used to train
techniques like the Fast Fourier Transform (FFT), and a face classifiers as well as facial recognition algorithms to
identification algorithm is used to remove the impact of choose the best features.
ambient light. The HR is then determined from the frequency d) Cascade in OpenCV: There is a potential classifier at
that was obtained using a peak detection method. every level of the cascade. The purpose of each stage
is to establish if a certain plane is unmistakably a
XXIII. PROPOSED METHODOLOGY face. The recognized pane will be automatically
This technique's fundamental premise is that blood discarded if it is not a face.
flowing down the face alters the skin's hue in a way that is
apparent to the camera but invisible to the human eye.
There isn't much information in the sections for the eyes,
lips, and nose. So, to obtain regions with skin probability, we
apply a skin mask. The signals in the skin mask that are
accessible from that place are the next step after obtaining
the skin mask. Then, using this signal's peak detection
method, the heart rate is calculated.
E. Signal Amplification: The three signals used in the LAB XXV. IMPLEMENTATION
colour space are ‘L’, ‘A’, and ‘B’, where ‘L stands for A. Video Recording: The camera is currently linked. The
the brightness of the image and ‘A’ and ‘B together webcam captures video and reads picture metadata to
stand for its fusion of the other two channels. L lacks extract frame capture times.
color information, thus you must separate A and B
channels of color to from it. Blood pulses cause minute B. Face Recognization: Eliminates extraneous data, such
changes in brightness and intensity that are recorded in as the background, since we are only interested in a
the LAB colour space. Independent Component person's face. Establish the face's limiting box. The
Analysis or Principal Component Analysis reduces video is trimmed to the face's bounding box.
dimensionality when utilised.
1 74
2 76
JWT Library
Mohammed Nabeel
Department of CSE
Presidency University
Bengaluru, India
201910101660@presidencyuniversity.in
XXIX. BACKGROUND
web applications relied on session-based XXX. EXISITNG PLATFORMS AND THEIR
authentication to authenticate and authorize users. LIMITATIONS
Session-based authentication involves storing a There are several existing platforms which
session ID on the server and sending it to the client provide similar functionalities but using different
in a cookie. The client then sends the session ID methods such as session ID, cookies, etc.
back to the server with each request, allowing the
Session IDs: Session IDs were commonly used
server to identify the user and their session data.
in web applications as a way to maintain a user's
While session-based authentication is still used in
authentication state. When a user logs in, the
many web applications, it has several limitations.
server generates a unique session ID and stores it
For example, session-based authentication is
on the server side. This session ID is then passed
vulnerable to session hijacking, where an attacker
back to the client and included in subsequent
steals the user's session ID and impersonates the
requests to the server, allowing the server to
user. This can lead to unauthorized access to
identify and authenticate the user. While session
sensitive data and resources. JWT was introduced
IDs are still used in some applications, they have
as a more secure alternative to session-based
some drawbacks compared to token-based
authentication, and it has become increasingly
authentication methods like JWT. For example,
popular in recent years as web development has
session IDs are more stateful, which can make
evolved to become more secure and efficient.
them less scalable in high-traffic environments.
JSON Web Token (JWT) was first introduced in
Basic Auth:- Basic Auth is a simple authentication
2010 as a standard for securely transmitting
scheme that has been around since the early days
information between two parties over the internet.
of the web. It involves sending a user's credentials
JWT is an open standard and is designed to be a
(i.e., username and password) as a Base64-
compact and self-contained way to transmit
encoded string in the Authorization header of an
information between parties as a JSON object.
HTTP request. While Basic Auth is easy to
JWTs consist of three parts: a header, a payload,
implement, it has some significant drawbacks,
and a signature. The header contains information
including the fact that credentials are sent in
about the type of token and the cryptographic
plaintext, which makes them vulnerable to used in the processes of information exchange and
interception and theft. authentication. Each part is separated by a
OAuth: OAuth is a protocol for delegated dot symbol (.) refer the figure
authorization. It allows users to grant third-party
applications access to their resources without
sharing their login credentials. OAuth can be used
in combination with other authentication methods,
such as JWTs. While OAuth can provide a secure
and scalable method for authentication and
authorization, it can also be complex to implement
and may not be necessary for simple web
applications.
XXXI. PROPOSED FRAMEWORK A Header section that specifies the type of token
The proposed framework for building a blog (in our example, "JWT") and the signature
web application using the MERN stack and a algorithm. The entire thing is Base64 encoded.
custom JWT library built with TypeScript A Payload part that includes the token data, such
provides a powerful and flexible platform for as the username, token production date, and
building secure and scalable web applications. The expiration date. All of that is encoded in Base64
custom JWT library includes functions like Sign, and written in JSON.
Verify, and Decode, which can be used to And a Signature section, formed by combining
generate, validate, and parse JWTs for user the Header and Payload sections, which is
authentication and authorization. With this library, subsequently encrypted using the private key. The
developers can implement a robust and secure username or the user's application rights can both
authentication and authorization system in their be included in the second part, "Payload," of the
blog web application. message. However, the JWT specifications make
In addition to the custom JWT library, the clear which keywords should be used, including
proposed framework also utilizes the MERN stack "iat" (date and time of token generation) and
for Blog web application. MongoDB is used as the "exp" (expiration date).
database, Express as the web framework, React as As previously stated, the Header and Payload are
the frontend library, and Node.js as the server-side concatenated and encrypted to provide the
runtime. Together, these technologies provide a Signature. It provides a private key for us.
flexible and scalable platform for building blog The figure shows a JWT that has the previous
web applications that can handle large amounts of header and payload encoded, and it is signed with
traffic and user data. With this framework, a secret.
developers can focus on building the core
functionality of their blog web application, while
the MERN stack and custom JWT library handle
the rest.
A. Architecture
C. Potential benefits
The scalability of stateless apps is the main REFERENCES
benefit of Node.js authentication using JWT over
the conventional authentication procedure. And [1] Node.js HERE
given that companies like Facebook and Google [2] React.js HERE
are starting to use it, its popularity across the [3] Express.js HERE
industry is probably just going to increase. [4] Material UI HERE
The advantages include: Secure: JWTs are [5] MongoDB HERE
protected from being altered by the client or an [6] Brute Forcing HS256 is Possible: The
attacker thanks to digital signatures that use either Importance of Using Strong Keys in Signing
a secret (HMAC) or a public/private key JWTs HERE
combination (RSA or ECDSA). [7]https://auth0.com/docs/secure/tokens/json-web-
Effective/Stateless: Since a JWT doesn't call for a tokens
database search, it can be verified quickly. [8] https://supertokens.com/blog/what-is-jwt
Particularly in big dispersed systems, this is
Votemate – Secured Online Voting Application
Abstract— This research paper explores the use of OAuth unique requirements and constraints of the voting process.
2.0 in an online voting application. OAuth 2.0 is a widely used Finally, we evaluate the effectiveness of OAuth 2.0
authorization protocol that enables secure and efficient authentication to improve the security and usability of online
communication between different web applications. The voting applications and provide recommendations for further
proposed online voting application aims to improve the current research and development in this area.
voting system by providing voters with a more accessible, user-
friendly and secure platform.
With the OAuth 2.0 protocol, the voting application can
OVERVIEW
securely access user data from various web applications, such
as social media platforms, without requiring users to share OAuth 2.0 is an auth orization framework that allows
their login information. This ensures the transparency of the users to access resources using a secure token system without
voting process and the confidentiality of user data. The paper the need to share their credentials with the service provider.
discusses the architecture, implementation and security
features of the proposed online voting application and In the context of online voting, OAuth 2.0 can be used to
evaluates its effectiveness in improving the voting process. The ensure that only authorized users can access the voting
results of this study show that OAuth 2.0 can be effectively application. The voting application can be integrated with a
used in online voting applications to improve security, user third-party authentication service, such as Google, Facebook,
accessibility, and transparency. or Twitter, to enable users to log in using their existing
credentials. The OAuth 2.0 protocol is used to securely
exchange authentication and authorization data between the
Keywords—Authorizarion, Authentication, OpenIdConnect, voting application and the authentication service.
Login , Registration , Security
One of the benefits of using OAuth 2.0 for online voting
I. INTRODUCTION applications is that it provides enhanced security. User
credentials are not shared with the voting application,
Online voting applications have become increasingly
reducing the risk of credential theft. Additionally, OAuth 2.0
popular in recent years as more and more organizations seek
uses access tokens to grant users access to the application,
to increase voter participation and simplify the voting
which can be revoked at any time. This helps prevent
process. However, these applications must also be secure,
unauthorized access to the voting application.
reliable and user-friendly in order to gain user trust and
ensure the integrity of the voting process. Another benefit of using OAuth 2.0 for online voting
applications is that it simplifies the login process for users.
One way to improve the security and usability of online
Users can log in using their existing social media or email
voting applications is to implement OAuth 2.0
credentials, reducing the need to remember multiple
authentication. OAuth 2.0 is a widely accepted authorization
usernames and passwords.
framework that allows users to grant third-party applications
access to their resources without revealing their credentials. Overall, online voting applications using OAuth 2.0 provide
Using OAuth 2.0, online voting applications can ensure that enhanced security and simplified user login processes,
users are authenticated and authorized to vote, while making them an attractive option for research papers focused
protecting their sensitive information from unauthorized on improving online voting systems
access. In this study, we explore the benefits and challenges
of implementing OAuth 2.0 authentication in online voting
applications. We also discuss best practices for designing
and implementing such applications, taking into account the
II. EXISTING SYSTEM III. PROPOSED SYSTEM
a) Security Vulnerability: One of the most important A. User Registration: The first step would be for the user to
concerns about electronic voting is the risk of register on the online voting platform and provide their
security breaches, hacking and vote manipulation. personal details such as name, address, date of birth and
Malicious actors could potentially alter vote counts a valid email address.
or access sensitive voter information.
B. OAuth 2.0 Authorization: After registration, the user
b) Lack of transparency: Some electronic voting will be asked to allow the online voting platform to use
systems are not transparent, making it difficult for the authentication server of their OAuth 2.0 provider.
voters and election officials to ensure the accuracy This is used to confirm their identity and credentials for
and reliability of the voting process. Without a future logins.
clear audit trail, it can be challenging to detect and C. Voter Verification: Once a user is logged in, the online
correct errors or potential fraud. voting application verifies their voting status using a
database of eligible voters provided by the Election
c) Accessibility Issues: Electronic voting may not be Commission. This verification process ensures that only
accessible to everyone, especially voters with eligible voters can participate in online voting. Ballot
disabilities or limited technical skills. The user Choice: The user is then presented with the ballot
interface of the application may also be challenging choices. The ballots would be pre-populated based on
to navigate, potentially leading to errors or the user's registered address and candidates for various
confusion. positions.
d) Technical Problems: Voting applications may D. Voting: the user can select suitable candidates and vote.
experience technical glitches or problems that may The application would ensure that each user can only
cause delays, long queues or other disruptions. In cast one vote and prevent multiple votes from the same
some cases, the application may not be able to user.
handle high traffic volumes, resulting in crashes or
other problems.
47
v. Result: The results feature shows the election
results, including the votes received by each party
and candidate. The system automatically calculates
and displays the final results, indicating the winner
and candidate.
The aim of this study is to propose an online voting Factors such as user experience, security and accessibility
architecture that is secure, user-friendly and efficient. The are important to consider when designing a voting
proposed architecture consists of a login page, a dashboard, application. The program must be simple and easy to use as
voter registration, polls, results and a helpline. well as provide secure and private voting that prevents fraud
or malicious activity.
i. Login page: The login page is the first point of User authentication/authorization using OAuth2.0 and
entry for users. It provides users with a secure OpenID Connect can provide additional security and
interface that allows them to authenticate customization capabilities to a voting application, allowing
themselves using credentials such as username and users to log in with their existing credentials and ensuring
password. The authentication process is supported that only authorized users can access application functions.
by a series of APIs that communicate with the However, it is important to note that while OAuth 2.0 may
database to verify the user's identity. Using APIs improve the security of an online voting system, it is not a
ensures that the authentication process is panacea. Other security measures such as two-factor
simplified, efficient and secure. authentication and end-to-end encryption should also be
implemented to strengthen system security. In addition, the
ii. Dashboard: The Dashboard provides users with a system must be rigorously tested and verified to identify and
personalized view of their account information, fix potential vulnerabilities.
including profile information and other related Overall, a well-designed and secure voting program can be a
information. The panel also provides access to valuable tool for organizations and communities that want to
various electoral systems such as voter registration, collect input and feedback from voters. This can help
polls, results and a helpline. promote openness and inclusiveness while providing
valuable information and insights that can be useful in
iii. Voter registration: The voter registration feature decision-making processes.
allows users to register to vote in elections. For Overall, an online voting application using OAuth 2.0 as an
registration, the user gets a unique 10-digit aadhaar authentication mechanism can provide a robust and reliable
duplicate number, which he needs to link with his platform for conducting secure and transparent elections.
mobile via OTP. This process ensures that each
user's identity is verified and that they are allowed REFRENCES
to vote. [1] Arora, S. singh & M. Aggarwal (2019). A secure
online voting system using blockchain technology.
iv. Polls: The voting feature allows users to view Journal of information security and applications
political parties and candidates participating in the [2] Kumar, R. Wadhwani (2019). An efficient and
current election. Each political party has a brief secure electronic voting system based on
description and their candidates. Users can vote by blockchain technology and business intelligence
selecting the desired candidate from the list. To [3] Kshetri.N (2020). Blockchain’s roles in meeting
ensure the security and integrity of the voting key supply chain management objectives.
process, users must enter their duplicate aadhaar International journal of information management
number and authenticate using OAuth.
48
[4] Mercuri. R Neff, C.A (2019). Defending digital [8] Haldar & saha (2017) secure remote electronic
democracy: A multidisciplinary approach. voting system using hybrid cryptosystem. In
Springer. intelligent computing and control systems.
[5] Grewal, R. (2019). A review of online voting [9] Jin, Lu & yang (2016). Security analysis of a recent
systems. International journal of advanced research online voting protocol. Security and
in computer science. communication networks.
[6] Popovic, M.& bojanic (2018) security issues of [10] A Hybrid Voting System for High-Integrity and
electronic voting systems. Journal of applied Verifiability by R. Sun, L. Zhang
Engineering science.
[7] Chaum, D. (2018). Scantegrity : End to end
verifiability for optical scan voting systems using
invisible ink confirmation codes. In Towards
trustworthy elections.
49
Patient Case Similarity
50
increase the chances that this fruit is an orange, it
is referred described as being "naive".
The mathematician and hero Bayes is
associated with the name "Bayes" and the
Bayesian'theorem, that serves as the basis of the
The doctors can use this technique to help them Naive Bayes algorithm.
choose wisely.
SVM:
For both order and relapse problems can be solved
Drawbacks: using the support vector machine, a controlled AI
computation method. For example, in the disciplines
• High complexity. of jargon recognition, machine learning, and memoir
• Highly inefficient. informatics, it is a really crucial piece of information.
• Requires skilled persons. SVM uses a strategy based on a two class direct
linearly separable variable and a hyperplane that
maximises the geometric fringe and minimises the
II. PROPOSED METHOD. type error.
This approach is employed to forecast disease based
on symptoms. This system evaluates the model
using a decision tree classifier. End users utilise this INTERFACES:
system. The technology will forecast disease based
on symptoms. This system makes use of machine
learning capabilities. 1. System:
The decision tree classifier method is used to
forecast diseases. This system is known as "AI 1.1 Create Dataset:
Therapist" by us. This system is intended for those The dataset including symptoms are to be
individuals who are always concerned about their categorised, is separated into training and testing
health; as a result, we have included several dataset, with the test size being set at 30–20%.
elements that acknowledge this concern and also 1.2 Pre-processing
work to improve the user's mood. As a result, the
function "Disease Predictor" for health awareness The data has been scaled and rearranged into the
can identify diseases based on their symptoms. right format for training our model.
1.3 Training:
CNN Deep Learning, machine learning, and SVM
ALGORITHMS AND METHODS: techniques are utilised to train our model using the
Random Forest: pre-processed training dataset.
Ensemble learning is the act of using multiple 1.4 Classification:
models that have all been trained on the same data The results of our model are displayed
and mean their results to get more precise
predictions or categorization. The underlying
premise of ensemble learning is that each model's 2. Patient:
flaws (in this case,a decision tree)are unique and
unrelated to one another.
2.1 Upload Symptoms
Naive Bayes: The user has to upload Symptoms.
The Naive Bayes method determines the 2.2 View Results
probability that an object with specific
characteristics belongs to a given group or class. The predicted disease is displayed.
Using an orange-colored, globular, and pungent
fruit as an example, you would probably infer that
it was an orange if you were attempting to
determine a fruit solely upon its shade, shape, and
flavour. Since all of these packaging combined
51
ARCHITECTURE DIAGRAM REFERENCES
We have therefore come to the conclusion that [5] Balasubramanian, Satyabhama, and Balaji
machine learning can be utilised to track our health Subramanian. "Symptom based disease prediction in
in an efficient manner. We can maintain our health medical system by using Kmeans algorithm."
by periodically getting a free health check. When International Journal of Advances in Computer
the machine learning technique is built and deployed Science and Technology 3.
using the Python web framework and later
transformed into a website using that domain, it will
be freely accessible to everyone. In order for our [6] Dhenakaran, K. Rajalakshmi Dr SS.
model to forecast the optimal outcome, the user only "Analysis of Data mining Prediction Techniques in
needs to visit the relevant page and choose 5 to 8 Healthcare Management System." International
symptoms. After receiving the prediction, the user Journal of Advanced Research in Computer Science
will gain insight into their health and, if necessary, and Software Engineering 5.4 (2015).
contact the appropriate doctor.
[7] Maniruzzaman, M., Rahman, M., Ahammed,
B. and Abedin, M., 2020. Classification and
prediction of diabetes disease using machine learning
paradigm. Health information science and systems,
8(1), pp.1-14.
52
53
Networking Platform for E-Sport Players
Abstract— This research paper aims to can be attributed to several factors, including the
investigate the need for a networking platform increasing popularity of video games, the rise of
for e-sport players. With the rapid growth of streaming platforms like Twitch and YouTube,
the e-sport industry and the increasing number and the growing acceptance of e-sports as a
of players and enthusiasts, there is a pressing legitimate form of entertainment.
need for a dedicated networking platform to Online gaming has become a widespread
connect e-sport players. It can be challenging phenomenon in recent years, with millions of
for players to find the right team, and for clubs people across the world participating in various
to find the best players to recruit. A centralized online games. Along with this growth in
platform would serve as a social hub for popularity, the emergence of online gaming
players to interact, and collaborate with each communities has become a significant aspect of
other and for clubs and organizations to the online gaming experience. These communities
identify and connect with talented players. This can range in size from small groups of gamers who
paper will explore the potential benefits of such enjoy playing together to massive groups of
a platform and analyze the current state of e- thousands of people, and they can exist both
sport player networking. The findings of this within and outside of the game.
research will provide insights into the Communities play a crucial for e-sport players
feasibility and viability of a dedicated and teams as it can provide numerous benefits that
networking platform for e-sport players, and can enhance their career prospects and overall
the potential impact it could have on the e- success in the industry. Networking allows players
sport industry as a whole. and teams to build relationships with other
professionals in the industry. These connections
Keywords—E-sports, Online Gaming can provide opportunities for collaboration,
Community, Networking Platform teaming up for tournaments, or even securing
sponsorship deals and endorsements. Networking
VI. INTRODUCTION also provides players and teams with access to
The term "e-sports" pertains to the world of valuable industry knowledge and insights. By
competitive video gaming, wherein players or connecting with others in the industry, players can
teams engage in head-to-head battles or learn about new strategies, tactics, and gameplay
tournaments featuring a range of video games. techniques that can help them improve their
These events often draw large audiences and can performance and increase their visibility within the
be seen as the digital equivalent of traditional industry.
sports competitions. E-sports has grown rapidly in
recent years, with the industry estimated to be
worth over $1.38 billion in 2022[1]. This growth
VII. BACKGROUND Networking plays a crucial role in e-sports,
The e-sports industry has experienced both for individual players and for teams. E-sports
significant growth in recent years and is expected is a highly competitive field, and the competition
to continue to grow in the future. The COVID-19 for the best players is fierce. Networking can help
pandemic has further accelerated this growth as players and teams to connect with each other and
more people turned to online entertainment. E- form partnerships that can lead to success. Teams
sport market revenue is anticipated to reach 1.87 are constantly looking for talented players to join
billion US dollars in 2025[1]. According to a their rosters, and networking can help them find
survey [2] in 2022, the global e-sport viewership those players.
reached 532 million and by 2025 the viewership According to Hsiao, C. C., & Chiou, J. S. [4],
count is expected to be over 640 million. Asia and social networks and relationships can have value
North America are currently the biggest markets and benefits for individuals. The study finds that
for e-sports. players who have a higher position in the online
One of the main reasons is the accessibility of community, such as those with more connections
online gaming. With the widespread availability of and influence, are more likely to have higher
high-speed internet, more and more people can levels of community trust and perceive more social
now play video games online and connect with value. Players who have built a strong network
other players from around the world. This has led within the e-sports community are more likely to
to the formation of online communities, which be noticed by teams looking to recruit new talent.
provide players with a platform to socialize, Additionally, players who have a strong reputation
compete and improve their skills [3]. within the community are more likely to be
recommended to teams.
Networking can also help players to build their
personal brands and establish themselves as
valuable members of the e-sports community. By
networking with others, players can gain exposure
and build relationships that can help them advance
their careers. This can lead to opportunities for
sponsorship deals, endorsements, and other forms
of income that can support their gaming careers.
However, finding and connecting with each
other can be a daunting task for players and clubs,
as they may encounter various challenges that
Fig. 1. eSports market revenue worldwide from 2020 to 2025 (in hinder the process. Geographic barriers can make
million U.S. dollars) [1] it difficult for them to physically meet and the lack
of networking opportunities, particularly for
players who may not be part of established leagues
or organizations. Barriers that obstruct effective
communication, such as differences in language
and cultural norms, can hinder communication
between people. Clubs may have limited resources
to find players or to host try-outs. Legal and
regulatory challenges related to contracts, work
permits, and eligibility can further hinder the
process of connecting players and clubs.
In the study [5] author C. Won Jung examines
the relationship between game playing activities
and community involvement and self-
Fig. 2. eSports audience size worldwide from 2020 to 2025, by type
identification as a gamer. It found that game
of viewers(in millions) [2] communities serve as public spheres and that game
playing encourages social consciousness and
behaviour such as engaging in public discourse
55
and community activities. The study extends the and offer players and clubs a more comprehensive
subject of game studies beyond the notion of solution for their networking needs.
addiction vs. education and fitness, and suggests There is a need for more community
that games are a social simulator that allows for engagement features. While many existing
social experience that may be transferred to platforms offer basic communication and social
positive real-life consequences. networking features, there is a need for more
VIII. EXISITNG PLATFORMS AND THEIR LIMITATIONS robust community engagement features, such as
forums, mentorship programs, and collaboration
There are several existing e-sport player
tools. A dedicated networking platform that offers
networking platforms, each with its own set of
more community engagement features could help
features. We explored few of these platforms and
players and clubs build stronger relationships and
found great features however felt that there are
better support each other in meaningful ways.
some gaps that needs to be addressed.
There is no central platform where both clubs
Most of the platforms offer a range of basic
and players can interact with each other. Most of
features to their users, such as chat, cross platform
the networking platforms are focused on LFG and
support, and the option to filter profiles based on
are only limited to player-to-player networking.
one's individual needs.
No platform hosts both the players and clubs under
However, a significant number of platforms are the same roof, which can make it difficult to find
still missing some crucial features that would the right player for a club.
enhance user experience even further. For
E-sport players and clubs often use a variety of
instance, feed on the platform is a vital feature for
different tools and platforms to manage their
users as it allows them to share their experiences,
teams, track performance, and communicate with
opinions, and interests with others. Another
each other. A dedicated networking platform that
feature that is often missing in many platforms is
offers better integration with other e-sport tools,
the content upload feature. Users may not be able
such as team management software and
to showcase their creativity, which can result in a
communication apps, could help streamline the e-
lack of engagement and activity on the platform. It
sport experience and make it easier for players and
is concerning that many platforms lack adequate
clubs to manage their teams.
privacy controls, which is an essential aspect of
any online platform. After exploring several networking platforms
and finding them to be lacking in various ways, we
One of the networking platforms, GameTree [6]
realized that there was a need for a more
has an innovative approach for suggesting user
comprehensive and user-friendly platform. With
profiles. They designed different kinds personality
this realization, we set out to build a new platform
assessment which would improve the
that addressed the gaps we experienced with the
recommendations showed to the user. Apart from
existing platforms. The proposed platform aims to
that, the platform has option to filter profiles based
provide a seamless and intuitive user experience,
on game, gender, age, geographical location and
with features that cater to the needs of e-sport
language. This platform offers feed feature, chat
players. We believe that proposed platform will
rooms, personalized game recommendations based
bridge the gaps that we encountered, and provide a
on a user's preferences and play history.
one-stop solution for e-sport players to connect,
Despite the existence of several e-sport player collaborate and grow.
networking platforms, there are still gaps in the
market for a dedicated networking platform that IX. PROPOSED NETWORKING PLATFORM
could better meet the needs of players and clubs. The proposed e-sport networking platform is a
One of the gaps is the lack of support for a wider comprehensive online platform designed to
range of games. Many existing platforms focus on connect e-sport players, teams, clubs and
popular games like CS:GO, Dota 2, and League of organizations with each other, creating a space for
Legends, leaving players and clubs of less popular collaboration, competition, and career
or niche games struggling to find adequate advancement. The platform is designed to provide
support. A dedicated networking platform that users with the tools they need to showcase their
supports a wider range of games could fill this gap skills, connect with others, and potentially advance
their careers in the gaming industry. The proposed
56
platform would serve as a centralized hub for the that users are aware of their rights and obligations
e-sport community to connect and engage with when participating in competitions and events.
each other, helping to drive the growth and Additionally, the platform uses content
development of e-sports industry. matching and collaborative filtering to recommend
profiles to players and clubs. This feature enables
users to discover and connect with other players
who share their interests or skill levels, creating a
space for collaboration and competition.
Architecture
A.
The proposed platform is built with MERN
Stack. The frontend layer is built using React.JS. It
also communicates with Cloudinary using API
calls to store media assets. The URL to the media
assets is stored in the database. The server-side
logic layer is built using Node.JS and Express.JS.
MongoDB Atlas is used for storing and managing
data.
Potential Benefits
C.
57
factor in facilitating second language (L2) learning X.CONCLUSION
through gaming, with extended online gaming In conclusion, the proposed e-sport networking
communities providing support for language platform offers a comprehensive solution to the
learning through paratexts and advice. The study growing demand for a centralized platform that
suggests that organizing L2 gaming practices can caters to the needs of gamers, teams, and
reflect a gamer's L2 learning trajectory and that organizations. The platform can also benefit
game-related paratexts in both L1 and L2 form the gamers by providing them with a platform to
funds of knowledge for many L2 gamers. showcase their skills and achievements, and
Additionally, the study emphasizes the importance potentially connect with gaming organizations and
of providing structures and guidance for young L2 sponsors. The ability to upload, edit, and manage
learners on how to use L2 games to learn content in one place, as well as the inclusion of
autonomously. While the study has certain legal documents such as contracts, can promote
limitations, its findings have important research transparency, fairness, and cooperation within the
and pedagogical implications. community. Additionally, the platform's
According to L. Conner [3], the online gaming matchmaking system can help connect players
communities have a significant impact on gamers with similar skill levels and provide opportunities
and are just as essential as any other real-world for advancement. While there may be some
community. Online gaming has grown in challenges in implementing the platform, the
popularity, with millions of individuals across the potential benefits and demand for such a platform
world actively participating in online games and make it a feasible and viable project. Overall, the
building interactions and connections with other e-sport networking platform has the potential to
gamers. These communities can range in size from revolutionize the e-sport industry and provide a
a few people to thousands of people and can exist valuable resource for gamers, teams, and
both within and outside of the game. Some gaming organizations alike.
communities emerge around gamers who like
playing together, and these communities can REFERENCES
outlive particular games. The paper argues that [22] C. Gough, “Global Esports Market Revenue
2025,” Statista, 22-Sep-2022. [Online].
these communities provide opportunities for Available:
gamers to connect with others who share their https://www.statista.com/statistics/490522/glo
bal-esports-market-revenue/.
interests and can lead to lifelong friendships. The [23] C. Gough, “Global eSports audience size by
paper provides examples of popular online games Viewer Type 2025,” Statista, 27-Jul-2022.
and their associated communities, demonstrating [Online]. Available:
https://www.statista.com/statistics/490480/glo
the impact of these communities on the gaming bal-esports-audience-size-viewer-type/.
world. [24] L. Connor, “Online gaming and the
communities that it creates., debating
D. Feasibilty and Viability communities and networks 11,” Debating
Communities and Networks 11, 01-May-2020.
The proposed e-sport networking platform [Online]. Available:
appears to be feasible and viable given the https://networkconference.netstudies.org/2020
OUA/2020/05/01/online-gaming-and-the-
growing popularity of e-sports and the increasing communities-that-it-creates/.
demand for a platform that can provide a [25] C.-C. Hsiao and J.-S. Chiou, “The impact of
online community position on online game
centralized location for gamers to showcase their continuance intention: Do game knowledge
skills and connect with others in the gaming and community size matter?,” Information
& Management, vol. 49, no. 6, pp. 292–
community. There is a clear need for a platform 300, 2012.
that can help gamers easily manage and showcase [26] C. Won Jung, “Role of gamers’
their content. By offering a platform for communicative ecology on game community
involvement and self-identification of gamer,”
organizations to find and hire talented players, the Computers in Human Behavior, vol. 104, p.
platform can provide a valuable service for both 106164, 2020.
players and organizations. [27] “GameTree – LFG Find Gamer Friends,”
GameTree. https://gametree.me/.
However, the success of the platform will [28] A. Chik, “Digital Gaming and language
learning: Autonomy and community,” 01-Jun-
depend on several factors such as effective 2014. [Online]. Available:
marketing, user adoption, and the ability to attract https://scholarspace.manoa.hawaii.edu/bitstrea
and retain users. m/10125/44371/1/18_02_chik.pdf.
58
Optimizing Website Performance: How Google Analytics Can Improve User
Experience
60
This study looked at the effect of website quality
on prospective internet buyers. The findings
IV. METHODOLOGIES
showed that website quality significantly
increased the likelihood of making an online
purchase. This shows that website optimization Real-Time Analytics is one of Google
and analysis might assist companies in enhancing Analytics' most important tools for tracking and
the quality of their websites to boost online sales. improving website performance. With the help of
[7] this technology, website owners may monitor and
In the context of Saudi Arabian online shopping assess visitor activity in real time. By providing
behavior, this study looked at the effect of website data on user interactions such page load times,
quality on customer loyalty. The results showed bounce rates, and user behavior, real-time
that consumer loyalty was significantly positively analytics aid website administrators in identifying
impacted by the quality of websites. This potential performance issues and improving the
underlines how crucial website analysis and user experience.
optimization are for firms looking to increase Google Analytics' Real-Time Analytics offers
client retention and loyalty. [8]. insightful data on how website visitors behave.
This essay evaluates the literature on many facets Website owners may determine which sites are
of digital marketing and provides a framework for doing well and which ones are generating
it. It emphasizes the significance of website problems for users by tracking user behavior in
optimization and analysis as a crucial element of real-time. This data can be used to guide data-
digital marketing to boost website performance driven decisions about the functionality and
and customer engagement. [9] design of websites, which could ultimately result
in increased user engagement and conversion
In the context of the online retail industry, this rates.
study looked into the effect of website quality on
client loyalty. The findings showed that consumer 1) Real-Time Analytics: Real-Time Analytics
loyalty was significantly positively impacted by offers in-the-moment data on user activity on a
the quality of websites. This underlines the website. This makes it possible for website
significance of website optimization and analysis administrators to keep an eye on visitor behavior,
for online retailers looking to increase client track conversions, and spot any performance
retention. [10] problems that might be degrading the user
experience.
This study looked at how user happiness relates to
website quality in the context of online ticketing 2) Behavior Flow Analysis: This method follows
systems. The findings demonstrated that user a user's journey through a website, from the first
happiness was significantly positively impacted landing page until the successful conversion.
by the quality of the websites. This emphasizes Website owners can discover portions of the site
how crucial website optimization and analysis are that may be generating user drop-offs and
for online ticketing systems to boost customer optimize those pages to enhance the user
happiness and boost sales. [11] experience by analyzing the user behavior flow.
An overview of APM methods for microservice- 3) Conversion Tracking: Conversion tracking
based applications is provided in this paper. The enables website owners to monitor particular user
authors talk about distributed tracing, monitoring, behaviors, like making a purchase or submitting a
and analytic problems associated with APM in the form. The website's most effective conversion-
context of microservices. The paper emphasizes driving locations can be found using this data, as
the value of APM in assuring the level of service well as any areas that might benefit from
quality and user experience in applications that optimization.
use microservices. The authors also give an 4) Funnel Visualization: A website's users can
overview of several APM frameworks and tools see the steps they must follow to finish a certain
that can be applied to the microservices action, such placing a purchase, by using this
environment. [12] technique. Website owners can enhance
61
conversion rates by identifying the points in the instance, can be used by website owners to figure
funnel where users lose interest and optimizing out the 90th percentile of page load times, which
those processes. is the amount of time required by 90% of users to
load a specific page.
5) Site Speed Analysis: Site speed analysis gauges
how long it takes for a website to load and offers
information on how user behavior is impacted by
website speed. This information can be utilized to
enhance user experience and increase website
speed.
Google Analytics provides a full range of tools for
monitoring and enhancing website performance to
guarantee peak speed and a wonderful user VI. DASHBOARD
experience. With the help of these strategies,
website owners may learn a lot about user
behavior, spot performance problems, and take A user interface known as a "Google Analytics
action to enhance the user experience and improve dashboard" offers a summary of important
commercial results. performance indicators and analytics pertaining to
website traffic and user behavior. It is a
configurable platform that enables website owners
V. GOOGLE ANALYTICS INSIGHTS to track and examine data in real-time to learn
more about the effectiveness of their website. The
following terms can be found in the Google
Summarization: In this technique, results from Analytics dashboard:
various dimensions or metrics are added together. 1) Audience Overview: The number of visitors,
For instance, website owners can compute the sessions, and pageviews are all included in this
total number of pageviews, sessions, or clicks on a section's overview of the website's audience.
certain website component using this technique. Demographic information on the website's
Average: The average value of a statistic visitors, such as their age, gender, and location, is
across multiple dimensions is what this method also included.
entails. The average time spent on a page, the 2) Acquisition Overview: This part contains details
length of an average session, or the average about how visitors arrived at the website, such as
number of pages per session, for instance, can all the type of traffic that came via direct traffic,
be determined using this method by website social media, paid search, or organic search.
owners.
3) User Behavior: This section gives information
Count: In this approach, the quantity of a on user behavior, such as pageviews, average
specific dimension or metric is counted. For session length, bounce rate, and the most popular
instance, website owners might use this technique web pages.
to track the number of sessions, unique visitors, or
clicks on a particular component of their websites. 4) Conversion Overview: Information about the
website's conversion objectives, including the
Min Max: Finding the least or highest value of number of transactions, revenue, and conversion
a measure over many dimensions is the goal of the rate, is provided in this section.
min and max procedures. These techniques, for
instance, can be used by website owners to 5) Real-Time: This area gives information about
determine the minimal and maximal amount of the number of users who are actively using the
time spent on a page or the minimal and maximal website, where they are located, and the pages
number of pages per session. they are currently reading in real-time.
Percentiles: Using this technique, data is 6) Custom Reports: This feature enables website
divided into equal portions based on a owners to produce tailored reports that offer
predetermined metric. This technique, for certain information and performance measures.
62
7) Goals: With the help of this tool, website
owners may specify and track particular
conversion targets, such form submissions or
sales, and measure how well they're doing over
time.
8) Events: This function enables website
administrators to monitor individual user actions,
such as button clicks or file downloads. Fig 4: Overview of page views.
63
website and their behavior in real-time, thanks to
real-time monitoring. With goal tracking, you may
keep track of particular actions that site users take,
such submitting forms or making purchases.
Google Analytics also provides custom reports
and dashboards, which let you design unique
views of the data from your website. Businesses
who want to measure particular KPIs (key
performance indicators) that are pertinent to their
objectives may find this to be especially helpful. Fig 6: The Gadget House Website for monitoring.
In general, Google Analytics is a crucial tool 9. Acquisition overview shows the number of
for monitoring and optimizing websites. It helps users and new users.
guide your decision-making about how to increase
user experience and improve website performance
by giving you useful information into traffic and
user behavior.
Along with the interface, Google Analytics
features a sizable vocabulary exclusive to their
platform. The following are some of the key terms
to comprehend:
Sessions: A session is a collection of
interactions that happen on your website over the
course of a specific period of time. Multiple
pageviews, interactions, and events can occur
during a single session. Fig 7: Acquisition overview.
Pageviews: Each time a person accesses a page 10. Shows direct and indirect user count. (we
on your website, a pageview is logged. do not have indirect traffic)
Bounce rate: The percentage of visitors to your
website that leave after only reading one page is
known as the bounce rate.
Conversion rate: The conversion rate is the
proportion of visitors to your website who carry
out a desired action, like making a purchase or
completing a form.
We have created a e-commerce website using
HTML, CSS and PHP for backend. Mentioned
below are the results and insights gained Fig 8: Traffic overview
generated from our website. 11. First user analytics, shows how many new
users visited the site.
VIII. OUR RESULTS
64
Fig 9: New user data.
16. Detailed analysis of the pages. Increased use of machine learning and AI:
Machine learning and AI can be used to automate
the analysis of website data, making it easier for
65
website owners to identify performance issues and Additionally, Google Analytics delivers real-time
optimize their website's performance. monitoring, enabling businesses to recognize and
handle any issues as they emerge.
Integration with other tools and technologies:
APM and Google Analytics can be integrated with In order to provide a thorough perspective of
other tools and technologies such as website performance and advertising campaigns,
containerization, serverless computing, and edge Google Analytics also connects with a wide
computing to provide more comprehensive number of other programmers and platforms, such
insights into website performance. as Google Ads. Based on information about user
behavior and performance, this integration helps
Real-time monitoring: Real-time monitoring
firms to optimize their online advertising strategy.
capabilities can be enhanced to provide instant
alerts and notifications for any performance In general, Google Analytics is an effective
issues. tool for businesses trying to track and improve the
performance of their websites. Businesses may
Deeper insights into user behavior: Google
use it to make data-driven decisions and enhance
Analytics can be enhanced to provide deeper
their online presence because it offers insightful
insights into user behavior, such as clickstreams,
data on user behavior and performance indicators.
session replay, and heat maps, which can help
website owners identify areas for improvement.
Mobile app monitoring: With the increasing REFERENCES
popularity of mobile apps, APM and Google
Analytics can be enhanced to provide more
comprehensive monitoring and analytics [1] Gao, H., Wang, Y., & Chen, Y. (2017).
capabilities for mobile applications. Application performance monitoring: A
review and taxonomy. Journal of Network
and Computer Applications, 83, 73-89.
X. CONCLUSION doi: 10.1016/j.jnca.2017.01.003
[2] Salsbury, C., & Beck, J. (2018). Using
Google Analytics as a web application
Google Analytics is a potent web analytics tool
performance monitoring tool. International
that gives businesses the ability to track and
Journal of Web Information Systems,
improve the functionality of their websites. It
14(3), 280-289. doi: 10.1108/IJWIS-02-
provides a wide range of capabilities, such as
2018-0012
website traffic tracking, user behavior monitoring,
and data analysis to support organizations in [3] Wang, Y., Gao, H., & Chen, Y. (2018).
making defensible judgements about their website Performance evaluation and prediction of
strategy. web applications: A review. Journal of
Systems and Software, 140, 10-26. doi:
Organizations can use Google Analytics to
monitor crucial performance indicators like page 10.1016/j.jss.2018.02.024
views, bounce rates, and conversion rates. This [4] Cheng, Y., Liu, X., & Li, J. (2019). Real-
data can be utilized to enhance user experience, time APM in microservices architectures:
boost website traffic, and optimize website design. A review. Journal of Systems and
Additionally, the tool offers information on user Software, 157, 110392. doi:
behavior, such as how visitors use the website, 10.1016/j.jss.2019.110392
which pages they visit most frequently, and how [5] Fang, J., Wang, W., & Li, X. (2020). Web-
much time they spend on each page. based application performance monitoring
Creating unique reports and dashboards is one with Google Analytics. International
of the main advantages of utilizing Google Journal of Grid and Utility Computing,
Analytics. This enables businesses to monitor 11(2), 111-119. doi:
particular indicators and learn more about how 10.1504/IJGUC.2020.107168
consumers engage with their website.
66
[6] Kim, J., & Koo, C. (2014). The impact of for disease prediction on large-scale data.
website quality on customer satisfaction In 2022 International Conference on
and purchase intentions: Evidence from Electronics and Renewable Systems
B2C e-commerce in Korea. International (ICEARS) (pp. 1556-1561). IEEE.
Journal of Electronic Commerce, 18(1), [15] Alam, A., & Muqeem, M. (2022,
69-97. doi: 10.2753/JEC1086-4415180103 October). K-Means Integrated with
[7] Hu, X., Lin, Z., & Huang, L. (2016). A Enhanced Firefly Algorithms for
study on the impact of website quality on Automatic Clustering to Select the
online purchase intention. Journal of Optimal Number of Clusters. In 2022 2nd
Electronic Commerce Research, 17(1), 1- International Conference on Technological
13. Advancements in Computational Sciences
(ICTACS) (pp. 343-347). IEEE.
[8] Ali, H., & Alkibsi, A. (2017). The impact
of website quality on customer loyalty: A
study of online shopping behavior in Saudi
Arabia. Journal of Theoretical and Applied
Electronic Commerce Research, 12(3), 37-
52.
[9] Kannan, P. K., & Li, H. (2017). Digital
marketing: A framework, review and
research agenda. International Journal of
Research in Marketing, 34(1), 22-45. doi:
10.1016/j.ijresmar.2016.11.006
[10] Chen, X., & Chen, Y. (2018).
Research on the impact of website quality
on customer loyalty in the online retail
industry. Journal of Theoretical and
Applied Electronic Commerce Research,
13(1), 1-16.
[11] Huang, J. T., & Chou, H. Y.
(2019). The impact of website quality on
user satisfaction in the context of online
ticketing services. Journal of Business
Research, 100, 169-179. doi:
10.1016/j.jbusres.2019.02.012
[12] Zhai, Y., Dong, C., Zhang, X., &
Yang, W. (2020). A survey of
microservice-based application
performance monitoring. Journal of
Internet Technology, 21(5), 1435-1448.
[13] Alam, A., Rashid, I., & Raza, K.
(2021). Data mining techniques' use,
functionality, and security issues in
healthcare informatics. In Healthcare and
Medicine, Translational Bioinformatics
(pp. 149-156). Academic Press.
[14] Alam, A., & Muqeem, M. (2022,
March). k-means clustering and a nature-
inspired optimization technique combined
67
Predicting The Onset Of Lifestyle Diseases
Madhura H C S Manvita M
Information Science and Information Science and Parikshith N
Engineering Presidency Engineering Information Science and
University Presidency University Engineering
Bangalore, India Bangalore, India Presidency University
201910101781@presiden 201910101629@presiden Bangalore, India
cyuniversity.in cyuniversity.in 201910100312@presiden
cyuniversity.in
69
nearest neighbour, XGBoost, random forest, XIII. METHODS USED
logistic regression, and 1D convolutional neural
A. Support Vector Machine
network. By successively entering the
characteristics in three stages according to their SVM, a commonly employed supervised
machine learning method for classification and
properties, all analyses were carried out. After
regression analysis, works by determining the
using the syntetic minority oversamplingtechnique
optimal hyperplane that maximizes the separation
(SMOTE) to address the data imbalancing, the between classes. This algorithm partitions data
models' results were compared. It concluded that points into distinct classes. The data points closest
tree-based machine learning models could to the hyperplane, which determines where it is,
accurately identify Met S in mid aged Koreans. are the support vectors. Working: Input data- SVM
Early Met S daignosis is crucial and necessitates a uses numerical values as its input data. For
multifaceted strategy that include self instance, the input data for a binary classification
administered questionnaires, antthropometric problem (two classes) would consist of a set of
measurements, and mettabolic tests. numerical features and labels designating the two
classes. Determine the separating hyperplane-
In a different study , as shown in [3] the authors SVM seeks for the ideal hyperplane that divides
created a model that will examine the information the classes with the greatest margin. The width of
given by the user and provide forecasts of the the margin is the separation between the nearest
diseases that he or she may be likely to suffer. In points in both classes and the hyperplane. The
addition to providing you with forecasts, the hyperplane that maximizes this distance is the
model also teaches you how to avoid common ideal hyperplane. Mapping to higher dimensions-
lifestyle diseases and offers you management In some cases, the data's original feature space
strategies in the event that you experience cannot be divided by a straight line. In these
situations, SVM transforms the data into a greater-
moderate symptoms. This project educates the
dimension that allows for linear separation.
individual about their health so that, if necessary,
Calculation of suport vectors- The data points that
they can receive treatment promptly and thus save lay closest to the hyperplane or those that are on
countless lives. The processes involved in this the margin are the support vectors. The hyper
study include identifying the lifestyle diseases at plane's location is determined by the position of
an early stage, Preventing these diseases, and hoe these suport vectors. Classification of new data-
we can manage these diseases. The diseases SVM can be used to categorise new metrics points
focused in this study are heart diseases, Breast by mapping them into the same feature space and
cancer, Diabetes and hypertension. Different figuring out which side of the hyperplane they lie
algorithms are implemented in order to identify on after the hyperplane has been identified. SVM
the respective diseases, like clustering is used to is used frequently in applications like text analysis
detect Heart Diseases and Diabetes; NaïveBayes, and image classification because it can handle both
Backpropagation NeuralNetwork and linear and non-linear data. Due to its strength in
DecisionTree to predect the survivalbility rate of handling high-dimensional data, robustness to
breastcancer patients. By using a typical machine outliers, adaptability in kernel functions, good
learning technique that uses a portion of the whole generalization efficiency, and capability to deal
dataset that is entirely distinct from the training with both binary and multi-class classification
set, the model may be tested and confirmed. This problems, SVM is an effective machinelearning
will aid in predicting how accurate the model will model that is appropriate for the prediction of
lifestyle diseases.
be. The techniques employed must be changed,
the algorithms must be trained on a new dataset,
and the entire method must be redone after
determining the procedures, if the algorithm is B. Decision Tree
unable to accomplish the desired accuracy level.
A decision tree refers to a supervised learning
technique employed for the purposes of
classification or regression analysis. To make
predictions or decisions, the process involves
splitting the data into subsets according to the
70
input feature values and iteratively making The procedure multiplies the probabilities for all
decisions based on these subsets. Each internal features and all classes after calculating the
node represents a feature, and each leaf node likelihood of each feature value given the class.
represents a class label or regression value. It is The projected class for the new observation is
visualized as a tree-like structure. Working: The given to the class with the highest probability.
decision tree algorithm divides the data into Model tuning- The algorithm may employ many
subsets recursively according to the values of the methods, including regularisation, feature
input features. In this procedure, the optimal selection, and hyperparameter tuning, to increase
feature to divide the data into classes at each stage the model's accuracy. Test and evaluation- The
is chosen based on a criterion that maximises class model's effectiveness is assessed using a different
separation or reduces variation within each subset. test dataset as the last stage. The correctness of the
The algorithm chooses the optimal characteristic model is assessed using evaluation metrics such as
to divide the data into two or more subsets, precision, recall, and F1-score on the test data set.
starting with the complete dataset at the root node. It is adaptable in various application domains since
The process is repeated for each subset until a it can handle both continuous and discrete data.
stopping condition is met, such as reaching a
maximum depth or a minimum number of
occurrences in each leaf node. A decision is made D. Random Forest
based on the value of the chosen feature at each A group of decision trees are used in the
internal node of the tree, and the data is partitioned supervised learning technique known as random
as a result. Up to the leaf nodes, which stand in for
the ultimate predictions or judgements, the forest to produce predictions. By selecting random
procedure is repeated. By navigating the tree from subsets of attributes and instances from the
the root node to the appropriate leaf node based on training data, it builds numerous decision trees,
the values of the input characteristics, the resulting combining the predictions of these trees to
decision tree can be used to predict the class label provide a final prediction. This method aids in
or regression value of future instances. Decision lowering overfitting and enhancing the model's
tress are used to forecast lifestyle disorders for a precision and generalizability. Working- By
variety of reasons, including, Non-parametric,
using bootstrapping, which involves selecting
which means it makes no assumptions regarding
the data's underlying distribution, Resistant to instances at random with replacement, random
noise and missing data values. subsets of the training data are produced. Using
these bootstrapped subsets and a random subset of
features at each node, several decision trees are
C. Gaussian Naïve Bayes built. A greedy technique is used to build the
A probabilistic classification approach called trees, which recursively splits the data based on
Gaussian NaiveBayes is grounded on the Bayes the chosen features. The target variable for new
theorem and implies that the characteristics are instances is predicted using each decision tree.
independent and have a normal distribution. By The forecast is based on either the average (for
evaluating the product of the probability of all the
regression) or the majority vote (for classification)
feature values given to the class, it assesses the
probability of a new observation being assigned to of all the predictions made by the forest's trees. A
each class. The forecast is then given to the class different validation set is used to assess the
with the highest likelihood. Working: Training- model's performance. To increase the precision
The algorithm calculates the mean and variance of and robustness of the model, the process of
each feature for each class during the training building the trees and making predictions is
phase using the training data. Probability iterated many times with various subsets of the
Calculation- The algorithm first determines the
data. The random forest approach uses methods
previous likelihood of each class based on the
proportion of instances of each class in the training like feature subsampling and bagging to avoid
data when a new observation is given to it. The overfitting and increase the model's
estimated mean and variance of each feature for generalizability. The ensemble of decision trees
that class are then used to compute the probability that is produced can be utilised for classification
of each feature value given the class. Prediction- and regression applications and can handle
71
categorical and numerical input. High reliability overfitting brought on by bagging and feature
and precision thanks to an ensemble of decision subsampling, etc makes it a suitable algorithm for
trees, Capability of handling datasets with many the prediction of lifestyle diseases.
features and large dimensions, Unaffected by
72
values are taken from the confusion matrix in of Chronic Diseases Using Machine Learning
Figure.2. Approach. J Health Eng. 2022-Feb-15. doi:
Stage 6: The model chooses the disease that has 10:1155/2022/2826127 .
the highest probability among all the classifiers
and displays it. [2]. Junho Kim, Sujeong Mun, Siwoo Lee,
Kyoungsik Jeong and Younghwa Baek.
Prediction of metabolic and premettabolic
XV. RESULT
syndromes using machinelearning models with
After implementing our methodology we have antthropometric, lifestyle, and biochemicalfactors
successfully trained our models to give accurate from middle-aged population inKorea. Kim et al.
prediction of diseases caused by lifestyle BMCPublic Health 2022; 22:664. doi:
activities. In Fig.3 we have mentioned the 10.1186/s12889-022-13131-x.
accuracy score of each algorithms in particular.
[3]. Sakshi Gaur, Sarvesh Sharma, Ayush
Tripathi. Easy Prediction of Lifestyle Diseases. 4-
June-2021. EasyChair Preprint no: 5702.
73
[9]. Chih-HungJen, Chien-ChihWang, [12]. AnandA and ShaktiD, 2015. Predection of
BernardC.Jiang, Yan-HuaChu, Ming-ShuChen. daibetes based onpersonal lifestyle indecators.
Applicationof clasification techniqueon InNext generation computingtechnologies
development anearly warningsystem for chrnic (NGCT), 2015 1st international conference on
illnes. Expret system with aplications Volume.39, (pp.673–676).IEEE.
Issue 10, Agust 2012,
doi.org/10.1016/j.eswa.2021.02.004. [13]. SharmaM and MajumdarP.K, 2009.
Ocuupationallifestyle diseases: An
[10]. PatekariS.A. and ParveenA., 2012. Emergingissue. IndianJournal ofOccupational and
Prediction system for heartdisease using Environmentalmedicine, 13(3), pp. 109–112.
NaïveBayes. InternationalJournal of
AdvancedComputer and MathematicalSciences, [14]. P.Prabhu, S.Selvabharathi. DeepBelief
pp.290–294. NeuralNetworkModel for Predection of
DaibetesMellitus. In 2019 3rd
[11]. SuzukiA, LindorK, St SaverJ, LympJ, InternationalConference on Imageing,
MendesF, Muto, A.OkadaT andAnguloP, 2005. SignalProcessing and Communication, 2019 (pp.
Effect of change onbody weightand lifestyle in 138–142) Institute of
nonalcoholic fattyliverdisease. Journal of ElectricalandElectronicsEngineers Inc.
Hepattology, 43-6, pp. 1060–1066. ISBN:9781728136639. 2019.
74
Predictive Analysis Of Heart Rate
Using Opencv
III. METHIDOLOGY
78
The research paper mentioned aims to predict heart
rate using OpenCV and the Haar cascade classifier
for facial detection. The Fourier transform is used
to analyze the frequency components of the heart
rate signal.The researchers used a video camera to
capture facial images of participants while they
performed a series of activities that increased their
heart rate, such as jogging or jumping. They then
applied the Haar cascade classifier to detect the face
in each frame of the video, and extracted the region
of interest around the forehead, where the pulsation
of the blood vessels can be measured.Next, the
researchers used the Fourier transform to analyze
the frequency components of the pulsation signal,
which allows them to identify the heart rate. They
compared their results with a reference heart rate
obtained from a pulse oximeter, a non-invasive
medical device that measures the oxygen saturation
in the blood.
Overall, this research shows that it is possible to
use computer vision techniques to predict heart rate
from facial images, which has potential applications
in remote health monitoring and wellness tracking.
79
We would like to express our sincere gratitude to all
the individuals who have contributed to the
completion of our research paper titled Predictive
analysis of heart rate using OpenCV.
Firstly, we would like to extend our thanks to our
supervisor, for providing us with invaluable
knowledge and insights throughout the entire
research process.
We would also like to acknowledge the contribution
of everyone who helped for their significant role in
the data collection and analysis process. Their
expertise have been instrumental in making this
project a success.
REFERENCES
80
Md.Shahjalal, Md.Morshed Alam, Yeong measurement using low-cost RGB face video:
Min Jang 2020 A technical literature review,’’ Front.
10. D. J. McDuff, J. R. Estepp, A. M. Piasecki, Comput. Sci., pp. 1–15, Dec. 2017.
and E. B. Blackford, ‘‘A survey of remote 12. M.-Z. Poh, D. J. McDuff, and R. W. Picard,
optical photoplethysmographic imaging ‘‘Non-contact, automated cardiac pulse
methods,’’ in Proc. 37th Annu. Int. Conf. measurements using video imaging and blind
IEEE Eng. Med. Biol. Soc. (EMBC), Aug. source separation,’’ Opt. Exp., vol. 18, no. 10,
2015 pp. 10762–10774, 2010.
11. P. V. Rouast, M. T. P. Adam, R. Chiong, D.
Cornforth, and E. Lux, ‘‘Remote heart rate
Yashaswini M
Department of Computer Science and Swapnadeep Banik
Engineering Department of Computer Science and
Presidency University Engineering
Bangalore, India Presidency University
YASHASWINI.20201LCS0005@preside Bangalore, India
ncyuniversity.in 202011100016@presidencyuniversity.in
Abstract— For user authentication and security, personal People encounter authentication mechanisms every day
identification numbers are frequently employed. Users must and must verify using historical knowledge-based ways like
enter a physical PIN for password verification utilising PINs,
passwords, It is important to maintain a delicate balance
which makes them susceptible to password cracking or hacking.
by thermal tracking or shoulder surfing. On the other side, PIN
between simplicity and security in terminal authentication
entry methods donot take physical traces and provide a secure systems, ensuring that they are user-friendly while also
password entering choice. Pin verification with hands-off eye providing robust protection against potential threats and
blinks. Eye blinks-based authentication is the process of vulnerabilities., quick, and safe. However, these methods are
establishing a PIN by identifying the eye blinks in a series of not secure since they are observed by nefarious observers
picture frames. To prevent shoulder surfing and thermal who utilize surveillance methods like shoulder-surfing—
tracking assaults, this project offers a real-time application watching the user type the password while using the
that combines face detection and eye blink- based PIN keyboard—to record user authentication information.
entering.
Keywords— Webcam, Authentication, Password, Real-time Security problems are sometimes brought on by insufficient
Systems communications between people and systems. The authors
suggested a security framework consisting of three layers to
protect PIN digits. To mitigate the risk of shoulder surfing,
I. INTRODUCTION
individuals can utilize eye-blinking as a means of inputting
their password by selecting the appropriate symbols in the
81
correct order.Eye blinks are a common form of time application that combines eye blink-based PIN
communication, and security systems that track blinks entering and facial identification.
present a promising option to increase system security and
Scope Of The Project
usability. This paper will examine various approaches or
remedies to deal with eye blinking in security systems. This project's sole objective is to generate the PIN and
identify eye blinks in a series of image frames. This project
Personal identification numbers (PINs) are frequently used offers a real-time application that integrates facial
as a form of user verification for a variety of purposes, identification, eye blink-based PIN entry, and shoulder
including managing cash at ATMs, approving electronic surfing prevention to prevent attacks using thermal
transactions, unlocking mobile devices, and accessing imaging and shoulder surfing.
doors. Even with PIN authentication, such as in financial
systems and gateway management, authentication remains
a constant challenge. II. ALGORITHM SPECIFICATION
European ATM Security claims that compared to 2015,
ATM fraud attacks rose by 26% in 2016. Because the code A. Haar cascade classifier
must be entered by an authorised user in a public or open Using the platform based on Haar, an object is acquired
location, PIN entry is susceptible to password assaults such through a successful object discovery technique provided by
shoulder surfing and thermal monitoring. Paul Viola and Michael Jones in their paper, "Recent
Acquisitions.
Purpose
Making use of an Enlarged Cascade. Cascade's work is
The main objectives of this project are to detect eye taught using a large number of both positive and negative
blinks in consecutive image frames and generate a PIN photos in this machine- learning technique.
based on the selected symbols.To prevent shoulder surfing
and thermal tracking assaults, this research shows a real-
82
After that, it is used to identify objects in other images. To
effectively train the student, the approach necessitates a
significant amount of positive images (images containing
faces) and negative images (images not containing faces) If the second phase of the symptoms passes, use the
initially. Extraction of its features is the next stage. In technique again. The highest window of any category is
particular, using the Haar features shown in the image found on the face. The first five stages of the authors'
beneath. detector had 1, 10, 25, 25, and 50 features. The detector
The current method involves computing multiple features was comprised of over 6000 features and consisted of 38
for each kernel using various sizes and positions. To stages. The top two attributes from Adaboost were really
accurately calculate each element, it is necessary to the two features that were depicted in the aforementioned
determine the total number of pixels under both white and graphic.
black squares. To address this, the researchers introduced a The detector developed by the authors included 38 stages
key image in their approach. and more than 6000 features. The first five stages of the
Quality is the fact that your image is huge, it only allows detector included 1, 10, 25, 25, and 50 features,
for four pixels worth of functionality to be delivered. respectively. The top two attributes from Adaboost were
However, many of the features we have listed do not really the two features that were depicted in the
function. The feature initially selected appears to emphasize aforementioned graphic.
the unique attributes of the eye region, which is typically According to the authors, 10 qualities, on average, are
darker in comparison to other facial areas such as the nose evaluated for each sub-window out of a total of 6000. This
and cheeks. For instance, the image below showcases two is a simple overview of how the Viola-Jones algorithm for
prominent features on the top row that exhibit this face detection functions. For more in-depth information, it
distinction. is recommended to read the original paper or investigate
the sources referenced within it.
B. CNN Model
The stage where we construct the CNN, which we will
use to feed our features, is the most crucial phase in the
entire process. The CNN is created by merging multiple
distinct functions, each of which we will discuss
individually. This step is crucial to train and test the
model using the appropriate features.
V. IMPLEMENTATION
Localization Of Face
VII. CONCLUSION
86
REFERENCE
87
QR as Key implemented in an App using OAuth 2.0
Rahul Antony J Mohsin Khansab
Computer Science and Engineering Computer Science and Engineering
Presidency University Presidency University
Bengaluru, India Bengaluru,India\
rahulantony003@gmail.com khansabmohsin@gmail.com
1) Abstract— The recent advancements in the QR limited computing power and memory capacity. We have
allow every business field to adapt its functionalities. developed
And the integration of OAuth 2.0 in applications these a solution to this challenge in the form of the QR-Key App
days is seen widely because of its ease of use and security with OAuth 2.0.
feature that does not allow third-party applications to
access resources which are out of scope. This article a) Background and motivation
discusses a use case of QR and OAuth 2.0. The use case Modern technologies such as IoT have become an
discussed here is the ease-of- access service provided to integral part of our lives, and with their integration come
the users. When a user reserves a room in a Hotel, he is challenges related to authentication and access control.
given access to an application service that uses OAuth
To address this challenge, several solutions have been
2.0 authentication, after that a QR is generated using the
proposed to enable secure authentication and access
logic (First name of the user and the Room Number), control in IoT systems. We chose OAuth 2.0 as the base
this QR code can be used at the door to unlock. This use standard for our solution due to its widespread usage in
case can be implemented in Hotels, Corporate Check- web applications and its ability to enable decoupling
Ins, etc. The approach provides a hassle-free check-in to between authentication and authorization.
a hotel or any Check-In for that matter.
A major motivation for our research is to enable
Keywords : OAuth 2.0, QR codes , IOT doorlocks , authentication and access control in IoT systems using
Authentication, Authorization ,Mobile applications, User OAuth 2.0 while making the process of login simple.
privacy, Security, Access control, User experience , OpenID
Connect
b) Research objectives
The main objective of our research is to develop
the QR- Key App with OAuth 2.0 for secure
I. INTRODUCTION authentication and access control in IoT systems.
In the current era, individuals are utilizing modern Specifically, our research objectives are as follows:
technologies to perform their daily tasks.
● Create an Mobile app that uses OAuth for Login.
One such task that has seen significant growth with Mobile app in turn uses credentials provided from
advancements in technology is the process of Resource Server to create an QR key.
authentication and access control. In recent years, OAuth ● Implement OAuth 2.0 protocol server for
2.0 has emerged as the de facto standard for enabling authentication and access control.
authorization and access control in web applications. ● IoT lock that can be unlocked using QR keys
Now, there is a growing interest in utilizing OAuth 2.0 generated by the Mobile app.
for authentication and access control in IoT systems as ● Frontdesk program to manage and monitor access
well. Several research studies have explored the use of permissions for users and devices in the IoT system.
OAuth 2.0 in IoT systems for access control and
authentication.
a) c) Proposed Solution
However, one challenge in implementing OAuth 2.0 in Our proposed solution, the QR-Key App with
IoT systems is dealing with constrained devices that have OAuth 2.0, aims to simplify the process of authentication
1
and access control in IoT systems while also enhancing mechanism for IoT devices that uses a unique device
their security. The solution involves the use of a mobile identifier and a shared secret. Another study by Lee et al.
app that uses OAuth 2.0 for authentication with the (2019) proposed a QR code-based access control system for
resource server, which in turn generates a QR key that smart homes that uses a dynamic QR code and a one-time
can unlock IoT locks. The OAuth 2.0 protocol server acts password.
as the central gateway for managing access control and
authentication requests from the system, while the
frontdesk program provides an interface for managing
permissions and monitoring the IoT lock. III. METHODOLOGY
A) Webpage Registration
d) Research questions 2) User Registration: The user registers for the
service by providing their email, first name, and last
The research questions for our proposed solution
name. The service generates a unique user ID and stores
are as follows:
it in the database.
1. How can OAuth 2.0 be effectively integrated into
an IoT authentication and access control system?
2. How can a mobile app using OAuth 2.0 be utilized to 3) OAuth 2.0 Authorization: The user logs in to the
generate secure QR keys for unlocking IoT locks? service using their OAuth 2.0 credentials. The service
3. How effective is the QR-Key App with OAuth 2.0 in verifies the user's identity with the OAuth 2.0 provider
simplifying the process of authentication and access and retrieves the user's profile information, including
control while enhancing security in IoT systems? the user ID.
4) Room ID Generation: The user selects a room ID
from a list of available rooms. The room ID is stored in
II. LITERATURE REVIEW the database along with the user ID.
● Introduction to OAuth 2.0
OAuth 2.0 is an open standard for authorization that B) QR Code Generation
allows users to grant access to their resources on one site 1) QR Code Creation: The service generates a unique
(called the "resource server") to another site (called the QR code for the user, which is a combination of the
"client") without sharing their credentials. OAuth 2.0 has user's first name and room ID. The QR code is
gained popularity due to its simplicity, scalability, and generated using a QR code generator library.
support for various authentication protocols. Several studies
have evaluated the security and usability of OAuth 2.0 in 2) QR Code Display: The QR code is displayed on the
various contexts, including mobile devices and social user's smartphone screen.
media.
● Authentication and Access Control C) Door Lock Control
Authentication and access control are essential
1) QR Code Scanning: The user positions their
components of security in any system. Traditional
smartphone in front of the QR code reader on the door
authentication methods such as passwords, PINs, and tokens
lock. The door lock reads the QR code using its built-in
are vulnerable to various attacks, including brute-force
camera.
attacks, phishing, and man-in-the-middle attacks. Several
studies have proposed alternative authentication methods, 2) QR Code Verification: The door lock verifies the
such as biometrics, behavioral authentication, and context- QR code by checking if it matches the room ID stored in
based authentication. However, these methods are not the database for the user. If the QR code is valid, the
foolproof and have their limitations. door lock unlocks.
● QR Codes as a Means of Access Control
QR codes are two-dimensional barcodes that can be D) Server for Resources
scanned using a smartphone camera. QR codes are
increasingly used as a means of access control in various 1) Database Management: The back-end server
settings, including event tickets, payments, and loyalty manages the user and room databases, including user
programs. Several studies have evaluated the security and registration, OAuth 2.0 authorization, and room ID
usability of QR codes as a means of access control, and generation.
proposed various enhancements, such as encryption and 2) API: The back-end server provides an API for
dynamic QR codes. managing the resources and controlling access to them.
The API is secured using OAuth 2.0 authentication.
E) Python Program for Lock Monitoring
● Related Works
1. Lock Status Monitoring: The front-end program
Several studies have explored the use of OAuth 2.0 and monitors the status of the door lock, including whether
QR codes for access control in various settings, including it is locked or unlocked.
mobile devices, smart homes, and industrial IoT. For
example, a study by NIT Tiruchirappalli (Department of 2. Combination Generation: When the door lock is
CSE) proposed an OAuth 2.0-based authentication unlocked, the front-end program generates the
2
combination of the user's first name and room ID and easy management of users and room IDs, simplifying the
sends it to the door lock for display on the lock screen. process of adding and removing users from the system.
3
Model Details easy-to-use and efficient method for granting
access to the door lock. The back-end server and
A. Description of the App : The proposed access authentication server further enhance the system's
control system is implemented as a mobile app that security, making it a reliable and robust solution for
allows users to access a door lock in an IoT environment. organizations that require a secure access control
The app utilizes OAuth 2.0 for authentication and system.
generates a unique QR code for each user that can be
C) Future work Future work for this research includes
scanned by the door lock's camera to grant access. The
conducting user testing to evaluate the
app also includes a back-end server for resource effectiveness and usability of the proposed access
management and an authentication server for Oauth. control system fully. The user testing results can
B. Implementation details : The app was provide valuable feedback for improving the app's
implemented using Flutter. The OAuth 2.0 authentication design and functionality, ensuring that the system is
was implemented using the JWT library, and the QR user-friendly and effective. Additionally, future
code generation was implemented using the flutter_qr work could focus on integrating additional security
library. The back-end server was implemented using measures such as biometric authentication or multi-
factor authentication to further enhance the
Node.js and Express. Mongo DB is used for the
system's security. Finally, the proposed access
Database. To test the app, a local instance of the back-
control system can be extended to other IoT
end server and authentication server were set up, and a devices such as smart homes or industrial
mock door lock was created using a Raspberry Pi and a machinery, providing a scalable solution for secure
camera module. access control in various environments.
C.
VII. CONCLUSION REFERENCES
1. N. Nasurudeen Ahamed, Karthikeyan P, S.P.Anandaraj, Vignesh
A) Summary of findings In this research, we proposed R. "Sea Food Supply Chain Management Using Blockchain",
an access control system for IoT door locks that 2020 6th International Conference on Advanced Computing and
utilizes OAuth 2.0 for authentication and generates Communication Systems (ICACCS), 2020 ( Publication )
a unique QR code for each user to grant access. The 2. Takamichi Saito. "A privacy-enhanced access control” Systems
system includes a mobile app, a back-end server for and Computers in Japan, 05/2006 ( Publication )
resource management, and an authentication server 3. st.fbk.eu ( Internet Source )
for OAuth. Through our implementation and 4. Submitted to Roehampton University ( Student Paper )
testing, we found that the proposed access control 5. Submitted to Nanyang Technological University ( Student Paper
system is a viable solution for accessing a door )
lock in an IoT environment. The system is user- 6. Submitted to Coventry University ( Student Paper )
friendly, efficient, and secure, providing a valuable 7. Submitted to The University of the West of Scotland ( Student
solution for small to medium-sized organizations Paper )
that require a cost-effective and easy-to-use access 8. theses.hal.science ( Internet Source )
control system. However, further user testing is 9. Se-Ra Oh, Jahoon Koo, Young-Gab Kim. "Security
required to evaluate the effectiveness and usability interoperability in heterogeneous loT platforms", Proceedings of
of the system fully. the 37th ACM/SIGAPP Symposium on Applied Computing,
2022 ( Publication )
B) Contributions and significance of the research The 10. apps.dtic.mil ( Internet Source )
proposed access control system using OAuth 2.0 11. The Electronic Library, Volume 30, Issue 5 (2012-09-29) (
and QR codes makes a significant contribution to Publication )
the field of access control systems for IoT devices. 12. Xing Liu, Jiqiang Liu, Wei Wang, Sencun Zhu. "Android single
The use of OAuth 2.0 provides a secure sign-on security: Issues, taxonomy and directions", Future
authentication process that is widely adopted in the Generation, Computer Systems, 2018 ( Publication )
industry, while the use of QR codes provides an
4
Leaf disease detection & classification using ml algorithms
6
[5]A different approach was taken by A. Objectives
Singh et al. (2019), who combined
• To investigate the interactions between
machine learning with Internet of Things
disease-causing agents and host plants,
(IoT) technology for plant disease
considering their overall relationship.
detection. They developed a sensor-
based system that monitored • To identify different diseases affecting
environmental parameters and used plants in various environments.
machine learning algorithms to analyze
• To develop a methodology for disease
the data for disease identification.
prevention and management, aiming to
reduce losses and damages caused by
[6]Ghosal et al. (2020) introduced a deep
diseases.
learning-based approach for detecting
multiple plant diseases. They employed • Scope:
transfer learning techniques with pre-
• Prevention of diseases in plants for farmers,
trained CNN models to leverage large-
assisting them in maintaining healthy crops.
scale image datasets and achieve high
accuracy in disease classification. • Collaboration with pesticide companies to
predict and provide new pesticide solutions
[7]Wang et al. (2021) proposed a for effective disease control.
comprehensive framework for plant B. EXISTING METHODS-
disease detection and classification using DRAWBACK
a combination of image processing,
feature extraction, and machine learning 1. Require long training time.
algorithms. Their system achieved 2. Difficult to understand learned function.
reliable disease identification results and 3. Large nos. of support vectors are used for
demonstrated the potential for practical training in classification task.
implementation.
XIX. PROPOSED METHOD
• In this proposed algorithm, the main
objective is to detect plant diseases by analyzing
leaf images. The methodology involves
identifying the specific disease affecting the leaf
and highlighting the affected region using image
processing techniques. The algorithm aims to
provide fast and accurate results, indicating the
percentage of the affected area. A dataset of leaf
images containing different plant diseases such
as Alternaria Alternate, Bacterial Blight, and
Cercospora leaf spot has been collected for
evaluation.
Architecture Diagram
7
The SVM classifier, a supervised learning
technique, is used for categorizing data. It works
by finding a hyperplane that separates the data
points based on the distances between support
vectors. SVM is commonly used in applications
such as facial expression recognition, speech
recognition, and texture classification. It can
handle both binary and multiclass classification
Fig.1. Flow Chart of Proposed Work problems, and it offers robustness in various
scenarios.
A. Contrast Enhancement:
By implementing these components in the
The image's contrast and brightness are adjusted algorithm, plant diseases can be detected and
to improve its visibility and distinguishability. classified accurately from leaf images. The
This involves scaling the intensity values of the SVM classifier provides efficient and effective
image using a constant factor. classification, contributing to the overall success
of the proposed methodology.
B. Image Segmentation:
8
Accuracy (%) = 100* ((No. of correctly
Classified) / (Total
no of leaves in Datasets)).
Accuracy
Fig.4 Flow chart of GLCM
9
using Image Processing and Genetic
Chart Title Algorithm”, 205, ICACEA, India. •
120 [3]. Sujatha. R, Y. Sravan Kumar and Garine
Uma Akhil, “Leaf Disease Detection using
100 100 99.55 100
93.77 90.31 Image Processing”, Journal of Chemical and
80
73.98 Pharmaceutical Sciences, March 2017, pp 670 –
60 672. •
[4]. Gautam Kaushal, Rajni Bala, “GLCM and
40
KNN Based Algorithm for Plant Disease
20 Detection”, International Journal of Advanced
0 Research in Electrical, Electronics and
Instrumentation Engineering, Vol. 6, Issue 7,
July 2017, pp. 5845 – 5852. REFERENCES •
[5]. Mrunalani R. Badnakhe, Prashant R.
Deshmukh, “Infected Leaf Analysis and
healthy leaf cercospora Comparison by OTSU Threshold and K-Means
bacterialspot curl virus Clustering, “International Journal of Advanced
late blight scab spot Research in Computer Science and Software
Engineering, Vol. 2, Issue 3, March 2012. •
[6]. Abdolvahab Ehsanirad, Sharath Kumar
Fig. 6. Accuracy on Proposed Work Y.H, “Leaf Recognition for Plant Classification
Using GLCM and PCA Methods”, Oriental
V CONCLUSION Journal of Computer Science & Technology,
Vol. 3 (1), 2010, pp. 31-36. •
[7]. Namrata K.P, Nikitha S, Saira Banu B,
This work gives efficient and accurate plant Wajiha Khanum, Prasanna Kulkarni, “Leaf
disease detection and classification technique by Based Disease Detection using GLCM and
using image processing technique. CNN and
SVM”, International Journal of Science,
image techniques are used for plant leaf disease
Engineering and Technology, 2017.
detection. This automated system reduces time
of detection and labour cost. It can help the [8]. Vijai Singh, A.K. Misra, “Detection of
farmers to diagnose the disease and take Plant Leaf Diseases using Image Segmentation
remedial action accordingly. In future work, we and Soft Computing Techniques”, Information
will extend our database for more leaf disease Processing in Agriculture 4 (2017), pp. 41–49
identification.
ACKNOWLEDGMENT
REFERENCES
10
Precision Agriculture And Crop Suggestion System Using AI And ML
Ms. Chandrakala H L Leander Nathan Kavya Sharma
School of Computer Science School of Computer Science School of Computer Science
and Engineering. and Engineering. and Engineering.
Presidency University Presidency University Presidency University
Bangalore, India Bangalore, India Bangalore, India
chandrakala.hl@presidencyuni 201910101286@presidencyuni 201910100777@presidencyu
versity.in versity.in niversity.in
Abstract: Farming and agriculture have been a significant role in the nation's economy. It is
the backbone of our country and a major one of the major industries that contributes
source of livelihood for a huge chunk of the significantly to the country’s GDP. In 2022,
GDP from agriculture in India increased to
population, especially in the rural sector.
6934.75 Billion INR in the fourth quarter of
However, a tremendous problem exists due to 2022 from 4297.55 Billion INR in the third
the unorganized ways of farmers, where they quarter of 2022. It is estimated that India’s
do not make calculative decisions based on agriculture sector accounts for only around 14
climate, soil, demand, and supply percent of the country’s economy but for 42
requirements. Thus an interactive solution percent of total employment. As the
technology curve starts to peak in the 21st
using Precision Agriculture: The use of
century, the necessity to revolutionize the
modern techniques using Artificial agriculture industry in India with the use of AI
Intelligence (AI) and Machine Learning (ML) and ML arises. With the fourth industrial
models. Using machine learning algorithms revolution, technology has drastically evolved,
like Random Forest, KNN or SVM, we can thus offering a wide variety of methods and
choose the most profitable crop list. tools to increase crop productivity and
improve weather prediction and
recommendation systems. AI/ML can be used
Keywords: Precision agriculture, profitable to correctly predict the weather at a local level,
crop. create guidance modules for farmers to use
sustainable techniques to help manage pests
I. INTRODUCTION through ecology, design AI for demand
prediction based on available stocks, exports,
and local needs, etc .
In India, agriculture is the primary source of
However, building solutions that are
income for 70% of rural households and plays
affordable, locally viable, and easily accessible
1
is necessary since the majority of farmers are due to the lack of multiple parameters that
dependent on others for the produce of their have not been included, we have through
land and lack skilled labour. Although AI thorough research and compilation, generated
powered harvesting robots, driverless tractors,
a dataset that satisfies the objectives and has
and crop monitoring using image processing
exist [1], they are far from affordable for been run by multiple algorithms to draw an
farmers with small landholdings. Nearly 65-70% accurate conclusion. An important factor is to
of Indian farmers have small to marginal include market and consumer trends that will
landholdings [2], and due to a lack of skilled be included in the recommendation apart from
labour, these tools may turn out to be hard to also taking in soil parameters [4].
use.
However, using available parameters such as Rohit Kumar Rajak et al [5] use pH, depth,
soil requirements, temperature, rainfall, and water holding capacity, drainage and erosion
available data, it is possible to build a crop as their set of parameters to derive their
recommendation system that can accurately
desired results. Thus, we see that including
predict what crop will be feasible for
profitable growth [3]. To achieve a good multiple classifiers helps increase the accuracy
harvest, certain soil parameters, such as and robustness of the model.
humidity, temperature, soil pH, sunlight, and
soil moisture levels must be satisfied. They are Deepti Dighe et al [6] have explored the use of
fed into the model as datasets collected from multiple algorithms including KNN, K-means,
verified statistical surveys and government
LAD, CHAID, Neural Networks and Naive
domains. The initial datasets can be used to
train the crop recommendation model to Bayes that were used to generate rules for the
achieve better accuracy. KNN, Random Forest, crop recommendation. They also, apart from
Decision trees, Logistic Regression, Naïve the general parameters, made sure to include
Bayes & Support vector machine are some of temperature, regional weather and month of
the algorithms that can be used to select the cultivation.
best crop type.
Abhinav Sharma et al [7] emphasize how ML
II. RELATED WORK
and IoT are used in each cycle of smart
agriculture as well as their benefits, drawbacks
Agriculture has been an integral part of India and potential future developments. Their paper
with respect to being a source of livelihood focuses on the inclusion of soil parameters
and dependency for most rural communities, such
which makes it an important tool for our as organic carbon and moisture content,
farmers to have access to the technology disease and weed detection on crops and
provided, which will not only help them species detection. Methods included were
increase their profit but also accurately guide Artificial Neural networks ( MLP NN), ELM
them on what crop should be grown. based regression, KNN and Random forest.
The current objective is to use a crop Elumalai Kannan et al projects the growth
recommendation system that uses multiple performance of major crops at the national
parameters which, as observed in multiple level. It presents data on the compound annual
research papers, use different parameters and growth rates of area, production and yield of
draw conclusions using an algorithm that is major crops in India. The study includes trends
best suited to their particular parameters. But, and patterns in the development of the nation's
2
crop sector and a projected agricultural output each other, therefore it makes sense not to
growth model across India. These parameters eliminate any of them, and we will use all of
help us better train the model and assist them when predicting the sort of crop to
farmers to practice efficient farming and stay produce.
flexible with market prices.
III. METHODOLOGY
After gathering the data, the next step is to Fig. 1 Correlation Matrix for crop
preprocess it before training the model. Data recommendation dataset
preprocessing may be done in a variety of
methods, beginning with reading the collected
dataset and progressing through data
purification [8] . Some dataset properties are
redundant while clearing data and are not
taken into consideration during cropping. As a
result, undesirable attributes and datasets with
incomplete data must be removed. To retrieve
them, we must drop or fill these missing
values with undesirable null values for greater
accuracy.
B. Feature Selection
3
C. Machine Learning Algorithm approach, which increases the outcome's
accuracy [10].
∞
Prediction methods based on machine learning
require exceptionally precise estimation based 𝐺𝑖𝑛𝑖 𝐼𝑛𝑑𝑒𝑥 = 1 − ∑ (𝑃𝑖)2
on previously learned data. The application of 𝑖=1
= 1 − [(𝑃+ )2 + (𝑃− )2 ]
data, statistical methodologies, and machine
learning technologies to estimate future results
is known as predictive analytics historical
3) Naive Bayes: The theorem used to develop
information. The objective is to provide the
a basic probabilistic classifier is known as
greatest possible solution and a prediction of
Naive Bayes. The value of one feature is
what will happen next, rather than merely
assumed to be independent of the value of any
knowing what happened.
other feature given the class variable by Naive
Bayes classifiers [11].
Naive Bayes, Decision Tree, Logistic
Regression, KNN and Random Forest are used 𝑃(𝐴|𝐵) = (𝑃(𝐵|𝐴) ∗ 𝑃(𝐴))/𝑃(𝐵)
in the crop recommendation models.
4) Decision Tree: For classification and
regression, Decision Trees (DTs) are part of
supervised learning. To overcome the problem,
1) K-Nearest Neighbor: KNN is a sort of
a tree representation is utilized, with each leaf
supervised machine learning that may be used
node representing a class label. The interior
for a variety of problems. Classification and
node of the tree represents qualities.
regression are two instances of problems that
may be solved. The symbol K represents the
number of nearest neighbors to a newly 𝐸𝑛𝑡𝑟𝑜𝑝𝑦:
forecasted unknown variable. 𝐻(𝑆) = −∑ 𝑃𝑖(𝑆) 𝑙𝑜𝑔2 𝑃𝑖(𝑆)
The Euclidean distance formula is used to 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝐺𝑎𝑖𝑛:
compute the distance between the data points 𝐼𝐺(𝑆, 𝐴) = 𝐻(𝑆)
[9] − ∑𝑣€𝑉𝑎𝑙𝑢𝑒𝑠(𝐴) ( |𝑆𝑣 |/𝑆) 𝐻(𝑆𝑣 )
𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑏/𝑤 𝐴 𝑎𝑛𝑑 𝐵
5) Logistic Regression: It is one of the most
= √(𝑥2 − 𝑥1)2 + (𝑦2 − 𝑦1)2
basic machine learning algorithms. It is
employed in the solution of classification
2) Random Forest : Random Forest is a
difficulties. It employs a sigmoid function to
method of ensemble learning that generates a
determine the likelihood of an observation,
large number of different models to tackle
and the observation is then assigned to the
classification, regression, and other problems.
appropriate class. When calculating if the
Decision trees are utilized during training. The
probability of an observation is 0 or 1, a
random forest algorithm generates decision
threshold value is chosen, and classes with
trees based on numerous data samples,
probabilities above the threshold are assigned
predicts data from each subset, and then votes
the value 1, while classes with probabilities
on it to provide the system with a better option.
below the threshold are assigned the value 0.
For data training, RF employs the bagging
4
1 This concludes with the fact that, to provide
𝑃=
1+ 𝑒 −(𝑎+𝑏𝑋) farmers with a simple, portable solution
produced by machine learning using random
D. Crop Recommendation: forest classifier with 99.32% accuracy is the
best option. It calculates the optimal crop to
The model will propose the best crop to grow plant based on many factors. Individuals will
on the given soil based on the N P K, be able to make better decisions while
temperature, humidity, and ph. sustaining crop and soil quality through the
crop recommendation system.
E. Performance Analysis:
IV. RESULT
5
[5] Rohit Kumar Rajak, Ankit Pawar, Mitalee Pendke ,
Pooja Shinde, Suresh Rathod, Avinash Devare, Crop
Recommendation System to Maximize Crop Yield
using Machine Learning Technique. Vol 4 issue 12.
Doi:12/2017.
[10] Ali, Jehad & Khan, Rehanullah & Ahmad, Nasir &
Maqsood, Imran. (2012). Random Forests and Decision
Trees. International Journal of Computer Science
Issues(IJCSI). 9.
6
Rental Battery Management System
Abstract - The many advantages of electric provide several benefits, such as shorter commute
bicycles (E- Bikes), such as shortened commute times, increased environmental sustainability, and
times, environmental sustainability, and health positive health effects. However, since e-bikes run
advantages, make them increasingly popular. on batteries, maintaining battery life can be difficult
However, managing the battery life is one of the for e-bike users. The usage of rental batteries for e-
major difficulties experienced by e-bike users. In bikes is one remedy for this problem. In this
this research article, we look into the possibility research article, we look into the possibility of using
of using e-bike rental batteries to address this e-bike rental batteries to address this problem.
problem. We assess the current status of e-bike
II. CURRENT STATE OF E-BIKE RENTAL
rental battery services and battery technology,
SERVICE
assess the benefits and drawbacks of rental
batteries, and investigate whether rental battery E-bike rental services are becoming increasingly
systems can be implemented in various locales. well- liked as a convenient and economical mode of
According to our research, renting out e-bike mobility. E-bike rental services are frequently found
batteries can assist to promote the use of e-bikes in metropolitan areas and provide riders with a
as an affordable and environmentally friendly practical method to travel across cities. Between
form of transportation while also addressing the docked and dockless systems, e-bike rental services
issue of battery management. may be split. In contrast to docked systems, which
enable users to rent and return bikes from any
Keywords—environmental sustainability,
location within a specified service area, dockless
E-bike, rental battery, battery technology,
systems do not require customers to pick up or
affordable
return their bikes at specific docking stations.
I. INTRODUCTION
A. Abbreviations and Acronyms
Electric bicycles, or e-bikes, have arisen as a
popular alternative to traditional bicycles and • E-bikes – Electric Bikes
vehicles for commuting and mobility. E-bikes • LIBs – Lithium-ion Batteries
7
B. Battery Technology A. Architecture
Battery technology for e-bikes has substantially
1) The system’s main component, the battery
advanced with the invention of lithium-ion batteries
management system, controls the batteries using
(LIBs). LIBs are portable, have a long lifespan, and
both hardware and software. It performs tasks
are easily rechargeable. Even though e-bikes are
including keeping track of the battery’s
now more dependable and effective thanks to
condition, managing the frequency of charging
battery technological developments, battery
and discharging, and controlling voltage and
management is still a problem for e-bikes users.
current levels.
2) Batteries are needed for the rental battery system
so that clients may hire out a set of batteries.
C. Advantages and Disadvantages of rental
Depending on what the clients need, the
batteries batteries come in a variety of sizes and
E-bike rental batteries have several benefits over capacities.
conventional batteries. First-off, renting batteries 3) Infrastructure for charging batteries: The
removes the inconvenience and time-consuming batteries’ ability to be charged and made ready
necessity for e-bike owners to maintain and charge for rental depends on the charging infrastructure.
their batteries. Second, renting batteries can be an This can be apply to charging devices like
affordable option for e-bike users who are unable to charges, cables, and adapters.
afford to buy a brand-new battery or replace an old 4) Program for Rental management: A program for
battery. Thirdly, since rental batteries are simply Rental management is needed to oversee the
exchangeable, customers do not have to worry about renting of batteries. Battery inventory
running out of battery life when traveling for an management, rental history tracking, and the
extended period. creation of invoices and receipts are all possible
with this program.
However, renting batteries has certain drawbacks as 5) User Interface: To allow users to rent and return
well. First-off, rental batteries might not be easily batteries, a user interface is necessary.
accessible in all locations, which would prevent Customers may use this, which can be a mobile
certain users from using them. Second, renting a or web-based application, to look for available
battery can be more expensive in the long run than batteries, make a reservation for them, and start
buying and maintaining one. Finally, temporary the renting process.
batteries might not be appropriate for customers 6) Security and Monitoring: To avoid theft,
who need a consistent and dependable power supply damage, or improper usage of the batteries, the
or for long-distance journey. rental battery system has to be safe and under
constant observation. The can include GPS
tracking devices, alerts, and security cameras.
III. FEASIBILTY OF RENTAL BATTERY
SYSTEMS
The viability of adopting rental battery for e-bikes
depends on several variables, including the cost of
renting batteries, the availability of rental services,
and the demand for e-bikes in a certain location. In
metropolitan regions with large population densities
and a strong demand for e-bike transportation, rental
battery systems are more likely to be successful.
Additionally, consumers who just need a short-
distance journey or who do not need a consistent
and dependable power supply can find that rental
battery systems are more practical.
8
Fig 1. Architecture Diagram
CONCLUSION REFERENCES
Rental Batteries for E-bike may be able to help e- [1]. Supriya M, Sangeetha V S, Subhasini A and
bike owners handle one of their main problems: Vaishnav M, “Retraction: Mobile Application in
managing battery life. Battery technology for e- Rental Batteries for Electronic Vehicles”, ICCCEBS
2021 Journal of Physics: Conference Series
bikes has substantially improved with the invention
of lithium-ion batteries, yet battery management is[2]. Molla Shahadat Hossain Lipu, Md. Sazal Miah,
still a problem for e-bike users. Shaheer Ansari, Safat B. Wali, Taskin Jamal,
Rajvikram Madurai Elavarasan, Sachin Kumar, M.
Renting batteries can be viable, affordable method M. Naushad Ali, Mahidur R. Sarker, A. Aljanad and
for managing batteries can promote the use of e- Nadia M. L. Tan, “Smart Battery Management
bikes as a transportation alternative. Technology in Electric Vehicle Applications:
Analytical and Technical Assessment toward
Emerging Future Directions”, Batteries 2022, 8,
ACKNOWLEDGMENT 219. Https://doi.org/10.3390/batteries8110219
This study report on Rental Batteries for E-bikes has [3]. Hayder Ali, Hassan Abbas Khan and Michael
been completed, and we would like to thank G. Pecht, “Evaluation of Li-Based Battery Current,
everyone who helped. First and foremost, we would Voltage, and Temperature Profiles for In-Service
want to express our gratitude to out academic Mobile Phones”, IEEE, 2020
advisor, whose advice and assistance were crucial
during the whole study process. [4]. Luiz Eduardo Cotta Monteiro, Hugo Miguel
Varela Repolho, Rodrigo Flora Calili, Daniel Ramos
We also want to express our gratitude to the subject- Louzada, Rafael Saadi Dantas Teixeira and Rodrigo
matter experts who shared their knowledge and Santos Vieira, “Optimization of a Mobile Energy
comments with us, helping us to refine our study. Storage Network”, Energies 2022, 15, 186.
Their expertise and knowledge significantly Https://doi.org/10.3390/en15010186
improved our comprehension of the subject and
[5]. Kevin Hendersen, Novando Santosa, Sally
gave us fresh viewpoints.
Septia Halim, Aswin Wibisurya, “Mobile-Based
Application Development For Car And Motor
Rentals”, Journal of Critical Reviews ISSN- 2394-
5125 Vol 7, Issue 8, 2020
[6]. Lagadec M F, Zahn R, Wood V,
“Characterization and performance evaluation of Li-
ion battery separators” Nat. Energy 2019.
[7]. Lipu, M.H.; Hannan, M.; Karim, T.F.; Hussain,
A.; Saad, M.H.M.; Ayob, A.; Miah, S.; Mahlia, T.I,
“Intelligent algorithms and control strategies for
battery management system in electric vehicles:
Progress, challenges and future outlook” J. Clean.
Prod. 2021, 292, 126044.
9
Index-Based Search Enabler For Movie NoSQL DBs Using MongoDB
Sumanth Kumar Mohapatra Yenimetla Venkata Krishna Chaitanya
Department of Information Science and Department of Computer Engineering
Engineering Presidency University
Presidency University Bengaluru, India
Bengaluru, India 201910101153@presidencyuniversity.in
201910100549@presidencyuniversity.in
B Bharath Kumar Reddy Ajith Kumar M
Department of Computer Science and Department of Computer Science and
Engineering Engineering
Presidency University Presidency University
Bengaluru, India Bengaluru, India
201910100917@presidencyuniversity.in 201910100285@presidencyuniversity.in
11
search functionality. To enhance query performance,• Design Of Indexes: Create indexes for the movie
this can entail adjusting the indexes or reassessing database's fields like movie title, director, and actor
the data model.
• Adapt And Overcome: Last but not least, it's
critical to keep an eye on and maintain the search
functionality to make sure it keeps up with
consumer demands. As the requirements change
over time, this can entail expanding the data model
with new indexes or fields.
IV. IMPLEMENTATION
• Data Preparation And Gathering: Start by
importing the Database into MongoDB shell.
Compile information on movies and the details
related to them, such as cast, crew, release dates,
genres, ratings, and reviews.
• Prior Testing: Try using the “find()” method
without implementing any Index. Also make use of
“explain(“executionstats”)” to evaluate on its
efficiency. Let’s use the find query shown in figure 1:
We can see based on the report that it has used
Collection Scan which in turn tells us that the
search operation has performed scans on
documents that weren’t related to the filter and
because of this we can also conclude that there will
be loss of time especially while compiling large
corpus of data as shown in figure 2:
Figure 3: Creating an Index on a field (Here it's
Figure 4:based on the
Running the Average
previousVotes)
search filter, but with
the presence of an Index (Compare the
"executionTimeMillis" with Figure 2)
names that are frequently searched for and
retrieved. Create indexes on these fields using
MongoDB's "createIndex()" method. (NOTE: we can
also merge fields and also create a compound
index). Let us make an index based on the previous
filter query as shown in figure 3:
As commonly people prefer good rated movies, we
can sort the rating in decreasing order “-1”.
• Implementing Search Functionality As Well As
Performing A Performance Evaluation: Based upon
the previous search query, now we again try the
14
MACHINE LEARNING ALGORITHM FOR STROKEDISEASE
CLASSIFICATION AND ALERT SYSTEM
DEEKSHITHA L4
School of Information Science and Dr.SAMPATH A K5
Engineering Associate Professor
Presidency University Presidency University
Bengaluru, Karnataka,India Bengaluru, Karnataka,India
deekshithal2002@gmail.com sampath.ak@presidencyuniversity.in
15
For both classification and regression issues, decision trees, learning models to improve stroke risk level classification"
a popular machine learning method, are used. They work by and it uses data from the 2017 National Stroke Screening
splitting the data into subsets based on the values of Program to create models for stroke risk classification using
independent variables, and then utilising the resulting machine learning techniques. In addition, the work
subsets, building a decision tree. A stroke prediction system that employs artificial
intelligence to detect stroke using real-time bio-signals is
In Random Forest, an ensemble learning method, many proposed in "AI-Based Stroke Disease Prediction System
decision trees are joined to improve the robustness and Utilizing RealTime Bio-Signals."These pieces can be
precision of the model. It is very useful when dealing with incorporated into the research paper to provide a thorough
noisy or complex datasets. analysis of the body of work on utilizing machine learning
algorithms to classify stroke diseases.
Machine learning algorithms show promise in diagnosing
and classifying strokes, enabling faster and more accurate Machine learning techniques are increasingly being used in
decision-making. Using various well-known machine research on stroke illness classification and early diagnosis.
learning techniques, including Logistic Regression, Random
A machine learning model was created in one such study by
Forest, SVM, and Decision Tree, we provide a novel method
M. Asadi et al. to predict the occurrence of stroke using
for classifying stroke illness in this study.The proposed
model can efficiently and accurately classify stroke disease, various imaging biomarkers, attaining an accuracy of 86.5%.
enabling faster and more accurate decision-making for A Convolutional Neural Network (CNN) was used in a
medical professionals. different study by G. Lee et al. to detect ischemic stroke in
computed tomography (CT) images with an accuracy of
Furthermore, our model includes a user-friendly graphical 96.7%. In terms of early detection, D. Kim et al.'s study
user interface (GUI) that can be used to alert patients about created an early warning system for stroke using machine
their stroke status via email. The GUI presents the results of learning algorithms, and they were able to predict the
the stroke classification to patients in a clear and concise beginning of stroke with an accuracy of 96.5%. Similar to
manner, allowing them to take appropriate actions and seek this, A. T. M. Faisal et al. created a smart system for early
medical attention if necessary. stroke.
By using machine learning algorithms to classify stroke
disease, our proposed system offers several advantages over
traditional methods. These include increased accuracy, faster
diagnosis, and reduced costs associated with stroke II. DATASET PREPARATION
treatment and care. The dataset includes 12 stroke prediction-related factors and
23,036 observations. This dataset was taken from Kaggle
Overall, this research presents a significant contribution to
and the variables the dataset consist of are:
the field of stroke diagnosis and classification, providing a
powerful tool for medical professionals to make informed
1. id: unique identifier for each observation
decisions and improve patient outcomes.
2. gender: gender of the patient (male or female)
3. age: age of the patient (in years)
4. hypertension: binary variable indicating whether
I. EXISTING WORK the patient has hypertension (1 = yes, 0 = no)
5. heart_disease: binary variable indicating whether
The application of machine learning algorithms for stroke the patient has heart disease (1 = yes, 0 = no)
disease classification and early detection has been studied in 6. ever_married: binary variable indicating whether
a number of published papers. In one such study, D. D. Kim the patient has ever been married (Yes or No)
et al., with encouraging findings, used a machine learning 7. work_type: type of work of the patient (Private,
model to predict the occurrence of stroke, classifying stroke Self-employed, Govt_job, Never_worked)
risk factors using the Support Vector Machine (SVM) 8. Residence_type: type of residence of the patient
technique. Similar to this, S. K. Roy et al. created an (Urban or Rural)
automated method for diagnosing strokes utilizing a variety 9. avg_glucose_level: average glucose level in the
of machine learning algorithms, such as SVM, Decision patient's blood (in mg/dL)
Tree, and Random Forest, with the accuracy of the Random 10. bmi: body mass index of the patient (in kg/m^2)
Forest approach being the highest. A machine learningbased 11. smoking_status: smoking status of the patient
alert system for early stroke detection was also created by Y. (formerly smoked, never smoked, smokes,
Han et al., and it was successful in detecting stroke at an Unknown)
early stage. 12. stroke: binary variable indicating whether the
patient had a stroke (1 = yes, 0 = no)
The study titled "Classification of stroke disease using
machine learning algorithms" is one of the other studies that Several procedures would be involved in creating the
are currently available on the topic. The research proposes a dataset for the study paper, including:
prototype for classifying stroke using text mining tools and
machine learning techniques. It is titled as "Using machine
16
1)Data cleaning entails identifying any incorrect or missing Train the Decision Tree, Support Vector Machine (SVM),
data and determining how to deal with it (for example, Random Forest, and Logistic Regression machine learning
impute missing values or remove observations with missing models. Each model's hyper-parameters should be tuned to
data). It could also entail looking for outliers and making a increase performance.
decision about how to deal with them.
4) Model Assessment:
2)Data transformation could entail scaling, normalizing, or Utilize criteria like accuracy, precision, recall, F1-score, and
establishing new variables based on existing ones in order to AUC-ROC to assess each model's performance. Choose the
make the data more analytically useful. model with the best performance after comparing the three.
5)Model training and evaluation: Using the training set as 6) Validation and Testing:
the basis, several machine learning models might be Utilizing both simulated and actual stroke data, test the alert
developed, assessed, and their performance compared with system. Use measurements like sensitivity, specificity,
that of the testing set. positive predictive value (PPV), and negative predictive
value (NPV) to verify the alert system's performance.
6)Reporting the findings: The study paper would provide the
analysis' findings, along with any conclusions and
suggestions based on them. To guarantee that the right credit
is given to the source of the data, the dataset would also need
to be correctly referenced in the study. IV. PROPOSED WORK
In conclusion, there are several critical processes in the Medical emergencies like strokes can cause instantaneous
dataset preparation process for this dataset on stroke death. Machine learning methods can be very helpful in
prediction, including data cleaning, transformation, feature preventing or minimizing the damage caused by this
selection, data splitting, model training and evaluation, and condition by detecting stroke early. In the proposed work,
reporting of the results. The dataset must be prepared we attempt to forecast the incidence of stroke based on the
correctly in order to produce accurate and trustworthy results existing causal factors using three different machine learning
and to guarantee the validity of any conclusions or techniques, including Decision Tree, Support Vector
suggestions made as a result of the study. Machine (SVM), Random Forest and Logistic Regression.
The data is first cleaned and preprocessed, and then it is
visualized using several graphs to reveal information about
III. ALGORITHM DETAILS the dataset. We next use the prepared data to train the
machine learning models, and we employ a graphical user
The algorithm details of the project are as follows: interface (GUI) program to predict stroke for fresh input
values.
1) Data Gathering and Preparation:
Gather data on the prevalence of stroke and the factors that The benefits of this planned effort include the potential for
may contribute to it, such as age, gender, smoking status, early stroke identification, which can stop a stroke from
blood pressure, cholesterol level, etc. happening or lessen its severity, thereby improving patient
Missing values are removed from the data, it is scaled and outcomes. Additionally, once a patient is determined to be at
normalized, and categorical variables are transformed into risk for a stroke, drugs can be recommended and promptly
numerical values as part of the preprocessing. given to limit the possibility of harm taking place.
2) Selection and Extraction of Features: The proposed technique, including the dataset preparation,
Determine which features are most crucial for the purpose of machine learning algorithms employed, and the GUI
classifying strokes by using feature selection algorithms. application for prediction, will be thoroughly described in
Extract characteristics like age, gender, smoking status, the study article. We will also explore the possible
blood pressure, cholesterol level, etc. from the preprocessed ramifications of our findings and show the study's outcomes,
data. including accuracy and precision measures. Furthermore, we
will contrast our strategy with other efforts on stroke
3)Model Education: detection, talk about the drawbacks and potential future
Separate the training and test sets from the preprocessed directions of our proposed system. Overall, the proposed
data. effort has the potential to help create a system for early
17
stroke identification that is more precise and successful,
which would improve patient outcomes and lessen the 2. Training Dataset:
burden of stroke on healthcare systems. - Split the preprocessed data into a training dataset and a
test dataset.
The proposed effort may also aid in the creation of a stroke - The training dataset will be used to train the machine
detection system that is both affordable and effective. We learning model.
can lessen the reliance on expensive diagnostic techniques,
like MRI scans or CT scans, which are frequently 3. Test Dataset:
unavailable to people in low-resource settings, by utilizing - The test dataset will be used to evaluate the
machine learning algorithms to identify stroke risk factors. performance of the trained model.
REFERENCES
19
Network Intrusion Detection System with PCA
Using Machine Learning Classifiers
1st Peddi Prashanth Kumar 2nd Priyanka N dept. 3rd Pathakamuri Bharath Kumar
dept. of CSE of CSE Presidency dept. of CSE
Presidency University UniversityBengaluru, Presidency University
Bengaluru, Karnataka Karnataka Bengaluru, Karnataka
201910101035@presidencyuniversity.in 201910101360@presidencyuniversity.in 201910100969@presidencyuniversity.in
Abstract—Utilising machine learning techniques, the major Basic security measures like firewalls and antivirus scanners
goal of this research is to find any network intrusions in any are reaching their capacity in dealing with the exponential
network system. In order to automatically identify attacks on increase in sophisticated Internet threats. Adding intrusion
computer networks and systems, we create a Network Intrusion
Detection System (NIDS). This system makes use of a variety detection systems to the security layers can help raise the
of machine learning techniques. Principal component analysis networks’ overall security. Various attacks are observed against
(PCA) is used in conjunction with several classification methods, the network or system. The network system is subject to
including as Support Vector Machines, Random Forest, and attacks like wormholes, black holes, and grey holes, among
XgBoost, to construct effective NIDS. An intrusion detection others. The purpose of these attacks is to steal data from the
system’s job is to find attacks. However, in order to lessen the
severity of attacks, it is also crucial to identify them quickly. system. The intrusion detection system was thus introduced to
Index Terms—Intrusion Detection System, Network Anomaly protect the system from such attempts. IDS monitor system
Detection, Features Selection, Dimensionality Reduction, NSL- attacks and work to protect the system from them.
KDD, Swarm Intelligence
II. LITERATURE REVIEW
I. INTRODUCTION We conducted a literature review on home automation
The evolution of telecommunications networks in the utilising IoT, and these four publications came out as being
twenty-first century has moved swiftly away from circuit and particularly important for understanding the previous research
packet switched networks and towards all-IP based networks. on the issue we were seeking to address as well as for
This progress has produced a unified environment where IP- understanding different solutions.
based voice and data connectivity across apps and services
is possible. Although communication network expansion has A. 2020 International Conference on Electronics and Sustain-
improved the sustainability of technologies, it has also opened able Communication Systems, ”Intrusion Detection System
up new unwelcome possibilities. The radio access networks are Using PCA with Random Forest Approach,” S. Waskle, L.
now susceptible to threats that weBasic security measures like Parashar, and U. Singh (ICESC), 2020 [?]
firewalls and antivirus scanners are reaching their capacity in Due to the advancement of wireless communication, there
dealing with the exponential increase in sophisticated Internet are several online security risks. The intrusion detection sys-
threats. Adding intrusion detection systems to the security tem (IDS) assists in identifying system attacks and identifies
layers can help raise the networks’ overall security. Various attackers. In the past, several machine learning (ML) tech-
attacks are observed against the network or system. The net- niques have been applied to IDS in an effort to improve
work system is subject to attacks like wormholes, black holes, intruder detection outcomes and boost IDS accuracy. In this
and grey holes, among others. The purpose of these attacks is paper, a method for creating effective IDS that makes use of
to steal data from the system. The intrusion detection system the random forest classification algorithm and principal com-
was thus introduced to protect the system from such attempts. ponent analysis (PCA) is proposed. Whereas the random forest
IDS monitor system attacks and work to protect the system will aid in classifying while the PCA will assist in organising
from them. re previously only applicable to fixed networks. the dataset by lowering its dimensionality. According to the
The need for more intelligent security systems arises from the results, the suggested strategy performs more accurately and
fact that threats are evolving to become more sophisticated. efficiently than other methods like SVM, Naive Bayes, and
20
Decision Tree. The performance time (min) for the suggested IV. METHODOLOGY
approach is 3.24 minutes, the accuracy rate ( A. Requirements
1) Functional Requirements: The fundamental prerequi-
B. 2018 IEEE Fourth International Conference on Big Data sites to operate the programme are the same as those needed to
Computing Service and Applications (BigDataService), K. run PyCharm IDE because the entire application is network-
Park, Y. Song, and Y. Cheong, ”Classification of Attack Types based.
for Intrusion Detection Systems Using a Machine Learning
• Python v3.6+
Algorithm,” .
• Pycharm IDE
In this article, we show the findings from our studies to • RAM :4GB Minimum
assess the effectiveness of identifying various attack types, • Hard Disk : 128 GB+
such as IDS, Malware, and Shellcode. We apply the Random • OS :Windows
Forest method to the numerous datasets created from the Kyoto • Libraries: Pandas, numpy , scikit-learn ,Xgboost.
2006+ dataset, the most recent network packet data gathered 2) Applications: • Employed by banks and other financial
for creating intrusion detection systems, in order to analyse the institutions to stop unauthorised access. Such a mechanism is
recognition performance. We conclude with talks and plans for necessary for governments and intelligence agencies to protect
additional research. their sensitive information. Such a mechanism is necessary for
B2C companies and tech firms to increase the security of their
C. A. Tesfahun and D. L. Bhaskari, ”Intrusion Detection users’ personal data.
Using Random Forests Classifier with SMOTE and Feature • UsabilityThe importance of intrusion detection systems
Reduction,” 2013 International Conference on Cloud Ubiq- (IDS) in computer and network security cannot be over-
uitous Computing Emerging Technologies, 2013 stated. This study’s experiment dataset was the NSL-
The importance of intrusion detection systems (IDS) in KDD intrusion detection dataset, an improved version of
computer and network security cannot be overstated. The ex- the KDDCUP’99 dataset.
• Purpose A detective tool called an intrusion detection
periment dataset in this research was the NSL-KDD intrusion
detection dataset, an improved version of the KDDCUP’99 system (IDS) is used to find hostile (including policy-
dataset. Due to intrusion detection’s fundamental properties, violating) activities. A preventive tool, an intrusion pre-
there is still a significant imbalance between the classes in vention system (IPS) is primarily made to both iden-
the NSL-KDD dataset, which makes it more difficult to apply tify and prevent hostile activity. IDS and IPS can be
machine learning to intrusion detection efficiently. Synthetic divided into two categories: network-based and host-
Minority Over sampling Technique (SMOTE) is used in this based, depending on where they are physically located in
study to address class imbalance by applying it to the training the infrastructure and the level of security needed. The
dataset. A reduced feature subset of the NSL-KDD dataset precise type used depends on strategic considerations,
is created using a feature selection method based on Infor- although both serve the same purpose.
• Efficiency The effectiveness and quality of NIDS, par-
mation Gain. The suggested intrusion detection framework
employs Random Forests as a classifier. According to em- ticularly its classification accuracy, detection speed, and
pirical findings, building IDS that is efficient and effective processing complexity, are adversely affected by the
for network intrusion detection performs better when using redundant and irrelevant network properties. In order to
Random Forests classifier with SMOTE and information gain- maximise the effectiveness of NIDS, numerous feature
based feature selection. selection strategies are used in this paper. The filter,
wrapper, and hybrid feature selection approach categories
are used. As a detection model, Support Vector Machine
III. PROPOSED SYSTEM AND ADVANTAGES.
(SVM) is used to categorise the behaviour of network
We suggest this system, which detects intrusions using connections into normal and abnormal traffic.
machine learning algorithms like SVM, Random Forests, and • Reliability The users of this application must all identify
XgBoost. These methods provide a quicker reaction to the and detect the network system, which increases its de-
threat since they can identify the probability of an assault more pendability. Utilising an intrusion detection system makes
quickly than the current methods. The high cardinality in this it very simple to identify unauthorised networks.The
system is reduced via principal component analysis. network data source is the NSL-KDD dataset, and the
network traffic classification is done using the SVM
method.
A. Advantage
• High accuracy. B. System Design
• Time Saving. 1) UML DAIGRAM: Unified Modelling Language is
• Low Complexities known as UML. A general-purpose modelling language with
• Easy To Scale standards, UML is used in the field of object-oriented software
21
engineering. The Object Management Group oversees and
developed the standard. The objective is for UML to establish
itself as a standard language for modelling object-oriented
computer programmes. UML now consists of a meta-model
and a notation as its two main parts. In the future, UML
might also be coupled with or added to in the form of a
method or process. The Unified Modelling Language is a
standard language for business modelling, non-software sys-
tems, and describing, visualising, building, and documenting
C. SEQUENCE DIAGRAM:
the artefacts of software systems. The UML is an amalga-
mation of best engineering practises that have been effective .In the Unified Modelling Language (UML), a sequence
in simulating huge, complicated systems. The UML is a diagram is a type of interaction diagram that demonstrates
crucial component of the software development process and how and in what order processes interact with one another. It
the creation of objects-oriented software. The UML primarily is a Message Sequence Chart construct. Event diagrams, event
employs graphical notations to convey software project design. situations, and timing diagrams are other names for sequence
• GOALS diagrams.
• The Primary goals in the design of the UML are as
follows:
• 1. To enable users to create and exchange meaningful
models, offer them a ready-to-use, expressive visual mod-
elling language.
• 2. Provide tools for specialisation and extendibility of the
key principles.
• 3. not depend on a certain development process or pro-
gramming language.
• 4. Establish a formal framework for comprehending the
modelling language.
• 5. Promote the market expansion of OO tools.
• 6. Support for more advanced development ideas includ-
ing partnerships, frameworks, patterns, and components.
• 7. Embrace the best practises
2) USE CASE DIAGRAM:: In the Unified Modelling Lan-
guage (UML), a use case diagram is a specific kind of
behavioural diagram that results from and is defined by a use- 1) COLLABORATION DIAGRAM:: The following cooper-
case analysis. Its objective is to provide a graphical picture ation diagram uses a numbering scheme to show the order
of a system’s functionality in terms of actors, their objectives in which the methods are called. The number designates the
(expressed as use cases), and any dependencies among those order in which the methods are called. The collaboration dia-
use cases. A use case diagram’s primary objective is to identify gram is described using the same order management system.
which system functions are carried out for which actor. The comparable to a sequence diagram, the method calls are also
system’s actors can be represented by their roles. comparable. Nevertheless, the cooperation diagram illustrates
the object organisation, whereas the sequence diagram only
describes it.
22
3) ACTIVITY DIAGRAM:: Activity diagrams are visual de-
pictions of workflows with choice, iteration, and concurrency
supported by activities and actions. Activity diagrams can
be used to depict the operational and business workflows of
system components in the Unified Modelling Language. An
activity diagram demonstrates the total control flow.
how data enters and exits the system, what modifies data,
and where it is stored. A DFD is used to illustrate the scope
and bounds of a system as a whole. It can be applied as a
method for communication between a systems analyst and
any participant in the system that serves as the foundation
for system redesign.
tables and their attributes. Let’s look at a straightforward ER • Support Vector Machines (SVM)
• Principal Component Analysis
diagram.
6) DFD DIAGRAM:: A Data Flow Diagram (DFD) is a • Decision Tree
• K-Nearest Diagram
common tool for illustrating how information moves through
a system. A good deal of the system requirements can be Random Forest Regression. As an ensemble learning tech-
graphically represented by a tidy and understandable DFD. It nique for classification, regression, and other tasks, random
can be done manually, automatically, or both. It demonstrates forests or random decision forests build a large number of
23
decision trees during the training phase and output the class is frequently utilised. PCA works by condensing a large
that represents the mean of the classes (classification) or collection of variables into a smaller set that still retains the
the mean/average prediction (regression) of the individual majority of the data in the larger set. • Accuracy naturally
trees. The tendency of decision trees to overfit their training suffers as a data set’s number of variables is reduced, but the
set is corrected by random decision forests. Although they secret to dimensionality reduction is to sacrifice some accuracy
frequently outperform decision trees, gradient boosted trees for simplicity. Because smaller data sets are easier to examine
are more accurate than random forests. However, their effec- and visualise, and because machine learning algorithms can
tiveness may be impacted by data peculiarities. analyse data much more quickly and easily,• In conclusion, the
•Every decision tree has a big variance, but the final variance principle of PCA is straightforward: minimise the number of
is modest when we mix them all in parallel. • When a variables in a data set while maintaining as much information
classification difficulty arises, the majority voting classifier is as possible.
used to determine the output. The final output in a regression
VI. FLOW CHART
problem is the mean of every output. An ensemble method
that can handle both classification and regression problems is
a random forest.
Support Vector Machines (SVM’s).Finding a hyperplane in
an N-dimensional space (where N is the number of features)
that clearly classifies the data points is the goal of the
support vector machine algorithm. • Support vector machines,
often known as SVMs, are useful for both classification and
regression applications. However, it is frequently employed in
classification goals.
• Decision boundaries known as hyperplanes assist in cate-
gorising the data points. Different classes can be given to the
data points that fall on each side of the hyperplane. • Support
vectors are data points that are closer to the hyperplane and ACKNOWLEDGMENT
have an impact on the hyperplane’s position and orientation. We owe a great deal to our mentor MS. BHAVYA B,
By utilising these support vectors, we increase the classifier’s Assistant Professor, School of Computer Science Engineer-
margin. The hyperplane’s location will vary if the support vec- ing, Presidency University, for her motivational leadership,
tors are deleted. These are the ideas that guide the development insightful suggestions, and giving us the opportunity to fully
of our SVM. express our technical prowess for the completion of the project
work. We express our gratitude to our family and friends for
A. XgBoost
their great support and inspiration in helping us complete this
The gradient boosting framework is used by the ensemble project.
machine learning method XgBoost, which is decision-tree
based. With new enhancements like regularisation, the model’s REFERENCES
implementation provides the features of the scikit-learn and R 1 Jafar Abo Nada; Mohammad Rasmi Al-Mosa, 2018 In-
implementations. ternational Arab Conference on Information Technology
• Supported gradient boosting methods include three pri- (ACIT), A Proposed Wireless Intrusion Detection Preven-
mary types: tion and Attack System
• •The learning rate is included in the gradient boosting 2 Kinam Park; Youngrok Song; Yun-Gyung Cheong, 2018
algorithm, commonly known as the gradient boosting IEEE Fourth International Conference on Big Data Com-
machine. puting Service and Applications (BigDataService), Clas-
• • Sub-sampling using stochastic gradient boosting at the sification of Attack Types for Intrusion Detection Systems
row, column, and column per split levels. Using a Machine Learning Algorithm
• • Gradient boosting using regularisation at both the L1 3 S. Bernard, L. Heutte and S. Adam “On the Selection of
and L2 levels. Decision Trees in Random Forests” Proceedings of Inter-
• national Joint Conference on Neural Networks, Atlanta,
On the whole, XgBoost is quick. Compared to other gradient Georgia, USA, June 14-19, 2009, 978-1-4244-3553-
boosting solutions, incredibly quick. Structured or tabular 1/09/25.002009IEEEA.Tesfahun, D.LalithaBhaskari, ”Intrus
datasets for classification and regression predictive modelling 0 − 4799 − 2235 − 2/1326.00 © 2013 IEEE
issues are dominated by XGBoost.
B. Principal Component Analysis (PCA)
•In order to reduce the dimensionality of huge data sets, a
technique known as principal component analysis, or PCA,
24
Implementing a CMS-Enabled Question and Answer Site for Departmental Use
26
stores the content in its online real-time datastore updated/regenerated with Incremental Static
called Content Lake, so we do not need to worry Regeneration (ISR).
about managing a separate database for the CMS.
• New question page at /questions/new:
The data for a Sanity project can be queried from a
frontend with Sanity’s open-source query language This page just contains a simple form for
called GROQ through their HTTP API. posting a new question and does not have
any other dynamic content. Thus, this page
Sanity also provides Sanity Studio, a React can be statically generated using SSG.
application that allows developers and editors to
manage content. Schemas for the different types of • API route for posting a question at
content can be created quickly with plain /api/question: The new question form
JavaScript objects and Sanity Studio will makes a POST request to this route with the
recognize that and create an editing environment question details as the request body to create
for managing those types of content. Since it’s a a new question in the Content Lake. Having
React application it’s also very easy to integrate it the logic to create the question in the API
with a Next.js application with the help of some route instead of client side ensures that no
packages. sensitive API keys that will give full read-
write access to the Content Lake are exposed
XXII. ARCHITECTURE to the client.
A. Site map • Sanity Studio at /admin: Requests to this
route get handled by the embedded Sanity
Studio which shows a content editing
interface for the admin users or editors
(verified experts).
XXIII. SYSTEM ARCHITECTURE
27
that sensitive configuration information and API what content shows up at /questions and
tokens are never exposed to the client. pages/questions/new.tsx determines
what content shows up at /questions/new.
XXIV. APPLICATION WALKTHROUGH
At build time the Next.js application fetches Note the .tsx extension, which is the
data from Sanity and generates pages using SSG TypeScript equivalent for .jsx files usually used
for all the questions available at that point of time. in React applications, as this Next.js project was
The questions list page gets rendered with SSR on configured to use TypeScript during initialization
each request. Thus, the questions list page contains for improving developer experience and ensuring
links for all questions, even for those questions type safety.
which have been created after the last build. B. Integrating Sanity Studio with the Next.js
The question details page for a particular Application
question does not get generated immediately after We used the official next-sanity package
it has been posted by a user. It gets generated on from Sanity.io to integrate Sanity tools like the
demand when a user tries to visit the details page client SDK as well as Sanity Studio with the
for such a question by clicking its link from the Next.js application. First, we configured public
always up-to-date questions list page. runtime and server runtime configurations using
Similarly, the question details for a particular environment variables in next.config.js as
question page does not update immediately after seen in Figure 3.
an answer is posted for it. The first request to the
question details page after an answer has been
posted results in the same old generated static page
loading quickly without the answer. But, in the
background Next.js checks if the data on the page
is out-of-date and rebuilds the page in background.
Then the next user that visits the same details page
sees the updated page with the answer. This way
question details pages can get updated when data
changes while still having the performance Figure 5: next.config.js
benefits of static generation. This technique is
called ISR, where the entire application does not The SANITY_TOKEN refers to the API token
need to be rebuilt to update all the pages generated we generated in the Sanity project so that our
through SSG. application can have write access to the Content
XXV. IMPLEMENTATION DETAILS Lake, which is necessary for creating new
questions. Since the API token gives full read-
This section will give a high-level view of how write access to the Content Lake, it should never
we built a Q&A platform called Questo using the be exposed to the public and instead should only
architecture described in the previous section. This be accessed server side and hence it is defined
section will contain excerpts of code from the inside serverRuntimeConfig.
public repository hosted on GitHub [5]. Since the
focus of this paper is not on the user interface of After configuring the environment variables we
the site, it will not show excerpts about code set up the configuration for Sanity Studio in
relating to layout and styling of the user interface. sanity.config.ts as seen in Figure 4.
A. Creating Next.js and Sanity Projects
First, we created a Sanity project and noted
down the project ID and dataset name. Next, we
created a Next.js 13 application with the pages
directory configuration, where routing is decided
by the file structure inside the pages directory.
E.g., the pages/index.tsx file determines
what content shows up at the path /,
pages/questions/index.tsx determines Figure 6: sanity.config.ts
28
This configuration is used in
pages/admin/[[…index]].tsx where the
Sanity Studio is set up. The [[…index]] part
of file name indicates that this is a catch-all route,
meaning this file is responsible for responding to
any requests to paths starting with /admin.
Figure 5 shows an excerpt from the file showing a
simple component that just configures the Sanity
Studio using components from the next-
sanity package and returns it to handle all
requests to the admin routes.
Figure 9: schemas/question.ts
And figure 8 shows an excerpt from
schemas/answer.ts that shows how the
schema was defined for the answer content type.
29
These schemas are exported as schemaTypes
from schemas/index.ts for use in Sanity
related configurations like in
sanity.config.ts.
D.Configuring the Sanity client
Figure 14: Questions list page
Next, we created a Sanity client which the
application uses to fetch and create data in the F. Fetching data and generating question details
Content Lake. Figure 10 shows an excerpt from pages
sanity-client.ts that shows the
During each build of the Next.js application we
configuration for the created client.
want to statically generate HTML pages for the
questions that already exist in the Content Lake.
Since /questions/[id] is a dynamic route,
Next.js needs to know for which possible values of
[id] it needs to generate pages. Figure 13 shows
an excerpt from
pages/questions/[id].tsx that shows
Figure 12: sanity-client.ts how to do this by defining a function called
getStaticPaths.
Note the useCdn option which has been set to
false. Sanity is able to provide data through
Content Delivery Networks (CDNs) quickly by
means of caching and other techniques, because of
which the data may not always be up to date. But
our questions list page needs to be always up to
date, that’s why we are not using the CDN with Figure 15: getStaticPaths in
this client. Since this client has the API token in it, pages/questions/[id].tsx
we use it only in server-side contexts.
All this piece of code is doing is fetching the
E. Fetching data for questions list page list of all questions and telling Next.js the possible
Figure 11 shows an excerpt from values for the dynamic route parameter [id] in
pages/questions/index.tsx that shows the route /questions/[id]. The fallback
how the data is fetched for the page at option in the returned object ensures that when a
/questions by defining a function called question is created through the application after
getServerSideProps, which fetches the data build time and a user visits the details page for that
on each request to that page and passes it as question, Next.js will not show a 404 Not Found
props for the React component that will render page, and will instead show the user some fallback
the user interface for the questions list page. content like a loading status till the page is built by
fetching data. For all possible routes determined
by the paths variable in the returned object,
Next.js statically generates the pages by fetching
data for each route by calling the
getStaticProps function defined in the same
file as seen in Figure 14.
30
make a POST request to the /api/questions
API route with the question data as the request
body. The handler for this API route is defined in
pages/api/questions.ts as seen in
Figure 16.
31
the user which improves the performance of the
website.
XXVII. CONCLUSION
In this paper, we discussed the issues with
regards to participation in large scale Q&A
platforms like fear of judgement for people asking
questions and uncertain quality of answers by
users and how those issues can be avoided if the
Figure 20: New question page platform is designed and intended to be used for
departmental levels. We discussed the architecture
I. Configuring Cross-Origin Resource Sharing for a CMS-enabled Q&A platform for
(CORS) and deployment departmental use. Then we showed a high level
The last thing to configure was the CORS view of the implementation for a Q&A platform
Origins in the Sanity project. Since locally, the that we created called Questo built with Next.js
Next.js project runs on http://localhost:3000, we and Sanity headless CMS. Finally we discussed
added it as a CORS origin with credentials performance and scalability of the application.
allowed to the Sanity project so that Sanity Thus, a Q&A platform that allows anyone to ask
responds to queries from the application running questions without signing in and only letting
locally. verified experts answer them is a good solution for
We also deployed the application with Vercel small-scale departmental use.
on https://questo-coral.vercel.app, thus we added REFERENCES
that URL as a CORS origin as well.
[29] Ma, Haiwei, Hao-Fei Cheng, Bowen Yu, and
XXVI. PERFORMANCE AND SCALABILITY Haiyi Zhu. "Effects of Anonymity,
Since the questions details pages are built and Ephemerality, and System Routing on Cost in
rebuilt with SSG and ISR, users will face very Social Question Asking." Proceedings of the
small loading times for those pages. But the ACM on Human-Computer Interaction 3, no.
questions list page fetches data on each request, GROUP (2019): 1-21.
thus it will take a bit more time to load compared [30] “Next.js
by Vercel - The React Framework for
to other pages in this application. One solution to the Web”, nextjs.org, https://nextjs.org/
this is to only show questions which have been (accessed Mar. 3, 2023)
answered. So, since the list will not have to be up
[31] “Tailwind CSS - Rapidly build modern
to date for each request, it can be generated with
SSG and updated with ISR. The downside to this websites without ever leaving your HTML.”,
approach is that the user will not be able to tell if a tailwindcss.com, https://tailwindcss.com/
particular question has already been asked if it has (accessed Mar. 10, 2023)
not been answered yet. Hence, this approach was [32] “TheComposable Content Cloud – Sanity.io”,
not used in our implementation. sanity.io, https://www.sanity.io/ (accessed Mar.
Regarding execution of serverless functions 17, 2023)
like API routes and SSR pages, those depend on [33] “srijan-nayak/Questo: A Q&A site built with
the infrastructure of the platform on which the Next.js and Sanity”, github.com,
application is deployed. In our case, we deployed https://github.com/srijan-nayak/Questo
the application on Vercel which is dynamically (accessed Apr. 20, 2023)
scalable. Its edge network ensures that the
serverless logic is executed as close as possible to
32
Logging Library for APM on A Microservice Based Web Application.
1st Mr. Sunil Kumar Sahoo 2nd Nihal G 3rd Rohan Muthanna MM
Department of CSE Department of CCE Department of IST
Presidency University Presidency University Presidency University
Bengaluru. Karnataka. Bengaluru. Karnataka. Bengaluru. Karnataka.
sunilkumarsahoo@presidencyuniversity.in 201910100683@presidencyuniversity.in 201910100079@presidencyuniversity.in
Abstract - This project aims to create a is used for API testing and managing the
logging library for Application Performance application's frontend. Overall, this project
Monitoring (APM) within a microservice demonstrates a well-structured microservice-
architecture. The four primary components of based architecture that makes use of industry-
the application are User-Management, Policy- standard tools and technologies.
Management, Claims-Management, and
Billing-Management. Additional components Keywords – Logging Library, APM,
have been implemented to support the Microservice Architecture, Spring Boot,
microservices, such as the Service Registry Insurance Domain, MySQL, User-
microservice. Management, Policy-Management, Claims-
Management, Billing-Management, Service
The API Gateway microservice, and the Registry, API Gateway, Config Server, SLF4J,
Config Server microservice. Logging is an Dynatrace, New Relic.
important element of this project, and the
SLF4J logging library is used to generate logs XXVIII. INTRODUCTION
within the microservices. These logs are Microservice architectures have grown in
exported to a separate file for easier separation favor in today's digital landscape due to their
and monitoring and analysis. scalability, flexibility, and ease of maintenance.
The goal of this project is to provide a logging
This project demonstrates a well-structured library for Application Performance Monitoring
microservice-based architecture that makes use (APM) within a microservice-based insurance
of industry-standard tools and technologies. online application. The logging library is essential
The logging library, in conjunction with the for tracking and analyzing application behavior,
interface with APM tools, offers thorough performance, and potential problems.
monitoring and analysis of the performance of Because of its complexity and the
the insurance online application. necessity for effective management of user
information, policy details, claims, and billing, the
The system achieves scalability, resilience,
insurance domain was chosen as the environment
and maintainability by utilizing efficient
for this project. The application's backend is
microservice communication and configuration
created with the Spring Boot framework, which is
management. APM tools such as Dynatrace
noted for its simplicity and broad ecosystem,
and New Relic provide useful insights into the
behavior of the application, performance while MySQL acts as the database for storing and
retrieving data.
indicators, and potential bottlenecks. Postman
User-Management, Policy-Management, Postman is used on the front end for API
Claims-Management, and Billing-Management testing and management.
are the four primary components or microservices
of the project. Each microservice focuses on a This sophisticated tool enables developers
certain function and manages the associated data to test and evaluate the microservices' APIs,
and procedures. This modular strategy improves ensuring their functionality and correctness.
the system's overall agility by allowing for greater Postman also promotes seamless integration and
organization, scalability, and independent speedier development cycles by facilitating
deployment of microservices. efficient cooperation between the frontend and
Several additional components have been backend teams.
added to support the microservices. The Netflix Overall, this project provides a complete
Eureka Server-powered Service Registry solution for creating a microservice-based
microservice provides a centralized mechanism insurance online application. The system provides
for registering and discovering microservices. scalability, performance, and maintainability by
This enables efficient communication between exploiting the benefits of microservice design,
multiple microservices, allowing them to locate Spring Boot, logging libraries, APM tools, and
and communicate with one another in real time. other supporting components. The logging library,
The API Gateway microservice serves as a in particular, is critical to enable effective
single point of entry for various microservices. It monitoring and analysis, assuring the application's
streamlines system interaction by providing a seamless operation in the dynamic insurance
consistent base URL and directing requests to the environment.
appropriate microservice based on established
rules. This abstraction layer improves the overall XXIX. CURRENT STATE OF THE
MICROSERVICE-BASED INSURANCE WEB
user experience and simplifies the management of
APPLICATION.
various endpoints.
The Config Server microservice retrieves
configuration information from a Git repository, Insurance websites provide customers with tools
providing for central management and quick to compare insurance options, track and submit
configuration modifications for the microservices. claims online, and use customer support chatbots
This decouples configuration details from the to get assistance. Customers may need to speak
coding, making it easier to change and maintain with an agent to finalize their policy.
configuration settings without requiring Insurance-based microservices are gaining
microservices to be redeployed. popularity in the industry. Insurance companies
The logging library, which makes use of can use microservices to create smaller, more
the SLF4J logging framework, is a critical focused applications that can be easily integrated
component of this project. The library collects log with other systems. For example, an insurance
data from microservices and outputs it to a company may create a microservice that handles
separate file for easier monitoring and analysis. claims processing, while another microservice
This logging technique allows developers and manages policy management. These services can
system administrators to effectively follow then be combined to create a larger insurance
application behavior, diagnose errors, and application. Microservices also enable insurance
optimize performance. companies to quickly develop new products and
The project is coupled with APM services, test them in a controlled environment,
technologies like as Dynatrace and New Relic to and scale them up as needed.
improve monitoring capabilities even further.
These tools provide detailed insights into the A. Abbreviations and Acronyms
performance indicators of the application, such as
• APM – Application Performance Management
response times, error rates, and resource
• SLF4J – Simple Logging Façade for Java
utilization. This connectivity makes it easy to
B. Logging Library.
discover bottlenecks, solve difficulties, and
optimize the overall performance of the system. A logging library is a framework for creating,
structuring, formatting, and publishing log events.
34
It offers APIs to send events from applications to files, network traffic, and infrastructure
destinations, similar to agents. measurements, and analyzes it to find trends,
abnormalities, and performance problems. It also
offers root cause analysis, proactive alerts, and
C. SLF4J [Logging Technology].
deep code-level diagnostics. Additionally,
SLF4J is a logging framework that offers a Kubernetes, containers, and cloud infrastructure
straightforward and adaptable layer for several services can be monitored by Dynatrace APM to
logging frameworks. It acts as a front or get a holistic picture of application performance.
abstraction for several logging systems,
making it easy for developers to move between
them without having to change the code. It XXX. OVERVIEW OF INSURANCE
offers a consistent and user-friendly interface MICROSERVICE WEBSITES.
that can be adjusted for different logging A. Architecture.
levels, message formats, and output locations.
SLF4J is widely used in Java projects and is 1) The architecture of the insurance website
regarded as a standard logging API. in this project is based on a microservice
method, in which various application
components are developed and deployed
D.Advantages and Disadvantages of independently as brief, unconnected
Microservices services.
Scalability, resilience, and agility are 2) The microservices' major data storage
benefits, while complexity, operational solution is the H2 database, which enables
overhead, and distributed system difficulties effective and scalable data management.
are drawbacks. 3) The service registry is used to manage the
complexity of the overall application by
keeping track of all the accessible
E. Advantages and Disadvantages of APM.
microservices and their instances.
By locating and fixing bottlenecks and 4) To provide accurate logging and
problems in real time, application performance debugging of the microservices, SLF4J is
management (APM) can assist enhance the overall used as the logging library.
performance and user experience of an 5) Docker is used to containerizing the
application. APM tools may need specialized microservices, which streamlines
knowledge to be used properly, and they can be deployment and makes it simpler to handle
difficult and expensive to adopt. Additionally, many microservice iterations.
they might increase the monitoring application's 6) To track and analyze the performance of
overhead. the microservices and the entire
application, the log details are finally
F. Advantages and Disadvantages of Logs. forwarded to an APM (Application
Logs can offer useful information for Performance Monitoring) tool like
performance analysis, security audits, and Dynatrace or New Relic.
troubleshooting. Logs have a few drawbacks, B. Figures
including the fact that they can be challenging to
interpret and analyze, that they can take up a lot of
storage, and that they might contain sensitive data
that needs to be secured.
35
Fig. 1 Project architecture. would first want to express our gratitude to our
academic advisor, whose suggestions and help
were essential during the entire study period. We
also wish to thank the subject-matter experts who
shared their insights and criticism with us,
allowing us to improve our research. Their
knowledge and skill considerably increased our
understanding of the subject and provided us with
new perspectives.
REFERENCES
[34] Doshi, P., & Kulkarni, P. (2018, May). A
review on microservices architecture. In 2018
International Conference on Communication,
Fig 2. Microservice Communication design. Computing, and Internet of Things (IC3IoT)
(pp.1-6). IEEE.
[35] Thangavel, P., & Kalaiselvi, M. (2021).
Application Performance Management Tools:
An Overview. International Journal of
Engineering and Advanced Technology, 10(4),
1844-1849.
[36] Amorim, E. R., Leão, R. S., da Silva, J. S., &
de Oliveira, J. P. (2020). An Exploratory Study
on the Use of Logging and Monitoring
Techniques for Microservices-Based
Applications IEEE Latin America
Fig 3. Database Architecture Transactions, 18(3), 520-527.
[37] Zhou, X., Li, Z., Wang, H., & Zhao, S. (2020,
December). Microservice-Based Performance
CONCLUSION.
Monitoring Framework for Web Applications.
The use of microservices in the building of In 2020 20th International Conference on
an insurance website offers advantages over Computational Science and Its Applications
traditional monolithic architecture, such as (ICCSA) (pp. 229-242). IEEE.
independent development and deployment of [38] Hogback - https://logback.qos.ch/
various application components, H2 database
[39] SLF4J - https://www.slf4j.org/
solution, service registry, Docker containerization,
and APM tools. These tools help to manage the [40] ELK Stack (Elasticsearch, Logstash, and
complexity of the microservices, simplify the Kibana) - https://www.elastic.co/what-is/elk-
deployment process, and improve optimization stack
and performance. H2, a service registry, and [41] Dynatrace APM documentation:
Docker containerization are all essential for https://www.dynatrace.com/support/help/
creating intricate online applications like [42] Gartner Magic Quadrant for APM:
insurance websites. https://www.dynatrace.com/gartner-magic-
ACKNOWLEDGMENT quadrant-apm/
[43] New Relic APM documentation:
We thank everyone who contributed to this
https://docs.newrelic.com/docs/apm/
study on Logging Library for APM on Insurance -
A Microservice Based Web Application. We
36
Augmented Reality-based 3D Build and Assembly Instructions
App for Cardboard Model
Muniswamy A School Niranjan G School Nikhil U Shet
of Computer Science and of Computer Science and Engineering School of Computer Science and
Engineering Presidency University, Bengaluru, Engineering
Presidency University, Bengaluru, Karnataka, India. Presidency University, Bengaluru,
Karnataka, India. 201910101587@presidencyuniversity.in Karnataka, India.
201910100012@presidencyuniversity.in 201910100042@presidencyuniversity.in
37
In 2017, Gabriel Evans, Jack Miller, Mariangely Our AR application ModelAR is designed for
Iglesias Pena, Anastacia MacAllister, and Eliot widespread accessibility, as it does not require
Winer created a prototype using the Unity game expensive AR glasses and can be downloaded on
engine for AR HMDs such as the Microsoft any Android smartphone. While certain
HoloLens. The application included features like a smartphones may have better AR capabilities than
user interface, interactive 3D assembly instructions, others, our application offers a convenient and cost-
and the ability to place content in a spatially- effective way for anyone to experience AR
registered manner. The study demonstrated that technology without the need for specialized
although the HoloLens shows potential, areas still hardware, such as AR glasses. So, we used Unity
need improvement, such as tracking accuracy Game Engine to develop our application. Each
before it can be used in an assembly setting in a cardboard model contains a marker card that must
factory. be scanned to load and superimpose a 3d version of
that cardboard model on top. To achieve this, we
T. Haritos and N.D. Macchiarella developed a used Vuforia SDK in Unity which handles image
mobile application in 2005 to train Aircraft recognition, tracking, and model superimposition
Maintenance Technicians (AMTs). The app was over tracked images. In addition, Blender 3D
designed to help AMTs with task training and job software was used for the 3D modeling of a
tasks. When they analyzed the outcomes, they physical cardboard box. The user interface of the
discovered that technicians' training and retention ModelAR includes forward, backward, and loop
costs were reduced. The app eliminated the need to buttons to traverse the assembly steps. Also, users
retrieve information from maintenance manuals for can scale up or down and rotate the superimposed
inspection and repair procedures, which could model on the marker at their convenience using
otherwise require leaving the aircraft. slider buttons.
In 2017, Jonas Blattgerste, Benjamin Strenge, Process of the proposed method
Patrick Renner, Thies Pfeiffer, and Kai Essig
conducted a study comparing traditional paper-
based assembly instructions with augmented reality
(AR) instructions for manual assembly tasks. The
results of the study showed that AR instructions
could greatly enhance performance in manual
assembly tasks when compared to traditional paper-
based instructions. Participants who used AR
instructions completed the task more quickly and
with fewer errors compared to those who used
paper-based instructions. Additionally, the study
revealed that participants found AR instructions to
be more user-friendly and helpful than paper-based
instructions.
38
A. Model Design C. Generating AR Marker
The initial phase in model design entails obtaining Our research paper employed fiducial or AR
precise measurements of the physical cardboard markers to accurately position our model within a
prototype, particularly for a packaging box. given space. By utilizing these markers, we were
Subsequently, a 3D rendition is crafted through the able to establish a reliable and robust tracking
utilization of software specifically designed for this system that facilitated the seamless integration of
purpose. In our study, Blender was the preferred our model into a real-world environment
tool due to its user-friendly interface, ability to implemented through the use of Python
deliver precision in modeling outcomes, and being programming language, wherein we leveraged the
open source. algorithm to generate AR markers that were easily
recognizable by our model. The resultant system
demonstrated superior accuracy and precision,
thereby underscoring the efficacy of our approach
in the context of AR applications.
D. Scripting
Fig, 3.2.1, An image of an animation controller for In the context of scripting in Unity for an AR app,
assembly steps. it is imperative to incorporate user interface (UI)
elements such as buttons that enable seamless
navigation through various animation states. This
can be achieved by creating an animation controller
39
with an integer parameter that corresponds to 3D model of a physical packaging box in
specific animation states. Blender software and used the Unity game
To achieve this, a script is written that binds the UI engine for AR development and
buttons to their respective functions. The script deployment onto Android platforms with a
includes the logic that increments or decrements minimum API level of 10. Furthermore,
the integer parameter based on the button pressed,
we developed an algorithm to generate AR
which in turn plays the linked animation state.
Additionally, the script contains a function that markers in Python, for our app to track and
enables the playing of the current animation superimpose a 3D model of a packaging
box over it. Which was possible by
IV. Implementation
employing Vuforia SDK. The results from
For Implementing our ModelAR, the minimum API
level 10 was considered for Android devices when Vuforia’s rating system showed that our
determining the necessary specifications for the AR marker could be tracked with 60%
augmented reality (AR) system. To facilitate model accuracy. 3D modelling and assembly
tracking and recognition, the Vuforia SDK instruction steps were built in Blender 3D
framework was utilized, and the resulting packages
software. Both the Virtual Model and AR
were subsequently integrated into the Unity Game
Engine, which offers a built-in XR foundation for marker were imported to Unity for
cross-platform compatibility with Android devices. development. UI buttons are used to
The 3D virtual model, complete with assembly traverse the model like forward and
instructions, was constructed using Blender 3D backward steps to assemble the model.
software before being imported into Unity. User
Scripts are written in Unity programming
interface (UI) buttons were employed to enable
users to navigate through the model. Programming on C# for virtual button events to traverse
in C# was employed for both Unity software between the assembly steps of a cardboard
development and Android app development. model.
V. Results Achieved Future Work: Our ModelAR version 2. O can
✓ Generated a marker that is trackable with provide improved features by adding a Slider
60% accuracy. button to scale up or down and rotate the model for
✓ High precision 3D modelling and assembly the user’s convenience. Improved UI quality for
step animation. the navigation buttons. This ModelAR app can be
✓ Able to Track and superimpose the virtual built for IOS devices and smart glasses and tablets.
model on the AR marker.
✓ Created UI buttons to navigate the model. References
[1] M. Dalle Mura and G. Dini, Department of
Civil and Industrial Engineering, University of
Pisa, 56122 Pisa, Italy, Published at Springer
“Augmented Reality in Assembly Systems: State of
the Art and Future Perspectives”.
Conclusion and Future Work [3] Jeff K.T. Tang, Tin-Yung Au Duong, Yui-Wang
Ng, Hoi-Kit Luk, Hong Kong – “Learning to
From this research work, we developed an Create 3D Models via an Augmented Reality
augmented reality-based Assembly Smartphone Interface”. Published at ResearchGate
instructions application for cardboard [4] Jonas Blattgerste, Benjamin Strenge, Patrick
models called ModelAR. We designed a Renner, Thies Pfeiffer, Kai Essig – “Comparing
40
Conventional and Augmented Reality Instructions for Manual Assembly and Training Tasks (June 4,
for Manual Assembly Tasks”. Published at PETRA 2021). Proceedings of the Conference on Learning
'17: Proceedings of the 10th International Factories (CLF) 2021, Available at SSRN:
Conference on PErvasive Technologies Related to http://dx.doi.org/10.2139/ssrn.38599
Assistive Environments.
[8] Gabriella Sosa, Faculty of Computing,
[5] Gabriel Evans, Jack Miller, Mariangely Iglesias Blekings Institute of Technology, Sweden
Pena, Anastacia MacAllister, and Eliot Winer “Enhance user experience when displaying 3D
"Evaluating the Microsoft HoloLens through an models and animations on mobile platforms: an
augmented reality assembly application", Proc. augmented reality approach”.
SPIE 10197, Degraded Environments: Sensing,
Processing, and Display 2017, 101970V (5 May [9] Wie Yan, Texas A&M University, College
2017); Station, Texas 77843, USA. Published at Springer –
“Augmented reality instructions for construction
[6] T. Haritos and N. D. Macchiarella, "A mobile toys enabled by accurate model registration and
application of augmented reality for aerospace realistic object/hand occlusions”.
maintenance training," 24th Digital Avionics
Systems Conference, Washington, DC, USA, 2005, [10] Dieter Schmalstieg, Tobias Höllerer
pp. 5.B.3-5.1, doi: 10.1109/DASC.2005.1563376. “Augmented Reality – Principles and Practice
(usability)”.
[7] Kolla, Sri Sudha Vijay Keshav and Sanchez,
Andre and Plapper, Peter, Comparing Effectiveness
of Paper Based and Augmented Reality Instructions
41
My City Info : A Comprehensive Guide To Finding What You Need
Mr.Jerrin Joe Francis Sanika Saigaonker Ramya C
Asst.prof-computer science Computer Science Information Science
engineering engineering engineering
Presidency University Presidency University Presidency University
Bengaluru, India Bengaluru, India Bengaluru, India
jerrin.francis@presidencyuniversity.in 201910100399@presidencyuniversity.in 201910101579@presidencyuniversity.in
Abstract— Visiting a city without an obligations and their employment. People like to
acquaintance or a proper plan can be a daunting take breaks to give themselves the much-needed
task. The effort put into planning such visits also break from this cycle.
takes a tedious amount of time, as there is a lot For this same reason, tourism is increasing.
of information on the internet. The proposed Technology has a huge potential to make this
system helps the user with reducing these efforts procedure.
and making an efficient plan to visit any place.
There are several facets to tourism. Planning takes
In the era of the internet, multiple websites help
up a lot of people's time in order to have a pleasant
with booking a room, and suggesting a place. experience.
i.e., the user must go through various websites
before making an informed decision about the People must decide where they will stay and must
place. The project focuses on making this sift through a lot of information to discover the
process easier for the user. The website will be a tourist attraction of their choice. Even with all of
one-stop destination for users to gain complete this, there may still be missing or irrelevant details.
knowledge of the city. The website will have Much time is spent on this procedure. There may
still be locations that are solely known to the
features that will help them look for a place to
locals. Locals are far better knowledgeable about
stay and suggest places to visit. With the help of
the food and activities that are unique to their
the review system, the user can easily decide if area.
they want to visit a particular place or not. The
users will also be informed about the weather The article places focus on a website that
and subjected to various facts about the city, promises to streamline and improve this
which will help them understand the place even procedure. The website incorporates a number of
better. They will also be provided emergency elements, including choosing an appropriate place
assistance, which aids in times of distress. to stay, proposing tourist attractions areas based on
a review system, exploring local food and events,
Keywords—internet, destination, distress, and also giving them access to information and
knowledge, efforts . news about the city that will help them discover
relevance to the location.
XXXI. INTRODUCTION
Although technology is advancing quickly,
people's lives are becoming considerably more
limited. Everyone is caught between their
XXXII. RELATED WORKS information about nearby police stations, NGOs,
In [1], recommendingplaces to the user based on hospitals, and fire brigades.
his current GPS location. The system will work on The website uses GIS to provide an affected
category and time spam field for showing the place for the user. Admins have access to add or
places. remove places and also can change the details of
In [2], system should find a path that fulfills those information provided on the website. Users can
criteria, show it on screen, show names of objects, contact the admin in case they know a new
some short descriptions and photos of them and attraction (hotel/restaurant/historic place) which
possible entrance costs. It should also be able to the website does not have.
estimate time needed to travel from one object to Users can also contact us through the mail id,
the next and if it is possible, advise which bus line
or other public means of transport may be used. mobile number that is available on the website or
can send a message to the admin by providing
In [3], Collaborative filtering is used to compare their name, mail id, subject on the website.
other user’s review on a place with the user’s The architecture diagram of ‘my city info' is
review. Cosine similarity is used to measure how shown in the
many similar keywords are present in the
descriptions and reviews.
43
password that is not easily guessable and use
different passwords for different accounts to
XXXV. IMPLEMENTATION prevent unauthorized access.
This website typically requires users to log in to
access the system's features and services. Here
is a brief overview of how a user can log in by XXXVI. EXPERIMENTAL RESULT
entering their name and password: After execution this website in windows, new user
• The user navigates to the login page of the signing up the website and using the website will
tourist guide system. be shown below
• On the login screen, the user enters their
username or email address in the relevant
section.
• On the login screen, the user enters their
password in the corresponding field..
• The user's username and password are
checked by the system to make sure they
match an already-existing user account..
• Phone number: Some city information
systems may use a phone number as the
primary login credential. Users will need to Fig.2 This is signup page for new user for the
enter their phone number to log in. This website
method is commonly used for mobile
applications that require authentication
through SMS verification.
• Address: In some cases, a city information
system may use a user's address as the
primary login credential. Users will need to
enter their street address, city, and state to
log in. This method is commonly used for
systems that provide hyper-local
information, such as property tax or utility
billing portals.
• If the username and password combination
are correct, the system grants the user access
to the system's features and services. If the
credentials are incorrect, the user is notified
and prompted to try again or recover their Fig 3.This will be then login page for the
account through the password recovery admin
process.
To ensure the security of the user's credentials, the
system should use encryption to protect the login
credentials during transmission and storage.
Additionally, the system should include measures
to protect against brute force attacks, where an
attacker attempts to gain access to an account by
repeatedly guessing username and password
combinations.
It's important for users to keep their login
credentials secure and not share them with anyone.
Additionally, they should choose a strong
44
Fig 4.This will be the login page for the user climate at the current location of the
After logging in to the website you will find the travelers/users, and the users can suggest
hompage in which you can select the cities based places/upcoming events to add by sending of
on where you want to travel information of that place as a request to admin.
Thus, this website makes travelers easier to find
their needs in Bangalore city by manifesting
information on city attractions, nearby places to
visit, local events, and news feed, whether the
report in a distinct website.
REFERENCES
45
[51] "A Web-Based Tourist Guide System for Mobile Phones “ISBN 978-1-4244-7547-6/10
Promoting Local Tourism" by R. H. Goudar @2010 IEEE.
and K. V. Hunagund. This paper presents a [54] A. Suryawanshi, V. C. Patil, G. Dudhane, D.
web-based tourist guide system that utilizes Joshi, and P. Ganapule, “Smart Tourist Guide
multimedia and interactive features to promote For Pune City,” Journal of emerging
local tourism. technologies and innovative research, 2018,
[52] Jian Meng,Neng Xu ,“A Mobile Tourist Accessed: Apr. 04, 2023. [Online]. Available:
Guide System Based on Mashup [55] J. Clerk Maxwell, A Treatise on Electricity
Technology“ ISBN978-1-4244- 7618-3 /10 and Magnetism, 3rd ed., vol. 2. Oxford:
©2010 IEEE. Clarendon, 1892, pp.68–73.
[53] Xiaoyun shi,”Tour-Guide: Providing
Location-Based Tourist Information on
46
UNCOVERING INCOME TAX FRAUD: A
LOGISTIC REGRESSION APPROACH FOR
DETECTION AND PREVENTION
Fancy Angeline U2 Santosh Kumar3
(B.Tech - ISE) Department of (B.Tech - ISE) Department of
Rafeeda Fatima1 computer Science and Engineering computer Science and Engineering
(B.Tech - ISE) Department of Presidency University Presidency University
computer Science and Engineering Bangalore, India Bangalore, India
Presidency University angelfancy890@gmail.com kumarsantosh25056@gmail.com
Bangalore, India
Rafeeda28@gmail.com Shivam Narayan5 Dr. P Sudha6
(B.Tech - ISE) Department of (Assistant Professor - SG)
Ravuludiki Hire Matam Prathibha4 computer Science and Engineering Department of computer Science
(B.Tech - ISE) Department of Presidency University and Engineering
computer Science and Engineering Bangalore, India Presidency University
Presidency University sshivam6495@gmail.com Bangalore, India
Bangalore, India sudha.p@presidencyuniversity.in
prathibharhm1@gmail.com
Abstract— The compulsory tax levied by the government on difficult to identify them. In many cases, they try to blend in
individuals and businesses based on their income is known as with their environment, much like military units that use
income tax. Tax fraud involves the intentional manipulation of camouflage or chameleons that use their coloring to hide
information on a tax return to reduce tax liability. Our project from predators. These tactics are not random, but rather
focuses on developing a machine learning model to identify
income tax fraud by analyzing taxpayers' financial data. Six
carefully planned and executed. As a result, new techniques
machine learning algorithms namely: Logistic Regression, are needed to detect and address patterns that appear to be
Decision Tree, Random Forest, Naive Bayes, k-Nearest normal but are actually part of fraudulent activities. Tax
Neighbors and Feed forward Neural Network were compared, authorities are given the task of finding these fraudsters and
and logistic regression was found to be the most effective in usually rely on experts' intuition. Random auditing is a way
detecting tax fraud. Compared to existing methods, the of discouraging tax frauds. Unfortunately, this approach is
proposed model captures both linear and non-linear not cost-effective, and auditing some types of taxes can take
relationships among variables, making it more accurate in up to six months or even a year, which puts a significant
detecting complex patterns. The model was developed by burden on the already overloaded tax auditors. Traditional
training it on a OpenML dataset and evaluate on a test dataset.
The research aim is to develop a model that can accurately
methods, such as manual audits or statistical analysis, are
detect tax fraud, and the objectives include comparing the time-consuming, expensive, and often ineffective.
effectiveness of various machine learning algorithms, Therefore, the use of artificial intelligence or machine
identifying significant factors contributing to tax fraud, and learning techniques has gained popularity in recent years, as
providing insights for policymakers. The proposed model has they can analyze large datasets and detect patterns that
significant potential in detecting tax fraud, which can reduce humans may miss.
revenue losses and promote fairness in the tax system while Machine learning (ML) is a branch of artificial intelligence
remaining an affordable solution. (AI) that utilizes statistical models and algorithms to enable
Keywords— Income tax fraud detection, Logistic computer systems to learn from data and improve their
Regression, Decision Tree, Random Forest, Naive Bayes, k- performance on specific tasks. Essentially, the algorithms
Nearest Neighbors, Feed Forward Neural Network. used in machine learning enable computers to learn and
make decisions based on patterns and trends discovered in
XXXVIII. INTRODUCTION
large datasets. There are three main types of machine
learning: supervised learning, unsupervised learning, and
Income tax is crucial for the functioning of our society, as it reinforcement learning. Supervised learning involves
provides countries with the necessary revenue to make vital
providing the machine with labeled data, which it can then
investments in infrastructure, health, and education.
use to make predictions or classifications based on that data.
However, despite its importance, many people are averse to
paying taxes, and make the government lose millions of Unsupervised learning, on the other hand, involves feeding
dollars every year. There are various strategies to evade the machine with unlabeled data, allowing it to identify
taxes, such as underreporting income, which reduces the tax patterns and relationships on its own. Reinforcement
liability. Criminals who commit fraud are becoming learning is a type of machine learning where the machine
increasingly sophisticated in their methods, making it learns by taking actions in an environment and receiving
feedback in the form of rewards or penalties. Machine Particle Swarm Optimization Algorithm has been improved
learning has proven to be a powerful tool for a wide range of to better detect tax evasion, with an accuracy rate of 95%. It
applications, including healthcare, finance, marketing, and is time and cost-effective but must be validated on a larger
cybersecurity. Its ability to learn from data and adapt to new dataset to ensure it is robust and accurate. [4].
circumstances without being explicitly programmed has
made it an indispensable tool in modern data analysis. The proposed method uses a clustering technique to identify
groups of taxpayers who have similar income profiles and
then identifies those who are reporting significantly lower
incomes. It relies heavily on a specific set of features and
requires manual investigation to confirm whether tax fraud
has occurred. [5]. Milos Savić developed a novel method for
detecting tax evasion risks called Hybrid Unsupervised
Outlier Detection. This approach combines the strengths of
both unsupervised and supervised techniques to enhance the
accuracy of the detection process. Although the method can
identify tax evasion in a particular case involving a grocery
shop owner, its applicability is limited, and it fails to address
the ethical and legal issues associated with detecting tax
evasion. Therefore, it is essential to test its reliability and
Fig-1 Classification of machine learning effectiveness by applying it to different datasets and
scenarios [6]. González and Velásquez's article
In this article, we propose a comprehensive method for "Characterization and detection of taxpayers with false
detecting and preventing income tax fraud by utilizing invoices using data mining techniques" focuses on
multiple machine learning algorithms, namely decision tree, identifying and characterizing taxpayers who use false
random forest, naive Bayes, k-nearest neighbors, feed- invoices to evade taxes in Colombia. The authors used
forward neural network, and logistic regression. To decision trees, neural networks, and logistic regression to
accurately identify fraudulent tax returns, we have trained identify patterns in tax data that can be used to identify
and tested these models on the OpenML Income Dataset. fraudulent behaviour. The results of the study have
Our approach emphasizes the use of behavioral and important implications for tax authorities and policymakers
demographic factors in logistic regression, and our results seeking to improve tax compliance and reduce tax evasion
demonstrate its effectiveness in improving fraud detection [7]. This paper uses a neural network to detect credit card
and prevention. This article presents a unique solution to fraud. Neural networks are difficult to understand and
address the issue of income tax fraud. require a lot of information to train, making them less
effective in smaller datasets. Additionally, they are
XXXIX. LITERATURE REVIEW
expensive to train and deploy and do not address issues
related to data privacy and security. Appropriate measures
The Article uses neural networks for fraud detection, which need to be taken to ensure data is protected and used
is a popular and effective technique in machine learning. It ethically [8].
provides a case study of income tax fraud detection and
claims to achieve a high level of accuracy. However, the AI and ML algorithms are useful tools for detecting
problem statement provides limited data, and the proposed fraudulent tax returns during income tax audits. In Taiwan,
method relies on accessing sensitive personal data, raising examples of successful use of these algorithms for both
concerns about data privacy. [1]. Neural networks are a profit-seeking enterprise income tax and individual income
powerful machine learning tool that can identify patterns in tax have been demonstrated in this study. This research
large and complex data sets, achieving 95%+ accuracy. provides valuable insights into the factors contributing to tax
However, it is difficult to identify tax evasion due to lack of fraud, which can aid in the development of effective tax
transparency, potential false positives, and lack of policies and regulations. However, it is important to note
information [2]. Data mining techniques can be used to that the findings are specific to Taiwan's tax system and
automate the process of analysing large amounts of data to may not be applicable to India's tax system. Additionally,
identify high-risk taxpayers. This is a time-saving and further research is necessary to address how the system can
efficient approach compared to manual analysis and can handle missing or unreliable data [9]. The book covers
lead to improved accuracy and flexibility. Additionally, the various techniques for fraud detection, including descriptive,
techniques can be applied to a variety of data sources, such predictive, and social network analysis. It provides practical
as financial transactions, tax returns, and other sources of examples and case studies of fraud detection in various
information, and can be scaled to large datasets. But data industries and emphasizes the use of data mining tools for
quality, algorithm selection, model interpretation, and fraud detection. It provides a general approach to fraud
privacy concerns all affect the accuracy of data mining detection, with limited focus on tax fraud, lack of emphasis
techniques for tax fraud detection [3]. The Improved on regulatory compliance, and dependence on data
48
availability. The effectiveness of the techniques may depend XL. PROPOSED METHOD
on the availability and quality of data, which may be a A. Logistic Regression (LR)
limitation in the context of the given problem statement
[10]. The research article provides a comparative analysis of It is a popular approach in machine learning that is used to
supervised and unsupervised neural networks. It uses a large solve binary classification problems. To determine the
sample size of 1,700 Korean firms over a 10-year period and likelihood of a specific outcome, the logistic function is
uses various financial ratios and non-financial factors as utilized to express the relationship between the input
input variables. The study found that both supervised and features and the output variable.
unsupervised neural networks can effectively predict The logistic function can be expressed as follows:
bankruptcy, which highlights the usefulness of machine
learning techniques in financial analysis. Legal and ethical 1
𝑃(𝑦 = 1|𝑥) =
considerations should be considered when using such a 1 + 𝑒 −(𝑏0 +𝑏1𝑥1+𝑏2 𝑥2+⋯….+𝑏𝑛𝑥𝑛 )
system [11].
Where:
The main challenge in detecting financial fraud is the use of
• 𝑥1 , 𝑥2 , … . . , 𝑥𝑛 are the input features.
repeated and unlawful techniques. To tackle this problem,
• 𝑏0 , 𝑏1 , 𝑏2 , … . . , 𝑏𝑛 are the co-efficient of the input
researchers analysed 32 documents discussing the growth of
features.
neural network algorithms for fraud detection from 2015 to
• The natural logarithm base 𝑒 is used in the
2020. The study focused on deep neural network algorithms
expression.
(DNN), convolutional neural networks (CNN), and neural
networks with SMOTE, as well as other ANN
An optimization procedure, such as gradient descent or
complementing methodologies. The experiments aimed to
Newton-Raphson, is applied to calculate the coefficients.
identify credit card fraud and facilitate online transactions.
Once the coefficients are determined, the logistic function
The comparative analysis revealed that the convolutional
can be used to predict the probability of the outcome for
ANN based on functional sequencing, the ANN with
new observations.
Gradient Boosting Decision Tree (XGBoost), and the ANN
with automatic ontology learning all met the requirements
for theoretical background, mathematical development,
experimental study, and accuracy of the results. However,
future research should consider the time, cost, and data
characterization required for neural network training, as
these factors significantly impact algorithm effectiveness
[12]. Neural networks are an affordable and straightforward
way to simplify analysis by avoiding the need to consider
many statistical assumptions, such as matrix homogeneity,
normality, and data processing. These models can
automatically adjust connection weights and are fault-
tolerant. They can also include all accessible variables in
model estimation and enable quick revisions. A study found
Fig-2 Logistic curve
that the Multilayer Perceptron is effective for identifying
fraudulent taxpayers and determining the likelihood of tax
evasion, with an efficacy of 84.3%. The ROC curve-based
sensitivity analysis demonstrated the model's excellent The threshold for identifying observations as belonging to
ability to distinguish between fraudulent and non-fraudulent one of the binary outcomes can be set to 0.5 (i.e., if P(y=1|x)
taxpayers. The Multilayer Perceptron network appears to be is larger than 0.5, the observation belongs to the positive
a highly effective way to classify taxpayers, and this study's outcome; otherwise, it belongs to the negative outcome).
results offer opportunities for improving tax fraud detection B. Decision Tree (DT)
by predicting fraud tendencies through sensitivity analysis.
The DT algorithm is a popular technique used in machine
It would be interesting to explore the use of this concept in
learning for classification and regression tasks. It represents
other taxes in the future [13]. each internal node as a feature, each branch as a decision
based on that feature, and each leaf node as a class or value.
The approach recursively divides the data into subsets based
on the most informative features until it meets a stopping
requirement, such as maximum depth or minimum number
of samples per leaf.
49
d) To make a prediction for a new data point 𝑥 , the
𝑓(𝑥) = ∑ 𝑦𝑖 ⋅ 𝐼(𝑥𝑖 ∈ 𝑅𝑖 ) algorithm passes it through all 𝑇 trees and
calculates the average prediction as:
𝑇
Where: 1
• 𝑓(𝑥) denotes the predicted output for a new 𝑌(𝑥) = ( ) ⋅ ∑ 𝑤𝑡 ⋅ ℎ𝑡 (𝑥)
𝑇
input 𝑥 . 𝑡=1
50
𝑃(𝑥1 , 𝑥2 , … . . , 𝑥𝑛 ) the absence of feedback loops during the propagation of
input across the network until it reaches the output layer.
= ∑ 𝑃(𝑦) ⋅ 𝑃(𝑥1 |𝑦) ⋅ 𝑃(𝑥2 |𝑦) ⋅ … . The FFNN is composed of basic units called neurons or
perceptrons, which linearly transform input signals, apply an
⋅ 𝑃(𝑥𝑛 |𝑦) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 𝑦.
activation function, and transmit the output to the next layer.
E. k-Nearest Neighbors (K-NN) The layers of neurons in the FFNN are connected to the
The k-NN algorithm is a supervised learning method that preceding and succeeding layers.
can be used for both classification and regression tasks. The The first layer of the FFNN is called the input layer, which
main objective of this algorithm is to predict the label or receives input data that is sent to the top-level hidden layer.
value of a test data point by identifying the k data points in The hidden layer receives inputs from the previous layer,
the training set that are closest to it. performs a linear transformation, applies an activation
To determine the distance between two data points, various function, and sends the output to the next layer. Finally, the
metrics like the Manhattan distance, cosine similarity, or output layer of the network generates the final output of the
Euclidean distance can be employed. Based on the values or FFNN.
labels of the k-nearest neighbors, the algorithm can then
predict the label or value of the test data point. The equation for the output of a single neuron In an FFNN
is as follows:
The equation for the k-NN algorithm can be expressed as 𝑛
follows:
𝑦 = 𝑓 (∑ 𝑤𝑖 𝑥𝑖 + 𝑏)
𝑖=1
For classification:
Where:
• Let 𝐷 represent the training dataset.
• 𝑥𝑖 is the input to the neuron from the previous layer
• Let 𝑥 be the test data point. or the input layer.
• The number of neighbors to take consideration is • 𝑤𝑖 is the weight of the connection between the
𝑘. input 𝑥𝑖 and the neuron.
• The distance metric between test point 𝑥 & any • 𝑏 is the bias term, which is added to shift the
point 𝑦 in the dataset 𝐷 is 𝑑𝑖𝑠𝑡(𝑥, 𝑦). output of the neuron.
• Let neighbors(𝑥) be the set of k nearest neighbors • ∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖 + 𝑏 represents a weighted sum of the
to 𝑥 in 𝐷 . inputs and bias.
• Let class (𝑦) be the class label of 𝑦 . • 𝑓 is the activation function, which introduces non-
linearity into the output of the neuron
The following is the predicted class label for 𝑥 : In a feedforward neural network (FFNN), an activation
function 𝑓 is used to introduce non-linearity. The most
predicted_class(x) = argmax(class(y)) for y in neighbors(x) commonly used activation functions are the sigmoid, ReLU
(Rectified Linear Unit), and softmax functions. The
In this equation, argmax returns the class label that occurs selection of the activation function depends on the problem
most frequently among the k nearest neighbors. and the desired output.
The output of the FFNN can be computed by combining the
For regression: equations for each neuron in the network. During the
• Let 𝐷 be the training dataset. training phase, the weights and biases of the neurons are
• Let 𝑥 be the test data point. learned by applying an optimization algorithm such as
• The number of neighbors to take consideration is backpropagation.
𝑘.
• The distance metric between test point 𝑥 & any
point 𝑦 in the dataset 𝐷 is 𝑑𝑖𝑠𝑡(𝑥, 𝑦).
• Let neighbors(𝑥) be the set of k nearest neighbors
to 𝑥 in 𝐷 .
• Let value(𝑦) be the value associated with point 𝑦 .
51
XLI. METHODOLOGY metrics like accuracy, precision, recall, and F1
score to measure how well the model was
performing in detecting income tax fraud.
7. Visualization: A visualization was created to
compare the performance of different algorithms.
Bar plots were used to compare the accuracy scores
of different models, ROC curves to see which
model performs better with an area under the curve
(AUC), and precision-recall curves that compares
precision and recall.
8. Model Comparison: The performance of each
algorithm was compared to determine which
algorithm performs the best.
52
Fig-6 ROC curve
53
performance of the models is consistent across folds, with
the logistic regression model achieving the highest mean
cross-validation score. In conclusion, our project provides a
framework for building and comparing various machine
learning models for detecting income tax fraud. The logistic
regression model with optimized hyperparameters was
found to be the best-performing model.
REFERENCES
54
Online Bus Booking System With User Authentication and Authorization
Abstract—The use of buses for traveling securely book their tickets using this system's user-
across the country is increasing day by day, so friendly interface. To further assure data security
the traditional method of booking tickets leads and privacy, the system includes user login and
to long waiting hours. So, an easier method is authorization mechanisms. The system can support
proposed in this paper by creating an online large numbers of users while retaining
bus booking system using the MERN stack performance because of the scalability built into its
technology, which includes MongoDB, Express, architecture. This study offers a useful technique
React, and Node.js, with user authentication for bus operators to enhance customer satisfaction
and streamline their operations while online bus
and authorization features. The system
reservation systems are becoming more and more
provides a user-friendly interface for
widespread, raising serious questions about data
customers to search for available bus routes, security and customer privacy. This research study
view schedules, and book their tickets securely. focuses on the creation of a bus reservation
The user authentication and authorization website with a strong emphasis on user
features ensure that only authorized users can identification and authorization to allay these
access the system's functionalities, providing worries. The MERN stack and other cutting-edge
an added layer of security. The system's web technologies were used in the design of the
architecture was designed with a focus on website to give users a simple and safe booking
scalability, allowing it to handle a high volume experience. To ensure that only authorized users
of users while maintaining its performance. can use the system's capabilities and that their data
This system is suitable for implementation by is safeguarded from unauthorized access, the
bus companies to improve their customer system uses user authentication and authorization
experience and streamline their operations mechanisms. The technical components of website
while ensuring data privacy and security. creation, including the usage of secure
authentication protocols and the application of
Keywords—Booking System, User access control policies, will be covered in detail in
Authentication, Authorization. this article.
56
• The JWT token is sent in the request
header for each future request that the
The client would then display the confirmation to client submits to the server. The JWT
the user after receiving a confirmation from the token is checked by the server, and if it is
server, which would reserve the chosen seats in found to be valid, the request is approved.
the MongoDB database. After then, the user
would log off the computer.
57
[73] Y. Yorozu, M. Hirano, K. Oka, and Y. pp. 740–741, August 1987 [Digests 9th
Tagawa, “Electron spectroscopy studies on Annual Conf. Magnetics Japan, p. 301, 1982].
magneto-optical media and plastic substrate M. Young, The Technical Writer’s Handbook.
interface,” IEEE Trans. J. Magn. Japan, vol. 2, Mill Valley, CA: University Science, 1989
58
MONITORING WATER QUALITY SYSTEM USING SMART SENSORS
ABSTRACT: In recent times, Internet of Things (IoT) paper, we developed a prototype system for measuring
and Remote Sensing (RS) techniques are utilized in the different parameters of water bodies such as rivers,
tracking, gathering, and evaluating records from remote lakes etc... The proposed smart water quality tracking
locations. Because of the full-size growth in worldwide gadget consists of smart sensors and a Wi-Fi module.
industrial output, rural to city float, and the over- This Wi-Fi module helps to connect cloud and Android
utilization of land and sea resources, the quality of water applications. By reading data from the pH sensor and
to be had by human beings has deteriorated TDS sensor, helps to analyze the pH value of water and
substantially. The excessive use of fertilizers in farms dissolved minerals of water. These values help in
and additionally other chemicals in sectors including determining the quality of water.
mining and production have contributed immensely to Keywords – pH sensor, TDS sensor, ESP32 module, piezo
the quality of water exceptional globally. Water is a vital buzzer, LED Display, Jumper wires.
need for human survival and consequently, there need to
be mechanisms put in place to vigorously take a look at
the satisfaction of water that is made available for
consumption on the town and metropolis articulated
INTRODUCTION
supplies and in addition to the rivers, creeks, and
coastline that surround our towns and cities. The supply There were numerous inventions in the twenty-
of desirable quality water is paramount in preventing first century, but at the same time, profanations,
outbreaks of water-borne diseases in addition to global warming, and other issues arose, and as a
enhancing the pleasant of existence. The improvement of result, there's no safe drinking water for the
a surface water tracking network is a crucial element in world's population. Water quality monitoring in
the assessment and protection of water quality. In this real-time faces challenges moment due to global
warming, limited water coffers, growing
population, and so on. As a result, better
methodologies for monitoring water quality
parameters in real time are much needed [1]. The
pH level of hydrogen ions is used as a parameter SYSTEM MODEL
to measure the quality of water. It indicates
whether the water is acidic or alkaline. Pure water
has a pH of 7, lower than 7 is acidic, and further
than 7 is alkaline. The pH scale runs from 0 to 14.
It should be between 6.5 and 8.5 pH for drinking
purposes. Turbidity is a dimension of a large
number of unnoticeable suspended patches in the
water. The lesser the turbidity, the lesser the threat
of diarrhea, and cholera. When the turbidity is
low, the water is clean. The temperature detector
determines whether the water is hot or cold. A
flow detector is a device that measures the inflow
of water. Traditional water quality monitoring
styles involve the homemade collection of water
samples from local places. Monitoring the water Fig.1: Block Diagram
quality in an efficient manner is playing a vital
role in every part of the earth. The sensors that have been used are pH and
TDS sensors. The IoT platform used is
LITERATURE SURVEY ThinkSpeak and also we used an Android
Water quality monitoring is a vital aspect of application like blynk. The sensors need to be
ensuring the safety and health of communities and configured and connected to the ESP32
ecosystems[2]. Many studies have been conducted microcontroller. This involves setting up the
to evaluate the effectiveness of water quality sensor's input and output pins and programming
monitoring programs and identify key challenges the ESP32 to read data from the sensors. The
and requirements. The literature highlights the sensors need to be calibrated to ensure accurate
importance of consistent and accurate sampling readings. The calibration process involves
and testing methods to ensure reliable data. comparing the sensor readings with known
Standardization of monitoring programs is also reference values and adjusting the sensor output to
crucial for comparing data across different regions match the reference values. The ESP32
and over time[3]. Funding is a significant microcontroller continuously reads data from the
challenge, particularly for smaller communities sensors and stores the readings in its memory. The
and organizations, which may struggle to secure collected data needs to be processed to extract
the resources needed to conduct monitoring useful information. This involves converting the
programs. Emerging contaminants, such as raw sensor readings into meaningful values such
microplastics, pharmaceuticals, and personal care as pH level and TDS value. The processed data
products, are also a growing concern, and can be transmitted wirelessly to a central server or
monitoring programs must be adaptable to address a cloud-based platform using Wi-Fi or Bluetooth
these issues. Collaboration between various connectivity. The data can also be displayed on a
stakeholders, including government agencies, local display for real-time monitoring. The
non-governmental organizations, and community collected data can be analyzed using data
members, is necessary for effective water quality analytics tools to identify values and patterns.
monitoring[4]. This can help in identifying water quality issues
60
and taking corrective actions. This helps us to
known water quality of drinking water.
HARDWARE
A. ESP32 microcontroller
61
enabling water treatment facilities to
respond quickly and effectively to mitigate
the impact of the event[10].
RESULTS
OUTCOMES
The exact outcomes of a water quality monitoring Fig.5: TDS and pH values in Blynk
system using IoT technology will depend on the
The TDS (Total Dissolved Solids) value of
specific goals and objectives of the system, as
240 indicates the amount of dissolved
well as the methodology and technologies used.
substances in the water, while the pH
However, some potential outcomes could include:
value of 7 signifies a neutral pH level.
• By monitoring key indicators of water This information can help monitor water
quality in real-time, such as pH, quality and ensure it is within acceptable
temperature, and TDS, water treatment parameters.
facilities can identify and address issues
that may impact the safety and quality of
the water supply. This can lead to
improved water quality and reduced risk of
waterborne illnesses.
• IoT-based water quality monitoring
systems can help water treatment facilities
improve their operational efficiency by
providing real-time data on key indicators
of water quality [9]. This can help
facilities optimize their treatment
Fig.6: pH values in thingspeak
processes and reduce waste, leading to
cost savings and improved sustainability.
• The insights gained from real-time
monitoring and data analysis can inform
better decision-making around water
treatment processes and resource
allocation. This can help water treatment
facilities prioritize their efforts and
resources, leading to more effective and
efficient water management.
• By monitoring water quality in real-time, Fig.7: TDS values in thingspeak
IoT-based systems can provide early
warning of potential contamination events,
62
CONCLUSION water quality monitoring system," Wireless
Personal Communications, vol. 111, no. 1, pp.
Implementing water quality monitoring using
225-242, June. 2020.
IoT can be a very useful and efficient way to
continuously monitor and improve the quality of [6] R. Prasad and N. K. Agrawal, "Smart water
water in various settings. quality monitoring system using internet of things
and cloud computing," in 2019 International
With the use of IoT sensors and devices, water
Conference on Electrical, Electronics and
quality parameters such as pH, temperature,
Computer Engineering (UPCON), Gorakhpur,
dissolved oxygen, turbidity, and conductivity can
Mar. 2019, pp. 1-6.
be easily measured and transmitted to a central
system for analysis and interpretation. This can [7] M. U. Rehman, M. U. Farooq, S. U. Haq,
help detect any abnormalities or changes in water and S. Q. Hasan, "Smart Water Quality
quality, which can then be addressed promptly to Monitoring System using IoT-based Sensors,"
ensure the safety and health of both humans and IEEE Access, vol. 8, pp. 199067-199082, Jan.
aquatic life. 2020.
Furthermore, the use of IoT in water quality [8] S. K. Kar, S. K. Sahu, and S. S. Padhi,
monitoring can also lead to cost savings and "Smart water quality monitoring system using
increased efficiency, as it eliminates the need for wireless sensor network and IoT," in 2019
manual measurements and reduces the risk of International Conference on Intelligent
errors or discrepancies. Real-time data can be Computing, Instrumentation and Control
accessed and analyzed remotely, allowing for Technologies (ICICICT), Kannur, Feb. 2019, pp.
faster decision-making and timely interventions. 680-685.
Water quality monitoring using IoT can play a [9] A. R. Abdullah, S. A. Mahdi, and S. A.
significant role in ensuring the sustainability of Aziz, "Smart water quality monitoring system
our water resources and protecting the based on internet of things," in 2018 IEEE 14th
environment. It is a promising technology that can International Colloquium on Signal Processing &
benefit various sectors such as agriculture, Its Applications (CSPA), Penang, April.
industry, and public health. 2018, pp. 167-171.
REFERENCES [10] N. Zidan, M. Mohammed and S. Subhi,
"An IoT based monitoring and controlling system
[1] R. S. Pandey and N. N. Pandey, Water
for water chlorination treatment", Proc. Int. Conf.
Quality Monitoring and Management: Basis,
Future Networks and Distributed Systems, pp. 31,
Technology and Case Studies, Springer Nature,
Jun. 2018.
Feb. 2020.
[11] R. Ramakala, S. Thayammal, A. Ramprakash
[2] A. Prasad and P. Singh, Monitoring and
and V. Muneeswaran, "Impact of ICT and IOT
Modeling of Global Environmental Change,
Strategies for Water Sustainability: A Case study
Springer Nature, Nov. 2019.
in Rajapalayam-India", Int. Conf. Computational
[3] A. M. Abdel-Shafy and S. M. Mansour, Intelligence and Computing Research (ICCIC 18),
Water quality – Monitoring and assessment, pp. 1-4, Dec. 2017.
Elsevier, Oct. 2019.
[4] J. Zhu, Y. Zhang, and H. Liu,
Environmental monitoring and modeling with
remote sensing and GIS, Elsevier, Jan. 2019.
[5] R. S. S. Dubey, S. Gupta, A. Tripathi, and
P. Pandey, "Wireless sensor network-based smart
63
Hastenure : Recruitment Management System
64
process is initiated when candidates submit
applications for open positions on the e-recruitment Fig.1 Block Diagram For Automated
portal. Actually, e-recruitment makes it possible for Hiring
job searchers to connect with more opportunities and
gather more knowledge. During e-recruitment,
candidates upload their resumes to the system, which
should be reviewed by an organization's HR
specialist. The candidate can also access information
from the employer regarding his application
procedure via a web-based recruiting portal or the
company website.
65
employees varies by region and industry,
but the actual cost can be high. Their
salary is not the only cost of hiring
employees; this also includes recruiter
salaries, time and effort, training and
onboarding costs, and increases the
overall cost of new hires. According to
Deloitte, the average cost per employment
is $4,000 and finding a direct alternative
requires 50–60% of an employee's salary.
V. Recruitment Process
IV. Conclusion
67
BRAIN TUMOR DETECTION USING DEEP LEARNING
ABSTRACT: A brain tumor is a growth of abnormal In recent times, the introduction of information
cells that have formed in the brain. Some brain tumors technology and e-health care system in the medical field
are cancerous (malignant), while others are not (non- helps clinical experts to provide better health care to the
malignant). A brain tumor is an abnormal growth of patient. This study addresses the problems of
cells inside the brain or skull. Primary tumor: grows segmentation of abnormal brain tissues and normal
from the brain tissue. Secondary tumor: cancer cells tissues such as gray matter, white matter, and
from different parts of the body spreads to the brain. cerebrospinal fluid from Magnetic Resonance Imaging
Most Researches in developed countries show that (MRI) images. The previously proposed models have
several people who have brain tumors died due to the high computational time and segmentation with a low
fact of inaccurate detection. Generally, a CT scan or complex network. In this paper, a Convolutional Neural
MRI that is directed into the intracranial cavity Network (CNN) has been used to detect tumors through
produces a complete image of the brain. This image is Magnetic Resonance Imaging (MRI) images. The
visually examined by the physician for detection & proposed system detects brain tumors with good
diagnosis of brain tumor. However, this method of computational time.
detection resists the accurate determination of the type Keywords – Brain Tumor, Magnetic Resonance Imaging,
& size of the tumor. Convolutional Neural Network, Deep Learning
I. INTRODUCTION detecting the tumor regions and classifying them
Brain tumor is a type of abnormal growth or mass that into three categories namely glioma, meningioma,
occurs in the brain tissue, and it can be benign (non- and pituitary tumor. For the Faster R-CNN
cancerous) or malignant (cancerous). The early and accurate
algorithm implementation, a deep convolutional
detection of brain tumors is crucial for timely medical
intervention and improved patient outcomes. Conventional network architecture called VGG-16 was used as
methods for brain tumor detection, such as magnetic the base network. The proposed algorithm
resonance imaging (MRI) and computed tomography (CT) efficiently identifies the brain tumor regions by
scans, are widely used, but they often require skilled choosing the optimal bounding box generated by
radiologists for interpretation and may have limitations in RPN.
terms of accuracy and efficiency.
In recent years, deep learning, a subfield of machine • P. K. Ramtekkar, A. Pandey, and M. K. Pawar:
learning, has shown promising results in various medical This paper studies various image classification
imaging tasks, including brain tumor detection. Deep methods, compares them, and concludes that each
learning models, such as convolutional neural networks method has its own advantages and disadvantages.
(CNNs), have demonstrated the ability to automatically
DBN is better than other methods in accuracy but
learn complex patterns and features from medical images,
leading to improved accuracy in detecting brain tumors. time-consuming whereas SVM, DT, and K-NN are
These models can analyze large amounts of data and extract simple to implement but will not accurate always.
relevant features, enabling them to detect brain tumors with Hence it is observed that the accuracy of the result
high precision and recall. is much more important, therefore deep neural
In this paper, we propose a deep learning-based approach network is preferable over other methods of brain
for brain tumor detection using CNNs. We aim to leverage tumor detection and classification.
the capabilities of CNNs to automatically detect brain
tumors from medical images with high accuracy. We will • Z. Jia and D. Chen : This paper presents a Fully
present our methodology, including the architecture of the Automatic Heterogeneous Segmentation using
CNN model, the dataset used for training and evaluation, Support Vector Machine (FAHS-SVM) for brain
and the evaluation metrics used for performance assessment.
tumor identification and segmentation. The
We will also discuss the experimental results and analyze
the performance of our proposed approach in terms of accuracy of our automated approach is similar to
precision, recall, and other relevant metrics. Finally, we will the values for manual segmentation inter-observer
conclude with the potential applications and future variability. To identify tumor regions by combining
directions of deep learning-based brain tumor detection. intrinsic image structure hierarchy and statistical
classification information. The tumor areas
described are spatially small and consistent
II. Literature Survey concerning image content and provide an
• Gumaei et al: introduced an automated appropriate and robust guide for the consequent
approach to assist radiologists and physicians segmentation.
in identifying different types of brain tumors.
The study was conducted in three steps: brain
image preprocessing, brain feature extraction, • K. Venu, P. Natesan, N. Sasipriyaa, and S.
and brain tumor classification. In the Poorani: Segmentation of the brain tumors
preprocessing step, brain images were automatically for cancer diagnosis is a challenging
converted into intensity brain images in the task. In this paper, a review of the state-of-the-art
range of [0, 1], using a min–max
methods based on deep learning is provided. Even
normalization rule. In the next step, the PCA-
NGIST method (a combination of normalized Convolution Neural Network can automatically
GIST descriptor with PCA) was adopted to extract complex features from the images. Further
extract features from MRI images. improvements and modifications in CNN
architectures and the addition of complementary
• Y. Bhanothu, A. Kamalakannan, and G. information can improve the efficiency of
Rajamanickam: This paper discusses the segmentation.
automatic brain tumor detection and classification
of MR Images using a deep learning algorithm. • N. Noreen, S. Palaniappan, A. Qayyum, I.
The Faster R-CNN algorithm was chosen for Ahmad, M. Imran, and M. Shoaib: This paper
69
discussed the application of deep learning models order to evaluate the performance of the CNN, has
for the identification of brain tumors. In this paper, been used by other classifiers such as the RBF
two different scenarios were assessed. Firstly, a classifier and the decision tree classifier in the
pre-trained DensNet201 deep learning model was CNN architecture.
used, and the features were extracted from various
DensNet blocks. Then, these features were
concatenated and passed to the softmax classifier to
classify the brain tumor. Secondly, the features III. Proposed Work
from different Inception modules were extracted
from the pre-trained Inceptionv3 model and
concatenated, and then passed to the softmax for
the classification of brain tumors.
70
Max Pooling Layer (MaxPooling2D): This layer is used
to reduce the spatial dimensions (width and height) of the
feature map, while retaining important features. Max
pooling takes the maximum value from a group of values in
a local region of the feature map, which helps in reducing
the computational complexity and the risk of overfitting.
Dropout Layer (Dropout): This is a regularization
technique used in CNNs to prevent overfitting. During
training, dropout randomly sets a fraction of input units to 0
at each update, which helps in preventing the model from
relying too heavily on any particular feature or neuron. Fig.:2 CNN architecture
Flatten Layer (Flatten): This layer is used to convert the The accuracy graphs and loss graphs are shown in Figure 3
3D feature map into a 1D vector, which can be fed into a and Figure 4
fully connected (dense) layer for further processing.
Fully Connected Layer (Dense): This is a traditional
neural network layer that connects each neuron to every
neuron in the previous and subsequent layers. It performs
the final classification or regression task based on the
extracted features from the convolutional layers.
These are some of the common CNN layers used in deep
learning models for image processing tasks.
IV. Implementation Details
Model Definition: A sequential CNN model is defined Fig.3: Accuracy graph
using Keras, which is a linear stack of layers. The model
consists of multiple convolutional layers (Conv2D) with
different filter sizes, activation functions (ReLU), and
dropout regularization (Dropout) to prevent overfitting. The
model also includes max pooling layers (MaxPooling2D) to
downsample the feature maps and flatten layer (Flatten) to
convert the 2D feature maps to 1D feature vectors. Finally,
fully connected layers (Dense) with ReLU activation are
added, followed by an output layer with softmax activation
for multi-class classification.
Model Compilation: The model is compiled with a
categorical cross-entropy loss function (categorical_ cross- Fig.4: Loss graph
entropy), Adam optimizer (Adam), and accuracy as the
evaluation metric.
Evaluation Parameters
Model Training: The model is trained on the training data Accuracy: it is the rate of rightly prognosticated samples to
(X_train and y_train) using the fit function with a specified the total number of samples. It provides an overall measure
number of epochs (20) and a validation split of 0.1 (10% of of how well the model is performing. An advanced delicacy
training data used for validation). The training progress is indicates better performance.
stored in history for later analysis.
accuracy = (TP + TN) / (TP + TN + FP + FN)
Precision: Precision It's the rate of rightly prognosticated
positive samples to the total number of prognosticated
positive samples. It measures the delicacy of positive
prognostications. Advanced perfection indicates smaller
false positive prognostications.
precision = TP / (TP + FP)
71
Recall (also known as Sensitivity or True Positive Rate): Science, Engineering and Applications (ICCSEA),
Recall is the ratio of correctly predicted positive samples to Gunupur, India, 2020, pp. 1-4, doi:
the total number of actual positive samples. It measures the 10.1109/ICCSEA49143.2020.9132874.
ability of the model to identify positive samples. A higher
recall indicates fewer false negative predictions. [4] Madhupriya, N. M. Guru, S. Praveen, and B.
Nivetha, "Brain Tumor Segmentation with Deep
recall = TP / (TP + FN) Learning Technique," 2019 3rd International
F1 score: F1 score is the harmonic mean of precision and Conference on Trends in Electronics and Informatics
recall, and provides a balanced measure of model (ICOEI), Tirunelveli, India, 2019, pp. 758-763, doi:
performance. It takes both precision and recall into account 10.1109/ICOEI.2019.8862575.
and provides a single value that balances both metrics. A [5] G. Hemanth, M. Janardhan and L. Sujihelen,
higher F1 score indicates a better balance between precision "Design and Implementing Brain Tumor Detection
and recall. Using Machine Learning Approach," 2019 3rd
F1 score = 2 * (precision * recall) / (precision + recall) International Conference on Trends in Electronics and
Informatics (ICOEI), Tirunelveli, India, 2019, pp. 1289-
Confusion Matrix: A confusion matrix is a table that 1294, doi: 10.1109/ICOEI.2019.8862553.
displays the true positive, true negative, false positive, and
false negative predictions of a model. It provides a detailed [6] M. Siar and M. Teshnehlab, "Brain Tumor
breakdown of model performance, allowing for a deeper Detection Using Deep Neural Network and Machine
analysis of performance in different categories. Learning Algorithm," 2019 9th International
Conference on Computer and Knowledge Engineering
(ICCKE), Mashhad, Iran, 2019, pp. 363-368, doi:
V. CONCLUSIONS 10.1109/ICCKE48569.2019.8964846.
The proposed algorithm is performed on the collected
[7] Y. Bhanothu, A. Kamalakannan, and G.
dataset that has four classes. The four types are glioma,
Rajamanickam, "Detection and Classification of Brain
meningioma, pituitary, and no tumor. we have divided the
Tumor in MRI Images using Deep Convolutional
dataset into training and testing sets. Real patient data is
Network," 2020 6th International Conference on
used for evaluating the model. CNN(Convolution Neural
Advanced Computing and Communication Systems
Network) can automatically extract complex features from
(ICACCS), Coimbatore, India, 2020, pp. 248-252, doi:
the images CNN is very useful for selecting an auto-feature
10.1109/ICACCS48705.2020.9074375.
in medical images. Images collected at the centers were
labeled by clinicians then tumor screenings were [8] N. Noreen, S. Palaniappan, A. Qayyum, I. Ahmad,
categorized into three 3 classes. A total of 1000 images were M. Imran, and M. Shoaib, "A Deep Learning Model
selected as train data and 1% of images were taken as test Based on Concatenation Approach for the Diagnosis of
data. The model gives an accuracy of 95.04%. Brain Tumor," in IEEE Access, vol. 8, pp. 55135-
55144, 2020, doi: 10.1109/ACCESS.2020.2978629.
72
2020, pp. 1-4, doi:
10.1109/INCET49848.2020.9154030.
[13] Zhe Xiao et al., "A deep learning-based
segmentation method for brain tumor in MR images,"
2016 IEEE 6th International Conference on
Computational Advances in Bio and Medical Sciences
(ICCABS), Atlanta, GA, 2016, pp. 1-6, doi:
10.1109/ICCABS.2016.7802771
73
AI CHATBOT FOR DIAGNOSING ACUTE DISEASES
75
METHODS AND TECHNOLOGIES Proposed Method
Decision Tree classifier The AI chatbot is purpose-built for diagnosing
acute diseases. It has been trained using standard
It is a supervised learning technnique that can
datasets and a decision tree algorithm. The
be used for both classification and regression
chatbot is hosted using Flask and HTML, making
problems.It is more useful in the case of
it accessible as a web chat. Upon initiating a
classification problems.It is a Tree Structured .
conversation with the user, the chatbot first asks
The internal nodes represent feature of
for the user's name and stores it for reference. It
dataset,branches represent the decsion rules.In a
then proceeds to inquire about the primary
decsion tree there are two types of nodes , which
symptom of the patient. Following this, the
are decision node and leaf node.
chatbot asks for suitable multiple symptoms, to
Scikit-learn which the user can respond with "Yes/No"
• Preprocessing the data using answers. Based on the collected symptoms, the
LabelEncoder and train_test_split. chatbot predicts and provides suitable precautions
and measures to be followed by the patient. The
• Creating a Decision Tree Classifier for the
chatbot's user-friendly interface and interactive
prediction of the disease based on the conversation flow make it a valuable tool for
symptoms. diagnosing acute diseases and providing relevant
• Creating a Support Vector Machine guidance to users.
Classifier (SVC) which is not used in the
code and can be removed. System Architecture
• Calculating cross-validation scores for the
created Decision Tree Classifier.
• Implementing the predict function to
predict the disease based on the input
symptoms.
Flask Framework
76
Work Flow Of Chatbot
Result/Outputs:
77
on Reliability, Infocom Technologies and
Optimization (Trends and Future Directions)
(ICRITO), Noida, India, 2020, pp. 619-622, doi:
10.1109/ICRITO48877.2020.9197833.
CONCLUSION
A. S, N. R. Rajalakshmi, V. P. P and J. L,
This paper presents a medical chatbot that can "Dynamic NLP Enabled Chatbot for Rural Health
diagnose and provide information about a disease Care in India," 2022 Second International
before consulting a doctor. The chatbot uses Conference on Computer Science, Engineering
natural language processing techniques and a and Applications (ICCSEA), Gunupur, India,
third-party expert program to handle questions it 2022, pp. 1-6, doi:
doesn't understand. The system aims to reduce 10.1109/ICCSEA54677.2022.9936389.
healthcare costs and improve access to medical
knowledge. The authors conducted experiments E. Amer, A. Hazem, O. Farouk, A. Louca, Y.
and obtained promising results, demonstrating the Mohamed and M. Ashraf, "A Proposed Chatbot
potential of the chatbot in assisting patients with Framework for COVID-19," 2021 International
self-diagnosis and symptom checking. Mobile, Intelligent, and Ubiquitous Computing
Conference (MIUCC), Cairo, Egypt, 2021, pp.
263-268, doi:
10.1109/MIUCC52538.2021.9447652.
References
78
U. Bharti, D. Bajaj, H. Batra, S. Lalit, S. Lalit and
A. Gangwani, "Medbot: Conversational Artificial
Intelligence Powered Chatbot for Delivering Tele-
Health after COVID-19," 2020 5th International
Conference on Communication and Electronics
Systems (ICCES), Coimbatore, India, 2020, pp.
870-875, doi:
10.1109/ICCES48766.2020.9137944.
79
HOME AUTOMATION USING ALEXA & GOOGLE HOME
80
3. LITERATURE REVIEW 6. ADVANTAGE OF THE PROPOSED
SYSTEM
Review Of Related Literature:
I. Safety
When people think about home
II. Convenience
automation, most of them may imagine living
III. Energy-saving potential
in a smart home: One remote controller for
IV. Remote Access
every household appliance, cooking
V. Customization
the rice automatically, starting air conditioner 7. FUTURE ENHANCEMENT
automatically, heating water for bath
automatically and shading the window The work may be expanded to new
automatically when night coming. To some heights because IOT has already taken the
extent home automation equals to smart home. market and adoption rates are rising quickly.
They both bring out smart living condition and The creation and use of a far more
sophisticated system was made possible by the
make our life more convenient and fast.
confluence of these two factors.
4. PROPOSED METHOD
These technologies limit human relationships
The goal of this project is to develop a since the majority of activities can only be
home automation system that gives the user carried out successfully and efficiently by
complete control over all remotely controllable these sophisticated machines.
aspects of his or her home.Making houses
And real-time traffic statistics. That lengthens
easier, better, or more accessible is the goal of
human time considerably.
home automation. If you can think of it, it
could be able to automate just about every part 8. RESULTS ACHIEVED
of the home. Home automation is the
combination of several technologies into a Home automation makes life more
single system. convenient and can even save you money on
heating, cooling and electricity bills. Home
5. METHODOLOGY automation can also lead to greater safety with
Internet of Things devices like security
A) To operate the appliances, you request
cameras and systems.
Google Assistant's help.
B) It sends the signal to the Sinric server.
C) Through serial communication, it sends the 9. CONCLUSION
same signal to Arduino. The Arduino UNO
will then process that signal and operate the As a result, an IOT-based system was
relays in accordance. developed that makes use of the author's IOT
platform to control Alexa hardware appliances
D) ESP-01 will receive the signal from the for home automation purposes. The voice-
Sinric server enabled smart board system is extremely
E) It then sends the feedback to ESP-01 again responsive in accepting commands and taking
through the Serial communication the appropriate actions.
F) So that we can track the real-time feedback Future versions of the system could
in the Google Home and Amazon Alexa apps, incorporate other AI principles to improve
the ESP-01 then sends feedback to the Sinric usability and boost automation. The creation of
server once again. the system for languages other than English
may be introduced as a further feature.
In conclusion, our solution offers a method for
IOT home security that takes future changes of
81
the Bluetooth protocol into careful [4]
consideration. The Alexa application was https://www.ncbi.nlm.nih.gov/pmc/articles/PMC81
integrated into a larger home automation 98920/
system. [5]https://www.security.org/home-
automation/#:~:text=Home%20automation%20mak
10. REFERENCES es%20life%20more,like%20security%20cameras%
20and%20systems.
[6]
[1] Mr.sunil S.khatal1, mr. B.S.Chundhire2,
https://www.cornerstoneprotection.com/blog/home-
Mr.K.S.kahate3”Survey on key Aggrigation system
automation-benefits/
for secure sharing of cloud data”
[7]
[2] A karmen crime victims:An introduction to
https://rcciit.org/students_projects/projects/ece/201
victimology, cengage learning ,2012
8/GR30.pdf
[3]https://www.safewise.com/faq/home-
automation/home-automation-
benefits/#:~:text=The%20benefits%20of%20home
%20automation%20include%20safety%2C%20con
venience%2C%20control%2C,consider%20to%20r
eap%20these%20rewards.
82
HOME AUTOMATION USING ALEXA & GOOGLE HOME
83
They both bring out smart living condition and sophisticated system was made possible by the
make our life more convenient and fast. confluence of these two factors.
9. REFERENCES
5. ADVANTAGE OF THE PROPOSED
SYSTEM
I. Safety [1] Mr.sunil S.khatal1, mr. B.S.Chundhire2,
II. Convenience Mr.K.S.kahate3”Survey on key Aggrigation system
III. Energy-saving potential for secure sharing of cloud data”
IV. Remote Access [2] A karmen crime victims:An introduction to
V. Customization victimology, cengage learning ,2012
6. FUTURE ENHANCEMENT
[3]https://www.safewise.com/faq/home-
The work may be expanded to new automation/home-automation-
heights because IOT has already taken the benefits/#:~:text=The%20benefits%20of%20home
market and adoption rates are rising quickly. %20automation%20include%20safety%2C%20con
venience%2C%20control%2C,consider%20to%20r
The creation and use of a far more
eap%20these%20rewards.
84
[7]
https://rcciit.org/students_projects/projects/ece/201
[4] 8/GR30.pdf
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC81
98920/
[5]https://www.security.org/home-
automation/#:~:text=Home%20automation%20mak
es%20life%20more,like%20security%20cameras%
20and%20systems.
[6]
https://www.cornerstoneprotection.com/blog/home-
automation-benefits/
85
House Price Prediction Using Machine learning
87
C. Lasso Regression
Lasso regression is a regularization technique .It
is used over regression methods for a more
accurate prediction .This model uses
shrinkage.shrinkage is where data values are Here we are first importing data and then we
shrunk towards a central point as the mean .The are doing data analysis, feature engineering and
lasso procedure encourages simple,sparse models. using ML Algorithms.
IV Advantages of proposed system- Here ,we
IMPLEMENTATION intend to base our evaluation based on every
criterion that is taken into account when
establishing the pricing. Here we choose our best
Here we use python jupyter notebook for model only after applying multiple machine
implementation of house price prediction.First we learning algorithms which gives better accuracy.
import the libraries.Python jupyter notebook is a Data is the heart of machine learning.with out data
free notebook. Then we load bangalore home we cannot train our models.here the data is
prices in to data frame.We do the data cleaning thoughrouly examined,cleaned and preprocessed
and remove any Null values from the data.Then ..Any missing values in the data gives error result.
we apply feature engineering .we add new feature
for bhk and Add new feature called price per VI
square feet. After applying the algorithms on the CONCLUSION
data. We got linear regression is the best algorithm From the historical development of machine
for the data. learning and its applications in real estate sector, it
can be shown that systems and methodologies
D. Figures and Tables
have been emerged that has enabled sophisticated
data analysis by simple and straightforward use of
machine l learning algorithms. The suggested
TABLE-I approach forecasts the price of real estate in
Bangalore based on a number of characteristics.
MODEL BEST_SCO BEST_PARAMS To find the best model, we propose to test a
RE variety of machine learning algorithms.
LINEAR_REGRES -4- {NORMALIZE’:FALSE}
Flask Integration: we will deploy our machine
SION 478168E+ learning model into flask web app.Flask provides
15
you tools , libraries, and technologies that allow
you to build a web applicaton.this framework is
LASSO 7.508086 {‘ALPHA’:’SELECTION’:’CYCLIC’)
EE-01
used for integrating Python models.
VII REFERENCES
[1] Varma, Ayush et al. “House Price Prediction Using Engineering (iCMLDE), Sydney, NSW, Australia, 2018, pp. 35-42,
Machine Learning and Neural Networks.” 2018 Second doi: 10.1109/iCMLDE.2018.00017.
International Conference on Inventive Communication and
Computational Technologies (ICICCT) (2018): 1936-1939. [3] Wang, P., Chen, C., Su, J., Wang, T., & Huang, S.
(2021). Deep Learning Model for House Price Prediction Using
[2] T. D. Phan, "Housing Price Prediction Using Machine Heterogeneous Data Analysis Along With Joint Self-Attention
Learning Algorithms: The Case of Melbourne City, Australia," Mechanism. IEEE Access, 9, 55244-55259.
2018 International Conference on Machine Learning and Data
88
AI CHATBOT FOR DIAGNOSING ACUTE DISEASES
90
METHODS AND TECHNOLOGIES Proposed Method
Decision Tree classifier The AI chatbot is purpose-built for diagnosing
acute diseases. It has been trained using standard
It is a supervised learning technnique that can
datasets and a decision tree algorithm. The
be used for both classification and regression
chatbot is hosted using Flask and HTML, making
problems.It is more useful in the case of
it accessible as a web chat. Upon initiating a
classification problems.It is a Tree Structured .
conversation with the user, the chatbot first asks
The internal nodes represent feature of
for the user's name and stores it for reference. It
dataset,branches represent the decsion rules.In a
then proceeds to inquire about the primary
decsion tree there are two types of nodes , which
symptom of the patient. Following this, the
are decision node and leaf node.
chatbot asks for suitable multiple symptoms, to
Scikit-learn which the user can respond with "Yes/No"
• Preprocessing the data using answers. Based on the collected symptoms, the
LabelEncoder and train_test_split. chatbot predicts and provides suitable precautions
and measures to be followed by the patient. The
• Creating a Decision Tree Classifier for the
chatbot's user-friendly interface and interactive
prediction of the disease based on the conversation flow make it a valuable tool for
symptoms. diagnosing acute diseases and providing relevant
• Creating a Support Vector Machine guidance to users.
Classifier (SVC) which is not used in the
code and can be removed. System Architecture
• Calculating cross-validation scores for the
created Decision Tree Classifier.
• Implementing the predict function to
predict the disease based on the input
symptoms.
Flask Framework
91
Work Flow Of Chatbot
Result/Outputs:
92
M. M. Rahman, R. Amin, M. N. Khan Liton and
N. Hossain, "Disha: An Implementation of
Machine Learning Based Bangla Healthcare
Chatbot," 2019 22nd International Conference on
Computer and Information Technology (ICCIT),
Dhaka, Bangladesh, 2019, pp. 1-6, doi:
10.1109/ICCIT48885.2019.9038579.
A. S, N. R. Rajalakshmi, V. P. P and J. L,
"Dynamic NLP Enabled Chatbot for Rural Health
Care in India," 2022 Second International
CONCLUSION Conference on Computer Science, Engineering
and Applications (ICCSEA), Gunupur, India,
This paper presents a medical chatbot that can 2022, pp. 1-6, doi:
diagnose and provide information about a disease 10.1109/ICCSEA54677.2022.9936389.
before consulting a doctor. The chatbot uses
natural language processing techniques and a E. Amer, A. Hazem, O. Farouk, A. Louca, Y.
third-party expert program to handle questions it Mohamed and M. Ashraf, "A Proposed Chatbot
doesn't understand. The system aims to reduce Framework for COVID-19," 2021 International
healthcare costs and improve access to medical Mobile, Intelligent, and Ubiquitous Computing
knowledge. The authors conducted experiments Conference (MIUCC), Cairo, Egypt, 2021, pp.
and obtained promising results, demonstrating the 263-268, doi:
potential of the chatbot in assisting patients with 10.1109/MIUCC52538.2021.9447652.
self-diagnosis and symptom checking.
93
Conference on Communication and Electronics 870-875, doi:
Systems (ICCES), Coimbatore, India, 2020, pp. 10.1109/ICCES48766.2020.9137944.
94
Building a Secure JSON Web Token (JWT) Library
LII. RELATED WORK The second part, the payload, contains the claims,
which include reserved, public, and private claims.
A. Formal Definitions The payload is also Base64Url encoded to form the
JSON Web Token (JWT) is an openly available second part of the JWT.
standard that describes a method for securely
exchanging information between two parties as a
JSON object. This information is trustworthy
because it is digitally signed[1].
JWTs are compact and can be easily shared through
various channels like URLs, POST parameters, or
HTTP headers. Their compactness makes
transmission quick, and their self-contained nature
means that all the necessary information about the Fig 2. JWT payload
user is carried in the payload to avoid multiple
database queries. The final part is the signature, which is created by
JWTs are ideal for authentication[3], allowing a user signing the encoded header, payload, secret, and
to log in once and use the JWT for each subsequent algorithm specified in the header.
request. This enables the user to access routes, The signature is used to verify the sender of the
services, and resources that are permissible with JWT and ensure that the message has not been
that token. JWTs are widely used in Single Sign-On altered.
(SSO) solutions because of their minimal overhead
and ease of use across various domains.
encoded, not encrypted, an
attacker who intercepts a JWT
could potentially decode and
view the information within it.
5) Weak encryption
algorithms: If the encryption
algorithm used to sign the JWT
is weak, an attacker could
Fig 3. JWT Signature potentially guess the signing key
and modify the claims within the
The result is a compact and easily transferable token JWT.
that can be used in HTML and HTTP environments.
6) Weak signatures and
insufficient signature validation:
Several attacks are possible due
to design flaws in some libraries
and applications, including
changing the algorithm to
"none," modifying the RSA
parameter value, and using weak
symmetric keys.
Fig 4. Resulting JWT
7) Plaintext leakage
C. Security challenges associated with JWTs through analysis of ciphertext
While JSON Web Tokens (JWTs) offer many length: Some encryption
benefits, there are also some security risks algorithms leak information
about the length of the plaintext,
associated with their use. Some of the potential and compression attacks are
risks include: powerful when attacker-
controlled data is in the same
1) JWTs are vulnerable compression space as secret data.
to replay attacks: Attackers can
intercept and replay JWTs, To mitigate these risks, it's crucial to carefully
causing the server to believe that design and implement the JWT-based
the request is coming from a
legitimate source. authentication and authorization system, properly
validate and verify the JWTs, use strong encryption
2) JWTs are prone to algorithms, and limit the sensitive information
tampering: If an attacker can included in the JWTs. Additional security
gain access to the JWT, they measures, such as using refresh tokens, limiting the
could modify the claims within JWT's lifespan, and implementing rate limiting to
it, potentially gaining access to prevent replay attacks, should also be considered.
resources they should not have.
D. Analysis of existing literature
3) JWTs don't support Analysis of existing literature:
revocation: Once a JWT has
been issued, there is no way to Although research papers on JSON Web Tokens
revoke it. If an attacker gains (JWTs) offer valuable insights into different aspects
access to a JWT, they could use of JWTs, there are certain limitations to these
it indefinitely. papers that should be acknowledged.
One limitation is that many of these papers
4) JWTs can reveal concentrate on specific programming languages or
sensitive information: Because
the claims within a JWT are libraries for JWTs. While this could be beneficial
for developers working with those languages, it The "tsdx create" command was then used to start a
may not be applicable to those working with other fresh TSdx project.
languages or libraries. Thus, the research findings In order to develop the JWT Generation and
and recommendations may not be broadly Validation modules, we then installed the relevant
applicable. packages, including "jsonwebtoken" and "crypto-
Another highlight is that some papers only address js."
particular use cases for JWTs, such as web or A JWT may be produced using the JWT Generation
mobile applications. Although these use cases are module with a payload and a secret key. This
significant, JWTs are also employed in other module used the Base64URL module to encode the
contexts like IoT devices, and research on these JWT and the SHA256 method to create the
applications is scarce. This can restrict the scope of signature.
the research and may not provide a comprehensive A JWT may be validated by validating the signature
understanding of the advantages and limitations of and payload using the JWT Validation module. The
JWTs across different contexts. Base64URL module was used to decode the JWT,
Moreover, many papers concentrate on the and the SHA256 method was used to verify the
technical implementation of JWTs and overlook signature.
broader issues like security considerations or best To enable cryptographic hashing using the SHA256
practices for handling token expiration and method, we developed a SHA256 module. Both the
revocation. Although technical implementation is JWT Generation and Validation modules used this
important, these broader issues are equally critical module to create and verify the signature.
for ensuring the security and reliability of JWTs in To support encoding and decoding of data in
practice[2]. Base64URL format, we lastly constructed a
In conclusion, while research papers on JWTs Base64URL module. The JWT Generation and
provide valuable insights into various aspects of Validation modules utilised this module to encrypt
JWTs, readers should be mindful of the limitations and decrypt the JWT.
of these papers and carefully evaluate the We used Jest to create tests for the JWT Generation
applicability of the findings and recommendations and Validation modules in order to guarantee the
to their specific use case. dependability and security of the library. These
tests verified the validity of the produced JWT and
the accuracy with which the validation function
LIII. DESIGN AND IMPLEMENTATION
verified a valid JWT.
Determining the architecture and modules
necessary to construct a secure library is the first
step in constructing the JWT library. Building a
JWT library requires the following modules:
A module that creates a JWT from a payload and a
secret key is known as a JWT generation module.
JWT Validation Module: This module verifies a
JWT by examining the contents and signature.
SHA256 module: A module that offers SHA256-
based cryptographic hashing.
Data in Base64URL format may be encoded and
decoded using a module called Base64URL.
Implementation:
First, we used npm to install the TSdx and Jest
packages, enabling us to build and test the library.
LIV. RESULTS AND ANALYSIS
Our work on developing our own JWT library has
resulted in a secure, and flexible solution for
generating, signing, and verifying JWTs in web
applications.
We have conducted extensive testing of the library
to ensure its security and performance, and have
provided thorough documentation and support for
developers who wish to use the library in their own
projects. Overall, our work has resulted in a
valuable contribution to the field of web application
security, providing developers with a practical and
reliable solution for implementing JWTs in their
applications[5].
Furthermore, our work has demonstrated the
Fig 5. JWT Process importance of open-source libraries and
community-driven development in the field of web
Tools and Technology used: application security. By sharing our library with the
A development environment for creating and wider community, we hope to contribute to the
testing TypeScript libraries is offered by the ongoing development of secure and efficient web
TypeScript library TSdx. The JWT library's first applications.
project structure was built using it. As we worked through the project we recognized
A popular testing framework for JavaScript apps is several strengths and limitations in the current
called Jest. Unit tests for the JWT Generation and existing JWTs.
Validation modules were made using it. Firstly, one of the strengths of a secure JWT library
Algorithm SHA256 - The SHA256 algorithm is a is its strong authentication feature. This feature
well-known cryptographic hashing method for enables JWT tokens to provide a secure and
creating digital signatures for secure efficient way to authenticate users and devices.
communication. It was used to produce and validate Another strength is token management. JWT
the JWT's digital signature during the deployment libraries can help prevent unauthorized access or
of the JWT Generation and Validation modules[4]. misuse by managing token expiration, revocation,
Data is encoded in a URL-friendly manner using and other security features.
Base64URL encoding, a version of Base64 Additionally, JWT tokens are platform-
encoding. It was used to encode and decode the independent, meaning they can be used across
JWT during implementation of the JWT Generation different platforms and technologies
and Validation modules. However, there are limitations to a secure JWT
JavaScript runtime Node.js is frequently used to library. While JWT tokens offer a secure way to
create server-side applications. It was used to run authenticate and transmit data, they can still be
the Jest tests and the JWT library. vulnerable to certain security risks if not correctly
Npm is a popular package manager for Node.js that implemented or secured[6]. This can lead to
is used to publish packages and manage unauthorized access if cryptographic keys used to
dependencies. It was used to install the JWT sign the tokens are compromised.
library's required packages, including Finally, there is limited scalability with JWT
"jsonwebtoken" and "crypto-js." tokens. While they can improve efficiency and
scalability of applications, they may not be suitable
for extremely large or high-traffic applications that
require more complex security measures.
In conclusion, a secure JWT library can offer many Through a series of experiments and evaluations,
benefits for authentication, token management, and we demonstrated that the library is robust against
efficiency, but it is important to consider the common attacks and provides high performance for
security risks and limitations of JWT tokens. generating and verifying tokens. Furthermore, we
Proper implementation and management of the compared our library to existing JWT libraries and
library can ensure that JWT tokens are used in a found that it offers comparable performance and
secure and well-tested manner[7]. security.
Through the process of designing and implementing
the library, we gained valuable insights into the
LV. DISCUSSION AND IMPLICATIONS
workings of JWTs[9] and their applications in
Our analysis of JWTs has highlighted both their modern web development. The library we have
strengths and weaknesses, and has identified several created is designed with a focus on security,
areas for future research and development. One of efficiency, and ease of use.
the key implications of our work is the importance Overall, this paper contributes to the existing body
of careful consideration when choosing and of research on JWTs by providing a new library that
implementing a JWT solution in web applications. can be used by developers to implement secure
Moving forward, we believe that there is still authentication and authorization mechanisms in
significant potential for further research and their applications. We hope that this work will
development in the area of JWTs, particularly in inspire further research and development in the
areas such as token revocation and support for field of web security and contribute to the creation
multiple signature algorithms. By continuing to of more secure and reliable systems.
explore and refine JWTs, we can ensure that they
remain a valuable tool for web application security
for years to come[8]. REFERENCES
Furthermore, our work has demonstrated the
importance of continual evaluation and refinement [74] Jones, M., Bradley, J., and Sakimura, N. (2015).
of security solutions. As new vulnerabilities and "JSON Web Token (JWT)." Published as an
threats emerge, it is crucial to continually evaluate Internet Engineering Task Force (IETF) RFC
and improve existing security solutions, including 7519. Retrieved from
JWTs. By conducting ongoing testing and https://tools.ietf.org/html/rfc7519
evaluation of our JWT library and other security [75] Akanksha and A. Chaturvedi, "Comparison of
solutions, we can ensure that they remain effective Different Authentication Techniques and Steps
and up-to-date with the latest security practices and to Implement Robust JWT
technologies. Ultimately, we believe that ongoing Authentication," 2022 7th International
evaluation and refinement are essential for ensuring Conference on Communication and Electronics
that web applications remain secure and reliable in Systems (ICCES), Coimbatore, India, 2022, pp.
the face of evolving security threats. 772-779, doi:
10.1109/ICCES54183.2022.9835796.
LVI. CONCLUSION [76] S. Ahmed and Q. Mahmood, "An authentication
based scheme for applications using JSON web
In conclusion, this research paper presents a new
token," 2019 22nd International Multitopic
JSON Web Token (JWT) library that provides a
Conference (INMIC), Islamabad, Pakistan,
secure and efficient method for authentication and
2019, pp. 1-6, doi:
authorization in web and mobile applications. The
10.1109/INMIC48123.2019.9022766.
library offers a straightforward and flexible API for
[77] Ficry Cahya Ramdani, Alam Rahmatulloh
generating, parsing, and verifying JWTs, and
includes support for a variety of signing and (2023). " Implementation of JSON Web Token
encryption algorithms. on Authentication with HMAC SHA-256
Algorithm." Retrieved from https://iopscience.iop.org/article/10.1088/1757-
https://www.semanticscholar.org/paper/Implem 899X/550/1/012023/meta
entation-of-JSON-Web-Token-on- Published in the proceedings of the 2021
Authentication-Ramdani-Alam- International Conference on Advanced
Rahmatulloh/b3d611ca6b7b7e9b2f6f22f1bb4bd Computer Science and Information Systems
e0211dc7f51 (ICACSIS) on October 2020-21.
Published as a conference paper in the 2018 [80] Abhishek Aadi (2018). "Secure JWT in a
International Conference on Informatics, Nutshell."
Multimedia, Cyber, and Information System Published on January 22, 2018. Retrieved from
(ICIM-CIS). https://medium.com/swlh/secure-jwt-in-a-
[78] Salman Ahmed, Qamar Mahmood. (2018). " An nutshell-e59a0139096d
authentication based scheme for applications [81] The App Solutions. (2020). "How to Build a
using JSON web token.". Retrieved from Secure JWT Authentication: Best Practices."
https://ieeexplore.ieee.org/document/9022766/ Retrieved from
Published as a conference paper in the 2018 3rd https://medium.com/@abhishekaadi/jwt-in-a-
International Conference on Computer and nutshell-part-1-84bf7c7018d
Communication Systems (ICCCS). [82] Teniola Fatunmbi (2022). "JSON Web Tokens
[79] A Rahmatulloh, R Gunawan, and F M S (JWT) vs. Session Cookies: Authentication
Nursuwars. (2020). " Performance comparison Comparison." Okta. Retrieved from
of signed algorithms on JSON Web Token" https://developer.okta.com/blog/2022/02/08/coo
Retrieved from kies-vs-tokens
Home Automation Using WhatsApp Chatbot
1st Pavan Penugonda 2nd Patan Ashraf Ali Khan 3rd P V Praneeth Reddy
dept. of CSE dept. of CSE dept. of CSE
Presidency University Presidency University Presidency University
Bengaluru, Karnataka Bengaluru, Karnataka Bengaluru, Karnataka
201910101463@presidencyuniversity.in 201910101027@presidencyuniversity.in 201910101091@presidencyuniversity.in
th th
4 M Nithin Kumar Reddy 5 P Subhash Reddy
dept. of CSE dept. of CSE
Presidency University Presidency University
Bengaluru, Karnataka Bengaluru, Karnataka
201910101115@presidencyuniversity.in 201910101116@presidencyuniversity
.in
B. System Design
System design refers to the process that was followed while
designing the entire system, including all of its components, flows,
tasks, and affordances that can be done by a user while using the
application. It is sort of a conceptual plan that will be followed
while actually building the system and ensuring that all of the parts
have been successfully finished. The system design provides a
complete overview of how the system is implemented even in the
technical sense, it will be used as a guideline and a road map for all
aspects of the building. System design will help us understand the
complexities, and the processes of how to build and what to build
Fig. 2. UML Sequence Diagram
first and help us manage our time and resources so that the
maximum is spent on the essential parts of development. System
design is the phase where ideation ends and the actual development C. Technical Requirements
of the product in its continuity begins. The technical requirements of the project signify the various
1) Application Flow: The following is how the system we intend to technologies, tools and procedures used in the development of
design works. The primary form of commu-nication with the the application.
automation system is through WhatsApp, this is where the users 1) WhatsApp Messenger: WhatsApp is a popular messag-ing
will send commands to operate the appliances in the house. These application that allows users to send messages, make voice and
appliances would be connected to an ESP32 module, which is an video calls, share media, and conduct group chats. It was
IoT device that would be connected to the internet founded in 2009 by two former Yahoo employees, Brian Acton,
• The user sends a message to either turn the bulb on or off and Jan Koum. Initiall0y, WhatsApp was designed as an
from WhatsApp. alternative to traditional SMS messaging, and it quickly gained
• This message is relayed to the ESP module via a Twilio popularity due to its ease of use and low cost. In our application
connection. we have used WhatsApp to send messages to control the
• The module interprets and understands the system and acts automation, users use this service to send messages to either turn
accordingly. the bulb on or off and receive respective messages in responses.
• The appliance is either switched on or off based on the 2) Twilio: Twilio is a cloud communications platform that
command issued.
enables businesses to communicate with their customers through
• The respective response is sent back to the users, showing
various channels such as voice, SMS, email, and messaging
either the confirmation of performing the action or any error
applications like WhatsApp. Founded in 2008, Twilio has
message that needs to be corrected. become a leading player in the cloud communica-tions industry,
providing reliable and scalable communication solutions to
businesses of all sizes. In our application, we have used Twilio
to relay the communication messages between WhatsApp and
the ESP32 module. The Twilio integration module receives
messages from WhatsApp and then relays it to the endpoint it
has established with the ESP32 modules, similarly, it also picks up Implementation is the actual stage where we implement all
messages sent by the ESP32 module and relays them back to the planning and designs we’ve achieved in the previous stages,
WhatsApp for the users to see. It acts as an intermediary although the development was ever-present in all those stages
communicator here. too, here we see a comprehensive overview of how the
3) ESP32 Module: The ESP32 module is a powerful and versatile development procedure occurred and how the applica-tion was
microcontroller unit (MCU) that is widely used in a range of developed so that the objectives of the application were achieved
embedded systems and Internet of Things (IoT) applications. It is properly. Every application has a few crucial components that
based on the Espressif ESP32 system-on-chip (SoC), which make up the bulk of the application along with several other
combines a dual-core processor, WiFi, and Bluetooth connectivity, standard conventional features, it’s these crucial components
and a range of peripheral interfaces into a single chip. The ESP32 that make up the application and show its functionality. Let’s
module is also designed with power efficiency in mind, with a look at those crucial components that make our application into
range of power-saving modes and low-power operation options to the solution that solves the problem that we have been talking
maximize battery life in battery-powered devices. It also includes a about from the beginning. As we have built two different
range of security features, including secure boot and flash applications, the construction of each application, and the
encryption, to protect against unauthorized access and tampering. In development of each individual feature has been depicted below.
our application we have used the ESP32 module to act as the
• Setting up a project on ThingESP
controller of the electrical appliances, all the bulbs are connected to
• WhatsApp Integration with Twilio
this module, and this is connected to the internet and a power • Programming the ESP32 Module
source, upon which it will communicate with the user with Twilio • Circuit Design.
as an intermediary. • Final Integration
4) ThingESP Library: The ThingESP library is an open-source
A. Setting up a project on ThingESP.
software library designed to simplify the development of Internet of
Things (IoT) applications using the ESP8266 and ESP32 As we have seen above, the ThingESP is an open-source
microcontroller units (MCUs). It provides a range of functions and library that allows us to communicate and program ESP32
utilities for connecting to WiFi networks, interfacing with sensors modules with ease. The first thing that we need to do is create a
and other peripherals, and sending and receiving data over the project in the library and get the communication endpoint.
internet The library includes a built-in MQTT client that enables • Create an account or log in to an existing account on
easy and efficient commu-nication between IoT devices and servers ThingESP.
or other devices. This allows developers to easily create scalable • Then, we have to add a new project and provide the name
and flexible IoT applications that can communicate with a wide and credentials. This credential can be anything, but we
range of other devices and systems. In addition to its support for must note it down for further usage.
• Then we will be taken to the project page and there on the
MQTT and sensor interfacing, the ThingESP library also includes a
right side we will notice a URL, which is the
range of functions for managing WiFi connections, including
communication endpoint for us.
automatic reconnection and error handling.
This communication endpoint is where all our messages from
5) Arduino IDE: The Arduino software is an open-source Integrated Twilio will go, and the ThingESP library ensures that the
Development Environment (IDE) that provides a user-friendly messages being sent to this endpoint reach the ESP32 module
platform for the programming and development of microcontroller- that we intend to communicate to.
based systems. It is designed to be simple and easy to use, making
B. WhatsApp Integration with Twilio
it accessible to beginners and experts alike. The Arduino software
is based on the Wiring language, which is a simplified version of C Whatsapp integration with Twilio is used to send messages
and C++. This makes it easy for developers to write and understand from Whatsapp to the ESP32 module via the ThingESP library.
Twilio acts as the secure intermediary between the user and the
code, even if they are not experienced, programmers. In addition to
ESP module.
its support for the Wiring language, the Arduino software also
• The first step is to create an account on Twilio.
includes a range of libraries and examples to help developers get
• Then we need to create a new project and verify your
started with common tasks and functions. In our project, we used
credentials.
Arduino to write the code to handle the ESP32 module and program • Once your project is created, go to messaging, then choose
it into the module. settings and again choose WhatsApp sandbox settings.
6) Relay: A relay is an electronic switch that is used to control high- • Agree to everything and this will activate your WhatsApp
power devices, such as lights, motors, or heaters, using a low-power sandbox developer environment.
signal from a microcontroller or other IoT device. Relays are Now when we have activated the WhatsApp sandbox, we will be
commonly used in IoT applications where the devices being given two fields, on the top place the URL we have received
controlled require more power than the microcontroller or other from the ThingESP. This is the endpoint, it denotes that all the
low-power device can provide. A relay consists of an messages received to this Twilio sandbox must be relayed to this
electromagnetic coil and a set of contacts. When a current is applied end point.
to the coil, it generates a magnetic field that causes the contacts to
close or open, depending on the type of relay. This allows the relay
to switch power on or off to a connected device or circuit.
V. IMPLEMENTATION When we look below these fields, we also see a phone number
with a 3-word unique identifier separated by hyphens. This is the
number that we must communicate with to control the light bulbs. • Now, the messages are handed over to the ThingESP
• The first thing to do is save the number onto our mobile which, based on the identifiers, identifies the module and
phone. sends the messages there.
• Then we need to send the three-word identifier as a message • The microcontroller interprets the messages and accord-
to the number. This verifies the number and will let us further ingly turns the light on or off and sends back an appro-
communicate. priate response.
This brings us to the end of the Twilio integration, so far we have This is how the system was implemented.
successfully set up all the message senders and relays to ensure the
message properly reaches its destination. Now we move on to VI. TESTING
configuring the hardware. The process of testing is as crucial as the development itself
and some say it’s far more crucial than the development. It is in
C. Programming the ESP32 Module
this phase that the application that we have developed is tested
Programming the ESP32 module means we write the code as to to ensure that it is running as we intend it to. Here we check with
what to do when we receive a message and we push that code onto the various documents that we have prepared along the way such
the microcontroller.
as the preliminary designs and ensure that the final outcome that
• We first write the code on the Arduino IDE.
we have received is something that we have been aiming at. As
• Then to upload it we select the board, from the tools, ESP32
and ESP32 Dev module. this application is a combination of both hardware and software
• Then we select the port and click on the arrow to upload it to components, it is essential to ensure that testing focuses on both
the module. aspects and ensures that everything works in synchronization.
This will upload the program to the ESP32 module. A. Unit Testing
D. Circuit Design Any application is made up of numerous small units and only
Circuit design implies the way we connect the relay, the ESP when they are combined properly can they form a fully
module, and the bulb with the power supply so that it can function functional application. These small units are crucial to be tested
as we intend it to. The following are the instructions to properly before sending to the next state so that they do not add up to
configure the circuit. some serious level of errors. If an erroneous component is
integrated with a non-erroneous one, it could lead to the
• We need to first connect one end of the two-pin plug to the
first pin the relay malfunctioning of both components. Our process of Unit Testing
the Application involved testing out the following units.
• Then we must connect one end of the bulb to the second pin
of the relay. • Testing to see if the circuits were connected correctly and
• Now, we connect the negative pin of the relay to the ground were fitted into their slots.
pin of the ESP 32. • Testing to see if the connections worked.
• Then, we connect the positive pin of the relay to the Vin pin • Testing to see if the messages from WhatsApp were
in the ESP 32 reaching the Twilio interface.
• We connect the signal pin to pin D23 in the ESP. • Testing to see if the URL endpoint given into Twilio was
right.
• Finally, connect the remaining pin of the two-pin plug to the
remaining pin of the bulb. • Testing to see if the messages from Twilio were being
relayed to the ESP32 module.
This finishes the circuit design. This is also depicted in the
• Testing to see if ThingESP is communicating with the
following image.
ESP32 module.
• Testing to see if the ESP32 module is properly
programmed.
• Testing to see if the right keywords were used for
verification in the ESP module.
• Testing to see if the ESP module behaved according to the
inputs being sent.
• Testing to see if the responses that were being sent back to
the user were appropriate.
This is how we performed the unit testing of our application,
ensuring that the basic building units of the application are
working correctly.
ACKNOWLEDGEMENT
We are greatly indebted to our guide Dr. Medikonda Swapna,
Associate Professor, School of Computer Science & Engineering,
Presidency University for her inspirational guidance, valuable
suggestions and for providing us a chance to express our technical
capabilities in every respect for the completion of the project work.
REFERENCES
[1] Soni, “Design and Implementation of Home Automation
System using Raspberry Pi,” California State Polytechnic
University, Pomona, 2021.
[2] P. Mathivanan, G. Anbarasan, A. Sakthivel and G. Selvam,
”Home Automation Using Smart Mirror,” 2019 IEEE
International Confer-ence on System, Computation,
Automation and Networking (ICSCAN), Pondicherry, India,
2019, pp. 1-4, doi: 10.1109/ICSCAN.2019.8878799.
[3] T. Parthornratt, D. Kitsawat, P. Putthapipat and P.
Koronjaruwat, ”A Smart Home Automation Via Facebook
Chatbot and Rasp-berry Pi,” 2018 2nd International
Conference on Engineering Innovation (ICEI), Bangkok,
Thailand, 2018, pp. 52-56, doi:
10.1109/ICEI18.2018.8448761.
[4] K. L. Raju, V. Chandrani, S. S. Begum and M. P. Devi,
”Home Automation and Security System with Node MCU
using Internet of Things,” 2019 International Conference on
Vision Towards Emerging Trends in Communication and
Networking (ViTECoN), Vellore, India, 2019, pp. 1-5, doi:
10.1109/ViTECoN.2019.8899540.
[5] C. J. Baby, F. A. Khan and J. N. Swathi, ”Home automation
using IoT and a chatbot using natural language processing,”
2017 Innovations in Power and Advanced Computing
Technologies (i-PACT), Vellore, India, 2017, pp. 1-6, doi:
10.1109/IPACT.2017.8245185.
[6] Pavithra, D., Balakrishnan, R. (2015, April). IoT based
monitoring and control system for home automation. In 2015
global conference on communication technologies (GCCT)
(pp. 169-173).
[7] Mandula, Kumar, et al. ”Mobile based home automation using
Internet of Things (IoT).” 2015 International Conference on
Control, Instrumen-tation, Communication and Computational
Technologies (ICCICCT). IEEE, 2015.
[8] Abdulraheem, A. S., Salih, A. A., Abdulla, A. I., Sadeeq, M.
A., Salim, N. O., Abdullah, H., Saeed, R. A. (2020). Home
automation system based on IoT. Technology Reports of
Kansai University, 62(5), 2453-64.
[9] Patchava, V., Kandala, H. B., Babu, P. R. (2015, December).
A smart home automation technique with raspberry pi using
iot. In 2015 Inter-national conference on smart sensors and
systems (IC-SSS) (pp. 1-4). IEEE.
Fitpulse – A Fitness and Gym Website.
CONCLUSION.
FitPulse is a platform that offers
comprehensive, secure, and customized fitness
solutions, revolutionizing the way consumers
approach fitness. It provides users with a user-
friendly login and signup process, a gym
booking feature, exercise challenges, and
information on their fitness development.
User's private information is protected and
trustworthy user authentication is provided for
the login and signup processes.
ACKNOWLEDGMENT
We thank everyone who contributed to this
study on Fitpulse - A gym booking and fitness
Flavour Fetch-An Authenticated and Authorized
Food Delivery Website
REFERENCES
[83] A. Shersingh Chauhan, S. Bhardwaj, R.
Shaikh, A. Mishra and S. Nandgave, "Food
Ordering website “Cooked with care”
developed using MERN stack," 2022 6th
International Conference on Intelligent
Computing and Control Systems (ICICCS),
Madurai, India, 2022, pp. 1690-1695, doi:
10.1109/ICICCS53718.2022.9788224.
[84] Joshi, Umesh, Shubham Mathur, Priyal Soni,
Vikas Sharma, Ayushi Ghill, and Yogesh
Suthar. "Online Food Ordering System."
International Journal of Advanced Research in
Computer Science 13 (2022).
[85] Serhat Murat Alagoza, Haluk Hekimoglub,” A
study on tam: analysis of customer attitudes in
online food ordering system”, Elsevier Ltd.
2012.
[86] Resham Shinde, Priyanka Thakare, Neha
Dhomne, Sushmita Sarkar,” Design and
Implementation of Digital dining in
Restaurants using Android”, International
Journal of Advance Research in Computer
Science and Management Studies 2014.
[87] Suthar, Pradeep, Amrita Agrawal, Kinal
Kukda, and Kajal Joshi. "FOOD MAGIC:
ONLINE FOOD ORDERING AND
DELIVERING SYSTEM." (2020).
[88] Bhargave, Ashutosh, Niranjan Jadhav, Apurva
Joshi, Prachi Oke, and S. R. Lahane. "Digital
ordering system for restaurant using Android."
International journal of scientific and research
publications 3, no. 4 (2013): 1-7.
[89] Chavan, Varsha, Priya Jadhav, Snehal Korade,
and Priyanka Teli. "Implementing
customizable online food ordering system
using web based application." International
Journal of Innovative Science, Engineering &
Technology 2, no. 4 (2015): 722-727.
[90] Cheong, Soon Nyean, Wei Wing Chiew, and
Wen Jiun Yap. "Design and development of
Fetal Distress Classification Based on Cardiotocography
I. INTRODUCTION
Abstract: The classification of fetal
Delivering a baby poses several challenges
distress is a critical task in obstetrics, as it
to doctors, and one of the most significant
allows clinicians to intervene and prevent
ones is to ensure the well-being of the
adverse outcomes for both the mother and
unborn child during delivery. One indication
the baby. Support Vector Machines (SVM)
of fetal distress, which can lead to
is a machine learning algorithm that has
hypoxia[1] - a condition where there is an
shown promising results in the
insufficient oxygen supply to the body or a
classification of fetal distress. In this study,
SVM was utilized to develop a model for specific body part - is a lack of oxygen
the classification of fetal distress based on reaching the fetus before and during
fetal heart rate (FHR) and uterine delivery. To monitor the fetus's condition
contractions (UC). continuously, doctors rely on a tool called a
cardiotocograph (CTG) that produces
Keywords: Cardiotocography, Fetal continuous time-series signals. CTG
distress, Support Vector Machines, Uterine
measures two metrics - uterine contractions
Contractions , Fetal Heart Rate.
(UC) and fetal heart rate (FHR) [.
Healthcare professionals analyze these
signals in graphical form to identify any
instances of fetal distress. This process is In summary, this study employed machine
called cardiotocography. learning algorithms to address the
challenges associated with pregnancy and
Fetal Heart Rate (FHR) refers to the rate at reduce potential complications.
which the fetal heart beats per minute. It is
an essential metric monitored during
pregnancy and childbirth, as it provides
insight into the fetus's overall health and
well-being. Uterine contractions (UC) are
the rhythmic and involuntary tightening of
the uterine muscles during pregnancy, which
helps to prepare the body for labor and
delivery.
1) Creation of Dataframe: This step is missing values, and these values should be
carried out to convert the data from the CSV eliminated from the dataset. Subsequently,
file to a usable format ie: Pandas dataframes. the feature selection process should begin,
We can perform multiple preprocessing where the relevant features are identified for
steps over these dataframes. use in training the model.
2) Dropping the null values: Firstly, the 3) Standardizing the data: Standardization
dataset needs to be checked for any NaN or refers to the process of transforming input
data so that it has zero mean and unit outcomes. the effectiveness and precision of
variance. a machine learning model.
V. CONCLUSION
Abstract--- This paper presents the the overall customer experience. However, the
development and implementation of a traditional customer support systems employed by
customer support chatbot for airlines using To address this challenge, the use of
Dialogflow, a powerful conversational AI conversational AI technologies has gained
platform. With the rapid growth of the airline significant attention. Chatbots, powered by
industry, efficient customer support systems natural language processing and machine learning
have become imperative to ensure customer algorithms, have emerged as viable solutions to
satisfaction and streamline operations. augment customer support operations in various
Leveraging natural language processing and industries. In the airline domain, chatbots offer a
machine learning techniques, this chatbot aims scalable and efficient means of engaging with
to provide personalized and timely assistance customers, providing instant assistance, and
to airline customers, enhancing their facilitating self-service options.
experience and reducing the workload of
human customer support agents. This paper This paper focuses on the development and
outlines the design, architecture, and key implementation of a customer support chatbot
features of the chatbot, highlighting its specifically designed for airlines, utilizing
potential to revolutionize the airline industry's Dialogflow—a leading platform for building
customer service domain. conversational agents. The chatbot acts as a
virtual assistant, capable of understanding and
Keywords-- Natural language processing, responding to a wide range of customer inquiries
Machine learning, Chatbots, Dailogflow, and requests, including flight information, ticket
Agents. booking, baggage policies, flight status, and more.
The motivation behind this research lies in
I. INTRODUCTION addressing the increasing demand for
personalized, efficient, and accessible customer
In today's highly competitive airline industry, support in the airline industry. By deploying an
providing exceptional customer service is crucial intelligent chatbot, airlines can enhance their
for maintaining a loyal customer base. Prompt customer service capabilities, reduce response
and accurate responses to customer queries, times, and provide 24/7 assistance to passengers
issues, and requests play a pivotal role in shaping across various communication channels, such as
websites, mobile apps, and messaging platforms.
check-in procedures, freeing up human agents to
focus on more complex issues. They found that
chatbots were particularly valuable in reducing
response times and increasing customer
The main objectives of this study are as follows: satisfaction.
1. To design and develop a robust and scalable B.Natural Language Processing Techniques:
customer support chatbot using Dialogflow.
2. To employ natural language understanding To enable chatbots to understand and respond to
and processing techniques to enable the user queries, natural language processing (NLP)
chatbot to accurately comprehend user techniques are employed. Chen et al. (2019)
queries. examined the application of NLP algorithms in
3. To integrate the chatbot with airline airline chatbots and emphasized the importance of
databases and systems, allowing it to fetch accurate intent recognition, entity extraction, and
real-time flight information and provide context awareness. They found that advanced
personalized responses. NLP techniques, including deep learning models,
4. To evaluate the performance and improved the accuracy and efficiency of chatbot
effectiveness of the chatbot through user tests interactions.
and analysis of user feedback.
5. To demonstrate the potential of the chatbot in C.Frameworks and Platforms for Chatbot
improving customer satisfaction, reducing Development:
operational costs, and optimizing human Various frameworks and platforms have been
customer support agent workflows. used to develop chatbots in the airline industry.
By achieving these objectives, this research aims Dialogflow, a widely adopted conversational AI
to contribute to the advancement of customer platform, offers pre-built NLP models, intuitive
support systems in the airline industry, providing interfaces, and integration capabilities. Research
valuable insights into the capabilities and by Brown et al. (2020) compared different chatbot
potential impact of chatbots developed with development platforms, including Dialogflow,
Dialogflow. and highlighted the ease of use and flexibility it
provides for building sophisticated conversational
II. LITERATURE REVIEW agents.
REFERENCES
18
addressing the problem of deteriorating soil issue in ensemble learning and to have the greatest
fertility. Additionally, it can aid farmers in more accuracy possible.
efficient pest and disease detection and
management, lowering the need for hazardous The authors of [3] have suggested a model that uses
pesticides and enhancing agriculture's data from the Government of India's repository
environmental sustainability. Precision agriculture website, data.govt.in. The dataset primarily
has great promise, and the Indian government has includes the 4 crops, totalling 9000 samples, of
started a number of measures to encourage its use. which 6750 are utilized for training and the
remaining 2250 for testing. Following pre-
By giving farmers access to real-time information processing, ensemble-based learners such as
on soil moisture, weather patterns, and crop health, Random Forest, Navies Bayes, and Linear SVM
machine learning and the internet of things have the are utilized, and the majority voting technique is
potential to completely change the Indian used to get the greatest accuracy.
agriculture industry. Utilizing this information will
improve crop productivity and boost profitability. By utilizing multiple machine learning algorithms,
Sensors can track soil moisture levels, for instance, this research primarily focuses [4] on estimating
and provide farmers real-time information on the crop's production. Logistic Regression, Naive
whether to water their crops. Following this data Bayes, and Random Forest are the classifier models
analysis, machine learning algorithms can offer employed in this instance, with Random Forest
ideas on how to optimize irrigation schedules to offering the highest level of accuracy. By
increase agricultural yields. Utilizing technology in considering variables like temperature, rainfall,
farming, precision agriculture maximizes [8] crop area, etc., the forecast provided by machine
yield while minimizing waste. To track crop health, learning algorithms will assist farmers in choosing
soil moisture, and weather patterns, sensors, which crop to cultivate to induce the greatest yield.
and machine learning algorithms are used. The Sandhya Tara and Sonal Agrawal, the authors of
application of fertilizer and pesticides is then this research [5] present a framework that uses
optimized using this data, which also helps to save machine learning and deep learning techniques to
water and increase crop yields. For smallholder suggest the best crop based on soil and climate
farmers in India who frequently lack access to the parameters. Area, Relative Humidity, PH,
most recent farming technologies and encounter Temperature, and Rainfall are the predictive
substantial difficulties in producing high crop variables in the dataset. once the dataset has been
yields, precision agriculture might be very helpful. pre-processed. The information is then divided into
Farmers may enhance their livelihoods, boost a training set and a test set. The response is then
output, and save expenses by utilizing precision depicted graphically for each of the parameters,
agricultural technology. including fertilizer use, pesticide use, area, UV
exposure, and water, using the above-mentioned
algorithms, and the yield is forecasted using the
II. LITERATURE SURVEY data for these parameters. Thus, with little loss and
a high yield, the results can assist farmers in
growing suitable crops.
Plenty research had gone to find the problem in
Indian agriculture and many more research are The authors of [6] proposed a model that uses
going along with the time to predict the solution for previous farmland data as the data set. It consists of
the issue. various attributes such as county name, state,
humidity, temperature, NDVI, wind speed, and
The technique for determining which crop is most yield. The model is trained to identify the soil
suited for harvesting is suggested in the study [1]. requirements necessary for yield prediction.
They utilized many algorithms, including Decision Algorithms applied to the dataset are random
tree, Random Forest, KNN, and neural network, on forest, decision tree, and polynomial regression.
the Indian Agricultural and Climate Data set in Among all three algorithms, Random Forest
order to obtain the highest level of accuracy. provides better yield prediction compared to other
algorithms.
Using data acquired from the Madurai area, the
Crop Suggestions System for Precision Agriculture In the paper [7], the factors used by the proposed
[2] was created to assist farmers in planting the system include soil pH, temperature, humidity,
proper seed according to soil conditions. The main rainfall, nitrogen, potassium, and phosphorus.
goal is to find a solution for the classifier selection Various crops are also included in the dataset. after
19
utilizing the dataset to train and test the model. A choices and proposes a district-by-district
variety of algorithms, including Decision Tree, forecasting model for the Tamil Nadu state. To raise
Random Forest, XGBOOST, Naive Bayes, and LR, the quality of incoming data, the paper suggests
are used to forecast a specific crop under specific employing pre-processing and clustering
environmental conditions and parameter values that techniques. Furthermore, it recommends employing
aid in growing the best crop. Thus, evaluating the artificial neural networks (ANN) to predict
accuracy of algorithms and selecting the greatest agricultural productivity and daily precipitation
accuracy will assist farmers in selecting the using meteorological data. In order to improve the
appropriate seed and aid in boosting agricultural system's success rate, the study article suggests a
yield. hybrid recommender system that makes use of
Case-Based Reasoning (CBR). The effectiveness of
Authors of [8] implemented precision farming, the proposed hybrid technique is evaluated against
where a variety of internet of things (IOT) sensors conventional collaborative filtering.
and devices are used to collect data on
environmental conditions for farming, the amount
of fertilizer to be used, the amount of water needed,
III. PROPOSED WORK
and the levels of soil nutrients. Through wired or
wireless connectivity, the data gathered by the
numerous IOT sensors at the end node is then saved
3.1 Data Description
in the cloud or on remote servers. Afterward,
relevant meanings and interpretations are inferred The dataset was collected from website of Smart AI
from the data using a variety of data analytic Technologies. Which consists 18 years of crop data
techniques, which are then applied to the data to (i.e,1997–2014) for 35 various districts of
make precise and correct decisions. Then, several Maharashtra State, Crop data includes Season
algorithms are used to select crops, and the data Names,CropNames,Area,Temperature,Wind_Speed
analysed can be used to understand agricultural ,Pressure,Humidity,Soil_Type,NPK_Nutrients,
conditions and whether they are favourable as well Production, Yield. a small snippet of data was
as forecast crop yields with the highest yield. shown below [fig 1.] where, Area is measured in
hectares, Crop production is measured in tonnes per
In [9] The right crop is advised using the proposed hectare and Crop Yield is measured in crop
approach based on details like soil PH, production weight (in kg) per area of land
temperature, humidity, rainfall, nitrogen, harvested or area of land planted (in hectares). This
potassium, and phosphorus. The historical data dataset is previously performed for Crop yield
with the above-mentioned parameters are included Prediction, But We are Performing for a Crop
in the dataset. To eliminate outliers and missing Recommendation using Machine Learning.
values, the gathered data is pre-processed. The
model is subsequently tested and trained. The
method utilizes a variety of machine learning
classifiers, including Deep Sequential Model,
KNN, XGB, Decision Tree, and Random Forest, to
select a crop accurately and effectively for site-
specific factors. Farmers will be assisted in
growing appropriate crops with the highest yield
thanks to this research report.
20
Stabilized. So, we were Considering the data from
the year 2000, By this we can get rid off high
3.2 Data Pre-Processing Variance and low bias which may cause of
Overfitting of data which leads to wrong
predictions of the model.
Before Performing a model building for any data,
It’s an Essential Step to perform a Data Pre
Processing step, Where raw data is cleaned and
transformed to provide a quality data for further
Analysis,In this csv dataset we had a total 17
Attributes Column values on which 12 attributes
are Numerical Column values and rest 5 are
Categorical Column attributes, by following this 17
Attribute Columns, the dataset contains 12,628
records of crop data.
3.3 EDA
21
3.4 Feature Engineering data in parallel, allowing for real-time
Sometimes, it is very difficult to make conclusions recommendations. Overall, Random Forest can
improve crop recommendations' precision and
and draw methods from the raw data where Feature
interpretability, helping farmers make wise
Engineering comes into the picture and makes
decisions and maximizing their crop selection
easier to draw methods. In this data also we had tactics.
Categorical Columns to draw Conclusions from
this Categorical Values. We had been used Feature 3.5.3 Navies Bayes:
Engineering Techniques such as One-hot Encoding,
Label Encoding. So that machine can understand Because of its ease of use, effectiveness, and
the data properly and provide useful insights to capacity for both categorical and discrete data, the
draw conclusions. Naive Bayes algorithm can be helpful in a crop
recommendation project. The conditional
probability that a crop will be suitable for a given
set of features, such as soil type, weather
3.5 Algorithms conditions, and crop attributes, is determined by the
probabilistic classifier Naive Bayes [3]. Given that
it only needs a small amount of training data to
3.5.1 Decision Tree: produce predictions, it is especially well suited for
projects with little available data. Naive Bayes is
This algorithm can be used in a crop
quick and effective for real-time recommendations
recommendation machine learning project to
because it has low computational requirements.
provide farmers with valuable insights and Additionally, Naive Bayes offers results that are
recommendations. To make informed predictions easy to interpret, enabling farmers to comprehend
about the best crop choice, the algorithm can learn the rationale behind the suggestions. Overall,
from historical data such as previous crop yields Naive Bayes can be a useful tool in crop
and agricultural practices. Decision tree [7] models recommendation projects because it provides
can also take multiple decision paths into account, precise and comprehensible predictions for the best
allowing for complex decision-making based on a crop choice.
variety of factors. The interpretability of decision
trees makes them especially useful in explaining 3.5.4 XGBoost:
the reasoning behind crop recommendations, which
Due to its ability to handle complex and non-linear
can help farmers make decisions. Furthermore,
data relationships, XGBoost, an advanced gradient
decision tree models can be easily updated with
boosting algorithm, can be extremely useful in a
new data, allowing for continuous crop
crop recommendation project. XGBoost [7] is well-
recommendation system refinement and
known for its high accuracy and predictive power,
improvement. Overall, the decision tree algorithm
which makes it ideal for making precise crop
can be a useful tool in crop recommendation
recommendations based on a variety of factors such
projects by providing data-driven insights to
as soil quality, weather conditions, historical crop
optimize agricultural practices and increase crop
data, and more. It can handle large datasets
yields.
efficiently and automatically handle missing data,
making it suitable for real-world agricultural
3.5.2 Random Forest:
scenarios. XGBoost also provides feature
Because it can handle complex datasets and importance rankings, which help farmers
produce precise predictions, the Random Forest understand which features influence crop
algorithm can be very helpful in a project recommendations. XGBoost also supports parallel
recommending crops. Random Forest can identify a processing, making it suitable for large-scale crop
variety of patterns and connections between various recommendation applications. Overall, XGBoost
factors, including soil characteristics, climatic has the potential to be a powerful tool for crop
conditions, and crop attributes, by using a number recommendation projects, providing accurate
of decision trees. With the help of this ensemble predictions as well as valuable insights for optimal
method, recommendations can be generated that crop selection.
are strong and trustworthy despite noise and
overfitting. Additionally, Random Forest [1] 3.5.5 KNN:
provides feature importance rankings that can be
used to pinpoint the crop selection variables that K-nearest neighbours (KNN) algorithm can be
have the greatest influence. Additionally, Random useful in a crop recommendation project due to its
Forest is effective at processing large amounts of simplicity and ability to handle both numerical and
22
categorical data. KNN [9] is a lazy learner, which Equation 1: Accuracy
means it does not need to be trained and can be
used to make real-time recommendations. Based on Precision [Equation 2] is defined as the ratio of true
the similarity of neighbouring data points, KNN positives, or the number of crops that were
can make crop recommendations based on accurately forecast, to all the anticipated crops. A
historical data on crop performance, soil quality, high accuracy rating means the model can correctly
weather conditions, and other relevant factors. It determine the best crops to produce in a certain
can also adapt to changing environmental area.
conditions, making it ideal for fast-paced
agricultural settings. KNN is interpretable,
allowing farmers to comprehend the reasoning
behind the recommendations. It is simple to
implement and has a low computational overhead,
making it appropriate for resource-constrained Equation 2: Precision
environments. KNN, on the other hand, may
Recall [Equation 3] quantifies the ratio of actual
necessitate the careful tuning of hyperparameters
crops that should have been advised to the overall
such as the number of neighbours (K) and the
number of true positives. A high recall score means
distance metric. Overall, KNN has the potential to
that the model can accurately identify a significant
be a useful and interpretable method for crop
fraction of the crops that should be grown in a
recommendation projects, providing real-time
specific area.
recommendations based on data point local
similarity.
3.6 Metrics
Equation 3: Recall
Accuracy Score [Equation 1] is defined as the total The accuracy, recall, and AUC scores of a
number of correct predictions out of the total successful crop recommendation model should be
predictions made from the testing data. Given high. While minimizing the number of false
below is the formula of accuracy score formula in positives (crops that are projected to be acceptable
terms of confusion matrix [Fig 5]. but are not suited for that site), the model should be
able to correctly identify the right crops to grow in
a certain place. The model should also be highly
confident in its ability to discriminate between the
23
proper crops and the incorrect ones. It is crucial to
assess a crop recommendation model's performance
using these measures on a sample dataset before
recommending it. If the model does well on the
assessment dataset, it may be a feasible alternative
for advising suitable crops in a specific area.
24
goal of providing farmers with reliable and
effective crop recommendation solutions.
V FUTURE WORK
25
Recommendation Platform for Machine Learning- [16] Enrichment of Crop Yield Prophecy Using
Driven Precision Farming,” Sensors, vol. 22, no. Machine Learning AlgorithmsR. Kingsy Grace, K.
16, p. 6299, Aug. 2022, Induja and M. Lincy
[9] Crop Recommendation System To Maximize [17] Thomas van Klompenburg, Ayalew Kassahun,
Crop Yield Using Deep Neural Network Vol Cagatay Catal, Crop yield prediction using machine
12,Issue 11, Nov /2021 Issn No:0377-9254 learning: A systematic literature review, Computers
and Electronics in Agriculture, Volume
[10] Dighe, Deepti, Harsh H. Joshi, Aishwarya 177,2020,105709,ISSN 0168-1699
Katkar, Snehal S. Patil and Shrikant Kokate.
“Survey of Crop Recommendation Systems.” [18] Data analytics for crop management: a big
(2018). data view Nabila Chergui and Mohand Tahar
Kechadi
[11] Improvement of Crop Production Using
Recommender System by Weather Forecasts [19] "Crop Yield Prediction In Agriculture Using
Bangaru Kamatchi, R. Parvathi Data Mining Predictive Analytic Techniques", Ijrar
- International Journal Of Research And Analytical
[12] Data Mining Techniques and Applications to Reviews (Ijrar), E-Issn 2348-1269, P- Issn 2349-
Agricultural Yield Data D Ramesh , B Vishnu 5138, Volume.5, Issue 4, Page No Pp.783-787,
Vardhan, International Journal of Advanced December 2018,
Research in Computer and Communication
Engineering Vol. 2, Issue 9, September 2013 [20] Champaneri, Mayank & Chachpara, Darpan &
Chandvidkar, Chaitanya & Rathod, Mansing.
[13] N. N. Jambhulkar Modeling of Rice (2020). CROP YIELD PREDICTION USING
Production in West Bengal International Journal of MACHINE LEARNING. International Journal of
Scientific Research, Vol: 2, Issue: 7 July 2013 Science and Research (IJSR). 9. 2.
[14] Li Hong-ying, Hou Yan-lin, Zhou Yong-juan, [21] Crop Variety Selection Method using Machine
Zhao Hui-ming, Crop Yield Forecasted Model Learning G. Vishwa, J. Venkatesh, Dr. C. Geetha,
Based on Time Series Techniques, Journal of http://dx.doi.org/10.21172/ijiet.124.05
Northeast Agricultural University (English
edition),Volume 19, Issue 1,2012,Pages 73- [22] Crop Prediction using Machine Learning N.L.
77,ISSN 1006-8104,https://doi.org/10.1016/S1006- Chourasiya, P. Modi , N. Shaikh3 , D. Khandagale,
8104(12)60042-7 S. Pawar, IOSR Journal of Engineering (IOSR
JEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719
[15] Masood, M. A., Raza, I. ., & Abid, S. . (2019). PP 06-10
Forecasting Wheat Production Using Time Series
Models in Pakistan. Asian Journal of Agriculture
and Rural Development, 8(2), 172 177.
26
BIKE CRASH DETECTION SYSTEM
RUHMA FATIMA G RAKESH
Computer Science of Engineering Computer Science of Engineering SPANDAN MANDAL
PRESIDENCY UNIVERSITY PRESIDENCY UNIVERSITY Computer Science of Engineering
BENGALURU,INDIA BENGALURU, INDIA PRESIDENCY UNIVERSITY
20201LCS0017 2019CSE0179 BENGALURU,INDIA
2020LCS0004
SALEM PAUL
Computer Science of Engineering B A KEERTHI
PRESIDENCY UNIVERSITY Computer Science of Engineering
BENGALURU,INDIA PRESIENCY UNIVERSITY
20201LCS0020 BENGALURU,INDIA
20201LCS0021
Abstract— Bike accidents are leading cause of road providing emergency services. If the delay is often reduced
accident-related deaths all over the world especially in Asian the person may get saved. For associate in nursing accident
countries. A lot of deaths around the world occur due to road victims, it is terribly tough to alert the police room or the
accidents but Asian countries face the highest amount of bike relations concerning the accidents. The projected system is
accident-related deaths which can be seen in the government
road transport survey where the quantitative relation of road
employed to scale back the time delay between the accident
accidents in 2018 was 4.61 lakhs out of which 1.47 lakh that and providing emergency services. The vehicle pursuit and
were due to bike accidents. This means that almost 402 people accident detection device are often put in in any
die every day in Asian countries due to road accidents
especially bike accidents. Most of these accidents occur due to vehicle. Whenever a vehicle is taken, or an associate
rush driving such as speeding, drunken driving and not accident happens to the vehicle the coordinates are taken
following the proper road rules. According to a survey the through international positioning system (GPS) module and
leading cause of death due to road accidents is the delay in are regenerated into Google map link through the formula
providing emergency services. If the time to deliver emergency within the microcontroller. The formula is preinstalled
services can be reduced the person might get saved. Usually
due to these accidents, it is exceedingly difficult for the person
within the microcontroller. In the event of associate
in an accident to alert the emergency services such as police accident, the traveler should
and medical. The proposed system is going to solve this very
issue. Whenever a vehicle the that is equipped with the system receive facilitate promptly and the folks related to the
gets into an accident the coordinates of the vehicle are taken person should be notified immediately proposes a system
through global positioning system (GPS) and a Google maps wherever label sensors mounted on the vehicle will observe
link is generated by the microcontroller through a formula. a crash and signal the small controller that successively
Through this, the affected person will get the emergency passes the information containing the
services promptly and send an emergency message about the
accident to the people related to the person, such as their
family members and friends. The process through which this
coordinate location of the crash beside the identification
will be done is the sensors mounted on the vehicle will detect a details to the cloud server. The google map link is
crash and signal the microcontroller which will pass the distributed through International System of Units for mobile
information containing the coordinates of the location of the communication GSM module to a predefined mobile sort of
crash and the identification details to the Cloud server. The members of the family and near police headquarters. The
Cloud server will then generate a Google map link which will accident is detected through measuring device and the price
be distributed to the affected person's family members and the compared with the formula's brink price. The friend will get
nearest police headquarters. (Abstract) the exact location of the vehicle by clicking on the google
map link provided among the SMS
Keywords—leading, highest, road, cause, cloud
LXIII. RELATED WORKS
LXII. INTRODUCTION
Design of accident detect ion and alert system for
Bike accident may be a terribly huge downside in Asian
motorcycles [2013]
nation and different countries too. Most of the deaths within
the world area are unit because of road accidents. Asian
The idea of vehicle accident detection is not new, and the
nation faces the absolute best death rate within the world
automotive companies have made lots of progress in
consistent with the govt road transport survey the
perfecting that technology. Hitherto the same in motorcycles
quantitative relation of road accidents in 2018 is 4.61 lakhs
is lying dormant waiting to reach its peak. This paper is an
within which variety of deaths is 1.47 lakhs i.e., 402 folks
attempt to contribute to that area of technology. Here we are
die per day in Asian nation. Reasons for road accidents
trying to detect accidents through three parameters-
square measure speed driving, drink, and drive, not
acceleration/ deceleration, tilt of the vehicle and the pressure
following rule. in keeping with some survey, the most
change on the body of the vehicle. Using these minute data
reason for deaths within the road accidents is delay in
values and an apt algorithm, the accident can be detected
with a reasonable success rate. And the coordinates of the In this paper, we suggest a method to intelligently detect an
vehicle found using GPS technology are sent to the accident at any place and any time and report the same to
emergency services for help. the nearby `service provider'. The service provider arranges
for the necessary help. Accident Detection and Reporting
Vehicle Tracking and Locking Based GSM and GPS [2013] System (ADRS) which can be placed in any vehicle uses a
sensor to detect the accident. The sensor output is monitored
Currently, all the public have their own vehicle, theft is and processed by the PIC16F877A microcontroller. The
happening in parking and sometimes driving in insecure microcontroller takes decisions on traffic accidents based on
places. The safety of vehicles is essential for public the input from the sensors. The RF transmitter module
vehicles. Vehicle tracking and locking system installed in which is interfaced with the microcontroller will transmit
the vehicle, to track the place and locking engine motor. The the accident information to the nearby Emergency Service
location of the vehicle was identified using Global Provider (ESP). This information is received by the RF
Positioning system (GPS) and Global system mobile receiver module at the `service provider' control room in the
communication (GSM). These systems constantly watch a locality. The RF transceiver module used has a range up to
moving Vehicle and report the status on demand. When the 100 meters (about 328.08 ft) under ideal conditions. The
theft is identified, the person responsible sends SMS to the service provider can use this information to arrange for
microcontroller, then microcontroller issues the control ambulances and inform police and hospital. We used low-
signals to stop the engine motor. Authorized people need to cost RF modules, a microcontroller by Microchip, LCD
send the password to controller to restart the vehicle and module and an accelerometer. This system can be installed
open the door. This is more secure, reliable, and lower cost. at accident prone areas to detect and report the same.
MPLAB IDE and Proteus software are used to simulate part
Incident Detection Algorithm Based on Non-Parameter of the system. ADRS also implements an intelligent
Regression [2002] Accident Detection and Reporting Algorithm (ADRA) for
the purpose.
We will first describe the traffic congestion problem that
many countries are facing in this world. Then we propose a Accident Detection and Reporting System using GP S,
traffic incident detection algorithm based on non-parametric GPRS and GSM Technology [2012]
regression to solve the congestion problem. Finally, we
compare the algorithm with other incident detection Speed is one of the basic reasons for vehicle accidents.
algorithms on the detection rate, false alarm rate and mean Many lives could have been saved if emergency services
detection time. A simulation result shows the algorithm could get accident information and reach in time.
proposed has higher detection rate, lower false alarm rate Nowadays, GPS has become an integral part of vehicle
and longer mean time detection. Furthermore, we state the systems. This paper proposes to use a GPS receiver's
direction of our next study. capability to monitor the speed of a vehicle, detect accidents
based on monitored speed, and send accident location to an
Alert Service Centre. The GPS will monitor the speed of a
Study on the Method of Freeway Incident Detection Using vehicle and compare it with the previous speed in every
Wireless Positioning Terminal [2008] second through a Microcontroller Unit. Whenever the speed
is below the specified speed, it will assume that an accident
Improving the incident detection system's performance was has occurred. The system will then send the accident
essential to minimize the effect of incidents. A new method location acquired from the GPS along with the time and the
of incident detection was brought forward in this paper speed by utilizing the GSM network. This will help to reach
based on an in-car terminal which consisted of GPS module, the rescue service in time and save valuable human life.
GSM module and control module as well as some optional
parts such as airbag sensors, mobile phone positioning Design and Development of GPS GSM based tracking
system (MPPS) module, etc. When a driver or vehicle system with Google map-based monitoring [2013]
discovered the freeway incident and initiated an alarm report
the incident location information located by GPS, MPPS or GPS is one of the technologies used in many applications
both would be automatically sent to a transport management today. One of the applications is tracking your vehicle and
center (TMC), then the TMC would confirm the accident keeps regular monitoring on them. This tracking system can
with a closed-circuit television (CCTV) or other approaches. inform you of the location and route travelled by vehicle,
In this method, detection rate (DR), time to detect (TTD) and that information can be observed from any other remote
and false alarm rate (FAR) were more important location. It also includes a web application that provides you
performance targets. Finally, some feasible means such as with the exact location of the target. This system enables us
management mode, education mode and suitable accident to track targets in any weather conditions. This system uses
confirming approaches have been put forward to improve GPS and GSM technologies. The paper includes the
these targets. hardware part which comprises of GPS, GSM, at mega
microcontroller MAX 232, 16x2 LCD and software part is
used for interfacing all the required modules and a web
Wireless Vehicular Accident Detect ion and Reporting application is also developed at the client side. Main
System [2010]
28
objective is to design a system that can be easily installed Design and development of GPS/GSM based vehicle
and to provide platform for further enhancement. [9] tracking and alert system for commercial inter-city buses
[2012]
Design and Implementation Vehicle Tracking System using In this paper we proposed the design, development, and
GPS & GSM/GPRS Technology and Smartphone deployment of GPS (Global Positioning System)/GSM
Application [2014] (Global System for Mobile Communications) based Vehicle
Tracking and Alert System which allows inter-city transport
An efficient vehicle tracking system is designed and companies to track their vehicles in real-time and provides
implemented for tracking the movement of any equipped an alert system for reporting armed robbery and accident
vehicle from any location at any time. The proposed system occurrences.
made beneficial use of a popular technology that combines a 3 PROPOSED SYSTEM
Smartphone application with a microcontroller. This will be
easy to make and inexpensive compared to others. The We have avoided the false alarm situation caused for some
designed in-vehicle device works using Global Positioning conditions, increased the accuracy of accident detection
System (GPS) and Global system for mobile communication using more than one sensor. To avoid the false alarm, we
/ General Packet Radio Service (GSM/GPRS) technology have one manual switch in the vehicle itself which must be
that is one of the most common ways for vehicle tracking. pressed within a certain amount of time for false accident
The device is embedded inside a vehicle whose position is detection and, hence avoiding any false intimation. We are
to be determined and tracked in real-time. A microcontroller using front bumper sensor, GPS sensor, Position encoder
is used to control the GPS and GSM/GPRS modules. The along with the MEMS sensor to increase the accuracy of
vehicle tracking system uses the GPS module to get accident detection. Bumper sensor will tell the
geographic coordinates at regular time intervals. The microcontroller how much force/pressure has been applied
GSM/GPRS module is used to transmit and update the on it and it's obvious the pressure will be more in case of
vehicle location to a database. A Smartphone application is accident. Position encoder is used for calculating the speed
also developed for continuously monitoring the vehicle of vehicle and it is expected to change drastically when
location. The Google Maps API (application programming accident being met and adding another layer of reliability.
interfaces) is used to display the vehicle on the map in the The MEMS sensor as usual tells the microcontroller if there
Smartphone application. Thus, users will be able to is sudden change in the acceleration. GPS, GSM modules
continuously monitor a moving vehicle on demand using the we are using to get the accident spot location and send the
Smartphone application and determine the estimated SMS.
distance and time for the vehicle to arrive at a given
destination. To show the feasibility and effectiveness of the
system, this paper presents experimental results of the LXIV. PROBLEM STATEMENT
vehicle tracking system and some experiences on practical Whenever accident being met, the nearby people call the
implementations. ambulance. The problem associated with this is that the
victims depend on the mercy of nearby people. There is a
chance that there are no people nearby the accident spot or
Automatic road accident detection techniques: A brief people who are around neglects the accident. This is the
survey [2017] flaw in the manual system.
Many precious lives are lost due to road traffic accidents According to a statistical projection of traffic fatalities, the
every day. The common reasons are driver's mistake and most obvious reason of a person's death during accidents is
late response from emergency services. An effective road the unavailability of the first aid provision, due to the delay
accident detection and information communication system in the information of the accident being reached to the
are needed to save injured persons. A system that sends ambulance or to the hospital.
information messages to nearby emergency services about
the accident location for timely response is in need. In
research literature, many automatic accident detection LXV. OBJECTIVES
systems have been proposed by many researchers. These
include accident detection using smartphones, GSM and Existing System
GPS technologies, vehicular ad-hoc networks, and mobile
applications. The implementation of an automatic road There are many solutions proposed for the concerned
accident detection and information communication system problem and each one have some advantage over others.
in every vehicle is very crucial. This paper presents a brief Among the other GSM and GPS solutions, some proposed
review of automatic road accident detection techniques used the solution of finding the accident condition using only
to save affected persons. An automatic road accident accelerometer sensor which may be a problem as it may lead
detection technique based on low-cost ultrasonic sensors is to false alarm for some of the cases. Our system uses more
also proposed than one sensor to increase the accuracy of the system and
also we have provision to avoid the intimation in case of
false alarm. The existing system also uses the Wi-Fi
modules which does not work when there is no network.
29
LXVI. MOTIVATION FOR THE WORK
Road accidents contribute to majority of deaths in India.
Most of the lives may be saved if they get medical help
quickly in time.
Over 1.51 lakh died in road accidents in year 2019
LXVII. METHOLOGY
30
[3] R. Ramani, S.Valarmathy, Dr. N Suthanthira,
S.Selavaraju, M.Thiruppat hi, R.Thagam, “ Vehicle
Tracking and Locking Based GSM and GPS”, Issue Date:
Sept 2013)
The deaths and additionally the severe conditions due to [11] “Cellular networks for massive IoT,” Ericsson White
accidents the GSM technologies square measure used where Paper, Jan 2016
the immediate action would be taken by the automobile /
police service that might cut back the severity. [12] W. Chris Veness, “Calculate distance and bearing
between two Latitude/Longitude points haversine formula in
References JavaScript”, 2016
[1] Government of India, Minist ry of Road Transport and [13] J. White, C. Thompson, H. Turner, B. Dougherty, and
Highways, Lok Sabha Unstarred Quest ion No. 374 D. C. Schmidt, “Wreckwatch: Automatic traffic accident
Answered on 19-07-2018 detect ion and notification with smart phones,” Mobile
Networks and Applications, vol. 16, no. 3, pp. 285–303,
[2] F. B. Basheer, J. J. Alias, C. M. Favas, V. Navas, N. K. 2011
Farhan and C.V. Raghu, "Design of accident detect ion and
alert system for motorcycles," 2013 IEEE Global [14] U. Khalil, T. Javid, and A. Nasir, “Automatic road
Humanitarian Technology Conference: South Asia Satellite accident detection techniques: A brief survey,” in
(GHTC-SAS), Trivandrum, 2013, International Symposium on Wireless Systems and
pp. 85-89. Networks (ISWSN). IEEE, 2017, pp. 1–6.
31
[15] P. B. Fleischer, A. Y. Nelson, R. A. Sowah and A. System”, International Conference on Mechanical and
Bremang, "Design and development of GPS/GSM based Electrical Technology (ICMET 2010).
vehicle tracking and alert system for commercial inter-city
buses," 2012 IEEE 4th International Conference on IEEE conference templates contain guidance text for
Adaptive Science & Technology (ICAST), Kumasi, 2012, composing and formatting conference papers. Please
pp. 1-6. ensure that all template text is removed from your
conference paper prior to submission to the
[16] R. Kannan, R. Nammily, S. Manoj, A. Vishwa,” conference. Failure to remove template text from
Wireless Vehicular Accident Detection and Reporting your paper may result in your paper not being
published.
32
FraudShield: Detection of Fraud in Credit Card based on Machine
Learning Techniques with integration of web-based Framework
1. INTRODUCTION
The consumers and financial organisations are of techniques based on machine learning in the
equally impacted by the severe issue of credit card identification of fraud in credit cards has a number
theft. Fraudulent actions can harm the credibility of benefits and irregularities in massive datasets
of the financial company as well as result in that human analysts could miss. In contrast to
significant monetary losses for both parties. In rule-based systems, which need manual updates
order to promptly identify forged transactions and to be successful, they are also able to adjust to a
reduce losses, it is necessary to design successful new fraud patterns as they appear. In this regard,
and effective fraud detection systems. Credit card the project's goal is to look into how well different
fraud has been successfully identified using machine learning approaches work to identify
machine learning techniques. These methods credit card fraud. In order to determine which
entail building a model from a dataset of method is most efficient, the research will
confirmed illegal and fraudulent transactions, examine multiple datasets and use a variety of
then using the model to forecast the likelihood of preprocessing strategies, feature selection
fraud for new transactions. In comparison to techniques, and modelling algorithms.[1,2]
conventional rule-based systems, the application
2. OBJECTIVES
The objectives that come across in fraud detection detection of fraud. Identify relevant attributes that
of credit cards are: (a) To identify and foresee the capture the patterns and traits of fraudulent
outcome of unauthorised credit card activity. (b) transactions by extracting and engineering them.
Analyse a few effective machine learning (e) Create a system that can process incoming
algorithms, identify the one with the best transactions made with credit cards in real-time
accuracy, and suggest a model. (c) Add a machine and identify whether they are counterfeit or
learning model to the web-based framework for genuine using the used machine learning models
better user interface and user experience. (d) Find [2].
pertinent dataset features that can aid in the
3. METHODOLOGY
3.1 Existing methods clustering and neural networks based on fraud
detection, demonstrated that by accumulating
In the current system, research on an instance qualities, neuronal activity inputs can be
involving detecting credit card fraud, where data minimised. Additionally, normalised data should
normalisation was used before cluster analysis, be used, and MLP training is recommended [3].
and with outcomes obtained through the use of This study was built on unsupervised learning.
Finding innovative strategies for identifying Proposing a model in the system that is being
fraudulent activity and improving the accuracy of suggested here to identify fraud behaviour in
outcomes were the two main purposes of this transactions with credit cards. The bulk of
article. Personal information in the data set used essential characteristics which are required to
for this study is kept isolated and it is based on the distinguish between legitimate and illegal
real transactional figures collected by a major transactions may be offered by this method. With
European corporation. The algorithm typically the development of technology, it becomes more
has a 50% accuracy rate. Discovering an difficult to identify the idea and pattern of faked
algorithm and dropping the cost measure were the transactions. The advancement of artificial
two main purposes of this paper. The result was intelligence (AI), machine learning, and other
23%, and the chosen algorithm had the lowest relevant information technology disciplines has
risk. [2,3]. made it possible to automate this process and
minimise part of the intense labour that is
Disadvantages necessary to detect credit card fraud [4]. Teo
identify credit card fraud. To discover which
1. The gains and losses attributable to fraud machine learning algorithm is best, comparisons
detection are adequately represented in this study are made between many algorithms, including
by a novel collative comparison metric. random forests, decision trees, logistic regression,
and Naive Bayes. Determine the best algorithm
2. The suggested cost measure is used to offer a that credit card merchants can use to identify
cost-sensitive strategy centred around Bayes fraudulent transactions. Finally, integrating the
minimum risk. machine learning model with the web based
framework using streamlit, it is a web based
3.2 Proposed method framework for better user interface and user
experience. Then creating menus, inputs fields for
prediction, classification reports and display
model graph in the web framework [5].
Fig.1. Architecture
4. MODULES
Data collection is the first stage of the research; variety of library functions can be used to load the
the data being gathered consists of a number of dataset. To read CSV function of the python
actions, some of which are genuine and others of pandas module was used in this case to load a data
which are fraudulent. The project's data collection collection in CSV or Microsoft Excel format.
phase is the initial stage; this dataset consists of a Creating a model for the data that was trained is
number of operations, some of which are genuine now used to create the model after the data was
and others of which are fraudulent. The first stage divided into test and training samples, each of
of the project is data collection; this dataset which was given a 70% and 30% weighting.
consists of a number of operations, several of Determining the module's accuracy using a
which are genuine and others of which are variety of algorithms, this stage determines the
fraudulent. Credit Card Dataset using the Kaggle module's correctness. Streamlit Web Framework
website as source, may access a credit card in the web application, will incorporate the
payment information set. This process of the machine algorithm graphs, user input, and
dataset: In this module, selected data is prepared, accuracy result.
cleaned up, and sampled. Dataset loading is a
A decision tree is a regression analysis and for exploratory learning and discovery. High-
classification machine learning algorithm. The dimensional data can be handled via decision
decision tree paradigm has a tree-like structure, trees. Classifiers have high accuracy in general
with each internal node indicating an decision trees. The decision tree inference is a
investigation of a characteristic, each branch common inductive way of learning classification
reflecting the testing result, and symbolising a information [7]. Decision trees categorise
class identification or a numerical value [6]. The instances by moving them through the tree
tree can be "learned" by subdividing the source set through the root to a leaf node that offers the
depending on the outcome of attribute testing. instance's categorization. Beginning at the lowest
This method is repeated recursively on each point of the tree, an instance is categorised by
derived subset, which is known as recursive checking the attribute indicated by this node and
partitioning. Because the development of a then going along the tree branch according to the
classifier that uses decision trees requires no numerical value of the property.
domain expertise or parameter setup, it is suitable
5.3 Naive-Bayes
Naive Bayes is a common classification technique Bayes principle is a quick approach that can
based on Bayes' probability theory. It is a handle big, high-dimensional datasets. (c) To
straightforward yet effective algorithm that is create accurate predictions, Naive Bayes takes
commonly used in the classification of text, only a small amount of training data. (d) Naive
filtering spam, and recommendation systems. The Bayes can deal with insignificant features and is
name "naive" comes from the assumption that unaffected by them [9]. The disadvantages are: (a)
each of the characteristics is unrelated to each Naive Bayes presupposes that the characteristics
other. To begin, the algorithm computes the are independent of one another, which is not
estimated likelihood of each class supplied with a necessarily the case in real-world datasets. (b) The
set of parameters. This is accomplished by the use naive Bayes technique has limited expressive
of the Bayes' theorem, which indicates that the capacity and may be incapable of capturing
likelihood of an expectation (in the current complicated feature interactions. (c) Naive Bayes
instance, the class) offered by the information (the presupposes a predefined distribution of
features) correlates to the likelihood of the data probability for the attributes, which may or may
supplied by the hypothesis multiplied by the not be appropriate for the dataset. Because it
initial likelihood of the hypothesis [8]. The implies a discrete distribution of probability for
advantages are: (a) Naive Bayes is a the features, Naive Bayes is unsuitable for
straightforward algorithm that is simple to grasp continuous data. (d) Naive Bayes is best suited for
and apply. It does not necessitate the use of categorical information and may struggle with
complex iterative algorithms, as many other continuum or numerical features [10].
machine learning techniques do. (b) The naive
5.4 KNN
In the KNN model, statistics uses a non-para- of physical measurements or arrive at vastly
supervised learning method called the k-nearest different scales because the method uses distances
neighbour technique (k-NN). A class member is for categorization [12]. Applying weights to
the product of the k-NN classification. An object neighbour contributions to make the near
is classified by the unanimous consent of its neighbours contribute more to the median than the
neighbours, with the object being given the distant neighbours is an effective method for
classification that is most common among its k regression as well as classification. Assigning
(positive, frequently tiny) nearby objects. When k each neighbour a weight of 1/d, where d
is equal to 1, the item is simply classified as the represents their distance from one another, is a
object's lone nearest neighbour [11]. With the k- common way to weigh objects. In both the k-NN
NN classification approach, all processing is put classification and the k-NN regression, the
on hold until the function being classified has neighbours are selected from a group of elements
been evaluated and only a remote model has been where the class or object value for a property has
constructed. The accuracy of the aforementioned been established.This is the algorithm's training
method can be greatly improved by identifying set; however, no explicit training is necessary
the source data if the features represent a variety [13,14].
5.5 Streamlit
For the detection of fraud in credit cards, with the variety of cloud platforms, including Heroku and
help of the freely available web application Google Clouds [15]. Overall, Streamlit is a
framework Streamlit, programmers may use powerful tool for creating interactive data-driven
Python to build interactive data-driven apps. With applications with Python, and is well-suited for
Streamlit, developers can easily create data data scientists and developers who want to
visualizations, interactive dashboards, and quickly prototype and deploy web applications.
machine learning models that can be deployed as Streamlit is a versatile web application framework
web applications. Streamlit provides a simple and that can be used for a wide variety of applications
intuitive interface for creating applications, in data science, machine learning, and beyond.
allowing developers to focus on the content and Here are some examples of the uses of Streamlit:
functionality of their applications rather than the (a) Interactive data exploration: Streamlit makes
technical details of web development. Streamlit it easy to create interactive data visualizations and
provides a number of features to make building exploration tools, allowing users to explore and
web applications easier, including: A simple and analyze data in a more intuitive and engaging
intuitive API for creating user interfaces and data way.
visualizations.Automatic reactivity, this enables (b) Machine learning model development and
developers to construct applications that are deployment: Streamlit can be used to develop and
interactive that are updated in immediate time as deploy machine learning models as web
the consumer interacts with them. Built-in support applications, allowing users to interact with and
for popular data science libraries such as Pandas, test models in real-time.
Matplotlib, and Plotly. Easy deployment to a
(c) Dashboard creation: Streamlit is well-suited and experimenting with new data science ideas
for creating interactive dashboards that allow and techniques, allowing users to quickly test and
users to explore and analyze data from a variety iterate on new ideas.
of sources. (e) Education and training: Streamlit can be used
(d) Prototyping and experimentation: Streamlit to create interactive educational tools and
provides an easy-to-use interface for prototyping tutorials, allowing students and learners [16,17].
6. TECHNIQUES
6.1 Repeat retailer
The fraud detection of a credit card using a repeat metrics of the new transaction to the historical
retailer is the technique that utilizes the history of metrics of the cardholder's previous transactions
transactions made at a particular retailer to at the retailer. If the metrics of the new transaction
identify potentially fraudulent transactions. The are significantly different from the historical
basic idea is that if a cardholder has made several metrics, the system flags the transaction as
legitimate transactions at a particular retailer in potentially fraudulent and triggers a review
the past, then any future transactions at that process. Repeat retailer is just one of many
retailer are more likely to be legitimate as well. techniques used in detection of credit card fraud,
The system maintains a history of transactions and is often used in combination with other
made by each cardholder at each retailer. When a techniques, such as anomaly detection and
new transaction is made, the system checks to see machine learning. By leveraging the history of
if the cardholder has made any previous transactions made by each cardholder, repeat
transactions at the same retailer. If the cardholder retailer can help identify potentially fraudulent
has made previous transactions at the retailer, the transactions and reduce the incidence of credit
system calculates various metrics, such as the card fraud. Through graph model which are
average transaction amount, where the time is depicting the analysis part based on the dataset,
between transactions, and the location of the column of ‘repeat retailer’. Predicting the percent
transactions [18]. The system compares the of ‘yes’ is 88.2% and ‘no’ is 11.8%.
The detection of fraud in credit card using valid and approves the transaction. If the
used_chip is a technique that utilizes the information provided by the merchant does not
information stored on the chip of a credit card to match the information stored on the chip, the
identify potentially fraudulent transactions. The system flags the transaction as potentially
basic idea is that the information stored on the fraudulent and triggers a review process.
chip can provide additional authentication and Used_chip is just one of many techniques used in
validation that can help verify the legitimacy of a detection of credit card fraud, and is often used in
transaction.The system reads the information combination with other techniques, such as repeat
stored on the chip of the credit card, including the retailer analysis and machine learning [20]. By
card number, expiration date, and other utilizing the information stored on the chip of a
information [19]. The system compares this credit card, used_chip can help verify the
information to the information provided by the legitimacy of a transaction and reduce the
merchant, such as the transaction amount, the incidence of credit card fraud. .Through graph
merchant name, and the location of the model which are depicting the analysis part based
transaction. If the information provided by the on the dataset, column of ‘used chip’. Predicting
merchant matches the information stored on the the percent of ‘yes’ is 65.0% and ‘no’ is 35.0%.
chip, the system assumes that the transaction is
7. RESULT
Based on the precision and accuracy scores performed moderately in both measurements,
offered, it is critical to analyse the fraudulent with a good precision score and an accuracy score.
credit card detection application's particular Logistic regression produced a fair balance of
evaluation standards and goals. If accuracy is precision and accuracy. K-Nearest Neighbours
important, Naive Bayes has the highest accuracy (KNN) also performed well, with an accuracy and
score. It is worth mentioning, however, that Naive precision score ofBased on the ratings, logistic
Bayes had the lowest precision score, indicating a regression appears to be the model of greatest
higher false-positive rate. Decision Tree obtained choice since it achieves a decent mix of accuracy
the highest precision score if precision is and precision. However, the best appropriate
important. This means that it was more accurate model is ultimately determined by the
in classifying fraudulent transactions. The application's specific requirements and goals, and
decision tree, on the other hand, had a somewhat other variables like complexity of computation,
lower accuracy score. When accuracy and interpretability, and scalability must also be
precision were combined, logistic regression considered.
Fig.6. Comparison graphs of four models based on accuracy and precision
8. CONCLUSION
caused by credit card theft. Future studies can
Finally, credit card theft is a serious worry for
concentrate on enhancing the suggested
both financial institutions and customers.
framework's accuracy and investigating the use of
Machine learning algorithms have been shown to
alternative machine learning approaches to
be useful in the real-time detection of fraudulent
address this challenge. This credit card fraud
transactions. In this paper, establishing a system
detection architecture, which employs
for detecting credit card fraud by combining both
streamlining and machine learning, is highly
supervised and unsupervised algorithms to
successful in preventing monetary harm caused
discover patterns that signal fraudulent behaviour.
by credit card fraud. Future research can
And also combined the learned algorithmic
concentrate on increasing the framework's
learning model with a web-based structure to
performance and investigating the use of more
create a straightforward user experience for real-
sophisticated machine learning methods. The
time identification of fraud. The experimental
suggested credit card fraud identification
results indicated that the suggested framework
framework based on stream-lit and machine
detected fraudulent transactions with high
learning solves this challenge effectively. To
accuracy while minimising false positives. The
accurately identify fraudulent transactions, the
proposed methodology can be used by banking
platform includes several machine learning
organisations in order to enhance their theft
algorithms such as decision tree, XGBoost,
identification abilities and avoid financial losses
random forest and logistic regression.
9. REFERENCES "Credit card fraud detection-machine learning
methods." In 2019 18th International Symposium
[1] Raj, S. Benson Edwin, and A. Annie Portia. INFOTEH- JAHORINA (INFOTEH), pp. 1-5. IEEE,
"Analysis on credit card fraud detection methods." 2019.
In 2011 International Conference on Computer,
[11] Yee, Ong Shu, Saravanan Sagadevan, and
Communication and Electrical Technology
Nurul Hashimah Ahamed Hassain Malim. "Credit
(ICCCET), pp. 152-
card fraud detection using machine learning as data
156. IEEE, 2011.
mining technique." Journal of Telecommunication,
[2] Ghosh, Sushmito, and Douglas L. Reilly. Electronic and Computer Engineering (JTEC) 10,
"Credit card fraud detection with a neural-network." no. 1-4 (2018): 23-27.
In System Sciences, 1994. Proceedings of the
[12] Malini, N., and M. Pushpa. "Analysis on credit
Twenty-Seventh Hawaii International Conference
card fraud identification techniques based on KNN
on, vol. 3, pp. 621- 630. IEEE, 1994.
and outlier detection." In 2017 third international
[3] Chaudhary, Khyati, Jyoti Yadav, and Bhawna conference on advances in electrical, electronics,
Mallick. "A review of fraud detection techniques: information, communication and bio-informatics
Credit card." International Journal of Computer (AEEICB), pp. 255-258. IEEE, 2017.
Applications 45, no. 1 (2012): 39-44. [13] Ganji, Venkata Ratnam, and Siva Naga
Prasad Mannem. "Credit card fraud detection using
[4] Srivastava, Abhinav, Amlan Kundu, Shamik anti-k nearest neighbor algorithm." International
Sural, and Arun Majumdar. "Credit card fraud Journal on Computer Science and Engineering 4,
detection using hidden Markov model." IEEE no. 6 (2012):1035-1039.
Transactions on dependable and secure computing
[14] Vengatesan, K., A. Kumar, S. Yuvraj, V.
5, no. 1 (2008): 37-48.
Kumar, and S. Sabnis. "Credit card fraud detection
[5] Awoyemi, John O., Adebayo O. Adetunmbi, using data analytic techniques." Advances in
and Samuel A. Oluwadare. "Credit card fraud Mathematics: Scientific Journal 9, no. 3 (2020):
detection using machine learning techniques: A 1185-1196.
comparative analysis." In 2017 international
[15] Zareapoor, Masoumeh, K. R. Seeja, and M.
conference on computing networking and
Afshar Alam. "Analysis on credit card fraud
informatics (ICCNI), pp. 1-9. IEEE, 2017.
detection techniques: based on certain design
[6] Sahin, Yusuf, and Ekrem Duman. "Detecting criteria." International journal of computer
credit card fraud by ANN and logistic regression." In applications 52, no. 3 (2012).
2011 international symposium on innovations in
[16] Nancy, A. Maria, G. Senthil Kumar, S.
intelligent systems and applications, pp. 315-319.
Veena, NA S. Vinoth, and Moinak Bandyopadhyay.
IEEE, 2011.
"Fraud detection in credit card transaction using
[7] Kiran, Sai, Jyoti Guru, Rishabh Kumar, hybrid model." In AIP Conference Proceedings,
Naveen Kumar, Deepak Katariya, and Maheshwar vol. 2277, no. 1,
Sharma. "Credit card fraud detection using Naïve p. 130010. AIP Publishing LLC, 2020.
Bayes model based and KNN classifier."
[17] Kaur, Darshan. "Machine Learning Approach
International Journal of Advance Research, Ideas
for Credit Card Fraud Detection (KNN & Naïve
and Innovations in Technoloy 4, no. 3 (2018): 44.
Bayes)." In Machine Learning Approach for Credit
[8] Husejinovic, Admel. "Credit card fraud Card Fraud Detection (KNN & Naïve
detection using naive Bayesian and c4. 5 Bayes)(March 30, 2020). Proceedings of the
decision tree classifiers." Husejinovic, A.(2020). International Conference on Innovative Computing
Credit card fraud detection using naive Bayesian & Communications (ICICC). 2020.
and C 4 (2020): 1-5.
[18] Saheed, Yakub Kayode, Usman Ahmad Baba,
[9] Saheed, Yakub K., Moshood A. Hambali, and Mustafa Ayobami Raji. "Big Data Analytics for
Micheal O. Arowolo, and Yinusa A. Olasupo. Credit Card Fraud Detection Using Supervised
"Application of GA feature selection on Naive Machine Learning Models." In Big Data Analytics
Bayes, random forest and SVM for credit card in the Insurance Market, pp. 31-56. Emerald
fraud detection." In 2020 international conference Publishing Limited, 2022.
on decision aid sciences and application (DASA),
[19] Adewumi, Aderemi O., and Andronicus A.
pp. 1091-1097. IEEE, 2020.
Akinyelu. "A survey of machine-learning and
[10] Varmedja, Dejan, Mirjana Karanovic, Srdjan nature- inspired based credit card fraud detection
Sladojevic, Marko Arsenovic, and Andras Anderla. techniques." International Journal of System
Assurance Engineering and Management 8 (2017):
937-953.
[20] Mehbodniya, Abolfazl, Izhar Alam, Sagar
Pande, Rahul Neware, Kantilal Pitambar Rane,
Mohammad Shabaz, and Mangena Venu Madhavan.
"Financial fraud detection in healthcare using
machine learning and deep learning techniques."
Security and Communication Networks 2021
(2021): 1-8.Handa, Akansha, Yash Dhawan, and
Prabhat Semwal. "Hybrid analysis on credit card
fraud detection using machine learning techniques."
Handbook of Big Data Analytics and Forensics
(2022): 223-238.
[21] Tiwari, Pooja, Simran Mehta, Nishtha
Sakhuja, Ishu Gupta, and Ashutosh Kumar Singh.
"Hybrid method in identifying the fraud detection
in the credit card." In Evolutionary Computing and
Mobile Sustainable Networks: Proceedings of
ICECMSN 2020, pp. 27-35. Springer Singapore,
2021.
[22] Kazemi, Zahra, and Houman Zarrabi. "Using
deep networks for fraud detection in the credit card
transactions." In 2017 IEEE 4th International
conference on knowledge-based engineering and
innovation(KBEI), pp. 0630-0633. IEEE, 2017.
[23] Faraji, Zahra. "A Review of Machine
Learning Applications for Credit Card Fraud
Detection with A Case study." SEISENSE Journal
of Management 5, no. 1 (2022): 49-59.
[24] Prusti, Debachudamani, and Santanu Kumar
Rath. "Web service based credit card fraud
detection by applying machine learning
techniques." In TENCON 2019-2019 IEEE Region
10 Conference (TENCON), pp. 492-497. IEEE,
2019.
[25] Ahammad, Jalal, Nazia Hossain, and
Mohammad Shafiul Alam. "Credit card fraud
detection using data pre-processing on imbalanced
data-Both oversampling and undersampling." In
Proceedings of the International Conference on
Computing Advancements, pp. 1-4. 2020
Early Prediction of Lifestyle Diseases Using ML
B.R.VENKATESH D.ABHIRAM
20191COM0025 20191COM0053
COMPUTER SCIENCE COMPUTER SCIENCE
COM-G04 COM-G04
BANGALORE, INDIA BANGALORE, INDIA
201910101142@presidencyuniversity.in 201910101803@presidencyuniversity.in
DIVESH CHANDRABOINA
20191COM0038
COMPUTER SCIENCE
COM-G04
BANGALORE, INDIA
201910101107@presidencyuniversity.in
Abstract— A doctor app is a web application trends and patterns, anticipate appointment
created to make getting medical care, a length, and predict patient no-show rates,
diagnosis, and treatment for users as simple as among other things. This enables doctors to
possible. Due to their accessibility and schedule their time more effectively, reduce
convenience, doctor apps are growing in wait times, and boost patient satisfaction.
popularity, especially in areas where travelling Additionally, categorisation is used to verify
to medical institutions is difficult or time- and reschedule appointments if necessary.
consuming. The system analyses real-time Overall, the ability to book appointments using
patient data, including symptoms, medical a doctor's app has the potential to significantly
history, and test results, to give appropriate raise the efficacy and level of healthcare.
therapies and identify probable diagnoses.
Massive amounts of medical data are analysed Keywords—Machine Learning (ML), User
using machine learning algorithms in order to Interface (UI), and Classification.
spot patterns and trends and forecast patient I. INTRODUCTION
outcomes. Machine learning algorithms may
examine a sizable amount of medical data, Technology advancement is hastening the
including patient history, test results, and transformation of the healthcare sector due to the
symptom data, in order to uncover patterns rise of global influence and evolving societal
and provide predictions about likely diagnosis attitudes. Today, it is easy to identify the areas of
and treatments. Numerous uses exist, such as healthcare delivery that have failed. Doctors and
customised and distant consultations, health patients have both experienced stress as a result of
monitoring, etc. This project focuses on cancelled and postponed appointments. One of the
enabling patients to easily schedule options that, in our opinion, can improve
appointments with their physicians and communication between a doctor and a patient is
specialists while providing professionals with the ability to quickly schedule an appointment
access to patient information and real-time online. A medical appointment scheduling
scheduling information via a user interface programme also makes patients and physicians
(UI). Machine learning algorithms are used to more at ease in situations that are progressively
examine appointment data, identify scheduling becoming more typical. For instance, even if a
C) Input Symptoms
D) Medicine Bookings
Ⅵ. CONCLUSION
In this project, a machines learning based early
life style disease detection with medical assistant is
created by evaluating a variety of algorithms for
machine learning against a early life style disease
dataset. The flask web application framework then
uses the most efficient method to forecast disease.
This paradigm was used to develop a health portal
single interface, including scheduling doctor L. Wang, “Disease prediction by machine learning
appointments, ordering medications, and over big data from healthcare communities”, ,”
illnesspredictions. IEEE Access, vol. 5, no. 1, pp. 8869–8879, 2017.
Future Work:
In future disease data for different disease
are collected and trained using Deep learning
methods to get more effective results and
accuracy. Segmentation of MRI scans can be
applied to dataset can be integrated to
website.
REFEREN
CES
ABSTRACT:
A record-breaking volume of publicly accessible the text is tokenized, which involves breaking it
user-generated data is now available because to up into separate words or phrases. As the
social networks' widespread use. This data can be meaning of the text might vary depending on the
analyzed to learn about people's thoughts and context in which specific words are used, this
feelings. stage is crucial for analyzing the sentiment of the
post. Several natural language processing
On the other hand, text communication over web- techniques are used to analyse the text for
based networking media might be a little sentiment after tokenization. This can involve
overwhelming. Due to social media platforms, a examining the polarity of certain words or
significant volume of unstructured data is phrases as well as the post's general tone and
produced on the Internet every second.To context. After that, a model is trained to identify
understand human psychology, the data must be various emotions in tweets. Making use of the
analyzed as quickly as it is generated. This can be huge amounts of tweets with known emotions to
accomplished with the use of sentiment analysis, train the model,using the method to determine the
which recognizes polarity in texts. It determines sentiment of new tweets based on their text using
if the user has a negative, positive, or neutral databases of tweets with known emotions.
attitude towards a product, administration, person,
or place.Emotion detection, which properly The goal of the research discussed in this paper is
identifies a person's emotional/mental state, is to identify and examine the sentiment and
necessary in some applications because sentiment emotion people express through text in their
analysis is insufficient., which determines an tweets, then use that information to generate
individual’s emotional/mental state precisely. recommendations.It is possible to identify tweets
that indicate negative emotions like rage or worry
A Twitter post's text is retrieved and using Twitter emotion recognition, which enables
preprocessed to get rid of unnecessary words and the early identification of potential crises or
useless information. After that, public health problems. It can also be used to
measure changes in feelings over time, giving
information about how certain things or policies affect people's attitudes.
I.INTRODUCTION:-
Keywords :-Emotion detection,Natural language
processing(NLP),tweets,twitter,sentiment
analysis ,emotion.
tweets.These tweets may express a variety of
Social media has evolved into a global forum for feelings, including joy, sadness, anger, love, fear,
people to express their ideas and feelings in the and surprise. . Emotion detection in tweets refers
current digital era.Research has demonstrated that to the process of automatically identifying the
people may express a wide range of emotions emotions expressed in a tweet. There are many
through textual communication in addition to techniques used for emotion identification in
nonverbal clues like facial expressions and voice tweets, We'll utilize Natural language processing
tone.People now frequently express their emotions (NLP), which involves training models on huge
through written communication on social media datasets of labeled tweets, to discover patterns in
sites in particular. Users can express their ideas the text that correspond to distinct emotions.
and opinions on a variety of subjects on prominent
social media sites like Twitter.It will be The accuracy of emotion detection in tweets can
challenging to group such tweets by emotions in vary depending on the quality of the datasets used
the conventional sense, such as Inaccurate for training, the complexity of the emotions being
analysis,Limited understanding of customer detected, and the specific techniques and
sentiment ,difficulty in identifying trends,inability algorithms employed.
to personalize content.When that happens,
emotion detection enters the scene. Emotion The objective of this study is to construct a
detection in tweets has several uses, such as Twitter emotion recognition platform to find and
customer feedback analysis and sentiment examine the emotions conveyed in tweets, such as
analysis. Businesses and organizations can learn a In order to provide insights on human behaviour,
lot about how their customers feel about their attitudes, and trends, we can use emotions like
goods or services by examining the emotions joy, sadness,love, anger, fear, and surprise. In
portrayed in order to offer perspectives on human behaviour,
views, and trends.
Fig 1 Example shows a histogram to see the
number of tweets for the different classes.
Sentiment analysis is a technique for identifying preprocessed out of the captured tweets. The
the emotional undertone of a text or document. It proper emotion is then assigned to each tweet
has a wide range of uses in marketing, politics, using an emotion vocabulary, such as the NRC
social media analysis, and customer feedback Emotion vocabulary.
analysis, among other fields.
The tweets are categorised into positive, negative,
Since Twitter is a well-liked microblogging and neutral emotions using machine learning
network that generates copious amounts of data in methods like Support Vector Machines (SVM),
real-time, it is the perfect source of information Naive Bayes, or Random Forest. The
for sentiment analysis. Researchers can learn more effectiveness of the emotion detection model is
about the beliefs, attitudes, and feelings of Twitter measured using evaluation measures like precision,
users by examining the sentiment of tweets about recall, F1-score, and accuracy.
diverse subjects, brands, events, and people.
Finally, the outcomes of the sentiment analysis are
The project calls for the use of the Python visualised using visualisation packages like
programming language and some of its well- Matplotlib or Seaborn. The information from the
known libraries for NLP, including NLTK, SpaCy, analysis can assist researchers and organisations
and scikit-learn. Twitter API is used to gather in making data-driven decisions, such as
tweets about a specific subject, company, or event. enhancing brand reputation, resolving client issues,
Stop words, punctuation, and URLs are among the and monitoring public opinion.
noise and extraneous information that are
Fig 2 Graph shows Accuracy per epoch plot
Fig 3 Graph shows loss per epoch plot
V. CONCLUSION:-
In this paper we discussed the twitter sentimental subjective meaning behind the text. This can be
analysis and how these things can be applied to helpful in a day and age where people are overly
other social media platforms. We took data from relying on social media platforms for their daily
The Hugging Face and processed it through the news and through our project, we could help
NLP (Natural Language Processing). The detect harmful, violent, offensive, and hateful
experiment we did in this project was intended to content and report it. This project has limitless
identify texts into different categories such as capabilities which can be utilized for betterment
positive, neutral or negative. Importing data had of social media platforms and can lead to safer
mainly three steps, 1. importing the Tweet and secure internet for all. The project can be
Emotion dataset, 2. creating train, validation and updated with time and we as a team will make
test sets, and 3. extracting tweets and labels from sure it is updated and serves the purpose it is
the examples. These steps helped us understand intended to, and with time there will be more use
the procedure for sentimental analysis, multiple cases we are not even aware of. So, we consider
models were created to train them on the existing this project as one step in the right direction in
data and deriving conclusions. progression of sentimental analysis.
This is a small step on creating a robust system for
sentimental analysis, which can derive actual
[2]
https://link.springer.com/article/10.1007/s13278-
021-00776-6
VI. REFERENCE:-
[3] https://www.coursera.org/projects/tweet-
[1] https://www.researchgate.net/publication/3505 emotion-tensorflow
91267_Emotion_and_sentiment_analysis_of_twee
ts_using_BERT [4] https://ieeexplore.ieee.org/document/8295234
Effective Conversational AI Platform For Tourism Chatbot Using RASA
st nd
1 BHAVANA NP 2 SABIRA BI 3rd MADHUSHREE C
20191ISE0025 20191ISE0141 20191ISE0089
dept. of ISE dept. of ISE dept. of ISE
Presidency University Presidency University Presidency University
Bengaluru, Karnataka Bengaluru, Karnataka Bengaluru, Karnataka
Software Configuration:
• Operating System : Windows 10
• Server-side Script : Python 3.7.9
• IDE : VS Code
A chatbot is an NLP software that can message. The difference between NLP and
simulate a conversation (or a chat) with a NLU is that natural language understanding
user in natural language through messaging goes beyond converting text to its semantic
applications, websites, mobile apps or parts and interprets the significance of what
through the telephone. the user has said.
5.1.1 Turn human language into Rasa Open source is a robust platform that
structured data includes natural language understanding and
open source natural language processing. It’s
Rasa Open Source provides open source
a full toolset for extracting the important
natural language processing to turn
keywords, or entities, from user messages,
messages from your users into intents and
as well as the meaning or intent behind those
entities that chatbots understand. Based on
messages. The output is a standardized,
lower-level machine learning libraries like
machine-readable version of the user’s
Tensor flow and spaCy, Rasa Open Source
message, which is used to determine the
provides natural language processing
chatbot’s next action.
software that’s approachable and as
customizable as you need. Get up and 5.1.3 Why open source NLP?
running fast with easy to use default
Rasa Open Source is licensed under the
configurations, or swap out custom
Apache 2.0 license, and the full code for the
components and fine-tune hyper parameters
project is hosted on GitHub. Rasa Open
to get the best possible performance for your
Source is actively maintained by a team of
dataset.
Rasa engineers and machine learning
5.1.2 What is natural language researchers, as well as open source
processing? contributors from around the world. This
collaboration fosters rapid innovation and
Natural language processing is a category of
software stability through the collective
machine learning that analyzes freeform text
efforts and talents of the community.
and turns it into structured data. Natural
language understanding is a subset of NLP Unlike NLP solutions that simply provide an
that classifies the intent, or meaning, of text API, Rasa Open Source gives you complete
based on the context and content of the visibility into the underlying systems and
machine learning algorithms. NLP APIs can
be an unpredictable black box—you can’t be
sure why the system returned a certain
prediction, and you can’t troubleshoot or
adjust the system parameters. Rasa Open
Source is completely transparent. You can
see the source code, modify the components,
and understand why your models behave the
way they do. Open source NLP also offers
the most flexible solution for teams building
chatbots and AI assistants. The modular
architecture and open code base mean you Future enhancement
can plug in your own pre-trained models and
Future Enhancements for Al technologies in
word embeddings, build custom Indian tourism sectors providing online
components, and tune models with precision services include: Personalized
recommendations: Implementing advanced
for your unique data set. Rasa Open Source
machine learning algorithms can
works out-of-the box with pre-trained enable personalized recommendations based
models like BERT, HuggingFace on user preferences, previous bookings, and
browsing history, enhancing the overall
Transformers, GPT, spaCy, and more, and
customer experience. Voice-enabled
you can incorporate custom modules like interactions: Integrating voice assistants and
spell checkers and sentiment analysis. natural language processing capabilities can
allow users to interact with online tourism
VI. Results services using voice commands, providing a
more convenient and hands-free experience.
Augmented reality (AR) experiences:
Utilizing AR technologies can offer
immersive experiences, allowing users to
virtually explore destinations, view hotel
rooms, or take virtual tours, enabling them
to make more informed decisions.
Sentiment analysis: Employing sentiment
analysis techniques on customer reviews and
social media data can provide valuable
insights into customer satisfaction levels,
enabling tourism sectors to address concerns pilot), food providers, first aid facilities
and improve service quality. Data-driven while traveling), etc.., Thus, the tourism
forecasting: Utilizing Al algorithms to
chatbot will give the assistance to the
analyse historical data and trends can help in
forecasting demand, optimizing pricing students even the passengers no need to visit
strategies, and resource allocation, leading the travel centres.
to improved operational efficiency.
References
Conclusion 1. Shruti S. and Gripsy J. V effective
product recommendation for E
Chat bots are a thing of the future which is Commerce website using hybrid
yet to uncover its potential but with its rising recommendation
system,IJCSC,8(2),2017,pp-81-88.
popularity and craze among companies, they
2. Schafer, J. Ben, Joseph Konstan,
are bound to stay here for long types of chat and John Riedl. "Recommender
bots being introduced, it is of great systems in ecommerce." Proceedings
of the 1st ACM conference on
excitement to witness the growth of a new
Electronic commerce. ACM, 1999.
domain in technology while surpassing the 3. Zlatanov S and Popesku J
previous threshold. We are inventing the (2019),Current Applications of
Artificial Intelligence in Tourism and
system because of the need of the increasing
Hospitality, International Scientific
population of our country. As we know if we conference on Information
want to travel from one place to another Technology and data related
place (national/international) we need to go research, January 2019,
DOI:10.15308/Sinetza-2019-84-90.
to travel centres we need to get the all 4. Y. Sun, Y. Zhang, Y. Chen, and R.
information about travelling structure in the Jin, 'Conversational recommendation
sense how would be the travel maintenance, system with unsupervised learning' ,
pp. 397—398. Association for
availability of travels (buses, trains, flights),
Computing Machinery, Inc, (2016).
timings, stoppings, availability of seats, food 5. Ayeh K. J and et. al-Information
facility (inside flights), travels management Extraction for a Tourist
Recommender System, Information
(how would be the staff(driver, conductor,
and Communication Technologies in and the structure of interpersonal
Tourism 2012: Proceedings of the closeness. Journal of Personality and
International Conference in Social Psychology 63, 596-612
Helsingborg, Sweden, January 25— (1992).
27 , 2012. 12. Bickmore, T., Cassell, J.: Social
6. Divya, Indumathi, Ishwarya, Dialogue with Embodied
Priyasankari, "A SeltDiagnosis Conversational Agents. In:
Medical Chatbot Using Artificial Kuppevelt, J.CJ., Bernsen, N .0.,
Intelligence", proceeding MAT Dybkjær, L. (eds.) Advances in
Journal, October-2017. Natural Multimodal Dialogue
Systems, vol. 30, pp. 23—54.
Springer, Dordrecht (2005).
Organizations and firms compete with each other based on II. LITERATURE REVIEW
the productivity of their workforce, which is highly dependent
on the working environment. The human resource (HR) In the literature, employee attrition has been investigated
department plays a critical role in creating and maintaining a from various perspectives.Some studies focused on analyzing
suitable environment that promotes stable and collaborative employees' behavior to identify the reasons behind their
employees. HR can achieve this by analyzing the employees' decision to leave or stay with the organizatin [6,7]. Other
database records to improve decision-making and prevent studies utilized machine learning algorithms to predict
employee attrition[1,2]. Employee attrition occurs when employee attrition based on their records. Alduayj and
productive employees leave the organization due to reasons Rajpoot [8] utilized several machine learning models,
such as work pressure, unsuitable environment, or including random forests, k-nearest neighbors, and support
dissatisfaction with salary, which negatively affects the vector machines with different kernel functions, and used
organization's productivity as it loses productive employees different forms of the IBM attrition dataset. However, their
and other resources such as HR staff efforts in recruiting[3] system's accuracy with the original class-imbalanced dataset
and training new employees. was not satisfactory despite achieving high accuracy with the
synthetic dataset. Usha and Balaji [9] used the same dataset to
To prevent or reduce the impact of employee attrition, compare several machine learning algorithms such as decision
predicting it before it occurs is crucial. Studies have shown tree, naive Bayes, and k-means for prediction, but their work
that happy and motivated employees tend to be more creative, lacked the data preprocessing stage, resulting in poor accuracy.
productive, and perform better[4]. Artificial intelligence (AI) Fallucchi et al.[3] studied the reasons that drive an employee
has recently been utilized in many different fields, including to leave the organization and utilized various machine
predicting employee attrition learning techniques, including naive Bayes, logistic regression,
k-nearest neighbor, decision tree, random forests, and support
vector machines, to select the best classifier. Although they
validated their work using cross-validation and train-test split,
their results included only the 70%:30% split train-test set
without discussing cross-validation. The test accuracy was
better than the training accuracy, indicating potential
improvement. Zangeneh et al. proposed a three-stage
framework for attrition prediction, utilizing the "max-out" Table 1. IBM dataset features
feature selection method for data reduction, a logistic
regression model for prediction, and confidence analysis for Feature name Type Feature name Type
prediction model validation. However, their system was Age Number MonthlyIncome Number
highly complex, and the accuracy was unsatisfactory. BusinessTravel Category MonthlyRate Number
DailyRate Number NumCompaniesWorked Number
Department Category Over18 Category
However, the prediction accuracy of these studies still needs DistanceFromHome Number OverTime Category
improvement to achieve higher confidence. This work Education Category PercentSalaryHike Number
proposes using deep learning and data preprocessing EducationField Category PerformanceRating Number
EmployeeCount Number RelationshipSatisfaction Category
techniques to increase the prediction accuracy and improve EmployeeNumber Number StandardHours Number
upon the state-of-the-art methodologies utilizing the IBM HR EnivironmentSatisfaction Category StockOptionLevel Category
dataset. Gender Category TotalWorkingHours Number
HourlyRate Number TrainingTimesLastYear Number
JobInvolvement Category WorkLifeBalance Category
III. METHODOLOGY JobLevel Category YearsAtCompany Number
EducationField Category YearsInCurrentRole Number
JobRole Category YearSinceLastPromotion Number
The proposed work analyses the respective dataset to detect JobSatisfaction Category YearsWithCurrentManager Number
the most influential features that affect the prediction and MaritalStatus Category Attrition Category
builds a predictive model according to the following phases.
1. Dataset Description
2.3 Rescaling
The standard score of a sample x is calculated as:
z = (x - u) / s
Fig 2. Imbalanced and balanced dataset. (a) Original imbalanced dataset. (b)
Synthetic balanced dataset.
Abstract—
Centralised exchanges have been the traditional
method of trading cryptocurrencies since their A comparison of DEX platforms reveals issues
inception. These exchanges are owned and operated with performance, security, privacy, and adoption.
by a single entity that controls all aspects of the While DEXs allow for trustless, transparent
platform, including the matching engine, wallet trading, regulatory compliance and liquidity
management, and asset custodianship. However, constraints limit their widespread adoption.
with the advent of blockchain technology, This paper evaluates the state and potential of
decentralized exchanges (DEXs) have emerged as DEXs to transform how cryptocurrencies and
an alternative to centralised exchanges. DEXs digital assets are exchanged by examining DEX
operate on a peer-to-peer network, allowing users mechanisms as an upgrade to centralised authority
to trade cryptocurrencies without the need for a and control.
central authority
Keywords - Decentralized Exchange, Transparency,
Smart contracts, Automated market makers (AMMs),
Order book models, Cross-chain, Smart contracts
[15] Nikolić, Ilias, Lukas A. Fuchs, Kevin Wang [27] Judmayer, Andreas, et al. "Security tokens:
Chung Cheng, and Wolter Pieters. "Finding Race Why social engineering becomes obsolete for
Conditions in Ethereum Smart Contracts." IACR ICOs." Journal of Cryptocurrencies and Blockchain
Cryptology ePrint Archive 2019 (2019): 1049. Technology 3, no. 2 (2020).
[16] Luu, Loi, Duc-Hiep Chu, Hrishi Olickel, [28] Ching, Andrew, and Stuart R. Ritchie.
Prateek Saxena, and Aquinas Hobor. "Making "Decentralized Exchanges (DEXs) Explained."
Smart Contracts Smarter." In Proceedings of the https://academy.binance.com/en/glossary/decentrali
2016 ACM SIGSAC Conference on Computer and zed-exchange-dex
Communications Security, 254-269. ACM, 2016.
[29] Zamyatin, Alex, et al. "XG Boost: A Scalable
[17] Parkhomenko, Denis, Gadi Taubenfeld and End-to-End Crypto-Token Transformation
Danny Dolev. "FC2: A High-Level Language for Framework for DeFi." arXiv preprint
Verifying Ethereum Smart Contracts." IACR arXiv:2101.07979 (2021).
Cryptology ePrint Archive 2020 (2020): 519.
Abstract
6. REFERENCES
4th Pagidela
Venkata 5th Pachipulusu Akash
6th Sarojini T Habbli
Mokshith Reddy Kumar 20191CSE0534
20191CSE0406 20191CSE0405 Department of Computer Science
Department of Computer Science Department of Computer Science and Engineering
and Engineering and Engineering Presidency University
Presidency University Presidency University Bangalore, India.
Bangalore, India. Bangalore, India. 201910102074@presidencyuniver
201910100960@presidencyuniver 201910101141@presidencyuniver sity.in
sity.in sity.in
Abstract— Online social networks (OSNs) have dimension reduction techniques were used to
grown in popularity and are now more closely create the decision tree in this paper, which is
associated with people's social activities than suggested to provide effective detection for fake
ever before. They use OSNs to speak with one Instagram accounts. To determine whether the
another, change news, design activities, and target account was genuine or fake, Three
even function their very own on-line businesses. machine learning classification algorithms—
In order to steal personal information, spread Decision Tree, Random Forest, Logistic
malicious activities, and share false Regression were used.
information, attackers and imposters have been Keywords—Decision Tree, Random Forest,
drawn to OSNs because of their explosive Logistic Regression.
growth and the vast quantity of personal data
they collect from their users. On the other Introduction
hand, academics have begun to look into Online social network’s (OSNs), such as
effective methods for spotting suspicious Facebook, Twitter, LinkedIn, Google+ have
activity and bogus accounts using account become increasingly popular over last few years.
features and classification algorithms. People use OSNs to maintain in contact with every
However, some of the characteristics of the other, share news, prepare events, and even run
account that are exploited have an adverse their very own e-business. For the length between
effect on the results or have no effect at all. 2014 and 2018 round 2.53 million U.S. greenbacks
Additionally, using independent classification have been spent on sponsoring political
algorithms does not always produce advertisements on Facebook by way of non-
satisfactory results. Three feature selection and profits. The open nature of OSNs and the massive
amount of personal data for its subscribers have nevertheless fake. For such OSNs, the existence of
made them vulnerable to Sybil attacks. In 2012, faux bills lead advertisers, developers, and
Facebook noticed an abuse on their platform inventors to mistrust their suggested consumer
including publishing false news, hate speech, metrics, which would negatively affects their
sensational and polarizing, and some others. revenues as recently, banks and economic
However, online Social Networks (OSNs) have establishments in U.S. have began to analyse
also attracted the interest of researchers for mining Twitter and Facebook debts of mortgage
and analysing their massive amount of data, applicants, earlier than virtually granting the loan.
exploring and studying users behaviours as well as Attackers observe the thought of having OSNs
detecting their abnormal activities. In researchers consumer bills are “keys to walled gardens”, so
have made a study to predict, analyse and explain they deceive themselves off as any individual else,
customer’s loyalty towards a social media-based by means of the use of images and profiles that are
online brand community, by identifying the most both snatched from a actual individual except
effective cognitive features that predict their his/her knowledge, or are generated artificially, to
customer’s attitude. Facebook community unfold faux news, and steal non-public
continues to grow with more than 2.2 billion information. These pretend debts are commonly
monthly active users and 1.4 billion Daily active known as imposters. In each cases, such faux
users, with an increase of 11% on a year-over-year money owed have a hazardous impact on users,
basis. In the 2nd quarter of 2018 alone, Facebook and their motives would be anything other than
stated that its whole income was once $13.2 billion good intentions as they usually flood spam
with $13.0 billion from advertisements only. messages, or steal private data. They are eager to
Similarly, in 2nd quarter of 2018 Twitter has phish person naive customers to phony
suggested attaining about one billion of Twitter relationships that lead to intercourse scam, human
subscribers, with 335 million month-to-month trafficking, and even political astroturfing..
energetic users. In 2017 twitter said a consistent Statistics show that 40% of parents in the United
income boom of 2.44 billion U.S. dollars, with 108 States and 18% of teens have a great concern
million U.S. dollars lower profit compared to the about the use of fake accounts and bots on social
previous year. In 2015 Facebook estimated that media to sell or influence products. Another
almost 14 million of its month-to-month lively example, during the 2012 US election campaign,
customers are in reality undesirable, representing the Twitter account of challenger Romney
malicious pretend money owed that have been experienced a sudden jump in the number of
created in violation of the web sites phrases of followers. The terrific majority of them had been
service. Facebook, for the first time, shared a later claimed to be faux followers. To beautify
document in the first quarter of 2018 that suggests their effectiveness, these malicious money owed
their inside pointers used to put in force are frequently armed with stealthy computerized
neighborhood requirements overlaying their tweeting programs, to mimic actual users,
efforts between October 2017 to March 2018, this recognized as bots. In December 2015, Adrian
document illustrates the quantity of undesirable Chen, a reporter for the New Yorker, mentioned
content material that has been eliminated via that he had considered a lot of the Russian money
Facebook, and it covers six categories: photo owed that he used to be monitoring change to pro-
violence, person nudity and sexual activity, Trump efforts, but many of those were accounts
terrorist propaganda, hate speech, spam, and faux that were better described as troll’s accounts
accounts. 837 million posts containing unsolicited managed by real people that were meant to mimic
mail have been taken down, and about 583 million American social media users. Similarly, before the
pretend money owed have been disabled, general Italian elections of February 2013, online
Facebook additionally has eliminated round eighty blogs and newspapers reported statistical data over
one million undesirable content material in phrases a supposed percentage of fake followers of major
of the relaxation violating content material types. candidates. Detecting these threatening debts in
However, even after stopping tens of millions of OSNs has end up a have to to keep away from
pretend debts from Facebook, it used to be more than a few malicious activities, insure
estimated that, round 88 million accounts, are protection of user’s money owed and defend
private information. Researchers try to come up been examined and evaluated in relation to all
with automatic detection equipment for figuring other applied techniques.
out faux accounts, which would be labour LITERATURE SURVEY
intensive and luxurious if executed manually. The
implications of researchers try may additionally Facebook political advertising is the most
enable an OSN operator detecting pretend debts recent in a long line of advancements in
successfully and effectively, it would enhance the campaign strategy, and it has been widely used in
journey of its customers by means of stopping elections all over the world. We [1] argue that
traumatic unsolicited mail messages and different existing measures provide little insight into
current campaign trends, offering analytical,
abusive content. The OSN operator can
methodological, and normative issues for
additionally amplify the credibility of its consumer academics and electoral authorities alike. Large-
metrics and allow 1/3 events to think about its scale peer-to-peer systems face security risks
person accounts. Information protection and from unreliable or malicious remote computing
privateness are amongst the predominant components. In order to counter these dangers,
necessities of social community users, retaining many of these systems employ redundancy.
and offering these necessities will increase However, the redundancy can be undermined if a
community credibility and in consequence its single flawed entity can assume several [2]
revenues. OSNs are using one-of-a-kind detecting identities and control a sizable chunk of the system.
algorithms and mitigation methods to tackle the This paper discusses numerous anomaly types and
developing danger of fake/malicious debts. their novel [3] categorization according to distinct
Researchers focus on identifying fake accounts traits. This paper discusses a variety of
through analysing user level activity by extracting approaches for preventing and identifying
features from recent users e.g. number of posts, anomalies, as well as the underlying
number of followers, profiles. They observe presumptions and causes of such anomalies.
educated computing device getting to know The study offers a discussion of several data
method for real/fake debts classification. Another mining techniques for find abnormalities. The
method is the use of layout stage shape the place objective of the study was to [4] ascertain how
the OSN is modelled as a design truly introduced much perceived value, service quality, and social
variables influenced users' inclinations to stick
as a series of nodes and edges. Each node
around for the social media-based online brand
represents an entity (e.g. account), and every part community of a major automaker.
represents as a relationship (e.g. friendship).
Though Sybil bills discover a way to cloak their PROPOSED METHOD
behaviour with patterns equivalent to actual This proposed application that can be considered a
accounts, they take place severa profile elements useful system since it helps to reduce the
and endeavor patterns. Thus, computerized Sybil limitations obtained from traditional and other
detection are no longer usually sturdy in existing methods. To design this system is we used
opposition to adversarial attacks, and does now not a powerful algorithm in a based Python
yield applicable accuracy. The Random Forest environment.
classification algorithm has been run on the
decision values obtained from the support vector
machine (SVM). As shown in, also verified the ADVANTAGES
detection capabilities of our classifiers using two • Accuracy is good.
additional sets of genuine and fake accounts that • No need of skilled persons
were unrelated to the initial training dataset. It
gives a summary of the study done on the Twitter
network and earlier studies on fake profile
detection. In, it is shown how the data was pre-
processed and how the results were used to
categorize the accounts into fake accounts and
genuine accounts. The overall accuracy rates have
Logistic regression - 89.95
Decision tree classifier - 89.47
Abstract - Prediction of the stock market is an important area of research that has received a lot of
attention from academics and practitioners alike. This paper presents a complete survey of the new
advances in financial exchange forecast strategies. A long-short term memory (LSTM) neural
network was proposed in this paper for the study of deep learning methods. However, the majority
of algorithms fail in practice due to the market's non-stationary and high volatility. Consequently,
the blend of Elliott Wave Hypothesis and Long Momentary Memory (LSTM) brain network models
has been proposed as an original way to deal with foresee securities exchange costs. Elliott Wave
Theory is a method of technical analysis that looks at financial market price patterns and wave
structures. A type of neural network model known as LSTM is capable of identifying sequential
patterns in time-series data. After normalizing and pre-processing the stock data, Elliott Wave
Theory uses the processed data to determine the current market phase. Then, we construct a long
deep learning LSTM model to anticipate a retracement point that serves as an ideal entry point for
maximising profits. Using the four evaluation criteria RMSE, MAE, MAPE, and RME, a LSTM
neural network's rationality can be thoroughly examined. Generally speaking, this study exhibits
the capability of joining conventional specialized investigation with profound learning procedures
for financial exchange cost expectation.
Keywords - Elliott Wave Theory, LSTM, RMSE, RME, Neural Network, Deep learning, Stock Market
Prediction.
F. Model evaluation
Table 2: Expected predicted values of open and close price
The LSTM model may need to be adjusted Wave Theory providing a framework for
in light of the backtesting findings. Changing the analysing the historical data and LSTM providing
model's features or its hyperparameters may be a powerful tool for time series analysis and
Figure 4: Visualization graph of the actual and predicted values for open price
prediction.
necessary to achieve this. The model's resistance One of the key advantages of combining
to changes in the input data and market Elliott Wave Theory with LSTM is the potential
circumstances should also be assessed. for improved prediction accuracy. Elliott Wave
I. Deployment Theory provides a unique perspective on stock
At last, we can set up the LSTM model in a price data, by identifying wave patterns and
real-world setting and keep track of how it trends that can reveal underlying market
performs over time. To guarantee that the model dynamics. By incorporating this information into
keeps performing well, we should also carry out the LSTM model, it is possible to enhance the
routine upgrades and maintenance. model's ability to capture patterns and trends in
the data, leading to potentially more accurate
IV. Results and Discussion predictions. Another advantage is the potential
Stock market prediction is a highly debated for increased interpretability. Elliott Wave
topic among experts and investors alike. Some Theory provides a structured framework for
believe that it is possible to accurately forecast analyzing stock price data, which allows for
the market's course, while others argue that the better understanding of the market dynamics and
stock market is inherently unpredictable and that trends. This can help in interpreting the LSTM
attempting to make accurate predictions is a model's predictions, as the wave patterns
fool's errand. identified by Elliott Wave Theory can provide
The combination of Elliott Wave Theory with meaningful insights into the expected direction of
LSTM for stock market price prediction is an stock prices.
interesting research direction that has shown Furthermore, the combination of Elliott
promising results. The proposed method Wave Theory with LSTM has the potential to
leverages the strengths of both approaches, with capture both short-term and long-term price
Elliott movements. Elliott Wave Theory is known for its
ability to identify both impulse waves (short-term
trends) and corrective waves (long-term trends), series data and make predictions for future stock
which can provide a holistic view of the market prices.
dynamics. LSTM, on the other hand, is capable The results obtained from this research are
of capturing both long-term and short-term encouraging, as they demonstrate that the
dependencies in time series data, making it a combined approach can achieve better prediction
suitable tool for analyzing both types of waves accuracy compared to using LSTM alone or
identified by Elliott Wave Theory. Elliott Wave Theory alone. The accuracy of the
However, there are also limitations to this predictions can potentially benefit investors and
combined approach. One limitation is the traders in making more informed decisions in the
subjective nature of Elliott Wave Theory. The stock market, leading to improved investment
identification and classification of wave patterns outcomes.
can be subjective and prone to human biases, It's crucial to keep in mind that stock
which may introduce uncertainties and errors into market forecasting is a difficult and constantly
the analysis. This can impact the accuracy and changing topic, and no strategy can ensure 100%
reliability of the combined approach. accuracy. There are inherent risks and
Another limitation is the potential for uncertainties associated with stock market
overfitting. LSTM models are known to be prone investments, therefore when making financial
to overfitting, especially when dealing with noisy decisions based on projections, caution should
and complex financial data. Incorporating always be used.
additional features from Elliott Wave Theory
may increase the complexity of the model, References
leading to potential overfitting issues. Careful [1] https://intellipaat.com/blog/what-is-
feature selection and regularization techniques lstm/#:~:text=First%2C%20you%20must
may be needed to mitigate this limitation. %20be%20wondering,especially%20in%
In addition, a variety of factors, including 20sequence%20prediction%20problems.
macroeconomic statistics, market mood, [2] https://www.investopedia.com/terms/e/ell
geopolitical developments, and more, have an iottwavetheory.asp
impact on the stock market. While Elliott Wave [3] https://www.researchgate.net/publication/
Theory and LSTM can capture some of the 316702779_The_Effectiveness_of_the_El
patterns in stock price data, they may not fully liott_Waves_Theory_to_Forecast_Financi
capture all the relevant factors that impact stock al_Markets_Evidence_from_the_Currenc
prices. It is important to consider the limitations y_Market
and uncertainties associated with using any [4] “A STUDY TO UNDERSTAND
predictive model in the complex and dynamic ELLIOTT WAVE PRINCIPLE” by Mr.
stock market environment. Suresh A.S1 Assistant Professor, MBA
V. Conclusion Department, PES Institute of Technology,
In conclusion, forecasting the stock market is Bangalore South Campus - 2016.
difficult, and no one can be certain of their [5] “Elliott wave principle with recurrent
predictions. The stock market is influenced by neural network for stock market
numerous factors, and these factors are often prediction” by KV Manjunath and
complex and unpredictable, the combination of Malepati Chandra Sekhar Presidency
Elliott Wave Theory with LSTM for stock University, Bangalore - 2022.
market price prediction holds great promise in [6] “Elliott wave prediction using a neural
improving the accuracy of stock price network and its application to the
predictions. The proposed method leverages the formation of investment portfolios on the
wave structure identified by Elliott Wave Theory Indonesian stock exchange” by
to provide additional insights into the historical Muhammad Rifqi Arrahim Natadikarta
data, and then uses LSTM to analyse the time and Deni Saepudin, School of
Computing, Informatics, Telkom
University, Bandung, Indonesia - 2023
[7] “School of Computing, Informatics,
Telkom University, Bandung, Indonesia”
by Fritz J. and Dolores H. Russ, College
of Engineering and Technology of Ohio
University– August 2005
[8] “Stock Market Trend Prediction Model
for the Egyptian Stock Market Using
Neural Networks and Fuzzy Logic” by
Maha Mahmoud Abd ElAal, Gamal
Selim, and Waleed Fakhr, Arab Academy
for Science, Technology and Maritime
Transport, Computer Engineering, Cairo,
Egypt – August 2011
DETECTION OF BRAIN STROKE USING MACHINE LEARNING
Abstract - In many nations, stroke is the main be administered within 4.5 hours of the onset
cause of mortality and obesity. By optimising of stroke symptoms. A haemorrhagic stroke
image quality to improve image results and can be caused by blood spilling into the brain.
reduce noise and applying machine learning The goal of treatment is to stop the bleeding
algorithms to classify the patients' images into and release the pressure on the brain. Taking
two subtypes of stroke disease, ischemic medications to lower brain pressure, regulate
stroke and stroke haemorrhage, this study blood pressure overall, stop seizures, and stop
preprocesses data to improve the image any sudden blood vessel constriction is
quality of CT scans of stroke patients. In this frequently the first step in treatment. Warfarin
work, the categorization of brain stroke or clopidogrel are examples of blood-thinning
disease is done using four machine learning anticoagulants or antiplatelet drugs that can be
algorithms: K-Nearest Neighbours, Naive given to patients in order to counteract their
Bayes, Cat Boost, and Random Forest. A effects.
doctor may inject tissue plasminogen activator
Keywords – TPA, Brain stroke, Machine
(TPA) or give blood thinners like aspirin. TPA learning, Deep learning, medical imaging,
is highly good in breaking up clots. However, Feature selection,
the injection must
Real-time monitoring, Interpretability.
1. INTRODUCTION Trends and Challenges of Wearable
Multimodal Technologies for Stroke Risk
A stroke happens when the blood supply to the
Prediction by Yun-Hsuan Chen and
brain is interrupted or reduced due to a blockage
Mohamad
or leak in the blood arteries. When this occurs,
the brain's cells begin to deteriorate because it is Sawan
not getting enough nourishment or oxygen. A In this study, we look at wearable
cerebrovascular disease is stroke. This indicates technologybased tools for tracking stroke-
that it has an impact on the blood arteries that related physiological markers in real time.
carry oxygen to the brain. Damage could begin
if the brain does not get enough oxygen. A A hybrid feature extraction based.
medical emergency has occurred. Even while optimized random forest learning model
many strokes are curable, others can be fatal or for brain stroke prediction by G
leave a person disabled. The cause of an Vijayadeep and Dr N Naga Malleswara Rao
ischemic stroke is blocked or constricted In This Paper is the biggest concerns created
arteries. The goal of treatment is often to by noise or feature selection issues in stroke
improve the blood flow to the brain. Taking disorders is disease prediction in the
medications to dissolve existing clots and stop vertebral column dataset.
new ones from developing is the first step in
treatment. A doctor may inject tissue
plasminogen activator (TPA) or give blood A Machine Learning Approach to Detect
thinners like aspirin. TPA is highly good in the Brain Stroke Disease by Bonna Akter
breaking up clots. However, the injection must and Aditya Raibongsh
be administered within 4.5 hours of the onset of
stroke symptoms. A hemorrhagic stroke can be Regardless of social or cultural background,
caused by blood spilling into the brain. The goal reasonably predicting the risk of a brain
of treatment is to stop the bleeding and release stroke, could have a considerable impact on
the pressure on the brain. Taking medications to human long-term death rates. Early detection
lower brain pressure, regulate blood pressure is critical to achieving this goal.
overall, stop seizures, and stop any sudden
blood vessel constriction is frequently the first • Zhang et al. (2020) developed a deep
step in treatment. learning-based approach for detecting acute
A person can receive drugs to counteract the ischemic stroke using CT perfusion images.
effects of blood thinners if they are on The proposed method achieved a high
anticoagulants or antiplatelet medication, such accuracy of 90.9% and a sensitivity of
as warfarin or clopidogrel. 93.8% in detecting stroke.
• Using multimodal MRI data, including
2. LITERATURE REVIEW diffusion-weighted imaging,
A hybrid machine learning approach to perfusionweighted imaging, and fluid-
cerebral stroke based on an imbalanced attenuated inversion recovery, Gong et al.
medical dataset by Tianyu Liu I. Wenhui Fan (2020) proposed a deep learning-based
technique for stroke identification. The
and Cheng Wu
proposed method identified strokes with a
The method recommended in this study
successfully decreased the false negative rate sensitivity of 91.5% and an accuracy of
while retaining a respectably high overall 90.1%.
accuracy, indicating a successful reduction in • A machine learning-based method for
the stroke prediction misdiagnosis rate. estimating the risk of stroke in people with
atrial fibrillation was created by Shen et al. 40,000 benign URLs from the real
in 2020. The area under the curve (AUC) for Internet were used in our
predicting the probability of having a stroke experimental research.
was 0.794 using the suggested strategy,
which combined several machine learning We also discuss the readability of
models. each group of discriminative traits
• Using clinical and genetic data, Fuentes et and present the results of our
al. (2019) proposed a machine learningbased experiments on their efficacy.
method for stroke detection. The proposed
method identified strokes with an accuracy Many machine learning algorithms
of 85.7% and a sensitivity of 81.4%. are available for prediction and
• A machine learning-based method for diagnosis of a brain stroke,
forecasting the course of stroke patients including KNN, Decision Tree,
using MRI data was developed by Random Forest, Multi-layer
Bhattacharya et al. in 2019. The proposed Perceptron (MLP), SVC, and Cat
method achieved an accuracy of 73.5% and Boost. We employed the
a sensitivity of 81.8% in predicting the recommended Analysing Brain
outcome of stroke patients. Stroke data. At this step, we have
implemented the Cat Boost
• Niu et al. (2018) proposed a deep
Classifier algorithm on these
learningbased approach for detecting acute
datasets and the individual
ischemic stroke using CT angiography
algorithms, and then we have
images. The proposed method achieved an
implemented the Voting Ensemble
accuracy of
method to combine these findings
94.8% and a sensitivity of 92.7% in and compute the final accuracy.
detecting stroke.
• Using clinical and neuroimaging data, Zhao
et al. (2018) created a machine learning- K-Nearest Neighbour:
based method for stroke identification. The
proposed method identified strokes with an o One of the simplest
accuracy of 94.8% and a sensitivity of machine learning
93.6%. techniques based on
supervised learning is K-
• Kim et al. (2017) proposed a machine Nearest Neighbour. The K-
learning-based method for CT image-based NN algorithm places the
stroke detection. The proposed method new case in the category
achieved an accuracy of 90.8% and a that is most similar to the
available categories by
sensitivity of 91.1% in detecting stroke.
assuming that the new
o case/data and the existing
cases are comparable. The
K-NN algorithm saves all
3. METHODOLOGY o the information that is
accessible and categorises
The textual qualities, link fresh data based on
structures, webpage contents, DNS similarity. This means that
data, and network traffic are only a utilising the K-NN method,
few of the discriminative features fresh data can be quickly
that our system makes use of. and accurately sorted into a
Many of these features are suitable category.
innovative and quite powerful. o o The K-NN approach can
be used for both
32,000 malicious URLs and
classification and Cat Boost
regression problems, but it
is more frequently utilised A high-performance open-source library
for classification issues. o Because K-NN is a called Cat Boost is used for decision tree
non-parametric method, it makes no gradient boosting. Cat Boost is a
assumptions about the underlying data. technique for decision trees that uses
Because it does not instantly learn from the gradient boosting. It was created by
training set, it is also known as a lazy learner Yandex engineers and researchers, and it
algorithm. is utilised by Yandex and many other
Rather, it stores the data set and executes an businesses, such as CERN, Cloudflare,
action on the dataset during categorization. and Careem taxi, for search,
Random Forest: recommendation systems, personal
assistants, self-driving cars, weather
A random forest is a machine learning method forecasting, and many other jobs.
for tackling classification and regression issues. Everyone is welcome to use it because it
It makes use of ensemble learning, a method for is opensource. The new kid on the block,
solving complicated issues by combining a Cat boost, has been around for a little
number of classifiers. over a year and is already posing a threat
In a random forest algorithm, there are many to XG Boost. Cat boost gets the greatest
different decision trees. The random forest scores on the benchmark, which is
algorithm creates a "forest" that is trained via fantastic. However, this improvement
bagging or bootstrap aggregation. The accuracy becomes considerable and obvious when
of machine learning algorithms is increased by you look at datasets where categorical
bagging, an ensemble meta-algorithm. Based on variables are heavily weighted.
the predictions of the decision trees, the
(random forest) algorithm determines the result.
It makes predictions by averaging or averaging NAIVE BAYES:
out the results from different trees. The accuracy
of the result grows as the number of trees A probabilistic machine learning model
increases. called a Naive Bayes classifier is utilised
With excessive dataset fitting and increased for classification tasks. The Bayes
precision. It produces predictions without theorem serves as the foundation of the
needing numerous package configurations classifier.
(unlike Scikit-learn).
The Random Forest Algorithm's Features:
• Compared to the decision tree
algorithm, it is more accurate. When B has already happened, we may
use the Bayes theorem to calculate the
• It offers a practical method for dealing with likelihood that A will also occur. Here,
missing data. A is the hypothesis and B is the
• Without hyper-parameter adjustment, it can supporting evidence. Here, it is assumed
generate a fair prediction. that the predictors and features are
independent. In other words, the
• It addresses the issue of decision trees' presence of one trait has no impact on
overfitting. the other. The term "naive" is a result.
Let's use an illustration to comprehend it.
• At the node's splitting point in every random
I've included a training data set for the
forest tree, a subset of features is chosen at weather below, along with the objective
variable "Play" (which denotes the
random.
possibility of playing). We must now
categorise whether participants will medical images and patient data, the
participate in games based on the accuracy of stroke diagnosis can be
weather. improved.
expensive.
I. A rise in the accuracy of stroke
diagnoses thanks to machine learning
algorithms' ability to examine vast IV. Improved patient outcomes: Early
volumes of data and spot patterns that detection and treatment of stroke can
might be hard for people to see. By improve patient outcomes by reducing
training algorithms on large datasets of the risk of long-term disability and
improving survival rates. Machine 6. References
learning algorithms can help in
[1] V. L. Feigin et al., Update on the global burden
predicting the likelihood of a stroke
occurring in a patient and provide early of ischemic and hemorrhagic stroke in
warning signs. Healthcare professionals 19902013: The GBD 2013 study, vol. 45, no. 3.
can take proactive steps to prevent 2015.
stroke by identifying individuals who https://pubmed.ncbi.nlm.nih.gov/26505981/
are at risk.
[2] N.Venketa subramanian, B.W.Yoon, J.Pandian,
and J.C.Navarro, Stroke Epidemiology,
South,East, and South-East Asia: A Review, vol.
V. Development of new tools and 20, no. 1. 2018.
techniques: Machine learning
algorithms can help in developing new https://pubmed.ncbi.nlm.nih.gov/29037005/
tools and techniques for stroke [3] Gur Amrit Pal Singh, P. K. Gupta Performance
diagnosis and treatment. For example, analysis of various machine learning-based
ML can assist in the development of approaches for detection and classification of
wearable devices or mobile apps that lung cancer in humans, vol. 3456789. Springer
can monitor patients' health and provide London, 2018.
early warning signs of stroke.
http://ir.juit.ac.in:8080/jspui/bitstream/123456789/905
5/1/Performance%20analysis%20of%20various
%20m%20achine%20learning-
Overall, a machine learning project for brain %20%20based%20approaches%20for%20detection%2
stroke detection has the potential to greatly
increase the precision and speed of stroke
diagnosis, lower medical expenses, and
enhance patient outcomes.
5. Conclusion
In this study, stroke data on CT scan image data
is classified using machine learning methods.
picture processing and feature extraction are
done on the picture data before classification.
The classification is then performed using a
comparison of (Four) techniques, namely K-
Nearest Neighbours, Naive Bayes, Random
Forest, and Cat boost. Compared to other
examined classification algorithms, the
algorithm using the Random Forest approach
offers the highest level of accuracy, according to
our testing. The accuracy of the classification
algorithm with the default optimisation
parameter value has not, however, been tested.
From this point forward, the categorization
model may be enhanced to accomplish. The
machine learning algorithm utilised has to have
its parameters tuned in order to improve
accuracy.
Smart Agriculture Aid Using Renewable Energy
5. OVERVIEW
Technology and sustainable energy sources
improve agricultural practices and increase
productivity while reducing the environmental
impact of farming. This approach can help farmers
overcome an objection to climate change, soil
degradation, and water scarcity, as well as.
There are several technologies and renewable Overall, smart agriculture aid using renewable
energy sources that can be used in smart energy is a promising approach to improving
agriculture, including: agricultural practices and sustainability. By
i. Solar Power: Solar panels can be installed adopting these technologies and practices, farmers
on farms to generate electricity for can increase productivity while reducing their
irrigation pumps, lighting, and other farm environmental impact.
equipment. Minimize the need for fossil A. ESP32 microcontroller
fuels and lowers greenhouse gas emissions. ESP32 is a low-cost, low-power microcontroller
ii. Wind Power: Wind turbines can be drawn with Wi-Fi and Bluetooth connectivity, a dual-core
to generate electricity on farms. especially processor, and built-in security features, suitable
useful in areas with high wind speeds, for a wide range of IoT applications. It is
where wind power can be a cost-effective compatible with the Arduino IDE and has a variety
alternative to grid electricity. of peripheral interfaces. The ESP32 looks like
iii. Biogas: Biogas can be utilized from Fig.2.
organic waste such as animal manure and
crop residues. This can be used as a
renewable energy source for cooking and
heating on farms.
iv. Precision Agriculture: Precision
agriculture uses sensors and data analytics
to optimize crop yields and reduce waste.
This can include technologies like GPS
mapping, drones, and soil sensors.
v. Vertical Farming: Vertical farming
involves growing crops in vertically
stacked layers using LED lights and
hydroponic systems. This approach can Fig.2: ESP32
increase crop yields and reduce water
usage. B. soil moisture sensor
Smart agriculture aid using renewable energy can Soil moisture sensors measure or estimate the
have several benefits, including: amount of water in the soil. These sensors can be
stationary or portable such as handheld probes.
Increased productivity: By using technology and Stationary sensors are placed at predetermined
renewable energy, farmers can increase crop yields locations and depths in the field, whereas portable
and reduce waste.[1] soil moisture probes can measure soil moisture at
several locations...
Reduced costs: Renewable energy sources like
solar and wind can reduce the cost of electricity for
farmers, while precision agriculture can reduce the
amount of water and fertilizer needed.[2]
Environmental benefits: Using renewable energy
sources and sustainable farming practices can
reduce greenhouse gas emissions and help mitigate
the impacts of climate change.[3]
Fig.3: Soil moisture Smart agriculture is the use of technology to
Soil moisture sensors measure or estimate the amount of improve the efficiency and sustainability of
water in the soil. These sensors can be stationary or portable farming practices. The application of renewable
such as handheld probes. Stationary sensors are placed at energy in agriculture can help to reduce the
predetermined locations and depths in the field, whereas dependence on fossil fuels, decrease carbon
portable soil moisture probes can measure soil moisture at
emissions, and lower operational costs. Renewable
several locations.
energy sources such as solar, wind, and biomass
energy can be used to power irrigation systems,
6. MOTIVATION pumps, and other equipment used in agriculture.
The use of renewable energy in agriculture can also
help to promote sustainable farming practices by
Agriculture could undergo a revolution as a result
reducing the carbon footprint of farming
of the usage of renewable energy, which could also
operations. Sustainable agriculture practices aim to
impact how we handle our natural resources and
reduce the environmental impact of farming while
raise cattle and cultivate crops. We can lessen our
ensuring that food production remains
reliance on fossil fuels, reduce our carbon
economically viable.
footprint, and enhance the farming process in an
extra sustainable and environmentally friendly
manner by utilizing the potential of the sun, wind, 9. EXPECTED OUTCOMES
and other renewable sources of energy. The exact/estimated outcomes of a water quality
The goal of smart agriculture is to maximize monitoring system using IoT technology will
resource utilization and increase agricultural yields depend on the specific goals and objectives of the
through the integration of technology and system, as well as the methodology and
agriculture. Farmers can monitor and manage technologies used. However, some potential
multiple elements of their farms, including soil outcomes could include:
moisture, temperatures, and levels of nutrients, in
• By monitoring key indicators of water
real-time with the use of cameras, drones, and
quality in real-time, such as pH,
other cutting-edge technologies. They may use this
temperature, and TDS, water treatment
to make data-driven choices and change their
facilities can identify and address issues
farmingpractices as necessary, which results in.
that may impact the safety and quality of
the water supply. This can lead to improved
water quality and reduced risk of
7. PROBLEM STATEMENT: waterborne illnesses.
The following might be the issue statement for • IoT-based water quality monitoring
employing renewable energy in smart agriculture systems can help water treatment facilities
aid: improve their operational efficiency by
The difficulties facing the agricultural sector, such providing real-time data on key indicators
as depletion of resources, global warming, and of water quality. This can help facilities
food security, are becoming more urgent. The use optimize their treatment processes and
of non-renewable energy sources in agriculture has reduce waste, leading to cost savings and
also considerably raised carbon dioxide emissions improved sustainability.
and environmental damage. 10. EXISTING SYSTEM:
DR. C KOMALAVALLI
Professor, Dept. of CSE
Presidency University
Bengaluru, India
komalavalli@presidencyu
ABSTRACT niveraity.in
For example: Chatbot: I see that you've dined with us before. Welcome back! Would you
like to order your favorite dish, the spaghetti Bolognese?
Chatbot: Hi there! Welcome to our restaurant chatbot. I'm here to help you User: Yes, please! That's my favorite.
with any questions or assistance you need. How can I assist you today? Chatbot: Great choice! Anything else I can assist you with today?
2.Reservation and Booking: The chatbot should be able to handle 7.Error Handling and Escalation: The chatbot should be able to handle
reservation and booking requests. It can ask for the date, time, and number errors, misunderstandings, or ambiguous queries gracefully. It can ask
of guests, and check the availability of tables. clarifying questions, offer suggestions, or escalate to a human agent when
necessary.
For example:
For example:
Chatbot: Sure! I can help you with a reservation. Please provide me with the
date, time, and number of guests for your booking. Chatbot: I'm sorry, I didn't understand your request. Could you please
User: I'd like to make a reservation for two on May 5th at 7:00 PM. provide more details or rephrase your question
Chatbot: Great! Let me check our availability for that date and time.
3.Menu and Specials: The chatbot should be able to provide information SCOPE
about the restaurant's menu, including special dishes or promotions. It can
also accommodate dietary restrictions and provide recommendations.
This model is being created with the concept for a chatbot system in the
For example: artificial intelligence field that would be used on a modest basis. With the
hackneyed dataset built using research of various institution, assists to
Chatbot: Our menu includes a variety of cuisines, such as Italian, Asian, and exercise exploratory sample of chatbot system, which could be modified to a
American. We also have vegetarian and gluten-free options. Would you like huge scale solution of eatery and can be deployed on large scale. The model
me to recommend any dishes? is sample that indicates, moving forward using artificial intelligence
User: What are your current specials? technologies effectively when they are available modified for massive scale
Chatbot: Our current special is a 3-course meal with a choice of appetizer, businesses of eatery, which decreases the burden of reception of resolving
main course, and dessert for Rs 899 the queries hardly anything reason. And assist enterprises to flourish
users anytime the equipment SSis exported within social platform such as
4.Order and Payment: The chatbot should be able to take orders and public websites of that restaurant. Also, this creates opportunity of jobs in
facilitate payments. It can provide options for delivery or pickup, and handle the near future.
payment processing securely. In this present era, it is predictable that a website application can be utilized
for ordering or pre-ordering daily bread. we are using phones, tablet or PC
For example: machine with internet connection to provide interaction between consumers
as well as the menu using secret login, user can view and place an order and
Chatbot: Would you like to place an order for delivery or pickup? get instant updated to gather invoices using the phone or PC itself. It is
User: I'd like to place an order for delivery. suitable, productive and simple so that it enhances what was done of the
Chatbot: Sure! What items would you like to order? eatery's staff., effective and makes dining experience immersive. will have
User: I'll have a margherita pizza and a Caesar salad. access to the specialized hardware and software they require to do their
Chatbot: Great! I'll add that to your order. How would you like to pay? We specific duties on schedule. Scheme development depends upon a
accept credit cards and online payments. framework provided by NLTK. The Kaggle dataset, which is used to train
and evaluate chatbots for functionality, is a need for this model. If a chatbot
is unable to respond, it is anticipated that it should hand over management of
5.Additional Information: The chatbot should be able to provide general the system to a human assistant who will be able to respond and address
information about the restaurant, such as hours of operation, location, and issues. while using website, it is mandatory for user to connect to the net so
contact details. It can also answer frequently asked questions (FAQs) about the system ought to be able to show various places and upon user request
the restaurant's policies, events, or services.
USER CHARACTERISTICS
For example:
1.User can ask questions regarding table booking.
Chatbot: We are located at 5th cross Whitefield and our hours of operation 2.User can know about the availability of accommodation.
are from 07:00 AM to 11:00 PM, 7 days a week . Is there anything else you 3.User can know about the venue of the restaurant.
would like to know? 4.User can know about the menu of the restaurant.
5.Prompt response to questions.
6.Personalization and Engagement: The chatbot should be able to engage 6.Detailed answers to queries.
users in a personalized and interactive manner. It can remember user 7.Settlement of a grievance and dispute.
preferences, offer recommendations based on past orders, and provide a 8.Contacting an available service professional.
pleasant and engaging conversation experience. 9.Consumer can connect to restaurant if not satisfied.
NON-FUNCTIONAL
2.1 ASSUMPTION AND DEPENDENCY
FUNCTIONAL REQUIREMENT
REQUIREMENT
SYSTEM FEATURES
PERFORMANCE REQUIREMENT
A conversational bot utilizes natural language processing (NLP) to process
user input and can apply the naive Bayes algorithm to deliver precise 1. The platform must need to be able to respond promptly.
outcomes quickly. Chat bots include the Natural Language Understanding 2.Software UI should be simple to navigate through.
(capability that enables them to comprehend user input and provide the 3.The platform should be able to function even when multiple people
appropriate output depending on it). The chatbot can handle a variety of are using it all together.
restaurant-related questions, such as reservations, cuisine, the sort of 4.If an inquiry cannot be responded by the conversational bot, the system
reservation needed, etc. It can also hand off management to a person if should consult with prime administrator.
necessary. If a chatbot's enquiry cannot be answered; it can provide a phone 5.The platform must provide accurate data.
number so that a person can assist the user. A user can easily acquire a 6.To keep up with changes in the restaurant, the software has to be updated
solution to their question by chatting with a chatbot from anywhere. Chatbot on a regular basis.
can reply in text form for the inquired question. Moreover, chatbots might 7.The platform must able give the user address, email id and contact
be used on the websites of certain eateries. information when request.
8.Platform must learn from numerous inputs provided by user.
9. The platform needs to comprehend multiple entities & motives.
DATASET
Acquired from Kaggle.com, which offers a variety of data sets for artificial
intelligence models. The data set is divided in half, having the first half used
SAFETY REQUIREMENT
for model training and the other half for testing models. Dataset includes
intents and entities; on the basis of this, the most likely probability is 1.The system must seek human assistance if a question is asked that is not in
determined and saved in the context variable. Intent is identified by the scope.
chatbot, which is composed of humans and entities. 2. The system must be able to safeguard restaurant-related data.
1. Conversation service
2. Natural Language Processing
3. Natural Language Generation example:
4. Natural Language Understanding
5. Creation of AI model with appropriatelanguage. 1.Enhancing customer experience: A restaurant chatbot
6. uploading the model's training information, comprising can offer personalized recommendations, take orders,
entities and intents. and provide relevant information, enhancing the overall
7. on text variables are being created to hold customer experience and improving customer
data from user conversations. satisfaction.
8. constructing illustrations for training and
comprehension purposes. 2.Automating tasks: A chatbot can handle routine tasks
9. Testing using a dummy discussion for the such as taking reservations, providing operating hours,
provided purpose and entities and answering frequently asked questions, freeing up
restaurant staff to focus on other important
responsibilities.
Abstract— Virtual assistants are now an integral Benefits of utilizing this system include its
part of our everyday life, and as of right now, we simplicity, user-friendliness, efficiency, and
can observe that people are neglecting their dependability.
physical fitness and health. To monitor a person's Compared to maintaining all client information in
health, we came up with the concept of an AI- record books or on spreadsheets, maintaining an
based diet and Exercise that is dependent on his entirely secure database on the server that is
BMI and BMR. available as needed by the user and is free of
The technologies we are used in this project are maintenance costs would be very efficient.
Ai with the help of artificial intelligence (AI), the The advantages are
system would automatically produce a diet plan The system is accessible from anywhere at any
for the user based on his body weight and provide time. Users can chat with the system to ask
motivational films to encourage him to lose questions about fitness and receive responses. It is
weight and get in shape. Additionally, users can easy to use and gain access to the system.
chat with chatbots to get advice on how to make
their diet and exercise regimens more Keywords— Ai diet plan, chatbot, Bmi and Bmr,
manageable. customized nutrition plan, monitoring,
What sets this proposal apart from others is the Notification reminder
addition of artificial intelligence (AI) to construct
a diet plan and chatbot. Many systems will present I. INTRODUCTION
us with diet plans to choose from, but this project Exercise and diet are important for maintaining
will create them automatically with the aid of AI, good health and well-being. A balanced diet
the program will generate a customized nutrition provides the necessary nutrients for optimal bodily
plan and send notifications to ensure good function and can help manage weight, compared to
adherence. maintaining all client information in record books or
The system will send reminders to drink water, on spreadsheets. Together, exercise and diet can
and if a user is seated in one spot for more than an reduce the risk of chronic diseases and promote
hour, it will send a warning about a change in longevity.
position and walk. We added a few motivating AI Based Diet with Fitness App is designed to help
videos if users wanted to be inspired to keep up individuals maintain their health and weight using
their diet and exercise routine. AI. The system will create an individual diet plan
based on the user’s BMR. The system will also the database, which can be added/updated. All the
offer a variety of exercises to users. The overall aim Videos based on BMI will be added directly in the
is to maintain the optimal health and weight of the database, which can be added/updated. The
user with the help of Artificial Intelligence. questions & answers should be added to the Dialog
flow dashboard which will be trained and uses
Artificial Intelligence. Google fit will give us the
II. EXISTING SYSTEM
results if the same google account is used in the
There are a lot of programs available on the Band/watch.
market right now, like Simplify, Google Fit, For this project, XML is used on the front end and
Samsung Health, and others. All these MSSQL is used on the back end. The programming
applications focus on tracking physical activity language is Java. The IDE used is Android Studio
like cycling, walking, and running while also
encouraging users to develop healthy eating IV. LITERATURE REVIEW
habits. rather than creating a suitable diet using Interest in using artificial intelligence (AI) to
the user's height and weight healthcare has risen recently, particularly in the area
Drawbacks of the existing system, the existing of diet and nutrition. Applications with AI
system was limited to providing exercises to capabilities can offer tailored suggestions and
individuals. Also, they did not provide the chatbot counsel based on a person's particular health
feature to the users. information, enhancing overall health outcomes.
• XML
Designing the user interface of Android applications
uses the markup language XML. It gives
programmers the ability to design a visual
representation of the app's layout, complete with
views for buttons, text fields, photos, and other
elements. The structure, arrangement, and
functionality of UI elements in an Android app are
specified using XML.
• DIALOGUE FLOW
A natural language processing (NLP) platform
called Dialogue Flow enables programmers to
create and incorporate conversational user
interfaces into bots, mobile apps, and online
applications. It responds to user requests in a
conversational manner by using machine learning
techniques to comprehend them. Chatbots, voice
assistants, and other conversational interfaces can
be developed using Dialogue flow and incorporated
into Android Studio applications.
Secure Chat Web Application Using JWT Creation,
Encryption, Decryption And Parsing For User Authentication And Login
Abstract— The paper analyzes the focal equally or more vulnerable. The security
points and impediments of each strategy, as well arrangements provided by institutions dictate the
as their appropriateness for diverse sorts of measures users have to take. The paper
applications and clients. Biometric confirmation, discusses various authentication schemes on the
for case, can be profoundly secure internet and provides an example of a complex
and helpful but may raise security concerns. security system in Korea that uses multiple
Behavioral confirmation, which analyzes a authentication methods.
user's designs of interaction with a framework,
can give nonstop confirmation but may LXXXV. PREVIOUS RESEARCH
be troublesome to actualize successfully. Token- There are three common types of methods
based verification, such as one-time passwords, used for user identification and authentication.
can give an extra layer of security but may 1. An authentication method that involves
be awkward for clients possessing a one-time password
generator, certificate, or smart card is
Keywords—user authentication, biometrics, referred to as "something the user
client certificates possesses."
LXXXIV. INTRODUCTION 2. To authenticate a user, they must provide
In secure systems like e-commerce, proper something that only they know, such as a
authentication of users is crucial. The traditional password or answer to a security
method of authentication is through a username question. The system must then be able
to verify the user's response to ensure
and password, but it has become inadequate due
proper authentication.
to users choosing weak passwords, not using
password management systems, and reusing 3. The user's identity can be confirmed
passwords across multiple sites. As a result, through biometric characteristics, such as
alternative or additional authentication methods a fingerprint or iris scan, which represent
are necessary. It is important to consider something unique to the individual.
different scenarios for authentication, such as The authentication process can involve three
authenticating to a device, remote authentication things: something that you have, something that
through the web, and other protocols, as the best you have forgotten, or something that you used
method varies depending on the situation. The to possess. The traditional approach of
paper focuses on remote authentication through authentication, which involves a username and
the Internet. It is important to avoid replacing a password, falls under the category of "something
weak authentication method with one that is you have forgotten".
Various alternative authentication methods this method argue that humans can recall
have been suggested, such as biometrics, pictures more easily than text, but such claims
graphical passwords, and public key are often based on overly optimistic estimates of
authentication. However, each of these methods human memory capabilities, and graphical
has its own limitations and disadvantages, and authentication systems can be limited in terms of
none has fully replaced the traditional username usability. Additionally, many graphical
and password combination that is widely used. authentication systems are vulnerable to
Some of these alternative methods have been "shoulder surfing," where unauthorized
used as secondary authentication measures. individuals can observe the user's login
A. Token-Based Authentication credentials. De Angeli, Coventry, and Renaud
Token-based authentication is a type of have categorized graphical authentication
authentication method that relies on the systems into three groups.
possession of an object, such as a code book, a
card, a smart card, or a public key-based o Draw metric schemes are a type of
certificate. In practice, user PKI certificates are authentication method that requires users
not commonly used due to their complicated to create a unique drawing or pattern. To
deployment and users' lack of comprehension. authenticate, users must then recreate this
drawing or pattern. Examples of draw
While this method is more secure than
metric schemes include Pattern lock,
traditional credential-based authentication, it
which is used for authentication on
carries the risk of the token being lost or stolen. Android phones, and Picture Password,
To mitigate this risk, the system must prevent which is used for authentication on
replay attacks and protect the token with a Microsoft Windows 8.
password.
o Cognos metric systems, also referred to
as Search metric, involve a user choosing
B. Biometric Authentication a familiar image (usually pre-determined
Biometric authentication systems are used to by the user) from a group of other images
identify and/or authenticate users based on their intended to confuse or distract.
physical characteristics. Common methods o Loci metric systems, which are also
include fingerprint recognition, iris recognition, referred to as cued-recall-based systems,
and facial recognition. Biometric authentication involve the identification of a sequence
systems suffer from a number of problems: of positions within an image.
o Ensuring confidentiality is a desirable
attribute for an authentication system, Graphical authentication systems are
but it is challenging to achieve in frequently used to authenticate personal devices,
including smartphones, and are also used for
biometric systems.
internet authentication. Although they have not
o Biometric systems are vulnerable to replaced traditional text-based passwords, they
mimic attacks unless they are being are often used as an additional authentication
supervised. method. Another commonly used knowledge-
o It is generally impractical to utilize based authentication scheme is the security
biometrics for remote authentication question, where the user provides an answer to a
over the internet since users may not question assumed to be private, such as their
have access to the necessary sensors. mother's maiden name. However, in practice,
o It should be noted that biometrics, such this information is often not entirely private and
as fingerprints, are not as exclusive to a can be easily discovered by others.
person as they are often believed to be.
LXXXVI. COMMON AUTHENTICATION
PRACTICE
C. Alternative Knowledge-Based Systems
Several alternatives to traditional text-based Authentication with a username and
passwords have been suggested, including password is the most common method, with a
graphical authentication systems. Supporters of security question as a backup option to reset the
password. However, security questions can be ActiveX plugins and acquire a digital
difficult for users to answer and easily guessed certificate.
by attackers. Some analysts think that lying 2. To access the bank's system, the user can
makes questions harder to guess, but research utilize their digital certificate by
shows that lying is harder to remember. Texting providing the certificate password during
or calling the user's cell phone is another option login.
for password recovery, but this method depends
3. The account PIN must be entered by the
on the user having a cell phone with them.
user.
Another email address is also an option, but
SMS is more reliable. Data centers can use more 4. The bank has provided the user with a
complex passwords, but they have no control card that contains a set of two numbers
over users who use the same password across that the user must enter for
multiple websites. Users often have trouble authentication.
remembering multiple passwords and security 5. The user will receive a number from the
questions for different sites, causing them to use bank via SMS on their cell phone, which
the same information across multiple sites. A they must input.
safer option is to use a password manager to 6. The user provides confirmation once
create and store strong passwords, security more using their certificate.
questions, and usernames. There's some
controversy over this advice, but it's similar to This authentication method combines two
Warren Buffett's "put all your eggs in one factors that the user knows (PIN and certificate
basket, but be careful in that basket" advice. It is password) with three factors that the user has
also recommended that users consider using (certificate, code card, and cell phone) for
increased security. However, despite its apparent
separate email addresses for different accounts
robustness, there are several weaknesses and
and use a single email to a message from other drawbacks in practice. The security plugins used
sites instead of logging in. for encryption, antivirus, anti-keystroke logging,
LXXXVII. EXTREME AUTHENTICATION: and firewall are inadequate for the task, and
KOREAN BANKING malware can bypass the password protection of
certificates stored on hard disks or USB keys by
In contrast to the low usage of user PKI key logging or brute-force attack. Furthermore,
certificates in other countries, South Korea is a the use of ActiveX has resulted in a Microsoft
notable outlier with a high adoption rate of monopoly in Korea, leading to poor web
around 60 percent among the population. This accessibility, and Korean users tend to install
widespread use of PKI has allowed Korean ActiveX controls without realizing the potential
banks to develop advanced authentication security risks. As a result, Korean users are
systems. For instance, when transferring a large conditioned to "Click on O.K. all the time.
amount of money to another bank, customers of Never, ever choose No!"
a Korean bank must follow a series of steps.
LXXXVIII. FUTURE WORK
When the text is finished, the thesis is ready
to be used as a template. Use the "Save As" Many articles promote alternative
command to copy the template file and use the authentication methods as traditional ones are
naming convention provided by the rule for the often easily compromised. However, this is
name of the document. Highlight everything in mainly due to incorrect infrastructure
this newly created file and import the prepared configuration, lack of security measures, and
file. You are now ready to create your unclear policies. It's crucial to have multi-
document; Use the scroll-down window to the faceted keys for authentication that are created,
left of the MS Word formatting toolbar. distributed, and maintained on a different
communication channel to prevent common
channel attacks. Authentication servers are
1. To gain authentication, the user is maintained through directory services like
required to install no less than four LDAPs and Active Directory. Although security
measures are applied to the authentication
server, connections to the directory servers are Summary of Discussions at the 2014
often unsecure, and the database itself may be Raymond and Beverly Sackler U.S.-U.K.
transparent to front-facing services. The Scientific Forum. The National Academies
development of novel authentication
Press, 2015.
mechanisms for legacy systems requires specific
infrastructure development, which could be [118]J.
Bonneau, E. Bursztein, I. Caron, R.
carried out without changing everything in the Jackson, and M. Williamson, “Secrets, Lies,
existing infrastructure. and Account Recovery: Lessons from the
Use of Personal Knowledge Questions at
LXXXIX.CONCLUSION Google,” pp. 141–150, May 2015.
Protecting computer systems is challenging, [119]A.Shabtai, Y. Fledel, U. Kanonov, Y.
especially because many users lack knowledge Elovici, S. Dolev, and C. Glezer, “Google
and expertise in this area, and providers often Android: A Comprehensive Security
prioritize meeting minimum security
Assessment,” IEEE Secur. Priv. Mag., vol.
requirements. Upgrading from the standard
username and password authentication method 8, no. 2, pp. 35–44, Mar. 2010.
has been a challenge. Nonetheless, users can [120]D.Balfanz, G. Durfee, D. K. Smetters, and
take measures to enhance their security. R. E. Grinter, “In search of usable security:
REFERENCES five lessons from the field,” IEEE Secur.
Priv. Mag., vol. 2, no. 5, pp. 19–24, Sep.
[112] R.Anderson, Security Engineering: A Guide 2004
to Building Dependable Distributed [121]A. Rabkin, “Personal knowledge questions
Systems, 2nd ed. New York: Wiley, 2008 for fallback authentication,” in Proceedings
[113] R.G. Rittenhouse, J. A. Chaudry, and M. of the 4th symposium on Usable privacy and
Lee, “Security in Graphical Authentication,” security - SOUPS ’08, 2008, p. 13.
Int. J. Secur. Its Appl., vol. 7, no. 3, pp. 347–
356, 2013.
[114] K.
I. P. Patil and J. Shimpi, “A Graphical
Password using Token, Biometric,
Knowledge Based Authentication System for
Mobile Devices,” Int. J. Innov. Technol.
Explor. Eng., vol. 2, no. 4, pp. 155– 157,
2013.
[115]A.H. Lashkari, S. Farmand, D. O. Bin
Zakaria, and D. R. Saleh, “Shoulder Surfing
attack in graphical password authentication,”
Int. J. Comput. Sci. Inf. Secur., vol. 6, no. 2,
p. 10, Dec. 2009.
[116]K.
Renaud, “On user involvement in
production of images used in visual
authentication,” J. Vis. Lang. Comput., vol.
20, no. 1, pp. 1–15, Feb. 2009
[117]National Academy of Sciences; Royal
Society, Cybersecurity Dilemmas:
Technology, Policy, and Incentives:
1
Abstract—Heart disease is a significant cause of death and high-fat diets, can lead to hypertension, which can cause
worldwide, and its early detection and prediction can heart diseases. Heart diseases account for a significant
prevent its fatal consequences. Machine learning number of deaths worldwide, with more than 10 million
techniques have shown promise in predicting heart people succumbing to this condition each year. Early
disease accurately by utilizing patient data. This paper detection and adopting a healthy lifestyle are essential for
aims to explore the application of various machine preventing heart diseases. Medically, a healthy pulse rate
learning models, including Logistic Regression, Decision should be between 60 to 100 beats per minute, and blood
Tree Classifier, Random Forest Classifier, Gradient
pressure should range between 120/80 to 140/90. Although
Boost Classifier, K-nearest neighbor, Naïve Bayes,
heart diseases can affect both men and women of all ages,
Stochastic Gradient Descent, Support Vector Machine,
and other ensemble methods, to predict heart disease in factors such as gender, diabetes, and BMI can contribute to
patients. The study utilizes a publicly available dataset its development.
that includes 303 patients with 14 features such as age,
sex, chest pain type, blood pressure, and cholesterol The main focus of the healthcare industry today is to
levels. Data preprocessing involved handling missing provide high-quality services and accurate diagnoses to
values, encoding categorical features, and scaling patients. Although heart diseases have been identified as a
numerical features. Various machine learning models leading cause of death worldwide, they can still be
were trained and tested, and their performance was effectively managed and controlled. The timely detection of
evaluated based on accuracy, precision, recall, and F1 a disease is crucial to ensure its proper management and
score. The results indicated that the Random Forest control. In this regard, our proposed work aims to detect
Classifier model outperformed other models, achieving heart diseases at an early stage, thus preventing any severe
an accuracy of 90.0% in predicting heart disease. This or fatal consequences.
study demonstrates that machine learning models can
predict heart disease effectively and can be used as an
[4] The primary objective of this project is to design a
early detection tool in clinical settings.
system that can analyses patient health records and identify
the most critical features that contribute to the development
I. INTRODUCTION of heart disease. By leveraging the power of machine
[1] Heart Disease Prediction using Machine Learning learning models, the system can predict the likelihood of a
Models is an innovative project aimed at developing an patient developing heart disease, providing valuable insights
accurate and reliable predictive model to identify to medical professionals on how to manage and treat their
individuals at risk of developing heart disease. The project patients. This project is highly relevant in today's healthcare
uses a variety of machine learning algorithms to create a landscape, where heart disease is still a leading cause of
powerful tool that can assist medical professionals in the death worldwide. By using cutting-edge machine learning
early detection and prevention of heart disease. techniques, this project has the potential to significantly
improve patient outcomes and reduce healthcare costs
[2] Any anomaly in the normal functioning of the heart can associated with the management and treatment of heart
be categorized as a heart disease, and it can result in disease.
disturbances in other parts of the body.[3] Unhealthy
lifestyle choices, such as smoking, alcohol consumption,
V. RESULTS
Fig 2: Heart Disease Prediction Deep Learning: Applying neural network architectures
like Convolutional Neural Networks (CNNs) and
Recurrent Neural Networks (RNNs) to heart disease
prediction can potentially provide better performance
than traditional machine learning models.
Detection of Alzheimer’s Disease Using Hybrid Cnn Compred With Deep Learning
Models
MobileNet Algorithm:
KISHOR. B. K KEERTHANA T
(20191CSE0260) (20191CSE0253)
School of Computer Science and School of Computer Science and
Engineering Engineering
Presidency University Presidency University
Bengaluru, Karnataka, India Bengaluru, Karnataka, India
kishorbkgowda36@gmail.com keerthanat2907@gmail.com
Abstract— This study primarily aims to investigate advantages and are highly effective, we have decided to use
measures to reduce phishing in online transactions. them in our study.
Phishing is a cybercrime where hackers try to obtain
Keywords— Phishing, Machine Learning Algorithms:
sensitive information such as passwords and credit card
Random Forest, Support Vector, LGB (LightGBM)
details by pretending to be trustworthy. We have developed a
classifier.
website in which we propose a novel approach for reducing
phishing in online transactions by applying machine
learning algorithms that can be used to analyse the
transaction's legitimacy, alert the user if any suspicious INTRODUCTION
activity is detected, and prevent phishing attacks in real
time. Our results show that the proposed approach is In recent years, the field of machine learning has experienced
effective in reducing the number of phishing attacks in tremendous growth. It entails utilizing statistical models and
online transactions. This study offers valuable insights and techniques to let computer systems learn from data without
practical recommendations that individuals and being explicitly programmed. The way we analyse data,
organizations can leverage to enhance their defences resolve difficult problems, and make judgments could be
against phishing attacks during online transactions. The completely transformed by machine learning.
developed website also focuses on user experience (UX) The contemporary significance of machine learning is the
design and user interface (U.I.) design, considering old-age ability of machine learning to process and analyse enormous
users, users from villages, those not technically savvy, and amounts of data rapidly and effectively is crucial.
kids. It is also cost-effective. Efficient and accurate, we have Applications for machine learning algorithms can be found in
chosen two algorithms for our study, i.e., random forest and several industries, including social media, marketing,
support vector classification. The advantages of the three healthcare, and finance. For instance, in marketing and social
algorithms are given below: Random Forest: high accuracy, media, machine learning provides individualized suggestions,
robustness, feature importance, scalability, non-parametric, disease diagnosis in healthcare, and fraud detection in the
resilience to noise, interpretability, and versatility. Support financial sector. The contemporary significance of machine
vector classification is regularization capabilities, efficiently learning: The ability of machine learning to process and
handling non-linear data, solving classification and analyse enormous amounts of data rapidly and effectively is
regression problems, and stability. LGB (LightGBM): crucial nowadays. Applications for machine learning
Highly accurate, Efficient handling of categorical features, algorithms can be found in several industries, including social
Good handling of imbalanced data, Flexibility and media, marketing, healthcare, and finance. For instance, in
customizability. Since these three algorithms have the most marketing and social media, machine learning provides
individualized suggestions, disease diagnosis in healthcare, their defence against phishing attacks, mitigate risks, and
and fraud detection in the financial sector. Machine learning safeguard sensitive information.
is now a vital tool for businesses and organizations to get Machine learning algorithms show promise in reducing
insights, make wise decisions, and maintain market phishing in online transactions, enabling faster and more
competitiveness due to the increasing amount of data accurate decision-making. This paper, a pioneering method is
available today. introduced for classifying phishing prevention in online
transactions, employing a range of widely adopted machine
Popular machine learning algorithms for classification and learning algorithms: Logistic Regression, Random Forest,
regression problems include Support Vector Machines SVM, and Decision Tree. The proposed model can efficiently
(SVM), Logistic Regression, Decision Trees, and Random and accurately reduce phishing, enabling faster and more
Forests. accurate user decision-making.
Support Vector Machine (SVM) is a binary classification Furthermore, our model includes a user-friendly graphical
algorithm that uses a hyperplane to divide the data points into user interface (GUI) that can be used to alert
two categories to the best possible extent. SVM has a high Overall, this research presents a significant contribution to
degree of accuracy and is particularly helpful when working cybersecurity, providing a powerful tool for users to make
with datasets that have feature spaces with many dimensions. decisions and improve the reduction of phishing outcomes.
Another binary classification approach that functions by
simulating the likelihood that an event will occur is logistic LITERATURE SURVEY
regression. It is applied when the independent variables are
continuous or categorical, and the dependent variable is [1] “Employed NB algorithms to identify the malicious
binary. A well-liked machine learning approach called websites. NB is a slow learner and does not store the previous
decision trees is utilized for classification and regression results in memory. Thus, the efficiency of the URL detector
problems. The way they operate is to divide the data into may be reduced.” [2] “Utilized multiple ML methods for
subsets based on the values of independent variables and then classifying URLs. They compared the performance of
create a decision tree using the resulting subsets. different types of ML methods. However, there were no
Multiple decision trees are combined in Random Forest, an discussions about the retrieval capacity of the algorithms.” [3]
ensemble learning approach, to increase the model's “Applied multiple classification algorithms for detecting
robustness and accuracy. When dealing with noisy or malicious URLs. The outcome of the experiments
complex datasets, it is especially helpful. LightGBM (LGB) demonstrated that the system's performance was better than
has proven to be an effective tool for phishing detection, a other ML methods. However, It lacks in handling a larger
critical task in cybersecurity. In phishing detection, LGB is volume of data.” [4] “Proposed a deep learning-based URL
employed to analyse a diverse set of features extracted from detector. The authors argued that the method could produce
URLs, email headers, and content. These features encompass insights from URLs. Deep learning methods demand more
domain characteristics, IP addresses, URL length, presence of time to produce an output. In addition, it processes the URL
suspicious keywords, and more. By training LGB on a and matches it with a library to generate an output.” [5]
carefully labelled dataset comprising both legitimate and “Developed a crawler to extract URLs from data repositories.
phishing instances, the model learns to identify complex Applied lexical features approach to identify the phishing
patterns and relationships indicative of phishing attacks. websites. The performance evaluation was based on a
During the training process, LGB employs its gradient crawler-based dataset. Thus, there is no assurance for the
boosting framework to construct an ensemble of decision effectiveness of the URL detector with real-time URLs.” [6]
trees. Through iterative iterations, LGB continuously “A CNN-based detecting system for identifying the phishing
improves the model's ability to differentiate between page. A sequential pattern is used to find URLs. The existing
legitimate and phishing instances by rectifying errors made research shows that the performance of CNN is better for
by preceding trees. This approach allows LGB to effectively retrieving images rather than text.”
capture the nuances and subtle indicators of phishing
attempts. The interpretability of LGB also proves valuable in I. EXISTING WORK
phishing detection. It provides insights into the importance of
different features, allowing analysts to understand the Email filtering: Machine learning algorithms can be used to
contributions of various indicators in identifying phishing analyse the content and metadata of emails to determine
attempts. The integration of LGB into a phishing detection whether they are likely to be phishing attempts. Emails
system involves data preparation, feature extraction, model identified as phishing attempts can be filtered out or flagged
training, and real-time prediction. The trained LGB model for review.
becomes a crucial component of a larger system that
incorporates real-time data collection, pre-processing, and Website classification: Machine learning algorithms can be
user notification. By leveraging LGB's advantages, such as its trained to classify websites as legitimate or phishing sites
efficient handling of features, accurate predictions, based on characteristics such as URL structure, content, and
scalability, and interpretability, organizations can enhance
SSL certificate. Users can be warned or prevented from 17)hostname length: a function to get the hostname length.
accessing known phishing sites. 18)sus_url: a function to detect suspicious words, if any.
19)count-digits: This function counts the number of digits in
User behaviour analysis: Machine learning algorithms have
the URL.
the capability to analyse user behaviour, allowing them to
detect potential phishing attacks by identifying patterns such 20)count-letters: a function to count the number of letters in
as a sudden increase in visits to unfamiliar websites or the given URL.
frequent input of login credentials, which could serve as 21)fd_length: a function to get the first directory length.
indicative signals of a phishing attempt. 22)tld_length: a function to get the length of tld from the
column tld created by the above line.
Domain analysis: Machine learning algorithms can analyse Several procedures would be involved in creating the dataset
domain names and identify patterns commonly used in for the study paper, including:
phishing attacks, such as misspellings or variations of well-
known domains. 1)Data cleaning entails identifying any incorrect or missing
data and determining how to deal with it (for example,
II. DATASET PREPARATION attribute missing values or remove observations with missing
data). It could also entail looking for outliers and deciding
The dataset includes 22 online phishing prediction-related how to deal with them.
factors and 6,51,190 observations. This dataset was taken
from Kaggle, and the variables the dataset consists of are: 2)Data transformation could entail scaling, normalizing, or
establishing new variables based on existing ones to make the
1)use_of_ip: creating a function to check if the given URL data more analytically useful.
has an I.P. address in it or not. There are 2 types of I.P.
addresses, namely, IPv4 and IPv6. 3)Feature selection is choosing a portion of the available
2)abnormal URL: Abnormal URLs may include variables for analysis depending on how well they relate to
misspellings or variations of popular websites, such as and predict the research topic.
"g00gle.com" instead of "google.com."
4) Data division: the data will be divided into distinct training
3)google index: a function to see if the URL is indexed on
and testing sets, allowing for robust evaluation and
Google. verification of the model's performance.
4)count (.): a function to detect the number of dots(.) in the
given URL. 5)Model training and evaluation: Using the training set as the
5)count (www): a function to detect the number of www in basis, several machine learning models might be developed
the URL. and assessed, and their performance compared with the
testing set.
6)count (@): a function to detect the number of @ in the
URL. 6)Reporting the findings: The study paper would provide the
7)count_dir ("/"): a function to detect the number of /'s in analysis' findings, along with any conclusions and
the given URL suggestions based on them. The dataset would also need to be
8)count_embed_domian ("//"): a function to detect the correctly referenced in the study to guarantee the proper
number of //'s in the given URL. credit to the data source.
9)short URL: a function to see if the URL is shortened.
In conclusion, there are several critical processes in the
10)count (https): a function to detect the URL's number of
dataset preparation process for this dataset on stroke
'https.' prediction, including data cleaning, transformation, feature
11)count (http): a function to detect the number of % in the selection, data splitting, model training and evaluation, and
URL. reporting of the results. The dataset must be prepared
12)count (%): a function to detect the number of %s in the correctly to produce accurate and trustworthy results and to
URL. guarantee the validity of any conclusions or suggestions made
13)count (?): a function to detect the number of ?'s in the from the study.
URL.
14)count (-): a function to detect the number of -'s in the III. ALGORITHM DETAILS
URL.
15)count (=): a function to detect the number of ='s in the We are using two different machine learning algorithms here.
given URL. They are namely-
16)URL length: a function to get the length of the URL.
• Random Forest on instances with larger gradients to prioritize learning from
informative examples. One of the key features of LightGBM
Random forest is a machine learning algorithm that can be is its computational efficiency. It achieves this through
used to prevent phishing attacks. The random forest creates several optimizations, such as histogram-based binning,
many decision trees, each trained on a subset of the available which reduces the memory usage and speeds up training.
data. The decision trees are then combined to create a single, LightGBM also supports parallel and GPU learning, allowing
more robust model that can make accurate predictions about it to handle large-scale datasets efficiently. In addition,
new, unseen data. LightGBM provides built-in support for handling categorical
In phishing prevention, the random forest can be used to build features, which are common in real-world datasets. It
a classifier that can distinguish between legitimate and employs techniques like Gradient-based One-Hot Encoding to
phishing websites. The classifier can be trained on a dataset convert categorical features into numerical representations
of known and legitimate phishing websites and then used to that can be processed by the algorithm.LightGBM offers a
predict the likelihood that a new website is phishing. wide range of hyperparameters that can be tuned to optimize
The key advantage of using the random forest in phishing the model's performance. These hyperparameters control
prevention is its ability to handle large and complex datasets various aspects of the algorithm, such as tree structure,
and identify the most important features distinguishing boosting parameters, regularization, learning rate, and more.
between phishing and legitimate websites. This allows the Overall, LightGBM is known for its ability to handle large
algorithm to generalize well to new and unseen data, making datasets, its speed, and its accuracy. It has gained popularity
it a powerful tool for detecting and preventing phishing in both research and industry due to its efficiency and
attacks. effectiveness in solving a variety of machine learning tasks.
LGB's predictions can be used to trigger warning messages or
• Support Vector Classification flags when an email or URL is identified as potentially
malicious. These warnings can be integrated into email
Support vector classification is another machine learning clients, web browsers, or security software, providing users
algorithm that can be used to prevent phishing attacks. Like with alerts and advising them to exercise caution or avoid
the random forest, it is a supervised learning algorithm that interacting with the flagged content. By integrating LGB's
can be trained on a dataset of known phishing websites and phishing detection capabilities into prevention systems,
legitimate websites to build a classifier that can distinguish organizations can enhance their overall defence against
between them. phishing attacks. However, it's important to note that
Support vector classification works by finding the hyperplane prevention efforts involve a combination of techniques,
that maximally separates the two data classes. In other words, including user education, email and web filtering, multi-factor
it identifies the line (in two dimensions) or the plane (in three authentication, and other security practices, to effectively
dimensions) that best separates the phishing and legitimate mitigate the risk of falling victim to phishing attacks.
websites in the feature space. The algorithm then uses this
hyperplane to classify new, unseen websites as either
phishing or legitimate. IV. IMPLEMENTATION
The key advantage of using support vector classification in
phishing prevention is its ability to handle complex and non-
linear data. Phishing websites can be very sophisticated and
use various techniques to mimic legitimate websites, making
it difficult to identify them based on simple features. Support
vector classification can overcome this challenge by
identifying the hyperplane that best separates the two data
classes, even in cases where the data is highly non-linear.
VII. CONCLUSION
VI. PROPOSED WORK
Using machine learning algorithms such as Random Forest
Support Vector Classification and LGB classifier can greatly
improve the prevention of phishing attacks on websites. By
analysing large amounts of data, these algorithms can identify [3] Gandotra E., Gupta D, “An Efficient Approach for
patterns and indicators of phishing attempts, allowing phishing Detection using Machine Learning”,
websites to take proactive measures to prevent them. Algorithms for Intelligent Systems, Springer,
Singapore, 2021, https://doi.org/10.1007/978-981-
Random Forest algorithms can analyse features such as email 15-8711-5_ 12.
sender information, URL characteristics, and message content
to determine the likelihood of a phishing attempt. Support [4] Hung Le, Quang Pham, Doyen Sahoo, and Steven
Vector Classification algorithms can classify emails and web C.H. Hoi, “URLNet: Learning a URL
pages as either phishing or legitimate based on their features Representation with Deep Learning for Malicious
and characteristics. URL Detection”, Conference’17, Washington, DC,
USA, arXiv:1802.03162, July 2017.
Using a combination of these algorithms, websites can
improve their ability to detect and prevent phishing attempts, [5] Hong Lexical and Blacklisted Domains”,
protecting their users from potentially harmful scams. Autonomous Secure Cyber Systems. Springer,
Website owners need to prioritize the implementation of these https://doi.org/10.1007/978-3-030-33432- 1_12
machine-learning techniques in their security systems to
maintain their users' safety and trust. [6] Aljofey A, Jiang Q, Qu Q, Huang M, Niyigena JP.
An effective phishing detection model based on
VIII. ACKNOWLEDGEMENT URL's character-level convolutional neural network.
Electronics.2020 Sep; 9(9):151.
We would like to express our sincere appreciation to
Professor Rama Krishna K Sir, who is working as an
Assistant professor in the Department of Computer Science at
Presidency University, for his invaluable guidance and
support throughout our academic journey. Their
encouragement and insightful feedback have been
instrumental in shaping our research work, and like to thank
them for their mentorship and for providing us with
opportunities to engage in meaningful research projects and
inspiring lectures, which have contributed to our intellectual
growth and development.
IX. REFERENCES
Abstract—Heart disease is a significant cause of death and high-fat diets, can lead to hypertension, which can cause
worldwide, and its early detection and prediction can heart diseases. Heart diseases account for a significant
prevent its fatal consequences. Machine learning number of deaths worldwide, with more than 10 million
techniques have shown promise in predicting heart people succumbing to this condition each year. Early
disease accurately by utilizing patient data. This paper detection and adopting a healthy lifestyle are essential for
aims to explore the application of various machine preventing heart diseases. Medically, a healthy pulse rate
learning models, including Logistic Regression, Decision should be between 60 to 100 beats per minute, and blood
Tree Classifier, Random Forest Classifier, Gradient
pressure should range between 120/80 to 140/90. Although
Boost Classifier, K-nearest neighbor, Naïve Bayes,
heart diseases can affect both men and women of all ages,
Stochastic Gradient Descent, Support Vector Machine,
and other ensemble methods, to predict heart disease in factors such as gender, diabetes, and BMI can contribute to
patients. The study utilizes a publicly available dataset its development.
that includes 303 patients with 14 features such as age,
sex, chest pain type, blood pressure, and cholesterol The main focus of the healthcare industry today is to
levels. Data preprocessing involved handling missing provide high-quality services and accurate diagnoses to
values, encoding categorical features, and scaling patients. Although heart diseases have been identified as a
numerical features. Various machine learning models leading cause of death worldwide, they can still be
were trained and tested, and their performance was effectively managed and controlled. The timely detection of
evaluated based on accuracy, precision, recall, and F1 a disease is crucial to ensure its proper management and
score. The results indicated that the Random Forest control. In this regard, our proposed work aims to detect
Classifier model outperformed other models, achieving heart diseases at an early stage, thus preventing any severe
an accuracy of 90.0% in predicting heart disease. This or fatal consequences.
study demonstrates that machine learning models can
predict heart disease effectively and can be used as an
[4] The primary objective of this project is to design a
early detection tool in clinical settings.
system that can analyses patient health records and identify
the most critical features that contribute to the development
I. INTRODUCTION of heart disease. By leveraging the power of machine
[1] Heart Disease Prediction using Machine Learning learning models, the system can predict the likelihood of a
Models is an innovative project aimed at developing an patient developing heart disease, providing valuable insights
accurate and reliable predictive model to identify to medical professionals on how to manage and treat their
individuals at risk of developing heart disease. The project patients. This project is highly relevant in today's healthcare
uses a variety of machine learning algorithms to create a landscape, where heart disease is still a leading cause of
powerful tool that can assist medical professionals in the death worldwide. By using cutting-edge machine learning
early detection and prevention of heart disease. techniques, this project has the potential to significantly
improve patient outcomes and reduce healthcare costs
[2] Any anomaly in the normal functioning of the heart can associated with the management and treatment of heart
be categorized as a heart disease, and it can result in disease.
disturbances in other parts of the body.[3] Unhealthy
lifestyle choices, such as smoking, alcohol consumption,
II. LITERATURE REVIEW III. PROPOSED SYSTEM
Numerous studies have been conducted to predict heart 9. Data Collection: Collect a dataset containing
diseases using Machine Learning datasets. These studies information about patients with and without heart
have employed various data mining techniques and achieved disease, including demographics, medical history,
varying levels of accuracy. The following section elaborates lifestyle factors, and diagnostic test results.
on these techniques: 10. Data Pre-processing: Clean and pre-process the data to
[1] S. Srinivasan et al proposes the use of decision trees remove any missing values, handle outliers, and
and random forest classifiers for predicting heart disease. normalize the data. This step also involves feature
The advantages of these methods include their selection, where the most relevant features are selected
interpretability and ability to handle missing data.The to build the predictive model.
limitations include their susceptibility to overfitting and 11. Model Selection: Evaluating the performance of
their inability to handle non-linear relationships between different machine learning algorithms, such as KNN,
features. decision trees, random forest, support vector machines,
[2] A. K. Singh et al compares the performance of naive etc for heart disease prediction. Select the most
Bayes and K-nearest neighbor algorithms for heart disease appropriate algorithm based on the evaluation metrics
prediction. The advantages of these methods include their and the objectives of the project.
simplicity and efficiency. The limitations include their 12. Model Development: Build the predictive model using
sensitivity to irrelevant features and the need for proper
the selected machine learning algorithm. Train the
feature scaling.
model on a portion of the dataset and evaluate its
[3] J. Chen et al investigates the use of support vector
machines and neural networks for heart disease prediction. performance on the remaining portion using evaluation
The advantages of these methods include their ability to metrics such as accuracy, precision, recall, and F1-
handle complex relationships between features and their score.
ability to generalize well to new data. The limitations 13. Model Optimization: Fine-tune the model parameters
include their computational complexity and the need for and hyperparameters to achieve optimal performance.
large amounts of training data. This step may involve using techniques such as grid
[4] H. Wang et al proposes the use of ensemble methods, search or Bayesian optimization to search for the best
such as bagging and boosting, for heart disease prediction. combination of parameters.
The advantages of these methods include their ability to 14. Model Validation: Validate the model on a new dataset
reduce overfitting and their ability to combine the strengths to ensure its generalizability and robustness. This step
of multiple models. The limitations include their increased
may involve splitting the dataset into training,
complexity and the need for proper parameter tuning.
validation, and test sets, or using cross-validation
[5] A. Esteva et al explores the use of deep learning
methods, such as convolutional neural networks and techniques.
recurrent neural networks, for heart disease prediction. The 15. Model Interpretation: Interpret the results of the
advantages of these methods include their ability to learn model and identify the most critical features associated
complex representations of data and their ability to handle with heart disease risk. This step may involve using
sequential data. The limitations include their high techniques such as feature importance or partial
computational complexity and the need for large amounts of dependence plots.
training data 16. Evaluation: Evaluate the performance of the deployed
[6] S. K. Singh et al proposes the use of genetic model in real-world settings and monitor its
programming for heart disease prediction. The advantages performance over time. This step may involve
of this method include its ability to automatically discover collecting feedback from medical professionals and
complex relationships between features and its ability to
handle non-linear relationships. The limitations include its
computational complexity and the need for proper parameter
tuning.
[7] H. Raza et al investigates the use of fuzzy logic for
heart disease prediction. The advantages of this method
include its ability to handle uncertainty and its ability to
incorporate expert knowledge. The limitations include its
sensitivity to parameter tuning and its susceptibility to
overfitting.
[8] A. Sharma et al proposes the use of principal
component analysis for heart disease prediction. The
advantages of this method include its ability to reduce
dimensionality and its ability to remove redundant features.
The limitations include its inability to handle non-linear
relationships and its sensitivity to outliers.
patients to improve the model's accuracy and usability. function in a logistic regression model to predict the
probability of an individual developing heart disease.
Support Vector Machine: A supervised learning algorithm
used for classification and regression. It may be used to
classify individuals as having or not having heart disease
based on their medical history, lifestyle, and other
characteristics.
These algorithms can be trained on heart disease datasets
and evaluated using various performance metrics such as
accuracy, sensitivity, and specificity. Based on their
performance, the most effective algorithm can be chosen for
heart disease prediction.
V. RESULTS
Abstract— The farming community in India better planning and management of crop
faces numerous challenges, including production.
unpredictable weather, pests, and fluctuations in
crop prices. To empower farmers with the Keywords—Smart System for Crop Price
necessary information to make informed Prediction using Machine Learning, XG Boost,
decisions about crop prices, we have developed a crop price predictions.
website called "Smart System for Crop Price
Prediction using Machine Learning." We use
I. INTRODUCTION
advanced machine learning algorithms like XG
Boost, ARIMA, and VAR to analyze historical Agriculture plays a crucial role in India's
data and identify trends and patterns that help economy, with over 58% of the population relying
predict future crop prices accurately. Our on it for their livelihoods. However, the sector faces
website uses a comprehensive approach to challenges such as labor shortages, changing
determine the best algorithm for predicting crop consumer preferences, and price fluctuations, which
prices, ensuring farmers have access to the most can significantly impact farmers' incomes and the
reliable and up-to-date information. Our country's GDP. To address these issues, innovative
ultimate goal is to provide farmers with the tools technologies such as machine learning and farm
automation have been adopted in agriculture.
and knowledge to manage their crops effectively,
minimizing financial risks and contributing to Machine learning algorithms can enhance crop
the growth of the agricultural sector. By productivity and quality by accurately predicting
leveraging the power of machine learning, we and estimating farming parameters. Accurate
hope to make crop price predictions more forecasting of crop yields and prices can also assist
accessible, reliable, and accurate, resulting in farmers in selling their produce at the right time and
for a good price, thereby mitigating the financial algorithms and concludes that Random Forest
risks faced by farmers due to price fluctuations after algorithm provides the best accuracy for crop yield
the harvest. prediction.
Our website, "Smart System For Crop Prize [5] This review article discusses the application of
Prediction Using Machine Learning" provides random forest and decision tree regression for crop
farmers with accurate crop price forecasts that price prediction. The authors provide an overview
enable them to plan and manage their crops better, of the importance of crop price prediction in
resulting in fewer losses and better price agriculture and review various studies that have
management. By leveraging machine learning used these algorithms. They also suggest areas for
algorithms, our platform can improve farmers' future research in this field.
decision-making capabilities and help mitigate the
risks associated with agriculture, leading to better
crop management and improved incomes for
farmers. With the adoption of innovative III.OBJECTIVE
technologies such as machine learning, the A preliminary study will be conducted to investigate
agriculture sector in India can become more the viability of utilizing machine learning for two
efficient, productive, and sustainable. purposes
• Firstly, to predict the modal price of
II.LITERATURE REVIEW specific crops through the application of
[1] This research paper discusses an automated machine learning algorithms.
agriculture Commodity price prediction system • Secondly, to develop and deploy a
utilizes machine Learning techniques. The system website that utilizes an appropriate
was tested on datasets from the Malaysian machine learning approach for crop price
agriculture industry, with the random forest prediction.
algorithm found to be the most accurate and stable.
The paper emphasizes the system’s potential to
assist farmers in decision-making
[2] The article proposes using supervised machine IV.EXISTING METHODS
learning algorithms to predict crop prices. The study Decision Tree Regression, LSTM, ARIMA, and
compares the performance of six different Vector Autoregression are popular methods that
algorithms and concludes that Random Forest and find use in machine learning and statistical
Support Vector Regression are the most effective in applications. Decision Tree Regression uses
predicting crops prices, based on historical data. recursive splitting of data based on input features
[3] The paper explores the use of predictive analytic to construct a tree-like model for predicting
in agriculture to forecast the prices to forecast the continuous numerical values. LSTM is a recurrent
to forecast the prices of Areca nuts in Kerala, India. neural network that can handle long-term
The study employs a hybrid model that combines dependencies in sequential data, making it well-
Artificial Neural Networks (ANN) and suited for tasks such as language modeling,
Autoregressive Integrated Moving Average speech recognition, and sequence prediction.
(ARIMA) models. The results suggest that the ARIMA is a statistical model used for time series
proposed model provides accurate forecasts for analysis and forecasting, which assumes
Areca nut prices. stationarity of the time series and comprises the
[4] The paper proposes a crop prediction system autoregressive component, the differencing
using machine learning algorithms. The system uses component, and the moving average component.
historical data of crop yield and weather conditions Vector Autoregression is a statistical model that
to predict the yield of the upcoming crop season. analyzes the relationship between multiple time
The study compares the performance of different series variables, with each variable modeled as a
linear function of its own lagged values. These
methods have unique features and can find
applications in various data analysis and
prediction tasks. Prediction based on
previous datasets
A. Drawbacks of the existing methods
Decision Tree Regression can be susceptible to Result and
overfitting, especially if the tree is overly complex suggestions
and deep, leading to poorer generalization
Fig. 11. Proposed Architecture
performance on new data. Additionally, small
changes in data can significantly impact the tree
structure and predictions made.
On the other hand, LSTMs, while effective in
handling long-term dependencies in sequential data, A. Step 1:
are computationally expensive and require more The datasets have undergone collection and
training time and resources compared to simpler refinement, with a focus on historical data to
models. Furthermore, LSTMs may encounter issues identify trends and patterns that can help in
such as the vanishing gradient problem, especially predicting future crop prices.
for longer sequences, which can lead to suboptimal
model performance B. Step 2:
Various analyses have been conducted to develop
a prediction model based on input of datasets
V.PROPOSED METHODS provides information on the prize and date in a
Our proposed solution is a web platform that uses particular regions.
machine learning algorithms to predict crop prices
and suggest the best crop to cultivate. The platform C. Step 3:
aims to help farmers make informed decisions to
avoid market fluctuations and maximize profits. By The prediction model is built using this
utilizing algorithms such as VAR, XGBoost, and algorithms XGboost, crop analysis and prediction
ARIMA, our solution will improve the livelihoods have been performed, taking into account various
of farmers and contribute to the growth of the datasets.
agricultural sector in India.
D. Step 4:
Through the process of crop analysis and
Collections of prediction, the price of a particular crop can be
Agricultural Datasets predicted, providing a better insights to farmers
armed with this information, farmers can make
Selection of the informed decisions on which crops to sow in order
parameters to decrease their loss.
VI.SYSTEM REQUIRMENTS
A. Hardware Requirements
System: INTEL i5.
Hard Disk: 512 GB.
RAM: 8 GB. farmers can take necessary measures to ensure
Network: Wi-Fi/mobile network. optimal crop yield and reduce the risk of crop loss.
Any desktop/Laptop system with the above
configuration or higher level. Overall, our website's combination of crop price
prediction and detailed crop information will
B. Software Requirements provide farmers with the necessary tools to make
Operating system: Windows 7 or higher. data-driven decisions about crop cultivation. This,
Coding Language: Python, HTML, JavaScript. in turn, will lead to higher profitability and reduced
Version: Python 3.7.0. wastage.
IDE: Python 3.7.0 IDLE.
ML Packages: NumPy, Pandas, Sklearn, Flask,
PymySql. VIII.CONCLUTION
ML Algorithms: ARIMA, Vector autoregression,
and In recent years, the adoption of machine learning
XGboost. algorithms in crop price forecasting has gained
Other Requirements: a verified resource for considerable attention due to its potential to provide
gathering the right more accurate predictions based on historical data
dataset.[6].www.agmarknet.gov.in and other relevant factors. After extensive research
on various forecasting techniques, our team chose
to utilize a multi-variate time series algorithm,
specifically Extreme Gradient Boosting (XGBoost),
VII.EXPECTED OUTCOMES for our project.
Our proposed website is designed to cater to the
needs of farmers by incorporating advanced XGBoost is a high-level machine learning
machine learning techniques like Vector algorithm that has gained widespread popularity for
Autoregression (VAR), XGBoost, and ARIMA to its application in statistical projects, including time
predict crop prices with a high degree of accuracy. series forecasting. It is an ensemble algorithm that
These algorithms will aid farmers in making combines the predictions of multiple decision trees
informed decisions about when to plant and harvest to generate more precise predictions. XGBoost has
crops, which in turn can help minimize crop the ability to handle both categorical and continuous
wastage and increase profits. variables and is known for its efficiency, scalability,
and accuracy.
Moreover, our platform aims to provide
comprehensive information about different crop The resulting web page will enable farmers to input
types, including their characteristics, growth details about their crops and receive real-time price
requirements, and harvest times. By offering forecasts based on the XGBoost algorithm. The
insights on the optimal growing conditions for each implementation of XGBoost in this project will
crop, such as the type of soil, water requirements, result in more precise predictions, enabling farmers
and temperature range, farmers can make informed to make better-informed decisions about the optimal
decisions about the most suitable crops to cultivate time to sell their produce.
based on their specific geographic region and
climate. The use of machine learning algorithms, such as
XGBoost, in crop price forecasting has the potential
In addition, our website will offer detailed to bring a revolutionary change in agriculture in
information on the different stages of crop growth, India, providing farmers with the necessary tools to
including planting, irrigation, fertilization, and pest make informed decisions and enhance their income.
control. By providing this valuable information, Our team's project showcases the potential of
machine learning in agriculture and highlights the
possibility of future innovations in this field. [2]. Ranjani Dhanapal “Crop price prediction using
supervised machine learning
algorithms” al 2021 J. Phys.: Conf. Ser. 1916
012042
IX.ACKNOWLEDGMENT
We would like to acknowledge the support and [3]. Kiran M. Sabu “Predictive analytics in
guidance of our project supervisor, Dr.Mohammadi Agriculture: Forecasting prices of Areca nuts
Akheela Khanum Presidency University who in Kerala.”. / Procedia Computer Science 171
provided invaluable insights and direction (2020) 699–708
throughout this paperwork duration. We also extend
our thanks to the academic community, whose [4]. Pavan Patil, Virendra Panpatil, Prof. Shrikant
research and publications provided the foundation of
Kokate “Crop Prediction System
our project. Special thanks go to the authors of the
using Machine Learning Algorithms” Volume: 07
various papers and articles that we referenced in our
work. Issue: 02 | Feb 2020 e-ISSN: 2395-
0056
CHANDANA S (20191COM0040)
UZMA FATHIMA SHAIK B.Tech Computer Engineering
(20191COM0213) Presidency University
B.Tech Computer Engineering Bangalore, india
Presidency University 201910100497@presidencyuniversity.i
Bangalore,India n
201910101646@presidencyuniversity.i
n
Abstract— Food quality is a key concern changed. These methods can make greater use of
worldwide, and to reduce the rate of this information in the future to reduce food
deterioration, it's crucial to keep the spoiling.
environment in food storage warehouses at a
comfortable temperature.
In general, most cooking methods will keep food Keywords—Food quality, IoT, Detection,
fresh. Various chemicals or ingredients are adde Sensors
d to food to make it look fresh or attractive. Most
food is now preserved with chemicals that make XCVII. INTRODUCTION
food unhealthy. These pollutants can cause many
diseases that cause consumers to crave healthy f For any type of living thing to sustain the energy
ood. necessary for survival, food is a basic requirement.
Today's food inspection methods are limited by Nutrients and energy from nutritious diet keep the
weight, volume, color and detection, so they cann body strong and active. Pesticides are frequently
ot provide much of the information needed for fo employed by farmers in agriculture to increase
od quality. The quality of the food must be tested productivity, and these pesticides play a significant
and protected against decay and deterioration d role in food contamination. Eating unhealthy food
ue to atmospheric factors such as heat, humidity when exposed to pesticides is like opening the door
and darkness. to sickness. Unhealthy food leads to disease,
obesity, and nutrient deficiency. Young people
The Internet of Things (IoT)-based system to today are very interested in living healthy lifestyles
detect the quality of food products is an and taking care of their physical health. The quality
integrated detection and management of the meal is therefore crucial for maintaining
information system made up of smart devices. It fitness. In today's world, food poisoning is also a
uses sensors to assess the quality and freshness of serious issue. It develops into the cause of numerous
food and can identify food spoilage early, before ailments. A thorough investigation is conducted to
symptoms appear. The proposed approach for determine the food's quality. In order to better serve
managing food quality is highlighted in the the needs of people, scientists are focusing on the
article. The study improves people's quality of types of bacteria that are present in food. The ability
life by using intelligent sensor networks to alert to know about food quality is largely made possible
people when food is about to expire or when by scientists and technology. It is clear from the
particular aspects of the packaging for food have current script that we require a device that can
A research paper on Food Quality Detection and Fig 1. Block diagram of proposed method
Monitoring System by IEEE International Students’
in 2020 Conference on Electrical, Electronics and
Computer Science by Atkare Prajwal (1), Patil In this project we propose to make an electronic
Vaishali (2), zade payal (3), Dhapudkar Sumit. Food device that is capable of detecting food spoilage and
plays an big part in our day-to-day life. With the gives us an indication whether the food substance is
development of globalization, the quality of food is fit or unfit for human consumption. Here In this
diminishing day by day. In general, most cooking system, we use ESP32 module as the basis of the
styles will keep food fresh. varied chemicals or system to connect sensors, gas sensor and
constituents are added to food to make it look fresh temperature sensor and LCD screen to display
or appealing. maximum food is now conserved related information. The sensor calculates the
with chemicals that make food unhealthy. This freshness level and quality level of the food by
impurity can effect numerous illnesses that interpreting the readings from the food output to us
multiply the demand for healthy food from on the LCD so that we can check the quality of the
consumers. People need organic food for health. food. This is done with great care and sensor
thus, to avoid food problems without mortal sensitivity.
explanation, we need tools like these that help
determine food quality. similar tools should be used
to guide us in our convention of clean food. thus, to
CI. METHODOLOGY comes equipped with both digital and analogue
output pins. The digital pin sends a high signal
when the concentration of these gases in the air
Hardware Components exceeds a specific threshold level. The threshold can
be adjusted using the on-board potentiometer. On
Arduino Uno the other hand, the analogue output pin generates a
voltage signal that can provide an approximate
The Arduino Uno is a microcontroller board that is measurement of the gas concentration in the
built around the ATmega328P. It boasts several surrounding air.
features, including 6 analogue inputs, a 16 MHz
quartz crystal, a USB connection, a power jack, an Gas Sensor
ICSP header, and a reset button. In addition, it
comes with 14 digital input/output pins, 6 of which Gas sensors are instrumental in determining the
can be used as PWM outputs. The board contains concentration of gas in the surrounding environment
everything needed to support the embedded and how it moves. By using electrical signals, gas
controller, making it an excellent choice for both sensors can provide information about the type and
beginners and experts. Getting started is as simple quantity of gas present, as well as any changes in
as plugging the board into a computer using a USB gas concentration [91-93].
cable, connecting it to an AC-to-DC adapter for
power, or using a battery. Pressure Sensor
Abstarct - Number plate detecting is an car licence plate is retrieved from the
image recognition technique that picture. Character recognition is
identifies vehicles by their number accomplished via optical character
(number) plates. The goal is to create recognition (OCR).
and put into use a reliable vehicle
identifying system that uses the licence Python programming is used to identify
plate to identify the vehicle. This system licence plate numbers. For this project,
can be put in place at the entryway of we'll use Python Pytesseract to extract
parking areas, toll stations, or any the letters and numbers from the
other private location, such as a college, licence plate and OpenCV to identify
in order to keep track of arriving and the licence number plates. We'll create
departing cars. It can be utilised to a Python programme to automatically
limit entrance to the building to just identify the licence plate.
authorised cars. The created system Key words : Vehicles License plate
takes a picture of the front of the car, images, Opencv , pytesseract OCR,
finds the licence plate, and then scans license plate recognition
the plate. Using image processing, the
Summary :The suggested fix utilises the use • Create a real-time ANPR system using
of the smart parking service (SPANS), a OpenCV that integrates these components to
system for locating available parking spaces enable reliable and efficient recognition of
using computer vision methods. The proposed license plates from images or video streams.
system makes use of the SPANS' camera to • The ideal of the designed system aims
collect images and information about parking the following five points
spaces. The proposed system takes a picture of
a vehicle when it is discovered and uses that • Affordable The systems must be
image to determine the vehciles number plate. affordable as the price is one of the main
As a result, the device saves the identification factors that kept on mind during design phase.
number, and public entities like traffic • Movable: The systems to be movable
departments can access this information. and easy use, this web app can be penetrated
through Phone as well.
VI. DESIGN CONSIDERATIONS
• Accurate The system must be accurate; real-time use in any industrial or institutional
thus, the most accurate algorithms have been parking area. This application can be broadly
chosen.
integrated with parking ticket vending
VIII. IMPLEMENTATION machines, monitoring systems, RFID-enabled
Only the administrator, can log in if their boom barriers, and so on.
username or password is correct; if not, an
error message will be sent to them. Stoner must X.FUTURE WORK
sign in and have their identity validated before • Improve accuracy: ANPR systems are
they can check the details and do the operation. heavily dependent on the accuracy of the
Before they can acquire access, he must first OCR engine. You can explore various
enter the login runner and submit the necessary techniques to improve the accuracy, such as
pre-processing techniques like noise
data.
reduction, image thresholding, and
Page for the administrator: By logging into the image enhancement.
page administrator has so many operations that • Integration with other systems: It can be
he can do like he can add the vehicle details to integrated with other systems like traffic
the data base and he can remove the details of management, toll collection, and parking
the vehicles and can also access the details of management systems to make these
systems more efficient to use.
the gate where the vehicles are coming and
• Smart phone integration: Its possible to
going out of the gates. develop this project further as mobile
Website home page: The webpage that a application that can be installed in mobile
website admin can start the camera feed of the to make use of it in much easier way.
system where the detection of vehicle and also
detection of the number plate also.
IX. COCLUSION
This ANPR system developed using OpenCV
has shown the primising results in accurately
recognizing number plates and successfully
achieved the primary objective of automatic
number plate recognition.
This vehicle gate management system is fully XI.REFERENCES
automated and can be tailored to any
commercial or industrial setting with minimal [1] "License plate recognition system using
OpenCV in python" by G. Naresh Reddy and
human intervention and programming. This
M. Veerraju. Published in the International
system detects the vehicle in addition to any Journal of Advanced Research in Computer
high-end sensors, making it a cost-effective Science, Volume 8, Issue 4, July-August 2017.
system. This system, with proper mechanical [2] "Automatic Number Plate Recognition System
assistance and design, can be implemented for Based on OpenCV" by S. K. Singh and A. K.
Singh. Published in the Proceedings of the
2017 International Conference on Computing [10] Saha, S.; Basu, S.; Nasipuri, M.
and Communication Technologies (ICCCT), Automatic Localization and Recognition of
Volume 2, 10thOctober 2017. License Plate Characters for Indian Vehicles.
[3] "Automatic Vehicle License Plate Recognition
Int. J. Comput. Sci. Emerg. Technol. IJCSET
Using OpenCV and SVM" by D. Li, X. Li, and
2011, 2, 520–533.
Q. Liu. Published in the Proceedings of the
2015 IEEE International Conference on
Progress in Informatics and Computing (PIC),
Volume 1, 2nd December 2015.
[4] "Automatic number plate recognition system
using OpenCV and Tesseract OCR" by S. S.
Kumar and S. S. S. Sree. In 2017 International
Conference on Innovations in Information,
Embedded and Communication Systems
(ICIIECS), pages 1-6,3rd december
[5] "License Plate Recognition System using
OpenCV and Convolutional Neural Network"
by S. Goyal, V. Singh, and K. Kumar. In 9th
august,2020 IEEE 7th Uttar Pradesh Section
International Conference on Electrical,
Electronics and Computer Engineering
(UPCON), pages 1-5.
[6] Mahalakshmi, S.; Tejaswini, S. Study of
Character Recognition Methods in Automatic
License Plate Recognition (ALPR) System. Int.
Res. J. Eng. Technol. IRJET 2017, 4, 1420–1426.
[7] Patel, C.I.; Shah, D.; Patel, A. Automatic
Number Plate Recognition System (ANPR): A
Survey. Int. J. Comput . Appl. 2013, 69, 21–33.
[ CrossRef ] Sensors 2020, 20, 55 12 of 13
[8] Cheng, G.; Zhou, P.; Han, J. Learning Rotation-
Invariant Convolutional Neural Networks for
Object Detection in VHR Optical Remote
Sensing Images. IEEE Trans. Geosci. Remote
Sens. 2016, 54, 7405–7415. [CrossRef]
[9] Cheng, G.; Zhou, P.; Han, J.; Xu, D. Learning
Rotation-Invariant and Fisher Discriminative
Convolutional Neural Networks for Object
Detection. IEEE Trans. Image Process. 2019,
28, 265–278. [CrossRef] [PubMed]
Bike Crash Detection And Alert System Using IMU
Y Rashmi
Computer Science and Engineering Yashaswini R Dr. Neha Singh
Presidency University Assistant Professor CSE
Computer Science and Engineering
Bangalore, India Presidency University
Presidency University
rashmirajuy@gmail.com Bangalore, India
Bangalore, India
yashaswini.r.121@gmail.com singhgaur.neha@gmail.com
IV. METHODOLOGY
The system works on 5 volts 1 amp as load current. It has
ADXL 335 , tilt sensor, limit switch, and push button as input
to the Arduino and GSM SOOC, GPS, a buzzer, and LED as
output.
Fig. 2. Arduino UNO
Fig. 4. Limit Switch
Arduino UNO is the brain of the project. The Microchip
ATmega328P microcontroller serves as the foundation for the A limit switch is used to control a parameter and stop it from
Arduino Uno, which is an open-source microcontroller to going too far. It does this automatically, without needing
which the main project code is uploaded. someone to do it manually. It is used here as a sensor in front
& back of the bike when the bike is hit it will detect it as a
crash or accident, the limit switch does not need any power
supply to function as it works as a push button switch, it just
B. Tilt Sensor transfers the data to the Arduino UNO.
C. Limit Switch
E. GPS Module
CONCLUSION
This article discusses the pressing need for an efficient
bicycle accident detection and notification system to reduce
fatalities and injuries resulting from traffic accidents. To
address this issue, we propose the development of a self-
contained bicycle crash detection device that can accurately
detect accidents and promptly notify relevant authorities with
precise location information to expedite emergency response
and save lives. This proposed device uses advanced
technology and algorithms to detect accidents based on
parameters such as impact force, collision angle, and sudden
changes in speed or direction and employs a cellular network
Fig. 6. GPS Module to send a notification to emergency contacts. The
implementation of this device has the potential to
The GPS module is used to send the location coordinates of the significantly reduce the fatality rate associated with bicycle
bike crash site to the registered mobile numbers. accidents by ensuring prompt notification and response from
emergency services. We conclude that the device represents a
crucial step towards mitigating the alarming impact of bicycle
F. GSM 800C Module accidents and saving lives by enabling timely and targeted
intervention.
REFERENC
ES
[1] Nicky Kattukkaran, Mithun Haridas T. P & Arun George,” Intelligent
Accident Detection and Alert System for Emergency Medical
Assistance”.
[2] Aboli Ravindra Wakure, Apurva Rajendra Patkar “Vehicle Accident
Detection and Reporting System Using GPS and GSM,” IJERGS,
April 2014.
[3] Damini S. Patel & Namrata H. Sane, “Real Time Vehicle Accident
Detection and Tracking Using GPS and GSM.”
[4] N. Srinivasa Gupta , M. Nandini, “Smart System for Rider Safety and
Accident Detection” IJERT Vol. 9 Issue 06, June-2020.
[5] C.Prabha, R.Sunitha, R.Anitha (2014), “Automatic Vehicle Accident
Detection and Messaging System Using GSM and GPS Modem”,
International Journal of Advanced Research in Electrical, Electronics
and Instrumentation Engineering.
[6] “Road Vehicle Alert System Using IOT” 2017 25th International
Conference on Systems Engineering (ICSEng)
[7] A. Cismas, I. Matei, V. Ciobanu and G. Casu, "Crash Detection Using
IMU Sensors," in 2017 21st International Conference on Control
Fig. 7. GSM 800C Module
Systems and Computer Science (CSCS), Bucharest, 2017 pp. 672-676.
doi: 10.1109/CSCS.2017.103.
The GSM module is used to send the data like GPS coordinates [8] M. M. Islam, A. E. M. Ridwan, M. M. Mary, M. F. Siam, S. A. Mumu
to the registered mobile contacts of the crash victim and S. Rana, "Design and Implementation of a Smart Bike Accident
automatically through an SMS. Detection System," 2020
[9] Brian Lin, Dhruv Mathur, and Alex Tam “Bike Crash Detection ”
2019
[10] Jussi Parviainen ,Jussi Collin ,Timo Pihlstrom, Jarmo Takala
EXPECTED RESULT
"Automatic crash detection for motorcycles ".
[11] Mr S.Kailasam, Mr Karthiga, Dr Kartheeban, R.M.Priyadarshani,
● The project complied with all of the high-level K.Anithadevi, “Accident Alert System using face Recognition”,IEEE,
specifications established at its outset. 2019.
● The device will be able to identify crashes from drops of the [12] Rajvardhan Rishi, Sofiya Yede, Keshav Kunal, Nutan V. Bansode,”
bike or quick controlled stops and detect crashes with over Automatic Messaging System”. for Vehicle Tracking and Accident
Detection, Proceedings of the International Conference on Electronics
10g of force with accuracy. and Sustainable Communication Systems,ICESC, 2020.
● Second, the tool could convey a message swiftly, the time [13] Md. Syedul Amin, Mamun Bin Ibne Reaz, Salwa Sheikh Nasir and
and location of the accident, to the emergency contact(s) Mohammad Arif Sobhan Bhuiyan “Low Cost GPS/IMU Integrated
within two minutes of the collision. Accident Detection and Location System”, 2017.
● Finally, we will be able to design a little gadget that would [14] Jussi Parviainen ,Jussi Collin ,Timo Pihlstrom, Jarmo Takala
"Automatic crash detection for motorcycles ".
enable mounting on the majority of motorcycles.
6
Dr Chinnaiyan R Ranjith kumar Tallam (20191CSE0624) Allada Bhanu sai subba rao
Professor( CSE )ilation) Deaprtment of CSE (ffliation) (20191CSE0027)
Presidency University (fon) Presidency University (iation) Deaprtment of CSE (ofAffiliation)
Bengaluru, India Bengaluru, India Presidency University filiation)
chinnayaiyan@presidencyuniversity.in 201910100941@presidencyuniversity.in Bengaluru, India
201910101058@presidencyuniversity.in
Introduction:
The traditional process of verifying educational certificates
is often lengthy, costly, and prone to fraud. Educational a secure and reliable method of verifying educational
institutions, employers, and other third-party verification certificates. The proposed system eliminates the need
agencies have to rely on intermediaries to verify the for intermediaries, such as universities or third-party
authenticity of certificates, which can take a long time and verification agencies, and allows for a direct
result in errors or inconsistencies. Blockchain technology verification process that is transparent, efficient, and
offers a solution to this problem by providing a secure and cost-effective. The system uses smart contracts to
decentralized platform for verifying educational certificates. automate the verification process, ensuring that all
This paper presents an educational certificate verification data is accurate and tamper-proof. The use of
system using blockchain technology, which aims to blockchain technology provides an immutable and
streamline the verification process and make it more transparent record of all verified certificates,
reliable, efficient, and cost-effective. The proposed system enhancing the integrity and trust of the verification
utilizes the immutability and transparency of blockchain to process. This paper presents an abstract of the
provide a tamper-proof and direct verification process that proposed Educational Certificate Verification System
Using Blockchain, highlighting its key features, The authors identified the limitations of traditional paper-
benefits, and potential impact on the education sector. based diploma systems, including the risk of counterfeit
The proposed system has the potential to transform the diplomas and the difficulty of verifying the authenticity of
way educational certificates are verified, providing a diplomas from different institutions. The proposed system
more secure, efficient, and reliable method for all utilizes blockchain technology to create a tamper-proof and
stakeholders involved. decentralized ledger of diploma information. The system
also employs smart contracts to automate the diploma
Literature Survey: issuance and verification process. The authors also
[1] Ahmed, S., Yaqoob, I., Hashem, I. A. T., Khan, I., & incorporated digital signature technology to ensure the
Ahmed, E. (2019). Blockchain Technology: A Survey On authenticity of the diploma issuer. The authors implemented
Applications And Challenges. Journal Of Network And a prototype of the proposed system and conducted
Computer Applications, 126, 50-70. experiments to evaluate its performance and effectiveness.
Https://Doi.Org/10.1016/J.Jnca.2018.09.017 The results demonstrated that the system can efficiently
Ahmed et al. (2019) wrote a comprehensive review paper on issue and verify diplomas while ensuring the authenticity
the application of blockchain technology. The authors first and security of the diploma information.
introduced the concept of blockchain and its key Summary: The risk of fake credentials and the difficulty in
characteristics, including decentralization, transparency, determining the legitimacy of degrees from various
immutability, and security. Then they discussed the history universities were mentioned as drawbacks of traditional
and evolution of blockchain technology, from its inception paper-based diploma systems by the authors.
as the underlying technology of Bitcoin to its current
applications in various industries. The paper also delved into Existing Method:
the technical aspects of blockchain, including consensus
mechanisms, smart contracts, and cryptographic techniques. The existing method is for issuing education certificates
Summary: The article also explored blockchain's technical using block chain is through the use of digital badges.
underpinnings, such as consensus processes, smart
contracts, and cryptography methods. The benefits of Digital badges are electronic representations of
implementing blockchain technology were emphasised by achievements, skills, and knowledge that can be shared
the writers, including improved security, reduced costs, and
increased productivity. online. They are often linked to a block chain, which serves
[2] Brinkmann, M., & Böhme, R. (2019). Blockchain- as a secure and tamper-proof ledger of the badges.
Based Certificate Verification With Privacy-Preserving
Revocation Checking. Computers & Security, 83, 267- Disadvantages:
283. Https://Doi.Org/10.1016/J.Cose.2019.01.008 1. Technical Complexity: Block chain technology can
Brinkmann and Böhme (2019) proposed a blockchain-based
certificate verification system that provides privacy- be complex to implement and maintain, and may
preserving revocation checking. The authors identified the require specialized technical expertise. This can
limitations of traditional certificate revocation mechanisms,
which rely on centralized authorities to maintain and make it difficult for some educational institutions
distribute revocation lists. These mechanisms can be slow, to adopt the technology.
inefficient, and susceptible to attacks. The proposed system
utilizes blockchain technology to create a decentralized, 2. Initial Cost: Implementing a blockchain-based
tamper-proof ledger of certificate revocation data. system for education certificates can be expensive.
Summary: The drawbacks of conventional certificate
revocation procedures, which depend on centralised There may be significant upfront costs associated
authority to maintain and disseminate revocation lists, were with developing the system, purchasing and
noted by the authors. These systems may be unreliable,
ineffective, and vulnerable to intrusions. The suggested maintaining hardware and software, and training
method develops a decentralised, tamper-proof ledger of personnel.
certificate revocation information using blockchain
technology. Proposed System:
[3] Lee, J. H., & Kim, T. H. (2019). A Blockchain-Based Implementing education certificates using block chain
Certificate Issuance And Verification System For
University Diplomas. International Journal Of technology. By following these steps, institutions can create
Distributed Sensor Networks, 15(2), 1550147719834755. a secure and tamper-proof certificate system that provides
Https://Doi.Org/10.1177/1550147719834755
Lee and Kim (2019) proposed a blockchain-based certificate learners with greater control over their educational
issuance and verification system for university diplomas. credentials.
Block Diagram:
METHODOLOGY:
Admin:
Figure 21:Block Diagram Login: Here, first Admin will login into the system with
username and password.
Upload details: After login, Admin should upload the
Advantages: details of the company.
Upload to block chain: Details were uploaded in the
1. Increased security: Block chain technology blockchain.
provides a secure and tamper-proof system for Hash code: Then hash code will open.
Digital Signature: It will show the signature of the
storing and sharing certificates. Each certificate is company.
stored in a block that is cryptographically secured, Company:
Login: Company must login into the system.
ensuring that the certificate cannot be altered or Signup: After login company should signup into the system.
duplicated without leaving a trace. Scan: Then it will scan the signature of the company.
Upload certificates: Next the company will upload the
2. Improved trust: With a block chain-based details into the system.
certificate system, learners can be sure that their Generate signature: After that it will generate the signature
of the company.
credentials are authentic and trustworthy. Successful: It will show the result i.e., it may be successful
Employers and other third parties can verify the or unsuccessful.
authenticity of the certificate without relying on the Software And Hardware Requirements:
issuing institution.
System Specifications:
3. Greater accessibility: With digital certificates
H/W Specifications:
stored on the blockchain, learners can access their 1. Processor : I5/Intel Processor
2. RAM : 8GB (min)
credentials from anywhere in the world using a
3. Hard Disk : 128 GB
computer or mobile device. This makes it easier for S/W Specifications:
• Operating System : Windows 10
learners to share their credentials with employers
• Front end react : React Js
or educational institutions. • Technology/backend : Python
• IDE : VS code
Architecture:
Figure 30:Company details In above screen company is login and after login will get
below screen
In above screen admin can view list of registered companies
and now logout and signup new company to perform
verification
REFERENCES
1. Gafurov, I., & Khusanov, R. (2021). Blockchain
technology for securing education certificates: A systematic
literature review. Sustainability, 13(5), 2451.
2. Sari, S., & Celik, E. (2020). An analysis of blockchain
Abstract - The rapid advancement of technology have emerged as a valuable resource for providing
has transformed the way we travel and explore comprehensive information about a city's various
new cities. In today's digital era, having a amenities, services, and attractions. These websites
comprehensive city information guide at your aim to serve as a one-stop platform for users to
fingertips has become essential for tourists and access information about places to visit, hotels,
locals alike. Introducing CityScape, a cutting- schools, police stations, transportation details, and
edge website and app that provides an all-in-one more. The purpose of this research paper is to
city information guide, offering a wealth of explore the development and implementation of a
essential information about any city around city information guide website, which can be a
India. CityScape is a user-friendly and intuitive useful tool for residents and visitors alike.
platform that caters to the needs of travelers
and locals alike. CityScape provides II.CURRENT STATE OF CITY
comprehensive and up-to-date information on INFORMATION GUIDE
all the must-visit places in the city. One of the Currently, there are numerous city information
standout features of CityScape is its guide websites and apps available, ranging from
transportation information. The platform government-run platforms to third-party options.
provides information on bus and train services, These platforms offer varying degrees of
including routes, and schedules. This enables comprehensiveness, usability, and accuracy. Some
users to easily plan their commute and navigate platforms rely on user-generated content, while
others source information from official databases.
the city using public transportation, saving time
In recent years, there has been a trend towards
and money. In addition to attractions and incorporating more interactive features, such as
transportation, CityScape also includes practical augmented reality and virtual tours, to enhance the
information such as local weather forecasts and user experience. However, there is still room for
emergency contact numbers, making it a one- improvement in terms of standardization,
stop-shop for all essential city information. accessibility, and data quality.
Users can also look for attractions, restaurants,
and hotels, providing valuable insights to help A. Abbreviations and Acronyms
them make informed decisions. • CityScape – website name
A. Architecture B. Modules
The architecture of a municipal information guide 1)Landing Page:
website consists of a number of parts that operate
in concert to deliver a thorough and user-friendly
platform. Several of the architecture's essential
elements include the following:
REFERENCES
3. Vaishnavi Krishi Krishi Bazaar is a Krishi Bazaar Krishi Bazaar's However, the web-app has some weaknesses that limit its
Desai; Bazaar mobile web-app that offers a impact in effectiveness. Firstly, the web-app's reach is limited to farmers
Isha Ghiria; enables farmers to transparent promoting fair who have access to smart-phones and the internet, which may
Twinkle sell their produce marketplace for market access for
Bagdi; directly to farmers to farmers may be
exclude those in rural areas with poor internet connectivity.
Sanjay Pawar consumers without showcase their limited due to Secondly, some farmers may not be comfortable using
intermediaries. It products, receive poor internet technology, which could limit their participation in the
allows farmers to fair connectivity and platform. Lastly, the information and recommendations
post their products compensation, lack of access to
and prices, and connect with technology and
provided by the web-app may not be sufficient for farmers with
consumers can potential digital literacy limited knowledge and resource.
browse and customers, and skills in rural Overall, the web-app has the potential to be an effective tool in
purchase them on save costs by areas. addressing the challenges faced by farmers, but its limitations
the web-app. eliminating
intermediaries.
and weaknesses need to be addressed to ensure it can reach and
4. Aina Marie eMarket The study used the The eMarket Limited benefit all farmers.
Joseph, for Local Rapid Web- web-application technology access
or skills may
Nurfauza Jali, Farmers application provides a hinder some
Amelia Jati Development solution for local farmers from
Robert Jupit, (RAD) methodology farmers to vend using the
Suriati for the development their fresh platform.
Khartini Jali of the eMarket web- produce through
application. The a mobile web-
eMarket web- application. The
application provides web-application
a solution for also helps
farmers to sell their customers to
crops at a proper acquire fresh
prize. produce easily
and conveniently.
Fig7. Relationship between the poverty and income in different states VII. Conclusion
In conclusion, the Farmers E-portal web-app offers a promising
A. Existing system solution to address the challenges faced by farmers in India.
In the existing system farmers need to struggle a lot to sell Through its features such as easy access to market information,
vegetables, grains. He needs to give the brokerage amount to a bidding system for selling produce, and an open forum for
the broker to sell its own products. Farmer needs to keep all its discussion, the web-app provides a comprehensive platform for
records manually it may take huge memory and most of the farmers to make informed decisions and connect with other
applications are not user friendly stakeholders in the agriculture industry. While there are areas
B. Drawbacks for improvement such as enhancing the user interface and
• Storing information is huge addressing connectivity issues in rural areas, the web-app's
• Need to maintain quantity record overall effectiveness in addressing the needs of farmers is
• Need to keep record for selling, purchasing the significant. With continued efforts to improve and expand its
agriculture products reach, the Farmers E-portal web-app has the potential to
• No accuracy in work significantly improve the livelihoods of farmers in India and
• Need extra security for prevent the data contribute to the growth of the agriculture sector.
C. Advantages of proposed system
• Provides the searching facilities based on the various ACKNOWLEDGMENT
factors, such as different form of products in different We would like to express our sincere gratitude to the team
seasons behind Farmers E-Portal web-app for their cooperation and
• Manages the information of seasons, vegetables support in providing us with the necessary information and
• Shows the information and description of the access to the web-app, which allowed us to conduct a thorough
Seasons, Vegetables, Grains analysis and evaluation of its effectiveness in addressing the
• Adding and updating of records in proper needs of farmers in India.
management of buying & selling vegatables & grain
• Weather forecast
• Bidding system REFERENCES
• Crop related information [1]C.Larman, Applying uml and patterns an introduction to
• Multiple languages object-oriented analysis and design and iterative development,
• Membership facility 3rd Massachusettes Perason Education,2005
[2]D.Carrington,CSSE3002 Course Note,School of ITEE
VI. Future Work University of Queensland,2008.
Based on our analysis and evaluation of Farmers E-portal, we [3]IEEE Recommended Practive for Software Requirements
have identified several areas where the web-app can be Specifications,IEEE Standard 830,1998
improved to better address the needs of farmers. [4] The Quint News for agriculture issues
Firstly, the web-app could benefit from a more user-friendly [5] Nethrapal on how much do farmers earn
and intuitive interface. While the web-app offers a range of [6]Nutr, “Recipe Menu Dev”, 2005
features and functions, navigating through them can be [7]Bayou and Bennet, “Agriculture Farming System”,1992
confusing and overwhelming for some users. Therefore, [8]Software Engineering of Airline Reservation Systems by Web
simplifying the user interface and making it more intuitive Services
would greatly enhance the user experience. [9]GHIRS: Integration of OOPS System by Web Services
Secondly, the web-app could include more detailed and [10]V.Swapna.M.Fridouse Ali Khan “Design and
tailored information for farmers, such as crop-specific advice Implementations of Web Application in International Journal of
and localized weather forecasts. This could be achieved Engineering Research & Technology Farmer Login New
through partnerships with local agricultural experts and Crops/Grains Buying/Selling Bidding & Viewing the Report
meteorologists to provide more accurate and relevant Close Websites www.google.com www.w3schools.com
information. www.javatpoint.com www.java2s.com
25
Voice Assistant for Disease Diagnosis Using Machine Learning and Natural Language
Processing
Dr Swati Sharma
{Dept. of} Computer Science Smitha Reddy S Shilpa N
Engineering 20191CCE0061 20191CCE0058
Presidency University Dept. of Computer Science Dept. of Computer Science
Bengaluru, India Engineering Engineering
swati.sharma@presidencyuni Presidency University Presidency University
versity.in Bengaluru, India Bengaluru, India
201910100730@presidencyu 201910100306@presidencyu
Thanusha M niversity.in niversity.in
20191CCE0076
Dept. of Computer Science Sowhardh C K
Engineering 20191CCE0065
Presidency University Dept. of Computer Science
Bengaluru, India Engineering
201910100204@presidencyu Presidency University
niversity.in Bengaluru, India
201910100737@presidencyu
niversity.in
In recent times, healthcare applications have been In this section, a detailed description on datasets
increasingly adopting machine learning and collection, model development, disease
natural language processing techniques. Among prediction, and voice assistant creation is given.
these applications, voice assistants for disease The initial step in constructing a machine learning
model is to collect data. Datasets were obtained preprocessing is done, it is ready for training and
from Kaggle, a data science platform. After data testing.
collection, the data is processed and divided into
training and testing datasets. Then the datasets 3.3. Disease Prediction Using Random Forest.
were trained and tested with the machine learning The proposed system uses Random Forest
algorithms such as SVM, Naïve Bayes, Decision algorithm to predict the acute diseases. The
Trees and Random Forest (RF). And when processed train dataset is split into test and train
compared for accuracy RF is selected. This model data using the train_test_split function from the
is then integrated with the voice assistant sklearn library. The split data (symptoms and
program. diseases) is then fitted onto the Random Forest
Model to train. Later the model is tested on the
Following are the steps involved in creation of test dataset. The illustration of Random Forest
Voice Assistant for Disease Diagnosis. algorithm consisting of 3 different decision trees
is shown in the Fig 1. Each of the decision tree
3.1. Data Collection. was trained using a random subset of the training
3.2. Data Preprocessing.
3.3. Disease Prediction Using Random Forest.
5. CONCLUSION
6. FUTURE SCOPE
Fig 2: Architecture of proposed Voice Assistant Future work can involve expanding the dataset to
for disease diagnosis system. include a wider range of diseases and symptoms,
and incorporating other technologies such as
image recognition. The study can further be
incorporated with neural networks to achieve
4. EXPERIMENTAL RESULTS
better understanding of user’s input and generate
The Experimental results of the proposed Voice more accurate disease diagnosis. The system only
Assistant system indicate that it achieved the accepts user’s input in English language, it can be
desired outcomes. the accuracy of the Random extended to include other regional and
international languages. Future systems could professionals in making more informed diagnosis
explore the potential use of the system in clinical and treatment decisions.
settings, where it could aid healthcare
[4] Xu Tan, Jiawei Chen, Haohe Liu, Jian [8] Dong Jin Park, Min Woo Park, Homin
Cong, Chen Zhang, Yanqing Liu, Xi Wang Lee, Young-Jin Kim, Yeongsic Kim &
Yichong Leng, Yuanhao Yi, Lei He, Frank Young Hoon Park, “Development of
Soong Tao Qin, Sheng Zhao, Tie-Yan Liu, machine learning model for diagnostic
“NaturalSpeech: End-to-End Text to disease prediction based on laboratory
Speech Synthesis with Human-Level tests”, Nature Portfolio, 2021.
Quality”, arXiv preprint
arXiv:2205.04421, 2022.
30
10) Sodium (sod): Sodium aids in the transmission 17) Diabetes Mellitus (dm): Diabetes mellitus is a
of nerve impulses, the contraction and relaxation collective term for a number of conditions that
of muscles, and the preservation of the ideal ratio
have an impact on how your body utilizes order to prepare it for the modelling stage. This is
glucose (blood sugar). commonly referred to as data pre-processing. Data
Pre-Processing is the stage in which distorted or
18) Coronary Artery Disease (cad): The coronary
encoded data is transformed so that the machine can
arteries, which supply blood to the heart, develop
easily analyse it. A dataset may be considered as a set
plaque buildup, which results in coronary artery
of statistics objects. Data objects are labelled by a
disease.
number of features, which ensure the basic
19) Appetite (appet): A hunger-driven desire to eat. characteristics of an object, such as the mass of a
physical object or the time at which it was created.
20) Pedal edema (pe): Pedal edema is characterized Data Pre-Processing is the stage in which distorted or
by an abnormal fluid buildup in the ankles, feet, encoded data is transformed so that the machine can
and lower legs, which results in swelling in the easily analyse it. A dataset may be regarded as a set
feet and ankles. of facts objects. Data objects are labelled by a
number of features, which ensure the basic
21) Anemia (ane): Anemia is a condition in which characteristics of an object, such as the mass of a
your body doesn't produce enough healthy red physical object or the time at which it was created.
blood cells to adequately oxygenate your tissues.
Missing values in the dataset can be either eliminated
or estimated. The most common way to deal with
XIII. METHODOLOGY FLOW DIAGRAM
missing values is to fill them in with the mean,
median, or mode value of the corresponding feature.
Because object values cannot be used for analysis, we
must convert numeric values with object type to
float64 type. Null values in categorical attributes are
replaced by the most frequently occurring value in
that attribute column. Label encoding is used to
convert categorical attributes into numeric attributes
by associating each unique attribute value with an
integer. This converts the attributes to the int type.
The mean value is calculated from each column and
used to replace all missing values in that column. We
are using the imputer function to find the mean value
in each column for this function. After the data has
been replaced and encoded, it should be trained,
validated, and tested. Training the data is the process
XIV. FLOW DIAGRAM DETAILS by which our algorithms are taught to build a model.
Dataset: A dataset is an example of how machine Validation is the portion of the dataset that is used to
learning can aid in prediction, with labels validate or improve our various model fits. Data
representing the outcome of a specific prediction. testing is used to put our model hypothesis to the test.
Data Pre-processing: Today's real-world datasets, Feature Selection: The method of computationally
particularly clinical datasets, are prone to missing, selecting the features that contribute the most to our
noisy, redundant, and inconsistent data. Working prediction variable or output is known as feature
with poor quality data yields poor quality results. As selection. We used Ant Colony Optimization (ACO)
a result, the first step in any machine learning to select the best features from the dataset in this
application is to explore and understand the dataset in study. It is a method for solving computational
problems that can be reduced to finding good paths instances. There are three types of variables:
through graphs. Artificial Ants are multi-agent continuous, nominal, and binary. As a result, nominal
methods inspired by real ant behaviour. The variables such as specific gravity, albumin, and sugar
pheromone-based communication of biological ants are used. We use knn classification to convert all
is frequently used as the primary paradigm. nominal variables to binary. k values are selected. K-
Combinations of Artificial Ants and local search Nearest Neighbor is a simple Machine Learning
algorithms have emerged as the preferred method for algorithm based on the Supervised Learning
a wide range of optimisation tasks involving some technique. The KNN algorithm assumes the
form of graph. Rather than accumulating pheromone similarity between the new case/data and the
intensities, this algorithm evaluates them during each available cases and places the new case in the
iteration. The proposed algorithm will alter a small category that is most similar to the available
number of features in subsets chosen by selecting the categories. The K-NN algorithm stores all available
best ants. To evaluate the performance of the subsets data and uses similarity to classify new data points.
that are the wrapper evaluation function, a This means that when new data appears, it can be
classification algorithm must be used. easily classified into a well-suited category using the
K-NN algorithm. The K-NN algorithm can be used
Classification: This study used four classification for both Regression and Classification.
algorithms: support vector machine (SVM), k-nearest
neighbours (KNN), decision tree, and random forest. Decision Tree: Decision Tree is a Supervised
All of the classification algorithms performed well. learning technique that can be used for both
The random forest algorithm outperformed all other classification and regression problems, But it is most
algorithms used. commonly used for classification. It is a tree-
structured classifier in which internal nodes represent
SVM {SUPPORT VECTOR MACHINE}: SVM is dataset features, branches represent decision rules,
a supervised learning model that is commonly used in and each leaf node represents the result. A Decision
classification problems. The SVM algorithm is tree has two nodes: The Decision Node and the Leaf
designed to find the optimal hyperplane that best Node. Multiple branches and preference nodes are
separates all objects of one class from those of used to make decisions. While Leaf nodes are the
another class with the greatest of the outcomes of those choices and do not have any
margins between these two classes. To achieve additional branches.The characteristics of the given
satisfactory computational efficiency, objects that are data set are used to inform the decisions or tests. It is
far from the boundary are discarded from the a graphic representation of every option for solving a
calculation, while other data points that are close to dilemma or making a choice under specific
the boundary are kept and determined as "support circumstances. The purpose it resembles a tree and is
vectors". The kernel functions of the SVM algorithm known as a choice tree is as it begins offevolved with
are radial basis function (RBF), linear, sigmoid, and the foundation node and branches out from there. The
polynomial. The radial basis function was chosen for Classification and Regression Tree algorithm, or
this study based on the results of nested cross CART, is used to construct a tree. A decision tree
validation. merely poses a query and divides into subtrees in
accordance with the response (Yes/No).
KNN: Knn Classification employs the Euclidean Random Forest: Breiman's "decision tree" machine
distance. It computes the distance between the new learning mechanism is the foundation of the bagging
element and other known element classes. In this ensemble method known as random forest (RF). In a
paper, the Chronic Kidney Disease dataset from the random forest, decision trees are the "weak learners"
UCI database is used, which has 25 variables and 400 in an ensemble. Random forest forces the diversity of
each tree separately by choosing a random feature.
After producing lots of trees, they cast their votes for XVII OBJECTIVES
Early-stage CKD goes undetected, and patients only
the class that is the most prevalent. The runtimes for
become aware of the severity of the condition once it
the random forest algorithm are significantly reduced, has progressed.
and it can handle unbalanced data. The supervised Consequently, a major challenge today is to identify
learning technique includes the well-known machine such a disease at an earlier stage. This initiative
learning algorithm Random Forest. In machine promotes early diagnosis and awareness among the
public. Early diagnosis and effective treatments may
learning, it can be used to solve both classification be able to halt or slow the progression of this
and regression issues. To solve a challenging issue persistent illness. The main criterion for the success
and enhance the performance of the model, RF of this project is the use of machine learning to
combines multiple classifiers. The random forest uses recognize behaviors or patterns of behavior in the
early stages of CKD in order to improve the quality
the predictions from each decision tree to predict the of life of patients.
final result based on the majority vote of predictions
rather than relying solely on one tree. The accuracy XVIII. GRAPHS
and risk of overfitting increase with the size of the
forest's tree cover. Classification
Feature Selection
Pus cell clumps
Hypertension
XIX. OUTCOMES
This investigation allows us to propose a model that
allows us to predict chronic renal failure. Early
diagnosis and treatment of CKD can be performed
inexpensively, reducing the burden of ESRD,
improving outcomes for diabetes and cardiovascular
disease (including hypertension), and significantly
reducing patient morbidity and mortality. Our
intention is to provide, in the simplest possible way,
an effective system that will help both physicians and detect the severity of the disease and improve its
patients to predict chronic kidney disease at the level generalization performance, a large number of more
of Early Stages. To better predict CRI, future complex and representative data will be collected in
research should address a variety of supervised and the future. We believe that as the data grow in size
Unsupervised device mastering strategies, in addition and quality, this model will get better and better. In
to characteristic choice strategies with extra overall order to improve the identification of CKD, more
performance measurements. Physicians and research and studies in this field are required. This
radiologists can benefit from a computer-aided will help doctors spot the disease earlier and give
diagnostic system that helps them make better patients the chance to regain their renal function.
diagnostic conclusions. Our method enables doctors
to treat more patients in less time. Appropriate REFERENCES
feature selection methods help to reduce the number
of features required by the prediction algorithm and [1] Reshma S, Salma Shaji, SR Ajina, Vishnu Priya,
thus reduce the number of medical tests required. SR, Janisha A Predicting Chronic Kidney
Disease by Machine Learning“, International
Journal of Engineering Research and Technology
XX. CONCLUSION
(IJERT), Vol.9, Iss.7, (2020), S. 137-140.
As a result, we believe that the practical diagnosis of
CKD could benefit from employing this method. This [2] Chen, G.; Ding, C.; Li, Y.; Hu, X.; Li, X.; Ren,
could be used to gauge a person's likelihood of L.; Ding, X.; Tian, P.; Xue, W. Prediction of
developing CKD in the future, be incredibly helpful Chronic Kidney Disease Using Adaptive
and economical. This model might be integrated with Hybridized Deep Convolutional Neural Network
typical blood report generation. If there is a person at on the Internet of Medical Things Platform.
risk, automatically flag them out. Patients wouldn't IEEE Access 2020, 8, 100497–100508.
need to see a doctor unless the algorithms flagged
them. For the modern and busy person, This would [3] Marwa Almasoud, Tomas E Ward, “Detection of
make it more affordable and simple. Using machine Chronic Kidney Disease using Machine
learning techniques, we developed a novel method Learning Algorithms with Least Number of
for detecting CKD. We evaluated a dataset of 400 Predictors”, International Journal of Advanced
patients, 250 of whom were in the early stages of Computer Science and Applications (IJACSA),
CKD. There are some noisy and missing values in Vol.10, No.8, (2019), pp. 89-96.
this dataset. As a result, we require a classification
algorithm that can handle missing and noisy values. [4] Xiao.J and colleagues "Comparison and
Additionally, in actual medical diagnosis, this development of machine learning tools for
method may be applicable to the clinical data of other chronic kidney disease progression prediction,"
diseases. Furthermore, we identify a cost effective Journal of Translational Medicine, vol. 17, (1),
highly accurate detection classifier using only 8 pp. 119, 2019.
attributes through cost analysis of all 24 attributes:
specific gravity, diabetes mellitus, hypertension, [5] I.A. Pasadana, D. Hartama, M. Zarlis, A.S.
haemoglobin, albumen, appetite, red blood cell count, Sianipar, A. Munandar, S.Baeha, A.R.M. Alam,
pus cell. Importantly, the findings of this study “chronic kidney disease prediction by using
introduce new factors that classifiers can use to detect different decision tree techniques”, journal of
more accurately CKD than the current state of the art physics: conference series 1255, (2019).
using formulas. As a result, the model's
generalization performance may be limited. [6] Cheng L.C.; Hu, Y.H.; Chiou, S.H. Applying the
Additionally, the model is unable to determine the Temporal Abstraction Technique to the
severity of CKD because there are only two Prediction of Chronic Kidney Disease
categories of data samples in the set of ckd and Progression. J. Med. Syst. 2019, 41, 85.
notckd. To predict CKD at an early stage, this system
offered the best prediction algorithm. The models are [7] Pinar Yildirim, "Predicting chronic kidney
trained and validated using the input parameters that disease from unbalanced data by multilayer
were obtained from the CKD patients in the dataset. perceptron: Predicting chronic kidney disease",
To perform the CKD diagnosis, learning models for IEEE, July 2017. doi:10.1109/COMPSAC.2017.
K-Nearest Neighbors Classifier, Decision Tree
Classifier, Logical Regression, and Artificial Neural [8] H. D. Mehr, A. Cetin, and H. Polat,J. Med. Syst.,
Networks are created. In order to train the model to vol. 41, no. 4, p. 55, "Diagnosis of chronic renal
disease based on support vector machine using
feature selection approaches," 2017.
[1]
robinrohit.vincent@presidencyuniversity.in, , 201910101478@presidencyuniversity.in ,
[2] [3]
201910101669@presidencyuniversity.in, 20191010998@presidencyuniversity.in,
[4] [5]
201910102008@presidencyuniversity.in
[6]
ABSTRACT.
This project facilitates real time pursuit of an automobile mainly in bike and seeks to minimize
the possibility of deaths by delay in the arrival of aid by alerting the concerned people about the
mishap of the vehicle. According to a government survey, drowsiness and drunk driving
constitute to 22 and 33 percent of accidents respectively in India. The number of lives lost can be
diminished if the assistance can be procured at the earliest. To develop such a system which can
notify the
concerned people about the mishap, GPS module, GSM module, accelerometer is interfaced with
Nodemcu which acts as the controller. The accelerometer detects the accident by a change in
present value of the vehicle orientation and sends the location through GPS module to registered
sim card via GSM module without any indulgence of the driver or passengers. Also through IOT
we can have the information to the Guardian. The planned system aims to cut back deaths in road
accidents by quite nine p.c.
[1]
yamanu.sjce06@gmail.com, rockchiranjeevi07@gmail.com, deepakcharie82965@gmail.com,
[2] [3]
chithrachithra4382@gmail.com, bencymohan21514525@gmail.com,
[4] [5]
saiprasadnaga332@gmail.com
[6]
ABSTRACT.
This paper presents a server-based FPGA resource pooling approach for cloud computing using
software implementation with JDK 8-64 bit,MySQL,Apache,and Heidi SQL technologies.
The proposed approach enables multiple users to share FPGA devices, improving resource
utilization and reducing costs. We propose a web-based interface that allows users to access
FPGA resources and allocate them based on their needs.The underlying infrastructure is built
using Apache web server, MySQL database, and JDK 8-64 bit with Heidi SQL for database
management. Our approach includes a resource allocation algorithm that ensures efficient use of
FPGA resources while providing fair access to all users. We demonstrate the feasibility of our
approach through a proof-of-concept implementation and performance evaluation. Our results
show that our approach can significantly improve resource utilization and reduce costs compared
to dedicated FPGA devices.The server-based implementation also simplifies FPGA resource
management, as the FPGA devices can be centrally managed and allocated to users as needed.
Overall, our software-based FPGA resource pooling approach can help accelerate the
development of FPGA-based applications in cloud computing environments, particularly for
users who cannot afford dedicated FPGA devices.
KEYWORDS : We are using fpga(field programmable gate array ) this is based on Java
simulation on this service
Fetal health classification
[1]
Arun Kumar S , Umar Haseeb , Rahul Kumar , Gopal Krishna Birabar , Vinay Gupta
[2] [3] [4] [5]
arunkumar.s@presidencyuniversity.in, 201910100133@presidencyuniversity.in,
[1] [2]
201910102192@presidencyuniversity.in,
[3]
201910101575@presidencyuniversity.in, 201910101636@presidencyuniversity.in,
[4] [5]
201910102178@presidencyuniversity.in
[6]
ABSTRACT.
Fetal health classification is an essential aspect of modern obstetrics, and prenatal care aims to
prevent adverse pregnancy outcomes. Currently, one of the most reliable methods to assess fetal
health is through ultrasound imaging. However, manual interpretation of ultrasound images by
medical professionals can be subjective, time-consuming, and prone to human error. Recently,
deep learning models, such as Convolutional Neural Networks (CNN), have shown promising
results in medical image recognition tasks. In this article, we will explore the potential
application of CNN models in fetal health classification CNN models are a type of artificial
neural network commonly used in image recognition tasks. These models have shown high
accuracy in image classification, segmentation, and object detection tasks. In medical imaging,
CNN models have been used in various applications, such as breast cancer detection, skin lesion
diagnosis, and lung disease detection. In fetal health classification, CNN models can be trained
to recognize patterns and features in ultrasound images that are indicative of fetal health status.
One approach to training CNN models for fetal health classification is to use a larg dataset of
ultrasound images labeled with fetal health status. These labels can be binary (e.g., healthy vs.
unhealthy) or multi-class (e.g., healthy, mild, moderate, and severe health conditions). Once the
dataset is labeled, it can be split into training, validation, and testing sets. The training set is used
to train the CNN model to recognize patterns and features in the ultrasound images, while the
validation set is used to tune the hyperparameters of the model. Finally, the testing set is used to
evaluate the performance of the trained CNN model. In fetal health classification, the
performance of the CNN model can be evaluated using metrics such as accuracy, sensitivity,
specificity, and area under the receiver operating characteristic (ROC) curve. The accuracy of the
model indicates the percentage of correctly classified images, while the sensitivity and
specificity measure the proportion of true positives and true negatives, respectively. The area
under the ROC curve is a measure of the overall performance of the model, where a value of 1
indicates perfect classification and a value of 0.5 indicates random classification.
soumya@presidencyuniversity.in, leonejacob2001@gmail.com,
[1] [2]
Poornimachowdary9562@gmail.com,
[3]
kumarswamymp2002@gmail.com, harshithkurapati23@gmail.com,
[4] [5]
ajaykumarmagham7@gmail.com
[6]
ABSTRACT.
Facial recognition technology is an arising field that has revolutionized the way we interact with
machines. It has numerous applications, one of which is voting systems. In this paper, we present
a face recognition voting system that utilizes facial recognition technology to ensure a more
secure and dependable voting process. The proposed system consists of three main factors face
detection, face recognition, and voting. The system operates by first detecting faces in a given
image or videotape feed, followed by recognition of the detected faces using a trained machine
learning model. Once a face is recognized, the system retrieves the corresponding voter ID and
checks if the voter is eligible to cast their vote. However, the system allows them to cast their
vote, and the vote is recorded, If the voter is eligible. The system also ensures that voters cannot
vote more than once by maintaining a record of voters who have formerly cast their votes. Our
trials show that the proposed system is accurate, effective, and can be a precious tool for
ensuring a fair and secure voting process.
riyaz@presidencyuniversity.in, 201910100226@presidencyuniversity.in,
[1] [2]
201910101191@presidencyuniversity.in,
[3]
201910101438@presidencyuniversity.in, 201910100578@presidencyuniversity.in,
[4] [5]
201910101279@presidencyuniversity.in
[6]
ABSTRACT.
This project IOT Based Smart Garbage Management System is a very smart system which will
help to keep our village and cities. We see that in our cities public dustbins are overloaded and it
create unhygienic conditions for people and That place leaving a bad smell. To avoid all these
things we are Going to implement a project IOT based smart garbage management system. These
dustbins are interfaced with Arduino base system having ultrasonic sensor along with central
system showing the Current status of garbage on display and web Server with GSM/GPRS
Module. To increase the cleanness in the country government started the various project