Flood Prediction Using Supervised Machine Learning Algorithms
Flood Prediction Using Supervised Machine Learning Algorithms
Abstract—The most frequent type of calamity is flooding, unlabeled dataset and reinforcement learning algorithm, self-trained
which happens when water overflows and submerges normally on rewards and punishment mechanisms. The supervised learning
dry terrain. Flood prediction models, which are typically based algorithms are Logistic Regression, Decision Tree, Random Forest,
on historic data and specified thresholds, are intended to Support Vector Machine, Naive Bayes, , and so on [4]. On the basis
forecast when the water level will exceed a predetermined of the above study, it is observed that the supervised learning
threshold. To minimize the complex numerical articulations of methods are showing the best results for prediction, which motivates
actual flood cycles, Machine Learning (ML) methods generate us to choose the same for this article. In view of past precipitation
predictions about future events that are far more accurate than information, the best of the two methodologies is picked for
predictions made by humans. In this paper, various supervised expectation [5] dependable deep learning and ML model for a real-
machine learning algorithms are implemented. Among many time flood detection system. It uses convolutional neural networks,
ML techniques, classification is a widely used one. This paper random forests, and naive Bayes to identify water levels and
uses various supervised learning algorithms, such as Logistic evaluate floods that may have humanitarian implications before they
Regression, Random Forest, XGB classifier, ExtraTree happen [6]. There are many advantages to the machine learning
classifier, LGBM classifier, and CatBoost classifier. Based on techniques for flood forecasting that are suggested in [7]. Then, it
performance, supervised learning algorithm for flood develops an intuitive web interface system using an Intelligent
prediction is analyzed and most appropriate models is Hydro Informatics Integration Platform to enhance online
predicted. This particular model can be effectively utilized by forecasting and flood risk management. This is achieved by utilizing
both the government and the general public to properly predict machine learning, visualization, and system development
floods in advance. methodologies. The most promising short- and long-term flood
prediction methods are presented in [8]. An investigation is also
Keywords—Flood Prediction, Machine Learning Algorithms given to the significant advancements in improving the caliber of
flood prediction models. The researchers found that the best
strategies for enhancing ML techniques were data decomposition,
I. INTRODUCTION hybridization, ensemble modeling, and optimization. [9] A useful
flood modeling framework for simulations was implemented. With
Flood is the most common disaster, causing losses of human life, the use of a hybrid hydraulic model and algorithms for learning, the
infrastructure, damage to property, agriculture, and livestock. The structure offers a novel, quick, effective, and expandable method for
process of forecasting and estimating the occurrence, intensity, and determining the level of floods. Thus, two machine learning models
potential impact of flooding events in a particular area is known as were used. [10] presented Random Forest (RF) as a competitive
flood forecasting. The aim of flood prediction is to provide prior substitute for Support Vector Machine (SVM) that often beats SVM
information to communities, authorities, and individuals, allowing in flood prediction models. [11] developed an expert prediction
them to take preventive measures and mitigate the potential damage model by utilizing the self-adaptive Evolutionary Extreme Learning
caused by floods. To predict the flood, a machine learning model is (ELM) and a non-tuned machine learning method. Water level
needed to analyze complex datasets, identify patterns, and improve prediction is the main use of the SaE-primary ELM. Creating such
prediction accuracy. Machine learning offers a wide range of models for water level prediction and monitoring is a crucial
approaches for prediction. Machine learning methods are utilized to optimization issue in water resources management and flood
predict floods by finding patterns in a set of data [1]. Supervised, prediction. [12] examined the effectiveness of random forest (RF),
unsupervised, and reinforcement learning are the three categories artificial neural network (ANN), and support vector machine (SVM)
into which machine learning approaches fall. [2]. In a supervised in general applications to floods and discovered that RF provided
learning algorithm, a dataset is given as input to the algorithm, the greatest results. [13] The support vector machine was utilized to
which is then processed and optimized to meet a set of specific forecast floods. They did point out, though, that even though their
outputs [3]. Unsupervised learning algorithm to analyze and cluster
objective model performs better than the benchmark models in the combining numerous weak learners, where each subsequent learner
absence of cutting-edge flood monitoring technology, it still requires corrects the errors made by its predecessor.
a lot of work. [14] A prediction model was developed using rainfall Gradient Descent: Instead of training new models randomly,
data to forecast the frequency of floods brought on by precipitation. XGBoost builds trees sequentially by minimizing a predefined loss
Based on the range of rainfall in specific locations, the model function. It uses gradient descent optimization techniques to
predicts if a "flood" will happen. Using information on rainfall from minimize the loss and improve the overall model performance.
districts in India, the forecast model will run. The dataset was trained
using a variety of methods, including Multilayer perceptron, support Regularization: Overfitting is avoided by XGBoost using
vector machine, K-nearest neighbor, and linear regression. The MLP regularization techniques. The objective function of the XGBoost
algorithm performed with a precision of 97.30%. Based mostly on classifier is L1 (Lasso) and L2 (Ridge) regularization factors, which
upstream stage observation, ML models may forecast flood stages penalize complex models and promote simplicity.
at a major gauge station [15]. The case study for this analysis is the Handling Missing Values: Preprocessing is not always necessary
lower Parma River in Italy, and a 9-hour forecast horizon was used. because XGBoost has built-in skills to manage missing values by
Three machine learning algorithms were compared and processing learning how to treat them during training.
speed: support vector regression (SVR), multi-layer perceptron
(MLP). D. LGBM Classifier:
To improve performance of flood prediction using
supervised learning models. The article's contribution includes (i)
The LGBM (Light Gradient Boosting Machine) Classifier is a
supervised learning algorithm famed for its speed, efficiency, and
implementing various supervised machine learning models (ii)
high performance, especially with large datasets. It belongs to the
comparing the performance of logic regression with Random Forest,
family of gradient-boosting algorithms, like XGBoost, but with
ExtraTree classifier, LGBM classifier, and CatBoost classifier.
specific optimizations that enhance its speed and reduce memory
II. METHODOLOGY usage.