Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
12 views

Class 7 Random Forest Algorithm

The document provides an overview of the Random Forest algorithm, a supervised machine learning technique that utilizes ensemble learning through multiple decision trees to improve prediction accuracy and reduce overfitting. It highlights the algorithm's features, advantages, and disadvantages, as well as its applications in classification and regression tasks. Additionally, it notes that Random Forest is less effective in cases of extrapolation and sparse data.

Uploaded by

quillsbot
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Class 7 Random Forest Algorithm

The document provides an overview of the Random Forest algorithm, a supervised machine learning technique that utilizes ensemble learning through multiple decision trees to improve prediction accuracy and reduce overfitting. It highlights the algorithm's features, advantages, and disadvantages, as well as its applications in classification and regression tasks. Additionally, it notes that Random Forest is less effective in cases of extrapolation and sparse data.

Uploaded by

quillsbot
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Random Forest

Algorithm
Gayathri Prasad S
Introduction
 A random forest is a supervised machine
learning algorithm that is constructed from
decision tree algorithms.
 It utilizes ensemble learning, which is a

technique that combines many classifiers to


provide solutions to complex problems.
 A random forest algorithm consists of many

decision trees.
 The random forest algorithm establishes the
outcome based on the predictions of the
decision trees.
 It predicts by taking the average or mean of

the output from various trees.


 Increasing the number of trees increases the

precision of the outcome.


 A random forest eradicates the limitations of

a decision tree algorithm. It reduces the


overfitting of datasets and increases
precision.
Features of a RFA
 It’s more accurate than the decision tree
algorithm.
 It can produce a reasonable prediction

without hyper-parameter tuning.


 It solves the issue of overfitting in decision

trees.
 In every random forest tree, a subset of

features is selected randomly at the node’s


splitting point.
 The main difference between the decision tree algorithm
and the random forest algorithm is that establishing root
nodes and segregating nodes is done randomly in the
latter.
 The random forest employs the bagging/bootstrap
aggregation method to generate the required prediction.
 Bagging involves using different samples of data
(training data) rather than just one sample. A training
dataset comprises observations and features that are
used for making predictions. The decision trees produce
different outputs, depending on the training data fed to
the random forest algorithm. These outputs will be
ranked, and the highest will be selected as the final
output.
Pictorial Representation
Classification in random forests employs an
ensemble methodology to attain the
outcome. The training data is fed to train
various decision trees. This dataset consists
of observations and features that will be
selected randomly during the splitting of
nodes. The leaf node of each tree is the
final output produced by that specific
decision tree. The selection of the final
output follows the majority-voting system.
Regression in random forests
 In a random forest regression, each tree
produces a specific prediction. The mean
prediction of the individual trees is the
output of the regression. This is contrary to
random forest classification, whose output
is determined by the mode of the decision
trees’ class.
Random forest algorithms are not ideal in the
following situations:
Extrapolation
Random forest regression is not ideal in the
extrapolation of data. Unlike linear regression, which
uses existing observations to estimate values beyond
the observation range. This explains why most
applications of random forest relate to classification.
Sparse data
Random forest does not produce good results when
the data is very sparse. This will generate
unproductive splits, which will affect the outcome.
Advantages of random forest
 It can perform both regression and
classification tasks.
 A random forest produces good predictions

that can be understood easily.


 It can handle large datasets efficiently.
 The random forest algorithm provides a

higher level of accuracy in predicting


outcomes over the decision tree algorithm.
Disadvantages of random
forest
 When using a random forest, more
resources are required for computation.
 It consumes more time compared to a

decision tree algorithm.


Datasets
 https://drive.google.com/file/d/15pc24lVzok
KXhPvjqjvgmMNqSc611EoL/view?usp=shari
ng
 https://drive.google.com/file/d/1ailAwduVTt0
8yG12MYIzq86-Etz4N9kM/view?usp=sharing
Thank You!!

You might also like