Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
307 views

ID4 Algorithm - Incremental Decision Tree Learning

The document discusses drawbacks of decision tree learning and how incremental decision tree learning (ID4 algorithm) addresses them. ID4 allows the decision tree to be updated incrementally as new examples are received, without having to rebuild the entire tree from scratch each time. When a new example is received, ID4 either adds it to an existing terminal node, splits an internal node to better discriminate examples, or replaces a sub-tree if the optimal attribute choice changes.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
307 views

ID4 Algorithm - Incremental Decision Tree Learning

The document discusses drawbacks of decision tree learning and how incremental decision tree learning (ID4 algorithm) addresses them. ID4 allows the decision tree to be updated incrementally as new examples are received, without having to rebuild the entire tree from scratch each time. When a new example is received, ID4 either adds it to an existing terminal node, splits an internal node to better discriminate examples, or replaces a sub-tree if the optimal attribute choice changes.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 9

Drawbacks of Decision Tree Learning

 In ID3, we have looked at learning decision trees in a single process.


 A complete set of examples is provided, and the algorithm returns a
complete decision tree ready for use.
This is fine for offline learning, where a large number of observation–
action examples can be provided in one go.
 The learning algorithm can spend a short time processing the example set
to generate a decision tree.
 When used online, however, new examples will be generated while the
game is running, and the decision tree should change over time to
accommodate them.
 With a small number of examples, only broad brush sweeps can be seen,
and the tree will typically need to be quite flat.
 With hundreds or thousands of examples, subtle interactions between
attributes and actions can be detected by the algorithm, and the tree is likely
to be more complex.
 The simplest way to support this scaling is to re-run the algorithm each
time a new example is provided.
 This guarantees that the decision tree will be the best possible at each
moment.
 Unfortunately, we have seen that decision tree learning is a moderately
inefficient process. With large databases of examples, this can prove
very time consuming.
 This approach is fine, as far as it goes, but it always adds further examples
to the end of a tree and can generate huge trees with many sequential
branches.
 We ideally would like to create trees that are as flat as possible, where the
action to carry out can be determined as quickly as possible.
Incremental Decision Tree Learning
 Incremental algorithms update the decision tree based on the new
information, without requiring the whole tree to be rebuilt.
 The simplest approach
 take the new example and use its observations to walk through the
decision tree.
 When we reach a terminal node of the tree, we compare the action
there with the action in our example.
 If they match, then no update is required, and the new example can
simply be added to the example set at that node.
 If the actions do not match, then the node is converted into a decision
node using SPLIT_NODE in the normal way.
ID4 ALGORITHM

 In ID4, we are effectively combining the decision tree with the decision
tree learning algorithm.
 To support incremental learning, we can ask any node in the tree to
update itself given a new example.

 When asked to update itself, one of three things can happen:


1. If the node is a terminal node (i.e., it represents an action), and if the
added example also shares the same action, then the example is added
to the list of examples for that node.

2. If the node is a terminal node, but the example’s action does not
match, then we make the node into a decision and use the ID3 algorithm
to determine the best split to make.
3. If the node is not a terminal node, then it is already a decision. We
determine the best attribute to make the decision on, adding the new
example to the current list. The best attribute is determined using the
information gain metric, as we saw in ID3.
 If the attribute returned is the same as the current attribute for the
decision (and it will be most times), then we determine which of the
daughter nodes the new example gets mapped to, and we update that
daughter node with the new example.
 If the attribute returned is different, then it means the new example
makes a different decision optimal. If we change the decision at this point,
then all of the tree further down the current branch will be invalid. So we
delete the whole tree from the current decision down and perform the
basic ID3 algorithm using the current decision’s examples plus the new
one.
The example tree in ID4 format
Walk Through
• It is difficult to visualize how ID4 works from the algorithm description alone,
so let’s work through an example.
• We have seven examples. The first five are similar to those used before:
– Healthy Exposed Empty Run
– Healthy In Cover With Ammo Attack
– Hurt In Cover With Ammo Attack
– Healthy In Cover Empty Defend
– Hurt In Cover Empty Defend
• We use these to create our initial decision tree(before ID4). The decision tree
looks like that shown figure
• We now add two new examples, one at a time, using ID4:
Eg 1. Hurt Exposed With Ammo Defend
Eg 2. Healthy Exposed With Ammo Run
The first example enters at the first decision node. ID4 uses the new
example, along with the five existing examples, to determine that ammo is the
best attribute to use for the decision.
This matches the current decision, so the example is sent to the
appropriate daughter node. Currently, the daughter node is an action: attack.
The action doesn’t match, so we need to create a new decision here.
Using the basic ID3 algorithm, we decide to make the decision based
on cover. Each of the daughters of this new decision have only one example and
are therefore action nodes. The current decision tree is then as shown in Figure
• Now we add our second example,(Healthy Exposed With Ammo Run)
again entering at the root node. ID4 determines that this time ammo can’t
be used (based on Information gain), so cover is the best attribute to use
in this decision.
• So we throw away the sub-tree from this point down (which is the whole
tree, since we’re at the first decision) and run an ID3 algorithm with all the
examples. The ID3 algorithm runs in the normal way and leaves the tree
complete. It is shown in Figure.

You might also like