Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

B+ Tree Rules

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

B+ Tree

1. The B+-tree index structure is the most widely used of several index
structures that maintain their efciency despite insertion and deletion of
data.
2. A B+ Tree combines features of ISAM(Indexed Sequential Access Method)
and B Trees. It contains index pages and data pages. The data pages always
appear as leaf nodes in the tree. The root node and intermediate nodes are
always index pages. These features are similar to ISAM. Unlike ISAM,
overflow pages are not used in B+ trees.
3. The index pages in a B+ tree are constructed through the process of inserting
and deleting records. Thus, B+ trees grow and contract like their B Tree
counterparts. The contents and the number of index pages reflects this
growth and shrinkage.
4. A B+tree index takes the form of a balanced tree in which every path from
the root of the tree to a leaf of the tree is of the same length. length of every
path from the root to leaf node is same and hence B+ tree is a balanced tree.
The letter B stands for balanced with this unique property. This balance
property ensures good performance for look up, insertion and deletion.
5. Data records are only stored in the leaves.
6. Internal nodes store just key values.
7. Keys are used to for directing a search to the proper leaf.
8. If a target key is less than a key in an internal node, then the pointer just to
its left side is followed.
9. If a target is greater than or equal then a key in an internal node, the pointer
just to its right side is followed.
10.B+ Trees and B Trees use a "fill factor" to control the growth and the
shrinkage. A 50% fill factor would be the minimum for any B+ or B tree. As
our example, we use the smallest page structure. This means that our B+ tree
conforms to the following guidelines.

11.A root node of n pointers must have at least 2 children and can have fewer
than int(n/2) keys.
12.If there are n number of pointers in a leaf node then the number of keys it
can hold is (n-1).
13.In B+ tree structure, a leaf node having n pointers must have at least (n
1)/2 keys and it can have at most (n-1) keys.
a. If n = 3, then the minimum number of keys a leaf node must have is
(31)/2 = 1 and the maximum number of keys are (3-1) = 2.
b. If n = 4, then the minimum number of keys a leaf node must have is
(41)/2 /2 = 2 and the maximum number of keys are (4-1) = 3.
c. If n = 5, then the minimum number of keys a leaf node must have is
(51)/2 = 2 and the maximum number of keys are (5-1) = 4.
d. If n = 6, then the minimum number of keys a leaf node must have is
(61)/2 = 3 and the maximum number of keys are (6-1) = 5.

14.In B+ tree structure, a non-leaf node except root with n number of pointers
must have between (n/2 and n children.
a. A non-leaf tree other than root with 3 pointers must have between
(3/2 and 3 children. i.e it must have 2 to 3 children.
b. A non-leaf tree other than root with 4 pointers must have between
(4/2 and 4 children. i.e it must have 2 to 4 children.
c. A non-leaf tree other than root with 5 pointers must have between
(5/2 and 5 children. i.e it must have 3 to 5 children.
d. A non-leaf tree other than root with 6 pointers must have between
(6/2 and 6 children. i.e it must have 3 to 6 children.
15.The B+ tree contains a relatively small number of lavels.
a. Level below root has at least 2*int(n/2) values.
b. Next level has at least 2*int(n/2)*int(n/2) values and so on.
16.If there are K search key values in the file, the tree height is no more than
log n/2 (K) .
17.The ranges of values in each leaf do not overlap, except if there are duplicate
search-key values, in which case a value may be present in more than one
leaf. Specically, if Li and Lj are leaf nodes and i < j, then every search-key
value in Li is less than or equal to every search-key value in Lj.
18.The non-leaf nodes of the B+ tree form a multilevel(sparse)index on the leaf
nodes. The structure of non-leaf nodes is the same as that for leaf nodes,
except that all pointers are pointers to tree nodes. A non-leaf node may hold
up to n pointers, and must hold at least n/2 pointers. The number of
pointers in a node is called the fanout of the node. Non-leaf nodes are also
referred to as internal nodes.
19.We shall see that the B+-tree structure imposes performance overhead on
insertion and deletion, and adds space overhead. The overhead is acceptable
even for frequently modied les, since the cost of le reorganization is
avoided. Further more, since nodes may be as much as half empty (if they
have the minimum number of children), there is some wasted space. This
space overhead, too, is acceptable given the performance benets of the B+-
tree structure.
20.A search can be done for a specific key value under B+ tree.
21.A search can be done for a range of values under B+ tree.
22.Insertion and deletion can be easily performed.
23. While processing a query in B+ tree, we traverse a path in the tree
from the root to some leaf node. If the number of records in the file is N,
number of pointers in the leaf node is n then the maximum number of nodes
to be accessed for a query like lookup is
24.In practice, only a few nodes need to be accessed. Typically, a node is made
to be the same size as a disk block, which is typically 4 kilobytes. With a
search-key size of 12 bytes, and a disk-pointer size of 8 bytes, n is around
200. Even with a more conservative estimate of 32 bytes for the search-key
size, n is around 100. With n=100, if we have 1million search-key values in
the le, a lookup requires only Ceiling(log50(1,000,000)) = 4 nodes to be
accessed. Thus, at most four blocks need to be read from disk for the lookup.
The root node of the tree is usually heavily accessed and is likely to be in the
buffer, so typically only three or fewer blocks need to be read from disk.

B+ Tree Insertion Rules


1. Insertion and deletion are more complicated than lookup, since it may be
necessary to split a node that becomes too large as the result of an insertion,
or to coalesce nodes (that is, combine nodes) if a node becomes too small
(fewer than ceiling(n/2) pointers). Furthermore, when a node is split or a pair
of nodes is combined, we must ensure that balance is preserved. To
introduce the idea behind insertion and deletion in a B+-tree, we shall
assume temporarily that nodes never become too large or too small. Under
this assumption, insertion and deletion are performed as dened next.
2. When splitting a leaf, lowest value in right part gets inserted into parent and
the value and also stays in leaf.
3. When splitting internal node, lowest value in right part gets inserted into
parent and the value is removed from the right part.
4. We rst nd the leaf node in which the search-key value would appear. We
then insert an entry (that is, a search-key value and record pointer pair) in the
leaf node, positioning it such that the search keys are still in order.
5. The general technique for insertion into a B+-tree is to determine the leaf
node l into which insertion must occur. If a split results, insert the new node
into the parent of node l. If this insertion causes a split, proceed recursively
up the tree until either an insertion does not cause a split or a new root is
created.
6. While inserting values into node, if node is full then we need to follow two
rules shown here.
a. Rule 1: If node is leaf node then break it into two partitions. The first
partition should hold int(n/2) key values and second partition can hold
rest of the key values where n is the number of pointers of the leaf.
After this is done, the smallest key value from second partition should
go to the parent partition.
b. Rule 2: If node is a non-leaf node then break it into two partitions. The
first partition should have (int(n/2) 1) key values and the second
partition should have the rest of the key values where n is the number
of pointers of the leaf. After this is done, the smallest key value from
second partition should go to the parent partition.
Example 1:
Consider the key values 2, 5, 7, 10, 13, 16, 20, 22, 23, 24 to be inserted
into the B+ tree of a leaf node of pointers n = 4. Here n = 4, then the
number of maximum key values that a leaf node can have is 3.
Insert 2, 5, and 7 then we get the following.

Now let us insert 10, the node is overflows as keys are already 3 and
hence it should be spilt and it is leaf node as it is initial node and not a
root. After inserting 10 using Rule 1we get the following.
There should be a parent node to look after these nodes and hence copy
7 into the parent node.

Now start inserting 13 and 16. After inserting 13 to the right leaf node, it
is full and hence need to be split further and the modified leaf nodes are
as follows.

As there is a new leaf node appears, the least value of it is 13 which


should be copied to the root as follows.

After this we need to add 20 and 22. After adding the key value 20, the
leaf node is full and for the addition of key value 22, a split is needed as
follows.

As there is a new leaf node appears, the least value of it is 20 which


should be copied to the root as follows.
Now let us add 23 and 24. After adding the key value 23, the last leaf
node becomes full and cannot accommodate 24 and hence a split is
needed as follows.

As there is a new leaf node appears, the least value of it is 23 which


should be copied to the root but the root is full and hence it should be

partitioned.

Let us see the following example, before inserting Adams.

After inserting Adams


Before the insertion of Lamport

After inserting Lamport

B+ Tree Deletion Rules


1. Delete operation on leaves and internal nodes is same. However need to
think about the correct value for the middle separating key.
2. If the deletion is from the leaf node, when the number of search key
values < (n/2) - 1 where n is the number of pointers in the leaf node of the
B+ tree.
a. Re-distribute the right node not less than left node by replacing the
between values in parent by their smallest value of the right node.
b. Merge by moving all values, pointers to left node and removing the
between value in parent.
Let us delete the key value 10 from the following B+ tree of n = 3.

After the deletion, the tree is as follows.


7. If the deletion is from the non-leaf node, Merge by moving all values,
pointers to left node and removing the between value in parent.
a. Re-distribute the sibling from parent, right node not less than left
node.
b. Merge by bringing down the parent, and by moving all values,
pointers to left node and also delete the right node and pointers in
parent.

You might also like