Oracle Join Algorithms
Oracle Join Algorithms
Oracle Join Algorithms
html
Note: You will see more use of nested loop when using FIRST_ROWS optimizer mode as it works
on model of showing instantaneous results to user as they are fetched. There is no need for
selecting caching any data before it is returned to user. In case of hash join it is needed and is
explained below.
Hash join
Hash joins are used when the joining large tables. The optimizer uses smaller of the 2 tables to
build a hash table in memory and the scans the large tables and compares the hash value (of rows
from large table) with this hash table to find the joined rows.
The algorithm of hash join is divided in two parts
28 comments:
Sachin said...
I wanted to put some examples in the post itself, but missed it earlier.
Here it is:
Table created.
Table created.
Index created.
Gather D stats as it is
SQL> exec dbms_stats.set_table_stats(ownname => 'SCOTT', tabname => 'E', numrows =>
100, numblks => 100, avgrlen => 124);
Execution Plan
----------------------------------------------------------
Plan hash value: 3204653704
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2200 | 6 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| E | 25 | 225 | 1 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 100 | 2200 | 6 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | D | 4 | 52 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | E_DEPTNO | 33 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
B) Let us set some more artificial stats to see which plans is getting used:
SQL> exec dbms_stats.set_table_stats(ownname => 'SCOTT', tabname => 'E', numrows =>
1000000, numblks => 10000, avgrlen => 124);
SQL> exec dbms_stats.set_table_stats(ownname => 'SCOTT', tabname => 'D', numrows =>
1000000,numblks => 10000 , avgrlen => 124);
Now we have 1000000 number of rows in E and D table both and index on E(DEPTNO)
reflects the same.
Plans changes !!
Execution Plan
----------------------------------------------------------
Plan hash value: 51064926
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250G| 5122G| | 3968K(100)| 13:13:45 |
|* 1 | HASH JOIN | | 250G| 5122G| 20M| 3968K(100)| 13:13:45 |
| 2 | TABLE ACCESS FULL| E | 1000K| 8789K| | 2246 (3)| 00:00:27 |
| 3 | TABLE ACCESS FULL| D | 1000K| 12M| | 2227 (2)| 00:00:27 |
-----------------------------------------------------------------------------------
C) Now to test MERGE JOIN, we set moderate number of rows and do some ordering
business.
SQL> exec dbms_stats.set_table_stats(ownname => 'SCOTT', tabname => 'E', numrows =>
10000, numblks => 1000, avgrlen => 124);
SQL> exec dbms_stats.set_table_stats(ownname => 'SCOTT', tabname => 'D', numrows =>
1000, numblks => 100, avgrlen => 124);
Execution Plan
----------------------------------------------------------
Plan hash value: 915894881
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500K| 52M| 167 (26)| 00:00:02 |
| 1 | MERGE JOIN | | 2500K| 52M| 167 (26)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| E | 10000 | 90000 | 102 (1)| 00:00:02 |
| 3 | INDEX FULL SCAN | E_DEPTNO | 10000 | | 100 (0)| 00:00:02 |
|* 4 | SORT JOIN | | 1000 | 13000 | 25 (4)| 00:00:01 |
| 5 | TABLE ACCESS FULL | D | 1000 | 13000 | 24 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Hope these examples help in learning ...