Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
23 views

Mod 7 - Query Optimization

Uploaded by

Sanjay Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Mod 7 - Query Optimization

Uploaded by

Sanjay Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Database Management System

Query Processing and Optimization

Dr. Balasundaram A

VIT-Chennai
SWE1004 Syllabus

Module-7

Heuristic query optimization

Dr. Balasundaram A (VIT-Chennai) Database Management System 2 / 28


Text Books and References

Text Books
R. Elmasri & S. B. Navathe, Fundamentals of Database Systems, Addison Wesley, 7 th
Edition, 2015
Raghu Ramakrishnan,Database Management Systems,Mcgraw-Hill,4 th edition,2015

References
A. Silberschatz, H. F. Korth & S. Sudershan, Database System Concepts, McGraw Hill, 6
th Edition 2010
Thomas Connolly, Carolyn Begg, Database Systems : A Practical Approach to Design,
Implementation and Management,6 th Edition,2012
Pramod J. Sadalage and Marin Fowler, NoSQL Distilled: A brief guide to merging world of
Polyglot persistence, Addison Wesley, 2012.
Shashank Tiwari, Professional NoSql, Wiley, 2011.

Dr. Balasundaram A (VIT-Chennai) Database Management System 3 / 28


Introduction to Query Processing
A query expressed in a HLL(SQL) must first be Scanned, Parsed, and
Validated.
The Scanner identifies the query tokens(SQL keywords, attribute names, and
relation names).
The Parser checks the query syntax to determine whether it is formulated
according to the syntax rules (rules of grammar) of the query language.
The Validated by checking that all attribute and relation names are valid and
semantically meaningful names in the schema.
An internal representation of the query is then created Query (Tree/Graph).
The DBMS must then devise an execution strategy/query plan for retrieving
the results of the query from the database.
A query typically has many possible execution strategies, and the process of
choosing a suitable one for processing a query is known as query optimization.

Dr. Balasundaram A (VIT-Chennai) Database Management System 4 / 28


Query Processing

Dr. Balasundaram A (VIT-Chennai) Database Management System 5 / 28


Translating SQL Queries into Relational Algebra

An SQL query is first translated into an equivalent extended Relational


algebra expression, represented as a query Tree/Graph data structure and
then optimized.

SQL queries are decomposed into query blocks(Units/Chunks).

Query Block : It is a single SELECT-FROM-WHERE expression, as well as


GROUP BY and HAVING clause if these are part of the block.

Nested queries : are within a query are identified as separate query blocks.

Aggregate operators : are in SQL must be included in the extended algebra

Dr. Balasundaram A (VIT-Chennai) Database Management System 6 / 28


Translating SQL Queries into Relational Algebra

Finally, The query optimizer will choose an execution plan for each query block.

Dr. Balasundaram A (VIT-Chennai) Database Management System 7 / 28


Translating SQL Queries into Relational Algebra

For Converting/Translating a Query, written in HLL(SQL) into Relational Algebra,


need to have an appropriate strategies for the following:
Algorithm for Selection Operation
Algorithm for Projection and Set Operation
Algorithm for External Sorting
Implementation of JOIN, SET and Aggregate Operations
Combining Operations Using Pipeling
Parallel Algorithms for Query Processing

Dr. Balasundaram A (VIT-Chennai) Database Management System 8 / 28


Query Trees and Heuristics for Query Optimization

Heuristic : Problem solving by Experimental (Trail-and-Error) Heuristic Rules :


Used to Modify the Internal Representation of Query, to improve the performance.
i.e Query Tree/ Query Graph (Data Structure)

Process for heuristics optimization


1 The parser of a high-level query generates an initial internal representation;
2 Apply heuristics rules to optimize the internal representation.
3 A query execution plan is generated to execute groups of operations based on
the access paths available on the files involved in the query.

The main heuristic is to apply first the operations that reduce the size of
intermediate results.

Dr. Balasundaram A (VIT-Chennai) Database Management System 9 / 28


Query Trees and Heuristics for Query Optimization

Query tree: is a data structure that corresponds to a Reln. Algebra expression,


which represents the input relations of the query as leaf nodes of the tree, and
represents the relational algebra operations as internal nodes.
An execution of the query tree consists of executing an internal node operation
whenever its operands are available and then replacing that internal node by the
relation that results from executing the operation.
Query graph: A graph data structure that corresponds to a relational calculus
expression. It does not indicate an order on which operations to perform first.
There is only a single graph corresponding to each query.

Dr. Balasundaram A (VIT-Chennai) Database Management System 10 / 28


Example-1

For Eg :

– For every project located in ‘Stafford’, retrieve the project number, department
number, and department manager’s last name, address, and birth-date.

SQL Query :
SELECT P.NUMBER,P.DNUM,E.LNAME, E.ADDRESS, E.BDATE
FROM PROJECT AS P, DEPARTMENT AS D, EMPLOYEE AS E
WHERE P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN AND
P.PLOCATION=‘STAFFORD’;
Relation Algebra :
πPnumber ,Dnum,Lname,Address,Bdate (((σPlocation=‘Stafford 0 (PROJECT))
./Dnum=Dnumber (DEPARTMET)) ./Mgrs sn=Ssn (EMPLOYEE))

Dr. Balasundaram A (VIT-Chennai) Database Management System 11 / 28


Example-1

Query Tree -1

Dr. Balasundaram A (VIT-Chennai) Database Management System 12 / 28


Example-1
Query Tree -2

Query Tree -3

Dr. Balasundaram A (VIT-Chennai) Database Management System 13 / 28


Example-1
Query Tree -2

Query Tree -3

Dr. Balasundaram A (VIT-Chennai) Database Management System 13 / 28


Using Heuristics in Query Optimization

Heuristic Query Optimization:


Oracle calls this Rule Based optimization.
A query can be represented as a tree data structure. Operations are at the
interior nodes and data items (tables, columns) are at the leaves.
The query is evaluated in a depth-first pattern.

Heuristic Optimization of Query Trees: The same query could correspond to


many different relational algebra expressions and hence many different query trees.
The task of heuristic optimization of query trees is to find a final query tree that
is efficient to execute.

Example:

SELECT LNAME
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE PNAME = ‘AQUARIUS’ AND PNUMBER=PNO AND
ESSN=SSN AND BDATE > ‘1957-12-31’;

Dr. Balasundaram A (VIT-Chennai) Database Management System 14 / 28


Using Heuristics in Query Optimization

Query Tree-1

Dr. Balasundaram A (VIT-Chennai) Database Management System 15 / 28


Using Heuristics in Query Optimization

Query Tree-2 Query Tree-3

Dr. Balasundaram A (VIT-Chennai) Database Management System 16 / 28


Using Heuristics in Query Optimization

Query Tree-4 Query Tree-5

Dr. Balasundaram A (VIT-Chennai) Database Management System 17 / 28


Steps to Optimize Query

Query Optimization Steps in converting a Query Tree

Initial (Canonical) Query Tree for SQL Query ’Q’.

Moving SELECT Operations down the Query Tree

Applying the More Restrictive SELECT Operation First

Replacing the CARTESIAN PRODUCT and SELECT Operations with JOIN


Operations

Moving PROJECTION Operations down the Query Tree ’Q’

Dr. Balasundaram A (VIT-Chennai) Database Management System 18 / 28


Transformation Rules for Relational Algebra

An overall rule for heuristic query optimization is to perform as many select and
project operations as possible before doing any joins.
1 Cascade of σ : A conjunctive selection condition can be broken up into a
cascade (that is, a sequence) of individual σ operations:
σ c1 ,c2 ,c3 ....,cn (R) = σ c1 (σ c2 (σ c3 ....., (σ cn (R))))
2 Commutative of (σ) : The ’σ’ operation is commutative:
σ c1 (σ c2 (R)) = σ c2 (σ c1 (R))
3 Cascade of π : In a cascade (sequence) of π operations, all but the last one
can be ignored:
π List1 (π List2 (...., π Listn (R) ) ) = π List1 (R)
4 Commuting σ with π : If the selection condition ’c’ involves only those
attributes A1 , ... , An in the projection list, the two operations can be
commuted:
π A1 ,A2 ,A3 ...,An (σ c (R)) = σ c (π A1 ,A2 ,A3 ...,An ((R)))

Dr. Balasundaram A (VIT-Chennai) Database Management System 19 / 28


Transformation Rules for Relational Algebra
5 Commutativity of ./ and X : The Join Operation is Commutative, as is ’X ’
Operation:
1 S ./c R = R ./c S
2 RX S=SX R
Notice that although the order of attributes may not be the same in the
relations resulting from the two joins (or two Cartesian products), the
meaning is the same because the order of attributes is not important in the
alternative definition of relation.
6 Commuting σ with (X or ./): If all the attributes in the selection condition c
involve only the attributes of one of the relations being joined —say, ’R’
—the two operations can be commuted as follows:
σ c (R ./ S) = σ c (R) ./ S
Alternatively, if the selection condition c can be written as (c1 and c2 ), where
condition c1 involves only the attributes of R and condition c2 involves only
the attributes of S, then the operation commute as follows :
σ c (R ./ S) = (σ c1 (R)) ./ (σ c2 (S))

Dr. Balasundaram A (VIT-Chennai) Database Management System 20 / 28


Transformation Rules for Relational Algebra

7 Commutativity of set operations: The set operations ∪ and ∩ are


commutative but “–” is not.
8 Associativity of ./, X , ∪, and ∩ : These four operations are individually
associative; that is, if ’θ’ stands for any one of these four operations
(throughout the expression), we have :
(R θ S) θ T = R θ ( S θ T )
9 Commuting π with (./ or X ) : Suppose that the projection list is L =
A1 , ..., An , B1 , ..., Bm , where (A1 , ..., An ) are attributes of R and
(B1 , ..., Bm ) are attributes of S. If the join condition c involves only
attributes in L, the two operations can be commuted as follows:
π L (R ./c S ) = π A1 ,A2 ,A3 ,....An (R) ./c π B1 ,B2 ,B3 ,....Bn (S)
10 Commuting π with set operations: The π operation commutes with ∪, ∩
and −. If θ stands for any one of these three operations, we have :
πc (R θ S) = πc (R) θ πc (S)

Dr. Balasundaram A (VIT-Chennai) Database Management System 21 / 28


Transformation Rules for Relational Algebra

11 The π operation commutes with : πL (R ∪ S) = πL (R) ∪ πL (S).


12 Converting (σ, X ) into ./ = If the condition ’c’ of a σ that follows a ’X ’
corresponds to a join condition, convert the (σ, X ) sequence into a ./ as
follows:
σ c (R X S) = R ./c S
13 Pushing σ in conjunction with set difference.
σ c (R−S) = σ c (R)− σ c (S)
However, σ may be applied to only one relation
σ c (R−S) = σ c (R)− S
14 Pushing σ to only one argument in ∩ : If in the condition σ c all attributes
are from relation R, then
σ c (R∩S) = σ c (R)∩ S
15 If S is empty, then R ∪ S = R If the condition c in σ c is true for the entire R,
then σ c (R) = R.

Dr. Balasundaram A (VIT-Chennai) Database Management System 22 / 28


Using Heuristics in Query Optimization

Outline of a Heuristic Algebraic Optimization Algorithm:


Using rule 1, break up any select operations with conjunctive conditions into a
cascade of select operations.
Using rules 2, 4, 6, and 10 concerning the commutativity of select with other
operations, move each select operation as far down the query tree as is permitted
by the attributes involved in the select condition.
Using rule 9 concerning associativity of binary operations, rearrange the leaf nodes
of the tree so that the leaf node relations with the most restrictive select operations
are executed first in the query tree representation.
Using Rule 12, combine a Cartesian product operation with a subsequent select
operation in the tree into a join operation.
Using rules 3, 4, 7, and 11 concerning the cascading of project and the commuting
of project with other operations, break down and move lists of projection attributes
down the tree as far as possible by creating new project operations as needed.
Identify subtrees that represent groups of operations that can be executed by a
single algorithm

Dr. Balasundaram A (VIT-Chennai) Database Management System 23 / 28


Using Heuristics in Query Optimization

Summary of Heuristics for Algebraic Optimization:

The main heuristic is to apply first the operations that reduce the size of
intermediate results.

Perform select operations as early as possible to reduce the number of tuples


and perform project operations as early as possible to reduce the number of
attributes. (This is done by moving select and project operations as far down
the tree as possible.)

The select and join operations that are most restrictive should be executed
before other similar operations. (This is done by reordering the leaf nodes of
the tree among themselves and adjusting the rest of the tree appropriately.)

Dr. Balasundaram A (VIT-Chennai) Database Management System 24 / 28


Using Heuristics in Query Optimization
Query Execution Plans :
An execution plan for a relational algebra query consists of a combination of
the relational algebra query tree and information about the access methods
to be used for each relation as well as the methods to be used in computing
the relational operators stored in the tree.
Materialized evaluation: the result of an operation is stored as a temporary
relation.
Pipe-lined evaluation: as the result of an operator is produced, it is forwarded
to the next operator in sequence.
Cost Components for Query Execution :
Access cost to Secondary Storage
Disk Storage Cost
Computation Cost
Memory Usage Cost
Communication Cost

Dr. Balasundaram A (VIT-Chennai) Database Management System 25 / 28


Query Optimization

Query optimization is a difficult part of the query processing.


It determines the efficient way to execute a query with different possible
query plans.
It cannot be accessed directly by users once the queries are submitted to the
database server or parsed by the parser.
A query is passed to the query optimizer where optimization occurs.
Main aim of Query Optimization is to minimize the cost function, I/O Cost
+ CPU Cost + Communication Cost
It defines how an RDBMS can improve the performance of the query by
re-ordering the operations.
It is the process of selecting the most efficient query evaluation plan from
among various strategies if the query is complex.
It computes the same result as per the given expression, but it is a least
costly way of generating result.

Dr. Balasundaram A (VIT-Chennai) Database Management System 26 / 28


Importance of Query Optimization

Query optimization provides faster query processing.


It requires less cost per query.
It gives less stress to the database.
It provides high performance of the system.
It consumes less memory.

Dr. Balasundaram A (VIT-Chennai) Database Management System 27 / 28


Thanks

Dr. Balasundaram A (VIT-Chennai) Database Management System 28 / 28

You might also like