Chapter 2 Query processing and optimization [Autosaved]
Chapter 2 Query processing and optimization [Autosaved]
systems
Chapter 2: Query
Processing and
Optimization
1
Overview of Query
Processing
What is query processing?
The activities involved in parsing, validating, optimizing, and executing a
query.
The aims of query processing are to transform a query written in a high-
resource usage.
Generally, we try to reduce the total execution time of the query, which
is the sum of the execution times of all individual operations that make up
the query.
2
Query optimization:
Example
Comparison of different processing
strategies
Find all Managers who work at a London branch.
3
Query optimization:
Example cont’d…
Three equivalent relational algebra queries corresponding to this SQL
statement are:
σ
1. (position=‘Manager’) ∧ (city=‘London’) ∧ (Staff.branchNo=Branch.branchNo) (Staff × Branch)
σ(position=‘Manager’) ∧ (city=‘London’)(Staff
2.
Staff.branchNo=Branch.branchNo
Branch)
(σposition=‘Manager’(Staff))
3. Staff.branchNo=Branch.branchNo
(σcity=‘London’(Branch))
For this particular example assume there are 1000 tuples in
accesses required.
There are no indexes or sort keys on either relation.
4
Query optimization:
Example cont’d…
The first query calculates the Cartesian product of
Staff and Branch
σ(position=‘Manager’) ∧ (city=‘London’) ∧ (Staff.branchNo=Branch.branchNo)
(Staff × Branch)
5
Example…
The second query joins Staff and Branch on the branch
number branchNo
σ(position=‘Manager’) ∧ (city=‘London’)(Staff Staff.branchNo=Branch.branchNo
Branch)
Requires (1000 + 50) disk accesses to read each of the
relations.
The join of the two relations has 1000 tuples, one for each
tuples.
The second Selection operation reads each Branch tuple to
tuples.
The final operation is the join of the reduced Staff and Branch
87:1.
If we increase the no. of data to 10 times, the factor is 870:1. 7
Phases of query processing.
8
Dynamic versus static
optimization
Dynamic: carry out decomposition and
optimization every time the query is run.
Static: where the query is parsed, validated,
and optimized once.
9
Query Decomposition
Query decomposition is the first phase of query
processing.
The aims of query decomposition are to transform
a high-level query into a relational algebra query.
Stages of query decomposition
1. Analysis
2. Normalization
3. Semantic analysis
4. Simplification
5. Query restructuring
10
Query Decomposition:
stages cont’d…
1.Analysis
The query is lexically and syntactically analyzed using
the techniques of programming language compilers.
Verifies that the relations and attributes specified in
the query are defined in the system catalog.
Example: Assume we have a Staff table with staffno.
and with position attribute which accepts variable
character string. In the following query staffNumber is
not defined and position is incompatible datatype.
SELECT staffNumber
FROM Staff
WHERE position > 10;
11
Query Decomposition:
stages cont’d..
2. Normalization
Converts the query into a normalized form that can be more
easily manipulated.
i.e. in SQL, the WHERE condition converted into one of two
forms by applying a few transformation rule.
Conjunctive normal form: A sequence of conjuncts that
are connected with the ∧ (AND) operator.
e.g. (position = ‘Manager’ ∨ salary > 20000) ∧ branchNo =
‘B003’
Disjunctive normal form :A sequence of disjuncts that
are connected with the ∨ (OR) operator.
e.g. (position = ‘Manager’ ∧ branchNo = ‘B003’ ) ∨ (salary >
20000 ∧ branchNo = ‘B003’)
12
Query Decomposition:
stages cont’d..
3. Semantic analysis
objective of semantic analysis is to reject
normalized queries that are incorrectly
formulated or contradictory.
A query is incorrectly formulated if components
do not contribute to the generation of the result.
which may happen if some join specifications are
missing.
For example, the predicate (position = ‘Manager’
∧ position = ‘Assistant’) on the Staff relation is
contradictory, as a member of staff cannot be
both a Manager and an Assistant simultaneously.
13
Query Decomposition:
stages cont’d..
4. Simplification
The objectives of the simplification stage are to
detect redundant qualifications.
Eliminate common subexpressions.
Transform the query to a semantically equivalent
but more easily and efficiently computed form.
For example: From Boolean algebra
p ∧ (p) ≡ p p ∨ (p) ≡ p
p ∧ false ≡ false p ∨ false ≡ p
p ∧ true ≡ p p ∨ true ≡ true
p ∧ (~p) ≡ false p ∨ (~p) ≡ true
14
Query Decomposition:
stages cont’d..
5. Query restructuring
The query is restructured to provide a more
efficient implementation.
15
Heuristical Approach to
Query Optimization
Uses transformation rules to convert one relational
algebra expression into an equivalent form.
That is known to be more efficient.
Transformation Rules for the Relational
Algebra Operations
By applying transformation rules, the optimizer can
transform one relational algebra expression into an
equivalent expression.
In listing these rules, we use three relations R, S, and T,
with R defined over the attributes A = {A1, A2, . . . , An},
and S defined over B = {B1, B2, . . . , Bn}; p, q, and r
denote predicates, and L, L1, L2, M, M1, M2, and N denote
sets of attributes.
16
Heuristical Approach…
cont’d…
1. Conjunctive Selection operations can cascade
into individual Selection operations (and vice
versa).
This transformation is sometimes referred to as
cascade of selection.
σp∧ q∧ r(R)= σp (σ q (σ r(R)))
E.g.
σbranchNo=‘B003’
∧salary>15000(Staff)=σbranchNo=‘B003’(σsalary>15000(Staff))
17
Heuristical Approach…
cont’d…
2. Commutativity of Selection operations
σp (σ q (R))=σq (σ p (R))
E.g.
σbranchNo=‘B003’(σsalary>15000 (Staff))=σsalary>15000
(σbranchNo=‘B003’(Staff))
18
Heuristical Approach…
cont’d…
3. In a sequence of Projection operations,
E.g.
Π
lNameΠbranchno, lName(Staff) = ΠlName(Staff)
19
Heuristical Approach…
cont’d…
4. Commutativity of Selection and Projection.
If the predicate p involves only the attributes in
the projection list, then the Selection and
Iname (staff))
20
Heuristical Approach…
cont’d…
5. Commutativity of Theta join (and Cartesian
product).
R pS = S pR
R× S= S× R
E.g.
Staff Staff.branchNo=Branch.branchNo
Branch= Branch
Staff.branchNo=Branch.branchNo
Staff
21
Heuristical Approach…
cont’d…
22
Heuristical Approach…
cont’d…
23
Heuristical Approach…
cont’d…
24
Heuristical Approach…
cont’d…
25
Heuristical Approach…
cont’d…
26
Heuristical Processing
Strategies
Perform Selection operations as early as possible.
Combine the Cartesian product with a subsequent
Selection operation whose predicate represents a
join condition into a Join operation.
Use associativity of binary operations to rearrange
leaf nodes so that the leaf nodes with the most
restrictive Selection operations are executed first.
Perform Projection operations as early as possible.
Compute common expressions once.
27
Heuristical Query
optimization: Example
Consider the following table :
Employee (Fname, Mname, Lname, Ssn, Bdate,
Address, Gender, Salary, Superssn,Dno)
Project (Pname, Pnumber, Plocation, Dnum)
Works_On (Essn, Pno, Hours)
28
Heuristical Query
optimization: Example
Query Q on this table find the last names of
employees born after 1957 who work on a
project named ‘Aquarius’.
This query can be specified in SQL as follows:
Q: SELECT Lname
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE Pname=‘Aquarius’ AND Pnumber=Pno
AND Essn=Ssn
AND Bdate > ‘1957-12-31’;
29
Heuristical Query
optimization: Example
Simplified steps in converting a query tree
during heuristic optimization
1.Initial (canonical) query tree for SQL query Q.
2.Moving SELECT operations down the query tree.
3.Applying the more restrictive SELECT operation
first.
4. Replacing CARTESIAN PRODUCT and SELECT with
JOIN operations.
5.Moving PROJECT operations down the query tree.
30
Heuristical Query
optimization: Example
1. Initial (canonical) query tree for SQL query Q.
SELECT Lname
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE Pname=‘Aquarius’ AND Pnumber=Pno AND
Essn=Ssn
AND Bdate > ‘1957-12-31’;
31
Heuristical Query
optimization: Example
2. Moving SELECT operations down the query tree.
32
Heuristical Query
optimization: Example
3. Applying the more restrictive SELECT operation first.
33
Heuristical Query
optimization: Example
4. Replacing CARTESIAN PRODUCT and SELECT with JOIN
operations.
σR.a=S.b RXS=R R.a=S.b S
34
Heuristical Query
optimization: Example
5. Moving PROJECT operations down the query tree.