Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Chapter 12, 13 - Query Processing and Optimization

The document outlines the basic steps in query processing, focusing on optimization techniques for evaluating relational algebra expressions. It discusses generating equivalent expressions, cost-based query optimization, and various equivalence rules that can be applied to improve query performance. Additionally, it emphasizes the importance of choosing efficient evaluation plans based on statistical information and the potential cost differences between them.

Uploaded by

20yatripatel
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 12, 13 - Query Processing and Optimization

The document outlines the basic steps in query processing, focusing on optimization techniques for evaluating relational algebra expressions. It discusses generating equivalent expressions, cost-based query optimization, and various equivalence rules that can be applied to improve query performance. Additionally, it emphasizes the importance of choosing efficient evaluation plans based on statistical information and the potential cost differences between them.

Uploaded by

20yatripatel
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

QUERY

PROCESSING AND
Book Referred: Korth
6th Edition, Chapter
12 & 13
OPTIMIZATION
DATABASE
MANAGEMEN
T SYSTEMS
INTERNALS
BASIC STEPS IN QUERY
PROCESSING
1. Parsing and translation
2. Optimization
3. Evaluation
BASIC STEPS IN QUERY
PROCESSING (CONT.)
BASIC STEPS IN QUERY
PROCESSING : OPTIMIZATION
A relational algebra expression may have many equivalent
expressions
 E.g., 
salary75000(salary(instructor)) is equivalent to
salary(salary75000(instructor))

Each relational algebra operation can be evaluated using


one of several different algorithms
 Correspondingly, a relational-algebra expression can be
evaluated in many ways.
Annotated expression specifying detailed evaluation strategy
is called an evaluation-plan.
 E.g., can use an index on salary to find instructors with
salary < 75000,
 or can perform complete relation scan and discard
instructors with salary  75000
INTRODUCTION TO
QUERY OPTIMIZATION
Alternative ways of evaluating a given query
 Equivalent expressions
 Different algorithms for each operation
Find the names of all instructors in the Music department together with the course title
of all the courses that the instructors teach.
π name, title (σdept name =“Music” (instructor ⋈ (teaches ⋈ πcourse_id, title (course))))
INTRODUCTION (CONT.)
An evaluation plan defines exactly what algorithm is used
for each operation, and how the execution of the operations is
coordinated.

 Find out how to view query execution plans on your favorite database
INTRODUCTION (CONT.)
Cost difference between evaluation plans for a query can be
enormous
 E.g. seconds vs. days in some cases

Steps in cost-based query optimization


1. Generate logically equivalent expressions using
equivalence rules
2. Annotate resultant expressions to get alternative query
plans
3. Choose the cheapest plan based on estimated cost
Estimation of plan cost based on:
 Statistical information about relations. Examples:
 number of tuples, number of distinct values for an
attribute
 Statistics estimation for intermediate results
 to compute cost of complex expressions
 Cost formulae for algorithms, computed using statistics
GENERATING
EQUIVALENT
EXPRESSIONS
TRANSFORMATION OF
RELATIONAL EXPRESSIONS
Two relational algebra expressions are said to be equivalent
if the two expressions generate the same set of tuples on
every legal database instance
 Note: order of tuples is irrelevant

In SQL, inputs and outputs are multisets of tuples


 Two expressions in the multiset version of the relational
algebra are said to be equivalent if the two expressions
generate the same multiset of tuples on every legal
database instance.
An equivalence rule says that expressions of two forms are
equivalent
 It is statement that enables us to find an equivalent
expression from the given expression.
 If on all legal instances of the database, both the expressions
provide the same result.
 Can replace expression of first form by second, or vice versa.
EQUIVALENCE RULES
1. Conjunctive selection operations can be deconstructed
into a sequence of individual selections.
Cascading of conditions 
2. Selection operations are commutative.

3. Only the last in a sequence of projection operations is


needed, the others can be omitted.
 L1 ( L2 (...( Ln ( E ))...))  L1 ( E )

4. Selections can be combined with Cartesian products


and theta joins.
a. (E1 X E2) = E1  E2
b. 1(E1 2 E 2) = E 1 1 2 E2
EQUIVALENCE RULES
(CONT.)
5.Theta-join operations (and natural joins) are
commutative.
E1  E2 = E2  E1
6. (a) Natural join operations are associative:
(E1 E 2) E3 = E1 (E2 E 3)

(b) Theta joins are associative in the following


manner:

(E1 1 E2) 2 3 E3 = E1 1 3 (E2 2 E 3)

where 2 involves attributes from only E2 and E3.


PICTORIAL DEPICTION OF
EQUIVALENCE RULES
EQUIVALENCE RULES
(CONT.)
7. The selection operation distributes over the theta join
operation under the following two conditions:
(a) When all the attributes in  involve only the attributes
of one of the expressions (E1) being joined.

E1  E2) = ((E1))  E2

(b) When 1 involves only the attributes of E1 and 2 involves


only the attributes of E2.
1 E1  E2) = (1(E1))  ( (E2))
EQUIVALENCE RULES
(CONT.)
8.The projection operation distributes over the theta join
operation as follows:
(a) If  involves only attributes from L1  L2:
L1 L2 ( E1  E2 )  ( L1 ( E1 ))  ( L2 ( E2 ))
EQUIVALENCE RULES
(CONT.)
9. The set operations union and intersection are commutative
E1  E2 = E2  E1
E1  E2 = E2  E1
 (set difference is not commutative).
10.Set union and intersection are associative.
(E1  E2)  E3 = E1  (E2  E3)
(E1  E2)  E3 = E1  (E2  E3)
11.The selection operation distributes over ,  and –.
 (E1 – E2) =  (E1) – (E2)
and similarly for  and  in place of –
Also:  (E1 – E2) = (E1) – E2
and similarly for  in place of –, but not for

12. The projection operation distributes over
union
L(E1  E2) = (L(E1))  (L(E2))
TRANSFORMATION EXAMPLE: PUSHING
SELECTIONS

Query: Find the names of all instructors in the Music


department, along with the titles of the courses that
they teach
 name, title(dept_name= “Music”
(instructor (teaches course_id, title
(course))))
Transformation using rule 7a.

 name, title((dept_name= “Music”(instructor))


(teaches course_id, title (course)))
instructor(ID, name, dept name, salary)
teaches(ID, course id, sec id, semester, year)
Performing the selection as early as possible reduces
course(course id, title, dept name, credits)
the size of the relation to be joined.
EXAMPLE WITH
MULTIPLE
TRANSFORMATIONS
Query: Find the names of all instructors in the Music
department who have taught a course in 2009, along with
the titles of the courses that they taught.
 name, title(dept_name= “Music”year = 2009
(instructor (teaches course_id, title (course))))

Transformation using join associatively (Rule 6a):


 name, title(dept_name= “Music”gear = 2009
((instructor teaches) course_id, title (course)))

Second form provides an opportunity to apply the


“perform selections early” rule, resulting in the
subexpression
dept_name = “Music” (instructor)  year = 2009 (teaches)
MULTIPLE
TRANSFORMATIONS
(CONT.)
TRANSFORMATION EXAMPLE: PUSHING
PROJECTIONS

Consider: name, title(dept_name= “Music” (instructor) teaches)


course_id, title (course))))
When we compute
(dept_name = “Music” (instructor teaches)

we obtain a relation whose schema is:


(ID, name, dept_name, salary, course_id, sec_id, semester,
year)
Push projections using equivalence rules 8a; eliminate
unneeded attributes from intermediate results to get:
name, title(name, course_id (
dept_name= “Music” (instructor) teaches))
course_id, title (course))))
Performing the projection as early as possible reduces the
size of the relation to be joined.
JOIN ORDERING
EXAMPLE
For all relations r1, r2, and r3,
(r1 r2) r3 = r1 (r2 r3 )
(Join Associativity)
If r2 r3 is quite large and r1 r2 is small, we choose

(r1 r2) r3
so that we compute and store a smaller temporary
relation.
JOIN ORDERING EXAMPLE
(CONT.)
Consider the expression
name, title(dept_name= “Music” (instructor) teaches)
course_id, title (course))))
Could compute teaches course_id, title (course) first, and
join result with
dept_name= “Music” (instructor)
but the result of the first join is likely to be a large relation.
Only a small fraction of the university’s instructors are likely
to be from the Music department
 It is better to compute

dept_name= “Music” (instructor) teaches


first.
END OF CHAPTER

You might also like