0% found this document useful (0 votes)

8 views

DE_Module5_QueryOptimization

The document provides an overview of the internal workings of Relational Database Management Systems (RDBMS), focusing on database structures, query processing, and optimization techniques. It explains the steps involved in query processing, including parsing, optimization, and evaluation, while detailing various operations and rules for efficient query execution. Additionally, it discusses cost-based and heuristic optimization algorithms, highlighting their importance in improving query performance.

Uploaded by

nayakasutosh85701

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

DE_Module5_QueryOptimization

Uploaded by

nayakasutosh85701

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Module – V

INTERNALS OF RDBMS
Introduction:
A database (DB) is a collection of homogeneous sets of data, with relationships defined among them, stored in a
permanent memory and used by means of a DBMS, a piece of software that provides the following key features:
 A language for the database schema definition, the restrictions on allowable values of the data (integrity
constraints), and the relationships among data sets.
 The data structures for the storage and efficient retrieval of large amounts of data in permanent memory.
 A language to allow authorized users to store and manipulate data
 A transactions mechanism to protect data from hardware and software malfunctions and unwanted interference
during concurrent access by multiple users.
Terminologies:
o Query: A query is a request for information from a database.
o Query Plans: A query plan (or query execution plan) is an ordered set of steps used to access data in a
SQL relational database management system.
o Query Optimization:
 A single query can be executed through different algorithms or re-written in different forms and
structures.
 The query optimizer attempts to determine the most efficient way to execute a given query by
considering the possible query plans.
 The goal of query optimization is to reduce the system resources required to fulfill a query, and
ultimately provide the user with the correct result set faster.
Query Processing:
 Query processing refers to activities including translation of high level language(HLL) queries into operations at
physical file level, query optimization transformations, and actual evaluation of queries.
A query expressed in a high-level query language such as SQL must first be scanned, parsed, and validated.

The steps involved in processing a query

 Parsing and translation
 Optimization
 Evaluation

Parsing and translation:

 First the given query is translated into its internal form.

1
 The parser checks the syntax of the user’s query, verifies the relation names appearing in the
query etc.
 The system constructs a parse-tree representation of the query, which it then translates into a
relational-algebra expression.
Optimization:
A relational algebra expression may have many equivalent expressions.
 Example:
select balance from account where balance <2500
 The relational algebra form is:
Π balance ( balance<2500 (account))

balance<2500 (Π balance (account))

 Each relational algebra operation can be evaluated using one of several different algorithms.
 A sequence of primitive operations that can be used to evaluate a query is a query execution
plan or query-evaluation plan.

 An index (denoted in the figure as “index 1”) on balance has been used for the selection
operation in order to find accounts with balance<2500.
 Amongst all equivalent evaluation plans , the one with lowest cost is choosen.
 Cost is estimated using statistical information such as number of tuples in each relation, size of
tuples etc from the database catalog.
Evaluation:
The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers
to the query.

2
Query Optimization
Optimization refers to the best of all possible options, but query optimization doesn’t consider all possible
option, so it is a query improvement. Query optimization is a function of many RDBMS in which multiple
query plans are examined & a good query plan is identified. The approaches are:
i. Reporting the query in a more effective manner and
ii. Estimating the cost of various execution strategies for the query.
The system first translates the query into its internal form. Then optimization begins, by finding an equivalent
expression that is more efficient and then selects a detailed strategy for processing the query. The final choice of
a strategy is based on the number of disk accesses required.

Equivalence Expression
 The first step is to find a relational algebra expression that is equivalent to the given query and is
efficient to execute.
 Two relational algebra expressions are said to be equivalent if the two expressions generate the same
set of tuples on every legal database instance.
 The first step is to find a relational algebra expression that is equivalent to the given query and is
efficient to execute.

3
i. Selection Operation
Rules for optimization are
a. Perform select operation as early as possible.
b. Conjunctive selection operations can be deconstructed into a sequence of individual selections. This is
called a sigma-cascade.

 P1 ( e ) by  P1 ( P2 ( e ))
P2
Where P1 , P2 are predicates and e is relational algebra expression.
 P1 P2 ( e ) =  P1 ( P2 ( e )) =  P2 ( P1 ( e ))

 Selection operations are commutative.

ii. Project Operation

Projections reduces the size of relations , so the rule is
- Apply projections early.
- Only the last in a sequence of projection operations is needed, the others can
be omitted. This is called a pi-cascade.

Selections can be combined with Cartesian products and theta joins.

iii. Natural Join Operation

Rules for conversion are

4
a. Choose an optimal ordering of the natural join operation
Since natural join is associative
( R1 ⋈ R2 ) ⋈ R3 = R1 ⋈ (R2 ⋈ R3) (but the computation may differ)

Similarly as natural join is commutative

( R1 ⋈ R2 ) = ( R2 ⋈ R1 )
b. Choose an optimal ordering of the theta join operation
Since theta join is associative

Similarly as theta join is commutative

iv. Union and Intersection are commutative.

( R1 U R2) = ( R2 U R1)
( R1 ∩ R2) = ( R2 ∩ R1)

v. Union and Intersection are associative.

( R1 U R2) U R3 = R1 U (R2 U R3)
( R1 ∩ R2) ∩ R3 = R1 ∩ (R2 ∩ R3)

vi. Other Operations

Selection operation distributes over the union, intersection, and difference operations.
a.  P ( R1 U R2) = P ( R1) U P ( R2)
b.  P ( R1 - R2) = P ( R1) - P ( R2)
c.  P ( R1 ∩ R2) = P ( R1) ∩ P ( R2)

vii. Projection operation distributes over the union operation.

 L ( R1 U R2) =  L (R1) U  L (R2)

 A1 , A2 ( c ( R)) =  c ( A1 , A2 (R)), if C involves only A1 and A2.

5
Heuristic Rule:
The heuristic rule is to apply select (  ) and project () operations before applying the join (⋈) or other binary
operations.

Example:
instructor(ID, name, dept_name, salary)
teaches(ID, course_id, sec_id, semester, year)
course(course_id, title, dept_name, credits)

Query 1: Find the names of all instructors in the “Physics” department, along with the titles of the
courses that they teach.

Optimized Query:

Query 2: Find the names of all instructors in the “CSE” department who have taught a course in 2009, along
with the titles of the courses that they taught.

Optimized Query:
By using “join associativity” and then the rule of applying “perform selection early”
Π name, title (σ dept_name=”CSE” (instructor) ⋈ σ year=2019 (teaches)

Query tree
 It is a tree data structure that corresponds to a relational algebra expression.
 It represents the input relation of the query as leaf nodes and relational algebra operations as
intermediate nodes.

6
 Execution consists of executing an internal node operation whenever its operations are available and
then replacing that node with the result relation.
 The heuristic optimizer transforms the initial query tree into a final query tree that is efficient.
 It applies the rules for equivalence on the initial tree.

Examples of Transformations
Branch-schema = (branch-name, branch-city, assets)
Account-schema = (account-number, branch-name, balance)
Depositor-schema = (customer-name, account-number)

Query1: Display names of customers having account in “BBSR” city.

Π customer-name (σ branch-city=”BBS R” (branch ⋈ (account ⋈ depositor)))
Optimized Query:
Π customer-name ((σ branch-city=”BBS R” (branch)) ⋈ (account ⋈ depositor))

Query 2: Display names of customers having account in “BBSR” city and balance more than 1000.
Π customer-name (σ branch-city=”BBS R” ᴧ balance > 1000 (branch ⋈ (account ⋈ depositor)))
Optimized Query:
Π customer-name (σ branch-city=”BBS R” (branch) ⋈ σ balance > 1000 (account) ⋈ depositor)))

7
Query optimization algorithms:
 Several different algorithms can be used for each relational operation, giving rise to alternative
evaluation plans.
 Hash join is best algorithm when large, unsorted, and non-indexed data (residing in tables) is to be
joined.
 In case no other join is preferred (maybe due to no sorting or indexing etc), then, Hash join is used.
 If both join inputs are large and the two inputs are of similar sizes, a merge join with prior sorting
gives better result.

8
Evaluation of an expression containing multiple operations:
 An expression with multiple operations can be evaluated broadly in two different ways:
materialized view and pipelining.
 A materialized view is a view whose contents are computed from the definition and stored
whenever required .
 The result of each evaluation is materialized in a temporary relation for subsequent use.
 A disadvantage to this approach is the need to construct the temporary relations, which must be
written to disk.
 An alternative approach is to evaluate several operations simultaneously in a pipeline, with the
results of one operation passed on to the next, without the need to store a temporary relation.
 Pipeline is the approach of sending the output of a computation as the input to the next
computation.

Practical query optimizers incorporate elements of the following two broad algorithms:
 Search all the plans, and chooses the best plan in a cost-based fashion.
 Uses heuristics to choose a plan.

Cost-Based Optimization Algorithm:

 A cost-based optimizer generates a range of query-evaluation plans from the given query by
using the equivalence rules, and chooses the one with the least cost.
Example:
 Suppose we want to find the best join order for

 If n=3 then 12 join orderings can be formed as below

 Here we have to find the cost for all possible join orders to find the best join order.
 There are (2(n − 1))!/(n − 1)! different join orders for ‘n ‘ number of relations.
 For n=7 ,the number is 665280. So it is very difficult to find the cost of all possible orders.
 Without generating the cost of all possible join orders using dynamic programming, the least-
cost join order for any subset of {r1, r2, ….r n } is computed only once and stored for future use.

9
 The time complexity of dynamic programming is O(3n ) and space complexity is O(2n ).
 Cost based optimization is expensive, but worthwhile for queries on large datasets.

Heuristic Optimization Algorithm:

 Cost based optimization algorithm is expensive even with dynamic programming.
 Systems may use heuristics to reduce the number of choices that must be made in a cost-based
fashion.
 Heuristic optimization transforms the query tree by using a set of rules that typically improves
execution performance.
o Perform selection early( Reduces the number of tuples)
o Perform projection early( Reduces the number of attributes)
o Perform most restrictive selection and join operations (i.e with smallest result size) before other
similar operations.
o Some systems use only heuristics algorithm whereas others combine heuristics with partial cost
based optimization.
o The heuristic rule is to apply select and project operations before applying the join (⋈) or
other binary operations.

Example:
Let emp(name, age, sal, dno)
dept(dno, dname, floor, mgr, ano)
Question: Display the name and departmental floor of employees getting salary more than 100k.
Ans: Select name, floor from emp, dept where emp.dno = dept.dno and sal > 100k.

10


Example 2:

SQL Server Partitioning
100% (2)
SQL Server Partitioning
20 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
Advanced Database
No ratings yet
Advanced Database
47 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
33 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
34 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
ADB Chapter 2
No ratings yet
ADB Chapter 2
40 pages
Data Communication Basics CH 2
No ratings yet
Data Communication Basics CH 2
36 pages
2 Chapter 3 Query Optimization
No ratings yet
2 Chapter 3 Query Optimization
29 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
Chapter 2 Query Processing
No ratings yet
Chapter 2 Query Processing
21 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Unit-5 Query Processing and Optimization
No ratings yet
Unit-5 Query Processing and Optimization
40 pages
28-Query Processing-30-09-2024
No ratings yet
28-Query Processing-30-09-2024
17 pages
4 Chapter Four
No ratings yet
4 Chapter Four
34 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
42 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
26 pages
Query Processing
No ratings yet
Query Processing
28 pages
ADBMS Notes
67% (3)
ADBMS Notes
48 pages
Module - 4
No ratings yet
Module - 4
60 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
CH - 2 Query Process
No ratings yet
CH - 2 Query Process
44 pages
Chapter Two Query Processing (2)
No ratings yet
Chapter Two Query Processing (2)
60 pages
Query Optimization
No ratings yet
Query Optimization
5 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
24 pages
Query Optimization
No ratings yet
Query Optimization
60 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
DBMS - Unit 3 1
No ratings yet
DBMS - Unit 3 1
17 pages
AMSAL
No ratings yet
AMSAL
58 pages
Unit 5 Query Processing Detail
No ratings yet
Unit 5 Query Processing Detail
38 pages
CH 02
No ratings yet
CH 02
127 pages
Chapter 1 Query Processing
No ratings yet
Chapter 1 Query Processing
58 pages
Query Trees and Heuristics For Query Optimization
No ratings yet
Query Trees and Heuristics For Query Optimization
29 pages
Chapter 2 Query processing and optimization [Autosaved]
No ratings yet
Chapter 2 Query processing and optimization [Autosaved]
35 pages
ADB Chapter 2 DB Part1
No ratings yet
ADB Chapter 2 DB Part1
10 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Rdbms Assignment
No ratings yet
Rdbms Assignment
12 pages
Query Optimization
No ratings yet
Query Optimization
103 pages
1 Intro Select Project
No ratings yet
1 Intro Select Project
28 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
42 pages
ch2. pdf
No ratings yet
ch2. pdf
72 pages
QUERY Processing and Relational Algebra
No ratings yet
QUERY Processing and Relational Algebra
27 pages
Query Processing and Optimization: Dessalegn Mequanint
No ratings yet
Query Processing and Optimization: Dessalegn Mequanint
31 pages
Lecture 20+Query+Processing+ +opt
No ratings yet
Lecture 20+Query+Processing+ +opt
22 pages
Adb_ch2
No ratings yet
Adb_ch2
72 pages
Optimización de Consultas en Bases de Datos Relacionales
No ratings yet
Optimización de Consultas en Bases de Datos Relacionales
44 pages
37-Module-4 Query Optimization-16-03-2024
No ratings yet
37-Module-4 Query Optimization-16-03-2024
26 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
63 pages
29-Query Optimization-04-10-2024
No ratings yet
29-Query Optimization-04-10-2024
35 pages
Chapter 2-Query Processing and Optimi
No ratings yet
Chapter 2-Query Processing and Optimi
43 pages
Query Processing
No ratings yet
Query Processing
5 pages
Chapter 6 - Query Processing and Optimization Algorithm
No ratings yet
Chapter 6 - Query Processing and Optimization Algorithm
27 pages
What Is Query Processing?
No ratings yet
What Is Query Processing?
9 pages
Advanced Database Systems Chapter One Query Processing & Optimization
No ratings yet
Advanced Database Systems Chapter One Query Processing & Optimization
22 pages
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
From Everand
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
Mustafa Al-Dori
5/5 (1)
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Production System: Fundamentals and Applications
From Everand
Production System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Advanced DB Chapter One
No ratings yet
Advanced DB Chapter One
34 pages
Assignment 04
No ratings yet
Assignment 04
10 pages
1.6 PPT - Query Optimization
No ratings yet
1.6 PPT - Query Optimization
53 pages
Execution Plan Basics - Simple Talk
100% (1)
Execution Plan Basics - Simple Talk
34 pages
Fundamentals of Database Systems: (Query Optimization - I)
No ratings yet
Fundamentals of Database Systems: (Query Optimization - I)
27 pages
Query Processing in Distributed Database
No ratings yet
Query Processing in Distributed Database
24 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
Optimizing Query Performance: Hanoi University of Technology
No ratings yet
Optimizing Query Performance: Hanoi University of Technology
53 pages
Enhanced SQL Trace Utility From Oracle: Oracle Tips by Burleson Consulting
No ratings yet
Enhanced SQL Trace Utility From Oracle: Oracle Tips by Burleson Consulting
19 pages
Optimizing SQL Query Processing: Patient 1, 0 0 0, 0 0 0
No ratings yet
Optimizing SQL Query Processing: Patient 1, 0 0 0, 0 0 0
6 pages
(Lecture Notes in Computer Science 10943) Ying Tan, Yuhui Shi, Qirong Tang - Data Mining and Big Data-Springer International Publishing (2018)
No ratings yet
(Lecture Notes in Computer Science 10943) Ying Tan, Yuhui Shi, Qirong Tang - Data Mining and Big Data-Springer International Publishing (2018)
792 pages
Expression Tree and Intro To Query Optimization
No ratings yet
Expression Tree and Intro To Query Optimization
11 pages
Oracle Statistics
No ratings yet
Oracle Statistics
26 pages
Ad Database All Slide
No ratings yet
Ad Database All Slide
49 pages
Database Performance Tuning and Query Optimization
No ratings yet
Database Performance Tuning and Query Optimization
33 pages
Ch13-Query Optimization
No ratings yet
Ch13-Query Optimization
42 pages
Python Geospatial Development - Third Edition - Sample Chapter
No ratings yet
Python Geospatial Development - Third Edition - Sample Chapter
32 pages
MCS 043
No ratings yet
MCS 043
34 pages
CH - 1 Query Process SW
No ratings yet
CH - 1 Query Process SW
43 pages
Chapter 2 Querry Proccessing
No ratings yet
Chapter 2 Querry Proccessing
7 pages
Query Processing and Optimisation - Intr
No ratings yet
Query Processing and Optimisation - Intr
41 pages
DDBMS Exam Questions
No ratings yet
DDBMS Exam Questions
3 pages
Query Optimization - Wikipedia
No ratings yet
Query Optimization - Wikipedia
5 pages
Micro
No ratings yet
Micro
265 pages
Query Processing
No ratings yet
Query Processing
3 pages
1_2e_Query_Optimization_ozsu_ch8_SPLIT (1)
No ratings yet
1_2e_Query_Optimization_ozsu_ch8_SPLIT (1)
29 pages
Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
No ratings yet
Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
29 pages
SQL Server Query Optimization - Key Points
No ratings yet
SQL Server Query Optimization - Key Points
5 pages

DE_Module5_QueryOptimization

Uploaded by

DE_Module5_QueryOptimization

Uploaded by

Module – V

The steps involved in processing a query

Parsing and translation:

balance<2500 (Π balance (account))

 Selection operations are commutative.

ii. Project Operation

Selections can be combined with Cartesian products and theta joins.

iii. Natural Join Operation

Similarly as natural join is commutative

Similarly as theta join is commutative

iv. Union and Intersection are commutative.

v. Union and Intersection are associative.

vi. Other Operations

vii. Projection operation distributes over the union operation.

Query1: Display names of customers having account in “BBSR” city.

Cost-Based Optimization Algorithm:

 If n=3 then 12 join orderings can be formed as below

Heuristic Optimization Algorithm:

You might also like