Query Processing and Optimization: Dessalegn Mequanint

The document summarizes query processing and optimization. It discusses how a DBMS executes queries by scanning, parsing, validating, and evaluating queries to access and present data. It describes the main steps in query processing as scanning, parsing, validation, generating a query tree, and obtaining results by traversing the tree. The document also discusses relational operations, selection, projection, joins, and how query optimization improves performance by choosing better execution plans.

Uploaded by

elshaday desalegn

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views

Query Processing and Optimization: Dessalegn Mequanint

Uploaded by

elshaday desalegn

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Query Processing and

Optimization
Dessalegn Mequanint
Overview
• Querying, and algorithms to evaluate queries
• Declarative query versus Algebraic query
How does the DBMS execute queries?
• A DBMS scans an input query, parses, validates, and
evaluates it by accessing the actual data, finally
presenting the results.
• Example: Given the relations:
– EMPLOYEE(FNAME, MINIT, LNAME, SSN, BDATE,
ADDRESS, SEX, SALARY, SUPERSSN, DNO)
WORKS_ON(ESSN, PNO, HOURS)
– get the names of employees who work on project No. 3:
• SELECT EMPLOYEE.LNAME FROM EMPLOYEE, WORKS_ON
WHERE EMPLOYEE.SSN = WORKS_ON.ESSN AND
WORKS_ON.PNO = ’3’ ;
Query Processing
• Scanner: identifies the tokens (language components).
– In the above example SELECT, FROM, and so on are all
tokens.
• Parser: verifies the query syntax to make sure the
syntax rules are obeyed.
• Validation: checks that all attribute and relation
names are valid and semantically meaningful. SELECT
EMPLOYEE.ESSN FROM ...
– would be invalid since ESSN does not exist in the
EMPLOYEE relation.
Query Processing ...
• Generate a query tree: The internal representation of the
query is usually a tree or graph form, constructed from
bottom to top.
Project lNAME attribute

Filter for PNO = 3

Join on ESSN = SSN

Table WORKS_ON Table EMPLOYEE

• Results are obtained by going through the steps in the tree
Relational Operations
• There are methods for carrying out all relational
operations:
– Selection ( σ)—selects a subset of rows from relation
– Projection ( π)—deletes unwanted columns from
relation
– Set-difference ( −)—tuples in relation 1, but not in
relation 2
– Union ( ∪)—tuples in relation 1 together with tuples in
relation 2
– Aggregation (such as SUM, MIN, etc.) and group by
Selection ( σ)
• Selects rows that satisfy selection condition.
• No duplicates in result! (Why?)
• Schema of result identical to schema of (only)
input relation
• Result relation can be the input for another
relational algebra operation! (Operator
composition.)
Selection ( σ)…
• σAcc-no>300(BOOK) =
Acc-No
400
500

• σTitle=”DBMS”(
Title BOOK)=
DBMS
DBMS
Selection ( σ)…
• σ <Cond1> and <Cond2> and ….
• σ <Cond1> or <Cond2> and ….
• σ <Cond1> or <Cond2> or ….
Projection ( π)
• Deletes attributes that are not in projection list.
• Schema of result contains exactly the fields in
the projection list, with the same names that
they had in the (only) input relation. ( Unary
Operation)
• Projection operator has to eliminate duplicates!
– as it returns a relation which is a set
Projection ( π)…
• πTitle(BOOK)
• πIDNo, FName(Student)
• πEmpno, Fname, Sname, Salary(Employee)
Nesting Selection in Projection
• πAcc-no (σTitle=”DBMS” (BOOK))
• πIDNo, FName(σCGPA>=3.25 (Student))
• πEmpno, Fname, Sname, Salary(σSalary>=10000 (Employee))
Equality Joins With One Join Column
• Three forms of outer join:
– Left outer join(⋊) the tuples which doesn’t match while doing
natural join from left relation are also added in the result putting
null values in missing field of right relation.
– Right outer join(⋉) the tuples which doesn’t match while natural
join from right relation are also added in the result putting null
values in missing field of left relation.
• select * from employee e1, department d1 where e1.did =
d1.did
• In algebra: R ⋈ S. It is so commonly used that it must be
carefully optimised. R ×S is large; so, R ×S followed by a
selection is inefficient.
Query Optimization
• DBMS Architecture
Query Optimization…
• Optimiser Architecture
Benefits of Query Optimisation
• We know how to evaluate queries.
– So why is there a need to optimise?
• the query language is declarative, that is, the user specifies the
required result.
• the user does not specify the details of how to go about obtaining the
required result and, therefore, there is opportunity for query
optimisation.
• query optimisation is necessary for high level relational languages.
• the term optimal solution is often used for the best obtained solution,
– with certain given constraints the cost involved in obtaining the true optimal
solution may be too high and hence we often settle for non-optimal
solutions.
Advantages of Having a Query
Optimiser
• The optimiser can take advantage of information not
available to the programmer such as database
statistics
• Changes to the database, such as addition of an
index, do not require queries to be reprogrammed.
– The optimiser need only to calculate new execution plan
• The execution plan is the result of "intelligence"
built into the optimisers and not dependent on the
capability of the individual programmer
Example of Query Optimisation
• SELECT EMPLOYEE.LNAME
FROM EMPLOYEE, WORKS_ON
WHERE EMPLOYEE.SSN = WORKS_ON.ESSN
AND WORKS_ON.PNO = 3;
• Suppose there are:
100 employees
200 WORKS_ON entries
10 of which are PNO = 3
Example…
• Solution 1: Take the cartesian product (×)
of EMPLOYEE and WORKS_ON.
– This will involve reading 100 + 200 tuples and writing 20,000
tuples.
– Restrict this result by the where clause: read in the 20,000
tuples to give the final result of 10 tuples.
• Solution 2: Apply WORKS_ON.PNO = 3 condition first.
– This involves reading 200 tuples and writing 10 tuples
(where PNO = 3). Perform join operation on the above result
with EMPLOYEE: read 100 tuples giving result of 10 tuples.
Overview of Query Optimisation
• Plan: Tree of relational algebra operations, with choice of
algorithm for each operation. Each operator is typically
implemented using a "pull" interface: when an operator is
"pulled" for the next output tuples, it "pulls" on its inputs and
computes them.
• Two main issues:
– For a given query, what plans are considered? (We need algorithms
to search the plan space for cheapest [estimated] plan.)
– How is the cost of a plan estimated?
• Ideally: Want to find best plan. Practically: Avoid worst plans!
Example
• Schemas:
– Sailor(sid: int, sname: string, rating: int, age: real)
– Reserves(sid: int, bid: int, day: dates, rname: string)
– Reserves: Each tuple is 40 bytes long, 100 tuples per
page, 1000 pages.
– Sailors: Each tuple is 50 bytes long, 80 tuples per page,
500 pages.
• Query:
– select S.sname from reserves R, sailor S where S.sid =
R.sid AND R.bid=100 AND S.rating > 5
Example…
• Relational Algebra tree:

M + pr*M*N
Example…
• Plan:
Cost: 500 + 500 × 1000 I/Os
•By no means the worst plan!
•Misses several opportunities:
selections could have been
"pushed" earlier, no use is made
of any available indexes, and so
on.
•Goal of optimisation: To find
more efficient plans that compute
the same answer.
Example…
• Alternative Plans 1 (No Indexes)
Main difference:
pushes selects.
Total cost is 3,560
page I/Os
Exercise
• Alternative Plans 2 (With Indexes)

total :1,210 I/Os

Reading Assignment
• What is System R or System R approach?
Using Heuristic in Query Optimization
• The following is a brief outline of the transformation steps which
will lead to an optimised tree that is more efficient to execute.
• The main idea is to apply first the operations that reduce the size of
intermediate results.
– Break up SELECT operation with conjunctive condition into a cascade
of SELECT operations.
– Move SELECT operations as far down the tree as possible.
– Rearrange leaf nodes of the tree so that relations with the most
restrictive SELECT operations are executed first.
– Combine (Cartesian PRODUCT followed by SELECT) into a JOIN operation
where possible.
– Move PROJECT as far down the tree as possible (breaking up the condition
first if necessary).
Semantic Query Optimization
• Semantic – of or relating to meaning or the study of meaning.
• Semantic information stored in databases as integrity
constraints could be used for query optimization.
• integrity : preserve data consistency when changes made in a
database.
• A different approach to query optimization, called semantic
query optimization, has been suggested. This technique,
which may be used in combination with the techniques
discussed previously, uses constraints specified on the
database schema.
– such as unique attributes and other more complex constraints.
Semantic Query Optimization…
• SELECT E.Lname, M.Lname FROM EMPLOYEE AS
E, EMPLOYEE AS M WHERE E.Super_ssn=M.Ssn AND
E.Salary > M.Salary
• This query retrieves the names of employees who earn
more than their supervisors.
• Suppose that we had a constraint on the database schema
that stated that no employee can earn more than his or
her direct supervisor.
– If the semantic query optimizer checks for the existence of this
constraint, it does not need to execute the query at all because
it knows that the result of the query will be empty.
Semantic Query Optimization…
• Query execution can be improved by:
– Analyzing integrity information, and rewriting
queries exploiting this information
– Avoid expensive sorting costs (Order
Optimization)
– Exploiting uniqueness by knowing rows will be
unique, thus, avoiding extra sorts
Semantic Query Optimization techniques
• Join Elimination (JE)

• Predicate Introduction (PI)

• Order Optimization (OO)

• Exploiting Uniqueness (EU)

SQL 100 Interview Questions
100% (2)
SQL 100 Interview Questions
24 pages
Mysql Associate
0% (1)
Mysql Associate
41 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
What Is SQL1
No ratings yet
What Is SQL1
62 pages
ADB Chapter 2
No ratings yet
ADB Chapter 2
40 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
05 Query Processing and Optimization-TELU
No ratings yet
05 Query Processing and Optimization-TELU
56 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
26 pages
Chapter 2 - Query Optimization
No ratings yet
Chapter 2 - Query Optimization
40 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Chapter 2-Query Processing and Optimi
No ratings yet
Chapter 2-Query Processing and Optimi
43 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
23 pages
Query Decomposition[1]
No ratings yet
Query Decomposition[1]
23 pages
13 QP1
No ratings yet
13 QP1
33 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
24 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
FALLSEM2023 24 - BCSE302L - TH - VL2023240100776 - 2023 06 16 - Reference Material I 2
No ratings yet
FALLSEM2023 24 - BCSE302L - TH - VL2023240100776 - 2023 06 16 - Reference Material I 2
41 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
ADBMS TypicalQueryOptimizer
No ratings yet
ADBMS TypicalQueryOptimizer
30 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
DBMS Chapter 7
No ratings yet
DBMS Chapter 7
5 pages
Data Communication Basics CH 2
No ratings yet
Data Communication Basics CH 2
36 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Unit 2 Query plan
No ratings yet
Unit 2 Query plan
7 pages
Query Optimization
No ratings yet
Query Optimization
5 pages
ADBMS Notes
67% (3)
ADBMS Notes
48 pages
CH 11
No ratings yet
CH 11
19 pages
Query Trees and Heuristics For Query Optimization
No ratings yet
Query Trees and Heuristics For Query Optimization
29 pages
Ad Bms Notes
No ratings yet
Ad Bms Notes
44 pages
ADBMS Chapter One
No ratings yet
ADBMS Chapter One
21 pages
CH - 2 Query Process
No ratings yet
CH - 2 Query Process
44 pages
1 Dsa
No ratings yet
1 Dsa
46 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Chapter 6 RelationalQueryLanguage
No ratings yet
Chapter 6 RelationalQueryLanguage
21 pages
Relational Algebra
No ratings yet
Relational Algebra
9 pages
Module 1
No ratings yet
Module 1
68 pages
Implications of A Distributed Environment Part 2
No ratings yet
Implications of A Distributed Environment Part 2
38 pages
Chapter 2 Query Optimization
No ratings yet
Chapter 2 Query Optimization
31 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
127 pages
Query Processing
No ratings yet
Query Processing
3 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Unit 6
No ratings yet
Unit 6
34 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
Query Proc Notes
No ratings yet
Query Proc Notes
10 pages
Chapter One1
No ratings yet
Chapter One1
21 pages
05 Query Processing-NDN
No ratings yet
05 Query Processing-NDN
33 pages
Relational algebra
No ratings yet
Relational algebra
87 pages
Chapter-3
No ratings yet
Chapter-3
88 pages
DAA Assignment 1
No ratings yet
DAA Assignment 1
32 pages
CIT-503 DAM Week 6
No ratings yet
CIT-503 DAM Week 6
39 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
Chapter-2-Data Structures and Algorithms Analysis
100% (2)
Chapter-2-Data Structures and Algorithms Analysis
44 pages
Module 4 - Query Processing and Optimization
No ratings yet
Module 4 - Query Processing and Optimization
15 pages
DBS_Part_2-1
No ratings yet
DBS_Part_2-1
23 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Query Optimizattion
No ratings yet
Query Optimizattion
113 pages
AMSAL
No ratings yet
AMSAL
58 pages
Unit 1
No ratings yet
Unit 1
32 pages
Physical Database Design and Tuning: R&G - Chapter 20
No ratings yet
Physical Database Design and Tuning: R&G - Chapter 20
23 pages
Advanced Data Structures
100% (1)
Advanced Data Structures
263 pages
Intro To SQL
No ratings yet
Intro To SQL
34 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Vertical_Data_Format_for_Frequent_Pattern_Mining (1)
No ratings yet
Vertical_Data_Format_for_Frequent_Pattern_Mining (1)
7 pages
Chapter 7
No ratings yet
Chapter 7
26 pages
Chapter 3
No ratings yet
Chapter 3
32 pages
Chapter 6
No ratings yet
Chapter 6
21 pages
Lab: Title: Lab Objectives:: Pointer Variables
No ratings yet
Lab: Title: Lab Objectives:: Pointer Variables
14 pages
Chapter 5
No ratings yet
Chapter 5
22 pages
Chapter 4
No ratings yet
Chapter 4
26 pages
Lab: Title: Lab Objectives:: Variable Definition Information Held
No ratings yet
Lab: Title: Lab Objectives:: Variable Definition Information Held
10 pages
Lab: Title: Lab Objectives:: Int Main Pow SQRT #Include
No ratings yet
Lab: Title: Lab Objectives:: Int Main Pow SQRT #Include
19 pages
Lab: Title: Lab Objectives:: Type Function - Name (Optional Parameter List) (Function Code Return Value )
No ratings yet
Lab: Title: Lab Objectives:: Type Function - Name (Optional Parameter List) (Function Code Return Value )
8 pages
Lab: Title: Lab Objectives:: Int Agefrequency (Totalyears) //reserves Memory For 100 Ints
No ratings yet
Lab: Title: Lab Objectives:: Int Agefrequency (Totalyears) //reserves Memory For 100 Ints
22 pages
SQL by Example - Learn How To CR - Charlotte McGary PDF
No ratings yet
SQL by Example - Learn How To CR - Charlotte McGary PDF
281 pages
Background: Alter Table
No ratings yet
Background: Alter Table
31 pages
Modeling MultiProviders and InfoSets With SAP BW PDF
No ratings yet
Modeling MultiProviders and InfoSets With SAP BW PDF
17 pages
Untitled
No ratings yet
Untitled
25 pages
Governor in Framework MGR
No ratings yet
Governor in Framework MGR
13 pages
Microsoft SQL Server 2000 Programming by Example
No ratings yet
Microsoft SQL Server 2000 Programming by Example
704 pages
DBMS Breshup
No ratings yet
DBMS Breshup
11 pages
000-732 150 Q&a
No ratings yet
000-732 150 Q&a
52 pages
SQL Joins in Report
No ratings yet
SQL Joins in Report
6 pages
CTSQL Notes
No ratings yet
CTSQL Notes
17 pages
Oracle Database 11g: SQL Fundamentals I: D49996GC11 Edition 1.1 April 2009 D59982
No ratings yet
Oracle Database 11g: SQL Fundamentals I: D49996GC11 Edition 1.1 April 2009 D59982
58 pages
VennDiagram1 PDF
100% (1)
VennDiagram1 PDF
1 page
Workshop Solutions
100% (1)
Workshop Solutions
47 pages
Hands-On Lab: IBM Software Information Management
No ratings yet
Hands-On Lab: IBM Software Information Management
25 pages
List of SQL Commands
100% (2)
List of SQL Commands
7 pages
Data Base Management System (Coeg3193) : Chapter Five: Structured Query Language (SQL)
No ratings yet
Data Base Management System (Coeg3193) : Chapter Five: Structured Query Language (SQL)
79 pages
Relational Algebra and Relational Calculus: Pearson Education © 2009
No ratings yet
Relational Algebra and Relational Calculus: Pearson Education © 2009
57 pages
0raspunsuri Oracle Sem 1 PDF
No ratings yet
0raspunsuri Oracle Sem 1 PDF
22 pages
Chaper 7 Test Bank
No ratings yet
Chaper 7 Test Bank
37 pages
Question Bank Class Xi Cs-4
No ratings yet
Question Bank Class Xi Cs-4
20 pages
Qlikview Minus Points
No ratings yet
Qlikview Minus Points
36 pages
Informatica Geek Interview Questions
No ratings yet
Informatica Geek Interview Questions
69 pages
Lab Course File: Galgotias University
No ratings yet
Lab Course File: Galgotias University
42 pages
Mysql Notes Cycletest4 2024 2025 12
No ratings yet
Mysql Notes Cycletest4 2024 2025 12
18 pages
SQL & T-SQL Ver 1 by Shareef PDF
No ratings yet
SQL & T-SQL Ver 1 by Shareef PDF
102 pages
BDC Output 5
No ratings yet
BDC Output 5
4 pages
Extended Star Schema
No ratings yet
Extended Star Schema
22 pages