CH - 1 Query Process SW
CH - 1 Query Process SW
CH - 1 Query Process SW
1
Objectives of the chapter
2
Introduction of Query Processing
3
Overview of Query Processing
A query can either be a request for data results from your database or for
action on the data, or for both.
A query can give you an answer to a simple question, combine data from
different tables, add, change, or delete data from a database.
Query Processing takes various steps for fetching the data from the database.
Query Processing Steps
6
Query Processing Steps
There are three phases that a query passes through during the DBMS’
processing of that query:
Optimization
Evaluation
1. Query Decomposition
Query decomposition is the first phase of query processing.
Validator - Validate by checking that all attribute and relation names are valid
and semantically meaningful names in the schema of the particular database
being queried.
Cont’d
Parser extract the tokens from the raw string of characters and translate
them into the corresponding internal data elements (i.e. Relational
algebra operations and operands) and structures.
In second stage, the query processor applies rules to the internal data structures of the query
to transform these structures into equivalent, but more efficient representations.
Selecting the proper rules to apply, when to apply them and how they are applied is the
function of the query optimization engine.
A query typically has many possible execution strategies, and the process of choosing a
The best evaluation plan candidates generated by the optimization engine is selected
and then executed.
Code generator generates the code to execute that plan either in compiled or
interpreted mode.
The runtime database processor has the task of running (executing) the query code,
whether in compiled or interpreted mode, to produce the query result.
Figure 1:- Steps in Query Processing
Figure 2 :- Steps in Query Processing
Translating SQL Queries into Relational Algebra
(RA)
15
Translating SQL Queries into Relational
Algebra
An SQL query is first translated into an equivalent extended relational algebra
expression—represented as a query tree data structure—that is then optimized.
SQL queries are decomposed into query blocks, (Query block - basic units that can
be translated into the algebraic operators and optimized.
SQL QUERY:- 1, SELECT title, price FROM book WHERE price >50
1. σprice>50(Πtitle,price(book))
SQL QUERY:- 2, SELECT balance FROM account WHERE balance < 2500
1. σbalance< 2500(Πbalance(account))
2. Πbalance(σbalance<2500(account))
Generating Execution Plan –Example 2
This can also be represented as either of the following query trees:
Class Work
Generating Execution Plan –Example 3
Employee Employee
Generating Execution Plan –Example 4
FROM EMPLOYEE
WHERE Salary> (SELECT MAX(Salary)
FROM EMPLOYEE
WHERE Dno=5 );
This query retrieves the names of employees (from any department in the
company) who earn a salary that is greater than the highest salary in department 5.
The query includes a nested subquery and hence would be decomposed into
two blocks.
Generating Execution Plan –Example 4
The inner block could be translated into the following extended relational algebra
expression:
ΠMAX Salary(σDno=5(EMPLOYEE))
ΠLname,Fname(σSalary>C(EMPLOYEE))
The query optimizer would then choose an execution plan for each query block.
Notice that in the above example, the inner block needs to be evaluated only once to
produce the maximum salary of employees in department 5, which is then used as the
Heuristics Approach to Query
Optimization
26
Heuristics Approach to Query Optimization
Heuristic rules are used to modify the internal representation of a query which is
usually in the form of a query tree or a query graph data structure to improve its
expected performance.
The scanner and parser of an SQL query first generate a data structure that
corresponds to an initial query representation, which is then optimized according to
heuristic rules.
Because the size of the file resulting from a binary operation such as JOIN is
usually a multiplicative function of the sizes of the input files.
The SELECT and PROJECT operations reduce the size of a file and hence should
be applied before a join or other binary operation.
Heuristic is rule that works well in most cases but is not guaranteed to work well in
every possible cases.
Notation for Query Trees
Query tree - Tree data structure that corresponds to a relational algebra expression.
QUERY
SELECT Name, Cname, Dname
FROM Student ,Course, Department
WHERE Cid=Cno AND Did=Dno AND Cgpa>3 ;
Exercise 1:
Name,Cname,Dname
× Department
Student Course
Exercise 1:
Step 2: Moving SELECT operations down the query tree
Exercise 1: and SELECT with JOIN operations
Step 4: Replacing CARTESIAN PRODUCT
Query Optimization
39
Using Selectivity and Cost Estimates in Query
Optimization
Heuristic is rule that works well in most cases but is not guaranteed to work well in
every possible cases.
A query optimizer does not depend solely on heuristic rules; it also estimates and
compares the costs of executing a query using different execution strategies and
algorithms, and it then chooses the strategy with the lowest cost estimate.
1. Access cost to secondary storage: This is the cost of transferring (reading and
writing) data blocks between secondary disk storage and main memory buffers.
This is also known as disk I/O (input/output) cost.
2. Disk storage cost: This is the cost of storing on disk any intermediate files that are
generated by an execution strategy for the query.
Cont’d
3. Computation cost: This is the cost of performing in-memory operations on the records within
the data buffers during query execution.
Such operations include searching for and sorting records, merging records for a join
or a sort operation, and performing computations on field values.
4. Memory usage cost: This is the cost relating to the number of main memory buffers needed
during query execution.
5. Communication cost: This is the cost that is associated with sending or communicating the
query and its results from one place to another. It also includes the cost of transferring the table
and results to the various sites during the process of query evaluation.
Th
an
ky
ou
!!
43