Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
15 views

CO3-Notes-Query Processing and Optimization

The document discusses query processing in databases. It describes what a query is, how queries are processed through parsing, optimization, and evaluation steps, and provides details about each step such as translating SQL queries to relational algebra and choosing the most efficient execution plan.

Uploaded by

Nani Yagneshwar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

CO3-Notes-Query Processing and Optimization

The document discusses query processing in databases. It describes what a query is, how queries are processed through parsing, optimization, and evaluation steps, and provides details about each step such as translating SQL queries to relational algebra and choosing the most efficient execution plan.

Uploaded by

Nani Yagneshwar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CO3

Query Processing and Optimization

Query
• A query is a kind of request that is sent to the Database for retrieval of data. In a query,
we pass some specific condition, and then it matches the specific data if it is present in
the database.
• Query is used to retrieve records from the table.
• Query can either be a select, an action, or a combination of both. Select queries can
retrieve information from data sources, and action queries work for data manipulation
for example, to add, change or delete data.
• We can write queries in SQL (Structured Query Language).
• Advanced users can also use query commands to perform various programming tasks
and granting permissions.
Query Processing

• Query processing is a process of translating a user query into an executable form.


• Query Processing is the activity performed in extracting data from the database.
• Query processing includes certain activities for data retrieval. The activities include
translation of queries in high-level database languages into expressions that can be used
at the physical level of the file system, a variety of query-optimizing transformations,
and actual evaluation of queries.
Query processing is a 3-step process for fetching the data from the database:
◼ Parsing and translation
◼ Optimization
◼ Evaluation
1) PARSING AND TRANSLATION:

• SQL being a High-Level Language makes it easier not just for the users to query data
based on their needs but also bridges the communication gap between the DBMS which
does not really understand human language. In fact, the underlying system of DBMS
won't even understand these SQL queries. For them to understand and execute a query,
they first need to be converted to a Low-Level Language. The SQL queries go through
a processing unit that converts them into low-level Language (Relational Algebra) in
DBMS. Relational algebra is well suited for the internal representation of a query.
• First step in query processing is Parsing and Translation. Queries undergo lexical,
syntactic, and semantic analysis.
➢ Essentially, the query gets broken down into different tokens and white spaces
are removed along with the comments (Lexical Analysis).
➢ In the next step, Query processor first checks the query if the rules of SQL have
been correctly followed or not (Syntactic Analysis).
➢ Finally, the query processor checks if the meaning of the query is right or not.
Things like if the table(s) mentioned in the query are present in the DB or not?
if the column(s) referred from all the table(s) are actually present in them or
not? (Semantic Analysis)
• Once the above-mentioned checks pass, the system translates the query into a relational-
algebra expression.
• As an illustration, consider the query where user wants to fetch the records of the
employees whose salary is less than 70000:
➢ select salary from instructor where salary < 75000;
➢ First, Query would be divided into tokens.
➢ Then, name of the queried table is looked into the data dictionary table.
➢ Name of the columns mentioned (salary) in the tokens are validated for
existence.
➢ Type of column(s) being compared have to be of the same type (salary and the
value 20000 should have the same data type).
➢ Next step is to translate the generated set of tokens into a relational algebra
query.
➢ These are easy to handle for the optimizer in further processes.

TRANSLATION
• To make the system understand the user query, it needs to be translated in the form of
relational algebra. This query can be translated into either of the following relational-
algebra expressions:
• After translating the given query, we can execute each relational algebra operation by
using different algorithms in this way, a query processing begins its working.

2) QUERY OPTIMIZATION

• The query optimizer (also known as the optimizer) is database software that identifies
the most efficient way (like by reducing time) for a SQL statement to access data.The
purpose of query optimization, which is an automated process, is to find an execution
plan that reduces the time required to process a query.
• Following query parsing, calculating how many different ways there are in which the
query can run, then the parsed query is delivered to the query optimizer, which
generates various execution plans to analyze the parsed query and select the plan with
the lowest estimated cost. If the relational graph was constructed, there could be
multiple paths from source to destination.
• A query execution plan will be generated for each of the paths. Cost of the query
evaluation can vary for different types of queries. DMBS picks up the most efficient
evaluation plan based on the cost each plan has. This task is performed by the database
system and is known as Query Optimization.
• Optimizer also evaluates the usage of index present in the table and the columns being
used. It also finds out the best order of subqueries to be executed so as to ensure only
the best of the plans gets executed.
• For any query there are multiple evaluation plans to execute it. Choosing the on which
costs the least is called Query Optimization in DBMS. Although the system is
responsible for constructing the evaluation plan, the user does need not to write their
query efficiently.
• Consider the following relational-algebra expression, for the query “Find the names of
all instructors in the Music department together with the course title of all the courses
that the instructors teach.”

• The above expression constructs a large intermediate relation. we reduce the size of the
intermediate result. Our query is now represented by the relational-algebra expression:

It is equivalent to our original algebra expression, but which generates smaller


intermediate relations. Figure depicts the initial and transformed expressions.
• An evaluation plan defines exactly what algorithm should be used for each operation,
and how the execution of the operations should be coordinated.
• As we have seen, several different algorithms can be used for each relational operation,
giving rise to alternative evaluation plans. In the figure, hash join has been chosen for
one of the join operations, while the other uses merge join, after sorting the relations on
the join attribute, which is ID.

• To find the least-costly query-evaluation plan, the optimizer needs to generate


alternative plans that produce the same result as the given expression, and to choose the
least-costly one. Generation of query-evaluation plans involves three steps:
➢ generating expressions that are logically equivalent to the given expression,
➢ annotating the resultant expressions in alternative ways to generate alternative
query-evaluation plans, and
➢ estimating the cost of each evaluation plan, and choosing the one whose
estimated cost is the least.
3) QUERY EVALUATION
• A query execution engine is responsible for generating the output of the given query
• It takes the query execution plan, executes it, and finally makes the output for the user
query
• After finding the best execution plan, the DBMS starts the execution of the optimized
query. And it gives the results from the database. In this step, DBMS can perform
operations on the data. These operations are selecting the data, inserting something,
updating the data, and so on.

You might also like