Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
0 views

Querry Optimization

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Querry Optimization

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Query optimization

• A query optimizer decides the best methods for implementing each query.
SQL query optimization is the process of refining SQL queries to improve
their efficiency and performance. Optimization techniques help to query and
retrieve data quickly and accurately.
Without proper optimization, the queries would be like searching through this data
unorganized and inefficiently, wasting time and resources.
Requirement For SQL Query Optimization
• The main goal of SQL query optimization is to reduce the load on system
resources and provide accurate results in lesser time. It makes the code more
efficient which is important for optimal performance of queries.
The major reasons for SQL Query Optimizations are:
• Enhancing Performance: The main reason for SQL Query Optimization is to reduce
the response time and enhance the performance of the query. The time difference
between request and response needs to be minimized for a better user experience.
• Reduced Execution Time: The SQL query optimization ensures reduced CPU
time hence faster results are obtained.
• Enhances the Efficiency: Query optimization reduces the time spend on hardware
and thus servers run efficiently with lower power and memory consumption.
• Ways or Steps in SQL Query Optimization: (query optimization strategies )
• The optimized SQL queries not only enhance the performance but also contribute to cost
savings by reducing resource consumption.
1. Use Indexes
• Indexes act like internal guides for the database to locate specific information quickly.
Identify frequently used columns in WHERE clauses, JOIN conditions, and ORDER BY clauses,
and create indexes on those columns. However, creating too many indexes can slow down
adding and updating data, so use them strategically.
It is important to create only indexes on columns that will provide significant search speed
improvements.
2. Use WHERE Clause instead of having
• The use of the WHERE clause instead of Having enhances the efficiency to a great extent.
WHERE query execute more quickly than HAVING. WHERE filters are recorded before groups
are created and HAVING filters are recorded after the creation of groups. This means that
using WHERE instead of HAVING will enhance the performance and minimize the time
taken.
For Example
• SELECT name FROM table_name WHERE age>=18; results in displaying only those names
whose age is greater than or equal to 18 whereas
• SELECT age COUNT(A) AS Students FROM table_name GROUP BY age HAVING
COUNT(A)>1; results in first renames the row and then displaying only those values which
3. Avoid Queries inside a Loop
• This is one of the best optimization techniques that we must follow.
Running queries inside the loop will slow down the execution time to a
great extent. In most cases, we will be able to insert and update data in bulk
which is a far better approach as compared to queries inside a loop.
• The iterative pattern which could be visible in loops such as for, while and
do-while takes a lot of time to execute, and thus the performance and
scalability are also affected. To avoid this, all the queries can be made
outside loops, and hence, the efficiency can be improved.
4. Use Select instead of Select *
• One of the best ways to enhance efficiency is to reduce the load on the
database. This can be done by limiting the amount of information to be
retrieved from each query. Running queries with Select * will retrieve all the
relevant information which is available in the database table. It will retrieve
all the unnecessary information from the database which takes a lot of time
and enhance the load on the database.
• The better approach is to use a Select statement with defined parameters to
retrieve only necessary information. Using Select will decrease the load on
the database and enhances performance.
5. Keep Wild cards at the End of Phrases
• A wildcard is used to substitute one or more characters in a string. It is used with
the LIKE operator. LIKE operator is used with where clause to search for a specified
pattern. Pairing a leading wildcard with the ending wildcard will check for all records
matching between the two wildcards.
Let’s understand this with the help of an example.
• Consider a table Employee which has 2 columns name and salary. There are 2
different employees namely Rama and Balram.
• Select name, salary From Employee Where name like ‘%Ram%’;
• Select name, salary From Employee Where name like ‘Ram%’;
• Now when we search %Ram% we will get both the results Rama and Balram,
whereas Ram% will return just Rama. So, efficiency will be enhanced by using wild
cards at the end of phrases.
6. Use Exist() instead of Count()
• Both Exist() and Count() are used to search whether the table has a specific record
or not. But in most cases Exist() is much more effective than Count(). As Exist() will
run till it finds the first matching entry whereas Count() will keep on running and
provide all the matching records. Hence this practice of SQL query
optimization saves a lot of time and computation power. EXISTS stop as the logical
test proves to be true whereas COUNT(*) must count each and every row, even after
7. Avoid Cartesian Products
• Cartesian products occur when every row from one table is joined with every row from another
table, resulting in a massive dataset. Accidental Cartesian products can severely impact query
performance. Always double-check JOIN conditions to avoid unintended Cartesian
products. Make sure you’re joining the tables based on the specific relationship you want to
explore.
• For Example
• Incorrect JOIN (Cartesian product): SELECT * FROM Authors JOIN Books; (This joins every author
with every book)
• Correct JOIN (retrieves books by author): SELECT Authors.name, Books.title FROM Authors JOIN
Books ON Authors.id = Books.author_id; (This joins authors with their corresponding books
based on author ID).
8. Consider Denormalization
• Denormalization involves strategically adding redundant data to a database schema to improve
query performance. It can reduce the need for JOIN operations but should be balanced with
considerations for data integrity and maintenance overhead. JOIN operations, which combine
data from multiple tables, can be slow, especially for complex queries. Denormalization aims to
reduce the need for JOINs by copying some data from one table to another.
• For Example
• Imagine tables for “Customers” and “Orders.” Normally, we would link them with a foreign key
(e.g., customer ID) in the Orders table. To speed up queries that retrieve customer information
along with their orders, we could denormalize by adding some customer details (e.g., name,
9. Optimize JOIN Operations
• JOIN operations combine rows from two or more
tables based on a related column. Select the JOIN
type that aligns with the data we want to
retrieve.
• For example, to find all customers and their
corresponding orders (even if a customer has no
orders), use a LEFT JOIN on the customer ID
column. The JOIN operation works by comparing
values in specific columns from both tables (join
condition). Ensure these columns are indexed for
faster lookups. Having indexes on join columns
significantly improves the speed of the JOIN
Cost Based optimization

Cost-Based Optimization:
For a given query and environment, the Optimizer allocates a cost in
numerical form which is related to each step of a possible plan and then
finds these values together to get a cost estimate for the plan or for the
possible strategy. After calculating the costs of all possible plans, the
Optimizer tries to choose a plan which will have the possible lowest cost
estimate.
Features of the cost-based optimization-
• The cost-based optimization is based on the cost of the query that to be
optimized.
• The query can use a lot of paths based on the value of indexes, available
sorting methods, constraints, etc.
• The aim of query optimization is to choose the most efficient path of
implementing the query at the possible lowest minimum cost in the form of
an algorithm.
• The cost of executing the algorithm needs to be provided by the query
Optimizer so that the most suitable query can be selected for an operation.
Cost Estimation:
To estimate the cost of different available execution plans or the execution
strategies the query tree is viewed and studied .
The cost of optimization of the query depends upon the following-
Cardinality-
Cardinality is known to be the number of rows that are returned by
performing the operations specified by the query execution plan. The
estimates of the cardinality must be correct as it highly affects all the
possibilities of the execution plan.
Selectivity-
Selectivity refers to the number of rows that are selected. The selectivity
of any row from the table or any table from the database almost depends
upon the condition. The satisfaction of the condition takes us to the
selectivity of that specific row. The condition that is to be satisfied can be
any, depending upon the situation.
Cost-
Cost refers to the amount of money spent on the system to optimize the
system. The measure of cost fully depends upon the work done or the
Cost Components Of Query Execution:
Access cost to secondary storage-
This can be the cost of searching, reading, or writing data blocks that originally found on the
secondary storage, especially on the disk. The cost of searching for records in a file also depends
upon the type of access structure that file has.
• Memory usage cost- The cost of memory usage can be calculated simply by using the number of
memory buffers that are needed for the execution of the query.
• Storage cost- The storage cost is the cost of storing any intermediate files(files that are the result of
processing the input but are not exactly the result) that are generated by the execution strategy for
the query.
• Computational cost- This is the cost of performing the memory operations that are available on the
record within the data buffers. Operations like searching for records, merging records, or sorting
records. This can also be called the CPU cost.
• Communication cost- This is the cost that is associated with sending or communicating the query
and its results from one place to another. It also includes the cost of transferring the table and
results to the various sites during the process of query evaluation.

• Issues In Cost-Based Optimization:


The following are the issues in cost-based optimization-
• In cost-based optimization, the number of execution strategies that can be considered is not really
fixed. The number of execution strategies may vary based on the situation.
• Sometimes, this process is really very time-consuming to cost because it does not always guarantee
finding the best optimal strategy
Heuristic Optimization in DBMS
Cost-based optimization is expensive. Heuristics are used to reduce the number
of choices that must be made in a cost-based approach.

Heuristic optimization transforms the expression-tree by using a set of rules


which improve the performance. These rules are as follows −

• Perform the SELECTION process foremost in the query. This should be the first
action for any SQL table. By doing so, we can decrease the number of records
required in the query, rather than using all the tables during the query.
• Perform all the projection as soon as achievable in the query. Somewhat like a
selection but this method helps in decreasing the number of columns in the
query.
• Perform the most restrictive joins and selection operations. What this means
is that select only those sets of tables and/or views which will result in a
relatively lesser number of records and are extremely necessary in the query.
Obviously any query will execute better when tables with few records are
joined.
• Steps in heuristic optimization
• Deconstruct the conjunctive selections into a sequence
of single selection operations.
• Move the selection operations down the query tree for
the earliest possible execution.
• First execute those selections and join operations which
will produce smallest relations.
• Replace the cartesian product operation followed by
selection operation with join operation.
• Deconstructive and move the tree down as far as
possible.
• Identify those subtrees whose operations are pipelined.

You might also like