Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
183 views

Db2 SQL Tuning

The document provides tips for improving DB2 SQL performance as a developer, including avoiding non-indexable predicates, updating statistics, tuning queries by changing SQL or indexes, and ensuring proper physical clustering to minimize data retrieval during queries. It highlights common issues like table scans and sorts, and how to identify and address them by examining queries with tools like Visual Explain.

Uploaded by

Bathmalakshmi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
183 views

Db2 SQL Tuning

The document provides tips for improving DB2 SQL performance as a developer, including avoiding non-indexable predicates, updating statistics, tuning queries by changing SQL or indexes, and ensuring proper physical clustering to minimize data retrieval during queries. It highlights common issues like table scans and sorts, and how to identify and address them by examining queries with tools like Visual Explain.

Uploaded by

Bathmalakshmi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

DB2 SQL Tuning Tips for

Developers
Webinar
Tony Andrews
tandrews@themisinc.com

Twitter
Questions?
I will try my best to get to some questions towards the end of
the webinar.

You can submit questions by typing into the questions area of


your webinar control panel.

Any questions not answered due to time constraints can be


answered afterward via an email.

Presentation will be added to our Themis website under


‘Webinar’ at top of main page. www.themisinc.com

2
Webinar Objectives
• Learn what makes queries, programs, and applications perform
poorly
• Learn what you can do as a developer to improve performance
• Better understand what SQL optimization is
• What to do when you see table scans in a query
• Teach developers the different types of predicates
• Learn the difference between indexable and non- indexable
predicates
• Learn why data statistics and ‘Knowing Your Data’ is so
important
• Learn the top steps to tuning a query or program
• Leave with many SQL standards and guidelines for development

3
What are some of the key areas that can cause performance
issues within applications, programs, and queries?

• Bad coding practices. Poorly coded SQL


- Non indexable predicates
- Stage 2 / Residual predicates
- SQL doing more than it needs (extra tables, extra sorts,
etc.)
• Wrong access path / Poor access path. Watch out for table
scans!!
• Poor index design (Low Cardinality, Redundancy, Column
order, etc). Know the application’s workload!
• Too much synchronous I/O
• Too many calls to DB2 from program logic

4
What are some of the key areas that can cause performance
issues within applications, programs, and queries?

• Large sorts. Know your data when you see a sort!


• Unneeded materialization of data
• Too much lock contention
• Statistics out of date (especially in test environments).
Need a good test environment with production statistics and
enough data to compare performance tests.
• Wrong clustering order of data

5
Bad Coding Practice
SQL Tip
1). Take out any / all Scalar functions coded on columns in predicates.

For example, this is the most common:

SELECT EMPNO, LASTNAME


FROM EMPLOYEE
WHERE YEAR(HIREDATE) = 2005

Should be coded as:

SELECT EMPNO, LASTNAME


FROM EMPLOYEE
WHERE HIREDATE BETWEEN ‘2005-01-01’ and ‘2005-12-31’

V9: Can now create indexes on SQL expressions.


V11: Optimizer actually does this date rewrite now
Bad Coding Practice
SQL Tip
1). Take out any / all Scalar functions coded on columns in predicates.

For example, this is the most common:

SELECT EMPNO, LASTNAME


FROM EMPLOYEE
WHERE HIREDATE + 7 DAYS > CURRENT DATE

Should be coded as:

SELECT EMPNO, LASTNAME


FROM EMPLOYEE
WHERE HIREDATE > CURRENT DATE - 7 days

V9: Can now create indexes on SQL expressions.


V11 Stage 1 Predicates Involving
Columns in Predicates

New Stage 1 / Indexable predicates

WHERE value BETWEEN COL1 AND COL2

WHERE SUBSTR(COLX, 1, n) = value  From Pos 1


only
WHERE DATE(TS_COL ) = value

WHERE YEAR(DT_COL ) = value


Bad Coding Practice
Stage 2 Predicates
Use the Visual Explain in IBM Data Studio or query directly the
DSN_PREDICAT_TABLE to see any stage 2 predicates. Note the filter
factor information also. WHERE ‘1900-01-01’ BETWEEN DATE_COL1
AND DATE_COL2
Tuning Approaches
• Explain the Query

• Change the SQL. Rewrite the query or predicates


a different way

• Redesign the program flow

• Update / Improve data statistics

• Change Physical Design

10
What Causes a Table Scan?
• The predicate(s) may be poorly coded in a non-indexable way.
• The predicates in the query do not match any available indexes on
table.
• The table could be small, and DB2 decides a tablespace scan may
be faster than index processing.
• The catalog statistics say the table is small, or maybe there are no
statistics on the table.
• The predicates are such that DB2 thinks the query is going to
retrieve a large enough amount of rows that would require a
tablespace scan. Check the Filter Factor!
• The predicates are such that DB2 picks a non-clustered index, and
the number of pages to retrieve is high enough based on total
number of pages in the table to require a tablespace scan.
• The tablespace file or index files could physically be out of shape
and need a REORG.

11
Tuning Approach: Change the SQL
and/or Change the program Design

• Can any predicates be rewritten (and still keep same logic)


• Can the query be rewritten
• Can we combine any queries in the program

Sometimes there can 2,3,4,5,6 different ways to code an


SQL statement and return the same results. They
do not all optimize the same!

12
Change the SQL Example
Each of these will produce the same results, but operate very differently. Typically one
will perform better than the other depending on data distributions. For Example:

Non Correlated Subquery Can also be coded as:

SELECT E.EMPNO, E.LASTNAME SELECT E.EMPNO, E.LASTNAME


FROM EMP E FROM EMP E
WHERE E.EMPNO IN WHERE EXISTS
(SELECT D.MGRNO (SELECT 1
FROM DEPT D FROM DEPT D
WHERE D.DEPTNO LIKE ‘D%”) WHERE D.MGRNO = E.EMPNO
AND D.DEPTNO LIKE ‘D%’)

Or a 2 table join, but watch out for possible duplicates (if 1 to many relationship)

SELECT DISTINCT E.EMPNO, E.LASTNAME


FROM EMP E, DEPT D
WHERE E.EMPNO = D.MGRNO
AND D.DEPTNO LIKE ‘D%’
Tuning Approach: Redesign the
Program Flow
• Know your numbers. How many inserts, updates, deletes,
selects, open cursors, and fetches per execution? Can
they be cut down?
• Code relationally and not procedurally
• Know the many different ways to code for mass inserts,
mass deletes, and mass updates.
• Minimize the number of times your code sends SQL
statements to DB2.
• Take advantage of multi row processing, merge, select from
insert/update/delete, multi table joins, etc.
• Order incoming data by either primary key, or column(s) of
the index selected from DB2.

14
Tuning Approach: Explain the Query
• Any Table Scans? What’s causing it?
• Any Index Scans? What’s causing it?
• Any Partition Scans? What’s causing it?
• Which Index? Matching columns? Screening?
• Any Sorts? What’s causing it? How big is the sort?
• Any Join sorts? What other queries join to that table?
• Any subqueries? Can they be rewritten?
• Any materialization from NTE and CTE’s? Can they be
rewritten? (Not saying these are always bad…)
• Check the predicates? Stage 2 or Residual? Filter factor?

15
Update / Improve Data Statistics

• Are statistics up to date (or close enough)?


• Do all columns have cardinality statistics?
• Are there any columns used in predicates with skewed
distribution of data?
- Are there statistics to support the data skew?
- Frequency value vs Histogram
- Is your code taking advantage of the statistics by either
hard coding or re-optimizing at runtime?
• Has data changed in the table (10% or more increase or
decrease) since last compile? KNOW YOUR DATA!!!!

16
Update / Improve Data
Statistics
Statistics in Test vs Production.

- Just copying statistics is not good


enough. Need enough data to see run
time differences

- Have to test the different code and


compare CPU times.

- DB2 not always correct in its


guestimations
17
Program Hard Coding for
Performance. Know your data!

STATUS_CODE current values ‘A’ 90% of data


‘I’ 6% of data
‘T’ 4% of data

1) Select ….. From Table 2) Select ….. From Table


Where …….. Where ……..
and …….. and ……..
and Status_Code = ‘A’ and Status_Code = :HV
; and Status_Code <> ‘A’
;

18
Physical Design
Make sure of the clustering order of data in your
tablespaces.
Tables should be physically clustered in the order that they are
typically processed by queries processing the most data.
This ensures the least amount of ‘Getpages’ when
processing.
Long running queries with ‘List Prefetch’ and ‘Sorts’ in many
join processes are good indicators that maybe a table is not
in the correct physical order.
Application queries that join to a table via the foreign key vs
the primary key is a good indicator.
Too many ‘Getpages’ vs rows returned

19
Change the Physical Design ?
EMP table clustered by EMPNO
000010 HAAS …… A00 000100 SPENSER …… E21
000020 THOMPSON …… B01 000110 LUCHESI …… A00
000030 KWAN ………. C01 000120 O’CONNELL ..…. A00
000050 GEYER …… E01 000130 QUINTANA …… C01
000060 STERN …… D11 000140 NICHOLLS …… C01
000070 PULASKI …… D21 000150 ADAMSON …… D11
000090 HENDERSON ……. E11 000160 PIANKA …….. D11

Should this table be in EMPNO Primary Key order?

It Depends…..

1-20
Change the Physical Design ?
EMP table clustered by EMPNO
000010 HAAS …… A00 000100 SPENSER …… E21
000020 THOMPSON …… B01 000110 LUCHESI …… A00
000030 KWAN ………. C01 000120 O’CONNELL ..…. A00
000050 GEYER …… E01 000130 QUINTANA …… C01
000060 STERN …… D11 000140 NICHOLLS …… C01
000070 PULASKI …… D21 000150 ADAMSON …… D11
000090 HENDERSON ……. E11 000160 PIANKA …….. D11

What happens here? Where are all the rows that


have ‘A00’ as a DEPTNO value?
SELECT *
FROM EMP IF there were 100 rows that contain
WHERE DEPTNO = ‘A00’ this value, they could be on 100
pages of data. Yes?

1-21
Thank you for allowing me to share some of my
experience and knowledge today!

Tony Andrews
tandrews@themisinc.com

• I hope that you learned something new today


• I hope that you are a little more inspired when it comes to
SQL coding and performance tuning

22
The material in this presentation is further developed
in the following Themis courses:

DB1032 – DB2 for z/OS Performance and Tuning


DB1041 – DB2 z/OS Advanced SQL
DB1037 – Advanced Query Tuning using IBM
Data Studio
DB1051 – High Performance Application Design
DB1006 – DB2 LUW Advanced Query Tuning using
IBM Data Studio

Links to these courses may be found at: www.themisinc.com

Tony’s Email: tandrews@themisinc.com


Twitter: @ThemisTraining
“I have noticed that when the developers get
educated, good SQL programming standards are in
place, and program walkthroughs are executed
correctly, incident reporting stays low, CPU costs do not
get out of control, and most performance issues are
found before promoting code to production.”

24
Education. Check out
www.db2sqltuningtips.com
www.ibmpress.com
www.amazon.com
Finally! A book of DB2 SQL tuning tips for
developers, specifically designed to improve
performance.

DB2 SQL developers now have a handy


reference guide with tuning tips to improve
performance in queries, programs and
applications.

As of DB2 V10.
Education. Check Out
www.themisinc.com

• On-site and Public


• Instructor -led
• Hands-on
• Customization
• Experience
• Over 30 DB2 courses
US 1-800-756-3000
• Over 400 IT courses
Intl. 1-908-233-8900

You might also like