Using Statspack To Track Down Bad Code
Using Statspack To Track Down Bad Code
Introduction
Many times a developer may be given the task of helping the DBA find and resolve an applications poorly performing code.
Of course first one must define what constitutes poor performance. Is poor performance a certain number of logical or
physical IO’s? Is it a certain number of consistent reads? I believe it needs to be defined in the context of the specific
application. For example, in an order entry situation inserts should be expected to be sub-second. Another example would be
that in a customer service application the customer information screen should be expected to be populated within less than
seven seconds. Yet another would be that a decision support system needs to return results within 15 minutes. Each of these
applications has different expectations for performance and each must be tuned using that expectation set in mind.
Another key concept when tuning is the concept of enough is enough. This concept means to set specific tuning goals and
when you reach them, go on to the next problem. Tuning Oracle has been likened to a video game with infinite levels, there is
always a way to get a few more milliseconds or microseconds of performance from Oracle, you have to know when to quit!
In this presentation we will look at using the statspack, and by extension the AWR, tool for finding and correcting bad SQL
in an application.
Setting Up Statspack
Before you use statspack it must be installed. This is fairly simple to do. To install statspack you must be on release 8.1.7 of
Oracle or later (you can go back as far as 8.1.5 but results can be interesting with these older releases) also, be very careful
about what you have running, we have seen issues with early versions of ColdFusion and Statspack colliding. Anyway, the
steps to install are:
1. Make sure dbms_jobs and dbms_shared_pool are installed in the system.
2. Review the spcreate.sql series of called scripts and eliminate the calls to install the packages in step 1.
3. Create a perfstat tablespace
4. Run the spcreate.sql script, usually in the $ORACLE_HOME/rdbms/admin directory
5. Use the statspack.snap procedure to test the install
6. Start automated statistics runs with spauto.sql
The spcreate.sql script runs a series of other scripts that creates the appropriate user, gives them the proper grants and builds
the proper tables and packages for the statspack process. By default it create the perfstat user with the password perfstat. Note
in step 2 we suggest editing the appropriate script to remove the build/rebuild of the dbms_jobs and dbsm_shared_pool. This
is to prevent locking issues should these packages already be installed. It is strongly suggested that these be prebuilt so you
can ensure there are no issues.
Once statspack is installed, you can begin testing.
AWR Usage
In 10g we now have AWR. The Automatic Workload Repository (AWR) defaults to a collection interval every 30 minutes
and collects data that is the foundation for all of the other self-tuning features. AWR is very much like STATSPACK,
especially the level-5 STATSPACK collection mechanism where top SQL is collected every hour, based on your rolling
thresholds for high-use SQL. In addition to the SQL, AWR collects detailed run-time statistics on the top SQL (disk reads,
executions, consistent gets) and uses this information to adjust the rolling collection threshold. This technique ensures that
AWR always collects the most resource intensive SQL. The AWR system provides reports that can be run against the AWR
tables that provide the same type of data a statspack report used to, and, much more. AWR takes advantage of the ADDM
(Adam) to gather its statistics using non-SQL based collection techniques which is much more efficient than the old SQL
based statspack and less intrusive on the database. The awrrpt.sql script is used to generate the reports that take the place of
the Statspack reports in 10g.
SQL ordered by Gets for DB: TEST10 Instance: TEST10 Snaps: 1240 -1241
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
1,340,065 1 1,340,065.0 3.3 27.85 27.75 2183036294
Module: SQLNav5.exe
SELECT /*+ FIRST_ROWS */ KEY||','||WRTN_PREM||','||WRTN_EXPSR||'
,'||EARNED FROM ( SELECT /*+ INDEX(pcoopt PK_POL_COVG_ON_OF
F_PREM_TRAN) */ pcoopt.pol_id||','||pcoopt.pol_tran_id
||','||pcoopt.veh_unit_nbr||','|| pcoopt.covg_mp_cd||'
,'||pcoopt.covg_cd KEY, SUM(pcoopt.veh_covg_wrtn_prem_
select a.column_name,a.column_length,a.column_position,b.column_
expression,decode(a.descend,'ASC','N','Y') descend,a.index_owner
object_owner,a.index_name object_name from sys.dba_ind_columns
a,sys.dba_ind_expressions b where a.index_owner=:schema and a.in
dex_name=:object_name and b.index_owner(+)=a.index_owner and b.i
-------------------------------------------------------------
SQL ordered by Reads for DB: TEST10 Instance: TEST10 Snaps: 1240 -1241
-> End Disk Reads Threshold: 1000
CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
90,051 20,322 4.4 1.9 90.85 3506.03 2624188903
Module: SQL*Plus
SELECT /*+ INDEX(p1 iu_pol_isc) */ t1.pol_tran_eff_dt
FROM pol p1, pol_tran t1 WHERE p1.p
ol_nbr = :b2 AND p1.ign_sys_cd =
:b1 AND t1.pol_id = p1.pol_id A
ND t1.pol_tran_typ_cd = 'CN' AND
F_PREM_TRAN) */ pcoopt.pol_id||','||pcoopt.pol_tran_id
||','||pcoopt.veh_unit_nbr||','|| pcoopt.covg_mp_cd||'
,'||pcoopt.covg_cd KEY, SUM(pcoopt.veh_covg_wrtn_prem_
We know it is the same code, even though we can’t see all of it by the hash value: 2624188903 being identical. This then is
probably again, the code we should look at optimizing because of this. We can obtain the entire SQL statement by extracting
it from the V$SQLTEXT view using the hash value provided. Using this hash code, we can also (if we are in 9i or greater)
extract the explain plan for the code from the V$SQL_PLAN table as well.
Another statement that appears in both sets is this one:
1,466 1 1,466.0 0.0 10.10 13.08 3656618493
Module: SQLNav5.exe
Select /*+ FIRST_ROWS */ Key||','||WRTN_PREM||','||WRTN_EXPSR||
','||EARNED From ( Select /*+ INDEX(pcoopt POL_COVG_ON_OFF_PRE
M_TRAN_IDX1) */ --/*+ INDEX(pcoopt POL_COVG_ON_OFF_PREM_TRAN_I
DX1) */ --- /*+ INDEX(pcoopt PK_POL_COVG_ON_OFF_PREM_TRAN) */
pcoopt.pol_id||','||pcoopt.pol_tran_id||','||pcoop
This pretty much tells us that these statements are the ones we should concentrate our tuning efforts on them first, then move
on to the other application based statements.
But how about clear indicators of troubled SQL? One issue I see over and over again, is recursion. Recursion occurs because
a statement doesn’t use bind variables. For example, look at Figure 4.
Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 12 07-Jun-04 17:53:55 117 8.2
End Snap: 13 07-Jun-04 18:03:18 107 7.4
Elapsed: 9.38(mins)
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
SQL ordered by Reads for DB: LINEAR Instance: LINEAR Snaps: 12 -13
-> End Disk Reads Threshold: 1000
CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
2,081,753 54 38,551.0 89.7 143.82 198.66 992807648
Module: ? @wsrv1.linearlive.com (TNS V1-V3)
SELECT mailer_id, first_name, last_name, zip, tracking_number, c
lass, school_name, year FROM mailer WHERE tracking_number = :tra
cking_number ORDER BY year DESC
% Total
Parse Calls Executions Parses Hash Value
------------ ------------ -------- ----------
4,025 8,043 8.72 3300435981
Module: ? @wsrv1.linearlive.com (TNS V1-V3)
SELECT val FROM storage WHERE sid = :sid AND name = :name
-------------------------------------------------------------
Figure 4: Problem SQL Indications
In looking at the waits, events and SQL in the above excerpts, we see a lot of CPU time being used, lots of buffer gets and
lots of physical reads. We also see that our key waits are for SQL area (latch free) events. This usually indicates issues with
recursion. We can see that the various parse related ratios are very much less than 100% which is the ultimate tuning goal.
This would indicate no re-parsing (at least hard parsing) was occurring. Reparsing is generally caused by lack of bind
variables. As we look through the SQL in Figure 4 we see a number of SQL statements that are not using bind variables and
some that use a mix of literal and bind variables.
In this situation we can apply the comparisons we used for the buffer reads and the physical reads as well as non-use of bind
variables to find and repair the problem SQL statements. If you are unable to correct the non-use of bind variables, using the
CURSOR_SHARING initialization parameter will help with the parse situation.
Summary
In this paper I have tried to convey the importance of using statspack to help find and isolate SQL statements that need
tuning. We have covered the installation and use of Statspack and have discussed AWR and its use along with statspack.
Developers should utilize statspack or AWR for point monitoring of development environments and for continued
monitoring of production environments.