Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Performance Coutners

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Performance Monitoring

Ramanjaneyulu Narra
10/21/2009

1
Table of Index

Table of Index..............................................................................................................................................2

Foreword: ...................................................................................................................................................3

Windows:.....................................................................................................................................................3

UNIX Flavors:...............................................................................................................................................4

Apache Server:.............................................................................................................................................4

JVM Statistics:..............................................................................................................................................5

Tomcat Server:.............................................................................................................................................6

WebLogic Server:.........................................................................................................................................7

WebSphere:.................................................................................................................................................8

Oracle Database:.......................................................................................................................................10

MS SQL Server: .........................................................................................................................................10


Foreword:
This document is meant to refresh the things to be considered while monitoring the servers. This
document will be helpful to only those, who have some idea on Windows monotoring, Unix Monitoring,
Apache server, Tomcat server, WebLogic, WebSphere server, Oracle and MS SQL servers. For the rest,
this can be used a guide through which they can start exploring the things and learn about these. Hope
this document will be helpful.Happy Reading.

Windows:
CPU - % Processor Time – % of the Processor being utilized

CPU - % User Time – % of the processor being occupied by user (Processor Time – User time will yeild
the OS time)

CPU - %Idle Time - % of the time the Processor is free

Memory – Available Bytes (Total free RAM)

Committed Bytes (Total RAM that is occupied)

Page Faults – The total no of times the Page was requested ( If the page is found in the RAM, it is called
Soft fault, if it is not found in the RAM and has to be retrieved from the harddisk, it is called hard page
fault. More Soft faults means the pages are read from RAM which is good. If hard faults are more, it
indicates that less RAM)

Disk – Reads per Second (No of times pages were read from the Disk in a second)

Writes per Second (No of times pages written to Disk in a second)

%Disk Busy – Total % of time the disk was busy

Avg Disk Queue Length – If the disk is already in use, and another request is made to access the disk it
will be in the Queue. Disk Queue length indicates the length of the queue. High Disk queue length
indicates a bottleneck and somewhere either more logging is enabled or the RAM is too less.

IO per Second – Reads + Writes / Sec

Network – Bytes Sent per Sec (Total bytes sent on the network card in a second)

Bytes Received per Second (Total bytes received on the network card in a second)

Total Bytes per Sec (Total bytes tranmitted over the network card in a second)

TCP – packets retransmitted (Total amount of packets retransmitted, if more retransmission are
happening, it indicates a network issue (the receiver could not acknowledge the receipt of the packets)),
TCP – Connections Established (total no of TCP connections established, should be in the same pattern
as user load)
TCP – Active Connections (total no of connections established by the physical machine to any other
servers)

TCP – Passive Connections (Total no of connections established from other servers or machines to the
machine)

TCP – Connections Reset (Total no of connections that were reset, More Connection resets indicates a
network issue or Port unavailabiltiy)

Processor Queuelength (from System counter) – Indicates the length of the processor queue, ideally the
average value should not be more than double the no of processors.

If we are monitoring the Web/APP/DB servers, we can also add the Process counter in the perfMon and
select all the above processes to know the CPU, Memory utilization at the process level. For example, if
we are monitoring the Apache Server, we can go to the Processor Object, select the Apache Process and
select the counters like % Processor Utilization, % User time, committed bytes etc.

To monitor the Windows resources, we use PerfMon a built in utility in Windows OS.

UNIX Flavors:
CPU – User Time, System Time, Idle Time

Memory – Free Memory, Cached Memory, Swapped Memory (if this counter is increasing, it indicates the
RAM is less)

DISK – Kbread/Sec, KBWrite/Sec, IO/Sec

Network – Rx Bytes (Received Bytes), Tx Bytes (Transmitted Bytes) on various ethernet cards.

TCP – Connections established, Connections Reset, Active Connections, Idle Connections

Note: These names may vary based on the flavor of the OS. But these names should be fine. We use
VMSTAT to get the CPU and MEMORY statistics and iostat to get the Disk statistics, netstat –s -t, netstat
–I to monitor the Network and TCP statistics.

We can connect to the UNIX machine through PUTTY and execute these commands which write the
output to a log file. Later these log files can be brought to Local machine through another utility named
WinSCP, which provides the UI for the UNIX file structure.

Apache Server:
Ready Children – Total no of available threads (in windows) / Processes (in UNIX flavours) to serve the
incoming requests

Busy Children – Total no threads/processes that are serving the incoming requests

Total Children – Ready Children + Busy Children

Configuration File: Http.conf (in Windows) or httpd.conf (in UNIX flavors)

Important configuration elements that can be set:

Max Clients - Maximum no of clients that can be established


Start Servers – The total no of processes that will be started when the Apache server is started

MinSpare Servers – The minimum no of servers that should be avialable to serve the new requests. If
this value is set to 10, there should be always 10 new processes available to take up the new load. If this
value is coming down, the new processes will be invoked automatically.

MaxSpare Servers – The maximum no processes that can be kept idle. If this value is set to 20, and the
total ideal processes are 45, it will terminate 25 processes and the idle processes will comedown to 20.

Threds per Processes – The maximum no of processes that can be established per processes
(applicable on Windows only)

Keep Alive (ON/OFF) – Based on this value, the connection will be closed after processing a request. If
this is set to on, it will wait for another request till the KeepAlive Timeout setting. If no request is sent
until the KeepAlive Timeout value, the connections will be closed automatically. This will help reduce the
CPU time to establish a new connection.

Maximum requests per Processes – If KeepAlive is on, the maximum no of requests that can be served
on a single connection. Setting this to Zero indicates, unlimited. Setting this value to a constant, will help
in releasing the memory occupied by the process. Will be helpful, if there are memory leaks in the
application.

While increasing any of these values, we should consider the hardware constraints. Each Process
typically occupies 15MB.This may be high and depends upon the application type.

We can monitor the Apache server using the Status page. The URL for this can be:
http://localhost/server-status/refresh=10 will refresh the status page every 10 seconds.

To enable status reports only for browsers from the foo.com domain add this code to your httpd.conf
configuration file

<Location /server-status>
SetHandler server-status

Order Deny,Allow
Deny from all
Allow from .foo.com
</Location>

JVM Statistics:
JVM stands for Java Virtual Machine. Monitoring JVM is key in identifying memory leaks with a Java
Based application.

» JVM heap divided into three major parts

» Young generation

» Permanent Generation

» Old or tenured generation

» Young generation further divided into two parts


» One Eden space

» Two Survivor spaces

» Each application has its own memory requirements.

» Monitor the GC logs.

» For applications which allocates large objects use a larger

heap size 1-2 GB.

» For applications which creates small objects use a relatively

smaller heap size 128-512 MB.

» Use command line parameters –Xms and –Xmx for sizing

overall heap.

Set –Xms and –Xmx to same size for better performance

Jstat is used to monitor, the JVM usage. The main coutners that we can get out of this JSTAT is no of the
time the YGC (Young GC) is happening, Time spent in YGC, no of times the FGC (Full GC) happened
and the time taken for FGC and total GC time. If FGC is happening more frequently, which indicates
either enough Heap settings, are not available for JVM or there are more live objects (which indicates that
the objects are not being cleared, a memory leak).

Tomcat Server:
Similar to Apache, Tomcat also can be monitored using the status page. The URL looks like
http://localhost/manager/status. Again we need to add some code snippet to the Tomcat user files.
Add the following lines in green to $CATALINA_HOME/conf/tomcat-users.xml.

<?xml version='1.0' encoding='utf-8'?>


<tomcat-users>
<role rolename="tomcat"/>
<role rolename="role1"/>
<role rolename="manager"/>
<role rolename="admin"/>
<user username="tomcat" password="tomcat" roles="tomcat"/>
<user username="both" password="tomcat" roles="tomcat,role1"/>
<user username="role1" password="tomcat" roles="role1"/>
<user username="TomcatAdmin" password="tcpass" roles="admin,manager"/>
</tomcat-users>

Tomcat server status page also gives the information like Total no of threads available, busy, and idle.
We can see each request and time taken to process each request in both the Apache and Tomcat status
pages. This will help us in identifying the network delay at each layer. For example the Login transaction
is taking 14.35 seconds at the load testing tool and the same is taking around 8.35 seconds at the Web
Server level and it is taking 7.05 seconds at the Application server level, that indicates that the network
latency between the Web to LoadTest environment is more compared to the latency between the Web
and App Server. This will help us identifying in the network latencies.

WebLogic Server:
WebLogic server is industry standard application server and it has its own adminstration console to
monitor the WebLogic server. We can monitor the JVM statistics, we monitor the thread statistics, we can
monitor the Bean statistics and we can also monitor the hogging threads and their status before they were
crahsed using thread dump. WebLogic uses self –tuning mechanism and hence we do not need to set
any thread count any where. Some more information about WebLogic server is:

» Various aspects in WebLogic Server Monitoring

» JVM Heap Usage

» Thread Pools – Active Executive Threads, Executive total threads, Queue Length,
Pending user request count, Completed request count, Hogging thread count, standby
thread count. Added to these, we can get the information about each thread, its status,
the request it is processing etc

» Workload – Pending requests, Completed requests, Executing requests, out of order


execution count, must run count, Max wait time, current wait time

» Metrics of Web Application – Servlets, current sessions, sessions high, Total Sessions

» JDBC Connection Pools (To establish a connection to the DB) – Active connections
in the Data Source, Total available connections in the data source, Prepared statement
cache, Waiting conditions for connection

» JMS (Messaging Services) -


» JTA

WebSphere:
Similar to WebLogic, Websphere also has its own Administration console from where we can monitor the
performance data.

Basic Counters:

Counter Name Parameter Description

Enterprise Beans CreateCount The number of times that beans were created

RemoveCount The number of times that beans were removed

ReadyCount The number of bean instances in ready state

MethodCallCount The number of calls to the remote methods of the bean

MethodResponseTime The average response time in milliseconds on the remote


methods of the bean

PooledCount The average number of objects in the pool

MessageCount The number of messages delivered to the onMessage


method of the bean (applies to: message-driven beans).

PassiveCount The number of beans that are in a passivated state

MethodReadyCount The number of bean instances in ready state

JDBC CreateCount The total number of managed connections created


Connection
Pools CloseCount The total number of connections that are closed

PoolSize Average number of managed connections in the pool

FreePoolSize: The number of free connections in the pool

WaitingThreadCount Average number of threads concurrently waiting for a


connection

PercentUsed Average percent of the pool that is in use. The value is


determined by the total number of configured connections
in the ConnectionPool, not the current number of
connections

UseTime Average waiting time in milliseconds until a connection is


granted

WaitTime Average waiting time in milliseconds until a connection is


use
JVM Runtime HeapSize The total memory (in KBytes) in the Java virtual machine
runs time
UsedMemory The amount of used memory (in KBytes) in the Java virtual
machine run time

UpTime The amount of time that the JVM is running

ProcessCpuUsage The CPU Usage (in percent) of the Java virtual machine

JCA Connection CreateCount The total number of managed connections that are created
Pools
CloseCount The total number of managed connections that are
destroyed

PoolSize The average number of managed connections in the pool

FreePoolSize The number of free managed connections currently in the


pool

WaitingThreadCount The average number of threads concurrently waiting for a


connection

UseTime The average time, in milliseconds, that connections are in


use (measured from when the connecton is allocated to
when it is released)

WaitTime The average waiting time in milliseconds until a connection


is granted

Servlet Session LiveCount The total number of sessions that are currently live
Manager

System Data CPUUsageSinceLast The average CPU utilization since the last query
Measurement

Thread Pools PoolSize The average number of threads in pool

Transaction ActiveCount The number of concurrently active global transactions


Manager
CommittedCount The number of global transactions that are committed

RolledbackCount The number of global transactions that are rolled back

Web Applications RequestCount The total number of requests that a servlet processed

ServiceTime The average response time, in milliseconds, in which a


servlet request is finished

Specified are the basic counters provided by the WebSphere (as recommended by WebSphere).
Other counters can be added by selecting the extended option or custom option. Definition of each
counter is available in PMI Custom Monitoring levels.
Oracle Database:
We use AWR Reports to monitor the Oracle database from 10g onwards. To monitor Oracle 9i, we have
STATSPACK. Based on the requirement we generate the snaps from every 15 minutes or so. We can
generate the HTML reports that compare two snaps. The major things that should be looked while
analyzing the AWR Report are:

The SGA size – Shared Global Area (can be accessed all the users that access the database, contains
the data realted to execution plan, the data that is being fetched from the physical files (will be stored in
the Database buffer cache, a part of SGA) etc. If the cache hit ratio is not close to 100%, it indicates that
Bind Variables are not implemented (Can be confirmed by more Hard parsings of the SQL statements)

The PGA size – The private space for each user, used to sort, join and Union the tables and the rows and
contains the session information of the user.

Top Timed events – To find the reasons for more elapsed time of a SQL statement. If the SQL execution
time is less and elapsed time is more, that indicates that the SQL is waiting for other resources than CPU.
These resources can be the table which is already locked. Most of the times, the more elapsed time of a
SQL is due to the table locked by another user and this user is waiting for the same table. We can see
some counter named latches or locks. If the SQL has to fetch too many rows and it has to sort them, then
the execution time will be more. If we have more PGA, we will have more space to sort the results. We
will get SGA advisory and PGA advisory in the AWR report to find out the optimal settings of the SGA and
PGA. These values should be altered at the Init.Ora file on the Oracle Database.

Top SQLs - This section gives us the SQLs that were taking more time. If we suspect that there is
aproblem with a particular SQL, we can get the Explain Plan for that SQL and confirm whether Indexing
was proper or not.

MS SQL Server:
We can monitor the MS SQL server using PerfMon or using SQL Profiler that comes default with the SQL
server. The major things that we should consider in MS SQL Server are:

• Cache Hit Ratio

• Full Table Scans/Sec

• Locks/Sec

• DeadLocks

• Processing Time

You might also like