Monitoring
Monitoring
Monitoring
Page 1 of 29
Monitoring
1 Introduction
This document provides an overview of what needs to be monitored as a part of daily
monitoring. Currently we are monitoring all systems manually.
2 Purpose
The purpose of this document is to provide technical details of what needs to be monitored for
all the transaction and the action to be taken in case of abnormalities.
Meaning
Production
Non Production
Database checks
DB12
once daily
once daily
DB13
once daily
once daily
once daily
once daily
once daily
once daily
DB02
ST04
R/3 Checks
SM21
once daily
once daily
ST22
once daily
once daily
SM12
once daily
once daily
Page 2 of 29
Monitoring
SM13
once daily
once daily
SM51
once daily
once daily
SM50
once daily
once daily
SM66
once daily
once daily
SM37
once daily
once daily
SP01
once daily
once daily
ST03
Workload Monitor
once daily
once daily
ST02
Buffer Statistics
once daily
once daily
SMICM
once daily
once daily
4 Manual Monitoring
As a daily task we perform manual monitoring of all the production system. We monitor only few
critical transactions in all the production systems and a excel report is shared with the client.
SAP Systems are being monitored once in each of the three shifts and as the person starts the
shift his first task to check by logging onto the SAP systems and make sure that all systems are
up and running. All the production systems should be monitored every 4 hours and the system
status should be sent to the whole team.
Page 3 of 29
Monitoring
5 Monthly Monitoring
a) As a part of monthly monitoring we should look into each server to make sure there are
no unwanted files in the file system and which can be cleaned. This can be installation
files, old kernel backup files, old trace / log files etc.
Page 4 of 29
Monitoring
b) We should also clean up the transport logs older than 6 months to free the space in the
trans directory.
c) Older Logs in the work directory should be cleaned up.
d) Please check and make sure that all the SAP housekeeping jobs are running and
performing the cleanup as desired. In case of any issues the housekeeping jobs must be
fixed. Following tables gives overview of all the housekeeping jobs. For more information
about the housekeeping jobs please read SAP Note 16083
Job name
Program
Variant
Freq
SAP_REORG_JOBS
RSBTCDEL
Yes
daily
SAP_REORG_SPOOL
RSPO0041/1041
Yes
daily
SAP_REORG_BATCHINPUT
RSBDCREO
Yes
daily
SAP_REORG_ABAPDUMPS
RSSNAPDL
Yes
daily
SAP_REORG_JOBSTATISTIC
RSBPSTDE
Yes
monthly
SAP_COLLECTOR_FOR_JOBSTATISTIC
RSBPCOLL
No
daily
SAP_COLLECTOR_FOR_PERFMONITOR
RSCOLL00
No
hourly
SAP_COLLECTOR_FOR_NONE_R3_STAT
RSN3_STAT_
COLLECTOR
No
hourly
SAP_REORG_PRIPARAMS
RSBTCPRIDEL
No
monthly
SAP_REORG_XMILOG
RSXMILOGREORG
Yes
weekly
SAP_CCMS_MONI_BATCH_DP
RSAL_BATCH_
TOOL_DISPATCHING
No
hourly
RSPO1043
RSPO1043
Yes
daily
RSTS0024
RSTS0024
Yes
daily
e) In the transaction SPAM check that the maintenance license is not expired. If the
maintenance license is expired please request a new license from market place and add
it to the SAP system from transaction SLICENSE.
Requesting license from Marketplace
Goto http://service.sap.com/licensekeys Keys and requests and then click on search an
installation and follow the screens. You will receive the license key on your email id.
Page 5 of 29
Monitoring
f)
Check the SCC4 and SE06 settings as described in sections 7.15 and 7.18.
g) Check that the load balancing is working in SMLG as per the sections 7.22.
Page 6 of 29
Monitoring
mode, report
Check for Failed Updates All failed updates must be investigated by the appropriate
functional analyst for problem determination and resolution Check update status, report
deactivated update status.
8. SAP: Checking Lock Entries (SM12).
Page 7 of 29
Monitoring
Check for CPU utilizations, paging of the system, check the processes running, Report any
processes = 0 or any high CPU or memory Figures, check the OS collector make sure the current
state is active.
20. AIX: Monitoring File Systems (ST06).
Check the SAP and Oracle directories in particular
In addition we can check the queued and transactional rfc connections using transactions SM58,
SMQ1 and SMQ2, report any inconsistencies.
Also we need to check the trace files and log files (dev_w*, brarchive* etc.) for any in consistencies in
work processes or database.
2.
3.
4.
In SAP Support Portal, subscribe to the following SAP Notes to be advised when they
change;
SAP Note 146289 - Parameter Recommendations for 64-Bit SAP Kernel
SAP Note 830576 - Parameter recommendations for Oracle 10g
Page 8 of 29
Monitoring
Overall System space is OK. Make sure that none of the file systems are more than 90%
utilized.
Memory / performance Bottle-necks should be checked. Make sure that the CPU
utilization is always below 90% on all the application servers. If the CPU utilization is
constantly seen to be over 90% (as seen from ST06) we need to analyze and find out
which process is using the most of CPU and take appropriate actions to resolve this. You
may be required to contact SAP / hardware vendor as applicable. In some cases you
may also be required to contact OS support team.
Similarly please check that the free physical memory is not below 10% as seen from
ST06. If such is the case please inform the BASIS team.
Ensure that the DB has good tuning and is performing well. All the alerts as reported
from ST04 and DB16 are resolved.
Prepare for audit of all systems to make sure that all systems in the landscape are at
same SPS / Kernel level.
Identify list of critical fragmented tables, from YEND perspective, and plan reorg and/or
index re-build 2 weeks prior to the YEND.
Page 9 of 29
Monitoring
Page 10 of 29
Monitoring
Page 11 of 29
Monitoring
Red More than 10 spool requests are in error status or spool request number is very high
which means that there is some serious issue with the spool system.
Page 12 of 29
Monitoring
Yellow one or two work processes in error status. This might be a onetime problem due to
erroneous program.
Red More than one or two work processes are in error status or any of the application servers
are down. This might point to some major issue pointing towards memory bottlenecks in the
system.
Page 13 of 29
Monitoring
DBIF_RSQL_SQL_ERROR: Analyze the dump and find out the reason of this dump. Such a
dump occurs when there are some deadlocks or there are some SQL errors. If there are
deadlocks please investigate and send email to BASIS team.
If the dump points out to some SQL error please contact the affected user on whose name the
dump exists.
In either case please create a Remedy ticket
Page 14 of 29
Monitoring
There are not many pending update records Check if there are ample free work
processes in the systems and check the system performance.
Check for erroneous update records Inform the responsible person / team about the
error.
RAG Status: We should report the following status based on the errors we see.
Green No Errors reported
Yellow Only 5 updates have failed and this could be due to some error from the development
side and wrong data being provided. This is not generic and may not impact the general system
performance.
Red More than 5 update failures or too many updates are pending. This might point towards
some serious issue in the system and may impact the overall system performance.
Page 15 of 29
Monitoring
Page 16 of 29
Monitoring
Cancelled Jobs
Action Taken: For any issues identified the monitoring team should create a ticket and inform
the Basis shift lead about it.
Basis shift lead should take actions as per the situation:
Long running jobs Find out if the job is really doing something. Check the Job status
and if found that job is in a stale condition, please inform the responsible person / team
about it.
Cancelled Jobs : Please check the job logs and if the issue is related to BASIS please fix
this otherwise inform the responsible job owner.
Jobs in delayed status Check if there are sufficient WP available in the system and
take appropriate actions.
RAG Status: We should report the following status based on the errors we see.
Green No long running jobs, cancelled jobs or delayed jobs
Yellow Less than 5 long running jobs or cancelled jobs.
Red More than 5 long running jobs or cancelled jobs or delayed jobs.
Page 17 of 29
Monitoring
RAG Status: We should report the following status based on the errors we see.
Green No documents in error status
Yellow Around 5-10 documents in error status
Red More than 10 documents in error status which would happen if the settings in the SCOT
transaction are not correct. This could also happen if Firewall is blocking the documents from
being transmitted.
Page 18 of 29
Monitoring
Page 19 of 29
Monitoring
Page 20 of 29
Monitoring
Page 21 of 29
Monitoring
Check: Monitoring Team should check if there is high number of swaps occurring into the SAP
system. This check should be performed on all the application servers
Page 22 of 29
Monitoring
Page 23 of 29
Monitoring
Page 24 of 29
Monitoring
Basis Shift lead should check in SMQR that the scheduler is working and there are no errors
like Destination not reached etc.
Page 25 of 29
Monitoring
Page 26 of 29
Monitoring
Page 27 of 29
Monitoring
Check: We should check every day and make sure there are no many entries in Error. Check if
the queues are processed slowly and there are many queues waiting to be processed.
Action Taken: If many entries in error are displayed monitoring team should inform the BASIS
shift Lead.
Basis Shift Lead should analyze the error. If there are some SQL errors, we need to fix them.
If the issue is due to slow processing, we should look into the updates and lock entries in the
system. Such errors will also be reported into SM21 logs and sometimes we can find related
ST22 dumps as well.
Also check if the user / password of the RFC user is not working or if the user is locked. If such
is the case please reach out to the SEC team to get this fixed and inform the functional team to
reprocess.
If the issue is pointing to the functional team, please create a Remedy incident and assign it
directly to the functional team.
RAG Status: We should report the following status based on the errors we see.
Green Queues are processing normally
Yellow There are 5-10 queues in error
Red More than 10 queues in error or there are many pending queues in the system.
Page 28 of 29
Monitoring
Page 29 of 29