Version 1.6.0 Install and User Guide TPCHC-160UG-001 REVISION NOTICE This is the first release of this manual for TPC Health Check 1.6.0. This manual provides technical and editorial updates and replaces all previous editions. Please refer to the Summary of Changes for complete revision history. ABSTRACT The TPC Health Check 1.6.0 software iis about two things, configuration backup and Health Checking. These two functions have been combined in a single tool that performs this automatically based on inventory from TPC. FOR FURTHER INFORMATION If you wish to obtain further information about TPC Health Check, please contact your Account Executive. RESTRICTIONS ON USE The following terms are registered or common law trademarks of International Business Machines Corporation in the United States, other countries, or both: AIX, IBM, the IBM logo, ibm.com, TPC Health Check, TPC, and zSeries Intel and Itanium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Other company, product and service names may be trademarks or registered trademarks or service marks of others. The information contained in this documentation is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained in this documentation, it is provided as is without warranty of any kind, express or implied. In addition, this information is based on IBMs current product plans and strategy, which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this documentation or any other documentation. Nothing contained in this documentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM (or its suppliers or licensors), or altering the terms and conditions of the applicable license agreement governing the use of IBM software. This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the users responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: IBM World Trade Asia Corporation Licensing 2-31 Roppongi 3-chome, Minato-ku Tokyo 106-0032, Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged should contact: IBM Global Services IBM Corporation Route 100 Somers, NY 10589 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this information and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement, or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Copyright IBM Corporation 2013 All rights reserved. Printed in USA. All specifications are subject to change without notice. Licensed Material Property of IBM Corporation xxiv Document History www.ibm.com Document History This is a snapshot of an on-line document. Paper copies are valid only on the day they are printed. Refer to the author if you are in any doubt about the accuracy of this document. Revision History Revision Number Revision Date Summary of Changes Changes Marked 1.0.0 11/30/2011 First Release of Document N 1.0.1 12/02/2011 Changed Alert Parameter N 1.1.0 12/23/2011 Added Support for Windows platform N 1.1a 03/28/2012 Updated Install Section with more step by step details N 1.2.0 04/25/2012 New Install Scripts N 1.2.1 05/10/2012 Minor Updates N 1.3.3 08/17/2012 New Release, support for DS8000, Brocade, and additional features 1.3.4 10/01/2012 New release, new embedded links in policy file, and new debug tool. N 1.4.0 12/17/2012 H New release H New Health Check engine H Support for new device types H Reports updates, and security enhancements. N 1.5.0 03/30/2013 H New release H Support for new device types H Collect support files and policy files updated N Licensed Material Property of IBM Corporation xxv Document History www.ibm.com Authors of this document Michael ONeill/Manassas/IBM Per Lutkemeyer/Denmark/IBM Development Team Christian Soender/Denmark/IBM Nicolai Kildal/Denmark/IBM Per Lutkemeyer/Denmark/IBM Thomas Lindgaard/Denmark/IBM J J 1.6.0 06/23/2013 H New release H Topology info are loaded from devices and used for new balancing checks H Support for Mainframe attached DS8000 H TPC 5.1 is now supported H New check type named matchvalue H New Brocade checks for error counters. N Revision Number Revision Date Summary of Changes Changes Marked Licensed Material Property of IBM Corporation xxvi Summary of Changes www.ibm.com Summary of Changes This section lists technical updates and new material added to the TPC Health Check Install and User Guide, for each release of TPC Health Check. Whats New in TPCHC Release 1.6.0 This release includes a number of improvements and additional features shown below. Component Additional Features Physical load balancing D New Check that verifies connections from a storage device to 2 switch fabrics are balanced, meaning an equal number of connections to each fabric. When this connectivity is in place, TPCHC can state that connections for this device are balanced. D Reason for this check is to ensure that the storage devices have redundancy and that workload is spread evenly between the 2 fabrics. D Id for check is 2.42 for all supported device types, which currently are SVC, DS8000, XIV, and Storwize V7000. Check type topology_checkconnections D Custom check to investigate number and type of connections a given storage device has to the SAN switches. D For each device: 1.if device is not connected to any fabrics, then device is compliant 2.check if device is connected to exactly 2 fabrics 3.check device is connected to equal number of switches in each fabric 4.check device has the same number of connections to each switch D Note: For SVC and V7000 each node will be checked seperately. Licensed Material Property of IBM Corporation xxvii Summary of Changes www.ibm.com Support for mainframe attached DS8000 D Access to a mainframe attached DS8000 is the same as to an open systems attached DS8000. To add support for mainframe attached DS8000 checks have been changed in the DS8K policy file. ds8k.2.9 Now allows SCSI-FCP, FICON and FC-AL ds8k.2.10 Check name updated. If fixed blocks are not used, check will return compliant. ds8k.2.11 Check name updated. If fixed blocks are not used, check will return compliant. D Enhancements to existing policies to ensure they dont generate false positive alerts. D Following DS8000 policies have been changed: ds8k.1.4 Check name updated ds8k.2.14 Changed to use output from lsarray. If arrays are not used, check will return compliant. ds8k.2.15 If no hosts are connected, check will return compliant. ds8k.2.19 If no hosts are connected, check will return compliant. D Output from Ishostconnect have been reformatted using | as delimiter between columns; any checks in local policy files must be updated to avoid false noncompliant. ds8k.2.15 and ds8k.2.19 are updated to use new delimiter in lshostconnect. TPC support D TPC 5.1 is now supported by TPCHC Check type matchvalue D New check type called matchvalue to capture a numeric value in stanza, and comparing it against either minimum, maximum, exact (equal) or range specified in policy. D Matchvalue are used for new Brocade error counter checks Component Additional Features Licensed Material Property of IBM Corporation xxviii Summary of Changes www.ibm.com Brocade policy changes D New check (2.33) to verify that trunking is enabled on ISLs. This new check will in combination with existing check 3.4 ensure that un-trunked E-ports are flagged noncompliant. 2.33 Trunking must be enabled on all ISLs 3.4 If trunking is enabled, trunking license must be present D Threshold lowered Check 5.1 Thresold is lowered from 50 to 30 on recommendation of Brocade D New error counter checks The following new policies have been added for brocade, checking various error counters 5.19 CRC errors per port must be below 5 per minute 5.20 Encoding errors outside of frame per port must be below 25 per minute 5.21 Encoding errors inside of frame per port must be below 25 per minute 5.22 Er_bad_os per port must be below 5 per minute 5.23 Er_rx_c3_timeout per port must be below 5 per minute 5.24 Er_tx_c3_timeout per port must be below 5 per minute 5.25 Er_c3_dest_unreach per port must be below 5 per minute New command line option localfiles D Files from another installation can be copied to the server where TPCHC is installed. When TPCHC is run with the localfiles option backup functionality is disabled and TPCHC uses the specified location instead as input to be health checked. Component Additional Features Licensed Material Property of IBM Corporation xxix TPC Health Check Install and User Guide www.ibm.com Support Information If you have a technical issue that you cannot answer with the provided resources, please contact the Support Center. " By Email: D Send email requests to dmsopen@us.ibm.com to: H Open cases for Severity 2, 3, or 4 problems; H Track the status of open problem cases; H Ask questions; H Submit enhancements requests. " By Phone: D For Severity 1 problems - system production down caused by SAT tools, contact the Support Center using the following toll-free phone numbers: H North America: 1-800-667-6383 H Europe: D Germany, Ireland, Italy, and UK: 00-800-6676-3835 D All Other European Countries: +1-303-354-5280 H Asia-Pacific Region: D Australia: 0011-800-6676-3835 D Hong Kong: 001-800-6676-3835 D Japan: 010-800-667-63835 D South Korea: 002-800-6676-3835 D Philippines: 00-800-6676-3835 D China: 800 810 1818-5174 H Other Countries: +800-667-6383 H Alternate number: +1-303-354-5280 For information about deployment policy and program management; D Kevin McQuillan/Ireland/IBM D Stanley Wood/Austin/San Jose/IBM D BJ Klingenberg/San Jose/IBM Licensed Material Property of IBM Corporation 1 1. Introduction to TPC Health Check www.ibm.com 1 1. Introduction to TPC Health Check 1.1 This Document This document describes how to use TPC Health Check and the requirements that must be fulfilled in order to get a successful result. In this document you can also find information on how to use TPC Health Check and how to create your own Health Check policies. TPC Health Check is also named TPCHC. TPC Health Check is installed and distributed via the SAT (Storage Automation Tools) package or installed from the TPC server or on a separate workstation, see Section 2 Install Scenarios. 1.2 Introduction TPCHC is about two things, configuration backup and Health Checking. These two functions have been combined into a single tool that performs this automatically based on inventory from TPC. What is Automated Backup? D Automated collection of configuration settings from storage devices D Automated means that it is run without user intervention D Archive configurations to disk D Number of versions to archive can be configured D Alert if backup failed What is Automated Health Check? D Verify that configuration settings on a storage subsystem are set according to recommendations or standards D Health Check policy files are used to describe criteria D A storage subsystem can be verified with one or more policies D Automated means that it is run as a service without user intervention D Alert if checks has failed D Generate summary report (.xls) and send via email D The output of TPCHC is generally used by the PDT (Problem Determination Tool) GUI wizard to quickly identify the devices in error (see the PDT tool available in the SAT package). Licensed Material Property of IBM Corporation 2 1. Introduction to TPC Health Check www.ibm.com 1 1.3 Health Check Overview Health check engine compares configuration settings with policies based on best practice. Configuration settings are pulled from each storage device and archived on disk. Report and alerts are generated based on results of the health check. D Daily schedule will kickoff Health Check engine D Health Check Engine 1. Pull subsystem inventory list from TPC using ODBC. 2. Connect to storage subsystems and pull configuration settings. a. ) Reset counters on Brocade Switches 3. Archive configuration settings on disk (Backup) 4. Compare configuration from each subsystem with one or more policies. 5. Report results Licensed Material Property of IBM Corporation 3 1. Introduction to TPC Health Check www.ibm.com 1 1.4 Device Types Supported TPCHC v1.6 supports the following device types: H Brocade H Cisco MDS H IBM DS8000 - Open systems H IBM DS8000 - Mainframe attached H IBM Storwize V7000 H IBM Storwize V7000 Unified H IBM SVC H IBM XIV Devices with firmware levels supported by TPC 4.2.x or 5.1.x are also supported by TPCHC. The TPC support matrix can be found at this link: https://www-304.ibm.com/support/docview.wss?uid=swg21386446 Licensed Material Property of IBM Corporation 4 1. Introduction to TPC Health Check www.ibm.com 1
Table 1.1: Device TPC 4.2.x Firmware TPC 5.1.x Firmware Brocade 200E 210E 300 3016 3250 3850 3900 4020 4100 4900 5000 5100 5300 5410 5470 5480 7500 7500E 7800 8000 M5440 12000 24000 48000 DCX-4S DCX Backbone 7.0.x 6.4.x 6.3.x 6.2.x 6.1.x 6.0.x 5.3.x 5.2.x 5.0.5 7.0.x 6.4.x 6.3.x 6.2.x 6.1.x 6.0.x 5.3.x 5.2.x 5.0.5 Cisco MDS MDS 9120 MDS 9124 MDS 9134 MDS 9140 MDS 9148 MDS 9216 MDS 9216i MDS 9216A MDS 9222i MDS 9506 MDS 9509 MDS 9513 5.2(2)* 5.0(4) 4.2(7) 4.2(3) 4.2(1) 4.1(3) 3.3(5) 3.3(4) 3.3(1) 3.2(3) 3.2(1) 5.2(2)* 5.0(4) 4.2(7) 4.2(3) 4.2(1) 4.1(3) 3.3(5) 3.3(4) 3.3(1) 3.2(3) 3.2(1) IBM System Storage DS8100 DS8300 R4.x R3.x R2.4.2 R4.x R3.x R2.4.2 Licensed Material Property of IBM Corporation 5 1. Introduction to TPC Health Check www.ibm.com 1 IBM System Storage DS8700 R.6.3* R6.2* R6.1* R5.1.6 (R5.1.5++)* R5.1.5 R5.1 R5.0 R6.3* R6.2* R6.1 R5.1.6 (R5.1.5++) R5.1.5 R5.1 R5.0 IBM System Storage DS8800 R6.3* R6.2* R6.1* R6.0* R6.3* R6.2* R6.1* R6.0* IBM System Storage DS8870 R7.0* R7.0* IBM Storwize V7000U 1.3* 1.4* 1.3* IBM SAN Volume Controller 6.42* 6.32* 6.22* 6.12* 5.12* 4.3 4.2 6.42* 6.32* 6.22 6.12 5.12 IBM XIV 2810-A14 2812-A14 10.2.46* 10.2.2 10.2 10.1 10.2.4 10.2.2 10.2 10.1 IBM XIV Gen3 2810-114 2812-114 11.1* 11.0* 11.1* 11.0* IBM Storwize V7000 6.4 6.3 6.2 6.1 6.4 6.3 6.2 6.1 Table 1.1: Device TPC 4.2.x Firmware TPC 5.1.x Firmware Licensed Material Property of IBM Corporation 6 1. Introduction to TPC Health Check www.ibm.com 1 1.4.1 Brocade SAN switch To collect configuration settings from a Brocade switch, we are using Secure Shell (SSH) which is installed together with TPCHC, so no extra programs needs to be installed. To access a Brocade switch from TPCHC server, we need port 22 open in firewalls, from TPCHC to the Brocade. Access to Cisco MDS requires an account with admin privileges. Connection to a Brocade switch only using username/password. SSH key connection is not supported. Devices are added using the utility ReconfigureTPCHC.bat. 1.4.2 CISCO MDS SAN-OS Systems To collect configuration settings from a Cisco, TPCHC are using Secure Shell (SSH) which is installed together with TPCHC. To access Cisco MDS from TPCHC server, port 22 must be open in firewalls, from TPCHC to the Cisco MDS. Access to Cisco MDS requires an account with admin privileges. Connection to a Cisco MDS is currently only using username/password. SSH key connection is not supported. Devices are added using the utility ReconfigureTPCHC.bat. 1.4.3 IBM System Storage DS8000 To collect configuration settings from a DS8000, TPCHC are using the DSCLI driver. DSCLI must be installed on the same server as TPCHC. If DSCLI is not installed, the DS8000 function will not be activated. To activate DS8000 support, you must run ReconfigureTPCHC.bat. The program dscli can be downloaded from: http://www-01.ibm.com/support/docview.wss?uid=ssg1S4000641 Access to DS8000 requires an account with admin privileges to collect user account settings. If an account with fewer privileges than admin is used, security checks cannot be executed. Devices are loaded from TPC or added using the utility ReconfigureTPCHC.bat. 1.4.4 IBM Storwize V7000 Disk Systems To collect configuration settings from a V7000, TPCHC are using Secure Shell (SSH) which is installed together with TPCHC. To access V7000 from TPCHC server, port 22 must be open in firewalls, from TPCHC to the XIV. Access to V7000 requires an account with admin privileges. Connection to a V7000 is currently only using username/password. SSH key connection is not supported. Devices are added using the utility ReconfigureTPCHC.bat. Licensed Material Property of IBM Corporation 7 1. Introduction to TPC Health Check www.ibm.com 1 1.4.5 IBM Storwize V7000 Unified Disk Systems To collect configuration settings from a V7000, TPCHC are using Secure Shell (SSH) which is installed together with TPCHC. To access V7000 from TPCHC server, port 22 must be open in firewalls, from TPCHC to the XIV. Access to V7000 requires an account with admin privileges. Connection to a V7000 is currently only using username/password. SSH key connection is not supported. Devices are added using the utility ReconfigureTPCHC.bat. 1.4.6 IBM System Storage SAN Volume Controller (SVC) To collect configuration settings from a SVC, TPCHC are using Secure Shell (SSH) which is installed together with TPCHC. To access SVC from TPCHC server, port 22 must be open in firewalls, from TPCHC to the SVC. Access to a SVC requires an account with admin privileges. Connection to a SVC uses a SSH key connection. Devices are added using the utility ReconfigureTPCHC.bat. 1.4.7 IBM XIV Storage System series To collect configuration settings from a XIV TPCHC are using the XCLI driver. XCLI must be installed on the same server as TPCHC. If XCLI is not installed before the TPCHC install, the XIV function will not be activated. To activate after TPCHC install, you must run ReconfigureTPCHC.bat. The program xcli can be downloaded from: ftp://ftp.software.ibm.com/storage/XIV/GUI/2.4.4/XIVGUI_windows_2.4.4_build3.exe NOTE: There is a known issue with XCLI version 4.1 do not use this version. Access from TPCHC server to XIV requires some firewall ports to be opened. Proprietary protocols are used to manage the XIV system from the IBM XIV Storage Management GUI and IBM XIV command-line interface (XCLI). This management communication is performed over TCP port 7778. The XIV GUI and XCLI act as the client and initiate the connection. The XIV system acts as the server. Access to XIV requires an account with monitor privileges, currently no commands that require admin privileges are used on the XIV. Devices are added using the utility ReconfigureTPCHC.bat. Licensed Material Property of IBM Corporation 8 2. Install Scenarios www.ibm.com 2 2. Install Scenarios TPCHC can be installed in three ways, local on the TPC server, remote, or standalone. Below is the three scenarios described. 2.1 Local TPCHC is installed locally on an existing TPC server. See requirements section Hardware and Software. 2.2 Remote TPCHC is installed on another server with connectivity to the TPC server and storage subsystems. See Pre-installation requirements all sections. 2.3 Standalone TPCHC is installed standalone on another server with no connectivity to TPC and configured direct to the devices. Licensed Material Property of IBM Corporation 11 3 Pre-Installation www.ibm.com 3 3 Pre-Installation 3.1 Hardware Requirements Requirements to hardware D Memory 1GB D Disk space requirements 2-3 GB free space D Temporary space needed during installs - 300MB D Program space - 700MB D Data space estimates - 1MB per day per device Example: Daily backups (Default = 30 days) Monthly backups (Default = 12 months) 10 SVC: (10 x 30) + (10 x 12) = 420MB disk space 3.2 Software Requirements 3.2.1 Operating System TPCHC can run on the following operating systems: D Windows Server 2003 r2 (32 and 64 bit) D Windows Server 2008 (32 and 64 bit) 3.2.2 TPC Version support TPCHC supports the following TPC versions: H TPC 4.2 H TPC 5.1 NOTE: The DB2 ODBC driver v9.7 fp4 is included in the install package. 3.3 DS8000 Support TPCHC connects to DS8000 devices using the DSCLI driver. DSCLI must be installed on the same server as TPCHC. If DSCLI is not installed before the TPCHC install, the DS8000 function will not be activated. To activate after the TPCHC install, you first need to install DSCLI. The URL to download the DSCLI tool is http://www-01.ibm.com/support/ docview.wss?uid=ssg1S4000641 Licensed Material Property of IBM Corporation 12 3 Pre-Installation www.ibm.com 3 NOTE: You must know what version to use depending on the DS8000 code bundle. 3.4 XIV Support To collect configuration settings from a XIV we are using XCLI. It needs to be installed on the same server as TPCHC, just like the DSCLI for the IBM DS8000 system. The program XCLI can be downloaded from: ftp://ftp.software.ibm.com/storage/XIV/GUI/ Program default location: C:\Program Files\XIV\GUI10\xclisession.exe 3.5 Environment Requirements 3.5.1 Generate Windows Account A Windows account needs to be created with administrative privledges. This account will allow that privileged user to install, run, and administer TPCHC. NOTE: The account used to install TPCHC must be the same as the one that will be used to execute it (from a command line and from the Windows scheduler). 3.5.2 Third Party Software Requirements The Problem Determination Tool (PDT) is installed with TPCHC as part of the SAT package. It is highly recommended to use the JAR as provided in the SAT package: For x86 - \SATHEAD.P4051.gold\ProblemDeterminationTool\Redist\IBMJRE\ibm-java-jre- 60-win-i386.exe For x64 - \SATHEAD.P4051.gold\ProblemDeterminationTool\Redist\IBMJRE\ibm-java-jre- 60-win-x86_64.exe The Third Party software included in the SAT package are .NET (4.0) and Java 6 JRE. This software requires no need to download but they must be installed. NOTE: These third-party installation package are available under ProblemDeterminationTool\Redist folder. 3.5.3 Reboot Requirements The installation of the .NET 4.0 may require a reboot.The .NET 4.0 install process will ask for a reboot at the end of the .NET 4.0 install in some circumstance. Not rebooting the server at that time has no impact on the installation and on TPCHC execution. Licensed Material Property of IBM Corporation 13 3 Pre-Installation www.ibm.com 3 The reboot of the server can be done later and does not prevent the execution and install test of TPCHC. It will prevent the PDT GUI tool to start until a reboot is done. 3.6 DB2 User Credentials TPCHC connects to DB2 in order to pull the list of storage devices to be checked and only applicable when TPCHC is configured to work with TPC. The DB2 user credential is needed to install TPCHC. The Local Administrator will also have access to DB2. Use an existing user account or create a new account in DB2. Make sure that the user is able to execute the following SQL statements: select * from tpcreport.storagesubsystem; select * from tpc.t_res_registered_napi 3.6.1 How to add a DB2 User called tpchc with limited privileges to run TPCHC 1.) You must be logged in as administrator on the server where DB2 is installed. 2.) Create a user tpchc in Windows 3.) Start DB2 CLI 4.) Enter the commands below: CONNECT TO TPCDB GRANT CONNECT ON DATABASE TO USER tpchc GRANT SELECT ON tpcreport.storagesubsystem TO USER tpchc GRANT SELECT ON tpc.t_res_registered_napi TO USER tpchc 3.7 SVC Certificate (SSH key) During installation of a SVC, a .ppk SSH key is generated. This is used by TPC when connecting to the SVC. TPCHC also needs this certificate. To allow for TPCHC to connect to a SVC the key must be converted to OpenSSH format. Converting .ppk key is described in Section 5 Post-Installation of this document. Licensed Material Property of IBM Corporation 14 3 Pre-Installation www.ibm.com 3 3.8 Connectivity The DB2 Perl modules communicate with TPCDB. If you connect through a firewall, opening from server running TPCHC to the TPC Server (running TPCDB) is needed on port 50000 for both UDP and TCP protocol. Port 50000 is the default port used when connecting to DB2, if this is changed in your environment please enter the correct number during the install process. TPCHC uses SSH protocol to connect to SVC controllers. The table below shows connectivity from TPCHC to each device type Server Protocol / port(s) Device TPCHC SSH > (tcp port 22) SVC TPCHC SSH > (tcp port 22) V7000 TPCHC SSH > (tcp port 22) V7000U TPCHC SSH > (tcp port 22) Brocade TPCHC SSH > (tcp port 22) Cisco MDS TPCHC via DSCLI TCP port 1718, 1719, 1750, 1755, 8451-8455 DS8000 TPCHC via XCLI TCP port 7778 XIV
TPC Server TPCHC Server DB2 Storage subsystems Port 50000, tcp & udp SSH, port 22 tcp Mail (SMTP) SMTP, port 25 tcp Alerting Licensed Material Property of IBM Corporation 15 3 Pre-Installation www.ibm.com 3 3.9 Checklist Use the checklist below to verify you can answer YES to all questions. NOTE: DSCLI downloaded and installed (needed for DS8000 support) and XCLI downloaded and installed (needed for XIV support). 3.9.1 Install on local server (Where TPC is installed) Check Answer User credentials with administrative privileges on TPC server available
Available disk space > calculated in section 3.1 Extract TPCHealthCheck component from SAT package to local/remote server.
DSCLI downloaded or installed (DS8000 support) Operating System listed in install matrix TPC version listed in install matrix DB2 version listed in install matrix DB2 user credentials available Certificates available for each SVC Connectivity as described in section 3.5 is available
Licensed Material Property of IBM Corporation 16 www.ibm.com 3 3.9.2 Install on remote server Check Answer User credentials with administrative privileges on remote server available
Available disk space > calculated in section 3.1 Extract TPCHealthCheck component from SAT package to local/remote server.
DSCLI downloaded or installed (DS8000 support) Operating System listed in install matrix TPC version listed in install matrix (only required with TPC integration)
DB2 version listed in install matrix (only required with TPC integration)
DB2 user credentials available Certificates for each SVC copied to remote server Connectivity as described in section 3.5 is available Licensed Material Property of IBM Corporation 17 Installation www.ibm.com 4 4 Installation 4.1 Overview There is a prerequisite to download the Storage Automation Tool package zip file at http:// bldgsa.ibm.com/projects/s/storage_automation/sat/. The install of TPCHC 1.6.0 installed from the SAT package differs from the previous version of TPCHC. For a fresh install or an upgrade, the TPCHC tool will now as default be installed under c:\IBM\tpchc and it is possible to change install destination. To start installation launch the batch file InstallTPCHC.bat that you will find under the folder TPC HealthCheck where the SAT package has been unzipped. NOTE: Support files for DS8000 and XIV must be installed before proceeding with installation. If a previous instance of the tool is installed, the install process will install TPCHC and apply the previous configuration with no user interaction. NOTE: There is a prerequisite to download the Storage Automation Tools package zip file 1) Unzip to the server where TPCHC is to be installed (e.g. SAT150.zip into c:\SAT150). 2) Locate the InstallTPCHC.bat under TPCHealthCheck subfolder in the Storage Automation Tools zip file unzipped location (e.g. c:\SAT150\tpchc). 3) You would be prompted to install a Microsoft Visual C++ 2008 Redistributable package unless the package is already installed, then 4) Follow the prompts and provide the appropriate information. NOTE: On Windows 2008R2 with user not administrator, but member of administrator group, all programs must be run with clicking right mouse button on the InstallTPCHC.bat file, and select run as administrator. If run from command line (CMD), cmd.exe must be started with elevation of rights. 4.2 TPC Health Check Installation - Fresh Install To begin installation, whether a fresh install or upgrade, run InstallTPCHC.bat from the TPCHealthCheck directory. You will be prompted to enter TPC server details. If the username, password, or server are left blank, TPC server interaction will be disabled. It can be re-enabled (and settings can be changed) using the reconfiguration utility. For more details on reconfiguration, see the Reconfigure TPCHC section. You will also be prompted to enter e-mail notification details. If any of these fields are left blank ( as shown in the following screen) e-mail notification will be disabled. It can be re-enabled (and settings can be changed) using the reconfiguration utility. In addition to TPCHC, the following packages will be installed if necessary: IBM JRE 6, and .NET 4.0 Client Profile, Licensed Material Property of IBM Corporation 18 Installation www.ibm.com 4 NOTE: Installing the SAT Integrated Package (SATIP) is recommended during initial deployment. Please refer to the SAT page - SATIP Installation Guide. A TPC database connection will be tested during the installation process. If this connection fails, an error message will be displayed with more information. If no suggestions are given by the installation script, please contact support. NOTE: You can reconfigure database connection settings and test the connection using the reconfiguration utility - ReconfigureTPCHC.bat. NOTE: The install is captured in the installLog.txt located in the install directory where the SAT package was unzipped. 4.3 TPC Health Check Installation - Upgrade from Previous Version If upgrading from a previous version of SAT, cygwin may already exist. You will be asked whether or not you wish to overwrite cygwin; it is not necessary to do so. Old settings and certificate files will be saved and imported into the new version. Licensed Material Property of IBM Corporation 19 Installation www.ibm.com 4 If upgrading from TPC Health Check 1.2/SAT 1.0.0, you will be prompted about enabling DS8000 support. See the documentation section for DS8000 support for more details. If upgrading from TPC Health Check 1.3/SAT 1.2.0 or earlier, you will be prompted about enabling XIV support. See the documentation section on XIV support for more details. If upgrading from a standalone version of TPC Health Check (or you are changing the installation path) and if you previously scheduled checks, you will be prompted to reschedule due to the installation path change. Database settings will be tested during the installation process. If the database connection fails during installation, you can reconfigure these settings using the reconfiguration utility. See Reconfigure TPCHC in the post-install section. NOTE: If upgrading from a standalone version of TPC Health Check, the old c:\cygwin\opt\IBM\tpchc path will be renamed to tpchc.OLD. You may remove this path as well as the c:\cygwin directory (unless you are using it for something else on the system) or save it as a backup. 4.4 Reconfiguration If you need to reconfigure the TPC Health Check installation, you may do so by running c:\IBM\tpchc\ReconfigureTPCHC.bat in the install directory. See Reconfiguring TPCHC in the post-install section. Licensed Material Property of IBM Corporation 20 Installation www.ibm.com 4 4.5 Copy Certificate Files The certificate files for each SVC need to be placed in C:\IBM\tpchc\conf\certs. Depending on how TPCHC is installed, select appropriate section below: 4.5.1 TPCHC installed local on TPC server Run the reconfiguration utility and select to add SVC devices from the TPC server will automatically copy (and convert, if necessary) the certificate files. 4.5.2 TPCHC installed on remote server If TPC Health Check is installed on remote server, then you must copy the certificate files to C:\IBM\tpchc\import. Transfer the SSH keys from the TPC server to the server hosting the TPC Health Check. Files are typically found under; C:\Program Files\IBM\TPC\device\cert After copying the files, run the reconfiguration utility and have it add the SVC devices from the TPC server. It will automatically find the certificates in the import directory and copy/convert as necessary. How to Determine the Correct SSH Key on the TPC server To make sure you copy the correct key (i.e. the key in the TPC server that correspond to the storage device you want to manage), connect to the TPC then go to Storage Subsystems to identify the storage device Licensed Material Property of IBM Corporation 21 Installation www.ibm.com 4 Licensed Material Property of IBM Corporation 21 5. Post Installation www.ibm.com 5 5. Post Installation TPCHC is now installed and you have to go through some steps described in this section. 5.1 Reconfiguring TPCHC Once TPCHC has been installed and/or configured, you can run the c:\IBM\tpchc\ ReconfigureTPCHC.bat script. This script allows you to perform several administration tasks such as: D Reconfiguring TPC settings, such as Database information, and TPC server IP. D Check and test the DB2 database connection for debug purposes. D Adjust devices configuration and support including adding manually some device, and enable or disable DS8000 and/or XIV support. This function is to be used when you need to define a device not registered in the TPC database. D Change email notification configuration: enabling/disabling and email settings. D Changing schedule for automatic TPCHC launch. To run the reconfiguration utility, run by double clicking on c:\IBM\tpchc\ReconfigureTPCHC.bat script. You will be presented with a menu as depicted in the screenshot. How to add devices. Type of install Add devices Integrated with TPC DS8000 and SVCs are imported/updated from TPC every time TPCHC is run. See additional device types supported in section 1.4. DS8000 credentials must be added using Reconfig- ureTCHC.bat. Brocade must be added using ReconfigureTPCHC.bat Licensed Material Property of IBM Corporation 22 5. Post Installation www.ibm.com 5 How to get SVC certificate files. 5.1.1 Reconfiguration: Manually Adding and Removing Devices 1) Run the c:\IBM\tpchc\ReconfigureTPCHC.bat script to manually add a device. It will update the adequate configuration files and allow TPCHC to perform a test against the device even if the device is not registered in TPC database. 2) To do so, choose the Adjust device configuration and support sub menu from the Reconfiguration menu . Stand alone All devices must be added using ReconfigureTCHC.bat. Type of install Certificate files Local on same server as TPC ReconfigureTPCHC.bat will pull information from TPC, copy and convert to openssh format. Remote server Copy certificate files for all SVCs manually to this directory C:\IBM\tpchc\import Then add devices using ReconfigureTPCHC.bat. Licensed Material Property of IBM Corporation 23 5. Post Installation www.ibm.com 5 3.) And proceed as follows by providing IP address of the SVC device and device name Licensed Material Property of IBM Corporation 24 5. Post Installation www.ibm.com 5 In the example shown above, a dummy DS8000 device has been added manually and subsequently removed. 5.1.2 Reconfiguration: Adding Devices from the TPC Server The reconfiguration utility can also add devices that it finds on a configured TPC server (SVC and DS8000 devices). See section 1.4 - Device Types Supported. Licensed Material Property of IBM Corporation 25 5. Post Installation www.ibm.com 5 If adding a SVC device from the TPC server, there are three ways to get the certificates. If TPC Health Check is being run on the TPC server, the certificate files will be automatically taken from the TPC server directory. If TPC Health Check is on an external server, you can place the certificates in C:\IBM\tpchc\import and the utility will use it there. Otherwise, the utility will show instructions on-screen. 5.1.3 Reconfiguration: Getting and Converting keys Use the ReconfigureTPCHC.bat to copy and convert the ssh keys. The prerequisite is that keys are copied locally under C:\IBM\tpchc\import if the TPCHC install is remote. There is nothing to do prior to executing ReconfigureTPCHC .bat execution if TPCHC is installed on the TPC server itself. ReconfigureTPCHC .bat will convert the key to the correct format. Licensed Material Property of IBM Corporation 26 5. Post Installation www.ibm.com 5 5.1.4 Reconfiguration: Adjusting Locally Defined Device Policies You can adjust what locally defined policies are associated with which devices. Using the ReconfigureTPCHC.bat, you can select devices and then add or remove locally defined policy files (located in tpchc\conf\policies\local) from them. Devices may be selected individually; you may also select multiple devices and add or remove policies from all selected devices at once. See the section on local policies for more information. 5.2 Running TPCHC manually Run script c:\IBM\tpchc\runTPChc.bat Look in the tpchc.log file to view result. Example: Licensed Material Property of IBM Corporation 27 5. Post Installation www.ibm.com 5 Type c:\IBM\tpchc\log\tpchc.log Tue Nov 29 08:15:03 2011: ####################################################################### Tue Nov 29 08:15:03 2011: # bin/tpchc.pl Tue Nov 29 08:15:03 2011: ####################################################################### Tue Nov 29 08:15:03 2011: Starting check of all devices in TPC Tue Nov 29 08:15:03 2011: Connecting to TPC Tue Nov 29 08:15:04 2011: got 3 device(s) from TPC Tue Nov 29 08:15:04 2011: skipped devices with types ds8000 as they have no default policy defined in c:\IBM\tpchc\conf\tpchc.cfg Tue Nov 29 08:15:04 2011: writing new devicefile Tue Nov 29 08:15:04 2011: processing central policy file Tue Nov 29 08:15:04 2011: SVC-2145-CB_SVC_B6_03-IBM not found in policy file, adding it with default policy svc_standard_v1.0.pol Tue Nov 29 08:15:04 2011: SVC-2145-CB_SVC_E1_01-IBM not found in policy file, adding it with default policy svc_standard_v1.0.pol Tue Nov 29 08:15:04 2011: SVC-2145-CB_SVC_P2_02-IBM not found in policy file, adding it with default policy svc_standard_v1.0.pol Tue Nov 29 08:15:04 2011: writing new policy file Tue Nov 29 08:15:04 2011: Running checks of 3 device(s) Tue Nov 29 08:15:04 2011: SVC-2145-CB_SVC_B6_03-IBM: Starting check Tue Nov 29 08:15:04 2011: SVC-2145-CB_SVC_E1_01-IBM: Starting check Tue Nov 29 08:15:04 2011: SVC-2145-CB_SVC_P2_02-IBM: Starting check Tue Nov 29 08:15:06 2011: SVC-2145-CB_SVC_B6_03-IBM: Health check succesful Tue Nov 29 08:15:06 2011: SVC-2145-CB_SVC_P2_02-IBM: Health check succesful Tue Nov 29 08:15:14 2011: SVC-2145-CB_SVC_E1_01-IBM: Health check succesful Tue Nov 29 08:15:14 2011: All devices was checked Tue Nov 29 08:15:14 2011: Creating healthcheck report Tue Nov 29 08:15:14 2011: Create an email with subject 'Healthcheck report for TPC for Carlsberg' for 'Per Lytkemeyer <pelu@dk.ibm.com>, Nicolai Kildal <nki@dk.ibm.com>' Tue Nov 29 08:15:14 2011: attach body part Tue Nov 29 08:15:14 2011: attach file : /opt/IBM/tpchc/tmp/tpchc-report.2011-11-29.xls Tue Nov 29 08:15:14 2011: send email Tue Nov 29 08:15:14 2011: Copied results to daily log directory: /opt/IBM/tpchc/results/daily/2011-11-29 Tue Nov 29 08:15:14 2011: Creating monthly archive zip: /opt/IBM/tpchc/results/monthly/2011-11.zip
Licensed Material Property of IBM Corporation 27 6. Scheduling www.ibm.com 6 6. Scheduling To configure scheduling of TPCHC, run the utility ReconfigureTPCHC.bat. When TPCHC have been run manually with success as described in section 5.2, you are ready to start scheduling. It is recommended to schedule a daily run of TPCHC.
Licensed Material Property of IBM Corporation 29 7. Results from TPC Health Check www.ibm.com 7 7. Results from TPC Health Check Results from the daily run are stored in a subfolder for each day. c:\IBM\tpchc\results\daily\yyyy-mm-dd\ Each month, the first run of TPCHC will result in a copy of the daily files to be stored in a zip file for that day. c:\IBM\tpchc\results\monthly\yyyy-mm-dd.zip The number of daily and monthly generations to keep is a setting you can change to your need, see configuration. Content of the result folders is described in below sections. 7.1 Configuration backup TPCHC pulls configuration settings from storage subsystems and saves output in a .out file for each system. The .out file is in plain text and can be used during recovery or fail-back. c:\IBM\tpchc\results\daily\yyyy-mm-dd\devices\device-name.out Licensed Material Property of IBM Corporation 30 7. Results from TPC Health Check www.ibm.com 7 77.1 Configuration backup Example, small extract from beginning of .out file: $ more /opt/IBM/tpchc/results/daily/2011-11-30/devices/SVC-2145-CB_SVC_B6_03-IBM.out #### <lsiogrp>
#### / Licensed Material Property of IBM Corporation 31 7. Results from TPC Health Check www.ibm.com 7 id:name:location:partnership:bandwidth:id_alias 000002006020B6DC:CB_SVC_B6_03:local:::000002006020B6DC 0000020064604756:CB_SVC_P2_02:remote:fully_configured:800:0000020064604756
id 000002006020B6DC name CB_SVC_B6_03 location local partnership bandwidth total_mdisk_capacity 393.7TB space_in_mdisk_grps 393.7TB space_allocated_to_vdisks 196.41TB total_free_space 197.3TB statistics_status off statistics_frequency 5 required_memory 8192 cluster_locale en_US time_zone 522 UTC code_level 5.1.0.7 (build 18.2.1009060000) FC_port_speed 2Gb console_IP 158.98.70.3:9080 id_alias 000002006020B6DC gm_link_tolerance 300 gm_inter_cluster_delay_simulation 0 gm_intra_cluster_delay_simulation 0 email_reply email_contact email_contact_primary email_contact_alternate email_contact_location email_state invalid inventory_mail_interval 0 total_vdiskcopy_capacity 196.41TB total_used_capacity 196.41TB total_overallocation 49 total_vdisk_capacity 196.41TB cluster_ntp_IP_address 158.98.140.164 cluster_isns_IP_address iscsi_auth_method none iscsi_chap_secret auth_service_configured no auth_service_enabled no auth_service_url auth_service_user_name auth_service_pwd_set no auth_service_cert_set no relationship_bandwidth_limit 25 gm_max_host_delay 5
id 0000020064604756 name CB_SVC_P2_02 location remote partnership fully_configured bandwidth 800 total_mdisk_capacity space_in_mdisk_grps space_allocated_to_vdisks total_free_space statistics_status off statistics_frequency 0 required_memory 8192 cluster_locale en_US time_zone 522 UTC code_level 5.1.0.7 (build 18.2.1009060000) FC_port_speed 2Gb console_IP 158.98.70.3:9080 id_alias 0000020064604756 gm_link_tolerance 300 gm_inter_cluster_delay_simulation 0 gm_intra_cluster_delay_simulation 0 email_reply email_contact email_contact_primary email_contact_alternate email_contact_location email_state invalid inventory_mail_interval 0 total_vdiskcopy_capacity total_used_capacity total_overallocation total_vdisk_capacity cluster_ntp_IP_address 158.98.140.164 cluster_isns_IP_address Licensed Material Property of IBM Corporation 32 7. Results from TPC Health Check www.ibm.com 7 7.2 Log files TPCHC generates three types of log files: A main log file (tpchc.log) and two with details for each subsystem (devicename.log and devicename_noncompliance.log). Content from latest run in tpchc.log is copied to the daily folder. The subsystem log files shows in details, how configuration is pulled from the storage subsystem, health check steps performed, and details for noncompliant checks. c:\IBM\tpchc\log\tpchc.log) c:\IBM\tpchc\results\daily\yyyy-mm-dd\tpchc.log iscsi_auth_method none iscsi_chap_secret auth_service_configured no auth_service_enabled no auth_service_url auth_service_user_name auth_service_pwd_set no auth_service_cert_set no relationship_bandwidth_limit 25 gm_max_host_delay 5
id 0000020064604756 name CB_SVC_P2_02 location remote partnership fully_configured bandwidth 800 total_mdisk_capacity space_in_mdisk_grps space_allocated_to_vdisks total_free_space statistics_status off statistics_frequency 0 required_memory 8192 cluster_locale en_US time_zone 522 UTC code_level 5.1.0.7 (build 18.2.1009060000) FC_port_speed 2Gb console_IP 158.98.70.3:9080 id_alias 0000020064604756 gm_link_tolerance 300 gm_inter_cluster_delay_simulation 0 gm_intra_cluster_delay_simulation 0 email_reply email_contact email_contact_primary email_contact_alternate email_contact_location email_state invalid inventory_mail_interval 0 total_vdiskcopy_capacity total_used_capacity total_overallocation total_vdisk_capacity cluster_ntp_IP_address 158.98.140.164 cluster_isns_IP_address iscsi_auth_method none iscsi_chap_secret auth_service_configured no auth_service_enabled no auth_service_url auth_service_user_name auth_service_pwd_set no auth_service_cert_set no relationship_bandwidth_limit 25 gm_max_host_delay 5
#### </lscluster>
Licensed Material Property of IBM Corporation 33 7. Results from TPC Health Check www.ibm.com 7 c:\IBM\tpchc\results\daily\yyyy-mm-dd\devices\devicename.log c:\IBM\tpchc\results\daily\yyyy-mm-dd\devices\devicename_noncompliance.log Noncompliance log files are described in section 7.2.1 Below are listed examples of main log file and subsystem log files. Example 1, main log file: Wed Nov 30 08:12:06 2011: ####################################################################### Wed Nov 30 08:12:06 2011: # bin/tpchc.pl Wed Nov 30 08:12:06 2011: ####################################################################### Wed Nov 30 08:12:06 2011: Starting check of all devices in TPC Wed Nov 30 08:12:06 2011: Connecting to TPC Wed Nov 30 08:12:07 2011: got 3 device(s) from TPC Wed Nov 30 08:12:07 2011: skipped devices with types ds8000 as they have no default policy defined in /opt/IBM/tpchc/conf/tpchc.cfg Wed Nov 30 08:12:07 2011: writing new devicefile Wed Nov 30 08:12:07 2011: processing central policy file Wed Nov 30 08:12:07 2011: Running checks of 3 device(s) Wed Nov 30 08:12:07 2011: SVC-2145-CB_SVC_B6_03-IBM: Starting check Wed Nov 30 08:12:07 2011: SVC-2145-CB_SVC_E1_01-IBM: Starting check Wed Nov 30 08:12:07 2011: SVC-2145-CB_SVC_P2_02-IBM: Starting check Wed Nov 30 08:12:09 2011: SVC-2145-CB_SVC_B6_03-IBM: Health check succesful Wed Nov 30 08:12:09 2011: SVC-2145-CB_SVC_P2_02-IBM: Health check succesful Wed Nov 30 08:12:17 2011: SVC-2145-CB_SVC_E1_01-IBM: Health check succesful Wed Nov 30 08:12:17 2011: All devices was checked Wed Nov 30 08:12:17 2011: Creating healthcheck report Wed Nov 30 08:12:17 2011: Create an email with subject 'Healthcheck report from TPC for Carlsberg' for 'pelu@dk.ibm.com, nki@dk.ibm.com' Wed Nov 30 08:12:17 2011: attach body part Wed Nov 30 08:12:17 2011: attach file : /opt/IBM/tpchc/tmp/tpchc-report.2011-11-30.xls Wed Nov 30 08:12:17 2011: send email Wed Nov 30 08:12:17 2011: Copied results to daily log directory: /opt/IBM/tpchc/results/daily/2011-11-30
Licensed Material Property of IBM Corporation 34 7. Results from TPC Health Check www.ibm.com 7 7.2.1 Log file for noncompliant checks A log file named <devicename>.noncompliant.log is stored in same place as the other 2 log files, in tmp\device (latest run) or results\daily\<date>\device (stored results). To recap the log files and their functions Example 2, detailed log file for a subsystem: Wed Nov 30 08:12:07 2011: ############################################### Wed Nov 30 08:12:07 2011: # Device: SVC-2145-CB_SVC_B6_03-IBM Wed Nov 30 08:12:07 2011: ############################################### Wed Nov 30 08:12:07 2011: Starting backup Wed Nov 30 08:12:07 2011: Running script: 'cd /opt/IBM/tpchc/bin/expect; /opt/IBM/tpchc/bin/expect/get_config.exp SVC-2145-CB_SVC_B6_03-IBM 2>&1' Wed Nov 30 08:12:07 2011: devicename SVC-2145-CB_SVC_B6_03-IBM Wed Nov 30 08:12:07 2011: type: SVC ipaddress: 158.98.70.4 Wed Nov 30 08:12:07 2011: SSH Key file exist, continue Wed Nov 30 08:12:07 2011: Writing svcinfo lsiogrp collected data to file Wed Nov 30 08:12:07 2011: Writing svcinfo lscluster collected data to file Wed Nov 30 08:12:07 2011: Writing svcinfo lscontroller collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsnode collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsmdiskgrp collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lshost collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsvdisk collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsfabric collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsrcrelationship collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsuser collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsmdisk collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsemailserver collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lssyslogserver collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lssnmpserver collected data to file Wed Nov 30 08:12:08 2011: Writing svcinfo lsquorum collected data to file Wed Nov 30 08:12:08 2011: [OK] Wed Nov 30 08:12:09 2011: Backup status: OK Wed Nov 30 08:12:09 2011: Finding policies to check: Wed Nov 30 08:12:09 2011: svc_standard_v1.0.pol Wed Nov 30 08:12:09 2011: Starting health check Wed Nov 30 08:12:09 2011: Running script: '/opt/IBM/tpchc/bin/healthcheck/hc.pl /opt/IBM/tpchc/tmp/devices/SVC- 2145-CB_SVC_B6_03-IBM.out /opt/IBM/tpchc/conf/policies/svc_standard_v1.0.pol 2>&1' Wed Nov 30 08:12:09 2011: [Policy: svc_standard_v1.0.pol] Wed Nov 30 08:12:09 2011: 1.1 - Firmware level: Compliant Wed Nov 30 08:12:09 2011: 1.2 - Hardware level: Compliant Wed Nov 30 08:12:09 2011: 2.1 - Max 2000 vdisks in any IOgrp: Compliant Wed Nov 30 08:12:09 2011: 2.2 - Max 200 hosts in any IOgrp: Compliant Wed Nov 30 08:12:09 2011: 2.3 - Quorum disk aktive state is no: Compliant Wed Nov 30 08:12:09 2011: 2.4 - Quorum disk aktive state is yes: Compliant Wed Nov 30 08:12:09 2011: 2.5 - Configured Quorum disk must be online: Compliant Wed Nov 30 08:12:09 2011: 2.6 - All Vdisks must be stribed: Compliant Wed Nov 30 08:12:09 2011: 2.7 - Vdisks max. size 2.0TB: Compliant Wed Nov 30 08:12:09 2011: 2.8 - Vdisks max size 2000GB: Compliant Wed Nov 30 08:12:09 2011: 2.9 - Email (SMTP) configured: Noncompliant Wed Nov 30 08:12:09 2011: 2.10 - SNMP configured: Compliant Wed Nov 30 08:12:09 2011: 2.11 - SNMP community configured: Compliant Wed Nov 30 08:12:09 2011: 2.12 - NTP configured: Compliant Wed Nov 30 08:12:09 2011: 2.13 - Intra Cluster Delay timing configured: Compliant Wed Nov 30 08:12:09 2011: 2.14 - Inter Cluster Delay timing configured: Compliant Wed Nov 30 08:12:09 2011: 2.15 - Mdisk max. 4095 configured: Compliant Wed Nov 30 08:12:09 2011: 2.16 - Mdiskgrp. max 127 configured: Compliant Wed Nov 30 08:12:09 2011: 2.17 - Hosts must have more than 1 connection: Noncompliant Wed Nov 30 08:12:09 2011: 2.18 - SVC total capacity max. 300TB: Noncompliant Wed Nov 30 08:12:09 2011: 2.19 - More than 1023 hosts defined: Compliant Wed Nov 30 08:12:09 2011: 3.1 - Configured Nodes are online: Compliant Wed Nov 30 08:12:09 2011: 3.2 - Disk Controllers online: Compliant Wed Nov 30 08:12:09 2011: 3.3 - Configured Disk controllers are not degraded: Compliant Wed Nov 30 08:12:09 2011: 3.4 - Configured mirrored vdisk are const_sync.: Compliant Wed Nov 30 08:12:09 2011: 3.5 - Configured vdisks are online: Compliant Wed Nov 30 08:12:09 2011: 3.6 - Configured Mdisks are online: Compliant Wed Nov 30 08:12:09 2011: 3.7 - Configured mdiskgrps are online: Compliant Wed Nov 30 08:12:09 2011: # Noncompliant TOTAL: 3 Wed Nov 30 08:12:09 2011: Healthcheck status: Noncompliant
Licensed Material Property of IBM Corporation 35 7. Results from TPC Health Check www.ibm.com 7 D <devicename>.log, used for general logging, like connection to device, which commands have been run, how did each policy check go, and last summary of number of Noncompliant checks, and overall status of device D <devicename>.out, used for storing output from each command run, separated with XML-like tags in separate stanza's D <devicename>.noncompliant.log, used for storing which lines failed to comply with the basic match check, or for more complicated checks, explanation of reason for failure. To illustrate see example below. Cisco policy check 2.1 id: cisco.2.1 checkname: Timeout must be set to 10 min. Or less type: match segment: terminal stanzastart: ^Session\sTimeout: stanzastop: $ stanzacriteria: include:^Session\sTimeout:\s+(?:[0-9]|1[0-9])\s+ stanzaempty: Noncompliant severity: minor .out file from Cisco device #### <terminal> TTY: /dev/pts/16 Type: "xterm" Length: 0 lines, Width: 80 columns Session Timeout: 30 minutes Event Manager CLI event bypass: no Redirection mode: ascii Accounting log all commands (including show commands): no Vlan batch mode: no #### </terminal> From policy check cisco.2.1 we see that checkname is Timeout must be set to 10 min. Or less - and examing the above data we see the session timeout is set to 30 minutes, which is noncompliant for this check. The corresponding .noncompliant.log for the device is: #### <cisco.2.1> ############################################################################### ############################# Id: cisco.2.1 Name: Timeout must be set to 10 min. Or less Type: match Segment: terminal Severity: minor Licensed Material Property of IBM Corporation 36 7. Results from TPC Health Check www.ibm.com 7 This check was noncompliant because of the following line(s), which did not meet the check criteria: --------------------------------------------------------------------------------------------------------- --- Session Timeout: 30 minutes --------------------------------------------------------------------------------------------------------- --- #### </cisco.2.1> We can see the id, name, type, segment and severity for this check, because it is a match check, we will see which lines conflicts with this check ,highlighted above. For this simple example there is little difference, between opening the .out file and the log fil, but if several checks fails, we have to go back and forth between the .log file and .out file, with the new .noncompliant.log file all noncompliant checks and the reason are gathered in one file. Example of a .noncompliant.log check of type count #### <svc.2.21> ############################################################################### ############################# Id: svc.2.21 Name: Check for more that 5 mdisk groups Type: count Segment: lsmdiskgrp Severity: info Reason for noncompliance: The number of lines matched was 6 which is above 5 The lines counted were: --------------------------------------------------------------------------------------------------------- --- 0 C05_DS83_P4_0 online 12 57 20.1TB 512 8.0TB 12.04TB 12.04TB 12.04TB 59 0 1 C05_DS83_P4_1 online 12 100 20.1TB 512 9.1TB 10.93TB 10.93TB 10.93TB 54 0 2 C05_DS83_P4_2 online 18 77 27.8TB 512 9.3TB 18.56TB 18.56TB 18.56TB 66 0 3 C05_DS83_P4_3 online 18 38 32.4TB 512 17.1TB 15.33TB 15.33TB 15.33TB 47 0 4 C05_DS83_P426_0 online 12 26 18.5TB 512 7.3TB 11.22TB 11.22TB 11.22TB 60 0 5 C05_DS83_P426_1 online 12 6 21.6TB 512 17.5TB 4.08TB 4.08TB 4.08TB 18 0 #### </svc.2.21> Now it is easy to see why this is noncompliant: The number of lines matched was 6 which is above 5 Then the exact lines where matched, now it is easier to start correcting the issue. Licensed Material Property of IBM Corporation 37 7. Results from TPC Health Check www.ibm.com 7 7.3 Health Check report The report generated is a Excel format spreadsheet. The report contains the result of the device configuration backup and the Health Check runs. The report is organized in a dashboard, summary sheet, noncompliant checks sheet and one sheet for every policy checked. Below you can see an example report. 7.3.1 Example dashboard Dashboard Licensed Material Property of IBM Corporation 38 7. Results from TPC Health Check www.ibm.com 7 7.3.2 Example summary report Summary Sheet 7.3.3 Example Noncompliance report Noncompliant report checks Note the column Link to guide, it contains links to guides on GSA website for each check. Licensed Material Property of IBM Corporation 39 7. Results from TPC Health Check www.ibm.com 7 7.3.4 Example Policy report Policy report example Result of all checks in a policy for each device. Licensed Material Property of IBM Corporation 41 8. TPCHC command line options www.ibm.com 8 8. TPCHC command line options TPCHC can be called with a number of options when started. Below are each option explained. In the explanations are <tpchc base dir> used to locate path to TPCHC. Default path when installing TPCHC is C:/Program Files/IBM/tpchc. 8.1 Oneshot This option allows for multiple runs of TPCHC without sending alerts, mails, or archiving results. Output from running TPCHC with this option will only be stored in <tpchc base dir>/ tmp. Note: content of <tpchc base dir>/tmp will be deleted every time TPCHC is started. Syntax: tpchc.pl oneshot 8.2 Collect support files TPCHC is able to collect files from a range of device types. Different methods are used to collect the support files from each device type, this is managed by TPCHC. Device types supported: Cisco MDS, SVC, V7000 Collected files are stored at this location: <tpchc base dir>/results/support/<device name> Note: running TPCHC will delete any existing support files for the specified device. Syntax: tpchc.pl support <device name> 8.3 Enable debugging In situations where TPCHC returns unexpected results or backup fails without a meaningful error message, debugging can be turned on. This will make TPCHC log all details during execution. Debug can also be used in combination with other options like oneshot and support. Syntax: tpchc.pl debug Licensed Material Property of IBM Corporation 42 8. TPCHC command line options www.ibm.com 8 8.4 Local files Files from another installation can be copied to the server where TPCHC is installed. When TPCHC is run with the localfiles option backup functionality is disanled and TPCHC uses the specified location instead as input to be health checked. Syntax: tpchc.pl localfiles c:\temp\files from server1\results Licensed Material Property of IBM Corporation 43 9. Alerting and Notification www.ibm.com 8 9. Alerting and Notification TPCHC can be configured to alert the administrator(s) in case of failures or compliance issues detected during execution of the scripts. Alerts are divided into two categories, system and compliance alerts - each can be sent to different receivers. TPCHC can also notify administrator(s) about the result of last run by sending an email with the Health Check report. 9.1 System alerts If an error occurs during execution of TPCHC, a system alert can be sent. Examples of errors that cause sytem alerts are: H Wrong access rights to files used by TPCHC H Missing credentials or SSH-keys to log on to a device H No connectivity to TPC database Settings for system alerts are configured in tpchc.cfg. For more details about this see Configuration section in Section 9. 9.2 Compliance alerts If a device does not comply with health checks defined in a given policy, a compliance alert can be sent. A device with two or more noncompliant checks will only result in one alert per device. An alert is generated based on the check with the highest severity. Settings for system alerts are configured in tpchc.cfg. For more details about this see Configuration section in Section 9. 9.3 Threshold based alerting When an alert is being generated a severity is assigned. TPCHC can be configured to only send alerts that have a certain severity level or higher. Licensed Material Property of IBM Corporation 44 9. Alerting and Notification www.ibm.com 8 List of severities: Threshold settings for compliance alerts are configured in tpchc.cfg. For more details about this see Configuration section in Section 9. 9.4 Notification via email The health Check report that is generated by TPCHC can be sent to a list of mail recipients via email. This functionality is especiallly good when starting to use Health Checking, because it provides an updated overview and makes keeping track of progress easy. Settings for email are configured in tpchc.cfg. For more details about this see Configuration section in Section 9. Severity Level Fatal 6 (High) Critical 5 Major 4 Minor 3 Warning 2 Info 1 (Low) Licensed Material Property of IBM Corporation 43 10. Configuration www.ibm.com 10 10. Configuration Configuration settings for TPCHC are stored in c:\IBM\tpchc\conf. The sections below describe each file. NOTE: A new debug flag has been added to tpchp.pl both as a command line option, and in the global configuration file below. When enabled, all communication beteen the scripts and the storage devices are logged in the device log files. (ex.debug: 1 enables debug output from device drivers and can be used to enable debug output for scheduled runs of tpchp.pl). 10.1 Configuration file (tpchc.cfg) ######################################################### # tpchc.cfg ########################################################## # # TPCHC version # ------------- # tpchc_version = 1.6 # TPCHC parameters # ---------------- # Description and unique identifier for this TPCHC instance tpchc_description = tpchc_identifier = 1 debug = 0 doc_url = http://bldgsa.ibm.com/projects/s/storage_automation/sat/Docs/TPC-HC/ Guides/ # # TPC parameters # -------------- # Description of the TPC instance and DB2 connection parameters # #tpc_db_user = #tpc_db_pass = # # Default policies Licensed Material Property of IBM Corporation 44 10. Configuration www.ibm.com 10 # ---------------- # Naming syntax is "policy_<device type>" # Controls which health check policies are used as default for a given device type # policy_svc = svc_standard_v1.6.pol policy_ds8000 = ds8k_standard_v1.6.pol policy_brocade = brocade_standard_v1.6.pol policy_xiv = xiv_standard_v1.6.pol policy_cisco = cisco_standard_v1.6.pol policy_v7000 = v7000_standard_v1.6.pol # # Expect settings # --------------- # Username used to log on to SVC devices # expect_svc_username = admin #dscli = #xcli = # # Parallelism # ----------- # Maximum number of devices checked in parallel by tpchc.pl # num_concurrent_workers = 5 # # Results archive # --------------- # Number of daily and monthly results to keep in the results archive # num_daily_results = 31 num_monthly_results = 12 # # Mail configuration # ------------------ # Mail address to receive the Healthcheck report and name of the sender # mail_recepients can be a comma separated list of mail addresses Licensed Material Property of IBM Corporation 45 10. Configuration www.ibm.com 10 #mail_sender = #mail_recipients = #mail_smtpserver = # # Alert configuration # ------------------- # User configurable alertscripts: # alertscript_compliance: command to run when a device is found to be Non-Compliant # alertseverity_compliance: minimum compliance alert severity we send compliance alerts for # alertscript_system: command to run when the tpchc.pl engine detects a runtime problem # # The alertscript settings can contain the following parameters, which are then set by tpchc.pl at runtime: # <message> : a text describing the alert # <severity> : INFO, WARNING or ERROR # <device> : name of the device if applicable (only for compliance alerts) # If script lines are commented out, no alerts will be sent # #alertscript_compliance = /bin/sh example_compliance_alert.sh <device> <severity> <message> #alertseverity_compliance = critical #alertscript_system = /bin/sh example_system_alert.sh <severity> <message> 10.2 Certificate files Logging on to SVCs requires SSH certificate key files. These files are stored in a sub-folder called certs. C:\IBM\tpchc\conf\certs The certificate filename must be the SVC cluster IP address. Refer to section 5.1 for how to add devices. 10.3 Policy files TPCHC is supplied with a standard policy for each device type that is supported. This is also listed in the TPCHC configuration file (tpchc.cfg) as default policy. You can create as many policy files as needed and copy them to the policy folder. C:\IBM\tpchc\conf\policies\local Licensed Material Property of IBM Corporation 46 10. Configuration www.ibm.com 10 If TPCHC is not installed in the default directory, the substitute C:\IBM with the path you selected. 10.4 Activate local policies To activate a policy it must be assigned to one or more storage devices. This is done using ReconfigureTPCHC.bat. 10.5 Run as stand alone or integrated with TPC TPCHC is able to run based on input from TPC, local configuration file or a combination of both. The table below show the combinations. TPC integration Local configuration file Result Yes No Storage subsystems known by TPC will be added to TPCHC and checked when TPCHC is run. Brocade must be added to local configuration file Licensed Material Property of IBM Corporation 47 10. Configuration www.ibm.com 10 How to enable/disable TPC integration During installation, see section 4.5 in this guide, you setup the integration. To enable TPC integration after install is complete run the ReconfigureTPCHC.bat script . Do you want to enable TPC support? (Y/n) is displayed, answer Y to enable TPC support. To disable TPC integration after install is complete, run the ReconfigureTPCHC.bat script and answer n to integration. How to edit local configuration file In section 5.1 in this guide you can see how to use the ReconfigureTPCHC.bat script to manage the local configuration file. No Yes Devices added using ReconfigureTPCHC.bat utility locally will be checked by TPCHC Yes Yes Both device from TPC and local configuration file will be checked by TPCHC. If a device with same IP address exists both in TPC and locally then both devices will be checked. Licensed Material Property of IBM Corporation 49 11. Health Check Engine www.ibm.com 11 11. Health Check Engine TPCHC v1.4 is shipped with a new health check engine, allowing for numerous new possibilities, for matching and counting. The next sections will describe each enhancement and a use case provides an example on this. 11.1 Global and Local policies It is possible to have both global and local policy files. Difference is that global policy files are shipped with TPCHC and based on IBM SSA Best Practices. Local policies are : D extensions of Global policies, ie. checking for other/more things special for the local environment, example might be a special naming convention D overwriting global Policies. Reason for overwriting might be different security policies compared to Global or special design in the local environment. Global policies are enforcing ITSC104. Global and local policies are stored differently in the file system. D Global policies tpchc\conf\policies\global D Local policies tpchc\conf\policies\local The file extension in which the policies are stored must end with .pol, from the name of the file it should be self evident what it contains, ex. svc_standard_v1.4 which is the Global Policy file for TPCHC v1.6. All Global policy files are under revision control and they should not be edited, if they are, the Excel report will show this as a critical change. The right way to make changes to Global policies are by making a local policy containing checks with same check id and change parameters here. This will be highlighted in Excel report as a minor change. Reason for highlighting is to make it easy to spot deviations from SSA Best Practices. It is valid to use local policies when SSA Best Practice conflicts with local rules for the environment. When running tpchc.pl all the policy files stored in global and local directory are read. This means that Global policies will always be run for all devices, regardless of local policies. Overwriting checks are based on the id of the check. This means that if a local check has the same id as a global check then the local check will overwrite, and be run instead of the global check. Below is an example where check svc.3.1 is overwriting a global policy check. From the general log file (devicename.log) for our SVC device. Global checks are listed first in the log file: svc.3.1 - Firmware level 4.4 or higher: Disabled by local check : critical svc.3.2 - Check SVC node same hardware in cluster: Noncompliant : info Licensed Material Property of IBM Corporation 50 11. Health Check Engine www.ibm.com 11 Normal log syntax, as seen in svc.3.2, is id name: status of check: severity?. For svc.3.1 the status of the check is Disabled by local check. Scrolling further down in the log file we see the local checks: [Policy: /cygdrive/c/Program Files/IBM/tpchcv/conf/policies/local/svc_codelevels.pol] svc.3.1 - Firmware level 4.4 or higher: Compliant : info svc.9.1 - Golden Firmware level 6.3.0.4: Noncompliant : critical In the local check section, we see the result of the custom check svc.3.1. See Excel report updatedsection for changes to report re. Global and Local policies. To create a local policy, like svc.9.1, see use case section. Checks are divided into 5 categories The check id is a combination of device type, category number and a sequential number. There are 5 categories of policy checks: D 1's security related checks, ie. password length D 2's settings related checks, ie. have a SMTP server been added D 3's Hardware/Software version related checks, ie. is the device running the Golden Firmware level D 5's status checks, ie. are all the disks or ports online D 9's reserved for custom checks Example: id: svc.3.1 checkname: Golden Firmware level 6.3.0.4 The id says the check is for storage type SVC, it is Hardware/Software related check 3, and it is the first check in this category 1. When reading the check name, it should be clear what the check is doing. Licensed Material Property of IBM Corporation 51 11. Health Check Engine www.ibm.com 11 11.2 Check Types Prior to TPCHC v1.4 all checks where based on matching values, from TPCHC v1.6 new types have been created. In each check the type must be specified. When the type is chosen there is some specific syntax to be aware of for each check. Required parameters when writing checks consist of: D id <checkid> D checkname <the check name describes what the check performs> D type <the type of the check can be a wide range of predefined check types> D segment <segment name to check in backup file> D stanzastart <regex> regular expression for start of the stanza that must match, example ^\d+ D stanzastop <regex> regular expression for end of the stanza that must match, example $ D match <regex> regular expression capturing the value to check using one set of parenthesis D criteria <min|max|equal|range>:<value> min,max, and equal takes a single numeric value, range takes 2 numeric values seperated by dash as arguments Examples of the match andcriteria parameters: match: ^temperature\:\s+(\d+\.\d+)\s*(captures a temperature value specified as a decimal value) criteria: min:10(temperature must minimum 10 degrees) criteria: max:50.01 (temperature must maximum be 50.01 degrees) criteria: range:5-25.3(temperature must be between 5 and 25.3 degrees, both included) Other possibilities for criteria, can be a metric value to compare number of captured lines against, ie. check type count; or check type countcompare_balance D stanzaempty < compliant|noncompliant> result of the check, when either the stanza is empty or if no matches are found. D severity < info|warning|minor|major|critical|fatal> describes how critical the check Some checks are more complicated and can contain 2 segments, 2 stanzastart, 2 stanzastop and maybe 2 criteria. Examples of Check types The type of the check defines how to check for something in the output files from the devices. A check type can be a comparison of text, counting of something, a match of text or even mathematical calculations. Some types are so specialized that they only support one or very few checks. Licensed Material Property of IBM Corporation 52 11. Health Check Engine www.ibm.com 11 match The match check is a simple match of text or patterns. It could also be numbers greater or lower than a specific value. The syntax can be seen below. id: svc.2.5 checkname: All vdisks must be striped type: match segment: lsvdisk stanzastart: ^\d+.* stanzastop: $ criteria: include:^\d+?:[^\:]+:\d+?:[^\:]+:[^\:]+:[^\:]+:[^\:]+:[^\:]+:(striped|many): stanzaempty: Compliant severity: info What is special about this check is that the criteria can include the provided regular expression or exclude it. In the example above an include is used. The type attribute contains the value match as it is a match check. In the example above the lines captured by stanzastart and stanzastop must contain either the word 'striped' or 'many' at a specific location of the line. multimatch Multimatch is based on the same principles as match. Its just done in 2 segments, which can be the same. The reason to do it in 2 segments is because of dependencies between these segments. This means that: D For the entire check to be compliant, means the check in segment 1 is Compliantand segment 2 is Compliant D If the check in either segment 1 or in segment 2 is Noncompliant then the entire check is noncompliant. The syntax for the check is the same as for match just with 2 segments, 2 stanzacriteria and 2 stanza start/stop: id: brocade.3.3 checkname: SFP vendor on DCX switch must be Brocade type: multimatch segment_1: switchshow stanzastart_1: ^switchType: stanzastop_1:$ criteria_1: include:62.3 segment_2: sfpshow stanzastart_2: ^Slot\s+\d+(?:[^\s]*)\s+\d+\:\s+id\s+(?:[^\s]*)\s+\Vendor\:\s+BROCADE stanzastop_2:$ criteria_2: include:BROCADE stanzaempty: Compliant severity: major Here the multimatch check is used for a dependency check, where if we are checking a Brocade DCX (type 62.3 in segment switchshow) then the SFP vendor must be BROCADE in segment Licensed Material Property of IBM Corporation 53 11. Health Check Engine www.ibm.com 11 sfpshow. If we are checking another switch type, we will not match anything, and check will be empty, resulting in Compliant, due to stanzaempty: Compliant. count The count check type counts something and compares it according to a predefined value. In some checks it is needed to count the number of lines. For this purpose the type count is very useful. Sometimes it is also needed to count number of lines consisting of a given text or pattern. The syntax can be seen below. id: svc.2.41 checkname: Max 5 controllers pr. cluster type: count segment: lscontroller stanzastart: ^\d+: stanzastop: $ criteria: max:5 stanzaempty: Compliant severity: info The type attribute contains the value count. In this example we want to count the number of lines in segment lscontroller, which starts with a number, and this must be no more than 5. The stanzastart attribute captures lines starting with a one or more number (\d+). The criteria defines that there must be max 5 of these lines. The criteria can contain the following values: criteria: max:5 criteria: min:5 criteria: equal:5 criteria: range:5-10 countcompare The countcompare check is a more advanced version of the count check. It is based on matching 1 value for each line in the stanza. This value MUST be captured by parentheses, () in stanzastart or stanzastop. Countcompare will count number of times the value occurs in the stanza. All these counted values is compared against a value in the same way as in the count check type. The syntax can be seen below. id: svc.2.37 checkname: max. 450 vdisks per host type: countcompare segment: lshostvdiskmap stanzastart: ^\d+:(\w+) stanzastop: : criteria: max:450 stanzaempty: Compliant severity: info Licensed Material Property of IBM Corporation 54 11. Health Check Engine www.ibm.com 11 The type is countcompare. What the regular expression for stanzastop states is that: There is one or more words(which is the value we want to capture) ? (\w+) The criteria states that we want a maximum of 450 occurrences of each word captured. In the criteria of countcompare it is possible to write: criteria: max:5 criteria: min:5 criteria: equal:5 criteria: range:5-10 comparetext The comparetext check type is based on comparing values captured in stanza start/stop. These captured values must meet the stanzacriteria which can only contain the value allequal. This means that all the captured values are equal. The syntax can be seen below: id: svc.3.2 checkname: Check SVC node same hardware in cluster type: comparetext segment: lsnode stanzastart: ^hardware\s+(\w+) stanzastop: $ stanzacriteria: allequal stanzaempty: Compliant severity: info In this check stanzastart states that all lines must start with the word hardware, followed by one or more spaces \s+. Finally the value to compare is captured (\w+). Stanzacriteria states that all these values must be equal. multisum Multisum is like multimatch a check in 2 segments because of dependency between these segments. It is based on a sum_1 extracted from segment 1 and a sum_2 extracted from segment 2. The stanzacriteria compares sum_1 and sum_2 which validates the check to compliant or noncompliant. The syntax can be seen below: checkname: 1.3 type: multisum segment_1: lshost stanzastart_1: ^(\d+) stanzastop_1: : segment_2: lsiogrp stanzastart_2: ^\d+?:[^\:]+?:\d+?:(\d+) stanzastop_2: : Licensed Material Property of IBM Corporation 55 11. Health Check Engine www.ibm.com 11 stanzacriteria: sum_1 < sum_2 severity: info stanzaempty: Compliant The values to summarize is captured by parenthesise in stanza start/stop. The stanzacriteria states that sum_1 < sum_2: Stanzacriteria for multisum can also contain the following values: stanzacriteria: sum_1 < sum_2 stanzacriteria: sum_1 > sum_2 stanzacriteria: sum_1 = sum_2 stanzacriteria: sum_1 != sum_2 countcompare_balance This check is based on a balance between 2 captured values. For each second captured value there must be a maximum of 2 different values of the first captured value and there must be the same amount of each of these 2 different values. The syntax can be seen below: id: svc.2.20 checkname: Controller balancing type: countcompare_balance segment: lsfabric stanzastart: ^ stanzastop: \w+:(\w\w)\w*:\w*:\w*:\w*:\w*:\w*:active:(\w+):\w*:controller$ stanzacriteria: stanzaempty: Compliant severity: major Stanzastop is responsible for capturing the values to check for balance. (\w\w) is the first captured value and (\w+) is the second captured value. matchvalue This check type allows a numeric comparison of a captured value. Implemented are min, max, equal, and a range check. The check type uses a regular expression to capture the value that needs to be checked using parentheses. Required parameters when writing checks based on this type: id: <checkid> checkname: <name of check> type: matchvalue segment: lsfabric stanzastart: ^ stanzastop: \w+:(\w\w)\w*:\w*:\w*:\w*:\w*:\w*:active:(\w+):\w*:controller$ match: ^temperature\:\s+(\d+\.\d+)\s* criteria: min:10 criteria: max:50.01 criteria: range:5-25.3 stanzaempty: Compliant severity: major Licensed Material Property of IBM Corporation 56 11. Health Check Engine www.ibm.com 11 11.3 Creating custom check types If you have identified needs for additional check types, request for new check types can be sent by mail to support. Pls. Include explanation and examples in request. 11.4 Renumbered existing policies Some checks from rel. 1.3.4 and been renumbered in rel. 1.4, but check name remains the same. This is due to enforcing the 5 categories of policy checks: D 1's security related checks, ie. password length D 2's settings related checks, ie. have a SMTP server been added D 3's Hardware/Software version related checks, ie. is the device running the Golden Firmware level D 5's status checks, ie. are all the disks or ports online D 9's reserved for custom checks. Example: id: svc.3.1 checkname: Golden Firmware level 6.3.0.4 The id says the check is for storage type SVC, it is a Hardware/Software version related check 3, and it is the first check 1. Reason for renumbering is alignment and in coming versions of TPCHC, the possibility for filtering and reporting on the number of failed checks pr. category, or pr. device type etc. In general policy check 3.1 and 3.2 will be used for Golden Firmware level and Acceptable Firmware level for all devices. The Firmware Levels comes as Best Practice from IBM SSA Team. Below the list of checks that have been renumbered, DS8K D Enclosure status must be normal, to 5.8 from 5.24 D IOport masking must be set to "all" for LUNs, to 2.19 from 5.4 D Email notify enabled, to 2.6 from 5.14 D Email address entered, to 2.8 from 5.15 D Check for none expiring Password, to 1.4 from 5.16 D Password expired, to 1.5 from 5.17 D Failed login, to 1.6 from 5.18 D IO Port status must be online, to 5.9 from 5.19 Licensed Material Property of IBM Corporation 57 11. Health Check Engine www.ibm.com 11 D Topology must be set for all IO ports, to 2.9 from 5.20 D Type must be DS, to 2.10 from 5.5 D Unknown wwpn must be = 0, to 2.2 from 5.12 D LUNs must be assigned to a volume group, to 2.11 from 5.13 SVC D Acceptable Firmware level 5.1.0.11, to 3.2 from 3.1 D Check for serialnumbers on each PDU, to 3.4 from 2.19 Brocade D Each cp must have an ipaddress configured, to 2.27 from 5.2 D Remaining buffers must be greater than 100, to 2.28 from 5.9 D Check that mixed zoning are not used, to 2.34 from 5.12 Licensed Material Property of IBM Corporation 61 12. Editing Policy files www.ibm.com 12 12. Editing Policy files This section describes the syntax of a policy file and how it is used with the Health Check script. A Health Check can be scheduled to run as a part of TPCHC or it can be run locally on your own laptop to syntax check the policy files while updating them. The policy files are located in directory C:\tpchc/conf/policies. Description of the file types: The content of a policy file is in readable text format where each file contains a set of checks. Each check has a set of attributes where the text patterns for stanzastart, stanzastop, and stanzacriteria attributes are described using regular expressions. File types: H Health Check script (hc.pl) H Policy file (policyname.pol) H Text file to be checked (devicename.out) The device output files are generated using expect scripts. Line starting with #### are defined by the programmer, the rest comes from the device configuration. The segments in the device output files are defined using the XML sysntax. An example would be the segment ishost, where start and stop fields for the segment are written #### <ishost> (start of segment) and #### </lshost> (end of segment). If a segment lists a set of entities, the entities attributes will be seperated by a : or whitespaces. An example could be (from lshost segment): id:name:port_count:iogrp_count 0:ARL_DKRI4EX030A:2:1 1:PMLANPM02:2:4 2:ARL_DKRI4EX040A:2:1 The first line displays the headers for each entity, making it easier to read. The lines following represents the entities, seperating the attributes with a : as described. How the script works. When the script is run, it first indexes the entire .out file and splits it up into segments. This happens again when the script subsequently indexes the policy files. Next the script applies each check to all stanzas in relevant segments (or to files if check is not limited to a specific segment). Depending on whether a stanza matches a check and whether the stanzacriteria is set to Include or Exclude, value of either COMPLIANT or NONCOMPLIANT is returned. An important note to remember is that if a segment has multiple stanzas, all of them must pass the stanzacriteria in order to get a successful check. Licensed Material Property of IBM Corporation 62 12. Editing Policy files www.ibm.com 12 12.1 Regular expressions - examples Below are given examples on statements that are very useful.: 12.2 Guide to create a check A check can be created in an existing policy file or in a new policy file. The policy files always have the extension .pol. The following sections explain how the example below was created. id: svc.5.2 checkname: Host ports are online type:match segment: lsfabric stanzastart: ^ stanzastop: (host)$ stanzacriteria: include:^[^\:]+:[^\:]+:\d+?:[^\:]+:[^\:]+:\d+?:[^\:]+:active stanzaempty: COMPLIANT severity: critical This is a real example for the SVC. RegEx Explanation ^ Start of a line. $ End of a line. ^car Line must start with car. .* 0 or more of any characters (wildcard). .+ 1 or more of any characters. \d Digit (0-9). \d+\.\d+\.\d+\.\d+ Any IP-address. [0-9][a-z][A-Z] The first character have to start with a digit from 0 to 9, the second character have to be a letter from a to z (case sensitive) and the last character have to be a capitalized letter from A to Z. \w Word character. ? Match 1 or 0 times. [A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4} Email validation. Licensed Material Property of IBM Corporation 63 12. Editing Policy files www.ibm.com 12 12.2.1 Defining check The first thing to do is to define the name of the check. In the name of the check is: checkname: Host ports are online NOTE: Check names must be unique. 12.2.2 Segment to search Next step is to define where the check should search. Defining the segment attribute will limit the search to that segment. To search the entire file and not just a single segment then the segment attribute can be removed. In this example the segment lsfabric is used. segment: lsfabric The segment to check is located in the device output file. The segment lsfabric can be seen below. The goal here is to make sure that all hosts are active. 12.2.3 Stanza start/stop conditions The stanzastart attribute should be matching the start of the lines to check and the stanzastop attribute should match the end of the lines. Both are described using regular expressions. If you want to check the entire segment instead of the individual stanzas, you can leave out thye stanzastart and the stanzastop attributes. This option is also possible when searching through the entire file (when the segment attribute is left out as described above). Both stanzastart and stanzastop is defined in this example. stanzastart: ^ stanzastop: (host)$ Licensed Material Property of IBM Corporation 64 12. Editing Policy files www.ibm.com 12 In previous section (5.2.2), the segment lsfabric is shown. The stanzastart attribute states that all lines within the stanza must end with the word host. This identifies all hosts which leads to the next attribute, the stanzacriteria. 12.2.4 Stanzacriteria definition The stanzacriteria must be defined to include or exclude the match provided by the regular expression. This is done by defining include or exclude in front of the stanzacriteria. This is afterwards followed by a colon : and then the stanzacriteria itself written as a regular expression. stanzacriteria: include:^[^\:]+:[^\:]+:\d+?:[^\:]+:[^\:]+:\d+?:[^\:]+:active In previously section (5.2.4), the stanza was reduced by stanzastatr/stanzastop to only contain lines ending with host. The next step is to check if these hosts are active. This is done using an included followed by a regular expression that matches the word active at the correct location of the line. What the stanzacriteria (and stanzastart/stanzastop) matches can be seen below: 12.2.5 Stanzaempty definition Stanzaempty is an attribute telling whether the check should be COMPLIANT or NONCOMPLIANT if there is no stanza matched by stanzastart and stanzastop. In the example the check should be compliant if the stanza is empty. stanzaempty: COMPLIANT 12.2.6 Severity definition The severity attribute describes how critical the check is. The severity must be defined according to some predefined words (given below). These words are found by a multiplication of impact Licensed Material Property of IBM Corporation 65 12. Editing Policy files www.ibm.com 12 and probability. Both impact and probability consist of a number ranging from 1 to 3. This gives us the following severities: 1 = info 2 = warning 3 = minor 4 = major 6 = critical 9 = fatal Impact and probability are found by the SMEs (Subject Matter Experts) In the example the severity is set to critical severity: critical 12.2.7 Putting it all together The check now looks like this: id:svc.5.2 checkname: Host ports are online type:match segment: lsfabric stanzastart: ^ stanzastop: (host)$ stanzacriteria: include:^[^\:]+:[^\:]+:\d+?:[^\:]+:[^\:]+:\d+?:[^\:]+:active stanzaempty: COMPLIANT severity: critical In the example there has been a name defined. The segment of where to check has been defined. The beginning and the end of the lines to check has been defined (the stanza) by the attributes stanzastart and stanzastop. Within the stanza there has been a stanzacriteria and based on this the check is COMPLIANT or NONCOMPLIANT. If there was nothing to check in the stanza then the check is COMPLIANT which is defined using the stanzaempty attribute. Finally the severity of the check has been set to critical. 12.3 Running the hc.pl file correctly In order to execute the script, an installation of Perl is required. The script uses normal text analyzing, driven by regular expressions. It is therefore not necessary to install any Perl add-ons or packages - only the device output file and one or more policy files. These files will be supplied as arguements to the Health Check script (hc.pl) in the command line running the script. Example: C:\tpchc\conf\policies\hc.pl SVC-2145-01-IBM.out policy1.pol policy2.pol policy3.pol It is also possible to set flags in order to debug policy files. Licensed Material Property of IBM Corporation 66 www.ibm.com 12 Stanza debug mode: flag: -s Setting the stanza debug mode flag will make the script run through the segment files and output all the stanzas delimited by the stanza start/stop attributes. This is a way to manually ensure that the stanza start/stop attributes capture the stanzas correctly ( i.e. thai it captures all stanzas and that each stanza is complete). The stanza debug mode will also output the stanzastart, stanzastop conditions, include (exclude-flag = 0) or exclude (exclude-flag = 1) and the criteria. Example command for running the script in stanza debug mode: ./bin/hc.pl -s expect/backup/SVC-2145-DB_SVC_E2_06-IBM.out conf/policies/ policy2.pol Match mode: flag: -m Setting the match mode flag will make the script output every stanza which matches the stanzacriteria defined in the check. Example command for running the script in match mode: ./bin/hc.pl -m expect/backup/SVC-2145-DB_SVC_E2_06-IBM.out conf/policies/ policy2.pol Setting both flags: flag: -sm It is also possible to set both flags. This will make debug mode and stanza match mode active. NOTE: This will output a lot of text, and is best used on single checks in test policy files and not a complete policy file with multiple checks. 12.4 Reporting The end result of the Health Check will be a printed list of all the checks and their status. These will be formatted so the checkname is printed first followed by a colon :, then result of the check followed by a colon : and finally the severity. Licensed Material Property of IBM Corporation 67 13. Uninstall www.ibm.com 13 13. Uninstall To uninstall TPCHC run the program C:\tpchc\UninstallTPCHC.bat.
Installation and Configuration of IBM Watson Analytics and StoredIQ: Complete Administration Guide of IBM Watson, IBM Cloud, Red Hat OpenShift, Docker, and IBM StoredIQ (English Edition)
Installation and Configuration of IBM Watson Analytics and StoredIQ: Complete Administration Guide of IBM Watson, IBM Cloud, Red Hat OpenShift, Docker, and IBM StoredIQ (English Edition)