IBM Network Advisor Best Practices and Deployment Guide - v3.10
IBM Network Advisor Best Practices and Deployment Guide - v3.10
IBM Network Advisor Best Practices and Deployment Guide - v3.10
http://ibm.biz/brocdesignbp
Document:
Title:
Table of Contents
Table of Contents...............................................................................................................2
Document history...............................................................................................................5
Document Location.................................................................................................................................. 5
Approvals................................................................................................................................................. 6
Distribution............................................................................................................................................... 6
Introduction........................................................................................................................8
When to use Network Advisor.................................................................................................................. 8
Network Advisor...............................................................................................................13
Server Sizing and Configuration............................................................................................................. 13
Server and Client Ports.......................................................................................................................... 14
Downloading IBM Network Advisor........................................................................................................ 16
Installing IBM Network Advisor............................................................................................................... 18
Launching the Remote Client................................................................................................................. 27
User Account Management.................................................................................................................... 28
Server Management Console................................................................................................................. 29
IBM Network Advisor Configuration Screen........................................................................................... 32
Event Logs.......................................................................................................................40
Collect SupportSave........................................................................................................41
Network Advisor Supportsave................................................................................................................ 41
Supportsave Manual Collection.............................................................................................................. 42
Supportsave Scheduled Collection........................................................................................................ 43
Event notification..............................................................................................................44
Call Home.............................................................................................................................................. 44
SNMP..................................................................................................................................................... 44
Fabric Watch....................................................................................................................46
Document:
Title:
Bottleneck Detection........................................................................................................53
Recommendations................................................................................................................................. 53
Suggested Bottleneck Settings.............................................................................................................. 53
FOS 6.3.............................................................................................................................................. 53
FOS 6.4.............................................................................................................................................. 53
FOS 7.0.............................................................................................................................................. 54
Implementation....................................................................................................................................... 54
Enable Bottleneckmon via GUI........................................................................................................... 54
Enable Bottleneckmon via CLI............................................................................................................ 54
How Bottlenecks are reported in Network Advisor.................................................................................55
Port Fencing.....................................................................................................................56
Implementation....................................................................................................................................... 56
Adding thresholds (Violation types):....................................................................................................... 57
Assigning thresholds to ports:................................................................................................................ 57
Unblocking a Port................................................................................................................................... 58
Removing Thresholds............................................................................................................................ 58
Zoning..............................................................................................................................76
Conclusion.......................................................................................................................93
References.......................................................................................................................94
Document:
Title:
Document history
Document Location
The source of the document can be found in the Team Room, located at:
Database Name:
Server Name:
File Name:
TBD
TBD
IBM Network Advisor Deployment Guide V3.03.doc
Revisio
n
Numbe
r
1.0
Revision
Date
Summary of Changes
Changes
marked
6/11/12
No
1.2
7/10/12
No
1.3
7/25/12
No
1.4
7/29/12
No
1.5
8/1/12
No
1.6
9/14/12
1.7
9/24/12
1.8
9/25/12
No
1.9
10/1/12
No
2.0
10/4/12
No
2.1
10/24/12
No
2.2
10/30/12
No
2.3
11/6/12
2.4
11/15/12
No
2.5
12/19/12
2.6
1/14/13
2.7
3/15/13
No
1.1
Document:
Title:
No
No
No
No
No
7/17/13
No
2.9
01/14/14
No
2.91
01/16/14
3.0
05/28/14
3.1
06/01/14
3.2
06/02/14
Per Jim Olson and Kirby Dahman, changed Fabric Watch F_port
Class thresholds to 25 for two alerts Link Reset and State
Change
Modified appearance of Fabric Watch alerts table for better
clarity/detail (no FW values changed)
Added a new section for Flow Vision MAPS. Pages 50 54
(Updated as per Jim Olson's directive to include Fabric Vision)
Added section for Fabric Vision introduction. Added table for
MAPS Threshold Values
Corrected MAPS implementation section for more clarity.
3.3
06/03/14
No
3.4
06/03/14
3.5
06/05/14
3.6
06/17/14
3.7
07/14/20
14
08/15/20
14
3.8
3.10
11/10/20
14
No
No
No
No
No
No
Yes
Yes
Yes
Yes
Approvals
This document requires following approvals:
Name
Title
Jim Olson
Distinguished Engineer
Distribution
This document has been distributed to:
Name
Jim Olson
Ann Corrao
Title
Distinguished Engineer
Distinguished Engineer
John Juenemann
Karen Haberli
Program Manager
Eric Block
Storage Architect
Sudharsan S Vangal
Storage Administrator
Document:
Title:
Document:
Title:
Introduction
The purpose of this document is to present a set of guidelines that incorporate IBM best practices for deploying IBM
Network Advisor (a.k.a. Brocade Network Advisor). This guide should act as a reference point in establishing
consistent, standard deployments across IBM environments.
The best practices noted in this guide present some the more advanced features of Brocade Fabric OS (FOS) for
example, Fabric Watch, Bottleneck Detection, and Port Fencing. Additional best practices are provided for hardware
selection, zoning, and performing scheduled health-related checks and tasks in the SAN.
The guidance found in this document should provide you with an efficient, economic, and effective process by which to
deploy and begin managing IBM Network Advisor.
NOTE: All deployments should be done using the Enterprise version of IBM Network Advisor.
NOTE: DCFM is not qualified or supported for management of switches operating with FOS v7.0 and
later firmware versions. You must first upgrade DCFM to Network Advisor 11.1 or later if you are
planning to upgrade devices to FOS v7.0 or you risk losing management connectivity.
Document:
Title:
Install and use Network Advisor to manage all switches. See Network Advisor
Setup Switch configuration backup. See Backup and Restore Configuration Data
Enable Bottleneck Credit Recovery Tools. See Bottleneck Credit Tools
Configure Call Home and SNMP or email event notification. See Event notification
Switches running FOS 7.2 or higher setup MAPS. See Monitoring and Alerting Policy Suite (MAPS)
Switches running FOS 7.1 or lower setup Fabric Watch. See Fabric Watch
Configure and enable Bottleneck Detection. See Bottleneck Detection
Configure Network Advisor Dashboards. See Network Advisor Dashboards
Implement and follow regular SAN health tasks. See Regular Tasks for SAN Health
Document:
Title:
Daily
Review of Event Logs
The Master Log should be reviewed daily by the operations team as part of the health check process. Network
Advisors Master Log lists all events and alerts that have occurred in the SAN and you should make it a habit of
reviewing this log on a daily basis.
View specific logs by selecting an option from the Monitor menus Logs submenu. The following logs can be
found here: Audit Log, Product Event Log, Fabric Log, FICON Log, Product Status Log, Security Log, Syslog
Log.
Fabric Watch, MAPS, Bottleneck Detection, and Port Fencing alerts will process like other alerts in the
environment. They can be found in the IBM Network Advisor Master Log.
Weekly
Backup Switches
Collect a set of configuration files in case they are required to restore the switch configuration.
See Switch Backup and Restore section for how to do this
Collect Supportsaves
Collect a complete set of supportsave files from all switches before clearing the switch counters.
This will provide a set of switch logs from before the counters were cleared in case they are required for PD.
Provides a set of switch logs which can be used a baseline.
See Supportsave Scheduled Collection.
Counters that are never cleared are hard to troubleshoot, and you have no frame of reference for when the
error counters on ports actually increased.
For this reason the Brocade best practice is to clear the counters on a known schedule, so that error counters
seen are known to represent recent issues.
NOTE: Any time new devices are added to the SAN or cabling changes are made, it is common for ports to detect
error. These errors should be cleared any time fabric changes are made.
Action
Automate a counter clear on all switches that runs on Sunday evening (suggest 6PM local time). You want
this to happen after all the normally scheduled weekend changes should be complete and prior to
production Sunday night / Monday morning workloads beginning to hit the production system.
Commands to be run:
Statsclear
Slotstatsclear
Document:
Title:
Monthly
Review switch logs for marginal links or other potential switch issues.
The following metrics are some of the key metrics when reviewing supportsave files.
PORTERRSHOW
c3timeout / disc c3
crc_err
crc g_eof
too shrt
too long
bad eof
loss sync
loss sign
SFPSHOW
The primary metric is Rx power which shows the amount of light the SFP is receiving.
Typically SFPs transmit around -2 to -3db (630 to 400uwatt) so for short distance cables receive power levels should
be similar. Longer cables lengths will result in lower receive light levels and is not consider an issue. In general
receive levels should not drop below -10db (100uwatt) unless its an extremely long cable run.
In general you should compare light levels to other cable runs of similar length and if you have noticeably lower levels
compare to the other cables would indicate a cabling issue.
ERRDUMP
The errdump log should be reviewed for messages that indicate issues which can vary from CDR-xxxx and C2-xxxx,
C3-xxxx messages indicating credit loss, to issues show excessive network login attempts to switch hardware issues.
FABRICLOG
Check the fabric log for signs of ports doing repeated Link Resets, ports going offline/online or repeated fabric
rebuilds.
Document:
Title:
Quarterly
Run Brocade SAN Health Report, see Brocade SAN Health Report
Document:
Title:
Network Advisor
Server Sizing and Configuration
IBM Network Advisor Sizing Requirements
Small
Medium
Large
Number of Fabrics
16
24
Number of Domains
20
60
120
2000
5000
9000
5000
10000
20000
20
30
40
Server Memory
6GB
8GB
12GB
60GB
80GB
100GB
100GB
100GB
100GB
100GB
100GB
100GB
If further information is needed associated to server sizing and configuration, please see here
http://www.brocade.com/downloads/documents/product_manuals/NetworkAdvisor/Net
workAdvisor_InstallGd_v1230.pdf
Additional Requirements
We want to do everything we can to eliminate issues in the SAN from impacting our management interface. Should
the SAN experience an unexpected degradation or failure, we need to ensure our ability to access Network Advisor is
unaffected. This ability could be severely compromised or lost if our main tools (OS, application) reside on the SAN.
Therefore, the following points must be followed in performing a best practice installation of IBM Network Advisor
server:
NOTE: A Virtualized server may be used, however it must follow same requirements as a dedicated/stand-alone
server
Server should be partitioned for three drives: one for the OS, one for the Application, and one for Backup Data
Backup Data needs to be on physically separate drive
Document:
Title:
Port
Number
Ports
Transport
Description
Communication
Path
Open in
Firewall
201
TCP
Client-Server
Switch-Server
Yes
211, 2
TCP
Client-Server
Switch-Server
Yes
221
TCP
Server-Switch
Client-Switch
Yes
231
Telnet
TCP
Server-Switch
Client-Switch
Yes
25
TCP
Server-SMTP
Server
Yes
49
TCP
ServerTACACS+
Server
Yes
80
Jboss.web.http.port
TCP
Client-Server
Yes
3, 4
Switch http
TCP
Server-Switch
Client-Switch
Yes
1611
SNMP Port
UDP
Server-Switch
Yes
Snmp.trap.port
UDP
Switch-Server
Yes
389
TCP
Server-LDAP
Server
Yes
4433, 4, 5
Switch https
TCP
Server-Switch
Client-Switch
Yes
5146
Syslog Port
UDP
Switch-Server
Yes
636
UDP
Server-LDAP
Server
Yes
10241, 7
MPI
TCP
Switch-Server
Yes
1812
TCP
Server-RADIUS
Server
Yes
20481, 9
MPI
TCP
Server-Switch
Yes
80
162
Document:
Title:
1, 5,
MPI
TCP
Server-Switch
Yes
26388
TCP
Server-Database
Remote-ODBCDatabase
Yes
Port
Number
Ports
Transport
Description
Communication
Path
Open in
Firewall
44301, 5, 7
MPI
TCP
Server-Switch
Yes
5988
TCP
SMI AgentServer-Client
Yes
5988
TCP
SMI Agent
Server-Client
Yes
80801, 7
MPI
TCP
Server-Switch
Yes
Jboss.naming.jnp.port-port 0
TCP
Client-Server
Yes
Jboss.connector.ejb3.port-port 1
TCP
Client-Server
Yes
Jboss.connector.bisocket.portport 2
TCP
Client-Server
Yes
24603
Jboss.connector.bisocket.secon
dary.port-port 3
TCP
Client-Server
Yes
246045
Jboss.connector.sslbisocket.por
t-port 4
TCP
Client-Server
Yes
246055
Jboss.connector.sslbisocket.sec
ondary.port-port 5
TCP
Client-Server
Yes
24606
Smp.registry.port-port 6
TCP
Client-Server
Yes
24607
Smp.server.export.port-port 7
TCP
Client-Server
Yes
24608
Smp.server.cliProxyListeningpor
t-port 8
TCP
Client-Server
Yes
Jboss.naming.rmi.port-port 9
TCP
Client-Server
Yes
24610
Jboss.jrmp.invoker.port-port 10
TCP
Client-Server
Yes
24611
Jboss.pooled.invoker.port-port
11
TCP
Client-Server
Yes
24612
Jboss.connector.socket.portport 12
TCP
Server
No
24613
Jboss.web.ajp.port-port 13
TCP
Server
No
24614
Jboss.web.service.port-port 14
TCP
Server
No
24615
Connector.bind.port-port 15
TCP
Server
No
3276865535
Ephemeral ports
UDP
Switch-Server
Yes
5555510
TCP
Server-Client
Yes
55556
TCP
Client
No
7, 9
10
24600
24601
24602
24609
Document:
Title:
Document:
Title:
Once youve downloaded the application, select the executable file and click install, this will bring up the
Introduction screen...
Accept License...
Document:
Title:
Select Install Folder (Do Not install to the root directory, usually C:\)...
Document:
Title:
Once installation is complete, click Done to complete the Network Advisor configuration...
Document:
Title:
We are performing a new install, so will select No as we are not migrating any data or settings...
Document:
Title:
You will need to have a Serial Number and License Key available at this point if you plan to perform a
permanent install (these should have been provided when you purchased IBM Network Advisor). Otherwise,
you can opt for a 75-day trial...
Document:
Title:
Document:
Title:
Most configurations will keep default. However, these settings can be changed later via the Server
Management Console (in the Services tab) noted below.
Document:
Title:
Select the network size based on the scaling you used to size your server...
Document:
Title:
At this point installation/configuration is complete and you are ready to start the client...
Document:
Title:
Following initial login below, you will need to change the Administrator Password from the default. Once you
have logged in you can perform this from Server > Users
Document:
Title:
The web server port number default is 80. However, if SSL is enabled, this will be 443. You must enter
the web server port number in addition to the IP address (e.g. IP_Address:Port_Number)
Global Storage Service Line Process
IBM Network Advisor Deployment Guide
Page 26 of 94
Work with your security team in securing and managing the Root and Factory accounts
Work with your security team to define non-default Admin and User accounts with the same access for
your users
Disable the default Admin and User accounts
From the Services tab, you can start, stop, refresh, and restart services on the server.
Document:
Title:
From the Ports tab, you can change the Management application server or web server port numbers.
From the AAA Settings tab, you can configure different authentication methods (LDAP or RADIUS, etc.), and
establish authentication policies.
Document:
Title:
From the Restore tab, you can restore server application data. Application: Server > Options > Server
Backup.
NOTE: The Restore Path is what you set above in the Server Data Backup section (E:\Backup).
From the Technical Support Information tab, you can collect information for technical support.
Document:
Title:
Document:
Title:
Document:
Title:
Select the switches for which you want to save configuration files from Available Switches.
Click the right arrow to move the selected switches to Selected Switches.
Click OK. Configuration files from the selected switches are saved to the repository.
Select the Backup all fabrics check box, to back up all switch configurations of discovered
switches in all fabrics
Clear the Backup all fabrics check box and select the specific fabric check boxes in the
Selected Fabrics table to back up individual fabrics.
If any switches do not have the EGM license, a messages displays. Click OK to enable backup on the switches with
the EGM license.
5. Click OK.
Document:
Title:
Select Server Backup in the Category list. The currently defined directory displays in the Backup Output
Directory field.
Enter the time (using a 24-hour clock) you want the backup process to begin in the Next Backup Start Time
Hours and Minutes fields.
Select an interval from the Backup Interval drop-down list to set how often backup occurs.
Browse to the hard drive and directory to which you want to back up your data (this should be a separate
physical drive).
Document:
Title:
Enabling backup
Backup is enabled by default. However, if it has been disabled, complete the following steps to enable the function.
Document:
Title:
Document:
Title:
Event Logs
You can view all events that take place through the Master Log at the bottom of the main window. You can also view a
specific log by selecting an option from: Monitor > Logs (submenu). These logs are described in the following list:
Audit Log. Displays all Application Events raised by the application modules and all Audit
Syslog messages from the switches and Brocade HBAs.
Product Event Log. Displays all Product Event type events from all discovered switches and
Brocade HBAs.
Fabric Log. (SAN only) Displays Product Events, Device Status, and Product Audit type events for all
discovered fabrics.
FICON Log. Displays all the RLIR and LRIR type events, for example, link incident type events.
Product Status Log. (SAN only) Displays events which indicate a change in Switch Status for all
discovered switches and Brocade HBAs.
Security Log. Displays all security events for the discovered switches.
Syslog Log. Displays syslog messages from switches and HBAs.
Master Log
The Master Log, which displays in the lower left area of the main window, lists the events and alerts that have
occurred on the SAN. If you do not see the Master Log, select View > Show Panels > All Panels or press F5.
The following fields and columns are included in the Master Log:
Severity. The severity of the event. When the same event (Warning or Error) occurs repeatedly, the
Management application automatically eliminates the additional occurrences.
Acknowledged. Whether the event is acknowledged or not. Select the check box to acknowledge the event.
Source Name. The product on which the event occurred.
Source Address. The IP address (IPv4 or IPv6 format) of the product on which the event occurred.
Origin. The event source type (for example trap, pseudo-event, application, or syslog).
Category. The type of event that occurred (for example, client/server communication events).
Description. A description of the event.
Last Event Server Time. The time and date the event last occurred on the server.
Count. The number of times the event occurred.
Module Name. The name of the module on which the event occurred.
Message ID. The message ID of the event.
Product Address. The IP address of the product on which the event originated.
Contributor. The name of the contributor on which the event occurred.
Node WWN. The world-wide name of the node on which the event occurred.
Fabric Name. The name of the fabric on which the event occurred.
Operational Status. The operational status (such as, unknown, healthy, marginal, or down) of the product on
which the event occurred.
First Event Product Time. The time and date the event first occurred on the product.
Last Event Product Time. The time and date the event last occurred on the product.
First Event Server Time. The time and date the event first occurred on the server.
Audit. The audit of the event.
Virtual Fabric ID. The VFID of the product on which the event occurred.
Zone Alias. Displays the zone alias of the product or port.
Document:
Title:
Collect SupportSave
To collect switch and Network Advisor supportsaves select the Monitor -> Technical Support
Document:
Title:
Document:
Title:
Document:
Title:
Event notification
Call Home
Network Advisor supports call home to IBM Support. This will allow automatic creation of a problem record with IBM in
response to significant error events on devices you are managing in your SAN. Additional information can be found at
the following links:
This is a direct link to the Brocade User Manual Call Home section and provides in-depth instruction on
how to configure
This link provides IBM-specific email addresses and phone numbers to use when configuring Call Home.
You may need to consult with your security team to ensure your security model allows call home via email
and/or phone
SNMP
As accounts may not have identical infrastructures, SNMP traps should be configured to be sent to the event capture
and reporting tool deployed for each account. You will need to work with your SNMP Trap Collector (i.e. Netview,
NetCool, etc.) administrator to ensure all alerts noted in the below sections are defined properly and are being
received.
NOTE: Recommendation is to configure SNMP v3. If your capture tool does not support this, use SNMP v1 (If you
need to use SNMP v1, do not use the default
Trap enablement tasks
Configuring individual SNMP traps this must be done on a per switch basis within the Web Tools interface. Enable
SNMP per the following on each of your Brocade products (switches, directors, etc.).
1. From Web Tools, click on Switch Admin > Show Advanced Mode
2. This will bring you to the following screen, select SNMP here
Document:
Title:
Document:
Title:
Fabric Watch
Fabric Watch tracks a variety of SAN fabric elements and events. Monitoring fabric-wide events, ports, and,
environmental parameters to enable early fault detection via SNMP.
Fabric Watch can be enabled and thresholds set to alert on these events for code level 6.3 and above.
Fabric Watch should have been purchased with the switch (it is a FOS feature, and is included automatically
with all Brocade SAN switches purchased from IBM).
When configuring Fabric Watch, the Fabric/Port Class and Alert Type/Threshold settings below should be followed:
High
Boundary
0
Minutes
raslog
E_Ports Down
Fabric Reconfigure
Domain ID Changes
Segmentation
Zone Changes
Fabric Logins
State Change
Protocol Error
Link Reset
Invalid Tx Words (enc_out)
Invalid CRCs
C3 Discards
Rx Performance
Tx Performance
State Change
Protocol Error
Link Reset
Invalid Tx Words (enc_out)
Invalid CRCs
C3 Discards
Rx Performance
Tx Performance
0
0
0
0
10
10
10
5
2
25
5
5
75%
75%
25
5
25
25
5
5
90%
90%
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
Minutes
raslog,snmp
raslog,snmp
raslog
raslog,snmp
raslog
raslog
raslog,snmp
raslog
raslog,snmp
raslog
raslog,snmp
raslog,snmp
raslog
raslog
raslog
raslog
raslog,snmp
raslog
raslog,snmp
raslog,snmp
raslog
raslog
Class
Area
Alert Type
SFP
ST
ED
FC
DC
SC
ZC
FL
ST
PE
LR
ITW
CRC
C3TX_TO
RX
TX
ST
PE
LR
ITW
CRC
C3TX_TO
RX
TX
Fabric
E_Port
FOP_Port (Fibre
Optical Port)
Time
Alert
Document:
Title:
2. Select the appropriate Class (F/FL Optical Port, E-Port, or Fabric) from the left screen pane:
Document:
Title:
Document:
Title:
Document:
Title:
Select the switch for which you just configured all Fabric Class, E_Port, and F_Port Class settings:
Document:
Title:
Following the above screen you will be presented with Validation and Summary screens to complete the
distribution of Fabric Watch settings.
Document:
Title:
Document:
Title:
Bottleneck Detection
As transmission speeds within SAN fabrics continue to increase devices causing latency within the fabric have a
larger impact on the overall health of the fabric. Devices causing latency have caused multiple customer impacts
within IBM. Bottleneck Detection now provides a way to automatically watch for and alert upon high latency
devices. This ability has already proven to shorten environment impact times within IBM operated environments
from days to hours.
Recommendations
Field experience shows that the original strategy of enabling Bottleneck Detection with conservative values for
latency thresholds almost always yields no results. There was a concern that aggressive values would result
in Bottleneck Detection alert storms, but this has not been the case. Even the most aggressive values result in
relatively few alerts being generated. As a result, it is now recommended that the most aggressive settings are
tried first and then backed off gradually if too many alerts are seen.
Brocade 48000 should have no more than 100 ports monitored due to memory constraints
Congestion Threshold (-cthresh): Is new starting with code level 6.4. This monitors bandwidth utilization, the
percentage of time that a link exceeds 95% utilization. The recommendation is to stay with the Brocade
default value for this setting (80%). This means that if an individual link exceeds 95% utilization for 80+% of
the measurement interval (the time specification= 30 seconds) an alert will be sent.
Latency Threshold (-lthresh): This is the minimum percent of time when a latency is detected (default is 20%
or .2) This is the parameter we will adjust as we fine-tune BD
Conservative Settings
Normal Settings
Aggressive Settings
-time
300
60
-qtime
300
60
-thresh
0.3
0.2
0.1
Parameter
Conservative Setting
Normal Settings
Aggressive Settings
-time
300
60
-qtime
300
60
-lthresh
0.3
0.2
0.1
-cthresh
0.8
0.5
0.1
FOS 6.4
Document:
Title:
FOS 7.0
Parameter
Conservative Setting
Normal Setting
Aggressive Setting
-time
300
60
-qtime
300
60
-lthresh
0.3
0.2
0.1
-cthresh
0.8
0.5
0.1
-lsubsectimethresh
0.8
0.5
-lsubsecsevthresh
75
50
Implementation
NOTE: The bottleneck detection feature detects latency bottlenecks only at the point of egress, not ingress
Use the below for your initial settings (see section below for additional tuning settings):
Congestion 50%
Latency 20%
Window 60 seconds
Quiet Time 60 seconds
Click the right arrow to apply the settings in the Bottleneck Detection pane to the selected
elements in the Products/Ports list.
FOS 7.0
bottleneckmon --enable -lthresh 0.2 -cthresh 0.5 -time 60 -qtime 60 -lsubsectimethresh 0.5 -lsubsecsevthresh
50 -alert
Document:
Title:
Port Fencing
Reasons to Implement Port Fencing
As transmission speeds within SAN fabrics continue to increase, devices causing latency within the fabric have a
larger impact on the overall health of the fabric. The health of the fabric may degrade faster than an alert can be sent,
received by the monitoring team, support tickets opened, and the required manual action to protect the fabric be
taken.
Port Fencing provides a way to have the fabric respond to error-level thresholds by disabling port with high error rates.
It sends an alert that this action has been taken so the steady state team can repair the situation and then bring the
port back online.
Implementation
NOTE: Port Fencing should only be done after the environment has successfully implemented Fabric Watch using
the settings recommended in this guide. Healthy SAN fabrics are a prerequisite to implementation of Port Fencing.
DO NOT implement Port Fencing unless the following criteria are met:
The environment is running code level 7.0.2c or newer. . In code levels prior to 7.0.2c, the FW-1510 alert sent
by the switch to inform administrators that Port Fencing has disabled ports is at an Informational severity
level. This alert severity has been raised to Error in the 7.0.2c release.
The monitoring or steady state team has the cycles to monitor Informational SNMP alerts from the SAN
switches.
A mature SNMP monitoring and response process must be in place prior to implementation of Port Fencing.
Port Fencing is going to disable ports, a steady state team must receive these alerts and take action to fix the
port and bring it back online. Failure to ever take action will result in future Client Impacting Events.
Example: 1 of 2 SAN ports for a server exceeds the Port Fencing threshold and the port is automatically disabled
by the SAN switch. The steady state team does not repair the port and bring it back online. A month later the
remaining HBA in the server fails, now the server has no connectivity to back-end SAN storage devices.
When configuring Port Fencing within FOS v6.4.2a, the Violation Type and Threshold settings below should be
followed:
E Port Class Area (note: the Time Base for all Alerts = 1 minute)
Violation Type
Protocol Error
Link Reset
Invalid Words (enc out)
Invalid CRCs
Threshold
10
10
60
30
F Port Class Area (note: the Time Base for all Alerts = 1 minute)
Violation Type
Protocol Error
Link Reset
Invalid Words (enc out)
Invalid CRCs
Document:
Title:
Threshold
5
200
40
20
Date: August, 2014
Version: 3.8
40
4. Select the Port Type (E Port Class or F Port Class noted above), to which you want to assign the threshold
from the Ports table. Do NOT assign a Port Type/Class to an incorrect Violation Type.
5. Click the right arrow
A directly assigned icon
displays next to the objects you selected in the Ports table to show that the
threshold was applied at this level.
An added icon
appears next to every object in the tree to which the new threshold is applied.
Unblocking a Port
Network Advisor allows you to unblock a port (only if it was blocked by Port Fencing) once the problem that triggered
the threshold is fixed.
When a port is blocked, and Attention icon
Removing Thresholds
To remove thresholds from the All Fabrics object, an individual Fabric, Chassis group, Switch, or
Switch Port, complete the following steps.
Select the object with the threshold you want to remove in the Ports table.
Document:
Title:
Document:
Title:
Some Fabric Vision technology features are supported on Gen 4 b-type platforms; others are available only on Gen 5
Fibre Channel platforms with 16 Gbps performance capability. The chart below shows the various Fabric Vision
technology features supported on each generation of products:
Feature
Latency Bottleneck Detection
Forward Error Correction
VC-level BB_Credit Recovery
ClearLink Diagnostics (D_Port)
MAPS
Flow Monitoring
Flow Mirroring
Flow Generator
Document:
Title:
Gen 4 Platforms
8 Gbps FC and associated capabilities
Yes
No
No
No
Yes
Yes, with some limitations
No
No
Gen 5 Platforms
16 Gbps FC and associated capabilities
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
MAPS behavior
Supported
Supported
level.
Pause/Continue behavior
Occurs at the element or counter level. Occurs at the element level. Monitoring
For example, monitoring can be paused can be paused on a specific port, but not
for CRC on one port and for ITW on
for a specific counter on that port.
another port.
Can configure the polling interval as well This configuration can be migrated from
Fabric Watch, but cannot be changed.
as the repeat count.
E-mail notification
Configuration
Document:
Title:
Monitoring
5. Set allowable actions for rules using mapsconfig --actions raslog, snmp, email, sw_critical, sw_marginal,
sfp_marginal
Make sure port fencing is not enabled / included in the mapsconfig command.
2. Remove the port rules from the policy using the following commands.
Note: Must be run from root
for i in $(mapspolicy --show IBM_SO | grep defNON | awk '{print $1}'); do mapspolicy --delrule IBM_RTS -rulename $i; done
for i in $(mapspolicy --show IBM_SO | grep E_PORTS | awk '{print $1}'); do mapspolicy --delrule IBM_RTS -rulename $i; done
Document:
Title:
for i in $(mapspolicy --show IBM_SO | grep F_PORTS | awk '{print $1}'); do mapspolicy --delrule IBM_RTS -rulename $i; done
for i in $(mapspolicy --show IBM_SO | grep T_PORTS | awk '{print $1}'); do mapspolicy --delrule IBM_RTS -rulename $i; done
mapsRule --create F_PORTS_PE_5 -group ALL_F_PORTS -timebase min -op g -policy IBM_SO -monitor PE -value 5 -action RASLOG
mapsRule --create F_PORTS_ITW_25 -group ALL_F_PORTS -timebase min -op g -policy IBM_SO -monitor ITW -value 25 -action
RASLOG
mapsRule --create F_PORTS_CRC_5 -group ALL_F_PORTS -timebase min -op g -policy IBM_SO -monitor CRC -value 5 -action
RASLOG,SNMP,EMAIL
mapsRule --create F_PORTS_CRC_H25 -group ALL_F_PORTS -timebase hour -op g -policy IBM_SO -monitor CRC -value 25 -action
RASLOG,SNMP,EMAIL
mapsRule --create F_PORTS_LR_3 -group ALL_F_PORTS -timebase min -op g -policy IBM_SO -monitor LR -value 3 -action
RASLOG,SNMP,EMAIL
mapsRule --create F_PORTS_LR_H10 -group ALL_F_PORTS -timebase hour -op g -policy IBM_SO -monitor LR -value 10 -action
RASLOG,SNMP,EMAIL
mapsRule --create F_PORTS_C3TXTO_5 -group ALL_F_PORTS -timebase min -op g -policy IBM_SO -monitor C3TXTO -value 5 -action
RASLOG,SNMP,EMAIL
mapsRule --create F_PORTS_TX_90 -group ALL_F_PORTS -timebase min -op g -policy IBM_SO -monitor TX -value 90 -action RASLOG
mapsRule --create F_PORTS_RX_90 -group ALL_F_PORTS -timebase min -op g -policy IBM_SO -monitor RX -value 90 -action
RASLOG
mapsRule --create E_PORTS_PE_5 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor PE -value 5 -action RASLOG
mapsRule --create E_PORTS_ITW_25 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor ITW -value 25 -action
RASLOG
mapsRule --create E_PORTS_CRC_5 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor CRC -value 5 -action
RASLOG,SNMP,EMAIL
mapsRule --create E_PORTS_CRC_H25 -group ALL_F_PORTS -timebase hour -op g -policy IBM_SO -monitor CRC -value 25 -action
RASLOG,SNMP,EMAIL
mapsRule --create E_PORTS_LR_3 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor LR -value 3 -action
RASLOG,SNMP,EMAIL
mapsRule --create E_PORTS_LR_H10 -group ALL_F_PORTS -timebase hour -op g -policy IBM_SO -monitor LR -value 10 -action
RASLOG,SNMP,EMAIL
mapsRule --create E_PORTS_ST_1 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor STATE_CHG -value 1 -action
RASLOG,SNMP,EMAIL
mapsRule --create E_PORTS_C3TXTO_5 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor C3TXTO -value 5 -action
RASLOG,SNMP,EMAIL
mapsRule --create E_PORTS_TX_75 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor TX -value 75 -action
RASLOG
mapsRule --create E_PORTS_RX_75 -group ALL_E_PORTS -timebase min -op g -policy IBM_SO -monitor RX -value 75 -action
RASLOG
Select the switch you want to import the MAPS policy into and select the IMPORT button.
Document:
Title:
To activate the policy expand the list of policies for the switch, select the IBM_SO policy and press the Activate push
button.
To enable the apropriate actions for the switch select the switch and press the Actions push button.
Document:
Title:
Right-click a device in the Product List or Connectivity Map and select Fabric Vision > MAPS > Configure
Document:
Title:
Document:
Title:
Document:
Title:
Log In to NA. From the Monitor menu choose the Fabric Vision sub menu, select MAPS and Enable
Log In to NA. From the Monitor menu choose the Fabric Vision sub menu, select MAPS and Configure
Highlight the switch to be configured, select the dflt_moderate_policy or IBM_SO policy and click the Activate button.
Document:
Title:
Log In to NA. From the Monitor menu choose the Fabric Vision sub menu, select MAPS and Configure
Highlight the switch with the policy to be viewed, select the policy and click the View button.
Document:
Title:
Choose the tab related to the parameter to be viewed (Port, Switch Status, Fabric, FRU, Security, Resource, FCIP, Traffic/Flows)
Document:
Title:
Document:
Title:
You can download this utility and instructions for using it from Brocade at:
www.brocade.com/sanhealth
Fabric level information total port count, performance, oversubscription ratios, port utilization, and number of
attached devices followed by specific information on each fabric, such as the connected switches, zoning
configuration, and a port map.
Switch level information such as licenses, port level configurations and ISL usage.
Port level information such as bandwidth utilization, CRC counts and port status provides a snapshot on
overall port health.
Visio diagram shows the logical connection of the switches in the fabrics as well as the connected devices.
ISLs, trunks and devices are shown exactly how they are connected to the switch ports. From this diagram,
the fabric topology and other information can be viewed quickly and easily.
Customized views of devices allow for online device identification, snapshot of performance stats and switch
attachment details.
Other items in this report include historical performance graphs plus guidelines and recommendations.
NOTE: Past reports should be saved for trend and troubleshooting and planning purposes. These reports can be
very helpful when trying to identify the source of an issue and should be readily available for Crit-Sit and Sev-1
types of situations.
Your name
Your eMail and phone number
Customer name
The geography the device/s will be (are) installed
Device Type / Model, and quantity
Account Focal e.g. SAN Architect, DPE, & etc. - name and contact information
Document:
Title:
3. When configuring SH client http://www.brocade.com/services-support/drivers-downloads/san-healthdiagnostics/download_san_health.page be sure to Include IGSSC@brocade.com see screen shot below:
Document:
Title:
7. Make sure you are using the latest client v3.2.6c download from http://www.brocade.com/servicessupport/drivers-downloads/san-health-diagnostics/download_san_health.page
8. Follow-up SAN Health review request are to include status on all actions called out in previous review
Brocade Recommendation Summary.
Document:
Title:
Zoning
All zoning tasks must be performed from the Zoning dialog box in the Network Advisor application. You can access the
Zoning dialog box from the main screen of the Management application using any of the following methods:
NOTE: The following points need to be observed when performing zoning operations
Zoning via the CLI or Web Tools interface should never be performed due to the increased potential for
catastrophic customer-impacting mistakes associated with these methods.
Single-Initiator Zoning should be used for all zoning. A single-initiator zone contains one HBA in a zone
with target device/s.
Your default zoning mode should be set for No Access. This means unzoned devices cannot see each
other and therefore requires a zone be established before they can communicate
The following is a procedure for zoning in a Brocade Fabric using IBM Network Advisor and will assure the following:
The current zone configuration in the fabric will be saved to the Network Advisor offline repository and can be
restored to the fabric if necessary.
Multiple copies of the fabric zoning configuration will be stored in the offline repository. The number of copies
will be dependent on your policy for cleaning out old zone DB copies in the offline repository.
The offline repository will be backed up as part of the scheduled Network Advisor backup when that backup
occurs. There will be exposure to lost updates to the zoning DBs should the Network Advisor server become
unavailable and have to be restored. The updates from the time of the last backup until the time the server is
lost would be unrecoverable.
The current active Fabric Zone DB will always be the zoning DB used for updating when zoning changes are
necessary in the fabric. The offline repository zone DBs will only be used for recovery if necessary.
The following will demonstrate the steps necessary to make changes to the current zone configuration and assure a
copy of the current zone DB is stored to the offline repository as a fallback if necessary.
The current Fabric Zone DB consists of only 1 zone configuration.
Document:
Title:
A request has come in to add an additional zone to the fabric, we will add this zone as zone4. Updates to fabric
zoning will always be made to the current active zone configuration in the Fabric Zone DB.
To assure that the Network Advisor zoning configuration window is current and assure you are viewing what is
currently active in the fabric, perform a Zone DB Operation to refresh the DB. Verify the Zone DB listed is the Fabric
Zone DB and perform a refresh.
Zone DB Operation Refresh (See below)
Document:
Title:
You will receive a message indicating you are overwriting the selected zone DB with the one in the fabric, see below.
Respond yes, this will guarantee your current view of the Fabric Zone DB is what exists in the fabric.
Document:
Title:
You will receive a window and need to input a Zone DB Name that will be used to identify the copy of the active Fabric
Zone DB you are saving to the offline repository. You should establish a standard naming convention to be used and
assure it is enforced. In this example we are using the initials of the person making the change followed by the date
the change is being made followed by the name of the active zone configuration.
Document:
Title:
Once you respond OK to save you will be presented with the following screen. VERY important at this point to notice
that the Zone DB listed below is the Zone DB you just saved to the offline repository:
RJP_120610_SANWEST_X_CURRENT.
Document:
Title:
You will now see that the Fabric Zone DB is listed in the top middle of the screen. See below.
Document:
Title:
Now that you have saved a copy of the current active Fabric Zone DB to the offline repository and have assured you
are again editing the active Fabric Zone DB you are ready to implement your change. For this example you will
create a new zone, Zone 4, add it to the current active zone configuration and activate the zone configuration so it
gets activated in the fabric. Create the new zone and name it Zone 4.
New Zone Type Zone 4 as the name (See below)
Document:
Title:
Add the newly created Zone4 to the active configuration, see below.
Document:
Title:
Activate the zone configuration so that your changes are pushed to the fabric. You will be presented with a window
that will display the changes you are getting ready to activate in the fabric. You need to VERIFY that these changes
are correct and respond OK once you have completed the verification.
Highlight current zone configuration Activate Respond OK once you have verified the intended changes
are accurate (See below)
Document:
Title:
You will now see that Zone4 is active in your fabric zone configuration, see below.
Document:
Title:
You will now see the Zone DB that you want to fall back to listed. Review the Zone Configuration to assure it is the
version you wish to fall back to. Notice the yellow triangle in the Active Zone Configuration tab below. This is a
warning to tell you that there is a difference between what is currently active in the fabric and the Zone DB that you
are editing in Network Advisor.
Once you have verified that the fall back Zone Configuration is correct then proceed to activate.
Highlight the Zone Configuration you wish to activate Click Activate (See below)
Document:
Title:
You will see a new window displaying the changes to the fabric that will be implemented. After you verify this is
accurate, click OK and the changes will be activated in the fabric. You will need to reply YES to a verification window
that comes up in order to activate the new configuration.
Document:
Title:
You will want to refresh this screen by selecting the Fabric Zone DB to show what is currently active in the fabric.
Zone DB Select the Fabric Zone DB (See below)
Document:
Title:
You will now see the Fabric Zone DB displayed, is showing you what is active in the fabric. You have successfully
fallen back to the point you were at prior to beginning the changes.
Document:
Title:
You will now see the Zone DB RJP_120210_SANWEST_X_CURRENT listed in the Zone DB field.
Document:
Title:
Document:
Title:
Document:
Title:
Conclusion
This document was designed provide guidance deploying IBM Network Advisor per IBM Best Practices. Additionally,
guidance for maintenance, monitoring, and performance has been included. This guide is not intended to replace any
of the current documentation that IBM and Brocade have released in support of this product.
Document:
Title:
References
Below are links to references found in this document in addition to Network Advisor-specific links at Brocade and IBM.
IBM
Brocade link describing and displaying director power efficiency and providing comparisons to Cisco
products
Brocade link required for access to specialized content (e.g. Guides, software, firmware, etc.). You
can sign-up for an account here as well.
This is the email that should be entered under Send a duplicate report to the following people?
section when sending a SAN Health Report
MyBrocade
This link provides an overview of Brocades SAN Health tool, a link to download it, and instructions on
how to use it.
IGSSC@brocade.com
This link provides IBM-specific email and phone numbers for Call Home
Link to IBM Network Advisor overview, features &benefits, and specifications. This is also the link
used to download Network Advisor
All the features, and their usage, in Network Advisor are described here
Document:
Title:
Document:
Title: