Dell Emc Xtremio Storage Array: Fru Replacement Procedures
Dell Emc Xtremio Storage Array: Fru Replacement Procedures
Dell Emc Xtremio Storage Array: Fru Replacement Procedures
Dell EMC
XtremIO Storage Array
X1 Cluster Type
XMS Version 6.4.0
XIOS Versions 4.0.15, 4.0.25, 4.0.26, 4.0.27 and 4.0.31
Copyright © 2022 Dell Inc. or its subsidiaries. All rights reserved. Published in the USA.
Dell believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
The information in this publication is provided as is. Dell makes no representations or warranties of any kind with respect to the
information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use,
copying, and distribution of any Dell software described in this publication requires an applicable software license.
Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be the property of their respective
owners.
For the most up-to-date regulatory document for your product line, go to Dell EMC Online Support (https://support.emc.com).
CONTENTS
Preface
PREFACE
As part of an effort to improve its product lines, Dell EMC periodically releases revisions of
its software and hardware. Therefore, some functions described in this document might
not be supported by all versions of the software or hardware currently in use. The product
release notes provide the most up-to-date information on product features.
Contact your Dell EMC technical support professional if a product does not function
properly or does not function as described in this document.
Note: This document was accurate at publication time. Go to Dell EMC Online Support
(https://support.emc.com) to ensure that you are using the latest version of this
document.
Purpose
This document provides the required information for replacing EMC XtremIO Storage Array
Field Replaceable Units (FRUs) that have been identified as unserviceable.
Audience
This document is intended for the Dell EMC field support personnel.
Related Documentation
The following Dell EMC publications provide additional information:
XtremIO Storage Array Technician Advisor Utility (Ver.2.X) User Guide
XtremIO Storage Array Hardware Installation and Upgrade Guide
XtremIO Storage Array Software Installation and Upgrade Guide
XtremIO Storage Array User Guide
XtremIO Storage Array Release Notes XIOS
XtremIO Storage Array Release Notes XMS
Preface 7
DELL EMC CONFIDENTIAL
Preface
Typographical conventions
Dell EMC uses the following type style conventions in this document:
Bold Use for names of interface elements, such as names of windows, dialog
boxes, buttons, fields, tab names, key names, and menu paths (what the
user specifically selects or clicks)
Italic Use for full titles of publications referenced in text
Monospace Use for:
• System output, such as an error message or script
• System code
• Pathnames, filenames, prompts, and syntax
• Commands and options
Monospace italic Use for variables.
Monospace bold Use for user input.
[] Square brackets enclose optional values
| Vertical bar indicates alternate selections — the bar means “or”
{} Braces enclose content that the user must specify, such as x or y or z
... Ellipses indicate nonessential information omitted from the example
Your Comments
Your suggestions will help us continue to improve the accuracy, organization, and overall
quality of the user publications. Send your opinions of this document to:
techpubcomments@emc.com
CHAPTER 1
General Information
General Information 9
DELL EMC CONFIDENTIAL
General Information
Note: Only Technician Advisor version 2.x supports the X1 cluster type.
A KVM, or keyboard and monitor are required on-site in case there is a need to
re-install a physical XMS and/or Storage Controllers.
To view the part numbers of the XtremIO cluster components, from the GUI hover the
mouse pointer over the desired component; a ToolTip appears, showing the
component’s details, including its part number.
ssh command using command line or SSH client (e.g. PuTTY).
sftp command using command line or SFTP client (e.g. WinSCP).
Example: Executing the sftp command using command line to transfer an XtremIO
Health Check Script to the XMS with the xmsupload credentials.
# sftp xmsupload@<IP of XMS>
...
Connected to <IP of XMS>.
sftp> pwd
Remote working directory: /images
sftp> cd scripts/
sftp> put system_health-v203.5.4-s4.0.0.py.gpg
Uploading system_health-v203.5.4-s4.0.0.py.gpg to /images/scripts/system_health-v203.5.4-s4.0.0.py.gpg
system_health-v203.5.4-s4.0.0.py.gpg 100% 10MB 45.6MB/s 00:00
sftp> exit
The following tools are required for the Storage Controller and XMS replacement
procedures:
Prepare a USB flash drive to restore the Storage Controller or XMS to its original state.
• For details on preparing a USB flash drive before a Storage Controller replacement,
refer to “Re-Installing a Storage Controller” on page 112.
• For details on preparing a USB flash drive before an XMS replacement, refer to
“Re-Installing a Physical XMS” on page 114.
Note: In Storage Controllers with XIOS version 4.0.25-22 (or later versions) and in XMS
with version 4.2.2-18 (or later versions) the sshd configuration was enhanced to
disable weak ciphers for SSL connectivity. This was done to enhance security and
resolve some vulnerabilities when this algorithm was used. Due to this change, some
older versions of the PuTTY SSH client and WinSCP SFTP client may cause an error to
occur when accessing or transferring files to or from the XMS. For further details, refer
to Dell EMC KB# 504645 (https://support.emc.com/kb/504645).
To avoid such errors, make sure that WinSCP and PuTTY (or any SSH client and SFTP
client used) are updated to the most recent version to enable accessing and
transferring files to, or from, the XMS.
Option 1 - Accessing the XMS Using a Tunnel From the Storage Controller
Note: If you are using the Technician Advisor utility, instead of this procedure, you can
launch an XMCLI session on the XMS. For details, refer to the Technician Advisor Utility
User Guide matching the software version currently installed.
Note: This procedure is performed by the customer, able to access to XMCLI on the XMS.
Note: Make sure to access the XMS via a Storage Controller that is healthy.
Note: An alert is raised to inform the that the technician port tunnel is open. This alert is
cleared once the tunnel is closed.
Note: Since the Storage Controller may not be able to open an SSH tunnel due to
security issues, the tunnel is opened from the XMS’s side.
5. Connect the computer to the TECH Ethernet(RJ45) port (marked " 2") on the rear of
the of the healthy Storage Controller on X-Brick 1 (X1-SC1 or X1-SC2), as selected in
step 1 of this procedure.
6. Upon completion of the procedure (when access to XMS is no longer required), make
sure to close the tunnel.
To access the XMS via the Storage Controller on a tunnel between the XMS and Storage
Controller:
Note: This procedure should be performed after a tunnel is opened between the XMS and
Storage Controller.
1. Connect your laptop to the TECH Ethernet(RJ45) port (marked " 2") on the rear of
Storage Controller.
2. Configure your laptop's network interface with the following:
• IP: 169.254.254.2
• Prefix mask: 255.255.240.0
3. Connect to the XMS using the following ssh command:
ssh xmsadmin@169.254.254.1 -p 10022
The actual connection is made by SSH to the TECH Ethernet port’s IP address (static
169.254.254.1) on port 10022, with the Username xmsadmin (the password for
xmsadmin is supplied by Dell EMC).
The Storage Controllers can now forward traffic between the Ethernet Tech port and XMS.
To close the tunnel that was opened between the Storage Controller and the XMS:
Note: This procedure should be performed following the completion of the replacement,
once XMCLI access is no longer needed. The subsequent commands referenced in this
procedure (following closure of the tunnel to the XMS) should be completed by the
customer, able to access XMCLI on the XMS.
Note: You can verify whether or not a tunnel was opened, by using the following
command:
show-technician-port-tunnels:
Option 2 - Accessing the XMS Using PuTTY Port Forwarding (or Similar on Another
SSH client) from the Storage Controller
Note: This procedure specifically refers to the PuTTY SSH client. To execute this procedure
using another SSH client, consult your SSH client documentation to determine if a similar
port forwarding feature is provided, and how to use the feature on your SSH client.
Note: Make sure to access the XMS via a Storage Controller that is healthy.
Note: If you are using the Technician Advisor utility, instead of this procedure, you can
launch an XMCLI session on the XMS. For details, refer to the Technician Advisor Utility
User Guide matching the software version currently installed.
11. While the PuTTY SSH session to the Storage Controller remains active, perform one of
the following actions:
• To access the XMS using XMCLI - Use PuttY to open another SSH session to IP
address 127.0.0.1, with the username "xmsadmin" (password for xmsadmin
supplied by Dell EMC), followed by appropriate XMCLI user credentials (e.g.
"tech").
• To access the XMS using WebUI - Using a web browser, open a session at URL:
https://127.0.0.1/webui, with appropriate XMCLI user credentials (e.g. "tech").
Note: You can access the Dell EMC SolVe Desktop at:
https://solve.emc.com/desktopbinaries/setup.exe
The following example shows the script for running an XtremIO HCS on the first cluster that
is connected to the XMS:
run-script script="system_health-vXXX.X.X-s6.0.0.py"
arguments="<cluster name>"
For guidance on running the XtremIO Health-Check Script and on resolving its output, refer
to Dell EMC KB # 206076 (https://support.emc.com/kb/206076). If an unexpected error
is reported by the HCS, submit a standard Service Request to XtremIO Global Technical
Support.
CHAPTER 2
Replacing the Servers and Components
The Storage Controller replacement procedure should be performed, using the XtremIO
Technician Advisor utility, following a Service Request (SR) determined by XtremIO Global
Technical Support. If you have any questions or encounter problems, contact XtremIO
Global Technical Support.
If RecoverPoint is connected to an XtremIO cluster, notify the customer to pause the
activity of Consistency Groups that are configured to replicate with the cluster, using
RecoverPoint native replication, during this FRU procedure.
If the customer requires assistance to pause in RecoverPoint, contact RecoverPoint Global
Technical Support.
If the customer is unable to perform this operation, do not perform this FRU procedure
and contact XtremIO Global Technical Support before taking any further action.
For further details, provide the customer with Dell EMC KB# 479972
(https://support.emc.com/kb/479972).
Note: Before arriving on the site, make sure that you have the updated Storage Controller
rescue image for the cluster’s version. In addition, ensure that the latest version of
Technician Advisor utility is installed on your laptop.
Note: If the customer has a Disk Retention Agreement with Dell EMC, remove the hard
disks and SSDs from the replaced Storage Controller and give them to the customer. For
instructions, refer to “Removing the Old Storage Controller Disks” on page 137.
Tolerance
Failure of a single Storage Controller may result in a performance degradation.
Failure of both Storage Controllers in the same X-Brick results in:
• Loss of service in a multiple X-Brick cluster
• Data loss in a single X-Brick cluster
Failure of both InfiniBand links and/or both SAS ports in the same Storage Controller
results in a Storage Controller failure.
Note: Make sure to access the XMS via a Storage Controller that is healthy.
Note: It is recommended to keep the CLI window in a maximized mode. Minimizing the
window may cause the activation progress bar to be displayed on new lines instead of
the same line.
Note: It is recommended to use the cluster name (and not the cluster ID) as the cluster
identifier in cluster-related XMCLI commands.
Note: The cluster-id parameter is not mandatory for single cluster configurations.
6. Use Table 1 to record the configuration data of the defective Storage Controller, and
refer to it when you configure the new Storage Controller.
X-Brick Name
Cluster Name
cluster-id
Note: Make sure to close the tunnel between the Storage Controller and XMS when access
to XMS is no longer required, as described in “Accessing the XMS via a Cluster Storage
Controller” on page 13.
Network ports are assigned for each Storage Controller, two at a time. For a (partial)
example, refer to Table 2 to determine the range of ports assigned for each Storage
Controller.
Table 2 Required Network Ports for Storage Controller Replacement (Partial Example)
.... ....
Note: The network port 11112 is only required if Storage Controllers are using an IPv6
management IP address.
It is necessary to confirm that each required network port from the XMS is open to its
respective Storage Controller.
Note: For checking the required ports to a defective Storage Controller, use the existing
Storage Controller to verify whether the port is open. However, if the defective Storage
Controller is not responsive, work with the customer to check the required ports for the
peer Storage Controller instead.
Closing the Tunnel Between the Storage Controller and the XMS (if Previously Opened)
After checking the cluster’s health using the XtremIO Health-Check Script (HCS), make
sure to close the tunnel between the Storage Controller and XMS (if one previously
opened), as described in “Accessing the XMS via a Cluster Storage Controller” on page 13.
Replacing the Defective Storage Controller Using the Technician Advisor Utility
The Storage Controller replacement procedure should be performed using the XtremIO
Technician Advisor utility following a Service Request (SR), determined by XtremIO Global
Technical Support. If you have any questions or encounter problems, contact XtremIO
Global Technical Support.
Prior to commencing the replacement procedure, check the XtremIO XIOS version installed
on the cluster, and the replacement Storage Controller’s P/N. If the cluster XIOS version is
4.0.25-22 or earlier, and the replacement Storage Controller is P/N 100-586-077 or P/N
100-586-078 (equipped with a Cobra-F type HDD), do not commence the replacement
procedure! Instead, contact XtremIO Global Technical Support to obtain a compatible
replacement Storage Controller.
Note: For information on essential preparations required for using the XtremIO Technician
Advisor utility at a customer's site prior to your arrival, refer to Appendix G.
Note: For details on the XtremIO Technician Advisor utility, refer to the XtremIO Technician
Advisor Utility User Guide, which is posted in the XtremIO SolVe Generator, under XtremIO
> XtremIO X1 (XIOS 2.x, 3.x, 4.x) > Service Scripts and Utilities > XtremIO Technician Advisor
> Install XtremIO Technician Advisor.
Note: If XtremIO Global Technical Support instructs you to follow the manual configuration
procedures, refer to Appendix D.
The Storage Controller DIMM replacement procedure should be performed, using the
XtremIO Technician Advisor utility version 2.6.0 (or later 2.x versions), following a Service
Request (SR) determined by XtremIO Global Technical Support. If you have any questions
or encounter problems, contact XtremIO Global Technical Support.
Approval from XtremIO Global Tech Support is required prior to any DIMM Replacement.
If RecoverPoint is connected to an XtremIO cluster, notify the customer to pause the
activity of Consistency Groups that are configured to replicate with the cluster, using
RecoverPoint native replication, during this FRU procedure.
If the customer requires assistance to pause in RecoverPoint, contact RecoverPoint Global
Technical Support.
If the customer is unable to perform this operation, do not perform this FRU procedure
and contact XtremIO Global Technical Support before taking any further action.
For further details, provide the customer with Dell EMC KB# 479972
(https://support.emc.com/kb/479972).
Tolerance
Significant DIMM issues may cause a Storage Controller to fail.
Failure of a single Storage Controller may result in a performance degradation.
Note: A DIMM replacement procedure is only supported when a single Storage Controller
channel requires the replacement, and can only be performed once, per Storage
Controller.
2. List the currently outstanding alerts for the cluster, using the following command:
show-alerts
3. Perform a DIMM replacement procedure when the following alert is raised on the
affected Storage Controller:
• node_dimm_level_5_major - (Alert Code 0403305) 2500 DIMM Correctable Errors
When the node_dimm_level_5_major alert is raised, the affected Storage Controller is
disabled, as well as its journaling (alerts node_system_disabled and journal_failed
are also raised on the affected Storage Controller).
Example: node_dimm_level_5_major alert raised
4. In case the above alert (0403305) does not appear in the show-alerts, DIMM
replacement may still be appropriate, as determined by XtremIO Global Tech Support.
For further details, refer to Dell EMC KB# 459351
(https://support.emc.com/kb/459351).
Note: DIMM replacement can also be considered if suspicion is raised (from the
customer or from field personnel) that a DIMM replacement procedure is required on a
Storage Controller, which should be performed, following a Service Request (SR)
determined by XtremIO Global Technical Support.
5. Confirm that DIMM FRU has not been previously been performed on this Storage
Controller. For instructions on running a script that displays whether or not the DIMM
replacement for this Storage Controller has occurred already, and the historical CE
count, refer to Dell EMC KB# 459351 (https://support.emc.com/kb/459351).
In exceptional cases, a Storage Controller replacement is required instead of replacing the
DIMM. The Technician Advisor utility should be used to determine this.
Note: Refer to the Technician Advisor User Guide for directions on how to log into the
cluster remotely, via Technician Advisor.
Note: If the Technician Advisor utility fails to validate performing a DIMM replacement,
it is necessary to replace the Storage Controller to resolve the issue, as described in
“Replacing a Storage Controller” on page 20.
3. Upon the DIMM replacement wizard’s completion of the “Cluster Health” phase (prior
to the “Deactivate” phase), cancel the wizard by clicking Exit, located in the wizard
window’s lower-left corner.
4. A DIMM Replacement should be performed only if the Technician Advisor DIMM FRU
wizard validations passed successfully.
Note: The identity of the channel with DIMM errors appears in the “Details” pop-up of the
“Querying IPMI log”, and the “Check Storage Controller status” of the “Query DIMMs”
phase, even when the green check-mark is displayed (adjacent to the Details button).
Note: For DIMM replacement procedures, the run-script command should be in the
following format (where "exclude Xn-SCn" refers to the Storage Controller containing
the DIMM to be replaced):
xmcli (tech)> run-script
script="system-health-v[x.x]-s[x.x.x].py" arguments="--exclude
Xn-SCn --check-type fru_sc --cluster-id 1"
Closing the Tunnel Between the Storage Controller and the XMS (if Previously Opened)
After checking the cluster’s health using the XtremIO Health-Check Script (HCS), make
sure to close the tunnel between the Storage Controller and XMS (if one previously
opened), as described in “Accessing the XMS via a Cluster Storage Controller” on page 13.
Note: For information on essential preparations required for using the XtremIO Technician
Advisor utility at a customer's site prior to your arrival, refer to Appendix G.
Note: For details on the XtremIO Technician Advisor utility version 2.6.0 (or later 2.x
versions), refer to the XtremIO Technician Advisor Utility User Guide, which is posted in the
XtremIO SolVe Generator, under XtremIO > XtremIO X1 (XIOS 2.x, 3.x, 4.x) > Service Scripts
and Utilities > XtremIO Technician Advisor > Install XtremIO Technician Advisor.
Note: If XtremIO Technician Advisor utility cannot be used to replace the defective DIMM,
replace the defective DIMM’s Storage Controller. For instructions, refer to “Replacing a
Storage Controller” on page 20.
Tolerance
Failure of a single Storage Controller power supply unit does not affect the Storage
Controller operation.
Failure of both Storage Controller power supply units in the same Storage Controller
results in Storage Controller failure.
To identify the defective Storage Controller power supply unit, using the CLI:
1. Log in to the XMCLI as tech.
2. List the Storage Controller power supply unit’s status, using the following command:
show-storage-controllers-psus cluster-id="<cluster name>"
3. Note the Index of Storage Controller power supply units with a non-healthy
Lifecycle-State.
Table 3 describes the possible failed Storage Controller power supply unit states:
State Description
4. Note the Index and Serial Number of Storage Controller power supply units
showing a non-healthy state.
To identify the defective Storage Controller power supply unit, using the GUI:
From the GUI, view the Inventory; the defective Storage Controller power supply unit
appears in orange.
Closing the Tunnel Between the Storage Controller and the XMS (if Previously Opened)
After checking the cluster’s health using the XtremIO Health-Check Script (HCS), make
sure to close the tunnel between the Storage Controller and XMS (if one previously
opened), as described in “Accessing the XMS via a Cluster Storage Controller” on page 13.
Note: If there are two Storage Controllers adjacent to each other, first tilt the cable
management bracket's tray furthest from the component being replaced and then tilt
the tray of the other Storage Controller.
2. Disconnect the power cable from the defective Storage Controller power supply unit.
To revoke cable retention, release the power cord latch. The cables should remain
fastened by the cable strap in the cable management bracket.
3. To remove the Storage Controller power supply unit, push the green lever and then pull
on the handle.
Note: If the defective Storage Controller power supply unit should be sent to Dell EMC for
Failure Analysis (FA), refer to Appendix C for the procedure details.
2. Connect the power cable to the new Storage Controller power supply unit. To resume
cable retention, fasten the power cord latch.
3. Lift the cable tray of the cable management bracket, while pulling the latches (on the
left and right sides of the bracket) until the latches click in.
Note: Make sure that the latches are engaged and the tray is locked in position.
Note: If there are two Storage Controllers adjacent to each other, first return the cable
management bracket's tray nearest to the component being replaced, to its original
position, and then return the second tray.
To verify that the new Storage Controller power supply unit is healthy, using the CLI:
1. Log in to the XMS CLI as tech.
2. Wait several seconds, then run the following command:
show-storage-controllers-psus cluster-id="<cluster name>"
3. If the State is not healthy, inspect the Storage Controller power supply unit.
To verify that the new Storage Controller power supply is healthy, using the GUI:
1. Hover the mouse pointer over the new Storage Controller power supply unit; a ToolTip
appears, showing the power supply status.
2. Verify that the State is Healthy.
Replacing an SFP+
Priority Failure Analysis (Priority FA) is required only for XtremIO FRU replacements
involved in an outage (DU/DL).
The SFP+ replacement procedure should be performed following a Service Request (SR)
determined by XtremIO Global Technical Support.
Tolerance
Failure of an SFP+ may result in performance degradation.
Procedure Prerequisite
Make sure to perform the following instruction prior to replacing an SFP+.
Note: XtremIO Global Tech Support should confirm this procedure prerequisite with the
respective Dell EMC network connectivity teams and with the customer.
For suspected SFP+ errors and/or iSCSI/Fibre Channel “connection to XtremIO cluster”
errors, arrange for the Connectivity team to confer with the customer in order to confirm
that iSCSI and Fibre Channel environment(s) to the XtremIO Storage Controller iSCSI or
Fibre Channel ports are validated. This includes confirming the network or Fibre Channel
switches, switch ports, network patch panels, cables and cable reseating (at both ends).
An SFP+ replacement procedure must only be performed after all other network
components and configurations have been verified. If not, replacing an SFP+ may not
resolve the issue.
Note: Identify the Storage-Controller-Name for each target by either the Name
value, or by running the following command:
show-targets prop-list=["Storage-Controller-Name"]
Note: In the example provided, following this step, a subset of SFP+s on the cluster is
detected as potentially defective. However, other SFP+s on the cluster can also be
defective. Complete the remaining steps in this procedure to determine thoroughly
which of the cluster’s SFP+s are defective.
* Assuming the FC network supports 8GFC and was tested as noted in the prerequisites, prior to
starting this procedure.
** Assuming the iSCSI network supports 10Gb and was tested as noted in the prerequisites, prior to
starting this procedure.
Replacing an SFP+ 37
DELL EMC CONFIDENTIAL
Replacing the Servers and Components
Note: If an SFP+ loopback tool is not available, skip this section and proceed with the rest
of the SFP replacement procedure.
Replacing an SFP+ 39
DELL EMC CONFIDENTIAL
Replacing the Servers and Components
Note: For further details on using LEDs to identify components, refer to Appendix B.
3. Using the noted details of the defective SFP+ (Name, Index, Port-Type, and
Target-Port-HW-Label), physically locate the SFP+ on the Storage Controller located
following step 2 of “Identifying the Defective SFP+”. For details, refer to the
Connecting the Cluster to Host section of the XtremIO Hardware Installation and
Upgrade Guide.
4. From the rear of the Storage Controller, unplug the (iSCSI or Fibre Channel) cable
connected to a defective SFP+.
Note: Use an orderable SFP+ extraction tool to raise the SFP+ bail. If an SFP+ extraction
tool is not available, carefully use a flat-headed screwdriver to lift the SFP+ bail.
6. Grasp the bail and slide the SFP+ out from the Storage Controller.
Note: The defective SFP+ should be sent to Dell EMC for Failure Analysis (FA) if possible.
Refer to Appendix C.
Replacing an SFP+ 41
DELL EMC CONFIDENTIAL
Replacing the Servers and Components
Note: For details on the required replacement SFP+ with XtremIO, refer to the XtremIO
Part Number List on XtremIO SolVe (Solve Desktop > XtremIO Generator > XtremIO X1
(XIOS 2.x, 3.x, 4.x) > FRU Replacement Procedures > XtremIO FRU Part Number List).
2. Make sure that the mating connector of the new SFP+ is free of dirt and/or obstacles.
3. Align the new SFP+ with the guides in the slot, and insert the SFP+ by sliding it into the
slot until slight resistance is felt.
Wait for 15 minutes before verifying that the replacement was successful.
5. Run the following command to verify the SFP+ replacement was successful:
show-targets cluster-id="<cluster name>"
6. On the show-target output, locate the information for the replaced SFP+(s), using
the Name and Index of the replaced defective SFP+.
7. Verify a successful FC SFP+ replacement, as follows:
a. Run the following command to verify that the Port-Speed is 8GFC and that the
Port-State is up:
show-targets-fc-error-counters cluster-id="<cluster name>"
b. In the show-targets-fc-error-counters output, locate the corresponding
FC target, using the Index of the replaced FC SFP+.
c. Verify that for this FC target, the Sync-Loss and Lync-Failure column values
no longer increase.
Replacing an SFP+ 43
DELL EMC CONFIDENTIAL
Replacing the Servers and Components
Tolerance
This procedure erases all historical performance and event data previously stored in
the XMS.
Failure of an XMS prevents cluster management.
Note: Failure of an XMS does not have an impact on cluster I/O operations.
Note: If the affected XMS is hard down, the procedure described in this section cannot be
performed and should be skipped.
Before replacing a defective component, a tunnel must be opened in order to access the
XMS via a Storage Controller, and be closed upon the procedure’s completion (when
access to XMS is no longer required). Once this is done, when handling a replacement
case on site, connect to the TECH port of a cluster’s Storage Controller, and access the
XMS. For instructions, refer to “Accessing the XMS via a Cluster Storage Controller” on
page 13.
Note: If the affected XMS is hard down, the information required for the affected XMS
cannot be retrieved using the Install Menu and XMCLI. In such case, consult with the
customer or a current or recent cluster log file bundle in order to collect the required
information.
Inability to connect to cluster management (after ruling out network problems) indicates
that the XMS is not healthy.
Use Table 4 to record the configuration data of the defective XMS, and refer to it when you
configure the new XMS.
Default Gateway
Before replacing the affected XMS, check with the customer to determine if the affected
XMS was part of a Native Replication environment. In addition, you can use the
show-remote-protection-peer-xms XMCLI command to assist in determining
this. For further details, refer to the XtremIO Storage Array User Guide of the version
running on the affected XMS.
remote-ip-addr
Issue the
remote-xms-alias-name show-remote-protection-peer-
xms XMCLI command.
remote-xms-user
remote-user-password
Note: In some cases, it may not be possible to execute the XtremIO Health Check Script, as
the XMS may be down or not functional.
Before replacing the defective XMS, check the cluster’s health by using the XtremIO
Health-Check Script (HCS). For instructions, refer to “Checking the XtremIO Cluster Health”
on page 17.
Note: For replacing a defective virtual XMS, refer to “Replacing a Virtual XMS” on page 51.
Note: Make sure that all cables are clearly labeled before disconnecting them from the
XMS.
2. Remove the bezel that covers the front of the server by simultaneously pressing the
tabs on both sides of the bezel to release it from its latches, then pull the bezel off the
component.
3. Remove the stabilizing screw behind the latch bracket on each side.
4. Pull the server forward until is locks in place, then, slide the blue disconnect tabs
forward to release the inner rails from the slide rails.
Note: If the defective XMS should be sent to Dell EMC for Failure Analysis (FA), refer to
Appendix C for the procedure details.
Note: For more detailed instructions on installing the physical XMS, refer to the XtremIO
Storage Array Hardware Installation and Upgrade Guide.
4. On the outside of each rail assembly, slide the blue disconnect tab forward to unlock
the server, and push the server completely into the rack.
5. Insert and tighten a small stabilizer screw directly behind each bezel latch.
6. Connect the two power cables to the XMS.
7. Connect the network cable to the MGMT1 Ethernet port (marked "1") on the physical
XMS.
8. Press the Power button to power on the XMS.
9. Reinstall the XMS bezel.
Note: For the detailed procedure, refer to XtremIO Storage Array Software Installation
and Upgrade Guide.
Note: If the TECH Ethernet port connection fails, or the OS fails to load, reinstall the
physical XMS with the appropriate XtremIO XMS Rescue Image. Refer to “Re-Installing
a Physical XMS” on page 114 for details.
Note: For the detailed procedure, refer to XtremIO Storage Array Software Installation
and Upgrade Guide.
Note: If the package is not on the Support page for XtremIO, contact the XtremIO
Global Technical Support.
Note: When downloading a software package, access the Dell EMC Support page and
verify that the MD5/SHA-256 checksum of the downloaded package matches the MD5
or SHA-256 checksum that appears on the support page for that package.
5. Upload the software image to /images. Use the sftp command via command line or
an SFTP client (e.g. Filezila, WinSCP) to log in as the xmsupload user and transfer the
package downloaded on your computer to the XMS.
6. When the file transfer is complete, close the SFTP client or exit the SFTP session (if one
was created by executing the sftp command from the command line).
7. Re-open SSH connectivity to the XMS by running the ssh command via command line
or the SSH client.
8. From the Install menu, select Install XMS only.
xbrickTMP
Install Menu
-------------------------------------
1. Configure XMS
2. Check XMS Configuration
3. Display XMS Information
4. Install XMS only
5. Install Storage Controllers
6. ESRS Menu
7. Recover XMS
8. Power Menu
9. Collect Log Bundle
10. IPMI Access Menu
11. Disable Remote Shell
12. Restricted Shell
13. Installation Package Pre-loaded on Storage Controller Menu
99. Exit Install Menu
>
>4
9. From the Install XMS sub-menu, select Installation using image filename.
xbrickTMP
10. Enter the image file name from the available packages listed.
11. Verify that it is the correct package to install, and select Yes.
12. Proceed to “Recovering the XMS” on page 55.
Note: For replacing a defective physical XMS, refer to “Replacing the Physical XMS” on
page 46.
Note: Request the customer to delete the defective Virtual XMS VM from their vSphere
virtual infrastructure.
Deploying a new Virtual XMS VM, using the XtremIO XMS OVA package
Configuring the XMS (Network connectivity, XMS Server DNS Name, Network interface
information)
Updating the XMS server to the required software version
Recovering the XMS via the xinstall Install menu or via the XMCLI
Note: For detailed instructions, refer to XtremIO Storage Array Software Installation
and Upgrade Guide.
Note: For the detailed procedure, refer to XtremIO Storage Array Software Installation
and Upgrade Guide.
Note: For the detailed procedure, refer to XtremIO Storage Array Software Installation
and Upgrade Guide.
Note: If the package is not on the Support page for XtremIO, contact XtremIO Global
Technical Support.
Note: When downloading a software package, access the Dell EMC Support page and
verify that the MD5/SHA-256 checksum of the downloaded package matches the MD5
or SHA-256 checksum that appears on the support page for that package.
7. Upload the software image to /images. Use the sftp command via command line or
an SFTP client (e.g. Filezila, WinSCP) to log in as the xmsupload user and transfer the
package downloaded on your computer to the XMS.
8. When the file transfer is complete, close the SFTP client or exit the SFTP session (if one
was created by executing the sftp command from the command line).
9. Re-open SSH connectivity to the XMS by running the ssh command via command line
or the SSH client.
10. Make sure that the software image is of the same version as that used by the running
cluster.
11. From the Install menu, select Perform XMS install only. Enter the image file name that
was used in the previous step as input.
Install Menu
-------------------------------------
1. Configure XMS
2. Check XMS Configuration
3. Display XMS Information
4. Install XMS only
5. Install Storage Controllers
6. ESRS Menu
7. Recover XMS
8. Power Menu
9. Collect Log Bundle
10. IPMI Access Menu
11. Disable Remote Shell
12. Restricted Shell
13. Installation Package Pre-loaded on Storage Controller Menu
99. Exit Install Menu
>
>4
xms-xbrick277
Logging off
Install menu
1. Configure XMS
2. Check XMS configuration
3. Display XMS Information
4. Install XMS only
5. Install Storage Controllers
6. ESRS Menu
7. Recover XMS
8. Power Menu
9. Collect Log Bundle
10. IPMI Access Menu
11. Disable Remote Shell
12. Restricted Shell
13. Installation Package Pre-loaded on Storage Controller Menu
99. Exit Install Menu
> 7
3. Provide the IP address (or Host Name) of the System Manager Storage Controller and
append the --keep-guid --dry-run flags.
Note: The --keep-guid parameter must be applied when the affected XMS is part of
a Native Replication or Recover Point replication configuration.
Note: In a multi-cluster environment, type a list of IP addresses (or Host Names) of the
managing Storage Controllers in all clusters that are connected to the affected XMS.
Note: Make sure to type a single space character between each pair of IP addresses (or
Host Names).
4. Select whether you want to restore the original database in case of failure.
Restarting XMS...
Restart XMS using services
Starting XMS system using XROOT /xtremapp
connectemc stop/waiting
connectemc start/running, process 9050
xtremapp-xms stop/waiting
xtremapp-xms start/running, process 9136
Install menu
1. Configure XMS
2. Check XMS configuration
3. Display XMS Information
4. Install XMS only
5. Install Storage Controllers
6. ESRS Menu
7. Recover XMS
8. Power Menu
9. Collect Log Bundle
10. IPMI Access Menu
11. Disable Remote Shell
12. Restricted Shell
13. Installation Package Pre-loaded on Storage Controller Menu
99. Exit Install Menu
> 7
2. Provide the IP address of the managing Storage Controller and append the
--keep-guid flag.
Note: The --keep-guid parameter must be applied when the affected XMS is part of
a Native Replication or Recover Point replication configuration.
Note: In a multi-cluster environment, type a list of IP addresses (or Host Names) of the
managing Storage Controllers in all clusters that are connected to the affected XMS.
Note: Make sure to type a single space character between each pair of IP addresses (or
Host Names).
3. Select whether you want to restore the original database in case of failure.
4. Wait for the recovery process to complete and for the XMS to restart.
Restarting XMS...
Restart XMS using services
Starting XMS system using XROOT /xtremapp
connectemc stop/waiting
connectemc start/running, process 9050
initctl: Unknown instance
xtremapp-xms start/running, process 9136
Note: If the affected XMS is part of a Native Replication or Recover Point replication
configuration, use the Install Menu for XMS recovery instead. For details refer to
“Recovering the XMS Using the Install Menu” on page 56
Old XMS and all of its data will be lost. Are you sure you want to recover the XMS? (Yes/No): yes
XMS recovery has been started
Done!
XMS recovery finished successfully
5. Optional: Following the XMS recovery process, if you want to refresh the SSH key, run
the following command:
refresh-xms-ssh-key
6. After the recovery has successfully completed, log out of XMS CLI.
Note: Even if working with a single cluster, ensure to add the single IP address.
Old XMS and all of its data will be lost. Are you sure you want to recover the XMS? (Yes/No): yes
XMS recovery has been started
Done!
XMS recovery finished successfully
5. After the recovery has successfully completed, log out of XMS CLI.
Note: Following an XMS replacement procedure, run the following XMCLI command to
check the XMS Remote Support configuration:
show-syr-notifier
It may be necessary to manually reconfigure XMS Remote Support.
This is especially important when using the SRS-VE XMS Remote Support configuration on
an XMS running version 6.0.1 (or later).
For further details, refer to Dell EMC KB# 524863 (https://support.emc.com/kb/524863).
After configuring the replaced component, it is necessary to perform the following post
replacement procedures:
With XMS version 6.1 (or later versions), if the XMS you just replaced was previously
part of a Native Replication environment, to restore the Native Replication
configuration on the replaced XMS, run the following command:
add-remote-protection-peer-xms
remote-ip-addr=”Remote_XMS_IP”
remote-xms-alias-name="Remote XMS alias used in Remote
Protection domain"
remote-xms-user=”admin” remote-user-password=”XXX” force
Note: Make sure to use the force flag while running the
add-remote-protection-peer-xms XMCLI command.
Note: Check with the customer to determine if the affected XMS was part of a Native
Replication environment. In addition, you can use the
show-remote-protection-peer-xms XMCLI command to assist for determining this. For
further details, refer to the XtremIO Storage Array User Guide of the version running on
the affected XMS.
CHAPTER 3
Replacing the DAE Components
The SSD replacement procedure should be performed, using the XtremIO Technician
Advisor utility, following a Service Request (SR) determined by XtremIO Global Technical
Support. If you have any questions or encounter problems, contact XtremIO Global
Technical Support.
SSD Integration on X1 cluster types is possible only after DPG rebuild has completed.
Tolerance
Failure of up to two SSDs in a single X-Brick results in performance degradation during
rebuild.
Concurrent failure of three SSDs in the same X-Brick results in a loss of service.
Failure of six SSDs in the same XDP group results in a degraded state which is called
“degraded (single failure)”, where the data has only a single parity protection. For a
10TB Starter X-Brick (5TB) it is five SSDs.
Failure of seven SSDs in the same XDP group results in dual-degraded state, which is
called “degraded (dual failure)”, where the data has no parity protection. For a 10TB
Starter X-Brick (5TB) it is six SSDs.
Failure of eight SSDs in the same XDP group results in loss of service. For a 10TB
Starter X-Brick (5TB) it is seven SSDs.
Insufficient SSD space may prevent the cluster from rebuilding the XDP group,
resulting in a degraded state where the data does not have double-parity protection.
Note: This procedure does not apply to defective SSDs detected by 5D SMART Error. For
more information, see “Handling Defective SSDs, Detected by 5D SMART Error” on
page 66.
Note: It is recommended to use the cluster name (and not the cluster ID) as the cluster
identifier in cluster-related XMCLI commands.
Note: The cluster-id parameter is not mandatory for single cluster configurations.
If the alert is raised on more than one SSD in the XtremIO cluster, make sure to replace the
defective SSDs systematically, one at a time. Therefore, it is necessary to wait for the
rebuild and integration of each new SSD to complete entirely BEFORE proceeding to
replace the next SSD, after each SSD is replaced. Refer to Dell EMC KB 205558 for further
details, and up-to-date information on this scenario.
Closing the Tunnel Between the Storage Controller and the XMS (if Previously Opened)
After checking the cluster’s health using the XtremIO Health-Check Script (HCS), make
sure to close the tunnel between the Storage Controller and XMS (if one previously
opened), as described in “Accessing the XMS via a Cluster Storage Controller” on page 13.
Note: For further details on using LEDs to identify components, refer to Appendix B.
Note: Make sure to close the tunnel between the Storage Controller and XMS (if one was
opened) when access to XMS is no longer required, as described in “Accessing the XMS
via a Cluster Storage Controller” on page 13.
Note: For details on the XtremIO Technician Advisor utility, refer to the XtremIO Technician
Advisor Utility User Guide, which is posted in the XtremIO SolVe Generator, under XtremIO
> XtremIO X1 (XIOS 2.x, 3.x, 4.x) > Service Scripts and Utilities > XtremIO Technician Advisor
> Install XtremIO Technician Advisor.
Note: If XtremIO Global Tech Support instructs you to follow the manual configuration
procedures, refer to Appendix E.
If RecoverPoint is connected to an XtremIO cluster, notify the customer to pause the
activity of Consistency Groups that are configured to replicate with the cluster, using
RecoverPoint native replication, during this FRU procedure.
If the customer requires assistance to pause in RecoverPoint, contact RecoverPoint Global
Tech Support.
If the customer is unable to perform this operation, do not perform this FRU procedure and
contact XtremIO Global Tech Support before taking any further action.
For further details, provide the customer with Dell EMC KB# 479972
(https://support.emc.com/kb/479972).
Tolerance
Failure of a DAE chassis results in loss of service.
Note: For further details on using LEDs to identify components, refer to Appendix B.
Verify that you specify the correct cluster name.
Note: If the DAE is healthy and connected, the force flag command must be run to
successfully run the replace-dae-prepare command. For guidance and
directions, refer to XtremIO Global Technical Support for assistance.
Note: If there are two Storage Controllers adjacent to each other, first tilt the cable
management bracket's tray furthest from the component being replaced and then tilt
the tray of the other Storage Controller.
6. If cables are not marked, label them so that you can reconnect them as required to the
new DAE chassis.
7. Disconnect the power cables from the DAE’s PSUs.
8. Disconnect the SAS cables from the DAE Controllers.
9. Remove the DAE Controller (LCC) units from the defective DAE and immediately insert
them into the new DAE Chassis (for details, refer to “Replacing the DAE Controllers
(LCCs)”).
10. Remove the DAE power supply units from the defective DAE and immediately insert
them into the new DAE Chassis (for details, refer to “Replacing the DAE Power Supply
Units”).
11. Remove the DAE bezel.
12. Remove each SSD (one at a time) from the defective DAE chassis and immediately
insert it into the same slot in the new DAE Chassis.
13. If you are replacing the DAE of a 10TB Starter X-Brick (5TB):
a. Remove the 12 plastic air seals from slots 13 through 24 of the defective DAE
chassis.
b. Insert the removed air seals into slots 13 through 24 of the new DAE chassis.
If you are replacing the DAE of a regular X-Brick, ignore this step.
14. Remove the four screws (two per side) that secure the front of the enclosure to the
front vertical channels of the cabinet, and save the screws.
15. With help from another person, slide the enclosure out of the cabinet.
Note: If the defective DAE chassis should be sent to Dell EMC for Failure Analysis (FA), refer
to Appendix C for the procedure details.
Note: Make sure that the latches are engaged and the tray is locked in its position.
Note: If there are two Storage Controllers adjacent to each other, first return the cable
management bracket's tray nearest to the component being replaced, to its original
position, and then return the second tray.
4. Wait for several seconds and make sure that the new DAE is in a healthy state, using
the following command:
show-daes cluster-id="<cluster name>"
Note: If the state of the DAE is other than healthy, contact XtremIO Global Technical
Support.
5. Run the following set of commands to verify that all of the other DAE components are
healthy and installed to their respective (correct) locations:
show-daes-controllers cluster-id="<cluster name>"
Note: If the cluster and modules are other than active and healthy, contact
XtremIO Global Technical Support.
If RecoverPoint is connected to an XtremIO cluster, notify the customer to pause the
activity of Consistency Groups that are configured to replicate with the cluster, using
RecoverPoint native replication, during this FRU procedure.
If the customer requires assistance to pause in RecoverPoint, contact RecoverPoint Global
Technical Support.
If the customer is unable to perform this operation, do not perform this FRU procedure and
contact XtremIO Global Technical Support before taking any further action.
For further details, provide the customer with Dell EMC KB# 479972
(https://support.emc.com/kb/479972).
Note: Starting from version 4.0.10, if a SAS port’s error level exceeds that of the
(predefined) error threshold, the system disables the port. In such cases, it may be
necessary to replace a DAE Controller per guidance and directions from XtremIO Global
Technical Support.
Tolerance
Failure of both DAE Controllers (or all SAS cables) in the same X-Brick results in loss of
service.
Failure of one or more DAE Controller SAS ports results in degraded service.
2. Note the Index of the DAE Controller with a disconnected state. The DAE Controller with
the disconnected state indicates that it is the defective DAE Controller. Proceed to
“Replacing the Defective DAE Controller” on page 77.
3. If all DAE Controllers in the cluster are healthy, it means that there is no defective DAE
Controller, and it is therefore necessary to identify the system-disabled DAE
Controllers' SAS port, using the following command:
show-daes-controllers-sas-ports
If the Port-Health-State column indicates a defective SAS port, make a note of the
following Indexes for future reference:
• DAE Controller index and port index with the "Port-Enabled-State" column
showing "System-disabled", or a Port-Health-State marked "failed"
• DAE Controller’s Index
4. Proceed to “Replacing the Defective DAE Controller” on page 77.
Note: For further details on using LEDs to identify components, refer to Appendix B.
Note: If one of the Storage Controllers is not operating correctly, contact XtremIO
Global Technical Support before taking any further action.
3. If the cluster is a factory-assembled rack, remove the shipping bracket from behind
the DAE to be serviced.
4. If necessary, from the rear side of the Storage Controller that is adjacent to the
component you are replacing, tilt the cable management bracket's tray (up/down) to
gain better access. Simultaneously pull the latches on the left and right sides of the
cable management bracket, and then push the tray either up or down.
Note: If there are two Storage Controllers adjacent to each other, first tilt the cable
management bracket's tray furthest from the component being replaced and then tilt
the tray of the other Storage Controller.
5. Make sure that the SAS cables are labeled. If not, label them as necessary, so that you
can reconnect them as required to the DAE Controller.
6. Disconnect the SAS cables from the defective DAE Controller.
Note: When disconnecting the cables it is important to note the ports the cables were
disconnected from, so that you can reconnect them to the same ports after installing
the new DAE Controller.
For cabling guidelines refer to XtremIO Storage Array Hardware Installation and
Upgrade Guide.
7. Remove the defective DAE Controller unit from the DAE as follows:
a. Locate the orange handle buttons on the DAE Controller handles.
b. Press the orange handle buttons to release the DAE Controller, pull the latches
outward, and remove the DAE Controller from its slot.
Note: If the defective DAE Controller should be sent to Dell EMC for Failure Analysis (FA),
refer to Appendix C for the procedure details.
Note: Make sure that the latches are engaged and the tray is locked in its position.
Note: If there are two Storage Controllers adjacent to each other, first return the cable
management bracket's tray nearest to the component being replaced, to its original
position, and then return the second tray.
Note: If one of the Storage Controllers is not operating correctly, contact XtremIO
Global Technical Support before taking any further action.
Tolerance
Failure of a single DAE power supply unit bears no consequence.
Failure of both DAE power supply units in the same DAE results in loss of service.
To identify the defective DAE power supply unit, using the CLI:
1. Log in to the XMCLI as tech.
2. List the DAE power supply unit status, using the following command:
show-daes-psus cluster-id="<cluster name>"
3. Note the PSU-HW-Label of the DAE power supply unit with a non-healthy State.
To identify the defective DAE power supply unit, using the GUI:
From the GUI, view the Inventory; the defective DAE power supply unit appears in
orange.
Note: Access to the disks in your DAE times out two minutes after a DAE power supply unit
is removed. While the system continues operating on a single PSU, the loss of the
removed PSU causes a timeout unless the PSU is replaced within two minutes. When
replacing a DAE PSU, ensure that the green light on the PSU remains permanently on for at
least five seconds before removing power on the second PSU.
Note: If there are two Storage Controllers adjacent to each other, first tilt the cable
management bracket's tray furthest from the component being replaced and then tilt
the tray of the other Storage Controller.
3. Disconnect the power cable from the defective DAE power supply unit.
Note: Ensure that the new DAE PSU is prepared for insertion.
Note: If the defective DAE power supply unit should be sent to Dell EMC for Failure
Analysis (FA), refer to Appendix C for the procedure details.
2. Connect the DAE power supply unit power cable. A green light indicates that the DAE
power supply unit is successfully connected.
3. If you initially tilted the cable management bracket's tray (up/down) on the Storage
Controller adjacent to the DAE, return it to its original position, by pulling the latches
(on the left and right sides of the bracket) until the latches click in.
Note: Make sure that the latches are engaged and the tray is locked in its position.
Note: If there are two Storage Controllers adjacent to each other, first return the cable
management bracket's tray nearest to the component being replaced, to its original
position, and then return the second tray.
3. Wait for several seconds and make sure the S/N of the replaced power supply unit has
changed.
4. Make sure that the new DAE power supply unit is in healthy state, using the following
command:
show-daes-psus cluster-id="<cluster name>"
5. Verify that the cluster and modules are active, by running the following commands:
show-clusters
show-modules cluster-id="<cluster name>"
CHAPTER 4
Replacing the InfiniBand Switch Components
If RecoverPoint is connected to an XtremIO cluster, notify the customer to pause the
activity of Consistency Groups that are configured to replicate with the cluster, using
RecoverPoint native replication, during this FRU procedure.
If the customer requires assistance to pause in RecoverPoint, contact RecoverPoint Global
Technical Support.
If the customer is unable to perform this operation, do not perform this FRU procedure and
contact XtremIO Global Technical Support before taking any further action.
For further details, provide the customer with Dell EMC KB# 479972
(https://support.emc.com/kb/479972).
Note: As best practice, you can compare the switch’s actual S/N to the one presented in
the GUI. It is always advisable to check the cable connection and LED activities on the
Storage Controllers IB NIC to make sure that you are operating on the correct switch.
Note: Starting from version 4.0.10, if an InfiniBand Switch port’s error level exceeds that of
the (predefined) error threshold, the system disables the port. In such cases, it may be
necessary to replace an InfiniBand Switch per guidance and directions from XtremIO
Global Technical Support.
Tolerance
Failure of a single InfiniBand Switch renders the cluster vulnerable to risk of failure of
the second InfiniBand Switch and, therefore, compromises redundancy.
Failure of both InfiniBand Switches in the same cluster results in loss of service.
Failure of one or more InfiniBand Switch ports results in degraded service.
Note: System Status LEDs are located at the front and rear of the InfiniBand Switch. A solid
red LED indicates that a major error has occurred.
Note: It is recommended to use the cluster name (and not the cluster ID) as the cluster
identifier in cluster-related XMCLI commands.
Note: The cluster-id parameter is not mandatory for single cluster configurations.
3. Make a note of the Index of the defective InfiniBand Switch and proceed to “Checking
the XtremIO Cluster Health” on page 88.
If both InfiniBand Switches are healthy, it is necessary to identify the system-disabled
InfiniBand Switch port, using the following command:
show-infiniband-switches-ports cluster-id="<cluster name>".
Note: If a port is down, the response table cannot show the connecting peer, and lists
it as “None”.
If a port is disabled by the system manager, the port is rendered disabled on a
multiple X-Brick cluster.
Note: Make sure that all cables are clearly labeled to enable proper connection to the new
InfiniBand Switch.
Note: It is recommended to remove and install one rail (for reference) before removing
the second rail.
120
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
PS1
PS2
UID
RST
Note: If the defective InfiniBand Switch should be sent to Dell EMC for Failure Analysis
(FA), refer to Appendix C for the procedure details.
Note: Verify that the correct holes are aligned to ensure that the depth of the
InfiniBand Switch within the rack is adjusted correctly.
2. Secure each inner rail to the InfiniBand Switch, using three screws.
3. Lift the InfiniBand Switch and slide it onto the rails.
4. Align the screw hole of each bezel clip with those on the front side of the inner rails
(one on each side).
120
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
2
S1
PS2
P
UIDT
RS
5. Through each bezel clip, tighten a screw (one on each side) to secure the unit to rack.
6. Connect the InfiniBand Switch power cables.
7. Connect the InfiniBand Switch interlink cables (labeled IBSW1-P17 and IBSW1-P18).
8. If you removed a shipping bracket directly above or below the InfiniBand Switch,
re-install it.
9. Wait for the interlinks to synchronize, as shown by the green LEDs on the InfiniBand
Switch associated ports.
10. Connect the remaining InfiniBand cables from the Storage Controllers.
11. If you initially tilted the cable management bracket's tray (up/down), return it to its
original position, by pulling the latches (on the left and right side of the bracket) until
the latches click in.
Note: Make sure that the latches are engaged and the tray is locked in its position.
6. Verify that the new InfiniBand Switch’s Serial Number is shown in the output.
7. Verify that the PSUs are healthy, by running the following command:
show-infiniband-switches-psus cluster-id= "<cluster name>"
Note: For guidance on running the XtremIO Health-Check Script and on resolving its
output, refer to Dell EMC KB # 206076 (https://support.emc.com/kb/206076). If an
unexpected error is reported by the HCS, submit a standard Service Request to XtremIO
Global Technical Support.
InfiniBand Switches are equipped with two replaceable power supply units that work in a
redundant configuration. Either unit may be extracted without bringing down the system.
Note: Make sure that the power supply unit that you are NOT replacing is showing all
green, for both the power supply unit and System Status LEDs.
Tolerance
Failure of a single InfiniBand Switch power supply unit does not affect the InfiniBand
Switch operation.
Failure of both InfiniBand Switch power supply units will lead to an InfiniBand Switch
failure.
To identify the defective InfiniBand Switch power supply unit, using the CLI:
1. Log in to the XMCLI as tech.
2. List the InfiniBand Switches status, using the following command:
show-infiniband-switches-psus cluster-id="<cluster name>"
Name Index Cluster-Name Index Location-Index Location Input-Power Lifecycle-State Power-Feed PSU-HW-Label
IB-SW1-PSU-L 1 xbrick711-713 1 1 left on healthy PWR-A PSU1
IB-SW1-PSU-R 2 xbrick711-713 1 2 right on failed PWR-B PSU2
IB-SW2-PSU-L 3 xbrick711-713 1 1 left on healthy PWR-A PSU1
IB-SW2-PSU-R 4 xbrick711-713 1 2 right on healthy PWR-B PSU2
3. Note the Name and PSU-HW-Label of the InfiniBand Switch power supply unit with a
non-healthy Lifecycle-State.
Note: Do not attempt to insert a power supply unit with a power cord connected to it.
2. Insert the power supply unit by sliding it into the opening until a slight resistance is
felt.
3. Continue pressing the power supply unit until the latch snaps into place, confirming
proper installation.
4. Insert the power cord into the power supply unit connector, until the power cord
retainer is latched.
Note: The green power supply unit indicator should illuminate. If not, repeat the whole
procedure to extract the power supply unit, and re-insert it.
Note: Make sure that the latches are engaged and the tray is locked in its position.
Tolerance
Failure of one or more fan units does not affect the InfiniBand Switch operation, as
long as the ambient temperature is below 45° Celsius.
If one or more fan units fail and the ambient temperature exceeds 45° Celsius, the
InfiniBand Switch fails.
Note: Operation without a fan unit should not exceed two minutes.
During a fan hot-swap procedure, if the LED indicator is OFF, the fan unit is
disconnected.
Note: Make sure that the fans have the air flow that matches the model number. An air
flow opposite to the system design will cause the system to operate at a higher (less
than optimal) temperature.
To identify the defective InfiniBand Switch fan unit, using the CLI:
1. Log in to the XMCLI as tech.
2. List the InfiniBand Switch power supply unit status, using the following command:
show-infiniband-switches cluster-id="<cluster name>"
The green Fan Status LED should illuminate. If not, extract the fan unit and reinsert it.
After two unsuccessful attempts to install the fan unit, contact XtremIO Global
Technical Support for guidance and directions. No further action should be taken
without explicit direction from XtremIO Global Technical Support.
3. If you initially tilted the cable management bracket's tray (up/down), return it to its
original position, by pulling the latches (on the left and right side of the bracket) until
the latches click in.
Note: Make sure that the latches are engaged and the tray is locked in its position.
CHAPTER 5
Replacing the Battery Backup Units
The Battery Backup Unit replacement procedure should be performed, using the XtremIO
Technician Advisor utility, following a Service Request (SR) determined by XtremIO Global
Technical Support. If you have any questions or encounter problems, contact XtremIO
Global Technical Support. Technician Advisor is initially used to identify defective Battery
Backup Units on the cluster, and is then used to replace each Battery Backup Unit that is
identified as defective.
The Battery Backup Unit is heavy and should be removed from and installed into the rack
by two people. To avoid personal injury and/or damage to the equipment, do not attempt
to lift or install the BBU without a mechanical lift and/or help from another person.
If RecoverPoint is connected to an XtremIO cluster, notify the customer to pause the
activity of Consistency Groups that are configured to replicate with the cluster, using
RecoverPoint native replication, during this FRU procedure.
If the customer requires assistance to pause in RecoverPoint, contact RecoverPoint Global
Technical Support.
If the customer is unable to perform this operation, do not perform this FRU procedure and
contact XtremIO Global Technical Support before taking any further action.
For further details, provide the customer with Dell EMC KB# 479972
(https://support.emc.com/kb/479972).
Note: Ordering a 5P BBU FRU Kit is done using 100-586-122-00 (Eaton 5P 078-000-122-xx
FRU)
Tolerance
Failure of more than half of the BBUs in the same cluster results in loss of service.
Note: It is recommended to use the cluster name (and not the cluster ID) as the cluster
identifier in cluster-related XMCLI commands.
Note: The cluster-id parameter is not mandatory for single cluster configurations.
Note: Make sure to close the tunnel between the Storage Controller and XMS when access
to XMS is no longer required, as described in “Closing the Tunnel Between a Storage
Controller and the XMS” on page 151.
Replacing a Battery Backup Unit (BBU) Using the Technician Advisor Utility 103
DELL EMC CONFIDENTIAL
Replacing the Battery Backup Units
Replacing a BBU
Before performing a Battery Backup Unit replacement procedure, ensure that the BBU
power cables (box and PDU) are plugged in tightly (not loose) at both ends, as power
cables have a tendency sometimes to lose connection.
The Battery Backup Unit replacement procedure should be performed, using the XtremIO
Technician Advisor utility, following a Service Request (SR) determined by XtremIO Global
Technical Support. If you have any questions or encounter problems, contact XtremIO
Global Technical Support.
Note: For details on the XtremIO Technician Advisor utility, refer to the XtremIO Technician
Advisor Utility User Guide, which is posted in the XtremIO SolVe Generator, under XtremIO
> XtremIO X1 (XIOS 2.x, 3.x, 4.x) > Service Scripts and Utilities > XtremIO Technician Advisor
> Install XtremIO Technician Advisor.
Note: For information on essential preparations required for using the XtremIO Technician
Advisor utility at a customer's site prior to your arrival, refer to Appendix G.
Note: If the XtremIO Technician Advisor Utility User Guide instructs that the Technician
Advisor utility cannot be used to replace Battery Backup Units on your cluster, contact
XtremIO Global Technical Support for directions on how to manually replace the Battery
Backup Units.
Replacing a Battery Backup Unit manually may lead to data-loss if not performed
correctly! Therefore, every effort must be made to use Technician Advisor to automatically
replace a cluster’s Battery Backup Unit.
Note: These instructions are not applicable for 1550 Evolution BBU serial communication
cables.
Incorrect replacement of 5P 1550i BBU serial communication cables may result in
damage to connectors and/or component ports.
5P 1550i Battery Backup Units are supplied with DB9-RJ45 serial data cables
accompanied by DB9-RJ50 adapters, or with RJ45-RJ50 serial communication cables with
labeling clearly indicating which devices and ports to plug into, depending on the XtremIO
hardware version in use.
A defective cable and/or cable adapter of this type must be replaced with a new RJ45-RJ50
serial communication cable.
Note: Replacement RJ45-RJ50 serial communication cables may not be labeled to indicate
which devices and ports to plug into.
Tolerance
In single X-Brick clusters, a failure of both communication cables (one for each BBU)
results in loss of service.
In multiple X-Brick clusters, a failure of more than half of the overall communication
cables in the cluster results in loss of service.
To verify whether failed serial communication cables exist within the cluster:
1. Log in to the XMCLI as tech.
2. Run the following command:
show-bbus cluster-id="<cluster name>"
Name Index Model Serial-Number Power-Feed State Connectivity-State Enabled-State Input Battery-Charge BBU-Load Voltage FW-Version Part-Number Brick-Name Index Cluster-Name Index ...
X1-BBU 1 Evolution 1550 DV0P2308A PWR-A healthy connected enabled on 100 24 210 9901DC 078-000-114 X1 1 xtremio-svt-003 1 ...
X2-BBU 2 Evolution 1550 DV0P23078 PWR-B healthy sc_2_disconnected enabled on 100 22 211 9901DC 078-000-114 X2 2 xtremio-svt-003 2 ...
10101
3. Disconnect the RJ50 end of the defective communication cable (or cable adapter) from
the COM (R) port of the BBU.
COM (R)
4. Connect the RJ45 end of the replacement communication cable (as indicated in the
figure below) to the 10101 port of the Storage Controller.
5. Connect the RJ50 end of the replacement communication cable (as indicated in the
figure above) to the COM (R) port of the BBU.
Note: Verify that the RJ50 end of the cable is connected to the BBU COM (R) port, and
that the RJ45 end of the cable is connected to the Storage Controller 10101 port.
6. If you initially tilted the cable management bracket's tray (up/down) of the connecting
Storage Controller, return it to its original position by pulling the latches (on the left
and right sides of the bracket) until the latches click in.
Note: Make sure that the latches are engaged and the tray is locked in position.
Name Index Model Serial-Number Power-Feed State Connectivity-State Enabled-State Input Battery-Charge BBU-Load Voltage FW-Version Part-Number Brick-Name Index Cluster-Name Index ...
X1-BBU 1 Evolution 1550 DV0P2308A PWR-A healthy connected enabled on 100 24 210 9901DC 078-000-114 X1 1 xtremio-svt-003 1 ...
X2-BBU 2 Evolution 1550 DV0P23078 PWR-B healthy connected enabled on 100 22 211 9901DC 078-000-114 X2 2 xtremio-svt-003 2 ...
APPENDIX A
Software Re-Installation
This section provides instructions for downloading and re-installing a software image on
the Storage Controller and XMS.
This section includes the following topics:
Writing the XtremIO Rescue Image to a USB Drive.................................................. 110
Re-Installing a Storage Controller .......................................................................... 112
Re-Installing a Physical XMS ................................................................................. 114
Note: Verify that you have a USB drive that is at least 2GB in capacity.
1. Locate the XtremIO Rescue Image from the XtremIO Global Technical Support page in
support.emc.com.
For details on the XtremIO Storage Controller Rescue Image or XtremIO virtual XMS
Rescue Image to download from the support page, refer to the latest Release Notes for
the XtremIO installed version.
Note: When downloading a software package, access the Dell EMC Support page and
verify that the MD5/SHA-256 checksum of the downloaded package matches the MD5
or SHA-256 checksum that appears on the support page for that package.
2. Download the image to the local machine where the USB drive will be created.
Note: Before you proceed, verify that the USB drive is available.
Note: Use Window Explorer to make sure that the correct drive letter is selected.
6. Click Write to write the image file to the USB Drive; a warning appears to indicate that
existing data on the selected drive will be overwritten.
7. Verify that the correct drive letter is selected and click Yes to confirm.
8. Follow the write operation progress. When the operation is completed, a message
appears, indicating that the write was successful.
9. From the Windows Notification Area, click Safely Remove Hardware and Eject Media.
Note: The menu option includes the USB drive’s brand name (e.g. "Eject Cruzer Blade"
appears when SanDisk Cruzer Blade USB drive is used).
Wait for the "Safe to remove hardware" message to appear in the Notification Area and
remove the USB drive.
An X-Brick Storage Controller image is available for USB flash drives to restore a Storage
Controller to its original state.
Extract the image to a USB flash drive (refer to “Writing the XtremIO Rescue Image to a USB
Drive” on page 110).
Note: Before starting the procedure, verify that you have a KVM or keyboard and monitor
connected.
Before using a USB flash drive with a Storage Controller Rescue Image, validate that the
USB flash drive was successfully written with the correct Storage Controller Rescue Image
that is running on the cluster. For further details, refer to “Writing the XtremIO Rescue
Image to a USB Drive” on page 110.
Note: It is important to keep the affected Storage Controller isolated from the rest of
the XtremIO cluster, throughout the re-installation procedure.
3. Power-cycle the Storage Controller by unplugging and re-connecting its two power
cables.
4. When the Storage Controller is booted-up, select Install XtremApp from the GRUB
menu.
5. Wait for the installation to complete and for the Storage Controller to reboot.
6. Remove the USB drive.
7. Reconnect the InfiniBand and SAS cables to the Storage Controller.
For cabling guidelines refer to XtremIO Storage Array Hardware Installation and
Upgrade Guide.
If the response shows alerts with the “repeating” text in the prefix, it is necessary to
clear the alert counters.
Note: Clearing alert counters clears all of the system’s alerts. In case of multiple alerts,
make a note of the components with repeated active alerts, prior to clearing alert
counters.
An XMS image is available for USB flash drives to install physical XMS node.
Extract the image to a USB flash drive (refer to “Writing the XtremIO Rescue Image to a USB
Drive” on page 110) and connect the USB flash drive to the XMS USB port.
Note: Before starting the procedure, verify that you have a KVM or keyboard and monitor
connected.
Before using a USB flash drive with an XMS Rescue Image, validate that the USB flash
drive was successfully written with the correct XMS Rescue Image that is running on the
cluster. For further details, refer to “Writing the XtremIO Rescue Image to a USB Drive” on
page 110.
Note: If the Boot Device menu is not displayed, F6 was pressed too late. Go back to
step 1 and repeat the procedure.
Note: The menu option includes the USB drive’s brand name (e.g. "Eject Cruzer Blade"
appears when SanDisk Cruzer Blade USB drive is used).
7. When the server is booted-up, select Install XMS from the GRUB menu.
8. Wait for the installation to complete and for the XMS to reboot.
9. Remove the USB drive.
APPENDIX B
Using LEDs to Identify Hardware Components
This section provides instructions for locating LEDs through CLI commands and using the
GUI.
This section includes the following topics:
Hardware Components’ LEDs ................................................................................ 118
Using the GUI to Activate Identification LEDs......................................................... 119
Using the CLI to Activate the Identification LEDS ................................................... 120
Note: If the component’s identification LED is already turned on, a check sign appears
next to the Turn On Identification LED option and the message box that follows states
that the LED will be turned off.
3. In the Change All Other Identification LEDs dialog box, select the desired state of the
LEDs (On or Off) and click OK; LEDs of all components, except for the LED of the
component you want to identify, change their state.
control-led
The control-led command beacons the identification LED.
DAE X1-DAE
DAEController X1-DAE-LCC-A
LocalDisk X1-SC1-LocalDisk1
StorageController X1-SC1
SSD wwn-0x5000cca02b0555dc
Note: It is possible to have SC1, SC2 and/or LCC-A, LCC-B, etc. (per X-Brick).
show-leds
The show-leds command displays the values for the identification and status LEDs.
APPENDIX C
Priority Failure Analysis
Priority Failure Analysis (Priority FA) is required only for XtremIO FRUs involved in an
outage (DU/DL).
This section provides instructions for shipping failed hardware parts to Dell EMC for
Failure Analysis (FA).
When Failure Analysis should be performed, the failed parts should be shipped to Dell
EMC via FedEx.
APPENDIX D
Manually Replacing the Storage Controllers
This section provides procedures for manually replacing defective Storage Controllers
(without the use of the Technician Advisor utility).
This manual installation should only be performed in situations where the Technician
Advisor utility cannot be used.
125
DELL EMC CONFIDENTIAL
Manually Replacing the Storage Controllers
The manual Storage Controller replacement procedure should be performed following a
Service Request (SR) determined by XtremIO Global Technical Support.
If RecoverPoint is connected to an XtremIO cluster, notify the customer to pause the
activity of Consistency Groups that are configured to replicate with the cluster, using
RecoverPoint native replication, during this FRU procedure.
If the customer requires assistance to pause in RecoverPoint, contact RecoverPoint Global
Technical Support.
If the customer is unable to perform this operation, do not perform this FRU procedure and
contact XtremIO Global Technical Support before taking any further action.
For further details, provide the customer with Dell EMC KB# 479972
(https://support.emc.com/kb/479972).
Before proceeding to replace the defective Storage Controller, contact XtremIO Global
Technical Support for guidance and directions. No further action should be taken without
explicit direction from XtremIO Global Technical Support.
Do not remove the defective Storage Controller until the new Storage Controller is
configured by XtremIO Global Technical Support and is ready to take over.
For further details on replacement Storage Controller P/Ns according to XtremIO cluster
model numbers and installed XtremIO versions, refer to the XtremIO Hardware
Compatibility Matrix on XtremIO SolVe (Solve Desktop > XtremIO Generator > XtremIO X1
(XIOS 2.x, 3.x, 4.x) > FRU Replacement Procedures > XtremIO Hardware Compatibility
Matrix).
Note: If there are two Storage Controllers adjacent to each other, first tilt the cable
management bracket's tray furthest from the component being replaced and then tilt
the tray of the other Storage Controller.
Note: Make sure that all cables are clearly labeled before disconnecting them from the
Storage Controllers. Do not proceed with the replacement procedure until all cables
that are connected to the Storage Controller are labeled.
Note: The disconnected cables can remain fastened to the cable management bracket
during the Storage Controller replacement procedure.
6. If required, release the cables from the cable tray of the cable management bracket
(mounted on the rear side of the Storage Controller) by releasing its cable straps.
7. Pull the tabs on both sides of the cable management bracket to release the bracket
from the Storage Controller’s inner rail.
8. Pull the cable management bracket out and remove it from the Storage Controller.
9. Remove the bezel that covers the front of the server as follows:
a. If the bezel is locked, unlock the bezel with the provided key.
b. Simultaneously press the tabs on both sides of the bezel to release it from its
latches, then pull the bezel off the component.
10. Remove the stabilizing screw behind the latch bracket on each side.
Note: A JIS screwdriver may be required if the rails are from an older version.
11. If a shipping bracket is installed directly above or below the server, remove it to
prevent damage to the foam padding.
12. Pull the server forward until it locks in place, then, slide the blue disconnect tabs
forward to release the inner rails from the slide rails.
Execute the following procedure to install the new Storage Controller only when requested
by XtremIO Global Technical Support.
4. On the outside of each rail assembly, slide the blue disconnect tab forward to unlock
the server, and push the server completely into the cabinet.
5. If you removed a shipping bracket directly above or below the server, reinstall it.
6. To further secure the rail assembly and server in the cabinet, insert and tighten a small
stabilizer screw directly behind each bezel latch.
7. From the rear side of the Storage Controller, align the rails of the cable management
bracket with the server's inner rails.
8. Insert the rails of the cable management bracket onto the inner rails of the Storage
Controller.
9. Push to slide in the cable management bracket until an audible click is heard. This
indicates that the cable management bracket and the Storage Controller rails are
engaged and locked.
10. Tilt the cable tray down by simultaneously pulling both latches, on the left and right
sides of the cable management bracket, and then pushing the tray downwards.
Note: If there are two Storage Controllers adjacent to each other, first tilt the cable
management bracket's tray furthest from the component being replaced and then tilt
the tray of the other Storage Controller.
11. Connect the MGMT network cable to the Storage Controller’s " 1" port (leftmost
port), and connect the InfiniBand, SAS, LAN and COM cables.
Note: Leave the FC/iSCSI cables disconnected until you are instructed to connect
them.
2
1
Note: Make sure that the InfiniBand, SAS, LAN and COM cables are properly
connected, before connecting the two power cables to the Storage Controller, and
powering on the Storage Controller.
Note: If the cables are properly fastened to the cable management bracket, ignore steps 1
and 2, and proceed to step 3.
3. Lift the cable tray, while pulling the latches (on the left and right sides of the bracket)
until the latches click in.
Note: Make sure that the latches are engaged and the tray is locked in position.
The figure below shows an example of the installed cable management bracket, with
the cables strapped to the tray.
HDDs
SSDs
2. Pull the lever open and slide the disk drive assembly (B) from the server.
Note: Once all four disks have been removed, the Storage Controller can be shipped back
to Dell EMC.
Note: It is not always possible to perform Fault Analysis on Storage Controllers that have
been returned to Dell EMC without the Storage Controller’s disks.
APPENDIX E
Manually Replacing the SSDs
This section provides procedures for manually replacing defective SSDs (without the use
of the Technician Advisor utility).
This manual installation should only be performed in situations where the Technician
Advisor utility cannot be used.
The manual SSD replacement should only be performed with direction from XtremIO
Global Tech Support.
Note: Make sure to follow each step of the DAE SSD replacement procedure. Specifically,
do not forget to remove the defective SSD from the DAE, and do not reinsert it.
7. Check the state of XDP Group for which the defective SSD belongs to
(normal/degraded/double-degraded).
8. If XDP state is normal, proceed to “No Rebuild in Progress”. If XDP state is degraded
or double_degraded, proceed to “Rebuild in Progress” on page 144.
No Rebuild in Progress
Note: The defective SSD should be sent to Dell EMC for Failure Analysis (FA) if possible.
Refer to Appendix C for the procedure details.
To remove the defective SSD entry from the cluster database, using the CLI:
1. In the XMCLI, run the following command
remove-ssd ssd-id="<ssd-id-name>"
cluster-id="<cluster-name>"
2. Once the system has received the command, you are prompted to confirm removing
this specific SSD.
3. Run the following command to verify that the defective SSD has been removed:
show-ssds cluster-id="<cluster name>"
To add a new SSD to the relevant cluster and X-Brick using the CLI:
1. Log in to the XMS CLI as tech.
2. List the slot status, using the following command:
show-slots cluster-id="<cluster-name>"
3. Locate the slot with an uninitialized_ssd state and note the slot number
(Slot-Num), the SSD-UID and index of the X-Brick for this slot.
4. Add the new SSD to the relevant cluster and X-Brick, using the following command:
add-ssd cluster-id="<cluster-name>" brick-id=<brick index>
ssd-uid="<SSD-UID>"
For example:
add-ssd cluster-id="xbrick141" brick-id=1
ssd-uid="wwn-0x5000cca050066ca8"
Note: If the SSD added is not a new SSD (out of the box) and was previously used in
this cluster, use the is-foreign-xtremapp-ssd flag.
For example:
add-ssd cluster-id="xbrick141" brick-id=1
ssd-uid="wwn-0x5000cca050066ca8" is-foreign-xtremapp-ssd
To assign the new SSD to the XDP Group, using the CLI:
1. Log in to the XMCLI as tech.
2. Assign the SSD to the XDP Group, using the following command:
assign-ssd dpg-id="<DPG-Name>" ssd-id="<SSD Name>"
cluster-id="<cluster name>"
For Example:
assign-ssd dpg-id="X1-DPG" ssd-id="wwn-0x5000cca050066ca8"
cluster-id="xbrick141"
Note: The SSD name used in step 2 is shown under the "Name" field on
show-ssds. The newly added SSD is seen with SSD-DPG-State "not_in_rg".
3. Use the following command to check if the integration process has completed:
show-ssds cluster-id="<cluster name>"
The SSD-DPG-State field shown on show-ssds changes for this new SSD, from
"not_in_rg" to "assigning_to_rg" during the integration process. Once the
process finishes, it changes to "in_rg".
4. You can also use the following command to check if the integration process has
completed:
show-data-protection-groups
• "Preparation-Progress" field changes to 0 when integration has
completed
• "Useful-SSD-Space" field increases when integration has completed.
5. Verify that the cluster and modules are active, by running the following commands:
show-clusters
show-modules cluster-id="<cluster name>"
6. Generate and upload a log bundle (refer to “Post Replacement Procedures” on
page 147).
Rebuild in Progress
Note: If all XDPs are not in a state of normal, it is necessary to wait until they are.
4. For each SSD showing SSD-DPG-State "not_in_rg" , perform the steps in “No
Rebuild in Progress” on page 141.
Troubleshooting
APPENDIX F
Post Replacement Procedures
If the response shows alerts with the “repeating” text in the prefix, it is necessary to
clear the alert counters.
Note: Clearing alert counters clears all of the system’s alerts. In case of multiple alerts,
make a note of the components with repeated active alerts, prior to clearing alert
counters.
Note: It is recommended to use the cluster name (and not the cluster ID) as the cluster
identifier in cluster-related XMCLI commands.
Note: The cluster-id parameter is not mandatory for single cluster configurations.
3. Copy the link into a web browser and download the package.
modify-alert-definition alert-type="initiator_redundancy_state_non_redundant"
send-to-call-home="no"
For guidance on running the XtremIO Health-Check Script and on resolving its output, refer
to Dell EMC KB # 206076 (https://support.emc.com/kb/206076). If an unexpected error
is reported by the HCS, submit a standard Service Request to XtremIO Global Technical
Support.
To close the tunnel that was opened between the Storage Controller and the XMS:
Run the following CLI command:
modify-technician-port-tunnel cluster-id=<Cluster ID>
sc-id=<Storage Controller ID> close
Closing the Tunnel Between a Storage Controller and the XMS 151
DELL EMC CONFIDENTIAL
Post Replacement Procedures
APPENDIX G
Essential Pre-Customer-Visit Preparations for
Technician Advisor Utility Use
This section describes preparations for using the XtremIO Technician Advisor utility at a
customer's site prior to your arrival.
This section includes the following topics:
Checking the Network Ports with the Customer ..................................................... 154
Preparing a Replacement Battery Backup Unit....................................................... 154