Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Configuration Example - Fence Devices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Red Hat Enterprise Linux 5

Configuration Example
- Fence Devices
Configuring Fence Devices in a Red Hat Cluster
Configuration Example - Fence Devices

Red Hat Enterprise Linux 5 Configuration Example - Fence


Devices
Configuring Fence Devices in a Red Hat Cluster
Edition 1

Copyright © 2009 Red Hat Inc.. This material may only be distributed subject to the terms and
conditions set forth in the Open Publication License, V1.0 or later (the latest version of the OPL is
presently available at http://www.opencontent.org/openpub/).

Red Hat and the Red Hat "Shadow Man" logo are registered trademarks of Red Hat, Inc. in the United
States and other countries.

All other trademarks referenced herein are the property of their respective owners.

1801 Varsity Drive


Raleigh, NC 27606-2072 USA
Phone: +1 919 754 3700
Phone: 888 733 4281
Fax: +1 919 754 3701
PO Box 13588 Research Triangle Park, NC 27709 USA

This book describes a procedure for configuring fence devices in a Red Hat Cluster using Conga.
Introduction v
1. About This Guide ............................................................................................................ v
2. Audience ........................................................................................................................ v
3. Software Versions ........................................................................................................... v
4. Related Documentation ................................................................................................... v
5. Feedback ....................................................................................................................... vi
6. Document Conventions ................................................................................................... vi
6.1. Typographic Conventions ..................................................................................... vi
6.2. Pull-quote Conventions ....................................................................................... viii
6.3. Notes and Warnings ............................................................................................ ix
1. Configuring Fence Devices in a Red Hat Cluster 1
2. Configuring an APC Switch as a Fence Device 3
2.1. APC Fence Device Prerequisite Configuration ............................................................... 3
2.2. APC Fence Device Components to Configure ................................................................ 4
2.3. APC Fence Device Configuration Procedure .................................................................. 5
2.4. Cluster Configuration File with APC Fence Device ......................................................... 8
2.5. Testing the APC Fence Device Configuration ............................................................... 10
3. Configuring IPMI Management Boards as Fencing Devices 11
3.1. IPMI Fence Device Prerequisite Configuration .............................................................. 12
3.2. IPMI Fence Device Components to Configure .............................................................. 12
3.3. IPMI Fence Device Configuration Procedure ................................................................ 14
3.4. Cluster Configuration File with IPMI Fence Device ........................................................ 16
3.5. Testing the IPMI Fence Device Configuration ............................................................... 18
4. Troubleshooting 19
5. The GFS Withdraw Function 23
A. Revision History 25
Index 27

iii
iv
Introduction
1. About This Guide
This book describes procedures for configuring fence devices in a Red Hat Cluster using Conga.

2. Audience
This book is intended to be used by system administrators managing systems running the Linux
operating system. It requires familiarity with Red Hat Enterprise Linux 5 and Red Hat Cluster Suite.

3. Software Versions
Software Description
RHEL5 refers to RHEL5 and higher
GFS refers to GFS for RHEL5 and higher
Table 1. Software Versions

4. Related Documentation
For more information about using Red Hat Enterprise Linux, refer to the following resources:

• Red Hat Enterprise Linux Installation Guide — Provides information regarding installation of Red
Hat Enterprise Linux 5.

• Red Hat Enterprise Linux Deployment Guide — Provides information regarding the deployment,
configuration and administration of Red Hat Enterprise Linux 5.

For more information about Red Hat Cluster Suite for Red Hat Enterprise Linux 5, refer to the following
resources:

• Red Hat Cluster Suite Overview — Provides a high level overview of the Red Hat Cluster Suite.

• Configuring and Managing a Red Hat Cluster — Provides information about installing, configuring
and managing Red Hat Cluster components.

• LVM Administrator's Guide: Configuration and Administration — Provides a description of the


Logical Volume Manager (LVM), including information on running LVM in a clustered environment.

• Global File System: Configuration and Administration — Provides information about installing,
configuring, and maintaining Red Hat GFS (Red Hat Global File System).

• Global File System 2: Configuration and Administration — Provides information about installing,
configuring, and maintaining Red Hat GFS2 (Red Hat Global File System 2).

• Using Device-Mapper Multipath — Provides information about using the Device-Mapper Multipath
feature of Red Hat Enterprise Linux 5.

• Using GNBD with Global File System — Provides an overview on using Global Network Block
Device (GNBD) with Red Hat GFS.

v
Introduction

• Linux Virtual Server Administration — Provides information on configuring high-performance


systems and services with the Linux Virtual Server (LVS).

• Red Hat Cluster Suite Release Notes — Provides information about the current release of Red Hat
Cluster Suite.

Red Hat Cluster Suite documentation and other Red Hat documents are available in HTML,
PDF, and RPM versions on the Red Hat Enterprise Linux Documentation CD and online at http://
www.redhat.com/docs/.

5. Feedback
If you spot a typo, or if you have thought of a way to make this manual better, we would love to
hear from you. Please submit a report in Bugzilla (http://bugzilla.redhat.com/bugzilla/) against the
component rh-cs.

Be sure to mention the manual's identifier:

Bugzilla component: Documentation-cluster


Book identifier: Cluster_Config_Example_Fence(EN)-5 (2009-06-18T15:20)

By mentioning this manual's identifier, we know exactly which version of the guide you have.

If you have a suggestion for improving the documentation, try to be as specific as possible. If you have
found an error, please include the section number and some of the surrounding text so we can find it
easily.

6. Document Conventions
This manual uses several conventions to highlight certain words and phrases and draw attention to
specific pieces of information.
1
In PDF and paper editions, this manual uses typefaces drawn from the Liberation Fonts set. The
Liberation Fonts set is also used in HTML editions if the set is installed on your system. If not,
alternative but equivalent typefaces are displayed. Note: Red Hat Enterprise Linux 5 and later includes
the Liberation Fonts set by default.

6.1. Typographic Conventions


Four typographic conventions are used to call attention to specific words and phrases. These
conventions, and the circumstances they apply to, are as follows.

Mono-spaced Bold

Used to highlight system input, including shell commands, file names and paths. Also used to highlight
key caps and key-combinations. For example:

To see the contents of the file my_next_bestselling_novel in your current


working directory, enter the cat my_next_bestselling_novel command at the
shell prompt and press Enter to execute the command.
1
https://fedorahosted.org/liberation-fonts/

vi
Typographic Conventions

The above includes a file name, a shell command and a key cap, all presented in Mono-spaced Bold
and all distinguishable thanks to context.

Key-combinations can be distinguished from key caps by the hyphen connecting each part of a key-
combination. For example:

Press Enter to execute the command.

Press Ctrl+Alt+F1 to switch to the first virtual terminal. Press Ctrl+Alt+F7 to


return to your X-Windows session.

The first sentence highlights the particular key cap to press. The second highlights two sets of three
key caps, each set pressed simultaneously.

If source code is discussed, class names, methods, functions, variable names and returned values
mentioned within a paragraph will be presented as above, in Mono-spaced Bold. For example:

File-related classes include filesystem for file systems, file for files, and dir for
directories. Each class has its own associated set of permissions.

Proportional Bold

This denotes words or phrases encountered on a system, including application names; dialogue
box text; labelled buttons; check-box and radio button labels; menu titles and sub-menu titles. For
example:

Choose System > Preferences > Mouse from the main menu bar to launch Mouse
Preferences. In the Buttons tab, click the Left-handed mouse check box and click
Close to switch the primary mouse button from the left to the right (making the mouse
suitable for use in the left hand).

To insert a special character into a gedit file, choose Applications > Accessories
> Character Map from the main menu bar. Next, choose Search > Find… from the
Character Map menu bar, type the name of the character in the Search field and
click Next. The character you sought will be highlighted in the Character Table.
Double-click this highlighted character to place it in the Text to copy field and then
click the Copy button. Now switch back to your document and choose Edit > Paste
from the gedit menu bar.

The above text includes application names; system-wide menu names and items; application-specific
menu names; and buttons and text found within a GUI interface, all presented in Proportional Bold and
all distinguishable by context.

Note the > shorthand used to indicate traversal through a menu and its sub-menus. This is to avoid
the difficult-to-follow 'Select Mouse from the Preferences sub-menu in the System menu of the main
menu bar' approach.

Mono-spaced Bold Italic or Proportional Bold Italic

Whether Mono-spaced Bold or Proportional Bold, the addition of Italics indicates replaceable or
variable text. Italics denotes text you do not input literally or displayed text that changes depending on
circumstance. For example:

To connect to a remote machine using ssh, type ssh username@domain.name at


a shell prompt. If the remote machine is example.com and your username on that
machine is john, type ssh john@example.com.

vii
Introduction

The mount -o remount file-system command remounts the named file


system. For example, to remount the /home file system, the command is mount -o
remount /home.

To see the version of a currently installed package, use the rpm -q package
command. It will return a result as follows: package-version-release.

Note the words in bold italics above — username, domain.name, file-system, package, version and
release. Each word is a placeholder, either for text you enter when issuing a command or for text
displayed by the system.

Aside from standard usage for presenting the title of a work, italics denotes the first use of a new and
important term. For example:

When the Apache HTTP Server accepts requests, it dispatches child processes
or threads to handle them. This group of child processes or threads is known as
a server-pool. Under Apache HTTP Server 2.0, the responsibility for creating and
maintaining these server-pools has been abstracted to a group of modules called
Multi-Processing Modules (MPMs). Unlike other modules, only one module from the
MPM group can be loaded by the Apache HTTP Server.

6.2. Pull-quote Conventions


Two, commonly multi-line, data types are set off visually from the surrounding text.

Output sent to a terminal is set in Mono-spaced Roman and presented thus:

books Desktop documentation drafts mss photos stuff svn


books_tests Desktop1 downloads images notes scripts svgs

Source-code listings are also set in Mono-spaced Roman but are presented and highlighted as
follows:

package org.jboss.book.jca.ex1;

import javax.naming.InitialContext;

public class ExClient


{
public static void main(String args[])
throws Exception
{
InitialContext iniCtx = new InitialContext();
Object ref = iniCtx.lookup("EchoBean");
EchoHome home = (EchoHome) ref;
Echo echo = home.create();

System.out.println("Created Echo");

System.out.println("Echo.echo('Hello') = " + echo.echo("Hello"));

viii
Notes and Warnings

6.3. Notes and Warnings


Finally, we use three visual styles to draw attention to information that might otherwise be overlooked.

Note
A note is a tip or shortcut or alternative approach to the task at hand. Ignoring a note
should have no negative consequences, but you might miss out on a trick that makes your
life easier.

Important
Important boxes detail things that are easily missed: configuration changes that only
apply to the current session, or services that need restarting before an update will apply.
Ignoring Important boxes won't cause data loss but may cause irritation and frustration.

Warning
A Warning should not be ignored. Ignoring warnings will most likely cause data loss.

ix
x
Chapter 1.

Configuring Fence Devices in a Red


Hat Cluster
This document provides two configuration examples, showing the steps needed to configure fence
devices in a Red Hat cluster using the Conga configuration tool. For general information about fencing
and fence device configuration, see Configuring and Managing a Red Hat Cluster.

This document includes the procedures for configuring the following two fence devices:

• APC switch: single APC device to fence all cluster nodes

• IPMI management board: separate IPMI management boards for each cluster node

This remainder of this document is organized as follows:

• Chapter 2, Configuring an APC Switch as a Fence Device describes the procedure for configuring
an APC switch as a fence device in a Red Hat cluster.

• Chapter 3, Configuring IPMI Management Boards as Fencing Devices describes the procedure for
configuring an IPMI management board as a fence device in a Red Hat cluster.

• Chapter 4, Troubleshooting provides some guidelines to follow when your configuration does not
behave as expected.

• Chapter 5, The GFS Withdraw Function summarizes some general concerns to consider when
configuring fence devices in a Red Hat cluster.

1
2
Chapter 2.

Configuring an APC Switch as a Fence


Device
This chapter provides the procedures for configuring an APC switch as a fence device in a Red Hat
cluster using the Conga configuration tool.

Figure 2.1, “Using an APC Switch as a Fence Device” shows the configuration this procedure yields.
In this configuration a three node cluster uses an APC switch as the fencing device. Each node in the
cluster is connected to a port in the APC switch.

Figure 2.1. Using an APC Switch as a Fence Device

2.1. APC Fence Device Prerequisite Configuration


Table 2.1, “Configuration Prerequisities” summarizes the prerequisite components that have been set
up before this procedure begins.

Component Name Comment


cluster apcclust three-node cluster
cluster node clusternode1.example.com node in cluster apcclust configured with APC switch to
administer power supply

3
Chapter 2. Configuring an APC Switch as a Fence Device

Component Name Comment


cluster node clusternode2.example.com node in cluster apcclust configured with APC switch to
administer power supply
cluster node clusternode3.example.com node in cluster apcclust configured with APC switch to
administer power supply
IP address 10.15.86.96 IP address for the APC switch that controls
the power for for clusternode1.example.com,
clusternode2.example.com, and
clusternode3.example.com
login apclogin login value for the APC switch that controls
the power for for clusternode1.example.com,
clusternode2.example.com, and
clusternode3.example.com
password apcpword password for the APC switch that controls
the power for for clusternode1.example.com,
clusternode2.example.com, and
clusternode3.example.com
port 1 port number on APC switch that
clusternode1.example.com connects to
port 2 port number on APC switch that
clusternode2.example.com connects to
port 3 port number on APC switch that
clusternode3.example.com connects to
Table 2.1. Configuration Prerequisities

2.2. APC Fence Device Components to Configure


This procedure configures an APC switch as a fence device that will be used for each
node in cluster apcclust. Then the procedure configures that switch as the fencing
device for clusternode1.example.com, clusternode2.example.com, and
clusternode1.example.com.

Table 2.2, “Fence Device Components to Configure for APC Fence Device” summarizes
the components of the APC fence device that this procedure configures for cluster node
clusternode1.example.com.

Fence Value Description


Device
Component
Fencing APC Power Switch type of fencing device to configure
Type
Name apcfence name of the APC fencing device
IP address 10.15.86.96 IP address of the APC switch to configure as a fence
device for node1.example.com, node2.example.com,
node3.example.com
login apclogin login value for the APC switch that controls
the power for for clusternode1.example.com,

4
APC Fence Device Configuration Procedure

Fence Value Description


Device
Component
clusternode2.example.com, and
clusternode3.example.com
password apcpword password for the APC switch that controls
the power for for clusternode1.example.com,
clusternode2.example.com, and
clusternode3.example.com
Table 2.2. Fence Device Components to Configure for APC Fence Device

Table 2.3, “Fence Agent Components to Specify for Each Node in apcclust ” summarizes the
components of the APC fence device that you must specify for the cluster nodes in apcclust.

Fence Agent Value Description


Component
fence device apcfence name of the APC fence device you defined as a shared device
port 1 port number on the APC switch for node1.example.com
port 2 port number on the APC switch for node2.example.com
port 3 port number on the APC switch for node3.example.com
Table 2.3. Fence Agent Components to Specify for Each Node in apcclust

The remainder of the fence device components that you configure for each node appear automatically
when you specify that you will be configured the apcfence fence device that you previously defined
as a shared fence device.

2.3. APC Fence Device Configuration Procedure


This section provides the procedure for adding an APC fence device to each node of cluster
apcclust. This example uses the same APC switch for each cluster node, which will first be
configured as a shared fence device. After configuring the APC switch as a shared fence device, the
device will be added as a fence device for each node in the cluster.

To configure an APC switch as a shared fence device using Conga, perform the following procedure:

1. As an administrator of luci Select the cluster tab. This displays the Choose a cluster to
administer screen.

2. From the Choose a cluster to administer screen, you should see the previously configured
cluster apcclust displayed, along with the nodes that make up the cluster. Click on apcclust to
select the cluster.

3. At the detailed menu for the cluster apcclust (below the clusters menu on the left side of the
screen), click Shared Fence Devices. Clicking Shared Fence Devices causes the display of any
shared fence devices previously configured for a cluster and causes the display of menu items for
fence device configuration: Add a Fence Device and Configure a Fence Device.

4. Click Add a Fence Device. Clicking Add a Fence Device causes the Add a Sharable Fence
Device page to be displayed.

5
Chapter 2. Configuring an APC Switch as a Fence Device

5. At the Add a Sharable Fence Device page, click the drop-down box under Fencing Type and
select APC Power Switch. This causes Conga to display the components of an APC Power
Switch fencing type, as shown in Figure 2.2, “Adding a Sharable Fence Device”.

Figure 2.2. Adding a Sharable Fence Device

6. For Name, enter apcfence.

7. For IP Address, enter 10.15.86.96.

8. For Login, enter apclogin.

9. For Password, enter apcpword.

10. For Password Script, leave blank.

11. Click Add this shared fence device.

Clicking Add this shared fence device causes a progress page to be displayed temporarily. After
the fence device has been added, the detailed cluster properties menu is updated with the fence
device under Configure a Fence Device.

6
APC Fence Device Configuration Procedure

After configuring the APC switch as a shared fence device, use the following procedure to configure
the APC switch as the fence device for node clusternode1.example.com

1. At the detailed menu for the cluster apcclust (below the clusters menu), click Nodes. Clicking
Nodes causes the display of the status of each node in apcclust.

2. At the bottom of the display for node clusternode1.example.com, click Manage Fencing for
this Node. This displays the configuration screen for node clusternode1.example.com.

3. At the Main Fencing Method display, click Add a fence device to this level. This causes a
dropdown menu to display.

4. From the dropdown menu, the apcfence fence device you have already created should display
as one of the menu options under Use an Existing Fence Device. Select apcfence (APC
Power Device). This causes a fence device configuration menu to display with the Name, IP
Address, Login, Password, and Password Script values already configured, as defined when
you configured apcfence as a shared fence device. This is shown in Figure 2.3, “Adding an
Existing Fence Device to a Node”.

Figure 2.3. Adding an Existing Fence Device to a Node

5. For Port, enter 1. Do not enter any value for Switch.

7
Chapter 2. Configuring an APC Switch as a Fence Device

6. Click Update main fence properties. This causes a confirmation screen to be displayed.

7. On the confirmation screen, Click OK. A progress page is displayed after which the display returns
to the status page for clusternode1.example.com in cluster apcclust.

After configuring apcfence as the fencing device for clusternode1.example.com, use the
same procedure to configure apcfence as the fencing device for clusternode2.example.com,
specifying Port 2 for clusternode2.example.com, as in the following procedure:

1. On the status page for clusternode1.example.com in cluster apcclust, the other nodes
in apcclust are displayed below the Configure menu item below the Nodes menu item on
the left side of the screen. Click clusternode2.example.com to display the status screen for
clusternode2.example.com.

2. At the Main Fencing Method display, click Add a fence device to this level. This causes a
dropdown manu to display.

3. As for clusternode1.example.com, the apcfence fence device should display as one of the
menu options on the dropdown menu, under Use an Existing Fence Device. Select apcfence
(APC Power Device). This causes a fence device configuration menu to display with the Name,
IP Address, Login, Password, Password Script values already configured, as defined when you
configured apcfence as a shared fence device.

4. For Port, enter 2. Do not enter any value for Switch.

5. Click Update main fence properties.

Similarly, configure apcfence as the main fencing method for clusternode3.example.com,


specifying 3 as the Port number.

2.4. Cluster Configuration File with APC Fence Device


Configuring a cluster with Conga modifies the cluster configuration file. This section shows the cluster
configuration file before and after the procedures documented in Section 2.3, “APC Fence Device
Configuration Procedure” were performed.

Before the cluster resources and service were configured, the cluster.conf file appeared as
follows.

<?xml version="1.0"?>
<cluster alias="apcclust" config_version="12" name="apcclust">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
<clusternodes>
<clusternode name="clusternode1.example.com" nodeid="1"
votes="1">
<fence/>
</clusternode>
<clusternode name="clusternode2.example.com" nodeid="2"
votes="1">
<fence/>

8
Cluster Configuration File with APC Fence Device

</clusternode>
<clusternode name="clusternode3.example.com" nodeid="3"
votes="1">
<fence>
<method name="1"/>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices/>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>

After the cluster resources and service were configured, the cluster.conf file appears as follows.

<?xml version="1.0"?>
<cluster alias="apcclust" config_version="19" name="apcclust">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
<clusternodes>
<clusternode name="clusternode1.example.com" nodeid="1"
votes="1">
<fence>
<method name="1">
<device name="apcfence" port="1"/>
</method>
</fence>
</clusternode>
<clusternode name="clusternode2.example.com" nodeid="2"
votes="1">
<fence>
<method name="1">
<device name="apcfence" port="2"/>
</method>
</fence>
</clusternode>
<clusternode name="clusternode3.example.com" nodeid="3"
votes="1">
<fence>
<method name="1">
<device name="apcfence" port="3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>

9
Chapter 2. Configuring an APC Switch as a Fence Device

<fencedevices>
<fencedevice agent="fence_apc" ipaddr="10.15.86.96"
login="apclogin" name="apcfence" passwd="apcpword"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>

2.5. Testing the APC Fence Device Configuration


To check whether the configuration you have defined works as expected, you can use the
fence_node to fence a node manually. The fence_node program reads the fencing settings from
the cluster.conf file for the given node and then runs the configured fencing agent against the
node.

To test whether the APC switch has been successfully configured as a fence device for the three
nodes in cluster apcclust, execute the following commands and check whether the nodes have
been fenced.

# /sbin/fence_node clusternode1.example.com
# /sbin/fence_node clusternode2.example.com
# /sbin/fence_node clusternode3.example.com

10
Chapter 3.

Configuring IPMI Management Boards


as Fencing Devices
This chapter provides the procedures for configuring IPMI management boards as fencing devices in a
Red Hat cluster using the Conga configuration tool.

Figure 3.1, “Using IPMI Management Boards as Fence Devices” shows the configuration this
procedure yields. In this configuration each node of a three node cluster uses an IPMI management
board as its fencing device.

Note
Note that in this configuration each system has redundant power and is hooked into two
independent power sources. This ensures that the management board would still function
as needed in a cluster even if you lose power from one of the sources.

Figure 3.1. Using IPMI Management Boards as Fence Devices

11
Chapter 3. Configuring IPMI Management Boards as Fencing Devices

3.1. IPMI Fence Device Prerequisite Configuration


Table 3.1, “Configuration Prerequisities” summarizes the prequisite components that have been set up
before this procedure begins.

Component Name Comment


cluster ipmiclust three-node cluster
cluster node clusternode1.example.com node in cluster ipmiclust configured with IPMI
management board and two power supplies
IP address 10.15.86.96 IP address for IPMI management board for
clusternode1.example.com
login ipmilogin login name for IPMI management board for
clusternode1.example.com
password ipmipword password IPMI management board for
clusternode1.example.com
cluster node clusternode2.example.com node in cluster ipmiclust configured with IPMI
management board and two power supplies
IP address 10.15.86.97 IP address for IPMI management board for
clusternode2.example.com
login ipmilogin login name for IPMI management board for
clusternode2.example.com
password ipmipword password for IPMI management board for
clusternode2.example.com
cluster node clusternode3.example.com node in cluster ipmiclust configured with IPMI
management board and two power supplies
IP address 10.15.86.98 IP address for IPMI management board for
clusternode3.example.com
login ipmilogin login name for IPMI management board for
clusternode3.example.com
password ipmipword password for IPMI management board for
clusternode3.example.com
Table 3.1. Configuration Prerequisities

3.2. IPMI Fence Device Components to Configure


This procedure configures the IPMI management board as a fence device for each node in cluster
ipminode.

Table 3.2, “Fence Agent Components to Configure for clusternode1.example.com” summarizes


the components of the IPMI fence device that this procedure configures for cluster node
clusternode1.example.com.

Fence Value Description


Agent
Component
Name ipmifence1 name of the IPMI fencing device

12
IPMI Fence Device Components to Configure

Fence Value Description


Agent
Component
IP address 10.15.86.96 IP address of the IPMI management board to configure
as a fence device for clusternode1.example.com
IPMI login ipmilogin login identity for the IPMI management board for
clusternode1.example.com
password ipmipword password for the IPMI management board for
clusternode1.example.com
authentication password authentication type for the IPMI management board for
type clusternode1.example.com
Table 3.2. Fence Agent Components to Configure for clusternode1.example.com

Table 3.3, “Fence Agent Components to Configure for clusternode2.example.com” summarizes


the components of the IPMI fence device that this procedure configures for cluster node
clusternode2.example.com.

Fence Value Description


Agent
Component
Name ipmifence2 name of the IPMI fencing device
IP address 10.15.86.97 IP address of the IPMI management board to configure
as a fence device for clusternode2.example.com
IPMI login ipmilogin login identity for the IPMI management board for
clusternode2.example.com
password ipmipword password for the IPMI management board for
clusternode2.example.com
authentication password authentication type for the IPMI management board for
type clusternode2.example.com
Table 3.3. Fence Agent Components to Configure for clusternode2.example.com

Table 3.4, “Fence Agent Components to Configure for clusternode3.example.com” summarizes


the components of the APC fence device that this procedure configures for cluster node
clusternode3.example.com.

Fence Value Description


Agent
Component
Name ipmifence3 name of the IPMI fencing device
IP address 10.15.86.98 IP address of the IPMI management board to configure
as a fence device for clusternode3.example.com
IPMI login ipmilogin login identity for the IPMI management board for
clusternode3.example.com
password ipmipword password for the IPMI management board for
clusternode3.example.com

13
Chapter 3. Configuring IPMI Management Boards as Fencing Devices

Fence Value Description


Agent
Component
authentication password authentication type for the IPMI management board for
type clusternode3.example.com
Table 3.4. Fence Agent Components to Configure for clusternode3.example.com

3.3. IPMI Fence Device Configuration Procedure


This section provides the procedure for adding an IPMI fence device to each node of cluster
ipmiclust. Each node of ipmiclust is managed by its own IPMI management board.

Use the following procedure to configure the IPMI management board as the fence device for node
clusternode1.example.com using Conga:

1. As an administrator of luci Select the cluster tab. This displays the Choose a cluster to
administer screen.

2. From the Choose a cluster to administer screen, you should see the previously
configured cluster ipmiclust displayed, along with the nodes that make up the cluster.
Click on clusternode1.example.com. This displays the configuration screen for node
clusternode1.example.com.

3. At the Main Fencing Method display, click Add a fence device to this level. This causes a
dropdown manu to display.

4. From the dropdown menu, under Create a new Fence Device, select IPMI Lan. This displays a
fence device configuration menu, as shown in Figure 3.2, “Creating an IPMI Fence Device”.

14
IPMI Fence Device Configuration Procedure

Figure 3.2. Creating an IPMI Fence Device

5. For Name, enter ipmifence1.

6. For IP Address, enter 10.15.86.96.

7. For Login, enter ipmilogin.

8. For Password, enter ipmipword.

9. For Password Script, leave the field blank.

10. For Authentication type, enterpassword. This field specifies the IPMI authentication type.
Possible values for this field are none, password, md2, or md5.

11. Leave the Use Lanplus field blank. You would check this field if your fence device is a Lanplus-
capable interface such as iLO2.

12. Click Update main fence properties. This causes a confirmation screen to be displayed.

13. On the confirmation screen, click OK. After the fence device has been added, a
progress page is displayed after which the display returns to the configuration page for
clusternode1.example.com in cluster ipmiclust.

15
Chapter 3. Configuring IPMI Management Boards as Fencing Devices

After configuring an IPMI fence device for clusternode1.example.com, use the following
procedure to configure an IPMI fence device for clusternode2.example.com.

1. From the configuration page for clusternode1.example.com, a menu appears on the left
of the screen for cluster ipmiclust. Select the node clusternode2.example.com. The
configuration page for clusternode2.example.com appears, with no fence device configured.

2. At the Main Fencing Method display, click Add a fence device to this level. This causes a
dropdown manu to display.

3. From the dropdown menu, under Create a new Fence Device, select IPMI Lan. This displays a
fence device configuration menu.

4. For Name, enter ipmifence2.

5. For IP Address, enter 10.15.86.97.

6. For Login, enter ipmilogin.

7. For Password, enter ipmipword.

8. For Password Script, leave the field blank.

9. For Authentication type, enterpassword. This field specifies the IPMI authentication type.
Possible values for this field are none, password, md2, or md5.

10. Leave the Use Lanplus field blank.

11. Click Update main fence properties. This causes a confirmation screen to be displayed.

12. On the confirmation screen, click OK. After the fence device has been added, a
progress page is displayed after which the display returns to the configuration page for
clusternode1.example.com in cluster ipmiclust.

After configuring ipmifence2 as the fencing device for clusternode2.example.com,


select node clusternode3.example.com from the menu on the left side of the page and
configure an IPMI fence device for that node using the same procedure as you did to configure
the fence devices for clusternode2.example.com and clusternode3.example.com.
For clusternode3.example.com , use ipmifence3 as the name of the fencing method and
10.15.86.98 as the IP address. Otherwise, use the same values for the fence device parameters.

3.4. Cluster Configuration File with IPMI Fence Device


Configuring a cluster with Conga modifies the cluster configuration file. This section shows the cluster
configuration file before and after the procedures documented in Section 3.3, “IPMI Fence Device
Configuration Procedure” and were performed.

Before the cluster resources and service were configured, the cluster.conf file appeared as
follows.

<?xml version="1.0"?>
<cluster alias="ipmiclust" config_version="12" name="ipmiclust">

16
Cluster Configuration File with IPMI Fence Device

<fence_daemon clean_start="0" post_fail_delay="0"


post_join_delay="3"/>
<clusternodes>
<clusternode name="clusternode1.example.com" nodeid="1"
votes="1">
<fence/>
</clusternode>
<clusternode name="clusternode2.example.com" nodeid="2"
votes="1">
<fence/>
</clusternode>
<clusternode name="clusternode3.example.com" nodeid="3"
votes="1">
<fence>
<method name="1"/>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices/>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>

After the cluster resources and service were configured, the cluster.conf file appears as follows.

<?xml version="1.0"?>
<cluster alias="ipmiclust" config_version="27" name="ipmiclust">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
<clusternodes>
<clusternode name="clusternode1.example.com" nodeid="1"
votes="1">
<fence>
<method name="1">
<device name="ipmifence1"/>
</method>
</fence>
</clusternode>
<clusternode name="clusternode2.example.com" nodeid="2"
votes="1">
<fence>
<method name="1">
<device name="ipmifence2"/>
</method>
</fence>
</clusternode>

17
Chapter 3. Configuring IPMI Management Boards as Fencing Devices

<clusternode name="clusternode3.example.com" nodeid="3"


votes="1">
<fence>
<method name="1">
<device name="ipmifence3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="10.15.86.96"
login="ipmilogin" name="ipmifence1" passwd="ipmipword" />
<fencedevice agent="fence_ipmilan" ipaddr="10.15.86.97"
login="ipmilogin" name="ipmifence2" passwd="ipmipword" />
<fencedevice agent="fence_ipmilan" ipaddr="10.15.86.98"
login="ipmilogin" name="ipmifence3" passwd="ipmipword" />
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>

3.5. Testing the IPMI Fence Device Configuration


To check whether the configuration you have defined works as expected, you can use the
fence_node to fence a node manually. The fence_node program reads the fencing settings from
the cluster.conf file for the given node and then runs the configured fencing agent against the
node.

To test whether the IPMI management boards have been successfully configured as fence devices for
the three nodes in cluster ipmiclust, execute the following commands and check whether the nodes
have been fenced.

# /sbin/fence_node clusternode1.example.com
# /sbin/fence_node clusternode2.example.com
# /sbin/fence_node clusternode3.example.com

18
Chapter 4.

Troubleshooting
The following is a list of some problems you may see regarding the configuration of fence devices as
well as some suggestions for how to address these problems.

• If your system does not fence a node automatically, you can try to fence the node from the
command line using the fence_node command, as described at the end of each of the fencing
configuration procedures. The fence_node performs I/O fencing on a single node by reading
the fencing settings from the cluster.conf file for the given node and then running the
configured fencing agent against the node. For example, the following command fences node
clusternode1.example.com:

# /sbin/fence_node clusternode1.example.com

If the fence_node command is unsuccessful, you may have made an error in defining the fence
device configuration. To determine whether the fencing agent itself is able to talk to the fencing
device, you can execute the I/O fencing command for your fence device directly from the command
line. As a first step, you can execute the with the -o status option specified. For example, if you
are using an APC switch as a fencing agent, you can execute a command such as the following:

# /sbin/fence_apc -a (ipaddress) -l (login) ... -o status -v

You can also use the I/O fencing command for your device to fence the node. For example, for an
HP ILO device, you can issue the following command:

# /sbin/fence_ilo -a myilo -l login -p passwd -o off -v

• Check the version of firmware you are using in your fence device. You may want to consider
upgrading your firmware. You may also want to scan bugzilla to see if there are any issues
regarding your level of firmware.

• If a node in your cluster is repeatedly getting fenced, it means that one of the nodes in your cluster
is not seeing enough "heartbeat" network messages from the node that is getting fenced. Most of
the time, this is a result of flaky or faulty hardware, such as bad cables or bad ports on the network
hub or switch. Test your communications paths thoroughly without the cluster software running to
make sure your hardware is working correctly.

• If a node in your cluster is repeatedly getting fenced right at startup, if may be due to system
activities that occur when a node joins a cluster. If your network is busy, your cluster may decide
it is not getting enough heartbeat packets. To address this, you may have to increase the
post_join_delay setting in your cluster.conf file. This delay is basically a grace period to
give the node more time to join the cluster.

In the following example, the fence_daemon entry in the cluster configuration file shows a
post_join_delay setting that has been increased to 600.

19
Chapter 4. Troubleshooting

<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="600">

• If a node fails while the fenced daemon is not running, it will not be fenced. It will cause problems
if the fenced daemon is killed or exits while the node is using GFS. If the fenced daemon exits, it
should be restarted.

If you find that you are seeing error messages when you try to configure your system, or if after
configuration your system does not behave as expected, you can perform the following checks and
examine the following areas.

• Connect to one of the nodes in the cluster and execute the clustat(8) command. This command
runs a utility that displays the status of the cluster. It shows membership information, quorum view,
and the state of all configured user services.

The following example shows the output of the clustat(8) command.

[root@clusternode4 ~]# clustat


Cluster Status for nfsclust @ Wed Dec 3 12:37:22 2008
Member Status: Quorate

Member Name ID Status


------ ---- ---- ------
clusternode5.example.com 1 Online, rgmanager
clusternode4.example.com 2 Online, Local, rgmanager
clusternode3.example.com 3 Online, rgmanager
clusternode2.example.com 4 Online, rgmanager
clusternode1.example.com 5 Online, rgmanager

Service Name Owner (Last) State


------- --- ----- ------ -----
service:nfssvc clusternode2.example.com starting

In this example, clusternode4 is the local node since it is the host from which the command was
run. If rgmanager did not appear in the Status category, it could indicate that cluster services are
not running on the node.

• Connect to one of the nodes in the cluster and execute the group_tool(8) command. This
command provides information that you may find helpful in debugging your system. The following
example shows the output of the group_tool(8) command.

[root@clusternode1 ~]# group_tool


type level name id state
fence 0 default 00010005 none
[1 2 3 4 5]
dlm 1 clvmd 00020005 none
[1 2 3 4 5]
dlm 1 rgmanager 00030005 none
[3 4 5]
dlm 1 mygfs 007f0005 none

20
[5]
gfs 2 mygfs 007e0005 none
[5]

The state of the group should be none. The numbers in the brackets are the node ID numbers of
the cluster nodes in the group. The clustat shows which node IDs are associated with which
nodes. If you do not see a node number in the group, it is not a member of that group. For example,
if a node ID is not in dlm/rgmanager group, it is not using the rgmanager dlm lock space (and
probably is not running rgmanager).

The level of a group indicates the recovery ordering. 0 is recovered first, 1 is recovered second, and
so forth.

• Connect to one of the nodes in the cluster and execute the cman_tool nodes -f command This
command provides information about the cluster nodes that you may want to look at. The following
example shows the output of the cman_tool nodes -f command.

[root@clusternode1 ~]# cman_tool nodes -f


Node Sts Inc Joined Name
1 M 752 2008-10-27 11:17:15 clusternode5.example.com
2 M 752 2008-10-27 11:17:15 clusternode4.example.com
3 M 760 2008-12-03 11:28:44 clusternode3.example.com
4 M 756 2008-12-03 11:28:26 clusternode2.example.com
5 M 744 2008-10-27 11:17:15 clusternode1.example.com

The Sts heading indicates the status of a node. A status of M indicates the node is a member of
the cluster. A status of X indicates that the node is dead. The Inc heading indicating the incarnation
number of a node, which is for debugging purposes only.

• Check whether the cluster.conf is identical in each node of the cluster. If you configure your
system with Conga, as in the example provided in this document, these files should be identical, but
one of the files may have accidentally been deleted or altered.

21
22
Chapter 5.

The GFS Withdraw Function


When a node can not talk to the rest of the cluster through its normal heartbeat packets, it will be
fenced by another node. If a GFS file system detects corruption due to an operation it has just
performed, it will withdraw itself. The GFS withdraw function is intended to be less severe than a
kernel panic. It means that the node feels it can no longer operate safely on that file system because
it found out that one of its assumptions is wrong. Instead of panicking the kernel, it gives you an
opportunity to reboot the node.

23
24
Appendix A. Revision History
Revision 1.0 Thu Jun 17 2009

25
26
Index
Name, 14
Password, 14
Password Script, 14
Use Lanplus, 14
A IPMI fence device configuration
APC fence device configuration components to configure, 12
components to configure, 4 prerequisites, 12
prerequisites, 3 procedure, 14
procedure, 5 IPMI management board
APC switch configuring as fence device, 11
configuring as fence device, 3 testing fence configuration, 18
configuring as sharable fence device, 5
testing fence configuration, 10 L
APC switch configuration component
Login configuration component
IP Address, 5
APC switch, 5
Login, 5
IPMI board, 14
Name, 5
Password, 5
Password Script, 5 M
Port, 7 Main Fencing Method configuration, 7, 14
Switch, 7
Use SSH, 7 N
Authentication Type configuration component Name configuration component
IPMI board, 14 APC switch, 5
IPMI board, 14
C
clustat command, 20 P
cluster.conf file, 8, 16 Password configuration component
cman_tool command, 20 APC switch, 5
IPMI board, 14
F Password Script configuration component
feedback, vi, vi APC switch, 5
fence device IPMI board, 14
APC switch, 3 Port configuration component
IPMI management board, 11 APC switch, 7
fence_apc command, 19 post_join_delay setting in cluster.conf, 19
fence_ilo command, 19
fence_node command, 10, 18, 19 S
sharable fence device
G configuration, 5
GFS withdraw function, 23 Switch configuration component
group_tool command, 20 APC switch, 7

I T
IP Address configuration component testing fence configuration
APC switch, 5 APC switch, 10
IPMI board, 14 IPMI management board, 18
IPMI board configuration component
Authentication Type, 14 U
IP Address, 14 Use Lanplus configuration component
Login, 14 IPMI board, 14

27
Index

Use SSH configuration component


APC switch, 7

W
withdraw function, GFS, 23

28

You might also like