cluster.conf

Langue: en

Version: 39164 (fedora - 16/08/07)

Section: 5 (Format de fichier)

NAME

cluster.conf - The configuration file for cluster products

DESCRIPTION

The cluster.conf file is located in the /etc/cluster directory. It is the source of information used by the cluster products - accessed indirectly through CCS (see ccs(7)). This file contains all the information needed for the cluster to operate, such as: what nodes are in the cluster and how to I/O fence those nodes. There is generic information that is applicable to all cluster infrastructures, as well as specific information relevant for specific cluster products.

This man page describes the generic contents of the cluster.conf file. The product specific sections of cluster.conf are left to their respective man pages. For example, after constructing the generic content, a user should look at the cman(5) man page for further information about the cman section of cluster.conf.

The cluster.conf file is an XML file. It has one encompassing section in which everything is contained. That entity's name is cluster and it has two mandatory attributes: name and config_version. The name attribute specifies the name of the cluster. It is important that this name is unique from other clusters the user might set up. The config_version attribute is a number used to identify the revision level of the cluster.conf file. Given this information, your cluster.conf file might look something like:

<cluster name="alpha" config_version="1">

</cluster>

You should specify a <cman/> tag even if no special cman parameters are needed for the cluster.

A mandatory subsection of cluster is fencedevices. It contains all of the I/O fencing devices at the disposal of the cluster. The I/O fencing devices are listed as entities designated as fencedevice and have attributes that describe the particular fencing device. For example:


  <fencedevices>
    <fencedevice name="apc" agent="fence_apc"
            ipaddr="apc_1" login="apc" passwd="apc"/>
  </fencedevices>

Concerning the fencedevice entity, the name and agent attributes must be specified for all I/O fence devices. The remaining attributes are device specific and are used to specify the necessary information to access the device. The name attribute must be unique and is used to reference the I/O fence device in other sections of the cluster.conf file. The agent attribute is used to specify the binary fence agent program used to communicate with the particular device. Your cluster.conf file might now look something like:

<cluster name="alpha" config_version="1">
  <cman/>
  <fencedevices>
    <fencedevice name="apc" agent="fence_apc"
            ipaddr="apc_1" login="apc" passwd="apc"/>


    <fencedevice name="brocade" agent="fence_brocade"
            ipaddr="brocade_1" login="bro" passwd="bro"/>


    <!-- The WTI fence device requires no login name -->
    <fencedevice name="wti" agent="fence_wti"
            ipaddr="wti_1" passwd="wti"/>


    <fencedevice name="last_resort" agent="fence_manual"/>
  </fencedevices> </cluster>

The final mandatory subsection of cluster is clusternodes. It contains the individual specification of all the machines (members) in the cluster. Each machine has its own section, clusternode, which has the name attribute - this should be the name of the machine. Each machine should be given a unique node id number with the option nodeid attribute. For example, nodeid="3". The clusternode section also contains the fence section. Not to be confused with fencedevices the fence section is used to specify all the possible "methods" for fencing a particular machine, as well as the device used to perform that method and the machine specific parameters necessary. By example, the clusternodes section may look as follows:


  <!-- This example only contains one machine -->
  <clusternodes>
    <clusternode name="nd01" nodeid="1">
      <fence>
        <!-- "power" method is tried before all others -->
        <method name="power">
          <device name="apc" switch="1" port="1"/>
        </method>
        <!-- If the "power" method fails,
             try fencing through the "fabric" -->
        <method name="fabric">
          <device name="brocade" port="1"/>
        </method>         <!-- If all else fails,

             make someone do it manually -->
        <method name="human">
          <device name="last_resort" ipaddr="nd01"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>  

Putting it all together, a three node cluster's cluster.conf file might look like:

<cluster name="example" config_version="1">
  <cman/>
  <clusternodes>
    <clusternode name="nd01" nodeid="1">
      <fence>
        <!-- "power" method is tried before all others -->
        <method name="power">
          <device name="apc" switch="1" port="1"/>
        </method>
        <!-- If the "power" method fails,
             try fencing through the "fabric" -->
        <method name="fabric">
          <device name="brocade" port="1"/>
        </method>         <!-- If all else fails,

             make someone do it manually -->
        <method name="human">
          <device name="last_resort" ipaddr="nd01"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="nd02" nodeid="2">
      <fence>
        <!-- "power" method is tried before all others -->
        <method name="power">
          <device name="apc" switch="1" port="2"/>
        </method>
        <!-- If the "power" method fails,
             try fencing through the "fabric" -->
        <method name="fabric">
          <device name="brocade" port="2"/>
        </method>         <!-- If all else fails,

             make someone do it manually -->
        <method name="human">
          <device name="last_resort" ipaddr="nd02"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="nd11" nodeid="3">
      <fence>
        <!-- "power" method is tried before all others -->
        <method name="power">
          <!-- This machine has 2 power supplies -->
          <device name="apc" switch="2" port="1"/>
          <device name="wti" port="1"/>
        </method>
        <!-- If the "power" method fails,
             try fencing through the "fabric" -->
        <method name="fabric">
          <device name="brocade" port="11"/>
        </method>         <!-- If all else fails,

             make someone do it manually -->
        <method name="human">
          <device name="last_resort" ipaddr="nd11"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>  


  <fencedevices>
    <fencedevice name="apc" agent="fence_apc"
            ipaddr="apc_1" login="apc" passwd="apc"/>


    <fencedevice name="brocade" agent="fence_brocade"
            ipaddr="brocade_1" login="bro" passwd="bro"/>


    <!-- The WTI fence device requires no login name -->
    <fencedevice name="wti" agent="fence_wti"
            ipaddr="wti_1" passwd="wti"/>


    <fencedevice name="last_resort" agent="fence_manual"/>
  </fencedevices> </cluster>

Special two-node cluster options:

Two-node clusters have special options in cluster.conf because they need to decide quorum between them without a majority of votes. These options are placed with the <cman/> tag. For example:


  <cman two_node="1" expected_votes="1"/>

Validating your cluster.conf file:

While cluster.conf files produced by the system-config-cluster GUI are pretty certain to be well-formed, it is convenient to have a way to validate legacy configuration files, or files that were produced by hand in an editor. If you have the system-config-cluster GUI, you can validate a cluster.conf file with this command:

xmllint --relaxng /usr/share/system-config-cluster/misc/cluster.ng /etc/cluster/cluster.conf

If validation errors are detected in your conf file, the first place to start is with the first error. Sometimes addressing the first error will remove all error messages. Another good troubleshooting approach is to comment out sections of the conf file. For example, it is okay to have nothing beneath the <rm> tag. If you have services, failoverdomains and resources defined there, temporarily comment them all out and rerun xmllint to see if the problems go away. This may help you locate the problem. Errors that contain the string IDREF mean that an attribute value is supposed to be shared two places in the file, and that no other instance of the name string could be located. Finally, the most common problem with hand-edited cluster.conf files is spelling errors. Check your attribute and tag names carefully.

SEE ALSO

ccs(7), ccs_tool(8), cman(5)