Quick and easy ICR cluster building using moab cluster builder

March 25th, 2009 10:20 am
Posted by Isaac Hailperin
Tags: , , , , , , , , ,

Our first Transtec ICR recipe was build based on the cluster dvd provided by Cluster Resources. It is basically an enhanced SLES 10 SP2 DVD. Cluster Resources has added various extra packages to make it an easy to use cluster installation tool.

In principal you boot your head node off the dvd. The steps needed are hardly more than those required for a default SLES install. After the installation you log in as root, and the Moab Cluster Builder (MCB) pops up. It automatically installs and configures Torque on the nodes and the headnode. Thus, after this deployment wizard has finished, your cluster is ready to use.

In order to make such a cluster ICR compliant, a few things need to be changed. First of all, a few packages should be added to the head node:
gcc
gcc-c++
termcap
termcap-32bit
libacl-32bit
libattr-32bit
bzip2-32bit
libcap-32bit
libelf libelf-32bit
gdbm-32bit

The compute nodes also need additional packages:
gcc
gcc-c+
termcap
termcap-32bit
Mesa
Mesa-32bit
openmotif-libs
openmotif-libs-32bit
libacl
libacl-32bit
libattr
libattr-32bit
bzip2
bzip2-32bit
libcap
libcap-32bit
libelf
libelf-32bit
gdbm
gdbm-32bit
The Moab Cluster Builder uses AutoYaST under the hood. So you can just use the AutoYaST mechanism to add these packages to the ComputeNode profile, which already comes along prepared by default. I found the AutoYaST GUI to be a bit buggy, so after a couple of errors caused by the GUI, I decided to use my favorite text editor to change the respective XML file directly.

The cluster checker also requires Java. If you install the SLES version of Java, you will break the Moab Cluster Suite, which comes along with its own Java, in a newer version. I found it easiest to download the latest Java package from the Sun website and installl it on the head and compute nodes.

Intel MPI uses a multi purpose daemon (mpd). It expects a file called $HOME/.mpd.conf with the correct permissions. I created it like this:
$echo secretword=lindyhop >$HOME/.mpd.conf
$chmod 600 $HOME/.mpd.conf
Note that you should do this only on the head node, as /home is exported via NFS.

For ICR, X11 tools are expected to be located in /usr/bin. However, SLES puts them in /usr/X11R6/bin. We can work around this with soft links:

#on the head node
$cd /usr/bin/Our first transtec icr recipe was build based on the cluster dvd provided by Cluster Resources. It is basically an enhanced SLES 10 SP2 DVD. Cluster Resources has added various extra packages to make it an easy to use cluster installation tool.

In principal you boot your head node off the dvd. The steps needed are hardly more than those required for a default SLES install. After the installation you log in as root, and the Moab Cluster Builder (MCB) pops up. It automatically installs and configures Torque on the nodes and the headnode. Thus, after this deployment wizard has finished, your cluster is ready to use.

In order to make such a cluster ICR compliant, a few things need to be changed. First of all, a few packages should be added to the head node:
gcc
gcc-c++
termcap
termcap-32bit
libacl-32bit
libattr-32bit
bzip2-32bit
libcap-32bit
libelf libelf-32bit
gdbm-32bit

The compute nodes also need additional packages:
gcc
gcc-c+
termcap
termcap-32bit
Mesa
Mesa-32bit
openmotif-libs
openmotif-libs-32bit
libacl
libacl-32bit
libattr
libattr-32bit
bzip2
bzip2-32bit
libcap
libcap-32bit
libelf
libelf-32bit
gdbm
gdbm-32bit
The Moab Cluster Builder uses AutoYaST under the hood. So you can just use the AutoYaST mechanism to add these packages to the ComputeNode profile, which already comes along prepared by default. I found the AutoYaST GUI to be a bit buggy, so after a couple of errors caused by the GUI, I decided to use my favorite text editor to change the respective XML file directly.

The cluster checker also requires Java. If you install the SLES version of Java, you will break the Moab Cluster Suite, which comes along with its own Java, in a newer version. I found it easiest to download the latest Java package from the Sun website and installl it on the head and compute nodes.

Intel MPI uses a multi purpose daemon (mpd). It expects a file called $HOME/.mpd.conf with the correct permissions. I created it like this:
$echo secretword=lindyhop >$HOME/.mpd.conf
$chmod 600 $HOME/.mpd.conf
Note that you should do this only on the head node, as /home is exported via NFS.

For ICR, X11 tools are expected to be located in /usr/bin. However, SLES puts them in /usr/X11R6/bin. We can work around this with soft links:

#on the head node
$cd /usr/bin/
$for i in /usr/X11R6/bin/*;do ln -s $i;done
There seems to be a SLES specific issue with the hostnames. The nodes contain wired names in /etc/HOSTNAME. So, fix this either in the AutoYaST profile, or by executing
$echo $HOSTNAME.<domainname> >/etc/HOSTNAME
on each host.

So that´s basically it, you should be ready to run the cluster checker.

$for i in /usr/X11R6/bin/*;do ln -s $i;done
There seems to be a SLES specific issue with the hostnames. The nodes contain wired names in /etc/HOSTNAME. So, fix this either in the AutoYaST profile, or by executing
$echo $HOSTNAME.<domainname> >/etc/HOSTNAME
on each host.

So that´s basically it, you should be ready to run the cluster checker.

JOIN THE CONVERSATION


You must be a Registered Member in order to comment on Cluster Connection posts.

Members enjoy the ability to take an active role in the conversations that are shaping the HPC community. Members can participate in forum discussions and post comments to a wide range of HPC-related topics. Share your challenges, insights and ideas right now.

Login     Register Now


Author Info


Isaac studied physics at the Free University in Berlin, Germany. He graduated in 2008 with a thesis in theoretical high energy physics. Using one of Germany's fastest supercomputers, he calculated eigenvalues of large matrices to study discretization errors in lattice gauge theories. Since October of 2008, Isaac has been a cluster engineer with Transtec AG. His main occupation currently covers cluster deployment methods. He is responsible for the technical issues of the Transtec ICR Program.