"Ramses" Diskless Cluster

Picture of Cluster
With the relatively low cost of high powered PC's the idea to interconnect a series of machines is not a new one.  Beowulf clusters and clusters of workstation have been used extensively for many years.  However, maintenance of these clusters has always been an issue, especially as the number of nodes increases.  By creating a truly diskless cluster the maintenance is restricted to a single machine, the master node.

The "Ramses" cluster was a truly diskless setup.  It was modelled after Arthur Weaver's Sirius Cluster located at Cornell University in Ithaca, New York, with modifications in software and in hardware.

With the exception of the master node, every other machine was simply a box with a motherboard and the minimal requirements to boot the BIOS.  ALL data was stored on the master node. Failures of the diskless clients did not compromise data. Indeed, the entire cluster configuration can be restored easily from a single bootable hard disk of the master node that we saved.

Hardware Design

Our cluster consisted of 8 dual proccessor AMD Athlon 1900MP processors. The master node was configured with:

The 7 diskless nodes were all identical and consisted of:

All interconnected by:

Hardware was purchased from Colfax International. They provided prompt service and we highly recommend them.

Software

All machines were running RedHat Linux 7.2 with the 2.4.18 kernel which was built and optimized for our setup. We used LAM-MPI for distributed computing. As well as using the Ganglia Web monitoring tool to obtain realtime cluster usage statistics.

Client machines remotely booted using the 3Com network boot (already in the network card bios) and tftp to obtain a copy of an optimized diskless kernel, and mounted all their file systems from the master node.

Should you wish to design your own cluster, we have compiled a step-by-step install recipe which can be found on our Cluster Setup Page page. You are also encouraged to look at the original SIRIUS Setup Page by Arthur Weaver.

There is also the former Job Submission Guide that refers to software once installed on Ramses. The old configuration and data is still accessible since we disconnected and saved one of the Raid disks. To boot up the old Ramses master node, simply open the Ramses case and switch the IDE cable and power connector to the second hard disk.

Main Page