Ramses Cluster Job Submission Guide (Outdated)

Preliminaries

Obtain a local cluster account on Ramses

Note: Although you can easily log into the master node with your current user account, you will be unable to access the cluster in its entirety.

Setup SSH for private/public key authentication:

cd ~
mkdir .ssh
cd .ssh
ssh-keygen -t dsa
cp id_dsa identity
cp id_dsa.pub authorized_keys2

Once done, you must ssh to all machines between 2 and 8 and accept the host identification so that their id's are stored in your known_hosts.

Instructions for Single CPUs

A simple way to submit a process on a free node is simply to ssh into the node and run your job.

ssh ramses6
<run your job> &
exit
That's all there is to it. Be sure to check the Ganglia Monitor and select a machine which shows the lowest load.

Alternatively, there is a perl script named bp.pl in /scratch/template/bin. Copy it in your directory. It makes use of bpsh and scans the system for the optimal host. Once it identifies the host it will submit the job with the output captured in a OUTPUT file and errors in ERROR.

Instructions for Multiple CPUs with MPI

Copy the file cluster from the template directory.

cp /scratch/template/cluster ~/.

This will allow you to boot up your own MPI cluster by issuing the following command.

lamboot ~/cluster

To issue multi-processor jobs, you may submit them directly by issuing:

mpirun c<processor number> <program> <program arguments>

where processor c<processor number can be a list eg. c0-4,10 would specify processors 0, 1, 2, 3, 4, and 10. NOTE: That this is for actual MPI software. More information can be found just by running mpirun without any options.

Here we also post an outdated procedure that may still be useful, but if you don't need it, ignore the following steps.

Copy the old scripts /scratch/template/bin/sub*; they will cycle through all processors distributing your tasks to each processor on the cluster. You may use them as follows:

submit <job script>

These are very simplistic scripts allowing for 6 command line parameters. The scripts do not check to see the status of the processors or anything but merely rotates through them. We recommend therefore to check the status of the clusters by looking at the Ganglia Monitor. On this page you can see the current user load on any given machine and can submit to a machine accordingly. To submit to a specific machine use one of the specific sub scripts as follows:

sub<processor number>

where <processor number> is:

Node
Processor Number

ramses
0-1

ramses2
2-3

ramses3
4-5

ramses4
6-7

ramses5
8-9

ramses6
10-11

ramses7
12-13

ramses8
14-15

A final note, it appears that the submit protocols leave some open file handles. This currently seems to only be a problem with MPI. To fix this, if you are not running any processes you can issue:

lamhalt
lamreboot (This is a script in /scratch/template/bin - copy it to your own directory)
lamboot ~/cluster

Node	Processor Number
ramses	0-1
ramses2	2-3
ramses3	4-5
ramses4	6-7
ramses5	8-9
ramses6	10-11
ramses7	12-13
ramses8	14-15

Ramses Cluster Job Submission Guide (Outdated)

Preliminaries

Instructions for Single CPUs

Instructions for Multiple CPUs with MPI

Main Page