Note: Although you can easily log into the master node with your current user account, you will be unable to access the cluster in its entirety.
cd ~
mkdir .ssh
cd .ssh
ssh-keygen -t dsa
cp id_dsa identity
cp id_dsa.pub authorized_keys2
Once done, you must ssh to all machines between 2 and 8 and accept the host identification so that their id's are stored in your known_hosts.
ssh ramses6That's all there is to it. Be sure to check the Ganglia Monitor and select a machine which shows the lowest load.
<run your job> &
exit
cp /scratch/template/cluster ~/.
lamboot ~/cluster
mpirun c<processor number> <program> <program arguments>where processor c<processor number can be a list eg. c0-4,10 would specify processors 0, 1, 2, 3, 4, and 10. NOTE: That this is for actual MPI software. More information can be found just by running mpirun without any options.
Copy the old
scripts /scratch/template/bin/sub*; they will cycle
through all processors distributing your tasks to
each processor on the cluster. You may use them as follows:
submit <job script>These are very simplistic scripts allowing for 6 command line parameters. The scripts do not check to see the status of the processors or anything but merely rotates through them. We recommend therefore to check the status of the clusters by looking at the Ganglia Monitor. On this page you can see the current user load on any given machine and can submit to a machine accordingly. To submit to a specific machine use one of the specific sub scripts as follows:
sub<processor number>
where <processor number> is:
Node
Processor Number
ramses
0-1
ramses2
2-3
ramses3
4-5
ramses4
6-7
ramses5
8-9
ramses6
10-11
ramses7
12-13
ramses8
14-15
A final note,
it appears that the submit protocols leave some open file handles.
This currently seems to only be a problem
with MPI. To fix this, if you are not running any processes you
can
issue:
lamhalt
lamreboot (This is a script in /scratch/template/bin - copy it to your own directory)
lamboot ~/cluster