Before you start, you should have a reasonably strong command of Linux. Furthermore, we take no responsibility for any loss of data, damage etc etc etc... you get the idea. The following guide should be fairly complete but we may have overlooked something or made a typo here or there. Any comments and/or suggestions are always appreciated.
Initially this work was done by our group member Essam Metwally in 2001/2002 after extensive reading and examination of other diskless cluster setups on the net, in particular Arthur Weaver of the Cornell SIRIUS: MacChess Cluster, although his installation recipe did not work perfectly for us, many of the steps were adapted from his example. We have simply refined things for our own particular configuration and fixed certain steps.
# rpm -Fvh glibc*.rpm
# tar -zxvh linux-2.4.18.tar.gz
# mv linux /usr/src/linux-2.4.18check if there is a symbolic link (usually yes) of linux to whatever version of the kernel is currently installed. If yes, remove it and have it point to the new source by entering:
# cd /usr/src
# rm linuxTime to customize:
# ln -s linux-2.4.18 linux
# cd linuxWe won't go too much into customization other than some essentials, but customize as you see fit. Load RamsesServer.conf and alter as necessary. In particular check the network device currently set to 3Com Vortex. Save and Exit.
# make xconfig
# make depAssuming no errors,
# make clean
# make -j 16
# make modules
# make installIf you are using lilo boot manager, edit /etc/lilo.conf and make sure that it includes the new kernel. Then run:
# make modules_install
# /sbin/lilo -vIf you are using grub, edit /boot/grub/grub.conf and add the kernel as appropriate. Since grub is located in the boot partition, nothing more need be done. Restart and boot with your freshly created kernel:
# shutdown -r nowCreate a custom kernel for the diskless nodes. This is a very stripped down version because really, all you have are processor(s), memory, motherboard, and maybe a card or two. Feel free to use RamsesCluster.conf again check to make sure that the selections are appropriate for your system paying particular attention to the network configuration
# cd /usr/src/linux (OR WHEREVER YOU LEFT IT)Load RamsesCluster.conf
# make xconfig
# make cleanMake a network bootable image of the client kernel using the tagging utility mknbi. The utility is available at http://etherboot.sourceforge.net/. We recommend version 1.0.6. Supposedly there is an incompatibility issue with a utility we use later on (but we did not experience this, so experiment with it). This supposedly incompatible utility is called imggen and is only necessary for 3COM cards as far as we know (so if you don't use 3COM then don't worry about it and get the latest version of mknbi).
# make dep
# make -j 16 bzImage
# rpm -ivh mknbi-1.0.6.noarch.rpmIF you are using a 3COM card, you need to get imggen from LTSP contributions webpage. The file is called imggen_v1.01.tgz.
# cd /tftpboot
# mknbi-linux --output=/tftpboot/vmlinuz-2.4.18-cluster \ --ipaddrs=rom \ --rootdir=/ \ -- append="ramdisk_size=1024" \ /usr/src/linux/arch/i386/boot/bzImage
# tar -zxvf imggen_v1.01.tgz
# chmod 755 imggen
# mv imggen /sbin
# cd /tftpboot
# /sbin/imggen -a vmlinuz-2.4.18-cluster vmlinuz-2.4.18-cluster-imggen
Add a machine definition as in ramses2 for each diskless machine in your cluster.
default-lease-time 21600;
max-lease-time 21600;
option subnet-mask 255.255.255.0; option broadcast-address 192.168.0.255; option routers 192.168.0.1; option root-path "/";
option domain-name-servers 192.168.0.1; option domain-name "";
shared-network CLUSTER { subnet 192.168.0.0 netmask 255.255.255.0 {} }
group { use-host-decl-names on; option log-servers 192.168.0.1;
host ramses2 { hardware ethernet XX:XX:XX:XX:XX:XX; fixed-address 192.168.0.2; filename "vmlinuz-2.4.18-cluster-imggen"; } ... ... }
# /etc/rc.d/init.d/dhcpd restart
# /etc/rc.d/init.d/xinetd restart
# tar -xvf ClusterNFS-3.0-rc1.tar
# cd ClusterNFS-3.0-rc1.tar
# ./BUILD
# make -j 16 install
# echo "/ 192.168.0.0/255.255.255.0(rw,no_root_squash)" >> /etc/exportsIf the file doesn't exist then change the first ">>" to ">"
# echo "/tftpboot/ 192.168.0.0/255.255.255.0(rw,no_root_squash)" >> /etc/exports
# /sbin/chkconfig --level 2345 atd off
# /sbin/chkconfig --level 2345 autofs off
# /sbin/chkconfig --level 2345 apmd off
# /sbin/chkconfig --level 2345 ipchains off
# /sbin/chkconfig --level 2345 sendmail off
# /sbin/chkconfig --level 2345 linuxconf on
# /sbin/chkconfig --level 2345 dhcpd on
# make_clusternfs_client << CLIENTID HERE 2 through 254 >>