Wednesday, May 17, 2006

Live CD Clustering Using ParallelKnoppix

If you’ve run across Beowulf or another cluster implementation, but thought that assembling your own cluster was either too complicated or too resource-intensive, cheer up! Given five minutes, a specialized, live Linux distribution called ParallelKnoppix, and a handful of ordinary personal computers, you too can build your very own mini-mini-mini-supercomputer.
ParallelKnoppix, a remaster of the Knoppix (http://www.knoppix.org/) live CD distribution, allows you to construct a parallel processing cluster using off-the-shelf desktops, laptops, and servers, and the LAM-MPI and/or MPICH implementations of the Message Passing Interface (and PVM). Moreover, because ParallelKnoppix is a live CD, you can convert a room full of machines — even those running Windows — into a Linux cluster without affecting the natively-installed operating system. Getting a cluster up and running takes about five minutes if all of your machines have PXE network cards. Clusters from two to 200 machines are supported.
Download, Burn, Boot
The first thing to do is download the ParallelKnoppix ISO image from http://pareto.uab.es/mcreel/ParallelKnoppix/ and burn one CD for each computer you’d like to include in the cluster. Next, boot one of the machines you’ll be using with the CD. (Keep in mind that you’ll need at least one Linux accessible partition on this machine. If the machine in question only has NTFS partitions, you can use a USB drive formatted as FAT32 to gain the needed space.) The machine should follow the normal Knoppix boot sequence.
Once the machine is booted, select ParallelKnoppix –> Setup ParallelKnoppix from the KDE menu to start the configuration script. Once in the configuration script, click OK to start the Terminal Server. The next dialog box will ask you how many nodes will be in the cluster, including the master node you’re using at the moment. Next, you’ll be asked to select all of the network drivers needed for the cluster. To simplify things, ensure that each slave machine is setup to PXE boot. While it’s possible to work around this, it complicates the setup and is beyond the scope of this article.
The next screen gives you a couple of cluster options. Keep the default of textmode and do not check the secure box. (See the sidebar “ParallelKnoppix Precautions.”) Next, provide additional boot options, if any. (You can normally leave this blank.)
ParallelKnoppix Precautions
ParallelKnoppix is an extremely insecure distribution. It is not intended for desktop or server use; instead, ParallelKnoppix is designed to be easy-to-use in an environment that can be restored quickly if any disaster occurs.
It’s highly recommended that you run ParallelKnoppix and your entire cluster on a dedicated network that is disconnected from the Internet.
You’re now ready to start the terminal server and are at the point where you’ll need a read/write mountable partition. Select the partition you’d like to use and click OK. A working directory with the name parallel_knoppix_working is created and exported by NFS. Anything you want to be accessible to the cluster should be placed in this directory.
Slaving Away
Now it’s time to boot each of your slaves. Once you’re sure all of the slave node machines are booted, click OK to have them mount the working directory. You should now have a working Linux cluster.
The ParallelKnoppix ISO has some example cluster applications in /home/knoppix/Desktop/ParallelKnoppix/Examples. To run one of them, copy the entire subdirectory (for example, /Octave/) into your working directory. From there, each example should have a README that explains how to run the program on the cluster. One great thing about being a Knoppix derivative is that fact that you can further remaster ParallelKnoppix to suit your needs, which could include your own custom application and data.
With this article and a couple ParallelKnoppix discs, you should be able to have a Linux cluster up and running in no time. This is a great way to get your feet wet with clustering or to prototype your next custom clustering application. Make sure not to forget about the inherent insecurities in this setup. Have fun and enjoy the rocket science.

No comments: