Saturday, May 06, 2006

Fast Compilation

 

Faster and Faster Compilation


Perl and Python may be popular scripting languages, but a great deal of software � including the Linux kernel and Samba, among many others, is still written in C and C++. Accordingly, a wide variety of tools are available to boost C/C++ programmer productivity. This month, let's explore ccache and distcc, two C/C++ tools that take different approaches to saving time. Both tools were written by members of the Samba team and are licensed under the GNU Public License.

Written by Andrew Tridgell and available from http://ccache.samba.org, ccache is a compiler cache. It acts as a caching pre-processor to C/C++ compilers, using the -E compiler switch and a hash to detect when a compilation can be satisfied from cache. Incorporating ccache into your builds should result in in a five to ten-fold increase in speed. You'll gain the most from ccache if you�re continually having to rebuild the same source tree (via make clean && make) or if you perform a lot of RPM rebuilds. ccache produces exactly the same output that the real compiler produces, including the same object files and the same compiler warnings. The only difference is that ccache is faster.

Installing ccache is the typical ./configure && make && make install. Once installed, there are two ways to use ccache. First, you can prefix your compile commands with ccache. For example, changie the CC=gcc line in your Makefile to CC=ccache gcc. Use this method if you'd like to test ccache or if you only plan to use it for some projects.
Alternatively, you can create a symbolic link to ccache from the names of your compilers, which allows you to use ccache without any changes to your build system. Make sure that the symlink appears in your PATH before the actual compiler.

While ccache uses caching to speed up compilation, distcc achieves its speed increase by distributing builds across several machines on a network. Like ccache, distcc always generates the same results as a local build. Written by Martin Pool, distcc is available from http://distcc.samba.org. distcc works by sending each job�s preprocessed source code across the network. It doesn't do any of the actual code compilation itself, it's just a frontend for gcc that utilizes the -j parallel build feature of make. Compilation is driven by a client machine, which runs distcc, make, the preprocessor, the linker, and other stages of the build process. The job is then distributed to any number of machines running the distccd daemon. One nice thing about distcc is that it scales nearly linearly, at least for a small number of machines, so you do not need a lot of hardware to see a benefit.

Installation of distcc is also the normal ./configure && make && make install. Install distcc on every machine that you want to distribute compilation jobs to. After installation, run distccd on each machine as follows:

$ distccd --daemon --allow 192.168.1.0/24

Replace 192.168.1.0/24 with the IP address and CIDR mask of the machines that should be allowed to connect. You're now ready to distribute compiles. First, add the name of the machines you'd like to harness into the DISCC_HOSTS environment variable:

$ export DISTCC_HOSTS="localhost dev1 dev2 dev3"

Always put the machines in order from fastest to slowest, and if you�re using a large number of machines, (you can opt to) omit "localhost" from the list, allowing it to focus on preprocessing. You can now build over the distributed system using the following command:

$ make -j8 CC=distcc

Why 8? As a rule, double the number of CPU�s in the build system and use that number for -j. You may be thinking that it'd be great if these tools worked together, allowing you to cache what could be cached and distributing the rest. You�ll be happy to find out that the tools are completely compatible. Even better, getting the tools to work together is extremely easy. To do so, simply set the CCACHE_PREFIX environment variable to distcc, as in export CCACHE_PREFIX="distcc".

Using ccache and distcc, either separately or together, can save you a large amount of time during tedious rebuilds. Hopefully, it's enough time for a latte.

No comments: