In an earlier post I said that I’d eventually explain how incremental code generation works.
First, to recap: I’ve already explained how incremental compilation works in the front end. Stopping here would still yield some speedups — I regularly see 20% compile-time improvements on ordinary programs; sometimes much larger; and for C++ I expect the effect will be bigger.
But, for many compilations, particularly when compiling C using optimization, it is the back end which is the major time sink. And since we already know from the front end which functions have changed (i.e., have been re-parsed) and which have not, it seems like it should be easy to also run the back-end incrementally.
There are a few plausible ways to implement this. Essentially, we can preserve the compiled form of the program at any point in the pipeline. By choosing a point after most of the work is done, we can minimize the time spent needlessly recompiling.
My current plan is to do the work at the very latest point possible, by writing out object files. The idea here is that each top-level definition in a translation unit — each global variable or function — will get its own tiny object file. Then, these files will be linked together to form the object file that the user actually asked for.
Of course, the current linker is slow and not at all suitable for this sort of thing. So, we’ll use a new, fast, incremental linker — which we’ll need anyway, in order to preserve incrementality through the entire compile cycle. (It isn’t much good to have a fast incremental compiler if you still end up waiting for the linker all the time.)
This approach has the nice property that the gcc work is relatively easy, for the most part. I don’t know yet whether this outweighs the work that must be done elsewhere, so I’m going to implement a prototype and find out. In fact, this is the next major milestone (though it will be a while, because I’m in a bug-fixing phase).
Another fun possibility here is that we can try to be smart about managing the cache of small object files. In particular, we may be able to preserve some benefits of incrementality even across restarts of the server. For C++, this could include template instantiations.
One drawback of the plan is that it may make dealing with debug info somewhat tricky. This ties in to how we compute hunks; if we don’t include source location in the hunk checksum, then we can end up with objects that lie to the debugger about where their corresponding source appears. We’d want to fix that up somehow, but that is not simple. I’m still thinking about this particular sub-problem.
Another oddity is handling inlining. In my planned implementation, gcc would fork and run code generation in a subprocess. (This lets me avoid excessive modifications to gcc.) But, if we inline a function then the compile server needs to know about this in order to properly re-compile when the inlined function is changed.
Future posts should include information about computing hunk boundaries (pretty short, but maybe I’ll touch on ideas for C++), multi-processor support, and maybe some speculation about scalability.