Archive for December, 2007

Hunk Boundaries

How do we decide the boundaries of reuse in the incremental compiler?

The one crucial thing when looking at reuse is that the signature that you choose must be stable across compilations. And, it is better if the boundaries are cheap to compute and roughly line up with declaration boundaries. In the prototype implementation, I accomplished these by using file change events (conveniently noted by libcpp) as hunk boundaries. File change boundaries are roughly stable (moreso for system headers), are free (libcpp already computes them), and very frequently align with declaration boundaries. This last criterion is typically only broken by constructs where a file is included in the middle of a declaration — for instance, in gcc itself by including a .def file in the middle of an enum, to populate it. I added a special case to the reuse code to account for these situations.

Aside from the convenience — which is very important for prototypes, but much less so for production — there was a historical reason that I chose file boundaries. When I first started thinking about writing an incremental compiler, I focused primarily on header files, and on having a single model which would encompass both the preprocessor and the compiler. After working through a number of cases, with the help of the prototype, I decided that this was “java thinking”; in C the primary unit of reuse is the declaration — not the file. And, it is much, much simpler to reason about C if you look at it after preprocessing.

In earlier posts I described the dependency computation that is needed for sharing declarations. One consequence of this is that the bigger the unit of reuse, the less often it will actually be reusable. At one extreme is gcc’s current PCH implementation, which requires a single header file to be included first in all compilations. With the file-boundary scheme, it turns out that even relatively ordinary programs fail to reuse in many situations. For instance, some headers in gcc have conditional declarations — so, change the order of includes, and an entire header’s worth of declarations must be re-parsed.

Note that failure to reuse is not a bug. Reuse is, after all, a performance and memory optimization. The worst consequence is that gcc will run as slowly, and will use as much memory, as it does now. Still, reuse is desirable. The more we reuse, and the smaller our units of reuse, the less work we have to do when recompiling.

So, this leads to the second implementation, which tries to guess declaration boundaries without parsing. This fast, if somewhat evil-looking, and it yields good results. This still isn’t perfect, though (it does not handle K&R-style definitions properly), so I will probably rewrite it again to do something closer to a quick parse.

I’m not sure whether this same approach makes sense for C++. I assume I will start here, because I understand it and know that it will work at least minimally. But, C++ brings its own problems. For instance, suppose you have an inline function defined in a class definition. In this case, if you change the body of the function, you’d like to be able to reuse the class definition, if possible — otherwise, many kinds of small changes will yield costly compilations. Tom Fitzsimmons suggested that perhaps the hunk computation can be made recursive in some way, meaning that perhaps it could operate inside of classes and namespaces, and not just at the top level. I’ll definitely be researching this once I start on the C++ front end.

The split between front and back end incremental processing means that each front end is free to implement its own approach to incrementality. I’m only looking at C and C++, but other people are free to hack on the other front ends. The actual changes to the middle-end interface are quite small, as it turns out.

What next? I think future posts will cover scalability, multi-processor support, and perhaps what to do about the C preprocessor. If there’s anything you want to know more about, post a comment and I will address it.

Incremental Code Generation

In an earlier post I said that I’d eventually explain how incremental code generation works.

First, to recap: I’ve already explained how incremental compilation works in the front end. Stopping here would still yield some speedups — I regularly see 20% compile-time improvements on ordinary programs; sometimes much larger; and for C++ I expect the effect will be bigger.

But, for many compilations, particularly when compiling C using optimization, it is the back end which is the major time sink. And since we already know from the front end which functions have changed (i.e., have been re-parsed) and which have not, it seems like it should be easy to also run the back-end incrementally.

There are a few plausible ways to implement this. Essentially, we can preserve the compiled form of the program at any point in the pipeline. By choosing a point after most of the work is done, we can minimize the time spent needlessly recompiling.

My current plan is to do the work at the very latest point possible, by writing out object files. The idea here is that each top-level definition in a translation unit — each global variable or function — will get its own tiny object file. Then, these files will be linked together to form the object file that the user actually asked for.

Of course, the current linker is slow and not at all suitable for this sort of thing. So, we’ll use a new, fast, incremental linker — which we’ll need anyway, in order to preserve incrementality through the entire compile cycle. (It isn’t much good to have a fast incremental compiler if you still end up waiting for the linker all the time.)

This approach has the nice property that the gcc work is relatively easy, for the most part. I don’t know yet whether this outweighs the work that must be done elsewhere, so I’m going to implement a prototype and find out. In fact, this is the next major milestone (though it will be a while, because I’m in a bug-fixing phase).

Another fun possibility here is that we can try to be smart about managing the cache of small object files. In particular, we may be able to preserve some benefits of incrementality even across restarts of the server. For C++, this could include template instantiations.

One drawback of the plan is that it may make dealing with debug info somewhat tricky. This ties in to how we compute hunks; if we don’t include source location in the hunk checksum, then we can end up with objects that lie to the debugger about where their corresponding source appears. We’d want to fix that up somehow, but that is not simple. I’m still thinking about this particular sub-problem.

Another oddity is handling inlining. In my planned implementation, gcc would fork and run code generation in a subprocess. (This lets me avoid excessive modifications to gcc.) But, if we inline a function then the compile server needs to know about this in order to properly re-compile when the inlined function is changed.

Future posts should include information about computing hunk boundaries (pretty short, but maybe I’ll touch on ideas for C++), multi-processor support, and maybe some speculation about scalability.