I tidied up my initial draft of incremental code generation so that it no longer gratuitously lowers functions which are not being recompiled. This was enough to get some results — results which are semi-bogus, due to not relinking, but which nevertheless give some idea of what can be expected.
Compiler | Seconds |
---|---|
Trunk | 33 |
Incremental, no server | 33 |
Server, first run | 27 |
Server, second run | 14 |
Preprocess | 4 |
So, the results are a bit odd. Recompiling is fast, as we’d expect — about twice as fast as a plain build. However, it still falls far short of the time used by the preprocessor. What is going on in there?
A look with oprofile seems to indicate that the excess is spread around. About 10% of the total time is spent in the GC; another 7% is used computing MD5s. Other than that… if I add up the top 40 or so non-cpp functions, I get about 5 seconds worth, and there is a long tail after that. That’s a bummer since that kind of problem is hard to fix.
2 Comments
So, just using the incremental server I get a speedup of 20% for a clean compile? Cool!
The Server, second run is after a make clean or equivalent I presume.
What code base do you use for this test?
How much time does relinking take approximately?
My test case was building zenity, from Gnome. I use that as my canonical “small test” since I have it sitting here and it isn’t very big.
The 20% number appears to be a best case. If I try other programs the benefit drops or vanishes. I thought that maybe this had to do with hunk sizes, but I tested this theory yesterday with bad results. So, I don’t really know the cause 🙁
Yes, the second run is after “make clean”. This is, in theory, the best possible case for incremental codegen — no code has actually changed, but the resulting objects should not either.
The bogus part of this is that I haven’t implemented relinking. You just get bad results right now; that’s why the patch isn’t committed. I’ve been poking at objdump and ld trying to find a way to script this part without too much work.