Distributed Compiler — The Cliffs of Inanity

Andrew MacLeod had an interesting idea yesterday — make the compile server interact nicely with distcc. I’ve been thinking about this a bit, and I don’t think there are any major problems with the idea. It would be pretty simple to run a compile server on multiple machines. Maybe it would make sense to ensure that a given file is always sent to the same server, but even this isn’t strictly necessary.

One possible problem is that this would not interact well with any indexing feature we add to the server. Also it would not allow cross-module inlining in all cases, or, worse, it might choose to inline or not depending on which jobs end up on which server — something would have to be done about that.

The point behind doing this is to try to scale to very large programs — larger than will fit in memory on your desktop.

Another idea in this space is to somehow serialize hunks to disk, GC “rarely used” hunks when memory is low, and then read hunks back in as needed. Whether this would actually be a win would depend, I think, on the speed of deserialization — it would have to be reliably faster than simply re-parsing (this is more likely if GCC adopt’s Ian’s proposed pointer-free “PC-rel-like” IR — but I don’t know how likely that is). I think I’ll wait for the LTO project to be a bit further along before investigating this any more; hopefully I’d be able to use LTO’s tree-reading and -writing capabilities to save and restore hunk contents.

Anyway, a distributed compile server is an interesting idea for the future. I’ll revisit it once the server is actually working.

Join the Discussion