I added the needed multi-hunk processing to my front end hack, to try to be able to compile parts of GCC. What this means is that if a construct (say, an enum) spans multiple file changes, the parser will now notice this and record it properly. If a subsequent parse sees the same starting hunk, it will check to see if subsequent hunks are the same as the previous chain, and if so, will map in all the bindings.

Unfortunately this was not the only problem — GCC also has an enum defined in two different header files, meaning that the token stream actually seen depends on the order of header inclusion. Gross stuff! Anyway, handling this was always on the agenda; but it means writing more code and so I am putting it off a little bit.

Meanwhile I compared my hacked compiler with a roughly equivalent unmodified version. I compiled zenity (a Gnome program; the part I compiled is about 5KLOC according to wc — but 475KLOC after preprocessing) with both, using the same command line arguments, including --combine.

The results: the modifications are a big win:

Compiler Time Memory (used at exit)
Hacked 11.98sec 17M
Trunk 23.87sec 62M

The hacked compiler with --combine is faster than running a make on the same program (though to be fair I didn’t try to measure any of the build overhead — but it would have to be more than 4 seconds of overhead, since the normal build took 16.55sec).

Share if you liked it


  • Out of curiosity, are you getting paid to hack on this or doing this in your spare time? It’s impressive stuff, I just wonder how you find time to work on it.

  • Yes, I’m doing this for my job right now.

Join the Discussion

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>