The Other Idea

I’m still considering what to work on next. As I mentioned earlier, OpenJDK seems to be one viable option. The only other idea I have at the moment is a massive change to the C and C++ front ends to GCC.

In particular I want to change the C/C++ front ends to be incremental compilers. The basic idea is to build a model of the user’s entire program, and then as changes occur, recompile only the minimum amount required.

This approach lends itself to a number of nice things: not just faster turnaround times, but also static analysis (e.g., as plugins to the compilation server — though this may be better if done in the middle end), refactoring, better IDE integration, etc. I expect we’d also see faster compilation the first time a program is compiled (i.e., not just in the recompilation case) since we would be able to reuse parsed header files in most situations.

I’ve spent quite a bit of time thinking about this and I have, I think, a reasonable plan for implementing it. There are a few implementation choices that require some experimentation, but I think that could safely be considered “phase 1” of the project.

My only misgivings center on whether I really want to become a C and C++ expert, and whether it is worth putting this much effort into improving these compilers; especially since I think that C is generally overused.

The real drive behind this is that I’m interested in programmer productivity. Over the years I’ve found that many of the things I’ve found it easy to motivate to work on have fallen into this category, perhaps driven by my own annoyance with productivity impediments.

I’lll probably write a lot more about what this change would look like, how it would work, details about the design, various oddities I’ve considered. But for now I’m mostly interested in feedback regarding its advisabiilty and/or desirability.

15 Comments

  • […] Tom Tromey in his latest blog toys with the possibility of adding incremental compilation to gcc for c and c++. now there’s something i want.   […]

  • I can think of a lot of people who would find that so incredibly useful… So I would say hell yes

  • I also think that would be a fantastic idea!

  • Of course I’m biased. But if the #1 goal is programmer productivity and
    you want to only recompile necessary things, and you want refactoring and static analysis, then I’d say consider Java. Modern IDE’s already handle incremental compilation and refactoring. Although obviously a question of taste I think, in general, programmer productivity is seen to be higher in Java than in C/C++. But the real reasons to do this are HotSpot’s code analysis has become quite strong (based on Linear Scan Register Allocation [1])
    and we now know that dynamic code analysis beats static code analysis.

    You can certainly explore the C/C++ path too, but at a minimum
    time slice with the alternatives because you have the code for HotSpot *today* under GPL and soon the rest of the JDK to look at an play with to see how dynamic analysis is handled.

    I would even go so far as to say that C/C++ provides the basis
    for Java/HotSpot which, in turn, provides the basis for the next
    boost in productivity. Is it scripting? Is it the rewrite of emacs in Java?
    Is it lisp running on HotSpot? Who knows, but my gut says these
    are foundational elements (and bigger productivity payoffs built on the shoulders of giants).

    –Tom

    [1] http://java.sun.com/javase/technologies/hotspot/publications/

  • I’ve gotten to use the Java incremental compiler, and it makes the test-fix-recompile cycle MUCH less painful… although not nearly as snappy as writing in an ‘interpreted’ language with an interactive D. E.
    IMO, a tool like this would make a huge difference in dev time, and would be worth its kilobytes in bars of frickin gold. Most importantly, it would probably reduce frustration when hunting for those little flaws that require ten or twenty recompiles EACH!

    Incremental compiler for C/C++ -> less frustration for programmers -> happier programmers -> better code -> world peace
    Therefore
    Incremental compiler -> world peace

    Write that incremental compiler, then wait for your Nobel Peace Prize!

    I mean, yeah… it’s got my vote.

  • Would you hack the incremental C/C++ compilation on top of GCC or on top of something else – LLVM for example?

  • I like the idea of breaking up the compilers into pieces that could be reused for other tasks like static analysis, but I’m not sure if it’s worth the effort. From what I can see compilers and static analysis tools only have the AST in common. Also the general concepts like inlining, etc are the same but they are done at different stages. Would there really be much room for reuse?

    Also, I hear horror stories about reusing the GCC AST for static analysis, why not help improve Elsa(perhaps get it to become a gcc frontend) instead? It’s a cool parser in that it can be easily extended to support different flavours of C/C++.

    On the other hand, whatever you decide to do, I’m sure it’ll end up superuseful. Good luck.

  • I work on C++ daily, and if you did that I would have your children!!! (yes Im male)

    PURTY PLEASE11111111

  • I really hope you work on OpenJDK as your contribution to Free (alternatives to) Java over the years has been inspiring.

  • I second that, this would really be awesome!

  • Hi Tom,

    I think it’s a really cool idea, particularly if we all wind up with good tools for static analysis. Yes, C and C++ are flawed languages, but they aren’t going away any time soon, and better analysis tools could make them much safer to use.

  • Eh. I hate compilation times. But in my experience:

    * C compiles instantly on modern processors.

    * C++ takes a while, but gcc -O0 with PCH is actually pretty darn fast. Also, it’s easy to parallelize, and we have ccache, and we have distcc.

    * The bottleneck for the C++ edit/test cycle is the _linker_. Templates mean very large text size and symbol table size; throw in debugging symbols and it becomes trivial to be waiting a minute or more for your hundred megabyte executable to get smushed together. And having more processors helps not at all.

    All this based on basically one program (guess you know which one :-)), so dunno how well it generalizes, but… I only spend like 10% of my sitting-around-waiting-time actually waiting for the _compiler_, per se.

  • A few responses:

    Tom Marble, thanks for the link; I’ll read those papers. I don’t completely agree with your conclusions — perhaps I’ll write about this, or we can talk at FOSDEM :-). I suppose the crux (agreeing with Joe Buck) is that C and C++ do still exist and are used, so it makes some sense to try to help the unfortunate folks stuck using them 🙂

    Anders: yes, my plan is to make it work with GCC. However, as with gcjx, the idea is to separate the front and back ends. So, having something that works with LLVM (or whatever) is not out of the question, or even very difficult. My interest in GCC is simply because, were I to do this, I would want it to be the standard system compiler. Also I generally consider myself part of the GCC community; I know how it works, etc.

    Taras: good points. Perhaps there is not much overlap, or perhaps static analysis is best done in a more LTO-like area. I do not know. And, the static analysis bit is sort of secondary to what I’m interested in anyhow. As far as the GCC AST goes… trees have gotten *much* better in recent years, but yeah, they are still pretty gross. Part of this project would be having a “real” AST for C and C++, as opposed to the hallf-lowered form that currently exists. Again, look to gcjx for an idea of where I’m coming from.

    Nathaniel: yeah, for C the problem is not so severe. But for C++, my experience is very different. In my tests with gcjx (and, good idea, I will try monotone), PCH only gives about a 30% speedup. Much of the bottleneck is in parsing and semantic analysis. Java tools are hugely, mind-bogglingly faster — I’ve written on this before, and I have some new results that are as dramatic and show the potential of this approach.

    Very good point about linking. My current plan for code generation calls for an incremental linker (with a couple additional odd features). This is something I’ve wanted for fGNU since using PureLink back in 1991… sad that we don’t have it really. Anyway, there are two candidates out there as starting points for this; elfutils has a linker (I think unfinished), and also Ian Taylor is working on one.

    I’ll expand on this stuff in future posts when I describe my plan in a bit more detail.

  • With Free Java on the horizon, and ready to be shipped with distro’s, it really opens up Java to become the next gen programming platform for the GNU/Linux desktop. In my mind, the next piece in that puzzle is the ability to really leverage Java in that environment. The Java-Gnome bindings look like they could use some help with their next generation work. Having Java on par with Python for Linux desktop work (Cairo, HAL, DBus, etc), would be really exciting!

    And what about Java integration into Firefox? Despite all the noise over the years, I still can’t write a Firefox plugin or extension using Java.

    Both of these projects would benefit from your combination of Java/C experience. And they are both, in my mind at least, more enabling of next generation type applications (and thus more beneficial) than optimizing a waning language some people are “stuck” using.

    Either way, thanks for all your contributions to the community!

  • I’m biased because I love Java, but I think C and C++ are anachronisms. C is not object oriented at all, and C++ is rather bloated. Both suffer from pointers, potential buffer overflows and memory leaks. An increasing amount of software is meanwhile written in Python, Java or C#, so I wouldn’t put too much effort into improving gcc. It does its job very well already IMHO.

Join the Discussion

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.