Archive for the ‘GCC’ Category

The Deletion of gcj

I originally posted this on G+ but I thought maybe I should expand it a little and archive it here.

The patch to delete gcj went in recently.

When I was put on the gcj project at Cygnus, I remember thinking that Java was just a fad and that this was just a temporary thing for me. I wasn’t that interested in it. Then I ended up working on it for 10 years.

In some ways it was the high point of my career.

Socially it was fantastic, especially once we merged with the Classpath community — I’ve always considered Mark Wielaard’s leadership in that community as the thing that made it so great.  I worked with and met many great people while working on gcj and Classpath, but I especially wanted to mention Andrew Haley, who is the single best debugger I’ve ever met, and who stayed in the Java world, now working on OpenJDK.

We also did some cool technical things in gcj. The binary compatibility ABI was great, and the split verifier was very fun to come up with.  Per Bothner’s early vision for gcj drove us for quite a while, long after he left Cygnus and stopped working on it.

On the downside, gcj was never quite up to spec with Java. I’ve met Java developers even as recently as last year who harbor a grudge against gcj.

I don’t apologize for that, though. We were trying something difficult: to make a free Java with a relatively small team.

When OpenJDK came out, the Sun folks at FOSDEM were very nice to say that gcj had influenced the opening of the JDK. Now, I never truly believed this — I’m doubtful that Sun ever felt any heat from our ragtag operation — but it was very gracious of them to say so.

Since the gcj days I’ve been searching for basically the same combination that kept me hacking on gcj all those years: cool technology, great social environment, and a worthwhile mission.

This turned out to be harder than I expected. I’m still searching. I never thought it was possible to go back, though, and with this deletion, this is clearer than ever.

There’s a joy in deleting code (though in this case I didn’t get to do the deletion… grrr); but mainly this weekend I’m feeling sad about the final close of this chapter of my life.

GCC and Python

When writing an earlier post, I realized I haven’t yet written about the Python plugin for GCC.

This is awesome! It is by far the simplest way to write a GCC plugin. The primary reason is that the author, the amazing David Malcolm, has put a lot of effort into the polish: this plugin is the simplest one to build (“make” works for me, with the Fedora 15 system GCC) and also the one with the best documentation.

Why would you want to write a plugin? Pretty much every program — and especially every C program, as C has such bad metaprogramming support — has rules which cannot be expressed directly in the language. One usually resorts to various tricks, and in extremis patch review by all-knowing maintainers, to preserve these. The plugin offers another way out: write a Python script to automate the checking.

I’ve already written a couple of custom checkers for use on GDB (which is a very idiosyncratic C program, basically written in an oddball dialect of C++), which have found real bugs.  These checkers cover things that no generic static analysis tool would ever correctly check, e.g., for the proper use of GDB’s exception handling system.  The exception checker, which we use to verify that we’re correctly bridging between Python’s exception system and GDB’s, took less than a day to write.

GCC Summit News

Right now I’m at the GCC Steering Committee Q&A panel at the summit.  There’s a new draft out of the GCC runtime license (used by libgcc, libgcj, etc).  The latest text allows for the development of GCC plugins — folks are already talking about review of the patch and how soon it can be merged (answer: after the license is finalized).

This is great stuff!

C++ Standard

I love patches like this one. This is a change to libstdc++ to reflect changes made at the lastest standard committee meeting. I think the meeting ended yesterday… a little slow, perhaps; I think last time I saw commits immediately after a vote concluded.

GCC Summit

Next week is the GCC Summit, in Ottawa. I’ll be giving a talk there about my work on the incremental compiler. That will be fun, but what I’m really looking forward to is seeing all the other GCC hackers I know. There are plenty of interesting talks on the schedule, and some cool BOFs; plus there is an informal gdb meetup (this should be especially interesting, as there are many cool developments in gdb-land).

In keeping with my Emacs mania, this year I wrote a special presentation mode. My presentation is just a text file, in Emacs “outline mode” format; my new mode displays it nicely in a screen-filling frame. I should say, “nicely”, since some kinds of rendering are a pain to do in Emacs, and I couldn’t be bothered.

GCC Shirt

Thanks to Elyn, my t-shirt design from last year’s GCC Summit is now available.

I made a new GCC t-shirt yesterday, but you’ll have to wait for the summit to see it.

Codegen Update

Since my last post, I’ve written a prototype implementation of relinking for the incremental compiler.

Now, the compile server will create an object file in a cache directory. If there was a previous variant of the compiled file in the cache, it will then link the two together (using ld -r). Then, the final object file is copied from the cache to the user’s requested output file.

So, now you can “make clean; make” and see correct results from the incremental compiler. The new results table:

Compiler Seconds
Trunk 30
Incremental, no server 30
Server, first run 26
Server, second run 17

This is probably the current best (or “best worst”) case — no actual recompilation needed to be done. In terms of user scenarios, this corresponds to, say, modifying a comment in a core header file and recompiling. And, given that this is execing both ld and cp, the results aren’t too bad.

On the other hand, I had somewhat higher expectations. I’ve been pretty depressed about this project all week. Relinking is turning out to be a pain; I’m pretty much convinced now that incremental preprocessing is a necessity; and this combination makes me wonder whether I’m chasing a rat down a hole. The question of whether this project remains worthwhile is normative one, and fairly subjective. That’s a fancy way of saying, I don’t know.

Ugh.

Mostly I try to think about it in terms of a success metric. Or, what is the minimum expected gain that would make it appear to be worthwhile? I suspect I may need to prototype the C++ compiler changes before I can really satisfy myself on that topic, though.

Back to the concrete.

The linking prototype is still pretty bogus. It arrives at an executable which works, but the intermediate object files grow over time. That’s because it is pretty hard to coerce ld (and objcopy) into doing the odd things I want: I want to link two files together, yielding another relinkable object (i.e., I need -r), where symbol name clashes are always resolved in favor of the first file. You’d think -z muldefs (I’ve gotten overly familiar with the ld manual) would work here, but it just drops the symbols — not the contents. So, maybe -ffunction-sections and --gc-sections is the way to go — but this also has problems; the former because (supposedly) it does not work with all programs, and the latter because it interacts oddly with -r.

I’m still hoping I can get by with a relatively simple linker hack, though as the week has dragged on I’ve realized that my understanding of linking is less than ideal.

First Codegen Result

I tidied up my initial draft of incremental code generation so that it no longer gratuitously lowers functions which are not being recompiled. This was enough to get some results — results which are semi-bogus, due to not relinking, but which nevertheless give some idea of what can be expected.

Compiler Seconds
Trunk 33
Incremental, no server 33
Server, first run 27
Server, second run 14
Preprocess 4

So, the results are a bit odd. Recompiling is fast, as we’d expect — about twice as fast as a plain build. However, it still falls far short of the time used by the preprocessor. What is going on in there?

A look with oprofile seems to indicate that the excess is spread around. About 10% of the total time is spent in the GC; another 7% is used computing MD5s. Other than that… if I add up the top 40 or so non-cpp functions, I get about 5 seconds worth, and there is a long tail after that. That’s a bummer since that kind of problem is hard to fix.

Setbacks

The last couple weeks uncovered a few problems in the incremental compiler.

First, suppose you compile a program with the incremental compiler, then recompile it. You would expect to get the same warnings as well. But — whoops — I never thought about this until a week or two ago.

I hate that awful moment of realization. It reminds me of getting in trouble as a kid. “Oh shit”, I think. “What am I going to do? Does this sink the project?”

In this case, there are some options. If the set of warning flags does not change between compilations, I think I can modify GCC to store the warnings with their corresponding declarations. This is a bit of a pain, but nothing too awful — and I think I can avoid imposing a cost on the non-warning case by representing the warnings as tree objects and storing them in the hunk with the other declarations.

If the user does change the warning flags, then what? Record it and recompile, I guess. A similar idea applies to options that change the ABI — because ABI decisions get baked into the tree objects we create, if the ABI changes, we cannot reuse the trees.

My other uh-oh moment has to do with inlining. I got bored by the tedious sub-projects I was working on — integrating pragmas (by the way. If you design a language, don’t design pragmas. Thanks) into the dependency computation, fixing the remaining test suite failures — so I decided today to start looking at incremental code generation. Something fun!

I tried out a quick implementation. If a function is parsed, we arrange to compile it; if it is not parsed, we don’t bother. This won’t work on real programs, of course, since those “missing” functions have to come from somewhere, but this should give a good idea of the possible speedup.

After testing on my typical small test program (zenity), I noticed something odd, namely that recompilations were not as blazingly fast as I thought they should be. (I first estimated the absolute lower bound as the time it takes to preprocess the source files.)

Hmm. A mystery. But first, a brief aside about tools. The compile server forks and runs code generation in the subprocess. I wanted to debug this fork. So, Plan A: use gdb and set follow-fork to child. But… that fails because, although my program does not use threads, it still links in the thread library (relic of my failed threading experiment), and gdb does not seem to handle this well. So, Plan B: maybe ftrace from frysk can help me — all I want to do is see a stack trace at a particular function call, perfect for ftrace. But, the ftrace I have aborts at startup. So I update and rebuild — but there is a build error. I suppose I could have gone with Plan C: stick in a sleep() call and attach, just like I did 15 years ago. Instead I picked Plan D: printf. Not quite as good, since I still need some of that information. Somehow I didn’t feel like Plan E: rip out the threading code and start over at Plan A.

Right now I’m doing a lot of debugging and pretty much every week has a vignette like that. I didn’t do that python stuff in gdb purely for fun.

Anyway. What is going on in the compile server?

What I found is that the code generation process still does some processing on every function, even functions that we intend to drop. In particular it is lowering each function to GIMPLE. I think what is going on here is that GCC is lowering functions and running local optimizations on them so that they can be considered as candidates for inlining. At least, that’s my working theory until I get back to Plan C and dig around a bit.

I’m not totally sure yet what to do about this. I think I will have to go back and rip out the decl re-smashing pass I wrote a while back, and instead find a way to perform gimplification in the server. That way, the compile server can keep the gimplified form for use by the back end. Other than the work involved, and some tricky details in lowering without smashing, I think this will work.

This isn’t going to be pretty, but at least it isn’t a total disaster. I’d like to think this isn’t totally an accident. GCC has undergone a lot of changes in the last five years to make it more flexible internally, and I’ve pushed a little bit more in that direction on the branch. This makes it a bit simpler to change the point at which we put a fork in the pipeline.

It feels a bit strange to write about the mistakes I make. On the plus side, I know how to fix these problems; writing about really unknown problems would, of course, be beyond the pale.