Archive for the ‘software’ Category

Friday GCC Update

I haven’t been blogging much lately, at least partly because I’ve been having fun hacking on GCC. Mark pointed out, though, that I should blog regularly about my progress so people know what is happening.

I’ve got the basic compile server running. Recently I’ve been fixing bugs in it. I’m not seeing the mind-blowing 2x speedup I once was; I need to look into that a bit. This doesn’t matter, though, because the real benefits will come from the next step, which is incremental code generation.

This week, to get away from the bugs a bit, I made GCC’s garbage collector thread-aware. Following the implementation methodology known as “Biting Off More Than You Can Chew”, I also dove in to making the whole front end work multi-threaded. So far this has yielded an evil-looking patch that marks a couple hundred global variables with __thread.

My current plan is to try to merge what I’ve got during the next Stage 1. My thinking is, the current code provides benefits, even if it not every planned benefit. It would be better to merge as frequently as possible — not just for testing but also because smaller patches are simpler to understand, and an early merge would get other GCC maintainers used to the new stuff.

Actually… why don’t I just post my road-map here

  1. Fix bugs. Make it bootstrap again.
  2. Clean up the hacks I put in. Implement the things I skipped (anti-dependencies, maybe semantic dependencies, fix PCH).
  3. Test on bigger programs.
  4. Make sure --enable-intermodule bootstrap works. Measure performance improvement.
  5. Stage 1 merge.

At this point there are multiple options for what to do next. I’m not sure what order I will tackle them — as the thread experiment shows, I’m prone to following what looks fun, in the absence of other constraints.

  1. Incremental code generation. May require linker changes.
  2. C++ front end. (I’m thinking we should make the front ends into shared libraries that gcc will dlopen.)
  3. Finish multi-threading work (if I don’t do this next week).
  4. If there isn’t one yet, finish incremental linker.

After this comes other fun stuff, like adding a way to query the server for information about your program. I’m thinking http…

I’ve also gotten interested in other aspects of GCC’s C front end. For example, it is not ideal for some kinds of static analysis because it folds too early. It might be fun to fix this.

Fan-out

Most of my big projects seem to go through a “fan out” phase, where the to-do list grows very quickly. This is a scary phase of the project — one where it feels as though things are accelerating out of control, and where success seems as though it could be in doubt. The compiler seems to be entering this phase now.

While this phase feels dangerous, I think the more dangerous phases come earlier, when I’m more confident.

Do you experience fan-out? I’m curious whether this is an artifact of how I approach projects, or whether it is a more universal experience.

First Success

I just managed to get through an ordinary make of a real program (zenity, which for some reason is my standard test) using gcc as a server. Yay! It is, as predicted, about twice as fast as an ordinary make.

Now pile on the caveats.

Compiler Progress

I’ve been working steadily on the incremental compiler project these last couple of months, but not blogging about it. So, time for a status report.

I made a branch in the GCC repository, cleaned up my existing patches, and checked it all in. There’s a short web page describing the project on the GCC wiki (branch details there).

I’ve also written a rough draft of the support code for “decl smashing”, and also changed the parser to consider “per-declaration hunks”.

The idea behind per-declaration hunks is that, for code written by the user, we get better incremental behavior if we consider each hunk to hold a single declaration or definition, rather than the previous rule of “everything between two file change events”. The former is better because it means that a change is isolated to its containing declaration and uses of that declaration (and we can do even better here, perhaps, with a bit more analysis); the latter is better for system headers and the like because it has lower overhead.

Finally, I’ve started server-izing GCC. This part should be ready for real testing pretty soon.

DVR

Comcast had a trial offer where you could get a DVR machine rent-free for a year, so we dropped by their office and picked one up. These machines are as nice as people say — much, much friendlier than plain TV or VCRs. It only took a couple of weeks for it to change how I watch TV.

While playing with it, though, I’m reminded once again why I first turned to free software all those years ago. I’d like to be able to hack the machine a little… say, upload my favorite DVDs, or add more disk space, or get a second one and be able to share videos between the two.

Maybe I should get a Neuros OSD box. Anybody try one of these? Or of course I could set up MythTV, though that seems more expensive (given that I have basically no usable hardware).

GCC Summit 2007

The 2007 GCC Summit just ended; all that’s left is the wrap party. Lately I haven’t been too interested in writing up comprehensive trip reports (but if you liked that style, post a comment — I’d like to know), and instead I’m just going to write about the bits I found particularly interesting. Overall the summit was good; I think the proceedings are online if you want to read the papers.

I had a few personal goals for the summit and I think I’ve accomplished all of them. I wanted to meet up with the various GCC hackers I’ve met over the years and sort of get them used to the idea that I might be around in GCC-land a bit more than previously. This was the easy part — I talked to tons of people and met many more.

I also wanted to pitch my incremental compiler to anybody who would listen, and get feedback about the idea in general and about a few specific potential problem areas. Thanks here go to Ian Taylor and Alan Modra for some assembler and linker advice; and to Mark Mitchell and Nathan Sidwell for telling me some awful truths about C++.

Nobody really complained much about my plans, not even the evil part where GCC will generate a single object per top level definition, or the even eviler part where code generation is done by fork()ing the server and doing the work in the child, to avoid having to clean up a million global variables. Now I’m left hoping I’m not in a tragedy of manners, where everyone holds back their knowledge of impending doom.

Diego, appropriately, gave the opening remarks. He showed how GCC’s SpecINT performance has remained flat over the last five years, inducing widespread depression. SpecFP looks a bit better, though, and plus everyone can agree that GCC’s internals are much nicer than they used to be.

Most of the talks were about optimizations. This is fun stuff but not my current area of interest. There were a few talks touching on the idea of writing plugins for GCC — one paper discussing a custom version of Lisp, and another having impressive amounts of working code including Python bindings and a GIMPLE visualizer. I’m hoping one or both of these go into GCC eventually; I especially like the idea of writing custom code analysis passes in Lisp.

A couple talks were about debugging. I missed one of these but the other had Daniel Jacobowitz demonstrating gdb stepping backward in time. This is awesome stuff! … though at the moment there are several restrictions and not all the patches are available. Lots of people are concerned about GCC’s debuginfo, which I hear from many sources is rather sucky.

It’s a bit strange to embark on a project that seems so unrelated to the rest of the work in GCC. I keep doing mental management on myself so as not to obsessively worry about this. Anyway the GCC crowd is pretty nice and I’m not worrying about how people will react any more.

Several people asked me about gcj of course. I tried to be clear about the fact that the situation is murky: my own gcj hacking has dropped off precipitously, though I still review patches, and what to do about gcj really depends on the level of involvement of other parties in the future (it is hard to pick a time period here but I would say at least two major releases).

GCC Summit + Fedora 7

I’m headed to the GCC Summit this week, with a trip to the office in Toronto thrown in.

In preparation for the trip, I posted a short note about the incremental compiler project to the GCC list. These posts are always a bit stressful, since I spend a few hours checking email and waiting anxiously for flames. But, as usual, little happened; you can check the (non-) responses for yourself. Wouldn’t the wisdom not to worry be nice?

I also prepared by upgrading my laptop to Fedora 7. The pre-preparation for this was a pain — but it was also all my fault, since I’d installed an ill-advised mishmash of random things either for development testing (new Eclipse and gcj) or for my own enjoyment (Emacs 22).

Fedora 7 is quite nice. The fonts look prettier. The icons are prettier (except, strangely, the rhythmbox icon). powertop is cool. This time around I couldn’t find a CD download (only DVD), and I didn’t want to risk a yum upgrade, so I downloaded the network install CD, booted it, and did an upgrade that way. This went smoothly; my only complaint was that I picked the wrong mirror at first (one seemingly without the necessary tree) and spent some time checking and rechecking URLs before catching on. This could perhaps be done a bit more smoothly somehow.

I haven’t had a chance yet to use Fedora 7 in anger, but so far I’m already happy with the upgrade. It went smoothly and I already see benefits; for instance madwifi seems to work with NetworkManger now.

Distributed Compiler

Andrew MacLeod had an interesting idea yesterday — make the compile server interact nicely with distcc. I’ve been thinking about this a bit, and I don’t think there are any major problems with the idea. It would be pretty simple to run a compile server on multiple machines. Maybe it would make sense to ensure that a given file is always sent to the same server, but even this isn’t strictly necessary.

One possible problem is that this would not interact well with any indexing feature we add to the server. Also it would not allow cross-module inlining in all cases, or, worse, it might choose to inline or not depending on which jobs end up on which server — something would have to be done about that.

The point behind doing this is to try to scale to very large programs — larger than will fit in memory on your desktop.

Another idea in this space is to somehow serialize hunks to disk, GC “rarely used” hunks when memory is low, and then read hunks back in as needed. Whether this would actually be a win would depend, I think, on the speed of deserialization — it would have to be reliably faster than simply re-parsing (this is more likely if GCC adopt’s Ian’s proposed pointer-free “PC-rel-like” IR — but I don’t know how likely that is). I think I’ll wait for the LTO project to be a bit further along before investigating this any more; hopefully I’d be able to use LTO’s tree-reading and -writing capabilities to save and restore hunk contents.

Anyway, a distributed compile server is an interesting idea for the future. I’ll revisit it once the server is actually working.

Another FE Test

I added “prerequisite” support to my front end prototype recently — this is code that ensures that a hunk’s contents are only re-used if all the things that the hunk depends on are also re-used. This turned out to be much simpler to implement than I had anticipated.

With this I still can’t compile the C front end with my compiler, because GCC has many ordering dependencies in its headers (ugly stuff — you should avoid this) and so ends up requiring the decl smashing fix. The decl smashing fix looks like a tricky project, especially given the multi-threading constraint, so I haven’t done it yet.

Meanwhile I took a simpler approach to trying to get some results from this test case: I reordered all the #include statements in the C front end to ensure consistency. Yay hacks!

Even with the additional prerequisites bookkeeping, the results are quite nice, though a bit more modest than the zenity case:

Compiler Time (M:S) Memory at exit
Prototype 0:49 43M
Trunk 1:29 83M

PCH Again

The other day Roland asked me why PCH is not a bigger win. While discussing this I realized that I didn’t try to time the very best case for PCH. So, I looked at a couple more things.

First, I spent some time munging the output of -ftime-report and making graphs with gnuplot. I thought it might be interesting to visualize the problem areas. All this told me, though, is that in my earlier tests, the PCH build still spent an inordinate amount of time in the parser.

Back when I wrote gcjx (my test case here), I tried to make it PCH-friendly. I made a master file, "typedefs.hh", which declared most things and which included a number of other files. But, it turns out, it didn’t include absolutely everything — which the "all.cc" approach does. So, I made a .gch file that did include everything (not entirely trivial, as there are ordering dependencies between headers to consider) and re-ran my test. This did speed up the PCH case — but the all.cc case is still much faster. A look at the graph here shows the same thing, namely that the PCH build is spending a lot of time in the parser.

I didn’t look much deeper than this. I’ve learned that PCH is not excellent even in its best case, but I don’t know why. Given PCH’s many restrictions, I don’t think it is too interesting to pursue this further. The approach I’m investigating is faster and does not have as many limitations as PCH. Also it will not require changes to your Makefiles.