Dark Tort versus The Little Sister

A while ago I read The Little Sister, by Raymond Chandler, and also Dark Tort, by Dianne Mott Davidson.

I had sworn not to read any more of Davidson’s books, but, like Marlowe, boredom and angst got the better of me. It was just sitting there, on Elyn’s side of the bed, promising relief. “Look at me”, said the cover. “I am not the terrible books you have already read. I am different.”

Naturally it lied.

I tried to picture Marlowe living in Aspen Meadows, working for a caterer. Anything to make it through the book, the reading of which, for some reason, had become like a duel. I could best Davidson: her bad writing, her undistinguished observations of Colorado life.

She tried to wear me down. First she had all the characters phrase statements as questions? Over and over? As if she had learned a new writing trick? I perservered.

Next she enumerated the many ultimate comfort foods — a specialized torture which had successfully broken George Will. I was stronger than that, more flexible. I can accept that Apple Betty is the ultimate one day, but Mac and Cheese the next. I have three ultimate comfort foods before breakfast.

Wily, evil Davidson tried repetition as well. Perhaps she could lull me into complacency with warm, fresh bread. Never just bread, only warm, fresh bread, a mantra to destroy my reading skills.

But what drove me to picturing Marlowe was a vignette picturing Boulderites. It’s as if she were writing for me, trying to probe my pet peeves. We Bouldarians are flighty. We’re paranoid. We think that garbage trucks are evil. We’re little old ladies. Sure, Boulder has its whatevers and et ceteras; but wasn’t Traven rumored to live on Spruce Street? That should be dark enough for anybody.

Someday, I hope, Davidson will lose it a little and write her own anti-novel, something that will annihilate her previous work. We’ll see Aspen Grove as it truly is; perhaps a corrupt small town with a machiavellian caterer pulling the strings. Marlowe will move there from Los Angeles to cure his vapors, and proceed to confront the yokel sociopaths and fight and shoot his way through the cafes and dog-washing businesses. Someday.

Miss Pettigrew Lives for a Day

We had read a lukewarm review of this on imdb, but we went anyway — and loved it. The plot is a bit thin, perhaps, but the movie has a sweet heart, the cast is good, and the sets and costumes are fantastic.

Gold is released

Ian Taylor checked in the long-awaited “gold“. Gold is a new ELF-only linker written in C++. It is designed for performance and is much faster than the current binutils ld.

I’m very happy about this for a few reasons. First, we’ve needed a new linker for a long, long time. Second, this will help the incremental compiler.

I looked through the gold sources a bit. I wish everything in the GNU toolchain were written this way. It is very clean code, nicely commented, and easy to follow. It shows pretty clearly, I think, the ways in which C++ can be better than C when it is used well.

Congratulations, Ian!

Compile Server Scalability

There are a few aspects to compile server scalability that are important to address.

First, and most obviously, memory use. Because we want to be able to send real programs through the compile server, and because we want it to remain live for relatively long periods of time, it is important that memory use be “acceptably bounded”. Naturally, the server process will grow with each additional compilation unit. At least in the straightforward implementation, there’s no way around that (but see below). However, it is important that the server not leak memory, and that recompilations generally not increase memory use. Also, ideally, all that work on decl sharing will keep memory use in check.

For the most part, this did not take any effort to achieve. GCC has a built-in garbage collector, and most nontrivial data structures are allocated using the GC. This is not a silver bullet, of course, but it has yielded good results with little effort in practice.

In the case of recompilation, we employ a simple heuristic — we store all parsed hunks keyed off the name of the requested object file (note: not the input file; it is common for a project to compile a given source file multiple times, but it is rare to see the same object file name more than once). When recompiling an object, we assume that there will be a lot of reuse against the object’s previous version, so we store those hunks temporarily, but then discard the old ones at the end of compilation. This way, we reuse, but we can also free hunks which are no longer in use.

Results from a few tests are very encouraging here. I compiled gdb with the compile server, then deleted the object files and re-compiled. Memory use (as reported by -fmem-report) stayed flat at around 51M — meaning that recompilation doesn’t grow the image, and the collection approach is working as desired.

I also built gdb using the compiler in “normal” mode, and looked at the -fmem-report totals. If you sum them up, which I naively expect gives a rough idea of how much memory --combine would use, you get 1.2G. Or, in other words, decl sharing appears to make a huge difference (I’m not completely confident in this particular number).

If memory use does become a problem for very large compiles, we could look at scaling another way: writing out hunks and reading them back in. Maybe we could use machinery from the LTO project to do this. This would only be useful if it is cheaper to read decls via LTO than it is to parse the source; if this is not cheaper then we could instead try to flush out (and force re-parsing of) objects which are rarely re-used. One special case of this is getting rid of non-inlineable function bodies — when we have incremental code-generation, we’ll never compile a function like that more than once anyway.

Another scalability question is how to exploit multiple processors, either multi-core machines, or compile farms. In an earlier post, I discussed making the compile server multi-threaded. However, that interacts poorly with our code generation approach (fork and do the work in the child), so I am probably not going to pursue it. Instead, for the multi-core case, it looks straightforward to simply run multiple servers — in other words, you would just invoke “gcc --server -j5“. Something similar can be done for compile farms.

An ideal result for this project would be for small changes to result in compilation times beneath your perceptual threshold. I doubt that is likely to happen, but the point is, the absolute turnaround time is important. (This is not really a question of scalability, but I felt like talking about it anyway.)

In the current code, though, we always run the preprocessor for any change. So, even once incremental code generation is implemented, the turnaround time will be bound by the time it takes to preprocess the source. This might turn out to be a problem.

In an earlier design (and in some other designs I have heard of), this is handled by making a model of compilation that includes preprocessing. That seems too complicated to me, though, and instead I think that it should be possible to also make an incremental preprocessor (say, one that uses inotify to decide what work must be re-done), and then use it without excessive cooperation from the parser.

Python and Gdb

Recently I’ve been hacking on integrating Python scripting support into gdb. For years now I’ve been wanting better scripting in gdb, but until I saw Volodya’s patch I never did anything about it. So, the other night I made a git repository (thanks gitorious!) and started hacking away. Thiago Bauermann did some nice updates on Volodya’s value-inspecting code, too.

A decent number of things are working. See the wiki page for details on cloning the repository.

Since I basically live in Emacs nowadays, I wanted to install the Python documentation in info form. Am I the only person who still loves info? It isn’t beautiful, to be sure, but it is amazingly convenient inside Emacs — simple to navigate, call up, and dismiss; with info-lookup it can function as a low-rent context-sensitive help; no messy fussing with the mouse.

Anyway, I couldn’t find this readily available anywhere, so in the end I checked out python myself and built the docs. That was sort of a pain… I’m half considering making an ELPA package out of the info pages. Come to think of it there are probably a number of potential info-only packages out there.

Tools Happenings

There are some fun things going on in the tools arena right now.

Do you read Taras Glek’s blog? He’s working on GCC Dehydra, which lets you write GCC passes in javascript. I think his work is one of the most interesting developments in the GCC world today.

There are a few similar projects in the works right now. The plugin branch lets you write GCC plugins; the authors of this branch have a Python plugin, but as far as I’m aware this is not publicly available.

On a slightly more idiosyncratic note, Basile Starynkevitch made a branch for his MELT project. This is also a plugin system, but it uses a lisp dialect for which he’s written his own lisp-to-C translator. I’m kind of partial to this idea — I think it would be fun to write parts of GCC in lisp, at least if it compiled down to something reasonable.

I’m quite interested in pushing GCC more into source analysis and refactoring. Right now the front ends have some problems that make this difficult, but I think these are surmountable without too much trouble. Maybe when I finish this pesky incremental compiler…

With all this going on I wonder why GCC-XML is not in the tree, at least on a branch.

Vladimir Prus recently made available his prototype which integrates Python into gdb. This is promising work — we’ve needed something like this for years. Maybe we can finally print complex data structures in sensible ways.

Finally, Benjamin Kosnik has checked in a restructuring of the libstdc++ documentation. I browsed the new stuff a bit and I found it much simpler to navigate. I’m very happy about this; good, clear documentation is critical to the success of free software.

Using Quagmire

Over the last two weeks I’ve spent some time hacking on Quagmire. I’ve tried to add the features I think are most commonly needed, and I think now it is ready for early adopters to come on board. It isn’t at feature parity with Automake, but it does implement a large subset of Automake’s functionality:

  • Build and install C/C++ programs and static libraries, with Automake-style dependency tracking.
  • Initial (aka, gcc-only) shared library support. (This is still fairly lame… “libtool in two lines of code”. My current plan here is to do something much more minimal than what libtool provides.)
  • Automatic support for the standard clean targets.
  • Install and uninstall, including DESTDIR.
  • dist and distcheck support (a bit incomplete; and FWIW this is just ripped right out of Automake).
  • Support for some minor (but standard) targets: TAGS, texinfo stuff, rebuilding configure-generated files.

It also has some features that Automake does not. One long-requested Automake feature is to be able to turn off its verbose output; Quagmire does this dynamically depending on whether -s was passed to make.

Quagmire also has some initial support for build-time function and header checks, and pkg-config support. This is not fully baked, but fun to play with. One way this is nicer than using configure is that if you add a new function to the list, the next invocation of make will run that test only.

There’s some example code in the repository that shows how to use most of the features.

Currently Quagmire does some things differently from Automake. For instance, it does not use directory prefixes for things like PROGRAMS. However, I recently figured out that it could (and in some situations, like _DATA, this is really what you want), so I’m wondering whether I should change all this to follow Automake more closely. Let me know what you think.

Also I’ve been wondering about whether it would be appropriate to post an announcement of Quagmire on the Automake mailing list.

Charlie Bartlett

We went to an early screening (theater release Feb 22) of Charlie Bartlett last night. It was part of BIFF, playing at the Boulder Theater.

This is a pretty good teen film — funny without being a comedy. It meandered at times but I enjoyed it quite a bit. The actors are very good.

Afterward the director held a Q&A on stage. This was interesting, and even inspiring. He mentioned that this script had been in development since at least 1985, and was voted as one of the best scripts not likely to be made. He also talked about his process of choosing a script (he read over 100), his filmmaking, and his move from editing to directing… he made a good enough impression that it made me want to push his film a bit. So, go see it.

Soon I Will Be Invincible

Never say pomo didn’t build anything — there’s a small genre of books that take fresh looks at old stories, usually adopting the outsider point of view. Soon I Will Be Invincible is the latest such book, the story of a supervillain (Doctor Impossible) and his attempts to take over the world. I stayed up late to read this book in one sitting.

This book follows all the comic book conventions — evil genius trying to take over the world, super heroes, etc. The universe seems to be Marvel-esque; there are alternate dimensions, and magic, and aliens, and gods. Soon makes funny references to the comic book world as well, with a mention of the “golden age”, and generally drawing implicit parallels between the life stories of its characters and the evolution of comics themselves.

It reminded me of Lint in that it is a bit repetitive at times. It is a nice read, though, and I recommend it.

27 Dresses

This film was almost completely formulaic. Perhaps, someday, an enterprising quantitative movie reviewer will come up with a way to measure the deviation of a film from its genre ideal. For instance, this movie would not score a perfect 1.0, because the boy-girl fight was a bit too brief.

Don’t get the wrong impression. There are many venues where boring conformance to the norm is to be preferred — perhaps, regrettably, including programming. Romantic comedies, I suspect, are in this zone, or nearly so. Which is a way of saying that I had an ok time, at least I didn’t spend the hours crying.

While watching this I did have enough spare time to wonder whether this kind of movie represents a particular phase of an actress’ career — similar to the way that writing an editor or window manager marks a particular development in a programmer’s career. Someone out there makes a living advising aspiring actors on these topics. I find that interesting, maybe even comforting.