Archive for January, 2005

Binary Compatibility How-To

Inspired by the GCC wiki
appearing on gcc.gnu.org, I wrote a
short
how-to
about using the new binary compatibility mode and the
libgcj database (aka caching jit) feature.

This is by far the simplest way to deploy existing applications
using gcj. It works for Eclipse, and this is how I tried out Derby
the other day. This is also the way we’re taking to get jonas to
work.

Derby and gcj

Today I gave Derby a quick
try using gcj. I followed the
same basic approach
as I used when compiling Eclipse.

That is, I compiled all the jar files with
-findirect-dispatch, stuck them in a libgcj class mapping
database, and ran gij using the compiled shared
libraries. This all compiled without trouble and the command-line
interpreter started without problems.

I haven’t done much testing, mostly since I don’t know much about
Derby and only had a short amount of time to play with it. Still,
looks like another easy success for gcj.

Wish lists

I updated my Eclipse wish
list
a little.

God of Cookery

Kurth told me about this movie about ten years ago, but I forgot
all about it until I happened to see the box on the counter at Video
Station a couple of weeks ago. Even then I had to wait since,
apparently, it is checked out frequently.

This movie — a kind of riches-to-rags-to-riches story about a
food critic slash chef — is every bit as great as K said it was all
those years ago. It is stunningly random, as if, at every juncture
while filming, the director considered the strangest next possible
direction to take. It really hit my funny bone, I recommend it.

Garbage Collection

Casey
recently wrote
about the woes of garbage collection. Here’s my
unsolicited take on the subject.

The big plus for GC is that it enables better software
engineering. A bit of global information — whether or not an object
is potentially in use — is handled globally, and no particular of
user module is responsible for its deallocation. This makes it much
simpler to write APIs; simply pass around objects as you like and the
system handles it.

Nothing is free, of course. You can usually expect to pay a speed
penalty with GC (though finding how large of one may be complicated).
The presence of GC changes the programming system in other ways as
well, for instance it ordinarily necessitates the presence of weak
references.

And, no conversation of GC would be complete without mentioning
that certain kinds of memory leaks will still persist. If you
continue to have a live reference to an object which will not be used
in the future, it won’t be collected. Explicit deallocation proponent
often erroneously point to this as a kind of GC failure, either
explicitly, or secondarily, as in “if you must explicitly null a
pointer, you might as well introduce a free() in the same
spot”. There are (at least) two points here: first, that in a GC
environment this is a local problem which can be fixed locally, and
second, the important point of GC is not that it reclaims memory, but
that it does not reclaim live memory.

That said, it is pretty easy to write C++ classes that basically
automate memory management. And, with a little planning in one’s
program, it is easy to avoid memory leaks altogether.

For gcjx, I wrote a simple “owning pointer” class that does
reference counting (you can find better ones in Boost). I’ve run into one or two
memory management bugs, mostly due to little design flaws in my API.
So, the situation in C++ really needn’t be that bad.

But then, gcjx is a fairly self-contained program, and the data
structures it builds are largely trees (with sole ownership). I
think the situation gets worse for explicit allocation when you start
looking at very large programs with modules over which you have
little control.

The ease of writing C++ wrapper classes is a minus, though, not a
plus, when it comes to this topic. Suppose you use a collection of
several libraries. Either you will end up using plain pointers and
missing out on the benefits of C++, or you’ll have to find ways to mix
and match various ownership approaches, potentially a fragile affair.
This is one of the big benefits of GC as I see it: not the technology
per se, but the API unification it implies across libraries.

Of course, the real reason I like GC is that I’m just lazy and it
makes the hacking go quicker.

gcjx now in gcc

A few days ago I finally moved gcjx development from sourceforge
to gcc.gnu.org. The branch is named gcjx-branch. It
isn’t fully hooked up to the build system yet, but you can build the
gcjx directory standalone and have a bytecode compiler.

I also recently ran jacks tests of both gcj and gcjx. The
results are overwhelmingly in gcjx’s favor:

gcjx:	Total	4928	Passed	4711	Skipped	45	Failed	172
gcj:	Total	4928	Passed	4166	Skipped	44	Failed	718

What’s funny is that their failures don’t overlap very much, and yet
they both manage to compile all of Classpath. Partly this can be
explained by the fact that compilers tend to do better on correct
code than incorrect code, but partly I just observe that even a
fairly buggy java compiler is still useful.

Andrew points out that, of course, gcjx will come with its own new
undiscovered bugs as well — and he said that without even looking at
the incomplete tree-generating back end. Still, at this point we seem
to have a lot of interesting code out there to use as test cases; I’m
sure at merge time (I think optimistically it will be sometime this
year) we’ll have confidence in the result.

Eclipse in Fedora

A gcj-compiled Eclipse RPM is now in Fedora Core Rawhide; for
instance look for “eclipse” here.
This hasn’t shown up in the FC info feed yet, but it
should soon. Thanks to Andrew Overholt, Tom Fitzsimmons, Bryce
McKinlay (and probably others) for getting this all running.

Visitors and Multimethods

gcjx uses a simple version of the Visitor pattern for
code generation. I’ve been thinking about this a bit lately, as
experience with gcjx and random discussions with Graydon have been
tweaking my interest in language design.

For those who don’t know, visitors are basically a way to achieve
dispatch on the dynamic type of an argument to a method. This is
very handy for doing things like walking the model of a program that
is built up inside a compiler.

In gcjx this takes a very simple form. There is an abstract
visitor base class which has one abstract method for each object in
the model, like:

class visitor {
  virtual void visit_block (model_block *,
			    const std::list<ref_stmt> &) = 0;
  ...
};

The arguments here are ad hoc, according to the particular object
being visited (it need not be done this way, but it was convenient
for gcjx).

Then each class in the model has its own visit method:

class model_block {
  void visit (visitor *v) {
    v->visit_block (this, statements);
  }
};

As you can see this results in a straightforward way to achieve
multiple dispatch. You simply call the visit method on
any element of the model, and the appropriate method in your visitor
will be called.

One nice thing about this approach is that the compiler will tell
you if your visitor is incomplete, since that can only happen if you
didn’t implement some abstract method. This also means it is easy to
add a new class to the model — all existing visitors will break,
making it simple to figure out where to add new methods.

The downside of this approach is that it is inflexible in a few
ways. For instance, consider the tree-generating back end in gcjx.
When compiling to trees, we want to build a new GCC tree object
representing each object in model of the program. So, the obvious
way to do that would be to have the visit method
return a tree.

This is unsatisfactory, though, because it means you have to
modify every class in the model to allow this. This in turn means
that the declaration of tree must be visible globally —
it can no longer be segregated to a single back end. Of course this
could be worked around; e.g., visit could return
void*… but then you lose type safety and have to add
casts all over.

Another approach to this problem is multi-methods, which means
doing dispatch on the runtime type of the arguments. This way you can
use generic functions instead of visitors, and then easily add new
kinds of visitors without modifying the classes in the model.

C++ doesn’t directly support this, though apparently it can
be done
. One drawback I do see here is that it doesn’t seem
possible to determine when you haven’t written a method. The
compiler, seemingly, can’t tell you… a classic sort of
static/dynamic tradeoff. I’m not really all that familiar with
existing multimethod implementations, maybe there is some nice way to
inform compilers of one’s intent here.

A third approach, taken in GCC, is to simply switch
on the type of the object. One advantage of this approach is that it
is often simpler to keep track of local state — you can write
iterative code instead of recursive code in some places, you don’t
have to invert a lot of logic to put things in separate functions,
etc. This also suffers from the problems that arise if you add a new
class.

Coding styles that substitute programmer discipline for compiler
errors don’t seem to work that well for me. The ideal approach would
look somewhat like multimethods, but would let me have the compiler
check self-imposed constraints about which methods must exist.

First Edit

Today I got my first simple edit working in medi8. This just means that now
you can drag video clips onto tracks and, if they overlap, you can
make a simple cut between them.

For some reason whenever I talk about progress in a program I find
that I wind up talking about everything that isn’t working yet. Right
now I’m resisting that urge.

The Life Aquatic with Steve Zissou

I liked this movie in the end. Anderson usually makes interesting
sets, and this movie was no exception. Also there were nice little
“magical realism” tidbits put in here and there. This was really
much, much better than The Royal Tenenbaums (which I hated). Bill
Murray, who is probably everybody’s favorite actor by now (I’ve been a
fan since The Razor’s Edge), was as good as usual. Cate Blanchett was
good, and Willem Dafoe stood out for me. Many of Anderson’s movies
fail to engage me… his style doesn’t usually leave me empathizing
with the characters very much. This movie was different, kind of
surprise. I recommend it.

Meet the Fockers

Not as funny as Meet the Parents, but still only moderately
disappointing. Due to the usual marketing thing, Zoolander was on TV
recently. It seemed funnier the second time and sort of primed me for
Fockers. In a way I suppose it makes sense to go watch sequels just
so you can occasionally be surprised by a good one. Thin rationale,
that.