Visitors and Multimethods

gcjx uses a simple version of the Visitor pattern for
code generation. I’ve been thinking about this a bit lately, as
experience with gcjx and random discussions with Graydon have been
tweaking my interest in language design.

For those who don’t know, visitors are basically a way to achieve
dispatch on the dynamic type of an argument to a method. This is
very handy for doing things like walking the model of a program that
is built up inside a compiler.

In gcjx this takes a very simple form. There is an abstract
visitor base class which has one abstract method for each object in
the model, like:

class visitor {
  virtual void visit_block (model_block *,
			    const std::list<ref_stmt> &) = 0;
  ...
};

The arguments here are ad hoc, according to the particular object
being visited (it need not be done this way, but it was convenient
for gcjx).

Then each class in the model has its own visit method:

class model_block {
  void visit (visitor *v) {
    v->visit_block (this, statements);
  }
};

As you can see this results in a straightforward way to achieve
multiple dispatch. You simply call the visit method on
any element of the model, and the appropriate method in your visitor
will be called.

One nice thing about this approach is that the compiler will tell
you if your visitor is incomplete, since that can only happen if you
didn’t implement some abstract method. This also means it is easy to
add a new class to the model — all existing visitors will break,
making it simple to figure out where to add new methods.

The downside of this approach is that it is inflexible in a few
ways. For instance, consider the tree-generating back end in gcjx.
When compiling to trees, we want to build a new GCC tree object
representing each object in model of the program. So, the obvious
way to do that would be to have the visit method
return a tree.

This is unsatisfactory, though, because it means you have to
modify every class in the model to allow this. This in turn means
that the declaration of tree must be visible globally —
it can no longer be segregated to a single back end. Of course this
could be worked around; e.g., visit could return
void*… but then you lose type safety and have to add
casts all over.

Another approach to this problem is multi-methods, which means
doing dispatch on the runtime type of the arguments. This way you can
use generic functions instead of visitors, and then easily add new
kinds of visitors without modifying the classes in the model.

C++ doesn’t directly support this, though apparently it can
be done
. One drawback I do see here is that it doesn’t seem
possible to determine when you haven’t written a method. The
compiler, seemingly, can’t tell you… a classic sort of
static/dynamic tradeoff. I’m not really all that familiar with
existing multimethod implementations, maybe there is some nice way to
inform compilers of one’s intent here.

A third approach, taken in GCC, is to simply switch
on the type of the object. One advantage of this approach is that it
is often simpler to keep track of local state — you can write
iterative code instead of recursive code in some places, you don’t
have to invert a lot of logic to put things in separate functions,
etc. This also suffers from the problems that arise if you add a new
class.

Coding styles that substitute programmer discipline for compiler
errors don’t seem to work that well for me. The ideal approach would
look somewhat like multimethods, but would let me have the compiler
check self-imposed constraints about which methods must exist.

First Edit

Today I got my first simple edit working in medi8. This just means that now
you can drag video clips onto tracks and, if they overlap, you can
make a simple cut between them.

For some reason whenever I talk about progress in a program I find
that I wind up talking about everything that isn’t working yet. Right
now I’m resisting that urge.

The Life Aquatic with Steve Zissou

I liked this movie in the end. Anderson usually makes interesting
sets, and this movie was no exception. Also there were nice little
“magical realism” tidbits put in here and there. This was really
much, much better than The Royal Tenenbaums (which I hated). Bill
Murray, who is probably everybody’s favorite actor by now (I’ve been a
fan since The Razor’s Edge), was as good as usual. Cate Blanchett was
good, and Willem Dafoe stood out for me. Many of Anderson’s movies
fail to engage me… his style doesn’t usually leave me empathizing
with the characters very much. This movie was different, kind of
surprise. I recommend it.

Meet the Fockers

Not as funny as Meet the Parents, but still only moderately
disappointing. Due to the usual marketing thing, Zoolander was on TV
recently. It seemed funnier the second time and sort of primed me for
Fockers. In a way I suppose it makes sense to go watch sequels just
so you can occasionally be surprised by a good one. Thin rationale,
that.

Lots more gcjx hacking

I’ve been doing a lot more gcjx hacking over break. It is now
down to around 180 Jacks failures (still some definite assignment
stuff to clean up). I’ve been compiling Classpath using it quite
regularly, and finally checked in a Classpath build patch to make it
simple to try this out.

My kind-of-wacky plan of using fields that define
operator() to make them act like methods was successful
recently. When I refactored warning handling to add support for
SuppressWarnings, I turned all those fields into real methods and
pushed them into a superclass (which is then reused elsewhere). This
required no changes to any users, which of course was the goal. It’s
all about the textual form.

Sometimes I think refactoring is over-hyped. One problem with it
is that you have to hope that either the IDE writer thought to include
the refactoring you need, or that you can hack the IDE to provide it
(assuming it is difficult enough to do that it warrants this). A more
flexible approach would combine something like pattern matching on the
AST with Emacs keyboard macros. That way you could use semantic
information to find the areas to change (useful), but then use plain
old text hacking to do the actual modification.

I plan to spend the rest of vacation lazing about and occasionally
hacking on medi8. I’d like to be able to actually edit something
before work starts again.

gcjx bug fixing

I went through gcc bugzilla and counted PRs that are fixed by
gcjx. I came up with 72, though a few of those are maybes. Chances
are, if you have found a front end bug in gcj, it is already fixed in
gcjx.

This weekend I also changed gcjx to optionally run the bytecode
verifier on the bytecode it creates, before writing it out. This
caught a couple of code generation bugs (and a funny buglet in the
verifier — it failed to special-case Object.<init>).

gcjx update

Annotations basically work, reading 1.5 class files (i.e.,
Signature parsing) works, annotations are generated in
the bytecode output, and SuppressWarnings mostly works.
That’s a lot for the last couple of weeks.

Elliott Hughes has been doing some profiling to find out why gcjx
is slow. It looks like there are some relatively simple ways to speed
it up. As guessed, the parser is a trouble spot.

All that seems to remain for 1.5 are generic methods. I had
written a lot of the unification code, but then deleted it by mistake.
:-(. It turns out not to be so hard to write though.

It is kind of demoralizing to work on an insane gcj
PR
, all the while knowing that gcjx is a hundred times simpler to
hack on, and this bug doesn’t exist there anyway. I wonder how many
PRs gcjx will fix when it goes in… fifty? One hundred?

Ocean’s 12

First, what’s up with the remake fever Hollywood seems to have
caught? They’ve remade everything except Zardoz and My Dinner With
Andre.

Ocean’s 12 was ok, though in place of old-style slick and cool
they decided to substitute dialog that ends mid-sentence. That’s not
great… and then the zooming around in time didn’t work so well and
the plot wasn’t nearly as engaging as in 11. All in all, a
disappointment. Prediction: no number 13.

Zatoichi

I must be the last person around to see this movie, because
whenever I tell people about it they just nod and agree that it was
really great. Plot-wise it is pretty basic, about a blind samurai who
is reminiscent of characters Eastwood played in spaghetti westerns.
It is a little gory, but the blood seemed to be computer generated and
was slightly off, color-wise, so I didn’t have to close my eyes or
anything. There are some nice twists, the film itself is beautiful,
and there are a couple of short “poetic” moments that reminded me of
the final scene (which I love) in The Seven Samurai.

San Francisco

I just got back from a short vacation in San Francisco, visiting
an old friend of Elyn’s. We did a lot of walking. I seem to mostly
go on vacations that drain me more than they refresh; San Francisco is
so visually stimulating and interesting architecturally that I spent
most of my time silently gawking. This gets hard after a couple of
days. One notable building: some bank building downtown which is
basically a roman corinthian temple. Wonderful.

SF also has a weird effect on my brain. I find myself thinking
crazy thoughts like, “now I must start a corporation and make ten
million dollars”, or “I must buy and renovate a turn of the century
mansion”. If I moved there I would probably never sleep again.