Archive for April, 2006

gcj optimization pass

Today I wrote a GCC optimizer pass. The new pass is specific to
gcj and does a simple form of devirtualization. The idea here is
that if we have some extra information about a method call, we can
turn an indirect virtual call into a direct call.

My pass does this in one particular case. If the “receiver”
object of a virtual call was allocated with new in the
current method, then we know its exact type, and we can devirtualize.
This is conceptually trivial on the SSA form.

I wrote this pass since I had been playing with similar code for
my LLVM-based JIT, and I wanted to compare LLVM and GCC here — I’d
never written a GCC optimization pass and was curious about the
effort involved.

It turns out to be simple. This pass is about 200 lines of code.
And, when building libgcj, there were more than 6000 cases where it
triggered. I’m encouraged by this and now I’m considering writing
more gcj-specific optimization passes. Some ideas:

  • “Strength reduce” interface calls to ordinary virtual calls.
    The only difficulty here is in the bookkeeping; it could be
    inserted into the current pass.
  • Special handling for StringBuffer and
    StringBuilder.
  • A simple gcj-specific VRP-like pass, to handle array bounds
    checks, null pointer checks, redundant checkcast calls,
    and the like.

I guess some of these would make ok SoC projects.

LLVM JIT now accessible

I got a nice bug report about the LLVM JIT from Haren Visavadia
the other day; his one short test case found 4 or 5 bugs.

I decided to stop hoping that sourceforge would start working
well, and instead I just moved the JIT cvs repository to sourceware.
It is now on sourceware.org, repository /cvs/rhug, module
gcj-jit. Instructions on how to build it are included.
You can also see it via
cvsweb
now.

If you give it a try, please drop me a line, especially if you hit
a bug.

LLVM Update

Last night I found the buglet in the JIT preventing “hello world”
from working. Now it is time to start more serious testing; first the
libgcj test suite and then Mauve.

LLVM Thoughts

On Friday I translated my libjit-based JIT to use LLVM.
This took a good part of the day; then I spent a chunk of Saturday
debugging it.

LLVM has a few drawbacks, as compared to libjit. There’s not
really any documentation for how to use LLVM as a JIT, so I ended up
reading the header files quite a bit; libjit is much better here.
LLVM’s API is quite a bit bulkier than libjit’s, and it is more
idiosyncratic. For instance, in LLVM many objects can have a name,
and many classes require a name in their constructors; this seems a
bit bloaty in a JIT context — but I didn’t measure it. Finally, LLVM
is installed strangely; it is mostly static libraries, but with a few
random object files thrown in for good measure. This is unfriendly to
say the least… also, link times with LLVM are much longer than with
libjit, reducing my efficiency.

Some of these I would like to see fixed — either in LLVM or in
whatever ends up, someday, in GCC. Names could perhaps be handled
optionally. Other oddities in the interface could be fixed (not that
I have a list or anything…). Shared libraries should be made.

All this is gas, though, in a way. LLVM is generally more
functional than libjit: it has many more ports, more optimizers, and a
friendlier license. It probably represents a better long-term
approach.

With a little help on irc from Chris Lattner I got the LLVM JIT
running a couple very simple Java programs; with the optimizers
enabled it appears that LLVM correctly notices that the empty loops
are empty and removes them… so, it is working. There’s still a lot
of debugging to do (“hello world” still crashes), but this isn’t a big
deal.

Naturally, exception handling remains a problem. I’m hoping to
get Bryce or Andrew to solve that problem :-)

libjit and gcj

Last weekend I wrote a JIT for libgcj using libjit.
Well, 90% of a JIT anyway.

libjit is remarkably simple to use. It took me about a day to
write a functioning (if not completely debugged) JIT for java
bytecode. On some microbenchmarks it was between 2 and 6 times faster
than the existing bytecode interpreter.

I’ve checked it in to the old gcjx
repository
… but you won’t be able to see it; I heard that
sourceforge has stopped updating its anonymous CVS. Email me if you
want a copy. The repository includes a patch for libgcj, the needed
modifications there are very minor.

Note that exception handling doesn’t work. This is somewhat hard
to do, since it requires modifying the JIT and also (probably)
patching libgcc. And…

Unfortunately libjit is pure GPL, so I doubt we’ll be including
this in libgcj, or even finishing it. Instead I think I’ll
investigate rewriting this JIT using LLVM instead. I’ve been thinking
of generalizing my existing patch to libgcj to make it possible to
dynamically load a JIT. That would make it easier to experiment here.

99 percent

According to our
nightly JAPI run
, we hit 99% of 1.4 the other day. Finally!
We’ve been hovering above 98% for quite a while, and checking in the
patch to make stubs disappear from the JAPI score (thus making it more
accurate) didn’t help.

At this point it looks like 1.4 completion is mostly about filling
in a few missing pieces here and there, and finishing the HTML
support.

The 1.5 scores still look like a disaster, since the trunk doesn’t
have generics and the generics branch doesn’t have all the most recent
patches. We’ll be fixing this problem this year when we merge the
generics branch back to the trunk.