Archive for the ‘Uncategorized’ Category

NCLUG

Last week I drove up to Fort Collins to give a talk about gcj at
NCLUG. I thought it went pretty
well… I gave an updated version of my old talk from FOSDEM 2004, but
then deleted the slides by mistake when I was trying to upload them.
The problem with my computer (and me!) assuming that I’m a power user
is that, occasionally and unpredictably, I am not.

Afterward a bunch of us went next door for Chinese food. I talked
to Evelyn from tummy.com a bit.
Apparently Fedora has let them retire KRUD, a local RH-based
distro. From the KRUD page it isn’t clear if this is a plus or a
minus, but in my mind it is a plus — it means Fedora is successfully
addressing needs that were not addressed by the old Red Hat
Linux.

Evelyn also had an experience similar to mine — and everybody’s,
I suppose — when installing linux for desktop use. I can’t just
install Fedora, I must also download flash (mozilla makes this easy,
but of course yum would be nicer), java (I didn’t on my FC5 box, but
partly because I’m keeping up appearances), and various sound and
video things. Evelyn also needed acroread, to my surprise; but
apparently only acroread can handle editing PDF forms.

Add to this the messy situation with proprietary drivers (my
laptop came with the atheros wifi stuff, which I still can’t get to
work on FC5) and the lack of ipod support, and you’d think that Linux
sucked.

I’m still hopeful though. We’ll outgrow this annoying phase.

I also learned about Night Vision for
Java
, a planetarium program written by Brian Simpson (he was
sitting across from me at dinner). Apparently this runs ok if you
enable the java2d stuff in Classpath; he tried it without success
during my talk but I’m told that things are all fixed in cvs (which, I
hope, we’ll be shipping in FC6).

Finally, I got to meet Bob
Proulx
. Bob does a lot of stuff in GNU-land and I had seen his
name before on the automake list, but I embarrassingly failed to
connect all the dots until after I had left. I hate those awkward
social moments. They seem to occur more often to me than to other
people.

I’ll be back in Fort Collins in a couple months to talk about
autoconf and automake. A little weird, since I haven’t worked on
these for so long.

Eclipse Plugins

I was also in Raleigh last week for a speaker training class, and
I caught up with Andrew Overholt there. We talked a bit about Eclipse
packaging, a hell we’ve both had to live in.

Whenever I think about what it was like to try to build that
thing, or its various plugins, I start thinking: why bother with this
at all? It’s just a huge mess!

But then I remember more. Of course we have to build it. We’re
building the OS, which changes. We need a reliable process from start
to finish so we can make and ship bug fixes. These are, btw, the same
reasons that open source java is needed — compatibility is desirable,
even necessary; but it is meaningless if you have no power to fix the
bugs preventing it.

As a user, it is convenient to just use the eclipse update manager
to download things. (Well, sort of convenient. The update manager UI
sucks and it has zero integration with the mozilla or anything else.)
And I do use it for a number of plugins. But installing an OS
reminded me why this approach sucks — it is a lot friendlier to have
a single way to install everything. The Eclipse approach means yet
another step in setting up a machine.

I suppose one answer here is to set up a site that provides a
bridge.

I’ve often thought about making an Eclipse meta-update site, which
would mirror every plugin available. The idea here is, why bother
copying those URLs to the update manager, navigating its brainless UI
once again? Instead, let one person do this and let Eclipse users
just point at this site. (The only problem with actually doing this
is that I couldn’t think of a way to make money off it. No ad revenue
via the update manager 🙂

Anyway, in conjunction with that I suppose you could auto-generate
RPMs from binary plugins, and from there a convenient yum repository.
This would solve the problem on the user end. Distros would still be
screwed, of course. Annoying binary distributions are the Java
standard, and Eclipse would just keep on contributing to the problem.

Happenings

So the buzz is that Sun will really actually truly free
Java sometime. Details, timeline, license, etc.: TBD.

This makes me feel very weird. I assume for a moment that it
is true and that it happens under acceptable conditions: it comes
pretty soon, it is complete, it is under a non-crazy license. On the
one hand, hallelujah! This is what we’ve wanted these 10 years.

On the other hand… I wonder what I’ll do with myself. I suppose
there are plenty of interesting things to work on. Even the Sun JDK I
suppose.

But the dislocation goes far beyond my future to-do list. What
does this mean about all the work I’ve done? Is it a waste?

I probably should’ve come up with answers to that back when we
merged libgcj into Classpath and nuked a lot of code. Sometimes I
feel bad about that process.

I do have my own answers for those questions. Everything is born,
lives for a while, and dies; our programs are no different. That they
die early or late doesn’t render them meaningless — only dead. And
meaning itself is something we bring, in interpretation; it isn’t an
intrinsic quality. Of course it is one thing to think that and
another to know.

Whew. Back to reality, we’re still hacking away on gcj. It
makes no sense to change course based on a maybe as big as this one.

Miguel’s blog pointed to a
nice entry
on this topic.

Danese Cooper says we’re
too poorly organized
, or at least thought of that way. I think
she is using “organized” to mean “backed by IBM” or something like
that. Anyway there’s not much correspondence between that idea and
what we’ve actually done.

It is true that Harmony has been a notable winner in lining up IBM
and Intel behind it. I often think of Harmony as a consortium in the
guise of an ASF project. I suspect our failure here was our license;
but it is difficult to say whether this was really a mistake per se.

She also wonders. “I’m wondering how long it will take the
various Linux distros to figure out that they can ship Harmony”.
We already know about Harmony. When shipping it isn’t a
big regression from shipping gcj, we’ll probably ship it. What does
that mean? It means that platform coverage and library coverage
matter. Meanwhile gcj remains the best free VM on my list of metrics:
platforms, performance, debuggability, and community.

gcj details

I’ve got the eclipse front end plugged into gcj here. It consists
of a new driver for ecj and a patch to the gcj specs to invoke it.
I’m debugging some .class compilation bugs that this
found, but I should be able to build everything soon. (I’ve already
built 1.5 code with it.) Next step: a branch in the gcc repository.

gnash

Last night when I couldn’t sleep I became bizarrely interested in
gnash and flash
software. First, I found the gnash source code kind of unreadable —
pretty messy. I read a bit about SWF; what a weird setup this thing
has.

A flash plugin is a classic example of what not to write
in C or C++. You end up reimplementing the world. Instead, start
with a library-rich language like java and it looks much simpler. I
found JSwiff for SWF
reading. Am I deluded when I think that this plus Java2d (and sound
and I guess JMF — yuck) plus a bit of glue would make it all happen?

Another One

Today I wrote another optimization pass for gcj. This one
collapses equivalent vtable references and array length references.
You’d think that GCC itself would do this, but there’s no way to tell
the optimizers that a given field is write-once.

Really I should fix GCC to do this… but writing a new pass is
easy to do, and fixing the generic code looks daunting.

The other day I also rewrote my devirtualization pass to use the
SSA propagation engine. Again, simple to do, and it improved the
results a bit.

Hacking GCC these days, while still tricky in some details, is
just enormously simpler than it was 5 years ago. Kudos to all the
tree-ssa folks who made this happen.

ecj

I spent some time this week hooking ecj up to gcj, as threatened.
I’ve got a new driver for the eclipse compiler that eases the argument
processing a bit. This is working well enough now that I was able to
successfully compile some source code using generics by running the
gcj driver.

If only I had a decent place to check this in. I wonder if the SC
would let me make a branch for this, even though it is in political
limbo.

JIT etc.

I started writing my GCC optimizer passes because I was curious
about writing a devirtualization pass for LLVM. I wrote about half of
it and then thought that surely this would be just as simple for
tree-ssa.

I’ve been thinking a bit about heuristics for when the libgcj JIT
should recompile. The easy ones are things like: recompile when
classes are initialized, so we can remove initialization calls from
inside loops; and recompile when constant pool references are
resolved, so we can replace expensive indirect accesses with cheap
direct ones.

There’s probably a lot of literature out there that I should be
reading on other times this is worthwhile — detecting when partial
specialization is worthwhile, profile-directed runtime optimization,
etc. Maybe HLVM will help.

Actually doing the recompilation is simple; LLVM provides the
needed hooks. For things like constant pool references, I think I
will take the simple approach of simply re-lowering from bytecode to
LLVM. If this proves to be too expensive, it can always be changed, I
think. But I suspect it won’t be. And, anyway, it will be fun
finding out.

gcj optimization pass

Today I wrote a GCC optimizer pass. The new pass is specific to
gcj and does a simple form of devirtualization. The idea here is
that if we have some extra information about a method call, we can
turn an indirect virtual call into a direct call.

My pass does this in one particular case. If the “receiver”
object of a virtual call was allocated with new in the
current method, then we know its exact type, and we can devirtualize.
This is conceptually trivial on the SSA form.

I wrote this pass since I had been playing with similar code for
my LLVM-based JIT, and I wanted to compare LLVM and GCC here — I’d
never written a GCC optimization pass and was curious about the
effort involved.

It turns out to be simple. This pass is about 200 lines of code.
And, when building libgcj, there were more than 6000 cases where it
triggered. I’m encouraged by this and now I’m considering writing
more gcj-specific optimization passes. Some ideas:

  • “Strength reduce” interface calls to ordinary virtual calls.
    The only difficulty here is in the bookkeeping; it could be
    inserted into the current pass.
  • Special handling for StringBuffer and
    StringBuilder.
  • A simple gcj-specific VRP-like pass, to handle array bounds
    checks, null pointer checks, redundant checkcast calls,
    and the like.

I guess some of these would make ok SoC projects.

LLVM JIT now accessible

I got a nice bug report about the LLVM JIT from Haren Visavadia
the other day; his one short test case found 4 or 5 bugs.

I decided to stop hoping that sourceforge would start working
well, and instead I just moved the JIT cvs repository to sourceware.
It is now on sourceware.org, repository /cvs/rhug, module
gcj-jit. Instructions on how to build it are included.
You can also see it via
cvsweb
now.

If you give it a try, please drop me a line, especially if you hit
a bug.

LLVM Update

Last night I found the buglet in the JIT preventing “hello world”
from working. Now it is time to start more serious testing; first the
libgcj test suite and then Mauve.

LLVM Thoughts

On Friday I translated my libjit-based JIT to use LLVM.
This took a good part of the day; then I spent a chunk of Saturday
debugging it.

LLVM has a few drawbacks, as compared to libjit. There’s not
really any documentation for how to use LLVM as a JIT, so I ended up
reading the header files quite a bit; libjit is much better here.
LLVM’s API is quite a bit bulkier than libjit’s, and it is more
idiosyncratic. For instance, in LLVM many objects can have a name,
and many classes require a name in their constructors; this seems a
bit bloaty in a JIT context — but I didn’t measure it. Finally, LLVM
is installed strangely; it is mostly static libraries, but with a few
random object files thrown in for good measure. This is unfriendly to
say the least… also, link times with LLVM are much longer than with
libjit, reducing my efficiency.

Some of these I would like to see fixed — either in LLVM or in
whatever ends up, someday, in GCC. Names could perhaps be handled
optionally. Other oddities in the interface could be fixed (not that
I have a list or anything…). Shared libraries should be made.

All this is gas, though, in a way. LLVM is generally more
functional than libjit: it has many more ports, more optimizers, and a
friendlier license. It probably represents a better long-term
approach.

With a little help on irc from Chris Lattner I got the LLVM JIT
running a couple very simple Java programs; with the optimizers
enabled it appears that LLVM correctly notices that the empty loops
are empty and removes them… so, it is working. There’s still a lot
of debugging to do (“hello world” still crashes), but this isn’t a big
deal.

Naturally, exception handling remains a problem. I’m hoping to
get Bryce or Andrew to solve that problem 🙂

libjit and gcj

Last weekend I wrote a JIT for libgcj using libjit.
Well, 90% of a JIT anyway.

libjit is remarkably simple to use. It took me about a day to
write a functioning (if not completely debugged) JIT for java
bytecode. On some microbenchmarks it was between 2 and 6 times faster
than the existing bytecode interpreter.

I’ve checked it in to the old gcjx
repository
… but you won’t be able to see it; I heard that
sourceforge has stopped updating its anonymous CVS. Email me if you
want a copy. The repository includes a patch for libgcj, the needed
modifications there are very minor.

Note that exception handling doesn’t work. This is somewhat hard
to do, since it requires modifying the JIT and also (probably)
patching libgcc. And…

Unfortunately libjit is pure GPL, so I doubt we’ll be including
this in libgcj, or even finishing it. Instead I think I’ll
investigate rewriting this JIT using LLVM instead. I’ve been thinking
of generalizing my existing patch to libgcj to make it possible to
dynamically load a JIT. That would make it easier to experiment here.

99 percent

According to our
nightly JAPI run
, we hit 99% of 1.4 the other day. Finally!
We’ve been hovering above 98% for quite a while, and checking in the
patch to make stubs disappear from the JAPI score (thus making it more
accurate) didn’t help.

At this point it looks like 1.4 completion is mostly about filling
in a few missing pieces here and there, and finishing the HTML
support.

The 1.5 scores still look like a disaster, since the trunk doesn’t
have generics and the generics branch doesn’t have all the most recent
patches. We’ll be fixing this problem this year when we merge the
generics branch back to the trunk.