Archive for August, 2004

FindBugs is Cool

I was talking to Dalibor and he pointed me at the FindBugs
paper
, which is super-cool.

Figure 14 (page 10) is particularly interesting. This is the one
where they show Classpath as having 724 flagged bugs in 457 KLOC, and
the 1.5 JDK as having 3,314 flagged bugs in 1,183 KLOC.

First, the size disparity is interesting here. Classpath is about
60% complete (by a naive count that excludes a lot of libraries that
exist but aren’t integrated). This suggests that our code is 64% the
size of the corresponding JDK code. (These numbers are a little
fuzzy, since we aren’t doing comparisons against 1.5 yet, so perhaps
that first 60% is a little high.)

Second, our bug rate is much better: 1.6 bugs per KLOC versus 2.8
for Sun. My theory here is that bugs in free libraries tend to be
fixed, whereas bugs in Sun’s library tend to be worked around. Even
though the JDK source is “available”, it is only available under a
very restrictive license that provides a strong disincentive to bother
fixing problems in it.

I’d love to see us running FindBugs nightly against Classpath, and
mandating that code be “FindBugs clean” (to the extent reasonably
possible).

75%

As I mentioned above, there are free Java libraries out there,
like Tritonus, that implement parts of the standard library but aren’t
part of Classpath. If you roll all these into the nightly comparison,
it turns out that we’re above 75% complete (versus 1.4). That’s quite
good, especially when you consider where we were a year ago.

Sad Day

A hard lesson from vet: we’re merely a syringe away from the
grave.

Today was sad. Elyn’s cat had cancer and was put to sleep. We
buried her and planted a tree.

Eclipse and gcj

One of the big projects in gcj land this year is implementing the
new binary compatibility ABI. I spent today trying it out on
Eclipse.

First I tried to set a baseline, so I ran Eclipse with gij. I had
to add Mark’s little Makefile tweak to make this work (now on the
branch); unit-at-a-time breaks the interpreter at the moment. BTW I
used Eclipse 2.x, since I knew it “should” work — but this technique
doesn’t require application changes, and Eclipse 3 is definitely on my
to-do list.

Then I compiled startup.jar like so:

gcj -fPIC -fjni -findirect-dispatch -shared -g -o GCJLIBS/startup.jar.so startup.jar

After a little tweak to URLClassLoader (also checked in),
this Eclipse started up fine too. Finally I compiled a few more jar
files from Eclipse using the same approach, and that worked too — I
looked in /proc/.../maps to see that the shared libraries
were actually loaded.

This, or something like it, is definitely the way forward. It has
turned precompiling Eclipse from a labor-intensive effort to something
that is basically trivial, and also that applies well to other
applications. There’s still some dispute about where we should look
for the shared libraries corresponding to jar files, but that’s just a
minor detail.

FindBugs

Take a look at FindBugs. They have a
Java bug-finder that found some things in Classpath; it is pretty
cool. I’d love it if this was being run on Classpath every night.
Maybe next time I hack my nightly build infrastructure…

Language

While writing gcjx, I’ve come to think that our current language
choices are really too weak for a programming task like this. For
most of the work, C++ does just fine, but I’ve run into a couple of
situations where a different approach would be notably better.

First, consider generating synthetic methods. Java compilers have
to do this in a few situations, the most simple one being an implicit
constructor. This involves creating a constructor, setting its return
type and some other attributes, creating a block, adding a
super call to the block, and then adding the block to the
constructor. In all this is about 30 lines of code, which isn’t too
bad in this particular case — but in a lisp-like language, this could
be done much more simply with a single backquoted expression.

Second, suppose you want to add a warning like “warn if
a binary operator on the right hand side of an assignment uses a
narrower type than the assignment itself”. This lets you catch bugs
like double = int / int. In this case, you’d really
like a language that makes tree-matching very easy, say something
with unification.

In both these cases C++ doesn’t fit the bill, and there’s no nice
way that I know of to add the needed syntactic support. I could use
lisp, of course, but then I’m sacrificing static typing. I’m not a
language wonk, though, probably there’s some language out there that
does everything and I just need someone to tell me what it is. Not
that I’ll switch 🙂

Random Week

It was a sort of random week, where I did little bits of things
all over. I’m switching projects here, from Eclipse back to gcj, and
this week was a kind of transition period.

I haven’t been hacking gcjx much lately. It’s been sort of hard
to keep my morale up for it. However, Dalibor gave me
some motivation to run Mauve through gcjx. That
found a couple of easily-fixed bugs. I ran the resulting Mauve and it
seemed to do ok — no verification errors, so it looks like code
generation is in good shape for the most part.

A
bunch
of
C++
bootstrap
patches went into GCC recently. This is great news for gcjx since it
means I basically don’t have to do any GCC hacking to hook up gcjx.
So, I spent a little time on this, but not really enough. It does
now compile without error, but it doesn’t link yet.

As I’ve mentioned before, writing a GCC front end is basically
easy now. The documentation situation is still a bit unfortunate
though. I’ll probably write something longer about my experience
writing a front end from scratch once I’m a bit further along gluing
gcjx to the middle end, basically a document showing exactly how easy
it is to write your own compiler nowadays.

I also made
a generics branch
for Classpath this week, and genericized a
pretty good part of the library (as well as added some other stuff
needed for full 1.5 support). Without this code there is really no
way to test all the new language features in gcjx — in Java, unlike
C++, the compiler has some knowledge of the standard class libraries.
For instance gcjx, at present, knows about 34 different classes in the
standard library.

The C++ wisdom is that this sort of dependency is a mistake. That
is probably true for some uses, but this approach does make it easier
to add language features that interact well with the library. For
instance, I don’t think you could add a nice foreach
construct to C++ without this. (You can do it with macros or with
function-like objects, but neither of those is really “nice”.) As it
is, gcjx is littered with needlessly verbose code using STL iterators.

GDirect

This morning I did a little GDirect
hacking. It now actually compiles and a short example program works.
I checked it in to the gcjx repository,
since that was convenient for me. It is in the “gdirect” module if
you want to take a look.

I think the next step will probably be developing a way to
represent native pointers in GDirect. Most likely I’ll do this by
boxing and unboxing pointer arguments somehow. Offer suggestions if
you like, my ideas in this area are pretty vague at the moment.
Earlier I was just going to use RawData, but I find I’m
more interested in cross-VM portability than I was before.

Also on the table is the idea of having a notion of classes that
wrap some native object. This would let us wrap non-static methods in
a natural way.

Free Java. Take 7.

So, there was another “should Sun free
Java?”
debate, this time at OSCon. Once again, as far as I can
tell, no one actually working on a free Java implementation was
actually invited. Sigh. I tried to get Red Hat to send me, but
getting that to happen is like pushing peas uphill.

Reading the summary, it sounds like all the familiar arguments on
both sides. Ho hum.

Here’s a somewhat
interesting, but flawed
anti-freeing-Java post. I’ve got some
replies to it; quotes in italics are his:

  • Well, you can already download the source and fix a bug if you
    want to. You can even submit the fix to Sun.
    Yeah, that’s true.
    Of course, you can’t actually do anything else with that fix of yours;
    the license doesn’t come close to meeting the OSD. And, now you’re
    tainted by Sun’s obnoxious license, so you can’t help out on a free
    Java implementation. Plus, in practice we’ve seen that Sun is not
    always very good about even properly understanding bug reports, let
    alone replying to them intelligently. Critics will say that free
    software projects are also all over the map on responsiveness to bug
    reports, but high-quality projects like gcc are much better than Sun
    in this regard.
  • Open sourcing would mean we’d have even more people bickering
    about trivial issues.
    Yeah, this is a price of openness. Some
    bickering, and really on most projects this doesn’t dominate
    communications, is worth it to have real openness.
  • I think that Java would be fine without Sun. It’s a language,
    carefully spec’d out, with multiple implementations from many
    organizations, some with billions of dollars.
    False.
    First, if Sun goes under, someone will buy up all that IP — it isn’t
    just going to float around out there. It could very well end up in
    the hands of someone who wants to kill Java. Second, the
    specifications are spotty. Some parts are really good (the JLS),
    some less good (the JVMSpec), and some aren’t really “specifications”
    but rather “programmer documentation”, which is ok but not really
    suitable for writing a replacement — the class libraries all fall
    into this category. For some libraries the situation is even worse.
    Third, as far as I’m aware, there is only one Swing implementation in
    existence anywhere, the one Sun owns. So, contrary to what he says,
    I believe IBM does not have its own complete Java implementation.
  • So now we can see that opensourcing Java doesn’t solve
    anything.
    Sure it does. Open sourcing Java
    almost immediately puts a lot more software into Debian. That’s
    “something”. Second, it helps align the free software community as
    allies of Java against MS. Third, it might help breathe some life into
    Java on the desktop.

That was basically the same old stuff, just with different words.
At least he didn’t take the familiar and bogus approach of accusing
all free software people of being Evil Communist GPL Zealots.

Wouldn’t it be interesting if, just once, a keep-Java-closed
advocate were to actually talk to and ask questions of someone
actually involved with working on a free Java implementation? There
are lots of us out here, mostly friendly.

JDWP

I started doing some work on a free JDWP
implementation yesterday, but today on a whim I looked, and it turns
out there is a complete
implementation of it
in Eclipse already, complete with
com.sun.jdi and a test suite, rewritten from scratch by
IBM. That’s great news since it means debugging gcj-built programs
with Eclipse is one big step closer — basically just hooking this
code up to the gdb/MI wrapper that is also in Eclipse.