Archive for May, 2004

Random Musings

Bastien’s
post
set off a chain of memories for me today. I saw Soy Cuba a couple years
ago at the library, and loved it. Once you get past the propaganda,
and some of the goofy characterizations (the portrayal of Americans is
particularly funny), you begin to realize what an interesting piece it
truly is. There’s a scene in a bar where the camera work is just
poetic.

And of course it was co-written by Yevgeny Yevtushenko. I saw him
once in Pasadena. He and the American translator of his work co-read
The
City of Yes and the City of No
, the translator in English and
Yevtushenko in Russian. We all assumed he was drunk, but perhaps he
was just dramatic, dancing about maniacally while his translator read
in English like an automaton.

That evening he also screened his film Stalin’s Funeral, which
was pretty wonderful, as I recall (this was more than ten years ago).
There’s a scene where a prisoner practices piano on the edge of table
in prison; he told us that this was a true story and that the real
prisoner, a concert pianist, maintained his training this way and gave
a concert the week after he was let out.

I’ve also been thinking today about the transience of things, and
the difficulty, even pain, of letting go. You’d think after all these
years, and all the various projects I’ve worked on and left, that it
would be easy. Nevertheless, despite my awareness (however vague) of
this process, I find I put a part of myself into whatever I work on,
and pulling it out again hurts.

STL and gcjx

gcjx is written in
C++ and uses the STL pervasively. It is in C++ as a compromise: it
isn’t C, and I thought I could convince the GCC maintainers to allow a
C++ front end. I felt that the Ada experience had probably soured
most of them on using a more esoteric language, like Java.

Bryce pointed out to me that the STL is not cheap. I already knew
that building STL-using programs took a long time — literally hours
to build gcjx on my wimpy laptop that hits swap whenever it builds a
big C++ program. That’s sort of an outlier, though, my desktop
machine builds it reasonably quickly. However, the result is also
huge:

$ ls -lh gcjx
-rwxr-xr-x    1 tromey   tromey       1.1M May 25 13:54 gcjx*

That’s after running strip, too.

Sigh. Perhaps we can replace our use of the STL with something
more light-weight — I left all the std:: tags scattered
all over the source for an eventuality like this (driven by my
perception of the GCC maintainers’ future requests, not any
engineering concern). I’m not even convinced this replacement would
help the size problem, though.

So, this is one identifiable advantage of Java’s otherwise goofy
erasure-based generics — there is only one implementation of any
given generic class, so you never have template-induced code bloat.

Shrek 2

Shrek 2 was about what I expected — funny but not hilarious, nice
graphics, simple storyline, and some clever subtle jokes. I enjoyed
it quite a bit, I recommend it. Elyn like it better than the
original, mostly due to less hype, I think.

Mozilla

Today I installed Mozilla on my girlfriend’s Windows box. What a
pleasant experience that was — congratulations Mozilla hackers.
She’s already using OpenOffice, and now she will be using free
software for email and browsing. Free software has come a long way
since those days ten years ago when a Tcl-based browser seemed like a
good idea and we were all waiting for XWord to work.

Refactoring

Today’s gcjx refactoring is like the work I had done on my
foundation: necessary, but in the end there’s just the same old dirt
showing.

Verifier

Last night I spent some time making the libgcj bytecode verifier
a
bit more generic
. This is still preliminary, but basically the
idea is to make it so that the verifier can be plugged into other
environments. We plan to put it in the gcj front end as well, but
other VM implementors might want to write the necessary glue code to
make it work for them — this is a bit ugly, but not really very
difficult.

gcjx

Sometimes you just need a short break to reset your mind a bit.
The other day I was struggling with one particular gcjx bug. I came
back to it tonight, after a grueling day hacking the Eclipse 3 build
system, and it was suddenly trivial.

So, good news. gcjx can now once again parse all of Classpath,
only this time, it is handling inner and anonymous classes more
correctly (still not perfectly…), and it survives through code
generation:

$ find /tmp/gcjx-out -name '*.class' -print | wc -l
   2391

gcjx update

This weekend I finished enough of anonymous and inner class
handling to check it in. gcjx is now back to generating code. There
are some more fixes to come in this area. Recently I also added a few
warnings; usually this is really simple in the new framework.

Bryce had a nice idea, which was to replace gcjh with a new gcjx
back end. This was inspired by the gcjh Miranda
method bug
, which is particularly ugly because it will require a
major infrastructural improvement to gcjh. So, I started writing the
JNI header back end, which predictably is really simple. gcjx holds
the promise of being able to write class files, header files, and an
object file all from a single compilation; strange but also
convenient.

gdb

Today I helped Anthony debug a libgcj problem over irc. I sure
wish there were an easy way to hook an irc session up to an
already-running gdb so we could interactively share the debugger.
Anyone have a cool hack for this? Preferably one that has some sense
built into it so that not just anybody can hijack the debugger…

GDirect

I spent a little time yesterday working on an easy way to create
native methods for Java. I call it “GDirect”. The idea is pretty
simple, basically have a package that creates native method
implementations on the fly, based on declarations in your code. So,
e.g., you would write something like this:

public class example {
  static {
    // Second argument is base name of shared library.
    GDirect.link (example.class, "m");
  }

  public static native double atan (double x);
}

In this example, example.atan will call the C math
library function atan, nothing more is needed.

Issues

There are a few issues with this approach. One is pointers, but
with gcj we can just represent these with RawData.
Another is structures, which I’m just punting on. A third issue is
type mismatch — if the Java int and the C
int are not the same, you need translation at runtime.
So far I’m not handling this, but it is actually pretty easy to deal
with. There are some other type-mapping issues to resolve as well,
I’ll look at those more a bit later.

Since this uses JNI under the hood, it is pretty slow. With the
upcoming binary compatibility ABI in gcj, though, we can do better
there. In some situations we can completely eliminate the need for a
runtime stub and just have direct calls to the C functions from
Java.

Opportunities

This code can be made completely portable across JVMs without much
effort. It works by creating new functions on the fly with libffi,
and it registers these functions with the JVM via the JNI
RegisterNatives call. For gcj, we’ll have some CNI
equivalent of this in the not-too-distant future, so we’ll be able to
make GDirect both portable and more efficient when running on our
implementation.

libffi’s closure API is what makes this possible. It works on
most platforms, though not all yet. I think a similar technique could
be used to make bridges in other places, e.g. interfacing Java to
Mozilla or to OOo.

Another random idea is to make it simpler to wrap libraries that
already have an object-oriented interface. One way we could do this
would be to tell GDirect about a field which holds a native pointer.
Then we would treat non-static native methods specially, looking up
the native pointer at invocation time and passing it to the underlying
C API. Combined with a simple facility to rewrite method names when
linking, this would make it very easy to create and maintain, say, a
Gtk wrapper library. For instance, something like
gtk_window_set_type_hint would become:

public class GtkWindow extends ... {
  // This would actually be in a superclass somewhere.
  private RawData nativeInfo;

  static {
    GDirect.link (GtkWindow.class, "gtk", "nativeInfo",
                  GtkRewriter.getInstance ());
  }

  public native void setTypeHint (int hint);
}

Here, GtkRewriter would be a special class that would know how to
transform “GtkWindow.setTypeHint” into “gtk_window_set_type_hint”.

JHBbuild

Tom Fitzsimmons has modified JHBuild to
build
gcj
. This is pretty cool, check it out. I’m hoping to convince
him to add more packages, like GNU Crypto, to the
mix.

gcc

The tree-ssa branch was merged into
mainline
today. Congratulations are in order to Diego and
everyone else who worked on this, it’s been a long time coming. Not
to be ungrateful, but now someone should write that VRP pass… 🙂

gcjx

I’m still working on making anonymous, inner, and local classes
behave properly. It’s been slow going, but mostly because it is
spring and things are more social now.

Nightly builds

I finally investigated a couple nightly build problems. Now the
API comparison pages
should be working again. I still haven’t investigated the GNU Crypto
check failure or the Kawa build failure; maybe later.