Archive for February, 2005

Libc oops

Ulrich almost immediately wrote me to say that my previous blog
entry was wrong and that glibc has had that functionality for quite
some time. I didn’t see anything here, so I went to
glibc
cvsweb
and dug around and found
dlmopen.c,
which seems to do what I want.

Bogus that I didn’t see this before. Now to wire it in to libgcj

First Code

Last night, just in time for FOSDEM, I got working assembly code
out of gcjx for the first time. It was a do-nothing program, of
course, but nevertheless this is a big milestone. In particular this
means that a fair amount of tree lowering works; the driver works;
various lang hooks and interconnects with GCC work; and gcjx can write
out Class objects, vtables, and other forms of metadata.

So, what remains on this front is a long debugging war. Along the
way I’ll need to fix up some details; e.g. the current class format
needs an upgrade to understand the new forms of metadata.

glibc wish

In Java it is possible to use class loaders to define multiple
classes from a given representation of a class — you can just pass
the same bytes around; each class loader essentially has its own
universe of types.

This doesn’t translate too well to gcj at the moment, since
dlopen() doesn’t do what we want when you try to open a
library more than once.

What would be cool (and I’ve heard that Solaris has this) is to be
able to create new “dlopen contexts” that would allow us to load a
given library once per context. Then in libgcj we could simply
associate a context with each class loader, and avoid the nasty hack
we have to do right now.

Jonas

Thanks to the patient help of Andrew Haley, the other day I
finally got to see Jonas
running pre-compiled with gcj. Right now I have a couple of “special”
hacks in my tree to make this work, but today I found out that they
can be blessed as real patches in short order. I’m planning to demo
this at FOSDEM, assuming I finish installing all those packages on my
laptop.

Sometimes I wonder whether these huge J2EE servers are really all
that great. They seem to have an awful lot of code and, maybe, don’t
really provide all that much leverage. Still, it is another N million
lines of code that run on libgcj.

… and that is the point. We’re much better these days about
being able to run existing java code.

gcj

I can tell gcj is being used more, because the rate of new bug
reports has shot up quite steeply. I keep telling myself, this is
good, this is good, this is good…

gcjx

A couple weeks ago I did more gcjx hacking. Now it can mostly
generate Class objects; this means it is quite close to generating
working object code. This will probably get pushed off a bit due to
FOSDEM though.

Why LLVM Matters

One of my wish-list items for libgcj hacking is to port the whole
mess to LLVM. LLVM is a low
level virtual machine; I think of it as a rough equivalent of the GCC
middle and back ends, only written in C++ and with a more flexible
design. For instance, LLVM can be used as a JIT as well as
ahead-of-time, and it can also do things like whole program
optimizations.

Hooking gcj and libgcj to LLVM doesn’t look particularly hard,
though it would require a larger block of free time than I seem to be
able to dig up. And, LLVM may not be quite ready for the
adventure… its exception handling is different (which is ok in a
closed world, but may matter more if you want real interoperability
with gcc-compiled C++ code), plus LLVM has notably fewer back ends,
and is missing some that matter. These are just minor bumps
though.

Today I read an interesting powerpoint
presentation
about the future of language design. It is full of
nice observations, for instance the idea that the implementation of
the next big programming language will probably be slower than what
we’re using now — since what we’re using now will have been heavily
optimized over the years.

This is where LLVM comes in. I think, these days, aspiring
language designers don’t really need to skimp on performance in order
to get their tools up and running. In the old days you would write an
interpreter or generate C code; but with LLVM it looks just as easy to
simply write a JIT. (It is also surprisingly easy to write a GCC
front end these days, so that is another viable approach to
implementing your language.)

The point of all of this is that free software lowers
institutional barriers, making division of labor more possible. In
other words, you write your language front end, and somebody else does
most of the worrying about turning it into efficient code.

New Laptop

Red Hat sent me a new laptop, replacing my ancient powerbook. It
is an excellent machine, notably more powerful than its predecessor.
For instance, I can actually build gcjx on it in a reasonable amount
of time. Finally a machine I can give Eclipse demos on 🙂

The FC3 install went very smoothly, though I haven’t yet worked
out every detail (getting wireless working looks fiddly).
Unfortunately, installing the OS is only the first step in really
configuring a new machine. Copying over my customizations is kind of
painful, especially random things I use but can’t be bothered to
properly package. Then there is also the task of configuring yum,
installing apt, and installing all that extra software I use that
isn’t in the OS itself.

This process is way simpler than it was back in the bad old days.
Configuring yum and apt is ridiculously easy. The significant barrier
right now seems to be simply remembering everything that I know I want
installed. Still, there’s room for growth in the “making it easy”
department here.

gcc.gnu.org

gcc.gnu.org is taking an little vacation, after having unexpected
and serious problems a couple of days ago. To make it possible to
get work done in the meantime, I’ve imported my working tree into monotone. This will make
it easy for me to keep my various patches separate for later copying
to the gcc repository. This is working out quite well… maybe we
should have just immediately set up a more public server for people to
use in the meantime. If gcc used monotone, I wouldn’t care nearly as
much when the machine crashed.

distcc and java

Some folks are working on modifying
javac to work with distcc
. I really should reply to this
on-list… what they are doing sort of overlaps with Anthony’s earlier
efforts in this area. Plus which, modifying a proprietary compiler is
ugly.

I’m not so sure that having distcc work for ordinary Java
compilation makes that much sense anyway. The current compilers are
all plenty fast on current hardware. It is hard to believe you could
get any substantial speedup by shipping compilations around the local
network.

On the other hand, there is a nice distcc improvement that would
help gcj. With the new binary compatibility ABI, when you compile
from .class to object, gcj no longer needs to read any dependencies
— it compiles the class file in isolation. This means it is
feasible to distribution these compilations.

So why won’t the ordinary distcc work? The most common way to
build using the new ABI is to compile entire jar files at once. So,
to be most useful, distcc would have to unpack the jar, distribute the
jobs itself, and then link the results. This is a slightly trickier
than it sounds because non-.class files have to be treated as
resources, requiring a different invocation of gcj.