Archive for the ‘Uncategorized’ Category

International Java Developers Conference

In a week or so I’ll be at the International
Java Developers Conference
in Sao Paulo. I’m very excited about
this! It is in Brazil, which is exotic and cool, and this year I’ve
just missed a few opportunities to give a talk there — it is good to
finally make it. But more importantly, Brazil seems to have a lively
free software scene and a lively Java development scene — so it is a
great place to talk about what we’re doing.

The conference
program
looks pretty interesting. I’m looking forward to meeting
the Sun guys.

Operator Overloading

This weekend I thought a bit about operator overloading in Java.
I thought I’d write up some of the things I considered.

Basic Approach

My first idea was something similar to what C# does — allow
essential operators to be overloaded, but not unusual ones like
|| or .. In C#, operators must be public
and static, so I would just copy that too. It is probably best to
introduce a new operator keyword, though we could just as
easily simply anoint magic method names like
operatorPlus. Since all operator uses are unqualified
(a Foo.+ b just looks too ugly), users would use
inheritance or static import to introduce operators into the scope.

If we use the operator keyword, then we have to
augment static import a bit to allow importing operators. I think
this argues slightly for simply picking special method names.

Since we would simply be translating operators into method calls,
no serious binary compatibility issues would arise. We could simply
define the rules for operators to map directly to the rules for
ordinary methods.

C# also synthesizes the compound assignment operators like
+= from primitive operators, if the compound operators
are not explicitly declared. This seems like a good idea to me.

We must also define how this interacts with boxing, but that is
also simple to do. The rule should be modeled after ordinary method
invocations, with the additional note that if the left hand side of a
binary operator is a primitive type, then we simply do not consider
non-static methods (i.e., we don’t box initially).

Additions

This would be pretty useful, but I can think of two possible
problems. The first problem is that it just seems nicer to allow
non-static methods as well. This is easily added, along with a rule
to search first for an instance method (in the left hand argument for
binary operators, or in the sole argument for unary methods), followed
by searching for a static method.

The second problem is that some operators are commutative, but
ordinary method definitions are not. Suppose you define
BigInteger.operator+(int). Now you can write
bigint+5 — but 5+bigint is an error. This
means you will end up writing a number of fairly redundant operators;
but it would be nice to be able to remove this redundancy.

C# and C++ seem to simply punt on this issue, which I suppose
makes sense. You might think about adding commutativity rules, but
that introduces an asymmetry for the situations where operators are
not commutative. Also, it makes searching for the operator method
strangely complicated.

More Ideas

Groovy apparently maps some operators onto already existing
methods, for instance mapping == to the
equals method. This is a cute idea, but I think it is
dangerous in practice. Sometimes you really do want to be able to
tell if two objects are identical, and equals is
frequently overloaded. We could resurrect this idea by having a
special interface that indicates that we want this sort of overriding
to happen automatically. We would then add new methods somewhere
(e.g., System) to allow un-overloaded equality
comparisons.

Groovy’s idea of mapping some comparison operators to
compareTo seems like a good one to me. It doesn’t suffer
from the same special problems as equality; we could automatically
override operators like > for any class which
implements Comparable. In fact, I think this should be
the only way to overload these comparisons.

One other thing to think about is whether the special operator
method names should simply be taken from BigInteger.
This approach would allow retrofitting of some operator overloading
onto this existing code without any library changes. Note that due to
the commutativity problem this would not be completely seamless — so
for best results we would need to add new methods regardless. In my
opinion, though, this would not be a good idea, as methods
named add are not uncommon; they are used all over the
collections API, and turning all of these into + doesn’t
seem smart.

Example

Here’s an example of how we would add an operator to
BigInteger.

    // New special interface indicating that == and != should be
    // overridden.
    public interface ComparisonOperator
    {
    }

    public class BigInteger extends Number
      implements Comparable, ComparisonOperator
    {
      ...

      // Implement the smallest number of operators to let "+"
      // work.

      public BigInteger operatorPlus(BigInteger val)
      {
	return add(val);
      }

      // Rely on widening primitive conversion.
      public BigInteger operatorPlus(long val)
      {
	...
      }

      public static BigInteger operatorPlus(long val, BigInteger bi)
      {
	return bi.add(val);
      }
    }

    // Use it.
    BigInteger x = whatever();
    System.out.println(x + 5);

Efficiency

Currently, Java compilers will take an expression using the
special String addition operator and turn it into a series of method
calls on a compiler-generated StringBuffer object. In
a case with multiple additions, e.g. a+b+c, the compiler
will generate a single buffer and make multiple calls on it.

Unfortunately, I don’t see a simple way to recapture this
efficiency for user-defined operators. It would be possible for the
compiler to notice a series of overloaded operators where each call
is resolved to the same method. And, for example, this could be
turned into a call to a varargs method:

    public static String operatorPlus(Object... args)
    {
      StringBuffer result = new StringBuffer();
      for (Object o : args)
	result.append(o);
      return result.toString();
    }

However, in this situation you end up creating a new garbage array.
Maybe this idea is the way to go, I don’t really know. I suppose in
theory object creation is supposed to be cheap.

One thing I haven’t considered here is the interaction between this
idea and type conversion operators. I think adding implicit type
conversion to the language is, most likely, a bad idea. It certainly
seems to hurt in C++. (If we did have type conversion, we could
handle this case by having a different type for all the intermediate
operators; in the String case this would simply
be StringBuffer.)

Implementation

I gave some thought to implementing this in gcjx. I think it
would be straightforward.

If magic method names were used, the parser would not need any
changes. Otherwise, we would have to add a new keyword (trivial) and
a change to static import (reasonably easy).

For semantic analysis, we would have to update the code for the
operators to handle this properly. That is quite simple, most of this
code is in two files. For instance, all of the simple binary
operators are handled in a single method.

For code generation, we could use a trick to make it relatively
simple. The idea would be that each operator class in the model would
hold a pointer to a method invocation object. If a particular
operator is in fact a call to an overloaded operator, this pointer
would be non-null. Then, an operator’s implementation of the visitor
API would simply forward to this operator instead:

template<binary_function OP, operator_name NAME>
void
model_arith_binary<OP, NAME>::visit (visitor *v)
{
  if (overloaded)
    overloaded->visit (v);
  else
    v->visit_arith_binary (this, lhs, rhs);
}

I believe this approach would let us introduce this feature without
any changes to the back ends. (It would make some uses of the
visitor a bit odd — for instance the model dumper would print method
calls rather than operator uses here. I don’t think that is a major
problem. In any case this is fixable if we care enough, by adding
default implementations of a new visitor method to the visitor class
itself.)

Anti-overloading

Operator overloading remains a contentious issue. While in some
situations (namely, math-like things such as BigInteger or matrix
classes) it is clearly an improvement, in other situations it is
prone to abuse.

This is a big discussion, and I have a lot of thoughts on it, but
I want to actually finish writing this today; I’ll write more on the
topic of the future of programming languages later. Meanwhile, I
think one common anti-overloading argument, namely their obscurity, is
definitively overturned by today’s IDEs. If overloading were part of
Java, you wouldn’t have to be confused about the meaning of a
+= appearing somewhere in the code — F3 in Eclipse would
take you directly to the proper definition.

In sum, this was a fun thought experiment for a Saturday morning.
I don’t think operator overloading is the most important feature
missing from Java, but it is often requested. Adding this to the
language would not break any existing code, would not greatly
complicate the language, the compiler or typical programs, and would
be clearly useful in some situations.

More Ajax

It turns out that a lot of folks are doing the ajax-based web
front page thing. Alexander Larsson told me
about Netvibes, but there is
also fyuze, Protopage, and (how could I have
missed this?) Google
Reader
. I would imagine that My Yahoo will move this direction
someday too.

Netvibes in particular looks amazingly similar to start.com.
Protopage is a bit different in that it appears to be more of a sticky
note and links holder — that’s a nice idea. I haven’t investigated,
I wonder if the pages are shareable. Anyway, the others should add
this functionality.

Services Versus Software

I found out about start.com last
week and I’ve been playing with it (even though it is written by MS,
boo). I think it compares very favorably to older “newspaper”
applications like My Yahoo. For one thing, the UI is notably cooler
— nice clean design. It seems to be a new-style Ajax application,
though as an end user I don’t really care much about its
implementation, just the result; namely that it is much more
interactive than Yahoo. I can click and rearrange things and it
happens immediately, rather than taking me to an intermediate edit
page. Also, with start.com it is immediately obvious how to go about
adding random RSS feeds — I didn’t even notice this feature of Yahoo
until writing this entry.

In addition to the UI improvements, it is better than My Yahoo in
another important way. start.com supplies a way to write gadgets,
which I gather are non-passive ways to integrate new feeds. And, as
gadgets are themselves just web entities of some kind, you can use
ones supplied by third parties.

So, that is cool.

The Problems

However, it occurs to me that this mode of deploying applications
has a major flaw: it can still only show me the world as viewed from
the server — but this is noticeably different from the world as
viewed from my machine. For instance, start.com can’t access the Red
Hat intranet. Nor can it access my bank accounts, or whatever
feed-like things I might happen to run privately (nightly test results
come to mind).

I see this as somewhat of an analog to the programming rule that
says that the structure of a program is a model of the structure of
the group that wrote it. In this case, my view is determined by the
relationships that start.com has, not by the relationships I have.

Analogies

Another analogous case is Google. Google does a great job of
searching the web and as a result we all use it dozens of times a day.
Rock on! But google.com only knows the public web, which is not the
same as my web.

Google-the-company, being smart, tries to solve this problem too,
by providing Google
Appliances
and Google Desktop Search.

But… once you accept this you’ve crossed that magical boundary
between services and applications, and all the usual rules for
software once again apply. In particular, why install some random box
on your internal network unless it is running free software under your
control? We just spent the last 15 years getting away from that!

Differentiate And Abstract

I think we need to clearly differentiate between data sources,
presentation, and UI implementation.

Data sources are the classic kind of “service”, and that is really
all I want from the web. In particular I don’t want a UI mixed in
with a service, as it is not going to be able to integrate all the
data sources I need.

What this means is that the presentation must be local, not
remote. So, the aggregator has to be a local program under my
control.

However, this does not necessarily mean that this program’s has to
be some typical C/Gtk thing. Ajax is perfectly fine — provided it
meets the usual user expectation requirements. I don’t see any reason
why the aggregator couldn’t be running inside a local web server.

Some recent stuff

This week on a whim I wrote javax.sound.sampled.
I’ve been testing it a bit against Tritonus and the handy code from jsresources.org.
This is going somewhat slowly, and I have yet to write the javadoc, so
it will be a little while before it goes in.

I’ve also been plodding along on gcjx, working toward getting the
Classpath generics branch to compile. Lately, though, I’ve been
thinking of shifting my emphasis back to the tree generator, so that I
can merge it into the gcc trunk sooner… the 1.5 work doesn’t have to
be complete for this to happen.

Frysk

I haven’t seen much hype for frysk, though it certainly
deserves some. (And, by the way, no, the “K.” mentioned on that page
is not Franz Kafka’s compiler-writing alter ego. Though, to be sure,
there are parallels between programming and The Castle.)

Having a programmable monitor and debug tool would be super handy.
There’s a kind of convergence here between tools like valgrind,
simulators, LLVM, and now frysk that is worth some investigation… it
is a rich period for execution tools, just as it is for
next-generation version control systems.

The other day I thought of an odd frysk application: sandboxing.
The idea is that you could run untrusted executable code under frysk.
frysk intercept system calls and force failures for those you want to
disallow. I picture it as being a bit like the java sandbox, where
you could configure the wrapper program to allow or disallow certain
events (allow X connections, disallow other network connections, allow
file reads, disallow writes outside of pwd, etc). The UI would be a
bit like valgrind: sandbox --whatever program arg arg.
Think of it as SELinux on the cheap.

Come to think of it, this would be very useful for testing as well
— you could easily do controlled failure of certain system calls;
e.g., inject EIO at certain points of the test.

Trying out KDE

I had a few frustrations with Gnome after upgrading my main
machine to FC4, so I’ve been giving KDE a try this week.

I have to say, amaroK is very
nice, the best music player I’ve tried. I find, to my surprise, that
I love all the weird little gui things they do. I don’t understand
why the panel part of it shows up in the notification area though; I’d
rather have a full set of player controls directly on the panel.

It also seems like the KDE panel is a bit more configurable — it
can act the
way the Gnome panel could before FC2
.

On the other hand, in some ways KDE seems less nice. The KDE
terminal doesn’t to turn URLs into links (that I can see). The KDE
panel sometimes moves applets around without my intervention (this one
seems to be a plain old bug). I had to install kbiff separately (the
whole point behind this exercise), and it is kind of clunky.

I was pleasantly surprised to find that the x-chat notification
thing I have also works in the KDE panel. I guess freedesktop.org is
earning its keep :-).

In general the two desktops feel pretty similar, much more than I
would have thought.

Anyway, I think I will switch back. I’m more used to gnome, and
now that I found out about mailnotify, my most
pressing irritation has gone away. I really wish Gnome had just
automatically run this applet for me instead of giving me a useless
error about the mailcheck applet going away. I literally went for a
couple of weeks thinking that they had simply dropped a useful applet
without providing a replacement. Foolish me, I guess, for not reading
the release notes; though honestly I think this easily falls into the
“just works” category. I usually put off upgrades because history has
taught me that in amongst the new goodies will be some unexplained
deletion of a feature I rely on. Upgrades are a fact of life,
handling them with some grace will make people happier.

Japi Update

A quick update on Classpath… a few days ago we hit 96%
of 1.4
without much fanfare. Seeing as we were at 92% just a
month ago… you do the math.

Of course, that is 1.4. How are we doing against 1.5, the target
that really matters these days? That is harder to say. Comparing
cvs head to 1.5
shows us at 87%, while comparing
the generics branch
shows us at 85%. These measure different
things, though — the latter includes generics information in its
comparison, but on the other hand the trunk tends to have fewer
missing methods (most development takes place there and merges to the
generics branch are relatively infrequent). I’ll take another look at
this after the next merge. Note also that one of the biggest “red”
chunks for 1.5, javax.management, is implemented by mx4j (at least, for the
purposes of folks who don’t mind mixing the Apache and Classpath
licenses).

Tools Matter

gcjx is a moderately-sized C++ program, 73,000 lines as counted by
wc. On this machine, even when taking advantage of g++’s
precompiled header feature, it takes 14 minutes to build.

By way of contrast, using jikes to build the 900,000 lines of java
code in Classpath takes 15 seconds on the same machine.

Now, in some ways this is not a fair comparison. Although I
compiled gcjx with -g and thus avoided many
optimizations, it is still more expensive to generate object code than
bytecode. Also, as plenty of people will tell you, C++ is a more
expressive language than Java, and so you should expect to pay for
that. Nevertheless we’re talking about a more than 650x slowdown for
using C++.

For me there is no question: the added power of C++ does not come
close to compensating for the time spent waiting for my builds to
complete. Some changes will introduce a 10 minute bubble into my
workflow — this has a noticeable effect on my concentration.

On the other hand, when working on Classpath and Mauve in Eclipse,
my changes are essentially ready immediately. For the last buglet I
worked on, I would simply work on the code in Classpath, or the test
case in Mauve, save my changes, and then, without pausing, switch
windows and run jamvm on the test.

I’m not really familiar enough with other C++ compilers on other
platforms to know where the fault for this lies — maybe other
compilers are a lot faster than g++. Though once you fix g++ you also
have to look at the linker, which seems to be another big time waster.

Naturally, tool performance is just one part of the whole
productivity story. That said, fast tools matter a lot, a lot more
than I used to think.

Lint

“Civilization is the agreement to have gaps between wars,” writes
Jeff Lint, subject of the biography Lint. This is an unusual
book that I picked up at the library on a whim. Parts of it are
outrageously funny, for instance the whole “Belly” sequence, Lint’s
cartoon TV show, and the description of his proposed Star Trek
episode. At the same time parts of this book disturbed in a way, as
though reading the description of someone unknowingly in the midst of
a mental breakdown, and then slowly realizing that, hey, maybe that is
me.

In any case, I think Aylett has achieved something remarkable
here, in that he was able to write a “writerly” book which did not
make me want to retch. While at times the relentless
other-worldliness and just-barely-incoherence of Lint was tiring, at
other points I was struck by the sustained creativity that was
required to have written this book — like a sci-fi version of The
Passion
, only not so deep. Several times I thought, “I wish I
had thought of that.”

I recommend reading this book, though I wouldn’t classify it as
an SF essential (say, The Female Man or so).

Oh, by the way, due to this book otters have now been added to the
list of animals which may no longer be used in jokes. You may
remember that penguins, platypuses, weasels, and newts are already on
this list.