This weekend I thought a bit about operator overloading in Java.
I thought I’d write up some of the things I considered.
Basic Approach
My first idea was something similar to what C# does — allow
essential operators to be overloaded, but not unusual ones like
||
or .
. In C#, operators must be public
and static, so I would just copy that too. It is probably best to
introduce a new operator
keyword, though we could just as
easily simply anoint magic method names like
operatorPlus
. Since all operator uses are unqualified
(a Foo.+ b
just looks too ugly), users would use
inheritance or static import to introduce operators into the scope.
If we use the operator
keyword, then we have to
augment static import a bit to allow importing operators. I think
this argues slightly for simply picking special method names.
Since we would simply be translating operators into method calls,
no serious binary compatibility issues would arise. We could simply
define the rules for operators to map directly to the rules for
ordinary methods.
C# also synthesizes the compound assignment operators like
+=
from primitive operators, if the compound operators
are not explicitly declared. This seems like a good idea to me.
We must also define how this interacts with boxing, but that is
also simple to do. The rule should be modeled after ordinary method
invocations, with the additional note that if the left hand side of a
binary operator is a primitive type, then we simply do not consider
non-static methods (i.e., we don’t box initially).
Additions
This would be pretty useful, but I can think of two possible
problems. The first problem is that it just seems nicer to allow
non-static methods as well. This is easily added, along with a rule
to search first for an instance method (in the left hand argument for
binary operators, or in the sole argument for unary methods), followed
by searching for a static method.
The second problem is that some operators are commutative, but
ordinary method definitions are not. Suppose you define
BigInteger.operator+(int)
. Now you can write
bigint+5
— but 5+bigint
is an error. This
means you will end up writing a number of fairly redundant operators;
but it would be nice to be able to remove this redundancy.
C# and C++ seem to simply punt on this issue, which I suppose
makes sense. You might think about adding commutativity rules, but
that introduces an asymmetry for the situations where operators are
not commutative. Also, it makes searching for the operator method
strangely complicated.
More Ideas
Groovy apparently maps some operators onto already existing
methods, for instance mapping ==
to the
equals
method. This is a cute idea, but I think it is
dangerous in practice. Sometimes you really do want to be able to
tell if two objects are identical, and equals
is
frequently overloaded. We could resurrect this idea by having a
special interface that indicates that we want this sort of overriding
to happen automatically. We would then add new methods somewhere
(e.g., System
) to allow un-overloaded equality
comparisons.
Groovy’s idea of mapping some comparison operators to
compareTo
seems like a good one to me. It doesn’t suffer
from the same special problems as equality; we could automatically
override operators like >
for any class which
implements Comparable
. In fact, I think this should be
the only way to overload these comparisons.
One other thing to think about is whether the special operator
method names should simply be taken from BigInteger
.
This approach would allow retrofitting of some operator overloading
onto this existing code without any library changes. Note that due to
the commutativity problem this would not be completely seamless — so
for best results we would need to add new methods regardless. In my
opinion, though, this would not be a good idea, as methods
named add
are not uncommon; they are used all over the
collections API, and turning all of these into +
doesn’t
seem smart.
Example
Here’s an example of how we would add an operator to
BigInteger
.
// New special interface indicating that == and != should be
// overridden.
public interface ComparisonOperator
{
}
public class BigInteger extends Number
implements Comparable, ComparisonOperator
{
...
// Implement the smallest number of operators to let "+"
// work.
public BigInteger operatorPlus(BigInteger val)
{
return add(val);
}
// Rely on widening primitive conversion.
public BigInteger operatorPlus(long val)
{
...
}
public static BigInteger operatorPlus(long val, BigInteger bi)
{
return bi.add(val);
}
}
// Use it.
BigInteger x = whatever();
System.out.println(x + 5);
Efficiency
Currently, Java compilers will take an expression using the
special String addition operator and turn it into a series of method
calls on a compiler-generated StringBuffer
object. In
a case with multiple additions, e.g. a+b+c
, the compiler
will generate a single buffer and make multiple calls on it.
Unfortunately, I don’t see a simple way to recapture this
efficiency for user-defined operators. It would be possible for the
compiler to notice a series of overloaded operators where each call
is resolved to the same method. And, for example, this could be
turned into a call to a varargs method:
public static String operatorPlus(Object... args)
{
StringBuffer result = new StringBuffer();
for (Object o : args)
result.append(o);
return result.toString();
}
However, in this situation you end up creating a new garbage array.
Maybe this idea is the way to go, I don’t really know. I suppose in
theory object creation is supposed to be cheap.
One thing I haven’t considered here is the interaction between this
idea and type conversion operators. I think adding implicit type
conversion to the language is, most likely, a bad idea. It certainly
seems to hurt in C++. (If we did have type conversion, we could
handle this case by having a different type for all the intermediate
operators; in the String
case this would simply
be StringBuffer
.)
Implementation
I gave some thought to implementing this in gcjx. I think it
would be straightforward.
If magic method names were used, the parser would not need any
changes. Otherwise, we would have to add a new keyword (trivial) and
a change to static import (reasonably easy).
For semantic analysis, we would have to update the code for the
operators to handle this properly. That is quite simple, most of this
code is in two files. For instance, all of the simple binary
operators are handled in a single method.
For code generation, we could use a trick to make it relatively
simple. The idea would be that each operator class in the model would
hold a pointer to a method invocation object. If a particular
operator is in fact a call to an overloaded operator, this pointer
would be non-null. Then, an operator’s implementation of the visitor
API would simply forward to this operator instead:
template<binary_function OP, operator_name NAME>
void
model_arith_binary<OP, NAME>::visit (visitor *v)
{
if (overloaded)
overloaded->visit (v);
else
v->visit_arith_binary (this, lhs, rhs);
}
I believe this approach would let us introduce this feature without
any changes to the back ends. (It would make some uses of the
visitor a bit odd — for instance the model dumper would print method
calls rather than operator uses here. I don’t think that is a major
problem. In any case this is fixable if we care enough, by adding
default implementations of a new visitor method to the visitor class
itself.)
Anti-overloading
Operator overloading remains a contentious issue. While in some
situations (namely, math-like things such as BigInteger or matrix
classes) it is clearly an improvement, in other situations it is
prone to abuse.
This is a big discussion, and I have a lot of thoughts on it, but
I want to actually finish writing this today; I’ll write more on the
topic of the future of programming languages later. Meanwhile, I
think one common anti-overloading argument, namely their obscurity, is
definitively overturned by today’s IDEs. If overloading were part of
Java, you wouldn’t have to be confused about the meaning of a
+=
appearing somewhere in the code — F3 in Eclipse would
take you directly to the proper definition.
In sum, this was a fun thought experiment for a Saturday morning.
I don’t think operator overloading is the most important feature
missing from Java, but it is often requested. Adding this to the
language would not break any existing code, would not greatly
complicate the language, the compiler or typical programs, and would
be clearly useful in some situations.