Archive for the ‘software’ Category

Hotspot

I took my first glance through Hotspot recently. I’m vaguely curious to see what it would take to port it to PowerPC, not because I have a particular love for that architecture, but because it is important for switching Fedora and RHEL from GCJ to OpenJDK. Also, I’m spending a bit of time researching what I want to work on next, and I think OpenJDK is one of the available options.

Hotspot seems like pretty nice C++ code, overall. The structure is easy to navigate. There are comments pretty much where I expect them.

So far I’ve only looked at simple bits. In particular I tried to read the code that corresponds to code in libgcj with which I am very familiar — the interpreter and the verifier.

The Hotspot verifier is a bit more “C++-y” than mine. Stuff is spread out a bit more and a little more abstract, so I find it somewhat harder to read. But, that’s to be expected. I didn’t try to delve into the deeper things (how do they really handle jsr), since I was only looking at this while waiting for a build today. Some other time.

There seem to be two bytecode interpreters in Hotspot. First, a simple C-ish one, roughly like the libgcj interpreter before I added direct threading. This one is not the default, I suppose it is probably only used when bootstrapping on a new platform or something like that.

The other interpreter is trickier and implements something which is sort of like early Kaffe JITs. The idea is that most bytecodes have an associated function that generates bits of machine code. Before executing a method, a very simple form of JITting is applied where these functions are called to compile code, naively, into a buffer which is then executed.

Well, I hope that is what happens… I’m not trusting myself much this week and the code is fairly convoluted.

I haven’t made much progress on my goal of seeing how hard a port is. The existing ports look pretty small… from 28 KLOC (amd64) to 36 KLOC (sparc).

How useful are generics?

Recently Classpath’s generics branch was merged to be the main line. This means all future Classpath releases will use generics, and now we’re free to use other Java 5 features in the Classpath source.

When we started the generics branch we made a conscious decision to do a “shallow” translation to generics — we rewrote method signatures and visible field signatures, but not the bodies of methods. This was done so that we could more easily merge changes on the trunk to the generics branch, a smart decision considering that the generics branch lived for two years.

This weekend I spent some time adding generics in a deep way, that is, modifying the bodies of methods to properly use generics, and attempting to remove all the warnings related to raw types, unchecked casts, etc. Aside from random warning removal, I had a specific question in mind: how much reliability do generics add?

I completely converted about 5 components (meaning some core package plus all its support code). In all of this I found 2 actual bugs. (Also I once found a bug in imageio on the generics branch during a shallow conversion, bringing the known total of bugs found by generics to 3.)

In one case we were making an invalid assumption about the actual return type of Collections.toArray() (and, btw, this particular API can only be truly fixed by reified generics; in a sense we were lucky to catch this bug).

In another case a protocol implementation had made an incorrect assumption about the types of the contents of a collection it returned.

I have to say I was very surprised by this result. Generics are somewhat tricky to use, add a lot of verbiage to the source (especially since Java doesn’t yet have C#’s var, aka C++’s auto type inference feature), and now, apparently, don’t really catch very many bugs.

There are a couple other theories to consider other than “generics aren’t worth it”.

One is that Classpath, being a core library, is unusually less susceptible to bugs of this kind than other Java programs. I don’t consider this very likely, but more experience with other programs would be useful.

Another is that the Classpath development process is unusually good and catches more bugs than normal. This would be nice if it were true, but I doubt this is very likely either.

Finally, one could argue that catching even a small number of bugs in legacy code is good, and that the real worth of generics comes when writing new code.

I’ll collect more data as we convert more of Classpath to use generics deeply. I’m curious to know whether other folks have had more positive experiences during conversion, or whether there’s something I’m missing about all this. At the moment generics appear to be “nice to have” but hardly worth the substantial upgrade effort across the toolchain and the large body of existing Java code…

Open javac

Today I checked out the newly-free javac and built it with (heresy?) Eclipse. Aside from waiting for it to build (this machine is a bit over-taxed at the moment), this only took a few minutes to set up.

I read through it a little bit, but not extensively. Overall it seems well done… easy to read, has a reasonable amount of javadoc, the big-picture layout is easy to follow, it uses generics throughout, etc.

I’m not sure what I was expecting really. Something more painful I suppose; I think I assume that all programs will be a pain to set up and build. It helps, of course, that Andrew and others got there earlier and put some needed support code into Classpath 🙂

It doesn’t look outrageously hard to add code to support using javac as a front end to gcj, the way we’re using ecj right now.

Sun Frees Java

I was out of town on my honeymoon (it was excellent, thank you very much) when Sun announced their choice of license for open source — or should I say free software? — Java. A few people have written to ask me my opinion on the topic, and rather more people have asked about the relevance of this change for Fedora, for gcj, and for Classpath. So, I’ve spent some time writing a post detailing my thoughts.

First Thoughts

This news is wonderful! It is the culmination of a 10 year dream to have a free software Java implementation, and not only that, it is the best possible way to to achieve the goal: having the reference implementation be the open implementation.

I’m also delighted by Sun’s choice of licenses.

Furthermore Sun is executing well on their promises. The people I’ve talked to have responded well to feedback (for instance on fixing some details in the contributor agreement) and they even were clear about the distinction between free software and open source.

I’m very optimistic that they will continue in this vein.

What about Fedora?

Naturally folks out there were eager to see OpenJDK in Fedora immediately. These initial responses are always a bit funny because the posters seem to send their messages before getting to the part of the press release explaining that not all the software is publicly available yet. Silly.

Still, the instinct is the right one. In my opinion, OpenJDK is the way to go. When we can build a complete implementation, we ought to package it and make it the default JVM on Fedora.

I think this won’t happen until at least Fedora Core 8.

Meanwhile, for Fedora Core 7 we should ship the new ecj-based gcj with support for all the new 1.5 features. This requires a bit more work, and perhaps a backport, depending on the base gcc chosen for the OS.

One issue down the road for OpenJDK adoption on Fedora is that it does not have a PPC port. Community hackers, please write this.

Oh, and when we ship OpenJDK, someone should take a good look at shipping NetBeans as well. Eclipse is wonderful and in some ways definitely better than NetBeans, but there are areas where NetBeans is superior as well.

What about Classpath?

I think our experience over the years has shown that every time we merged code bases, we gained. We gained developers, we gained quality, we gained completeness. In other words, it is better to work on a single good implementation than multiple competing implementations. (I don’t think this is a universal law, but it is true in the specific case of Java class libraries: an program where the result is relatively well-defined).

So, it is hard for me to see what role Classpath will play in the future. OpenJDK’s library will have the same license and will be the reference implementation. That argues strongly for moving Classpath development work time to OpenJDK hacking instead. (At least assuming contributor agreement stuff is cleared up … a detail.)

Classpath does have some code that the JDK does not. For instance we have some interesting AWT peers, we have a GConf back end for prefs, our HTML implementation is better, and a few other things. For the peers I think we could do separate releases (contributing these to OpenJDK is difficult or impossible — it may violate the FSF’s charter). For “pure library” bits, I’m not sure what to do 🙁

What about GCJ?

For many cases I think it will be preferable to switch entirely to OpenJDK, and not use GCJ at all.

Don’t get me wrong — GCJ is cool, and I’m proud of what we’ve done. I think we’ve solved some hard problems in creative ways. But, for many uses of Java, compatibility is the name of the game, and it is simpler to get this using the reference platform.

I think there are still areas where GCJ makes a lot of sense — embedded systems and the various platforms where there is a GCJ port but no other VM. I don’t expect it will die, but I do expect that the effort going into it will subside quite a bit. That will matter less, though, since it will be somewhat simpler for GCJ to keep pace now, since the libraries will need much less work than in the past.

Don’t Panic

Naturally this is all still contingent on things working out as they should. I’m not worried about this, and I don’t think you should worry either.

Also some patience is required. We’ll know much more in 6 months.

FOSDEM

Some folks from Sun will be at FOSDEM this year to talk about OpenJDK. And, of course, the usual Classpath and GCJ contingent will be there. I hope to see you there.

More on synchronization

I started a draft implementation of my identity synchronization idea. In the course of doing this I ran across a minor problem.My initial idea was to keep things very simple: a shared resource would be a collection of bits, and the users of the API would handle all aspects of interpretation. The library would handle bookkeeping details but would defer conflict resolution to the caller — handing back two versions of a file (three-way merge is a possibility, too, but I’m not convinced it is worth the effort) and having the caller respond with a merged version.

Now suppose you log in on your laptop and make some changes while disconnected. When you subsequently reconnect, the ideal would be for all the identity data to automatically resynchronize with the server. This seems like the most useful thing to do — I don’t want to have to actually restart RSSOwl for my saved blog-reading sessions to be available to my other machines.

Unfortunately this throws a wrench into the naive approach outlined above. If the synchronized data is uninterpreted bytes, then any conflict will mess up the idea of generic synchronization without running a particular application to handle a merge.

One way to fix this would be to mandate a file format so that the merge code can be generic. I’m not very fond of this. The other, and I think superior, plan would be to handle conflicts monotone-style: let all commits succeed, and handle merging on demand. In this model we can upload a file (with some ancestry info); when downloading updated files from the server we would simply download all available “head” files and the application would perform multiple merges.

This means having a smart server, but I think that was inevitable anyhow. (Probably Nathaniel is on target, as usual, and I should be reusing monotone’s code for this…)

Another little problem that came up is keyring management. In particular, the gnome keyring will be needed to decrypt data downloaded from the server. But, we want to store the keyring itself as a shared resource. I think the only answer here is a special API used only by gnome-keyring that lets us skip over the download step the first time the keyring file is needed.

Finally, there is a completely different design available. We could use a file-based service like monotone, check things in, and then provide merge utilities for each different file type. One way I don’t like this is that I suspect that current programs don’t differentiate between the different types of stored data; changing the programs themselves ensures a clean separation of identity-related data and transient data. At least for the time being I’m sticking with my first approach.

Recent Hacking

Lately I’ve been doing some random Gnome hacking. I’ve been a bit bored with what I’ve been working on recently (SRPM hacking — I hate this stuff) so I’ve cast around a bit for other things to do.

At first I was playing with frysk, but I promised myself that I would be a dilettante and hack only the fun bits. Unfortunately this means I’m now stuck waiting for other folks to write some important infrastructure, and also to fix a few mildly contentious bugs. (The bugs would be fun, but I’m also not interested in navigating the controversies in my spare time…)

So, I inconsistently turned to hacking on gnome-session — something both un-fun and contentious. I’m not always sure what I’m up to…

I’d forgotten exactly how awful XSMP is. It pretends to be policy-free, but when you look deeper you see that it makes many implicit assumptions about how things will actually work. It also uses the egregiously nasty ICE. And, of course, gnome-session’s implementation is no great joy, either, which is more than a little my fault.

Anyway, I fixed a number of problems and now I’m waiting for patch review.

Gnome hacking has gotten better since the bad old days. The documentation (glib in particular) has improved, for one thing. The newer APIs are also more well thought out.

I looked a bit more into a minimal implementation of my state-sharing idea. Gnome seems to already have most of the bits I’ll need: I can lift some encryption code from gnome-keyring, and gnome-vfs allows easy access to remote sites (the latter isn’t perfect for my purposes but will do for a proof of concept).

Dancing with myself

Every time I go on a trip I plan to spend half a day “setting up my laptop”. Mostly this means copying files back and forth until it somewhat resembles my main machine, and hoping I haven’t forgotten anything (say, for instance, the time I went to the RH Summit and forgot to bring the VPN key. Sigh).

I’m much too lazy to set up rsync for my desktop configuration for this purpose, and anyway I suspect that I’d spend as much time tweaking any rsync script as I do just copying things around. Instead I’ve spent my time more usefully, thinking about how this could be automated for KDE or Gnome.

What I’m thinking about in particular is a new role for kde.org or gnome.org: give everybody who uses the desktop a little space on the web site, and automatically encrypt and upload certain bits of data; then automatically download and synchronize it when the user logs in and has a network connection.

Initially this could just be a few things relating to my identity: the contents of my keyring (including passwords for web sites), information about my various mail accounts, my gaim accounts, newsgroups I use and information about what articles I’ve read, my blog list in RSSOwl, my calendar, my sticky notes, my .cvspass file, probably even my mozilla bookmarks.

I think that is actually pretty minimal and that over time we’d find more things to add — just go through the list of programs you use daily and it is pretty easy to see bits of configuration that relate to your identity, as opposed to machine- or session-local ephemera. But initially just starting with one or two of these to work out the concept would be fine.

The synchronization phase is the only tricky point. I’m often using both my machines at once, and if I make an account on some website using my main machine, and make another account around the same time using my laptop, I think it is reasonable to expect that the results will be merged — that I won’t lose one or the other.

This means that simply copying config files around in a naive manner is probably not workable. I don’t think there are any big problems solving this; in most cases I care about, it boils down to computing the union of two lists.

In the unusual case where the two configurations refer to one account with different passwords, I suppose we could defer to the user at synchronization time. I doubt this happens often.

In the bigger view I think this is just the first step toward integrating the desktops with their respective web sites. There are plenty more ways this could be done.

Bumptop again

Up until now I’ve thought that bumptop (and lowfat, and to a lesser extent croquet) was pretty but not very useful. The demos show a lot of nice file manipulation tools, but I use nautilus very rarely; so bumptop has seemed like it put a lot of energy into the wrong place.

I’ve been thinking about it a bit more and I’m not as sure as I once was. It took a while, but I finally realized that my desktop is littered with things that, on a bumptop desktop, would be active objects subject to manipulation — that bumptop actions are not limited to “files” but instead “desktop objects (some of which are files)”.

So, for instance, all the launchers on the gnome menu and on my various panels would fit into this category. So would things like URLs I dragged to the desktop, notification area icons, other panel applets like the mail checker, methusalem-based task trackers, etc.

Something like a panel full of launchers, or the gnome menu itself or that matter, would instead become a bumptop container of some kind. Menu editors would just go away, deferring to a more general (and, if this whole idea makes any sense at all, easier to use) organizational technique.

Technologically it all seems like a good time.. groovy physics-based 3d-ness; downloading hibernating python bits from the web to live like fireflies on the desktop, telling us the state of our imap account or calendar or remote build, until we delete them or bury them under a pile of crumpled PDFs; treating our applications like the edsels they are and, finally, stripping them of their chrome, choosing instead to pull bits from youtube and google video and our mythtv servers to remix them in place in a specialized desktop… a google of hackers could spend years having fun on this.

I’m still skeptical though: the extended definition of “desktop objects” still only covers a small minority of what I do daily. I only occasionally use launchers (and those I have set up nicely on an auto-hide side panel), and the whole point of applets and notifiers is that they are typically passive.

Silly desktop idea

A sillly destop idea: .desktop files should start with “#! some-new-program” and be executable. Then I could run them directly from the command line. Naturally the script interpreter would know how to parse command line arguments and pass them on to the underlying program appropriately.

Gnome Deployment

I was reading about Gnome 3.0 recently (and I’m working on a long post about it), when I ran across a post by Luis Villa about Gnome versus web development. (BTW you should read his blog if you don’t already.)

I found this pretty compelling in general — it is a nice analysis and it is also a list of things that can actually be addressed without going crazy.

I’m interested in the deployment issue, since I’ve done some playing in this area with Java Web Start. And, though I think I don’t agree that this is really a blocker, it is still fun to think about a bit — the technology is fun and cool, I’m just not convinced it solves a pressing problem. (Your response here…)

First, yes, with C or C++ I think deployment is just going to continue to suck.

Java already solved this problem in a pretty nice way. It is no big deal to write a single-click install, with automatic updating, good security (either a sandbox or signatures for apps that need more permissions), etc. This could use a little work for better Gnome integration, and of course we should be shipping this in Fedora, but the amount of needed work here is quite small. (I assume C# has something similar but I really don’t know.)

I looked into this for Python a little this weekend. And, it looks like Python Eggs and the Python cheese shop provide a nice basis to make this work. Finishing this is pretty concrete:

  • Define an XML document describing an application. This would include the same kinds of things that ended up in JNLP: icons, splash screens, the required python version, a list of python egg URLs, etc.
  • Add some kind of signing to python egg downloads. Unfortunately I don’t think sandboxing is an option. One idea would be to restrict eggs to python.org and gnome.org URLs — but that interferes with the idea of easy deployment.
  • Make a new URL type so that Mozilla will know to run appropriate python script to parse the XML, install the eggs, and launch the application.

For best results this would incorporate some things I was looking at for NetX: having the ability to download an application without running it; dbus integration so that a previously cached application won’t try to update itself in a net-less environment; drag-and-drop of entire applications from Mozilla to the panel; and now, integration with mathusalem when downloading eggs.