JIT Compilation for Emacs

There have been a few efforts at writing an Emacs JIT — the original one, Burton Samograd’s, and also Nick LLoyd’s. So, what else to do except write my own?

Like the latter two, I based mine on GNU libjit. I did look at a few other JIT libraries: LLVM, gcc-jit, GNU Lightning, MyJit.  libjit seemed like a nice middle ground between a JIT with heavy runtime costs (LLVM, GCC) and one that is too lightweight (Lightning).

All of these Emacs JITs work by compiling bytecode to native code.  Now, I don’t actually think that is the best choice — it’s just the easiest — but my other project to do a more complete job in this area isn’t really ready to be released.  So bytecode it is.

Emacs implements a somewhat weird stack-based bytecode.  Many ordinary things are there, but seemingly obvious stack operations like “swap” do not exist; and there are bytecodes for very specialized Emacs operations like forward-char or point-max.

Samograd describes his implementation as “compiling down the spine”.  What he means by this is that the body of each opcode is implemented by some C function, and the JIT compiler emits, essentially, a series of subroutine calls.  This used to be called “jsr threading” in the olden days, though maybe it has some newer names by now.

Of course, we can do better than this, and Lloyd’s JIT does.  His emits instructions for the bodies of most bytecodes, deferring only a few to helper functions.  This is a better approach because many of these operations are only one or two instructions.

However, his approach takes a wrong turn by deferring stack operations to the compiled code.  For example, in this JIT, the Bdiscard opcode, which simply drops some items from the stack, is implemented as:

 CASE (Bdiscard):
 {
   JIT_NEED_STACK;
   JIT_INC (ctxt.stack, -sizeof (Lisp_Object));
   JIT_NEXT;
   NEXT;
 }

It turns out, though, that this isn’t needed — at least, for the bytecode generated by the Emacs byte-compiler, the stack depth at any given PC is a constant.  This means that the stack adjustments can be done at compile time, not runtime, leading to a performance boost.  So, the above opcode doesn’t need to emit code at all.

(And, if you’re worried about hand-crafted bytecode, it’s easy to write a little bytecode verifier to avoid JIT-compiling invalid things.  Though of course you shouldn’t worry, since you can already crash Emacs with bad bytecode.)

So, naturally, my implementation does not do this extra work.  And, it inlines more operations besides.

Caveat

I’ve only enabled the JIT for bytecode that uses lexical binding.  There isn’t any problem enabling it everywhere, I just figured it probably isn’t that useful, and so I didn’t bother.

Results

The results are pretty good.  First of all, I have it set up to automatically JIT compile every function, and this doesn’t seem any slower than ordinary Emacs, and it doesn’t crash.

Using the “silly-loop” example from the Emacs Lisp manual, with lexical binding enabled, I get these results:

Mode Time
Interpreted 4.48
Byte compiled 0.91
JIT 0.26

This is essentially the best case for this JIT, though.

Future Directions

I have a few ideas for how to improve the performance of the generated code.  One way to look at this is to look at Emacs’ own C code, to see what advantages it has over JIT-compiled code.  There are really three: cheaper function calls, inlining, and unboxing.

Calling a function in Emacs Lisp is quite expensive.  A call from the JIT requires marshalling the arguments into an array, then calling Ffuncall; which then might dispatch to a C function (a “subr”), the bytecode interpreter, or the ordinary interpreter.  In some cases this may require allocation.

This overhead applies to nearly every call — but the C implementation of Emacs is free to call various primitive functions directly, without using Ffuncall to indirect through some Lisp symbol.

Now, these direct calls aren’t without a cost: they prevent the modification of some functions from Lisp.  Sometimes this is a pain (it might be handy to hack on load from Lisp), but in many cases it is unimportant.

So, one idea for the JIT is to keep a list of such functions and then emit direct calls rather than indirect ones.

Even better than this would be to improve the calling convention so that all calls are less expensive.  However, because a function can be redefined with different arguments, it is tricky to see how to do this efficiently.

In the Emacs C code, many things are inlined that still aren’t inlined in the JIT — just look through lisp.h for all the inline functions (and/or macros, lisp.h is “unusual”).  Many of these things could be done in the JIT, though in some cases it might be more work than it is worth.

Even better, but also even more difficult, would be inlining from one bytecode function into another.  High-performance JITs do this when they notice a hot spot in the code.

Finally, unboxing.  In the Emacs C code, it’s relatively normal to type-check Lisp objects and then work solely in terms of their C analogues after that point.  This is more efficient because it hoists the tag manipulations.  Some work like this could be done automatically, by writing optimization passes for libjit that work on libjit’s internal representation of functions.

Getting the Code

The code is on the libjit branch in my Emacs repository on github.  You’ll have to build your own libjit, too, and if you want to avoid hacking on the Emacs Makefile, you will need my fork of libjit that adds pkg-config files.

FOSDEM, Rust, and Debugging

I’ve recently switched groups at Mozilla to start working full-time on improving Rust debugging.  To kick this off and to meet people from the various projects I’m working on — the Rust compiler, lldb, llvm, gdb, and (eventually) the DWARF standard — I will speak about this work at FOSDEM.  If you’re going and want to meet up, drop me a line.

 

Shaggy Dogs and SpiderMonkey Unwinders

A year or so ago I was asked to debug a crash in the Firefox devtools.  Crashes are easy!  I fired up gdb and reproduced the crash… which turned out to be in some code JITted by SpiderMonkey.  I was immediately lost; even a simple bt did not work.  Someone more familiar with the JIT — hi Shu — had to dig out the answer :-(.

I did take the opportunity to get some information from him about how he found the result, though.  He pointed me to the code responsible for laying out JIT stack frames.  It turned out that gdb could not unwind through JIT frames, but it could be done by hand — so I resolved then to eventually fix this.

Phase One

I knew from my gdb hacking that gdb has a JIT unwinding API.  Actually — and isn’t this the way most programs end up working? — it has two.

The first JIT API requires some extra work on the part of the JIT: it constructs an object file, typically ELF and DWARF, in memory, then calls a hook.  GDB sets a breakpoint on this hook and, when hit, it reads the data from the inferior.  This lets the JIT provide basically any kind of information — but it’s pretty heavy.

So, I focused my attention on the second API.  In this mode, the JIT author would provide a shared library that used some callbacks to inform gdb of the details of what was going on.  The set of callbacks was much more limited, but could at least describe how to unwind the registers.  So, I figured that this is what I would do.

But… I didn’t really want to write this in C.  That would be a real pain!  C is fiddly and hard to deal with, and it would mean constant rebuilding of the shared library while debugging, and SpiderMonkey already had a reasonable number of gdb-python scripts — surely this could be done in Python.

So I took the quixotic approach, namely writing a shared library that used the second gdb JIT API but only to expose this API to Python.

Of course, this turned out to be Rube Goldbergian.  Various parts of the gdb Python API could not be called from the JIT shared library, because those bits depended on other state in gdb, which wasn’t set properly when the JIT library was being called.  So, I had gdb calling into my shared library, which called my Python code, which then invoked a new gdb command (written in Python and supplied by my package) — that existed solely for the purpose of setting this internal state properly — and that in turn invoked the code I wanted to run, say to fetch memory or a register or something.

Computer Science!

Well, that took a while.  But it sort of worked!  And maybe I could just keep it in github and not put it in Mozilla Central and avoid learning about the Firefox build system and copying in some gdb header file and license review and whatnot.

So I started writing the actual Python code… OMG.  And see below since you will totally want to know about this.  But meanwhile…

… while I was hacking away on this crazy idea, someone implemented the much more sane idea of just exposing gdb’s unwinder API to gdb’s Python layer.

Hmm… why didn’t I do that?  Well, I left gdb under a bit of a cloud, and didn’t really want to be that involved at the time.  Plus, you know, gdb is a high quality project; which means that if you write a giant patch to expose the unwinding API, you have to be prepared for 17 rounds of patch review (this really happened once), plus writing documentation and tests.  Sometimes it’s just easier to channel one’s inner Rube.

Phase Two

The integrated Python API was a great development.  Now I could delete my shared library and my insane trampoline hacks, and focus on my insane unwinding code.

A lot of this work was straightforward, in the sense that the general outline was clear and just the details remained.  The details amount to things like understanding the SpiderMonkey frame descriptor (which partly describes the previous frame and partly the new frame; there’s one comment explaining this that somehow eluded me for quite a while); duplicating the SpiderMonkey JIT unwinding code in Python; and of course carefully reading the SpiderMonkey code that JITs the “entry frame” code to understand how registers are spilled.

Naturally, while doing this it turned out that I was maybe the first person to use these gdb APIs in anger.  I found some gdb crashes, oops!  The docs would have been impenetrable, except I already knew the underlying C APIs on which they were based… whew!  The Python API was unexpectedly picky in other areas, too.

But then there was also some funny business, one part in gdb, and one part in SpiderMonkey.

GDB is probably more complicated than you realize.  In this case, the complexity is that, in gdb, each stack frame can have its own architecture.  This seemingly weird functionality is actually used; I think it was invented for the SPU, but some other chips have multiple modes as well.  But what this means is that the question “what architecture is this program?” is not well-defined, and anyway gdb’s Python layer doesn’t provide you a way to find whatever approximation it is that would make sense in your specific case.  However, when writing the SpiderMonkey unwinder, it kind of actually is well-defined and we’d like to know the answer to know which unwinder to choose.

For this problem I settled on the probably terrible idea of checking whether a given register is available.  That is, if you see “$rip“, you can guess it’s x86-64.

The other problem here is that gdb thinks that, since you wrote an unwinder, it should get the first stab at unwinding.  That’s very polite!  But for SpiderMonkey, deciding “hey, is this PC in some code the JIT emitted?” is actually a real pain, or at least outside the random bits of it I learned in order to make all this work.

Aha!  I know, there’s probably a Python API to say “is this address associated with some shared library?”  I remembered reading and/or reviewing a patch… but no, gdb.solib_name is close but doesn’t do the right thing for addresses in the main executable.  WAT.

I tried several tricks without success, and in the end I went with parsing /proc/maps to get the mappings to decide whether a given frame should be handled by this unwinder or by gdb.  Horrible.  And fails with remote debugging.

Luckily, nobody does remote debugging.

Remote Debugging

Oh, wait, people do remote debugging at Mozilla all the time.  They don’t call it “remote debugging” though — they call it “using RR“, which while it runs locally, appears to be remote to gdb; and, importantly, during replay mode fakes the PID, and does other deep magic, though not deep enough to extend to making a fake map file that could be read via gdb’s remote get command.

By the way, you should be using RR.  It’s the best advance in debugging since, well, gdb.  It’s a process record-and-replay program, but unlike gdb’s built-in reverse debugging, it handles threads properly and has decent performance.

Oh Well

Oh well.  It just won’t work remotely.  Or at least not until fellow Mozillian (this always seems like it should be “Mozillan” to me, but it’s not, there really is that extra “i”) and all-star Nicolas Pierron wrote some additional Python to read some SpiderMonkey tables to make the decision in a more principled way.  Now it will all work!

Though looking now I wonder if I dreamed this, because the code isn’t checked in.  I know he had a patch but my memory is a bit fuzzy — maybe in the end it didn’t work, because RR didn’t implement the qGetTLSAddr packet, which gdb uses to read thread-local storage.  Did I mention the thread-locals?

The Real Start of the Story

So, way back at the beginning, during my initial foray into this code, I found that a crucial bit of information — the appropriately-named TlsPerThreadData — was stashed away in a thread-local variable.  Information stored here is needed by the unwinder in order to unwind from a C++ frame into a JIT frame.

Only, Firefox didn’t use “real” thread-local variables, the things that so many glibc and gcc hackers put so much effort into micro-optimizing.  No, it just used a template class that wrapped pthread_setspecific and friends in a relatively ergonomic way.

Naturally, for an unwinder this is a disaster.  Why?  Unwinding is basically the dissection of the stack; but in order to compute the value of one of these thread-local-storage objects, the unwinder would have to make some function calls in the inferior (in fact this prevents it from working on OSX).  But these would affect the stack, and also potentially let other inferior code (in other threads — remember, gdb is complicated and you can exert various unusual kinds of control like this) run as well.

So I neglected to mention the very first step: changing Firefox to use __thread.  (Ok, I didn’t really neglect to mention it, I was just being lazy and anyway it’s a shaggy dog story.)

Do Not Use libthread_db

RR did not implement qGetTLSAddr, which we needed, because  lots of people at Mozilla use RR.  So I set out to implement that.  This meant a foray into the dangerous world of libthread_db.

For reasons I do not know, and suspect that I do not want to know, glibc has historically followed many Solaris conventions.  One such Solaris innovation was libthread_db — a library that debuggers use to find certain information from libc, information like the address of a thread-local variable

On the surface this seems like a great idea: don’t bake the implementation details of the C library into the debugger.  Instead, let the debugger use a debugging library that comes with the C library.  And, if you designed it that way, it would be a good idea.

Sadly, though, libthread_db was not designed that way.  Oh no.

For example, libthread_db has a callback interface.  The calling program — gdb or rr — must provide some functions that libthread_db can call, to do some simple things like “read some memory”; or some very complicated things like “find the address of a symbol given its name”.  Normal C programmers might implement these callbacks using a structure containing function pointers.  But not libthread_db!  Instead it uses fixed symbol names that must be provided by the calling application.  Not all of these are required for it to work (you get to figure out which, yay!), but some definitely are.  And, you have to dlopen a libthread_db that matches the libc of the inferior that you’re debugging (or link against it, but that’s also obviously bad).

Wait, you say.  Doesn’t that mess up cross-debugging?  Why yes!  Yes it does!  Which is why qGetTLSAddr has to be in the gdb remote serial protocol to start with.

Hey, maybe the Linux vendors should fix this.  They are — see Gary Benson’s Infinity project — but unfortunately that’s still in development and I wanted RR to work sooner.

Ok, so whew.  I wrote qGetTLSAddr support for RR.  This was a small patch in the end, but an unusual pain in an already painful series.  Hopefully this won’t spill out into other programs.

glibc

Hahaha, you are so funny.  Of course it spills out: remember how you have to define a bunch of functions with specific names in your program in order to use libthread_db?  Well, how do you know you got the types correct?

Yeah, you include <proc_service.h> (a name deliberately chosen to confuse, I suppose, why not, it doesn’t bear any obvious relationship to the library).  Only, that was never installed by glibc.  Instead, gdb just copied it into the source tree.

So naturally I went and fixed this in glibc.  And, even more naturally, this broke the gdb build, which was autoconf’d to check for a file that never existed in the past.  LOL.

Thank You Cthulhu

At this point I figured it was only a matter of time until I had to patch the kernel.  Thankfully this hasn’t been necessary yet.

It Says What

In gdb the actual unwinding and the display of frames are separate concerns.

And let me digress here to say that gdb’s unwinder design is excellent.  I believe it was redone by Andrew Cagney (this was well before my active time in gdb, so apologies if you’re reading this and you did it and I’ve misattributed it).  Like much of gdb, many of the details are bizarre and take one back to the byte-counting days of 1987; but the high level design is very solid and has endured with, I think, just one significant change (to support inline functions) in the intervening 15 or so years.  I’ve long thought that this is a remarkable accomplishment in the programming world.

So, yes.  It’s not enough to just unwind.  Simply having an unwinder yields backtraces with lines like:

#5 0xfeefee ???

Better than nothing!  But not yet great.

The second part of the SpiderMonkey unwinder is, therefore, a gdb “frame filter”.  This is an object that takes raw frames and decorates them with information like a function name, or a file name, or arguments.

Work to add this information is ongoing — I landed one patch just yesterday, and another one, to add more information about interpreted frames, is still in the works.  And there are two more bugs filed… maybe this project, like this blog post, will never conclude.  It will just scroll endlessly.

But now, with all the code in place, bt can show something like:

#6 0x00007ffff7ff20f3 in <<JitFrame_BaselineJS "f1">> (this=JSVAL_VOID, arg1=$jsval(4700))

This is the call f1(4700).

Let’s Just Have One More

Of course we still couldn’t enable this unwinder by default.  You have to enable it by hand.

And by the way, in the first release of gdb’s Python unwinder feature, enabling or disabling an unwinder didn’t flush the frame cache, so it wouldn’t actually take effect until some invisible-to-the-user state change took place.  I fixed this bug, but here Pedro Alves also taught me the secret gdb command flushregs, which in fact just flushes the frame cache. (I’m going to go out on a limb and guess that this command predates the already ancient maint prefix command, hence its weird name.)

Anyway, you have to enable it by hand because the unwinder itself doesn’t work properly if the outermost frame is in JIT code.  The JIT, in the interest of performance, doesn’t maintain a frame pointer.  This means that in the outermost frame, there’s no reliable way to find the object that describes this frame and links to the previous frame.

Now, normally in this case gdb would either resort to debug info (not available here), or in extremis its encyclopedic suite of prologue analyzers (yes, gdb can analyze common function prologues for all architectures developed in the last 25 years to figure out stuff) — but naturally JIT compilers go their own way here as well.

Humans, like Shu back at the start of this story, can do this by dumping parts of the stack and guessing which bytes represent the frame header.

But, I’ve been reluctant and a bit afraid to hack a heuristic into the unwinder.

To sum up — in case you missed it — this means that all the code written during this entire saga would still not have helped with my original bug.

The End

The Deletion of gcj

I originally posted this on G+ but I thought maybe I should expand it a little and archive it here.

The patch to delete gcj went in recently.

When I was put on the gcj project at Cygnus, I remember thinking that Java was just a fad and that this was just a temporary thing for me. I wasn’t that interested in it. Then I ended up working on it for 10 years.

In some ways it was the high point of my career.

Socially it was fantastic, especially once we merged with the Classpath community — I’ve always considered Mark Wielaard’s leadership in that community as the thing that made it so great.  I worked with and met many great people while working on gcj and Classpath, but I especially wanted to mention Andrew Haley, who is the single best debugger I’ve ever met, and who stayed in the Java world, now working on OpenJDK.

We also did some cool technical things in gcj. The binary compatibility ABI was great, and the split verifier was very fun to come up with.  Per Bothner’s early vision for gcj drove us for quite a while, long after he left Cygnus and stopped working on it.

On the downside, gcj was never quite up to spec with Java. I’ve met Java developers even as recently as last year who harbor a grudge against gcj.

I don’t apologize for that, though. We were trying something difficult: to make a free Java with a relatively small team.

When OpenJDK came out, the Sun folks at FOSDEM were very nice to say that gcj had influenced the opening of the JDK. Now, I never truly believed this — I’m doubtful that Sun ever felt any heat from our ragtag operation — but it was very gracious of them to say so.

Since the gcj days I’ve been searching for basically the same combination that kept me hacking on gcj all those years: cool technology, great social environment, and a worthwhile mission.

This turned out to be harder than I expected. I’m still searching. I never thought it was possible to go back, though, and with this deletion, this is clearer than ever.

There’s a joy in deleting code (though in this case I didn’t get to do the deletion… grrr); but mainly this weekend I’m feeling sad about the final close of this chapter of my life.

GDB Preattach

In firefox development, it’s normal to do most development tasks via the mach command. Build? Use mach. Update UUIDs? Use mach. Run tests? Use mach. Debug tests? Yes, mach mochitest --debugger gdb.

Now, normally I run gdb inside emacs, of course. But this is hard to do when I’m also using mach to set up the environment and invoke gdb.

This is really an Emacs bug. GUD, the Emacs interface to all kinds of debuggers, is written as its own mode, but there’s no really great reason for this. It would be way cooler to have an adaptive shell mode, where running the debugger in the shell would magically change the shell-ish buffer into a gud-ish buffer. And somebody — probably you! — should work on this.

But anyway this is hard and I am lazy. Well, sort of lazy and when I’m not lazy, also unfocused, since I came up with three other approaches to the basic problem. Trying stuff out and all. And these are even the principled ways, not crazy stuff like screenify.

Oh right, the basic problem.  The basic problem with running gdb from mach is that then you’re just stuck in the terminal. And unless you dig the TUI, which I don’t, terminal gdb is not that great to use.

One of the ideas, in fact the one this post is about, since this post isn’t about the one that I couldn’t get to work, or the one that is also pretty cool but that I’m not ready to talk about, was: hey, can’t I just attach gdb to the test firefox? Well, no, of course not, the test program runs too fast (sometimes) and racing to attach is no fun. What would be great is to be able to pre-attach — tell gdb to attach to the next instance of a given program.

This requires kernel support. Once upon a time there were some gdb and kernel patches (search for “global breakpoints”) to do this, but they were never merged. Though hmm! I can do some fun kernel stuff with SystemTap…

Specifically what I did was write a small SystemTap script to look for a specific exec, then deliver a SIGSTOP to the process. Then the script prints the PID of the process. On the gdb side, there’s a new command written in Python that invokes the SystemTap script, reads the PID, and invokes attach. It’s a bit hacky and a bit weird to use (the SIGSTOP appears in gdb to have been delivered multiple times or something like that). But it works!

It would be better to have this functionality directly in the kernel. Somebody — probably you! — should write this. But meanwhile my hack is available, along with a few other gdb scxripts, in my gdb helpers github repository.

Emacs hint for Firefox hacking

I started hacking on firefox recently. And, of course, I’ve configured emacs a bit to make hacking on it more pleasant.

The first thing I did was create a .dir-locals.el file with some customizations. Most of the tree has local variable settings in the source files — but some are missing and it is useful to set some globally. (Whether they are universally correct is another matter…)

Also, I like to use bug-reference-url-mode. What this does is automatically highlight references to bugs in the source code. That is, if you see “bug #1050501”, it will be buttonized and you can click (or C-RET) and open the bug in the browser. (The default regexp doesn’t capture quite enough references so my settings hack this too; but I filed an Emacs bug for it.)

I put my .dir-locals.el just above my git checkout, so I don’t end up deleting it by mistake. It should probably just go directly in-tree, but I haven’t tried to do that yet. Here’s that code:

(
 ;; Generic settings.
 (nil .
      ;; See C-h f bug-reference-prog-mode, e.g, for using this.
      ((bug-reference-url-format . "https://bugzilla.mozilla.org/show_bug.cgi?id=%s")
       (bug-reference-bug-regexp . "\\([Bb]ug ?#?\\|[Pp]atch ?#\\|RFE ?#\\|PR [a-z-+]+/\\)\\([0-9]+\\(?:#[0-9]+\\)?\\)")))

 ;; The built-in javascript mode.
 (js-mode .
     ((indent-tabs-mode . nil)
      (js-indent-level . 2)))

 (c++-mode .
	   ((indent-tabs-mode . nil)
	    (c-basic-offset . 2)))

 (idl-mode .
	   ((indent-tabs-mode . nil)
	    (c-basic-offset . 2)))

)

In programming modes I enable bug-reference-prog-mode. This enables highlighting only in comments and strings. This would easily be done from prog-mode-hook, but I made my choice of minor modes depend on the major mode via find-file-hook.

I’ve also found that it is nice to enable this minor mode in diff-mode and log-view-mode. This way you get bug references in diffs and when viewing git logs. The code ends up like:

(defun tromey-maybe-enable-bug-url-mode ()
  (and (boundp 'bug-reference-url-format)
       (stringp bug-reference-url-format)
       (if (or (derived-mode-p 'prog-mode)
	       (eq major-mode 'tcl-mode)	;emacs 23 bug
	       (eq major-mode 'makefile-mode)) ;emacs 23 bug
	   (bug-reference-prog-mode t)
	 (bug-reference-mode t))))

(add-hook 'find-file-hook #'tromey-maybe-enable-bug-url-mode)
(add-hook 'log-view-mode-hook #'tromey-maybe-enable-bug-url-mode)
(add-hook 'diff-mode-hook #'tromey-maybe-enable-bug-url-mode)

Emacs Modules

I’ve been working on an odd Emacs package recently — not ready for release — which has turned into more than the usual morass of prefixed names and double hyphens.

So, I took another look at Nic Ferrier’s namespace proposal.

Suddenly it didn’t seem all that hard to implement something along these lines, and after a bit of poking around I wrote emacs-module.

The basic idea is to continue to follow the Emacs approach of prefixing symbol names — but not to require you to actually write out the full names of everything.  Instead, the module system intercepts load and friends to rewrite symbol names as lisp is loaded.

The symbol renaming is done in a simple way, following existing Emacs conventions.  This gives the nice result that existing code doesn’t need to be updated to use the module system directly.  That is, the module system recognizes name prefixes as “implicit” modules, based purely on the module name.

I’d say this is still a proof-of-concept.  I haven’t tried hairier cases, like defclass, and at least declare-function does not work but should.

Here’s the example from the docs:

(define-module testmodule :export (somevar))
(defvar somevar nil)
(defvar private nil)
(provide 'testmodule)

This defines the public variable testmodule-somevar and the “private” function testmodule--private.

Emacs verus notification area, again

Ages and ages I wrote about letting Emacs code access the notification area.  I have more to say about it now, but first I want to bore you with some rambling thoughts and some history.

The “notification area” is also called the “status icon area” or the “systray” — it is a spot that holds some icons that are under control of various applications.

I was a fan of the notification area since it first showed up in Gnome.  I recognized it instantly as the thing I wanted that I hadn’t realized I wanted.

Now, as you know, the notification area has fallen on hard times.  It’s been removed in Gnome 3… I searched a bit for the rationale for this deletion, which as far as I can tell is just that some applications abused it, whatever that means; or that it was used inconsistently, which I think the web has conclusively proven is fine by users.  Coming from the Emacs perspective, where one can customize the somewhat-equivalent of the status area (see those recent posts on diminishing minor-mode lighters in the mode line…), and where a certain amount of per-mode idiosyncrasy is the norm, these seem like an inadequate reasons.

However, the reason doesn’t really matter.  I love the notification area!  When I moved more of my daily desktop use back into Emacs (the tides are strong but slow, and take years to come in or go out), I hooked Emacs up to it, and made it a part of my basic configuration.

It’s indispensable now.  What I particularly like about it is that it is both noticeable and unobtrusive — the former because I can have the icons blink on important events, and the latter because the icons don’t move around or obscure other windows.

Ok!  You should use it!  And I totally plan to tell you how, but first some boring history.

My original post relied on a hacked version of the Gnome zenity utility.  This turned out to be a real pain over time.  I had to rebuild it periodically, adding hacks (once removing chunks), etc.  Sharing it with others was hard.  And, for whatever reason, the patches in Gnome bugzilla were completely ignored.  Bah.

A bit later I wrote a big patch to Emacs to put all this into the core.  That patch was rejected, more or less.  Bah two.

Then even later I flirted with KDE for a bit.  Yes.  KDE had the nice idea to expose the notification area via dbus, and Emacs could talk dbus… so I did the obvious thing in elisp.  However the KDE notification area was pretty buggy and in the end I had to abandon it as well.

So, it was back to zenity… until this week, during my funemployment.  I rewrote my hacks in Python.  This was so easy I wish I’d done it years and years ago.

I’m not sure what the moral of this story is.  Maybe that my obsession is your gain.  Or maybe that I have trouble letting go.

Anyway, the result is here, on github, or in marmalade.  You’ll need Python and the new (introspection-based) Python Gtk interfaces.  This of course is no trouble to install.  The package includes the base status icon API, plus basic UIs for ERC and EMMS.  Try it out and let me know what you think.

Another Mode Line Hack

While streamlining my mode line, I wrote another little mode-line feature that I thought of ages ago — using the background of the mode-line to indicate the current position in the buffer. I didn’t like this enough to use it, but I thought I’d post it since it was a fun hack.

First, make sure the current mode line is kept:

(defvar tromey-real-mode-line-format mode-line-format)

Now, make a little function that format the mode line using the standard rules and then applies a property depending on the current position in the buffer:

(defun tromey-compute-mode-line ()
  (let* ((width (frame-width))
     (line (substring 
        (concat (format-mode-line tromey-real-mode-line-format)
            (make-string width ? ))
        0 width)))
    ;; Quote "%"s.
    (setq line
      (mapconcat (lambda (c)
               (if (eq c ?%)
               "%%"
             ;; It's absurd that we must wrap this.
             (make-string 1 c)))
             line ""))

    (let ((start (window-start))
      (end (or (window-end) (point))))
      (add-face-text-property (round (* (/ (float start)
                       (point-max))
                    (length line)))
                  (round (* (/ (float end)
                       (point-max))
                    (length line)))
                  'region nil line))
    line))

We have to do this funny wrapping and “%”-quoting business here because the :eval form returns a mode line format — not just text — and because the otherwise appealing :propertize form doesn’t allow computations.

Also, I’ve never understood why mapconcat can’t handle a character result from the map function.  Anybody?

Now set this to be the mode line:

(setq-default mode-line-format '((:eval (tromey-compute-mode-line))))

The function above changes the background of the mode line corresponding to the current window’s start and end positions.  So, for example, here we are in the middle of a buffer that is bigger than the window:

Screenshot - 08222014 - 12:52:19 PM

I left this on for a bit but found it too distracting.  If you like it, use it. You might like to remove the mode-line-position stuff from the mode line, as it seems redundant with the visual display.

Streamlined Mode Line

The default mode line looks like this:

Screenshot - 08112014 - 01:57:07 PM

At least, it looks sort of like this if you ignore the lamenesses in the screenshot. If you’re like me you probably don’t remember what all these things mean, at least not without looking them up.  What’s that “U”?  Why the “:” or why three hyphens?

At a local Emacs meetup with Damon Haley and Greg Pfeil, Greg mentioned that he’d done some experiments on using unicode characters in his mode-line.  I decided to give it a try.

I took a good look at the above.  I rarely use any of it — I normally don’t care about the coding system or the line ending style.  I can’t remember the last time I had a buffer that was both read-only and modified.  The VC information, when it appears, is generally too verbose and doesn’t show me the one thing I need to know (see below).  And, though I do like to see the name of the major mode, I don’t really need to see most minor mode names; furthermore I like to have a bit of extra space so that I can use other modes that display information that I do want to see in the mode line.

What’s that VC thing?  Well, ordinarily you may see something like Git-master in the mode line. But, usually I already know the version control system being used — or even if I don’t know, I probably don’t care if I am using VC. By default the branch name is in there too. This can be quite long and seems to get stale when I switch branches; and anyway because I do a lot of work via vc-dir, I don’t really need this in every buffer anyway.

However, what is missing is that the mode-line won’t tell me if a buffer should be registered with version control but is not.  This is a pretty common source of errors.

So, first the code to deal with the VC state.  We need a bit more code than you might think, because the information we need isn’t already computed, and my tries to compute it while updating the mode line caused weird behavior.  Our rule for “should be registered” is “a VC back end claims this file, but the file isn’t actually registered”:

(defvar tromey-vc-mode nil)
(make-variable-buffer-local 'tromey-vc-mode)

(require 'vc)
(defun tromey-vc-command-hook (&rest args)
  (let ((file-name (buffer-file-name)))
    (setq tromey-vc-mode (and file-name
                  (not (vc-registered file-name))
                  (ignore-errors
                (vc-responsible-backend file-name))))))

(add-hook 'vc-post-command-functions #'tromey-vc-command-hook)
(add-hook 'find-file-hook #'tromey-vc-command-hook)

(defun tromey-vc-info ()
  (if tromey-vc-mode
      (propertize (string #x26c3 32) 'face 'error)
    " "))

We’ll use that final function in the mode line. Note the odd character in there — my choice was U+26C3 (BLACK DRAUGHTS KING), since I thought it looked disk-drive-like — but you can easily replace it with something else. (Also note the weirdness of using string rather than a string constant. This is just for WordPress’ benefit as its editor kept mangling the actual character.)

To deal with minor modes, I used diminish. This made it easy to remove any display of some modes that I don’t care to know about, and replace the name of some others with a single character:

(require 'diminish)
(diminish 'abbrev-mode)
(diminish 'projectile-mode)
(diminish 'eldoc-mode)
(diminish 'flyspell-mode (string 32 #x2708))
(diminish 'auto-fill-function (string 32 #xa7))
(diminish 'isearch-mode (string 32 #x279c))

Here flyspell is U+2708 (AIRPLANE), auto-fill is U+00A7 (SECTION SIGN), and isearch is U+279C (HEAVY ROUND-TIPPED RIGHTWARDS ARROW).  Haha, Unicode names.

I wanted to try out which-func-mode, now that I had extra space on the mode line, so:

(setq which-func-unknown "")
(which-function-mode)

Finally, we can use all the above and remove some other things from the mode line at the same time:

(setq-default mode-line-format
	      '("%e"
		(:eval (if (buffer-modified-p)
			   (propertize (string #x21a7) 'face 'error)
			 " "))
		(:eval (tromey-vc-info))
		" " mode-line-buffer-identification
		"   " mode-line-position
		"  " mode-line-modes
		mode-line-misc-info))

The “modified” character in there is U+21A7 (DOWNWARDS ARROW FROM BAR).

Here’s how it looks normally (another badly cropped screenshot):

Screenshot - 08112014 - 08:42:33 PM

Here’s how it looks when the file is not registered with the version control system:

Screenshot - 08112014 - 08:43:04 PM

And here’s how it looks when the file is also modified:

Screenshot - 08112014 - 08:43:39 PM

Occasionally I run into some other minor mode I want to diminish, but this is easily done by editing it into my .emacs and evaluating it for immediate effect.