Archive for the ‘Emacs’ Category

Fun with rewriting

There’s a fun source rewriting trick that I’ve wanted to try out for a long time — and I finally got a chance to do it while working on the multi-threading patch for Emacs.

The Problem

In the multi-threaded Emacs, a let binding must be thread-local, because this is really the only way to manage dynamic binding in the presence of threads.  Emacs also has a notion of a buffer-local variable, and furthermore some buffer-local variables are stored directly in the internal struct buffer — that is, assignments to the variable in lisp are transformed by the lisp implementation into a field assignment in C.  These fields are freely used elsewhere in the C code.

Our implementation of thread-locals, though, is an alist mapping a thread object to the variable’s value.  So, to keep the C code working properly, we need to rewrite every field access to use a function that finds the proper per-thread value.

The Idea

The idea, of course, is automated rewriting.  However, like many other GNU programs, Emacs is heavily macroized, and furthermore may be the last program in the whole distro that uses K&R-style function definitions.  For these reasons I assumed that existing refactoring tools would not work well.

Luckily, though, this problem doesn’t require a very sophisticated refactoring tool.  Really all we need to do is find the location of each field reference, and then find the start of the left-hand-side, and then rewrite that into the new form.

The Hack

All we really need is to find a series of locations — the rest we can handle with some straightforward elisp scripting.  And what simpler way is there to get locations than to get the compiler to give them to us?

I wrote a batch script in elisp to automate the whole procedure.  Why elisp?  Not only is it a natural, perhaps even required, fit when hacking on Emacs, it also has some nice “sexp” functions which allow skipping over properly-parenthesized expressions.  This means I could do without a whole parser.  And why automate the whole process?  I expected it wouldn’t work properly the first time; having a single script let me git reset after each test run and simply re-run from scratch.

This elisp script first edits struct buffer to rename each field.  Then it runs make to rebuild Emacs.  This causes the compiler to emit an error message for each bad field access.

A critical point here is that I used GCC svn trunk.  Only recent versions of GCC emit correct column numbers in error messages .  GCC 4.4 might have worked, I am not sure — and in the end I needed a small libcpp patch to deal with a certain macro case.

The elisp script reads the output of make and pulls out the error messages.  For each error on a given line, it works in reverse order (so that multiple fixes on one line will work properly without the bother of inserting markers), rewriting the field accesses.  I wrote a bit of ad hoc code to back up to the start of the left-hand-side of the field access; doing this well is a bit funny, like writing a parser that works backwards, but in my case I knew I could get away with something relatively simple (I think this little sub-hack caused the script to miss less than 10 rewrites, i.e., tolerable).

I would guess that this script got 90% of the field accesses.  I had to fix up a few by hand, mostly in macro definitions in header files.  And, I had to revert a few changes as well, mostly in the garbage collector (which wants to see the real underlying alist, not the per-thread value).  Still, diffstat says: 49 files changed, 1305 insertions(+), 1021 deletions(-) — in other words, not something you’d want to do by hand.

So, ok, this is horrible.  But fun!  I think I will end up doing it again, for frame- and keyboard-local variables.  Maybe someday I’ll finish my patch to make libcpp properly track locations through macros, and then the script can even fix up macro definitions for me.

I’m not extremely interested in Eclipse-style refactoring — where the tool provides a couple dozen refactorings for you.  Instead, I think I want my refactoring tool to answer queries for me, so I can feed that information to a customized rewriting script.

Another way I could have done this was writing a GCC plugin with treehydra or MELT, but unfortunately my free time is so limited that I haven’t managed to even build either one yet.  Once plugins are in the Fedora GCC, I think it would be very worthwhile to package up treehydra…

package.el and Emacs

It looks like package.el is going to go into Emacs after all.  I’m psyched!

I made a git branch (still local) to hack on this.

Of course, now I realize that there are bugs to fix and features to add and cleanups to make before this is really feasible…

Emacs and Threading, Take 2

I’ve recanted. Contrary to my earlier post on this topic, I now think implementing threading in Emacs is possible. A patch from Giuseppe Scrivano inspired me, and I started my own patch to do it.

This was sort of fun. I wrote a batch script in elisp to rewrite some of the Emacs sources — yay semantic patching!

Thanks to Giuseppe, this is now hosted on Gitorious. We’re both working there, on different branches, merging code back and forth. I’ve mostly been working on variable bindings, and he’s very active, both with low-level changes and cool things getting Gnus to work in a separate thread.

If you’re interested in helping out, we discuss it on emacs-devel, but really we’d welcome any sort of contact.

Emacs 23

Much to my surprise, the Fedora Emacs maintainers pushed Emacs 23 into the (ostensibly stable) Fedora 11 repository.  I was a bit afraid to upgrade, since Emacs really is the cornerstone of my entire workflow.  My desire for new features quickly overcame my fear, though.

The first thing you will notice is that Emacs is much prettier.  It now uses XFT to render, so you get antialiasing.  For normal work, I don’t really care much, but this is why I used CVS Emacs last year for presentations: it makes a huge difference in situations where prettiness matters.  Unfortunately this seems to have negatively affected redisplay performance.

Another major feature I have been loving is support for multiple terminals.  I use this in two ways.

I run my Emacs on my main machine, of course.  This is the centerpiece of my desktop: I use it for hacking, for mail and news, and for irc.  Previously, if I used my laptop, I couldn’t easily access all this state; but now I can ssh to my main machine, run emacsclient -t, and have access to everything.

I’ve also set EDITOR to emacsclient -t.  This means that when I run git commit in a shell, the commit message shows up in a new emacs frame on that terminal.  This is very convenient for “quickie” edits, because it means not having to switch my focus. (If I had to pick a single reason that Emacs improves my productivity, this would be it: it makes it very easy to keep one’s focus.)

Funnily, though, I don’t actually run git commit in a shell very often any more, because the new vc-dir mode is good enough that I can do some common git operations without leaving Emacs.  If you tried VC in earlier versions of Emacs, then you probably remember it as a horrible joke — it worked fine for RCS, but was miserable at anything else.  vc-dir is something like a generalized pcl-cvs, so you can work on a whole directory tree at once (and do so efficiently, unlike the old vc-dired).  vc-dir is still pretty new, and there are some necessary operations that aren’t exposed (git push), but it is still a very nice step forward.

This release is definitely worth upgrading to.

Wish List Item

I’ve been trying for a while to figure out how best to read blogs.

Right now I use three different methods — I use iGoogle for some things, plain old web browsing for some, and then gnus for one feed.  What a pain!  I’ve also tried other readers in the past — a couple web-based one, Azureus, maybe something else.

None of these are ideal for me.  I think what I would really like is to use Gnus for everything, except Gnus blocks annoyingly while fetching the feeds.  So, I could use nntp//rss.  But then I am setting up and configuring yet another program, setting it up to run when I log in, forgetting to copy its configuration to my laptop, etc.

I wish there were “gmane for rss” — a site that ran nttp//rss for me and let me subscribe to any old feed using my news reader.  Anybody know of one?

Wait!  I have other complaints too!  I’ll save those for later… I’m turning into the sort of person who wishes RSS were NNTP and that Common Lisp were popular again.  What is happening to me?!?

Gnome Tip

As usual, my upgrade to F9 brought with it some behind-the-scenes changes.  Sometimes these lurk for quite a while before I discover them.

Tonight I clicked on a “mailto:” URL.  A while back I had configured firefox to open a new message buffer in Emacs when I did this; but to my surprise instead it launched evolution.

I wasted a lot of time trying to see what I did wrong in my firefox config (which is amazingly obscure, by the way, for something that seems like a basic configuration tweak).  The answer: nothing was wrong.  Instead, now I must also configure Gnome to know how to do this.

This meant a short side trip to install gconf-editor… as with seahorse, I was surprised to find out that a generally useful tool like this was not already installed for me.

After successfully editing the proper key, it turns out that “mailto:” is actually handled by the “Mail Reader” in “Preferred Applications” — something I would not have guessed, given that I am trying to send mail.  I guess I read that a bit too literally.  (The tip from the title: just edit this and save yourself a lot of time.)

I’m not even sure where all these little changes get made.  Was it a Gnome change?  A firefox change?  And integration patch from Fedora?  I couldn’t say.  Over time, these little annoyances do add up and leave a bad impression.

Off the top of my head, I don’t have a good idea for how Gnome, or whoever, should solve this kind of problem.  I just felt like venting a bit.

ELPA Update

I’ve been extremely flaky about ELPA lately, but the dam finally broke today, and I went through all my saved-up email and uploaded a bunch of packages.  Check it out.

I found out recently that ELPA has a competitor, ELM.  Anybody tried this?  If so, let me know what you think — is it better than ELPA?  Worse?  Are there ideas I should steal?

GCC Summit

Next week is the GCC Summit, in Ottawa. I’ll be giving a talk there about my work on the incremental compiler. That will be fun, but what I’m really looking forward to is seeing all the other GCC hackers I know. There are plenty of interesting talks on the schedule, and some cool BOFs; plus there is an informal gdb meetup (this should be especially interesting, as there are many cool developments in gdb-land).

In keeping with my Emacs mania, this year I wrote a special presentation mode. My presentation is just a text file, in Emacs “outline mode” format; my new mode displays it nicely in a screen-filling frame. I should say, “nicely”, since some kinds of rendering are a pain to do in Emacs, and I couldn’t be bothered.

Parsing

I was reading about PEG recently, and thinking “that is pretty interesting” — and of course it turns out that there is an Emacs implementation.

It is a bit odd how primitive parsing support is in Emacs. It is one of those mysteries, like how window configuration and manipulation support can be so weak. Peculiar.

CEDET includes a parser generator, called wisent. That is long overdue… though even it is a bit odd, apparently preferring a yacc-ish input syntax. I don’t know about you, but when I have a lisp system just sitting there, I reflexively reach for sexp-based formats. Well, ok, it is a port of bison. But still.

I did a little parser hacking in gdb recently. In gdb, if you complete an expression involving a field lookup, it will currently print every matching symbol in your program — when all you really wanted was the completion of a field name. This is what I set out to fix.

My first idea was: hey, the parser knows what tokens are valid. I can just ask it! But, I don’t think there’s a way to do that with bison parsers. At least, no documented way — boo. And anyway, as it turns out, this is not what you want.

For instance, consider the simple case of “p pointer->field“. This is syntactically valid as-is, so the parser would indicate that the desired completions are whatever can come next — say, an operator. But if the cursor is just after the “d”, you want to continue completing on the field name. So, you have to differentiate this case based on whitespace.

I ended up hacking the lexer as well as the parser. The lexer can now return a special COMPLETE token, which it does depending on the previous tokens and the presence or absence of whitespace. I also added some new productions like:

expression: expression '.' name COMPLETE
expression: expression '.' COMPLETE

From here it is pretty simple to solve the rest of the problem.

I don’t remember reading about this anywhere, but I’m sure it has been done before. I thought it was a pretty fun hack 🙂 – I love problems that start with the user experience and end up someplace much deeper.

Emacs News

An awesome color theme for Emacs.

Steve Yegge wrote a new javascript mode for Emacs. I’m interested in modes that push Emacs a little and do real parsing rather than the typical (and goofy) parse-partial-sexp stuff; this one falls into that category. Semantic also does this, though more generically and perhaps with more sophistication; I think semantic parsers are born incremental. I hope to push this mode into ELPA as soon as he responds to my patch…