Emacs and Threading

While once again waiting for Gnus to contact a news server, I thought: I’ll never be able to move my RSS reading into Gnus, because the delays will skyrocket. Sure, there’s nntp//rss, but that means configuring a separate program and keeping it running — and I’ve heard that this program can have a memory footprint as big as Emacs itself. (As an aside: remember the old days when Emacs was routinely the program with the largest footprint on your desktop? For me it is never number one, and sometimes even slips to 3 or 4.)

Maybe some savior will come along and make Gnus fetch RSS feeds in the background, using a process filter. I assume, without looking, that retro-fitting this into Gnus would be very hard. For new Emacs code, though, this is the way to go; you can set things up so that most mode-specific operations report “working…” back to the user when background operations are happening — while still letting the user switch buffers and work on other things. For instance, nowadays vc-annotate works this way, which is very nice, since annotate is fairly slow in most version control systems.

Even better would be to make Emacs capable of multi-threading. Most people arrive at this idea eventually. Unfortunately, I think it is just not possible; partly due to bad language choices: dynamic scope is very handy, but having only dynamic scope is terrible; but also partly due to consequent design choices for the rest of Emacs: buffers are big global objects, maintaining compatibility for the enormous body of existing lisp is crucial, and auditing even the built-in body of elisp is, in difficulty, somewhere between daunting and impossible.

A few weeks ago I heard a funny idea in this area. Instead of trying to handle multi-threading, how about old fashioned multi-process support, with some kind of message passing? Emacs could fork(), and then the child could wander off with its own copy of everything; and then the subprocess could send up messages and data which would be integrated into the Emacs event loop. This is basically the same idea as process filters, only with the benefit that the process could be expressed in the same lisp form as the handler, and the subprocess would have access to all the relevant lisp state.

Naturally, most of these messages would just be elisp; but perhaps it would be worthwhile to add a way to transfer the contents of a buffer wholesale.

11 Comments

  • You basically can’t fork() without a corresponding exec() in a desktop application. Many things will break, starting from the X connection.

    The architecture that I think almost all toolkits have arrived at, from Swing to GTK+ to Qt, is to have worker threads which perform things like I/O to sockets, and pass messages back to the main thread; in GTK+ this is done via g_idle_add. The worker threads have no access to global state, which in Emacs would be the buffer objects. If you wanted a thread to operate on a buffer, you’d pass it a conceptual copy of the buffer state. This doesn’t have to actually be a memory copy – you can do tricks like copy-on-write which is what GStreamer does.

  • Yeah, the X connection needs special treatment; in this approach the subprocess would be strictly limited to batch-like work, not display. I’m not sure anything else needs special treatment… what are you thinking of? Also, Emacs’ internal architecture in this area is pretty different from most other applications. Adding more gross hacks for the X connection would not be a big deal.

    FWIW you really can’t write an elisp program that does anything useful without using buffers. And, in Emacs, all state is global — that is the fundamental problem really.

    One idea I was kicking around is to only let the subprocess inherit a single file descriptor, chosen by the parent code.

    Maybe in the end it would be simpler to just fix Gnus. Right now that seems to be the only problem child.

  • > You basically can’t fork() without a corresponding exec() in a desktop application. Many things will break, starting from the X connection.

    We’ve just been looking at this for OLPC. Our application (“activity”) launch time has been ridiculously slow, because each new app launches its own python process and imports the same old modules (gtk, dbus, telepathy, our Sugar libraries) and does their initialization dances every time.

    We just switched to a method of activity loading that is “preimport the modules we’re likely to use, fork(), fix up dbus/X/filehandles, and import the new code the user wants us to run”. It is scary but so far good, and has the expected huge effect on startup times.

    So, tricky but possible, and we can tell you exactly how if you care.

    - Chris.

  • Yeah, using fork() as a startup time trick makes sense; I think KDE has been doing it for a long time, or at least they were before prelink was written, because their startup times took a large hit from (among other things) linking.

    But for Tom’s use case, it doesn’t seem that much different to me to exec() a new emacs process, and pass the state that you want carried over (perhaps which IMAP server to connect to) over stdin.

    Tom, I think elisp just needs (if it doesn’t have it already) a java.lang.StringBuilder equivalent; the problem with buffers is that they’re tied to the UI, but a thread can’t be updating the UI, it should be passing messages back to the main thread.

  • The Emacs problem is worse than just buffers. In Emacs everything is global; even a let binding modifies globally-visible variables. There’s really no sensible way for threads to work in this kind of situation. You could copy all bindings or something, I suppose, but that’s really no different than fork — just harder to implement.

    Starting a new emacs is tempting except that you want the sub-emacs to know some things from .emacs (e.g., load-path additions) but not others (e.g, calls to server-start).

    IOW, Emacs is firmly rooted in its 1980s design. That sucks of course, though perhaps oddly it is only a problem at the margins.

  • Maybe multiple elisp interpreters in the same process? Separate bindings, garbage collectors, etc., but with message channels being possible between interpreters. A message could just be a sexp represented as a string.

    Gross, but…yeah. It’d be nice to travel back in time and hand a free implementation of the JVM to whoever created C and shell scripts…

  • Yeah, I haven’t looked at how hard it would be to make a wholly separate elisp interpreter in the same process. It might be interesting. These days you can get pretty far just by marking globals with __thread.

  • My solution to this problem is to use rss2email in a cron job on my VPS. I also avoid using Gnus’ built-in ability to fetch email via POP/IMAP etc; instead, I grab it with fetchmail (yuck) or getmail and let Gnus get it from the local mail spool. This speeds things up considerably; the only downside is that it doesn’t solve the problem for NNTP, IMAP or other non-local groups. Ah well.

    I agree that emacs internals are fairly old-fashioned (unsurprisingly, for a program with such a long and distinguished history) but I’m not at all sure that’s such a big drawback. For instance, Gnus is able to open a group with several hundreds of thousands of messages (no, I’m not exaggerating) and present me with a nicely sorted, threaded view of that group in a matter of seconds, while using very little memory. Not bad for bytecode-interpreted Maclisp (essentially).

    I think that most of the use cases where people want some sort of concurrency in emacs are usually better solved by doing the work in some external program, communicating with emacs using the existing process interface. The desire to retrieve news feeds or email in the background is one instance of this. Of course, that makes it harder to customise the work that that program does via emacs lisp, but c’est la vie.

  • Gaute — yes, I tend to agree with all of that. (And, I also use fetchmail instead of having Gnus fetch my mail — that speeds things up a lot.)

    FWIW gnus is just about the only program that regularly bugs me by locking up Emacs. It is pretty fast overall, surprisingly so given elisp’s implementation, but it can still be a pain if, say, a news server fails to respond in a timely way.

    The big picture future I want to see for Emacs is first, a better display model (Emacs is ok but not super; it should at least do everything OO.o can do, or preferably anything gecko can do); and second, smarter “big” programming modes, like semantic or js2-mode. The latter may need a faster elisp; maybe with the lexbind branch we can do that, or maybe we can move it all to CL somehow.

    That’s all wishful thinking of course :)

  • [...] recanted. Contrary to my earlier post on this topic, I now think implementing threading in Emacs is possible. A patch from Giuseppe Scrivano inspired [...]

  • [...] recanted. Contrary to my earlier post on this topic, I now think implementing threading in Emacs is possible. A patch from Giuseppe Scrivano inspired [...]

Join the Discussion

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>