Recently I’ve been thinking about how to rebase Emacs on Common Lisp.
First, why rebase? Performance is the biggest reason. Emacs Lisp is a very basic lisp implementation. It has a primitive garbage collector and basic execution model, and due to how it is written, it is quite hard to improve this in place.
Seccond, why Common Lisp? Two reasons: first, Emacs Lisp resembles Common Lisp in many ways; elisp is like CL’s baby brother. Second, all of the hard problems in Lisp execution have already been solved excellently by existing, free-software CL implementations. In particular, the good CL implementations have much better garbage collectors, native compilation, threads, and FFI; we could expose the latter two to elisp in a straightforward way.
By “rebase” I mean something quite ambitious — rewrite the C source of Emacs into Common Lisp. I think this can largely be automated via a GCC plugin (e.g., written using David Malcolm’s Python plugin). Full automation would let the CL port be just another way to build Emacs, following the upstream development directly until all the Emacs maintainers can be convinced to drop C entirely (cough, cough).
Part of the rewrite would be dropping code that can be shared with CL. For example, we don’t need to translate the Emacs implementation of “cons
“, we can just use the CL one.
Some CL glue would be needed to make this all work properly. These days it can’t be quite as small as elisp.lisp, but it still would not need to be very big. The trickiest problem is dealing with buffer-local variables; but I think that can be done by judicious use of define-symbol-macro
in the elisp reader.
Emacs might be the only program in the world that would see a performance improvement from rewriting in CL :-). The reason for this is simple: Emacs’ performance is largely related to how well it executes lisp code, and how well the GC works.
44 Comments
interesting. have you talked about it with the crew on #emacs? it would be a fun project even if it failed to be widely accepted. wish you luck.
This is also something the GNU Guile (Scheme) guys have talked about for years. I’m not sure if they’ve started work in earnest, but part of their plan was that Guile would be used to implement other languages (Elisp, CL, JavaScript) for people that prefered them or need them for backwards compatability.
Sounds pretty awesome, even though there are long odds against it
I know about the Guile stuff. I think it is a really bad idea — while it seems similar to what I am proposing, it is actually quite different and won’t provide the same benefits at all. I occasionally rant about this on #emacs, maybe someday I will turn that rant into a blog post explaining what is so wrong about it.
“I know about the Guile stuff. I think it is a really bad idea — while it seems similar to what I am proposing, it is actually quite different and won’t provide the same benefits at all. ”
The main motivation for this re-write you presented was ‘Performance’ (?) and if you think guile won’t provide that, I seriously doubt you have looked at guile2. Also, AFAIK they have already invested a lot of efforts/time on that re-write so I am just hoping you have thought this through and not just doing it *purely* based your choice of language.
Why not trying to improve the existing alternative to Emacs in Common Lisp like Climacs[1] or Hemlock[2] ?
[1] http://www.cliki.net/Climacs
[2] http://www.cliki.net/Hemlock
I’ll help if you start this sounds a great idea! my english not too gooood.
“Emacs might be the only program in the world that would see a performance improvement from rewriting in CL”
What?!! I would’ve thought that about the only programs in the world which *wouldn’t* see a performance improvement would be the relative few which have been deliberately “hand-tuned”.
I would rather see an Emacs implementation on Ruby or another modern scripting language that fully supports the functional programming paradigm.
I doubt that. If the reason for rebase is performance, I seriously question Ruby as a viable ”implementing” language
I think you have a slightly skewed view on the Emacs C source. Interpreting Emacs Lisp is a pretty tiny part. Just do
cd emacs/trunk/src
wc *.c | sort -n
and see what the C code deals with. It’s mostly redisplay, and it would surely not profit from a CL rewrite at all (if that’s even possible, which I doubt). That code is so furiously complicated that almost no one dares to touch it. The other main parts are the event loop, X communication, character coding, image support, subprocesses, and so on.
Zeenix writes: “if you think guile won’t provide that, I seriously doubt you have looked at guile2”
I know enough about Guile to know it is not a very good fit. In addition to its various impedance mismatches (Lisp-1 versus Lisp-2, nil versus #f versus ‘()), it also does not have, and never will have, a truly first-class GC (don’t get me wrong — the Boehm GC is best of its class; but it will never be as good as what a “pure” Lisp system can achieve), and Guile’s execution model is still not top-notch. Even when Guile achieves native compilation, if it ever does, it will *still* not be as good as what I am proposing, because their plan does not include recoding Emacs into Lisp; so native compilation will stop at the boundary of the implementation. Experience with native compilation of Emacs Lisp (yes, this has been done) shows that this doesn’t help much.
So, I am very skeptical.
Also, dig around and read stuff like:
http://lists.gnu.org/archive/html/guile-devel/2010-04/msg00146.html
Even the Guile developers don’t think it is going to help much. There’s another note somewhere, IIRC, explaining that Emacs Lisp is going to be a second-class citizen on the Guile platform as well.
I don’t know why this plan sounds good to anybody.
Daimrod wrote: “Why not trying to improve the existing alternative to Emacs in Common Lisp”
I am heavily, heavily invested in many details of the existing Emacs. That is — Gnus, ERC, BBDB, all the various code editing features, I’m starting to use Semantic, etc, etc.
I think switching to a CL Emacs means rewriting all this. I’m not just reluctant to do this, but truly incapable. There isn’t enough time.
David wrote: “I think you have a slightly skewed view on the Emacs C source. Interpreting Emacs Lisp is a pretty tiny part.”
I assure you that I know a lot about the Emacs C code.
In my profiling of Emacs Lisp programs, the profiles are relatively flat and have a long tail. Experience with native compilation of elisp has shown that it doesn’t help much.
I think there are 3 things that can be done to improve Emacs Lisp performance.
First, a better GC. IIRC the GC was the top time-waster in my profiles. But, adding a better GC to the existing Emacs is nontrivial.
Second, speeding up regular expressions. In my profiles this was higher than the bytecode interpreter. I haven’t really looked into this one yet. We could perhaps JIT compile regular expressions, but I think the tradeoffs are unclear, and it would need profiling.
Third, improving Lisp execution performance in a deep way — exposing all the innards of Lisp functions to the compiler, for inlining and JITting and other optimizations. This is mostly already done by CL implementations, but there are some Emacs-specific things that would benefit as well.
I don’t get it… I’ve used emacs for 20+ years and performance has never been an issue for me.
Hi Tom,
I liked your article, but I think that in the comments you are still misunderstanding the advantages of rebasing Emacs on top of Guile.
Elisp under Guile will be faster than Emacs’ current implementation. Some time has passed since that note, and both Guile and Guile’s elisp implementation have gotten better.
Regarding GC: you are correct, the Boehm GC won’t be better than e.g. SBCL’s collector. But either would be better than what Emacs currently has.
Having a more performant Elisp would make it possible to write more of Emacs in Elisp, something I think we both agree would be good. It seems to me that Elisp all the way down is better than Elisp on Common Lisp — so this conversion of yours is, like replacing the Elisp implementation with Guile, merely an intermediate step towards a larger goal — and it’s not clear to me that it’s the right step.
CL will not be a panacaea for Emacs. It either poses greater portability problems than Guile, because the performant implementations are not available on all platforms that Emacs is used on; or, it won’t be as performant as Guile, which will offer good AOT performance on platforms for which native compilation is supported, and bytecode execution on the others.
CL is a great language, but it is bad for the GNU system, because it is its own system.
What computer do you have? Are you sure you are talking about emacs and not eclipse?
Emacs is as fast as notepad on my 5+ older computer.
Emacs is just an editor, is not something that needs phenomenal performance, look at eclipse, is slower like a snail and people still use it.
Better use your energy making something cool in lisp.
Lisp needs cool projects to show how cool the language is, rewriting emacs just for fun dosen’t bring anything useful to anyone except some programming fun for yourself.
Some other have thought of this as well:
http://www.cliki.net/CL-Emacs
The only way to prove or disprove the comments thus far (here and I assume elsewhere) is to go ahead and ‘do it!’ I’ve no particular complaint with Emacs as is, but it would be of great interest to compare the two. I also like the idea of CL being built in in terms of a repl—which is one of the things that I use Emacs for (aprox. 50% at a guess) In sum, we will know neither the benefits nor the problems unless someone gives this plan a shot—go for it!
Thanks Andy.
As you know I am very much opposed to the rebase on Guile idea. I still think it will not really yield the claimed benefits; and furthermore that some of those benefits are anti-features. Basically I think what I wrote on emacs-devel last year remains true.
In particular, what I have seen of the Guile performance plans will mean that Emacs will always have second-class performance. The GC matters more than native compilation, and the compiling-bytecode experiment proves that AOT compilation is no good.
“It seems to me that Elisp all the way down is better than Elisp on Common Lisp”
I don’t understand this statement, since this isn’t what you are actually working on.
“CL is a great language, but it is bad for the GNU system, because it is its own system”
I have really given up on the GNU Project as a vision. The decision-making and discussion in GNU is irredeemably broken. It depresses me that this is so, since GNU has been the foundation of my career; but recently I have learned, painfully, that I was only part of GNU by denying important truths about its nature. I realize this isn’t the case for you though.
Also, GNU has a CL implementation. So, I don’t understand this statement from that angle as well.
“Emacs might be the only program in the world that would see a performance improvement from rewriting in CL”
I’m not sure this is necessarily true. I’ve been doing some deeper reading of Lisp recently, plus looking at some examples online, and Lisp doesn’t look that inefficient. Performance similar to C in many cases.
Thanks for your thoughts, Tom. I’m probably retracing part of your GNU arc; perhaps I’ll come to the same conclusions too, but I think I need to walk it a bit farther.
I am not certain that the compiling-bytecode experiment is the last word in AOT elisp compilation. But, I can’t back this up right now, so I won’t insist.
Regarding GC, one further point: Racket was able to make the switch from a conservative collector to a generational moving collector, largely (AFAIK) via mechanical transformations on their C source. That could work with Guile as well, or it could work with Emacs’ C code on some other base. I am current OK with the Boehm GC’s performance, but it is possible to contemplate a switch if the benefits are large enough.
Regarding Elisp all the way down, I was a bit unclear. The future that I have in mind is one in which Emacs’ elisp implementation is replaced with Guile’s. Elisp would then be fast enough to implement lower-level things, and then the C part of Emacs would shrink over time, through refactoring, being replaced by Elisp, and probably some Scheme.
Happy hacking,
Andy
“I am not certain that the compiling-bytecode experiment is the last word in AOT elisp compilation. But, I can’t back this up right now, so I won’t insist.”
Actually, I am quite curious to hear your thoughts on why this might be, even if you don’t have evidence.
“then the C part of Emacs would shrink over time, through refactoring, being replaced by Elisp, and probably some Scheme”
Yeah. I propose to skip the intermediate steps. Also, I don’t see Scheme as a particularly viable target. Lisp and Scheme are just too different. I haven’t really looked at the details, but it seems like you must either make a mess of Scheme or of Lisp to fully bridge the two.
CLISP doesn’t support threading and native compilation. GCL is abandoned, pre-ANSI, doesn’t support threading and produceds native only by compiling to C first. The “good CL implementations” you’re aiming for are neither GPL nor part of GNU, so I doubt they would ever be seriously considered.
I think we can all agree that the three major platforms are Linux, Windows and OSX x86 and amd64 (maybe ARM in the future).
ClozureCL is an excellent CL implementation that has been extensively tested and used for commercial applications and also works on all these platforms (32 and 64bit).
SBCL works fine on Linux and OSX (after recent improvements) and Windows is getting better every day.
I hope that we can all agree that the CL is it’s own platform argument is completely
bogus.
Guile is a bad idea because it is not and never will be state of the art, and it has a minimal following of users and developers compared to Common Lisp.
“Emacs might be the only program in the world that would see a performance improvement from rewriting in CL”
Pretty much every Python and Ruby program, too, for starters, and Perl, and Basic, and … well, anything short of simple C code!
Threads in CLISP
http://www.clisp.org/impnotes.html#mt
Back in the 1970’s Bernie Greenberg got fascinated by Lisp and did a clean implementation of Teco EMacs in Lisp on Multics.
GNU Emacs was a C version based on the Lisp version. So what you’re saying is that you want to go back the original version in Lisp.
[…] writing an earlier post, I realized I haven’t yet written about the Python plugin for […]
Tom, the same goals (performance, better threading model, garbage collection) could be met by rebasing on Clojure. We’d get access to a huge library and a STM for free. Thoughts?
@Matt: Clojure has mainly immutable data and I think that’s going to be a real mess to use it in place of emacs-lisp which uses dynamic scope.
As per Bob’s accurate entry, I did implement an early Emacs in Lisp, starting in 1978. I was inspired by (meaning, “saw and admired what you could do with, not studied or adapted code of”) Stallman’s EMACS on ITS PDP-10, and Dan Weinreb’s EINE on the MIT Lisp Machine (ditto). All these efforts predating Common Lisp, my effort used Multics MacLisp. As (unfortunately) with Multics itself, portability was never a desideratum, and my editor exploited many Multics-specific features, some created explicitly to support it. A contemporary (1979) document (with 1996 preface) is online at http://www.multicians.org/mepap.html . Please keep in mind the phrase “33 years ago” when reading.
There is an extensive discussion of Multics MacLisp on the same site. Thanks.
I think it’s great idea to upgrade elisp to Common Lisp and would like to point out that performance considerations would be shadowed by the leap in language expressive power if rebasement would take place.
Consider this – CL is a mature language with threads/lexical scoping/nice GC/ffi/standard and bunch of libraries (sadly this bunch isn’t huge).
I agree that Guile isn’t viable alternative to CL mostly because it either would put Scheme as extension language and then it’s not clear where already written elisp code should go or it would leave elisp as extension language and improve it’s performance in which case C code from emacs core would be replaced by C code from guile core – hardly an improvement.
And I seriously doubt abouth idea that emacs C core would be replaced by intepreted Scheme.
But I’m not going to criticize guile here, it’s wrong place to do that.
Also regarding performance of regular expressions, which is sad to be poor.
Well that’s due to implementation of regexp engine which directly interprets them instead of compiling them to dfa (deterministic finite automaton).
The main problem with dfa-based implementation is that they don’t allow backreferences, e.g. ‘(fo+).*\1’, but I’m sure that they are rarely used anyway and can be sacrificed in 99% cases.
IIRC lookahead would be impossible, but emacs doesn’t have this so no one would miss it.
More on dfa vs direct interpretation here http://swtch.com/~rsc/regexp/regexp1.html.
Also since font-lock is all about regexps improving regexps a bit might cause good effect on overall subjective responsiveness of emacs.
I would like to suggest another alternative to translation of C code to CL.
Instead of complete translation right from the start, which would be hard but definitely sould happen, probably in small steps, I propose to skim through C sources and make C code operate on CL structures from embeddable common lisp compiler.
Also parts of C code which deal with e..g. bytecode should be eliminated but all lisp part would be in ECLs hands and all existing C code which deals with gtk and redisplay would remain accessible and need not be changed right away.
While writing this I realized that your idea of complete translation leads to much better results than use of ECL in C source and isn’t so hard as I believed before.
Parts of C that deal with elisp probably could be translated automatically without much effort.
Hope to hear some work going on in this direction.
@daimrod – Clojure’s native data structures are immutable, true. But we’re discussing a new implementation language for an elisp interpreter, not the possibility of replacing elisp with Clojure. Dynamic scope is a separate issue and I don’t see any reason a Clojure implementation wouldn’t support dynamic scope in elisp.
The a downside to choosing any JVM-based language would be the temptation to write emacs’s UI in a Java GUI toolkit. Emacs should still support a console UI and I have no idea how much trouble that would be from a JVM.
actually when I am thinking more about this, I am thinking actually add Lua support (embed Lua) directly into emacs could be a more interesting alternative … so that emacs can still use whatever it has, the raw c function and lisps, but with Lua, it may be easier for a lot of developers who are not fluent in lisp …
just some thought experiment and another hurdle is that Lua is MIT licence, vs GNU licence
@Isaac
There is no license conflict between MIT and GNU GPL, you’re free to add MIT-licensed stuff to a GPL’d prog, and the result will still be under the GPL.
Would all the thirdparty emacs extensions keep working in common lisp?
Yeah, the idea is to change the Emacs implementation while preserving the vast body of existing elisp.
[…] is a followup to my earlier post on converting the Emacs C code into Common Lisp. This one is a bit more technical, diving into […]
I can’t imagine a project this large ever getting finished, and I don’t believe the advantages are as great as you think they are.
First, performance really is not an issue anymore. Does anyone experience Emacs slowdown? I have pretty crappy computers and I’ve never ever experienced performance issues with Emacs.
Second, as others have said, a lot of the Emacs source code is not to do with Elisp but to do with highly performance-intensive areas such as redrawing. This code is complicated and I’m skeptical you could do a mostly automated and still performant conversion.
Third, it just isn’t worth it. OK, Elisp isn’t great, it’s not CL, but it’s good enough for what it does. There’s more important problems to solve.
Rob: There’s a lot of subjectivity in the experience of performance. If you have “pretty crappy computers”, my guess (and I freely admit this is just a guess) would be that your expectations are low for performance — and with low expectations, they’ll get met.
(For more on subjective versus objective performance measurements, see, e.g., http://www.sigchi.org/chi95/proceedings/shortppr/gvk_bdy.htm )
I’m fairly new to emacs myself, but I can definitely say there are times when I find myself waiting for it to do something. To me, that’s essentially how I define a case of “experiencing performance issues” — if I’m waiting for the computer, it’s got a performance issue.
So, if you’ve “never ever experienced performance issues with Emacs”, I’d like to know what that means. Or, just to have you know that others of us have, and so for Tom, this may well be a “worth it” pursuit.
And as for which problems to solve, well, there are a lot of us humans about. You’re welcome to go solve others. I’m glad that Tom is working on the ones that *he sees* as a problem. If you’re trying to recruit his help on one of yours, that’s one thing… though perhaps he’ll be more effective at helping if he’s not waiting for his editor. 😉
Anyway, Mostly what I want to say is Tom: Kudos on working on this. I’m excited to see what comes of it. I hope you’ll keep going… This seems like a lot of work, and I hope you have the persistence (and/or help, encouragement, etc.) to stick with it.
Perl, not CL, as we all know, is Perfect Emacs Rewriting Language.
Rob, one problem I have with emacs that could be solved is the entire process waiting for blocking i/o
re #41: Yes! Waiting on I/O is not only a problem, but is illustrative of a kind of problem. Emacs is not only a programming editor. It has become a middle-ware layer, or execution environment for major modes that serve in similar ways to desktop applications. So we have folks running org-mode, bbdb, a mail reader, an IRC client, maybe a web browser all together. Improvements that would help all those bits of software play nicely together would be welcome.