The thesis that underlies my project to translate the Emacs C code to Common Lisp is that Emacs Lisp is close enough to Common Lisp that the parts of the Emacs C code that implement Lisp can be dropped in favor of the generally superior CL implementation. This is generally true, but there are a few difficult bits.
Symbols
The primary problem is the translation of symbols when used as variable references. Consider this code:
(defvar global 73) (defun function (argument) (let ((local (something-else)) (+ local argument global)))
More is going on here than meets the eye.
First, Emacs Lisp uses dynamic binding by default (optional lexical binding is a new feature in Emacs 24). This applies to function arguments as well as other bindings. So, you might think you could translate this straightforwardly to:
(defvar global 73) (declare (special global)) (defun function (argument) (declare (special argument)) (let ((local (something-else)) (declare (special local)) (+ local argument global)))
This was the approach taken by elisp.lisp; it defined macros for let
and let*
(but forgot defun
) to do the dirty work:
(defmacro el::let* ((&rest vars) &rest forms) "Emacs-Lisp version of `let*' (everything special)." `(let* ,vars (declare (special ,@(mapcar #'from-list vars))) ,@forms))
But not so fast! Emacs also has buffer-local variables. These are variables where the value is associated with the current buffer; switching buffers makes a different binding visible to Lisp. These require no special syntax, and a variable can be made buffer-local at any time. So, we can break the above translation simply by evaluating:
(make-local-variable 'global) (setq global 0)
Whoops! Now the function will return the wrong result — the translation will have no way to know that is should refer to the buffer-local value. (Well, ok, pretend that the setq
magically worked somehow…)
My idea for implementing this is pretty convoluted. Actually I have two ideas, one “user” and one “kernel”:
User
I think it is possible to use define-symbol-macro
on all symbols that come from Elisp, so that we can tell the CL compiler about the real implementation. However, a symbol can either be treated as a variable, or it can be treated as a symbol-macro — not both at the same time. So, we will need a second location of some kind to store the real value. Right now I’m thinking a symbol in another package, but maybe a cons or some other object would work better. In either case, we’d need a macro, a setf
method for its expansion, and some extra-tricky redefinitions of let
and defun
to account for this change.
This would look something like:
(define-symbol-macro global (elisp:get-elisp-value 'global)) (defsetf elisp:get-elisp-value elisp:set-elisp-value)) ;; Details left as an exercise for the reader.
This solution then has to be applied to buffer-, keyboard-, and frame-local variables.
Kernel
The kernel method is a lot simpler to explain: hack a Common Lisp implementation to directly know about buffer-locals. SMOP! But on the whole I think this approach is to be less preferred.
Other Problems
Emacs Lisp also freely extends other typical data types with custom attributes. I consider this part of the genius of Emacs; a more ordinary program would work within the strictures of some defined, external language, but Emacs is not so cautious or constrained. (Emacs is sort of a case study in breaking generally accepted rules of programming; which makes one wonder whether those rules are any good at all.)
So, for example, strings in Emacs have properties as a built-in component. The solution here is simple — we will just translate the Emacs string data type as a whole, something we probably have to do anyway, because Emacs also has its own idiosyncratic approach to different encodings.
In elisp, aref
can be used to access elements of other vector-like objects, not just arrays; there are some other odd little cases like this. This is also easily handled; but it left me wondering why things like aref
aren’t generic methods in CL. It often seems to me that a simpler, more orthogonal language lies inside of CL, struggling to get free. I try not to think these thoughts, though, as that way lies Scheme and the ridiculous fragmentation that has left Lisp unpopular.