Parsing

I was reading about PEG recently, and thinking “that is pretty interesting” — and of course it turns out that there is an Emacs implementation.

It is a bit odd how primitive parsing support is in Emacs. It is one of those mysteries, like how window configuration and manipulation support can be so weak. Peculiar.

CEDET includes a parser generator, called wisent. That is long overdue… though even it is a bit odd, apparently preferring a yacc-ish input syntax. I don’t know about you, but when I have a lisp system just sitting there, I reflexively reach for sexp-based formats. Well, ok, it is a port of bison. But still.

I did a little parser hacking in gdb recently. In gdb, if you complete an expression involving a field lookup, it will currently print every matching symbol in your program — when all you really wanted was the completion of a field name. This is what I set out to fix.

My first idea was: hey, the parser knows what tokens are valid. I can just ask it! But, I don’t think there’s a way to do that with bison parsers. At least, no documented way — boo. And anyway, as it turns out, this is not what you want.

For instance, consider the simple case of “p pointer->field“. This is syntactically valid as-is, so the parser would indicate that the desired completions are whatever can come next — say, an operator. But if the cursor is just after the “d”, you want to continue completing on the field name. So, you have to differentiate this case based on whitespace.

I ended up hacking the lexer as well as the parser. The lexer can now return a special COMPLETE token, which it does depending on the previous tokens and the presence or absence of whitespace. I also added some new productions like:

expression: expression '.' name COMPLETE
expression: expression '.' COMPLETE

From here it is pretty simple to solve the rest of the problem.

I don’t remember reading about this anywhere, but I’m sure it has been done before. I thought it was a pretty fun hack 🙂 – I love problems that start with the user experience and end up someplace much deeper.

One Comment

  • Hi,
    We’ve got this on KDevelop, maybe you would like to take a look at it, the KDevelop4 c++ code completion is very complete.

Join the Discussion

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.