Archive for May, 2008

Parsing

I was reading about PEG recently, and thinking “that is pretty interesting” — and of course it turns out that there is an Emacs implementation.

It is a bit odd how primitive parsing support is in Emacs. It is one of those mysteries, like how window configuration and manipulation support can be so weak. Peculiar.

CEDET includes a parser generator, called wisent. That is long overdue… though even it is a bit odd, apparently preferring a yacc-ish input syntax. I don’t know about you, but when I have a lisp system just sitting there, I reflexively reach for sexp-based formats. Well, ok, it is a port of bison. But still.

I did a little parser hacking in gdb recently. In gdb, if you complete an expression involving a field lookup, it will currently print every matching symbol in your program — when all you really wanted was the completion of a field name. This is what I set out to fix.

My first idea was: hey, the parser knows what tokens are valid. I can just ask it! But, I don’t think there’s a way to do that with bison parsers. At least, no documented way — boo. And anyway, as it turns out, this is not what you want.

For instance, consider the simple case of “p pointer->field“. This is syntactically valid as-is, so the parser would indicate that the desired completions are whatever can come next — say, an operator. But if the cursor is just after the “d”, you want to continue completing on the field name. So, you have to differentiate this case based on whitespace.

I ended up hacking the lexer as well as the parser. The lexer can now return a special COMPLETE token, which it does depending on the previous tokens and the presence or absence of whitespace. I also added some new productions like:

expression: expression '.' name COMPLETE
expression: expression '.' COMPLETE

From here it is pretty simple to solve the rest of the problem.

I don’t remember reading about this anywhere, but I’m sure it has been done before. I thought it was a pretty fun hack 🙂 – I love problems that start with the user experience and end up someplace much deeper.

Would you do it again for free?

Thursday night I finally made it to a BLUG meeting. Stormy Peters from OpenLogic gave a talk titled “Would you do it again for free?”

Her talk covered some familiar ground — intrinsic versus extrinsic motivation, a list of motivations that free software developers claim (or that are claimed by others), the various methods of payment. Her slides were beautiful; she seemed a bit nervous though not overly so.

She also talked a bit about inequality in projects. She claimed that 40% of developers on free software projects are paid to do so; a show-of-hands at the meeting showed similar results.

OpenLogic is running the Open Source Census — kind of a cross-platform popcon. If you read her blog a bit you’ll see that she uses this information when talking to VCs and the like. That’s a smart idea and I’m generally in favor of hard data over speculation anyhow.

She was using an Asus, kinda cool. And Neil, sitting next to me, was using an XO. Weird times we live in.

Motivation, of course, is a psychological phenomenon, one with which we all have direct experience. That is, everybody has an opinion… so one commenter from the audience rejected most of her list of motivations in favor of — you guessed it — his. I suppose this is the bikeshed effect in a different form.

I didn’t agree with everything in Stormy’s talk. At one point she gave a sort of economic history of mankind which, I think, was badly mistaken on the facts, though perhaps not our experience of them.

After the talk I asked her about the pretty photos and consistent palette in her presentation. She said they were CC-licensed works from flickr and from some stock photo site… nice. (Also I noticed her slowly backing away while we talked. Whoa! Like, I’ve always been afraid of being that person. And now … hard data. Crap.)

She also talked a bit about the relationship developers have with open source. One idea was that a hacker might leave a project (suppose the project dies) — but will just switch projects and keep working. Also, supposedly nowadays open source developers make more money than proprietary developers; but, conversely, often claim that they would take a pay cut to work on open source (the intrinsic motivation thing). Let’s hope our bosses stop midway through that sentence.

I’m fascinated by the social dimension of programming. Partly this is defensive; over the years I’ve developed some heuristics that I use to evaluate developers (sorry. But it is true. And of course I like you.) and projects, mostly to try to keep away from painful experiences. But, I’m also interested in a more general taxonomy of projects — my suspicion is that many of the things we think we know about running projects either aren’t so, or are “don’t care” boxes in the Karnaugh map of administration. What is cool is that the free software movement is so big, now, that we have an excellent laboratory in which to study.

Codegen Update

Since my last post, I’ve written a prototype implementation of relinking for the incremental compiler.

Now, the compile server will create an object file in a cache directory. If there was a previous variant of the compiled file in the cache, it will then link the two together (using ld -r). Then, the final object file is copied from the cache to the user’s requested output file.

So, now you can “make clean; make” and see correct results from the incremental compiler. The new results table:

Compiler Seconds
Trunk 30
Incremental, no server 30
Server, first run 26
Server, second run 17

This is probably the current best (or “best worst”) case — no actual recompilation needed to be done. In terms of user scenarios, this corresponds to, say, modifying a comment in a core header file and recompiling. And, given that this is execing both ld and cp, the results aren’t too bad.

On the other hand, I had somewhat higher expectations. I’ve been pretty depressed about this project all week. Relinking is turning out to be a pain; I’m pretty much convinced now that incremental preprocessing is a necessity; and this combination makes me wonder whether I’m chasing a rat down a hole. The question of whether this project remains worthwhile is normative one, and fairly subjective. That’s a fancy way of saying, I don’t know.

Ugh.

Mostly I try to think about it in terms of a success metric. Or, what is the minimum expected gain that would make it appear to be worthwhile? I suspect I may need to prototype the C++ compiler changes before I can really satisfy myself on that topic, though.

Back to the concrete.

The linking prototype is still pretty bogus. It arrives at an executable which works, but the intermediate object files grow over time. That’s because it is pretty hard to coerce ld (and objcopy) into doing the odd things I want: I want to link two files together, yielding another relinkable object (i.e., I need -r), where symbol name clashes are always resolved in favor of the first file. You’d think -z muldefs (I’ve gotten overly familiar with the ld manual) would work here, but it just drops the symbols — not the contents. So, maybe -ffunction-sections and --gc-sections is the way to go — but this also has problems; the former because (supposedly) it does not work with all programs, and the latter because it interacts oddly with -r.

I’m still hoping I can get by with a relatively simple linker hack, though as the week has dragged on I’ve realized that my understanding of linking is less than ideal.

First Codegen Result

I tidied up my initial draft of incremental code generation so that it no longer gratuitously lowers functions which are not being recompiled. This was enough to get some results — results which are semi-bogus, due to not relinking, but which nevertheless give some idea of what can be expected.

Compiler Seconds
Trunk 33
Incremental, no server 33
Server, first run 27
Server, second run 14
Preprocess 4

So, the results are a bit odd. Recompiling is fast, as we’d expect — about twice as fast as a plain build. However, it still falls far short of the time used by the preprocessor. What is going on in there?

A look with oprofile seems to indicate that the excess is spread around. About 10% of the total time is spent in the GC; another 7% is used computing MD5s. Other than that… if I add up the top 40 or so non-cpp functions, I get about 5 seconds worth, and there is a long tail after that. That’s a bummer since that kind of problem is hard to fix.

Setbacks

The last couple weeks uncovered a few problems in the incremental compiler.

First, suppose you compile a program with the incremental compiler, then recompile it. You would expect to get the same warnings as well. But — whoops — I never thought about this until a week or two ago.

I hate that awful moment of realization. It reminds me of getting in trouble as a kid. “Oh shit”, I think. “What am I going to do? Does this sink the project?”

In this case, there are some options. If the set of warning flags does not change between compilations, I think I can modify GCC to store the warnings with their corresponding declarations. This is a bit of a pain, but nothing too awful — and I think I can avoid imposing a cost on the non-warning case by representing the warnings as tree objects and storing them in the hunk with the other declarations.

If the user does change the warning flags, then what? Record it and recompile, I guess. A similar idea applies to options that change the ABI — because ABI decisions get baked into the tree objects we create, if the ABI changes, we cannot reuse the trees.

My other uh-oh moment has to do with inlining. I got bored by the tedious sub-projects I was working on — integrating pragmas (by the way. If you design a language, don’t design pragmas. Thanks) into the dependency computation, fixing the remaining test suite failures — so I decided today to start looking at incremental code generation. Something fun!

I tried out a quick implementation. If a function is parsed, we arrange to compile it; if it is not parsed, we don’t bother. This won’t work on real programs, of course, since those “missing” functions have to come from somewhere, but this should give a good idea of the possible speedup.

After testing on my typical small test program (zenity), I noticed something odd, namely that recompilations were not as blazingly fast as I thought they should be. (I first estimated the absolute lower bound as the time it takes to preprocess the source files.)

Hmm. A mystery. But first, a brief aside about tools. The compile server forks and runs code generation in the subprocess. I wanted to debug this fork. So, Plan A: use gdb and set follow-fork to child. But… that fails because, although my program does not use threads, it still links in the thread library (relic of my failed threading experiment), and gdb does not seem to handle this well. So, Plan B: maybe ftrace from frysk can help me — all I want to do is see a stack trace at a particular function call, perfect for ftrace. But, the ftrace I have aborts at startup. So I update and rebuild — but there is a build error. I suppose I could have gone with Plan C: stick in a sleep() call and attach, just like I did 15 years ago. Instead I picked Plan D: printf. Not quite as good, since I still need some of that information. Somehow I didn’t feel like Plan E: rip out the threading code and start over at Plan A.

Right now I’m doing a lot of debugging and pretty much every week has a vignette like that. I didn’t do that python stuff in gdb purely for fun.

Anyway. What is going on in the compile server?

What I found is that the code generation process still does some processing on every function, even functions that we intend to drop. In particular it is lowering each function to GIMPLE. I think what is going on here is that GCC is lowering functions and running local optimizations on them so that they can be considered as candidates for inlining. At least, that’s my working theory until I get back to Plan C and dig around a bit.

I’m not totally sure yet what to do about this. I think I will have to go back and rip out the decl re-smashing pass I wrote a while back, and instead find a way to perform gimplification in the server. That way, the compile server can keep the gimplified form for use by the back end. Other than the work involved, and some tricky details in lowering without smashing, I think this will work.

This isn’t going to be pretty, but at least it isn’t a total disaster. I’d like to think this isn’t totally an accident. GCC has undergone a lot of changes in the last five years to make it more flexible internally, and I’ve pushed a little bit more in that direction on the branch. This makes it a bit simpler to change the point at which we put a fork in the pipeline.

It feels a bit strange to write about the mistakes I make. On the plus side, I know how to fix these problems; writing about really unknown problems would, of course, be beyond the pale.