Archive for the ‘software’ Category

Using Python in Gdb

Today I used the Python support in gdb to do real work. Until now I’ve just been playing around, adding functionality that looked fun.

My application was something I’ve wanted to be able to do in gdb for a long, long time: set a breakpoint in one function, but have the breakpoint be conditional on the caller. In my case, I’m debugging the compile server, and I want to examine calls to c_parser_lookup_callback which pass a complicated test (that is, the breakpoint is in the true branch of an if statement); but only calls originating in declspecs_add_type are interesting.

Roland taught me that you can fake this in plain gdb by using convenience variables and breakpoint commands. However, it is rather painful to get this right, as you have to remember to reset the convenience variable in some cases (in my case, if we reach the outer breakpoint but not the inner one).

With the Python support, this kind of thing is easy. First I define a Python function to do a little work:

(gdb) python
Type python script
End with a line saying just "end".
>def check():
>  frame = gdb.current_frame().get_prev()
>  return frame.get_name() == "declspecs_add_type"
>end

As you can see this just checks the parent frame’s function name.

Now, I make the breakpoint conditional on this function’s value:

(gdb) cond 1 $(check())

That’s all there is to it!

I suppose I could have written this as a one-liner, like:

(gdb) cond 1 $(gdb.current_frame().get_prev().get_name() == "declspecs_add_type")

Both of the above formulations still seem a little clunky to me. But, it is still early days for the Python integration; I think we’ll find some nice ways to simplify common tasks.

What’s your missing feature?

Emacs and Threading

While once again waiting for Gnus to contact a news server, I thought: I’ll never be able to move my RSS reading into Gnus, because the delays will skyrocket. Sure, there’s nntp//rss, but that means configuring a separate program and keeping it running — and I’ve heard that this program can have a memory footprint as big as Emacs itself. (As an aside: remember the old days when Emacs was routinely the program with the largest footprint on your desktop? For me it is never number one, and sometimes even slips to 3 or 4.)

Maybe some savior will come along and make Gnus fetch RSS feeds in the background, using a process filter. I assume, without looking, that retro-fitting this into Gnus would be very hard. For new Emacs code, though, this is the way to go; you can set things up so that most mode-specific operations report “working…” back to the user when background operations are happening — while still letting the user switch buffers and work on other things. For instance, nowadays vc-annotate works this way, which is very nice, since annotate is fairly slow in most version control systems.

Even better would be to make Emacs capable of multi-threading. Most people arrive at this idea eventually. Unfortunately, I think it is just not possible; partly due to bad language choices: dynamic scope is very handy, but having only dynamic scope is terrible; but also partly due to consequent design choices for the rest of Emacs: buffers are big global objects, maintaining compatibility for the enormous body of existing lisp is crucial, and auditing even the built-in body of elisp is, in difficulty, somewhere between daunting and impossible.

A few weeks ago I heard a funny idea in this area. Instead of trying to handle multi-threading, how about old fashioned multi-process support, with some kind of message passing? Emacs could fork(), and then the child could wander off with its own copy of everything; and then the subprocess could send up messages and data which would be integrated into the Emacs event loop. This is basically the same idea as process filters, only with the benefit that the process could be expressed in the same lisp form as the handler, and the subprocess would have access to all the relevant lisp state.

Naturally, most of these messages would just be elisp; but perhaps it would be worthwhile to add a way to transfer the contents of a buffer wholesale.

Gold is released

Ian Taylor checked in the long-awaited “gold“. Gold is a new ELF-only linker written in C++. It is designed for performance and is much faster than the current binutils ld.

I’m very happy about this for a few reasons. First, we’ve needed a new linker for a long, long time. Second, this will help the incremental compiler.

I looked through the gold sources a bit. I wish everything in the GNU toolchain were written this way. It is very clean code, nicely commented, and easy to follow. It shows pretty clearly, I think, the ways in which C++ can be better than C when it is used well.

Congratulations, Ian!

Compile Server Scalability

There are a few aspects to compile server scalability that are important to address.

First, and most obviously, memory use. Because we want to be able to send real programs through the compile server, and because we want it to remain live for relatively long periods of time, it is important that memory use be “acceptably bounded”. Naturally, the server process will grow with each additional compilation unit. At least in the straightforward implementation, there’s no way around that (but see below). However, it is important that the server not leak memory, and that recompilations generally not increase memory use. Also, ideally, all that work on decl sharing will keep memory use in check.

For the most part, this did not take any effort to achieve. GCC has a built-in garbage collector, and most nontrivial data structures are allocated using the GC. This is not a silver bullet, of course, but it has yielded good results with little effort in practice.

In the case of recompilation, we employ a simple heuristic — we store all parsed hunks keyed off the name of the requested object file (note: not the input file; it is common for a project to compile a given source file multiple times, but it is rare to see the same object file name more than once). When recompiling an object, we assume that there will be a lot of reuse against the object’s previous version, so we store those hunks temporarily, but then discard the old ones at the end of compilation. This way, we reuse, but we can also free hunks which are no longer in use.

Results from a few tests are very encouraging here. I compiled gdb with the compile server, then deleted the object files and re-compiled. Memory use (as reported by -fmem-report) stayed flat at around 51M — meaning that recompilation doesn’t grow the image, and the collection approach is working as desired.

I also built gdb using the compiler in “normal” mode, and looked at the -fmem-report totals. If you sum them up, which I naively expect gives a rough idea of how much memory --combine would use, you get 1.2G. Or, in other words, decl sharing appears to make a huge difference (I’m not completely confident in this particular number).

If memory use does become a problem for very large compiles, we could look at scaling another way: writing out hunks and reading them back in. Maybe we could use machinery from the LTO project to do this. This would only be useful if it is cheaper to read decls via LTO than it is to parse the source; if this is not cheaper then we could instead try to flush out (and force re-parsing of) objects which are rarely re-used. One special case of this is getting rid of non-inlineable function bodies — when we have incremental code-generation, we’ll never compile a function like that more than once anyway.

Another scalability question is how to exploit multiple processors, either multi-core machines, or compile farms. In an earlier post, I discussed making the compile server multi-threaded. However, that interacts poorly with our code generation approach (fork and do the work in the child), so I am probably not going to pursue it. Instead, for the multi-core case, it looks straightforward to simply run multiple servers — in other words, you would just invoke “gcc --server -j5“. Something similar can be done for compile farms.

An ideal result for this project would be for small changes to result in compilation times beneath your perceptual threshold. I doubt that is likely to happen, but the point is, the absolute turnaround time is important. (This is not really a question of scalability, but I felt like talking about it anyway.)

In the current code, though, we always run the preprocessor for any change. So, even once incremental code generation is implemented, the turnaround time will be bound by the time it takes to preprocess the source. This might turn out to be a problem.

In an earlier design (and in some other designs I have heard of), this is handled by making a model of compilation that includes preprocessing. That seems too complicated to me, though, and instead I think that it should be possible to also make an incremental preprocessor (say, one that uses inotify to decide what work must be re-done), and then use it without excessive cooperation from the parser.

Python and Gdb

Recently I’ve been hacking on integrating Python scripting support into gdb. For years now I’ve been wanting better scripting in gdb, but until I saw Volodya’s patch I never did anything about it. So, the other night I made a git repository (thanks gitorious!) and started hacking away. Thiago Bauermann did some nice updates on Volodya’s value-inspecting code, too.

A decent number of things are working. See the wiki page for details on cloning the repository.

Since I basically live in Emacs nowadays, I wanted to install the Python documentation in info form. Am I the only person who still loves info? It isn’t beautiful, to be sure, but it is amazingly convenient inside Emacs — simple to navigate, call up, and dismiss; with info-lookup it can function as a low-rent context-sensitive help; no messy fussing with the mouse.

Anyway, I couldn’t find this readily available anywhere, so in the end I checked out python myself and built the docs. That was sort of a pain… I’m half considering making an ELPA package out of the info pages. Come to think of it there are probably a number of potential info-only packages out there.

Tools Happenings

There are some fun things going on in the tools arena right now.

Do you read Taras Glek’s blog? He’s working on GCC Dehydra, which lets you write GCC passes in javascript. I think his work is one of the most interesting developments in the GCC world today.

There are a few similar projects in the works right now. The plugin branch lets you write GCC plugins; the authors of this branch have a Python plugin, but as far as I’m aware this is not publicly available.

On a slightly more idiosyncratic note, Basile Starynkevitch made a branch for his MELT project. This is also a plugin system, but it uses a lisp dialect for which he’s written his own lisp-to-C translator. I’m kind of partial to this idea — I think it would be fun to write parts of GCC in lisp, at least if it compiled down to something reasonable.

I’m quite interested in pushing GCC more into source analysis and refactoring. Right now the front ends have some problems that make this difficult, but I think these are surmountable without too much trouble. Maybe when I finish this pesky incremental compiler…

With all this going on I wonder why GCC-XML is not in the tree, at least on a branch.

Vladimir Prus recently made available his prototype which integrates Python into gdb. This is promising work — we’ve needed something like this for years. Maybe we can finally print complex data structures in sensible ways.

Finally, Benjamin Kosnik has checked in a restructuring of the libstdc++ documentation. I browsed the new stuff a bit and I found it much simpler to navigate. I’m very happy about this; good, clear documentation is critical to the success of free software.

Using Quagmire

Over the last two weeks I’ve spent some time hacking on Quagmire. I’ve tried to add the features I think are most commonly needed, and I think now it is ready for early adopters to come on board. It isn’t at feature parity with Automake, but it does implement a large subset of Automake’s functionality:

  • Build and install C/C++ programs and static libraries, with Automake-style dependency tracking.
  • Initial (aka, gcc-only) shared library support. (This is still fairly lame… “libtool in two lines of code”. My current plan here is to do something much more minimal than what libtool provides.)
  • Automatic support for the standard clean targets.
  • Install and uninstall, including DESTDIR.
  • dist and distcheck support (a bit incomplete; and FWIW this is just ripped right out of Automake).
  • Support for some minor (but standard) targets: TAGS, texinfo stuff, rebuilding configure-generated files.

It also has some features that Automake does not. One long-requested Automake feature is to be able to turn off its verbose output; Quagmire does this dynamically depending on whether -s was passed to make.

Quagmire also has some initial support for build-time function and header checks, and pkg-config support. This is not fully baked, but fun to play with. One way this is nicer than using configure is that if you add a new function to the list, the next invocation of make will run that test only.

There’s some example code in the repository that shows how to use most of the features.

Currently Quagmire does some things differently from Automake. For instance, it does not use directory prefixes for things like PROGRAMS. However, I recently figured out that it could (and in some situations, like _DATA, this is really what you want), so I’m wondering whether I should change all this to follow Automake more closely. Let me know what you think.

Also I’ve been wondering about whether it would be appropriate to post an announcement of Quagmire on the Automake mailing list.

Project Quagmire

This weekend I finally registered project quagmire with Google’s code hosting service. I’ve checked in all my initial .mk files, plus a little bit of documentation.

I’m a little nervous about posting this… Quagmire is still very immature, and I don’t want expectations set too high. There is still a lot to do! But, the good news is, you can easily help set the direction.

I suppose I will set up a Google group for Quagmire eventually. What do you think? Meanwhile, edit the Quagmire wiki with your questions, comments, and ideas.

Emacs Tip: gtk-look.el

One thing I like about Emacs is that when I ask for information, the default case is that it is displayed in a section (which Emacs calls a “window”) of the window (which Emacs calls a “frame”) that I’m already focused on. What’s nice about this is that if I ask for, say, help on a symbol in C mode, the information will show up, but I don’t have to look away, or worry about a new window stealing the focus, or move the mouse around. Also, I can easily dismiss the display without moving my hands from the keyboard.

Do you know about help on symbol in C mode? I didn’t know about this until last week. If you are in a C mode buffer, you can type C-h S, aka info-lookup-symbol, to look up help for a given C symbol. This works by digging through the libc info pages… this isn’t perfect, but it is pretty handy. (And, yes, I’m one of the few people who really loves info and uses it every day. I wish all the documentation on my system were available in this format — plus one or two others, yet to be invented, which would allow better IDE-ish features in Emacs.)

On a related note, a couple days Kevin Ryde sent me the latest version of gtk-look.el for uploading into ELPA. I’ve been meaning to try this for a while, so I finally installed it. gtk-lookup-symbol is similar to info-lookup-symbol, except it uses the devhelp documentation libraries as its source. Great stuff!

Naturally, I could just use devhelp, which is a very nice tool. Unfortunately, it is a very nice tool with one major drawback — it makes a new window, with all the attendant problems. And, by default, gtk-look suffers from this problem as well; the docs show up in firefox.

Luckily, Emacs is absurdly configurable, and this problem is solved by installing the w3m-el package (on Fedora) and configuring browse-url-browser-function to display devhelp files in w3m:



(setq browse-url-browser-function

    '(("file:.*/usr/local/share/gtk-doc/html" . w3m-browse-url)

     ("file:.*/usr/share/gtk-doc/html" . w3m-browse-url)

     ("." . browse-url-firefox)))

With this I can easily view the help for gtk, glib, etc, functions in the Emacs-y way that I prefer. If this sounds good to you, you’re only seconds away from installing it via ELPA.

Now what I need is c-lookup-symbol-dwim, which searches all the databases. This shouldn’t be too hard…

Hunk Boundaries

How do we decide the boundaries of reuse in the incremental compiler?

The one crucial thing when looking at reuse is that the signature that you choose must be stable across compilations. And, it is better if the boundaries are cheap to compute and roughly line up with declaration boundaries. In the prototype implementation, I accomplished these by using file change events (conveniently noted by libcpp) as hunk boundaries. File change boundaries are roughly stable (moreso for system headers), are free (libcpp already computes them), and very frequently align with declaration boundaries. This last criterion is typically only broken by constructs where a file is included in the middle of a declaration — for instance, in gcc itself by including a .def file in the middle of an enum, to populate it. I added a special case to the reuse code to account for these situations.

Aside from the convenience — which is very important for prototypes, but much less so for production — there was a historical reason that I chose file boundaries. When I first started thinking about writing an incremental compiler, I focused primarily on header files, and on having a single model which would encompass both the preprocessor and the compiler. After working through a number of cases, with the help of the prototype, I decided that this was “java thinking”; in C the primary unit of reuse is the declaration — not the file. And, it is much, much simpler to reason about C if you look at it after preprocessing.

In earlier posts I described the dependency computation that is needed for sharing declarations. One consequence of this is that the bigger the unit of reuse, the less often it will actually be reusable. At one extreme is gcc’s current PCH implementation, which requires a single header file to be included first in all compilations. With the file-boundary scheme, it turns out that even relatively ordinary programs fail to reuse in many situations. For instance, some headers in gcc have conditional declarations — so, change the order of includes, and an entire header’s worth of declarations must be re-parsed.

Note that failure to reuse is not a bug. Reuse is, after all, a performance and memory optimization. The worst consequence is that gcc will run as slowly, and will use as much memory, as it does now. Still, reuse is desirable. The more we reuse, and the smaller our units of reuse, the less work we have to do when recompiling.

So, this leads to the second implementation, which tries to guess declaration boundaries without parsing. This fast, if somewhat evil-looking, and it yields good results. This still isn’t perfect, though (it does not handle K&R-style definitions properly), so I will probably rewrite it again to do something closer to a quick parse.

I’m not sure whether this same approach makes sense for C++. I assume I will start here, because I understand it and know that it will work at least minimally. But, C++ brings its own problems. For instance, suppose you have an inline function defined in a class definition. In this case, if you change the body of the function, you’d like to be able to reuse the class definition, if possible — otherwise, many kinds of small changes will yield costly compilations. Tom Fitzsimmons suggested that perhaps the hunk computation can be made recursive in some way, meaning that perhaps it could operate inside of classes and namespaces, and not just at the top level. I’ll definitely be researching this once I start on the C++ front end.

The split between front and back end incremental processing means that each front end is free to implement its own approach to incrementality. I’m only looking at C and C++, but other people are free to hack on the other front ends. The actual changes to the middle-end interface are quite small, as it turns out.

What next? I think future posts will cover scalability, multi-processor support, and perhaps what to do about the C preprocessor. If there’s anything you want to know more about, post a comment and I will address it.