A while ago, I wrote about my work to speed up GDB’s DWARF reader. I thought I’d write again with a few updates.
Sharding
Back then, I wrote: “maybe GDB could trade memory for performance and shard the resulting index and do separate canonicalizations in each worker thread”.
I did end up doing this. Recall that the canonicalization step goes through all the discovered DWARF entries of interest — basically, all the objects in the program that both have a name and are not in a function scope (except in some languages, there is always an exception with DWARF) — and ensures the names are in a normal form. For Ada, this step includes synthesizing the package hierarchy (something that should probably be done for Go as well, except nobody really works on the Go support in GDB).
As an aside, sometime in the last few years we realized that this canonicalization has to be done for C as well, because in C there are multiple spellings of types like “short”. This is also implemented.
Because GDB already reads DWARF CUs in chunks in separate threads, the sharding idea is that we can speed up canonicalization a bit by doing this separately. Previously, GDB combined all the results before processing. Sharding means that lookups are a little more complicated; but it turns out not to be too hard, because the number of shards is typically low (for reasons I haven’t yet investigated, the reader doesn’t scale past 8 threads or so).
Background Reading
The other major change I made is to do all the DWARF reading in the background. This is a trick to make gdb feel faster to users. The basic idea here is that in many cases, gdb does not immediately need the DWARF from the various files. So, if we push the reading into worker threads, maybe it will be completely read in by the time gdb does need it.
This also somewhat benefits the situation where several shared libraries are loaded at once into the inferior. In this case, gdb already defers breakpoint re-setting until all the DWARF has been read — and with this change, all that work will be done in parallel.
Making this work wasn’t entirely straightforward. The main issue here is that gdb determines the initial language and location for “list” (et al) based on the debug info. The patches arrange to set these things lazily as well. I also had to add some rudimentary thread-safety to BFD.
Now, this can be defeated in a few ways. If you have a .gdbinit that sets a breakpoint, then that will cause the familiar pause, because setting a breakpoint will wait for the workers to complete. Or, if you debug a large executable and type very quickly, you may have to wait for the parsing to finish.
However, when it does work, it feels like gdb starts instantly.
One Comment
What does ‘because in C there are multiple spellings of types like “short”’ mean? I’m picturing ‘shawt’ but I assume that it’s not anything like that, that I’m just misinterpreting a technical term.