This past week I spent playing around with making the C front end multi-threaded. What this means in particular is that the compile server makes a new thread for each incoming compilation request; the threads are largely independent but do share some results of parsing.
Parts of this work are, I think, reasonably clean. Making gengtype
and the GC understand threads was pretty simple. The code to lock the global hunk map (this is where parsed bits of files are registered) was easy.
What is ugly is dealing with GCC’s many global variables. Cleaning this up is a multi-person, multi-year project — so I took the easy route and marked many variables (831 at last count) with __thread
.
This works semi-well. There are some bugs, especially with higher -j
levels, which means data races somewhere. Since this is not my most pressing project, I’m going to put this patch off to the side for a while.
Also I’ve been wondering whether the other GCC maintainers will really want to see this. For the most part the damage is isolated — the GC, the server, and the pieces of code that explicitly want to share things (for now this just means the C front end) need to know about threads. However, the thread-local variables are pretty ugly… if you’re a GCC maintainer, let me know what you think.
Meanwhile it is back to the grindstone of debugging C front end regressions that I’ve introduced. Also I’m looking at changing the debug output module so that I can defer writing any assembly output until fairly late in compilation. This is one step toward the incremental code generation plan.
3 Comments
I am not a GCC mai ntainer. But would a list of the “thread-local variables” be a nice idea for people wanting some easy task for getting into GCC? The linux kernel seems to have a pretty nice newbies/janitors following where peopletake some easy task and clean something up. Picking a couple of these variables off a list and try to figure out which data structure they really belong to and/or figuring out which functions it should be passed to might be a interesting idea for a GCC newbie/janitor.
Yeah, that would be a good idea. Kaveh does cleanups sometimes, maybe he would pick this up too 🙂
[…] question is how to exploit multiple processors, either multi-core machines, or compile farms. In an earlier post, I discussed making the compile server multi-threaded. However, that interacts poorly with our code […]