Deadtimes in `runner_do_unskip`
Looking at a recent threadpool
task plot for a "small" timestep, we have up to 90% deadtime, i.e. time in which no productive work is being done by the threadpool worker.
Looking at the code, there is nothing blocking in the main loop, i.e. outside of the tic
and threadpool_log
lines surrounding the mapper function call. So where are these large gaps coming from?
There are two hypotheses here:
- The loop does not actually work as I think it does, and one of the operations there is blocking for whatever reason,
- The logging/timing does not actually work as I think it does, and the gaps are actually happening somewhere in the mapper itself.
In order to figure out what is going on, I need profiling, but I only want to profile certain events, i.e. what's going on in threadpool_chomp
when it is called for runner_do_unskip
. What I propose doing is linking against libprofiler.so
and adding calls to ProfilerStart
and ProfilerStop
around the corrseponding call to threadpool_map
. This should give me line-by-line profiling information for the precise call stack that I want.
One problem though: I don't have a 16-core machine to run this on. I can set up a branch that does all this and run it on up to four cores on my laptop, but that's about it. Anybody willing to run this on our dedicated machine and send me the results once it's done?
Cheers, Pedro