Refactoring of the time-step communication tasks
Significant re-factoring of the way the time-step sizes are being exchanged.
Same as !1455 (closed) but without the last batch of changes.
Summary:
- A new top-level task collects the time-step sizes from the super level to the top-level. This was formerly done by making
engine_collect_end_of_step()
recurse. - The
timestep
,timestep_limiter
, andtimestep_sync
tasks all unlock that top-level task. -
engine_collect_end_of_step()
now only loops (via threadpool) over the local top-level cells. No recursion any more. - For each pair of top-level cells in the proxies we construct a pair of send/recv comm tasks.
- That comm task packs up the dt of the whole hierarchy sends it and unpacks the time-step sizes.
- The top-level time-step collection task unlocks the send.
- The individual per-species
tend
communication tasks that used to live at the super level are removed. - The second call to
engine_launch()
done every step to deal with the timestep limiter effect is removed (as it is now properly dealt with by the top-level task dependency)
This should help speed up the smallest steps by reducing the level of the plateau we usually see in the "main sequence" plots.