Skip to content

Log rank CPU times and memory use to assess the balance.

Peter W. Draper requested to merge repartition-cputime-update into master

Writes the user and system CPU times per rank into a file rank_cpu_balance.log together with the current load balance estimate. Also records the resident set size of the process on each rank to a separate file rank_memory_balance.log. The resident size is essentially the core memory use (but note only done when considering whether to repartition, as are the CPU times, so is indicative rather than some peak use).

These should be useful when trying to understand how the overall balance per node is working.

As part of this the CPU times used to estimate the balance are changed to only include time used by the tasks in engine_launch(), previously all the time used in the step was considered. Given the steps considered are those that interact all particles this is not likely to make much difference, but clarifies what we are using.

Edited by Peter W. Draper

Merge request reports