Unskip reduce recursion

I have been able to simulate an EAGLE_12 box for 1000 steps.

added 1 commit

6ee29ee6 - Cleanup

unmarked as a Work In Progress

changed the description

assigned to @matthieu and unassigned @lhausammann

I am running the PMill-768 and see that it's significantly faster. I'll wait for the job to complete to see whether anything strange happened.

Nice! Do you have a rough estimate of the speedup?

couple of percent overall as the unskip is sub-dominant in these runs.

But to answer your question, engine_marktasks on that run is 5x faster. That's at high-z in a full DMO setup on one node. We'll see how much it improves unskip when we start using that (at high z we rebuild every step).

So far it's a gain of 7% of the total runtime. To z=35 so there is still some universe evolution ahead.

Down to z=6:

master:

Time spent in the different code sections:
 - 'Engine Launch (Tasks)                   ' (  433 calls, time: 78474.3631s): 82.8392%
 - 'Engine Marktasks                        ' (  258 calls, time: 6601.5886s): 6.9688%
 - 'Engine Unskip                           ' (  176 calls, time: 3133.6703s): 3.3080%
 - 'Space Rebuild                           ' (  258 calls, time: 2667.2456s): 2.8156%
 - 'Gpart Mesh Forces                       ' (  257 calls, time: 1189.0423s): 1.2552%

branch:

Time spent in the different code sections:
 - 'Engine Launch (Tasks)                   ' (  438 calls, time: 78835.1453s): 90.0962%
 - 'Space Rebuild                           ' (  260 calls, time: 2679.0522s): 3.0617%
 - 'Engine Marktasks                        ' (  260 calls, time: 1397.1246s): 1.5967%
 - 'Gpart Mesh Forces                       ' (  258 calls, time: 1193.9383s): 1.3645%
 - 'Engine Unskip                           ' (  179 calls, time: 905.4478s): 1.0348%

So:

marktasks: 25s/call --> 5.3s/call
unskip: 17.8s/call --> 5.1s/call

Good stuff!

added 21 commits

6ee29ee6...c96c2585 - 18 commits from branch master
d14f2d80 - First implementation
bea56270 - Unskip is working
b755fa82 - Cleanup

Compare with previous version

added 1 commit

b9d341fa - Improve the code comments and simplify a bit the operations in cell_activate_subcell_grav_tasks()

Compare with previous version

@lhausammann I have made some small changes and added a few comments to explain the logic of what is going on. I might do some more small changes in naming things when time permits.

added 1 commit

1777cfc7 - Change the name of the new flags to reflect a bit better what they do.

Compare with previous version

Some more style changes.

@lhausammann Two remaining questions from me:

Should we clear the flags in the drift or in the tree walk in the runner? The runner seems more natural to me somehow as it should mimic the unksipping walk.
Should we add a debugging check at the end of the time-step to verify that all the flags have indeed been processed? Would make me more confident that we did not miss something and then forget to unskip something in the next step.

I am happy with both your suggestions.

Clearing the flags: you mean in runner_main?
Debugging check: you mean looping over all the cells?

in the different calls of runner_doiact_grav.c, specifically the tree walking calls.
Yes.

Also, my MPI + debug check run eventually reached a state where not all particles are active and it died with:

[0000] [133352.3] runner_doiact_grav.c:runner_dopair_recursive_grav():2141: cj->grav.multipole not drifted.

I do not have more information yet about the exact cell configuration that lead to this.

A non-MPI version has gone much further already so I wonder whether there is something tricky maybe with foreign cells/multipoles.

Let me confirm this, but one possibility is that since we do not drift foreign parts, we do not clear the flag on foreign cells. Then, on the next step, we abort the unskip early in that cell because the flag is still set.

Unskip reduce recursion

Merge request reports

Activity