Unskip reduce recursion
Merge request reports
Activity
added performance label
assigned to @matthieu and unassigned @lhausammann
But to answer your question, engine_marktasks on that run is 5x faster. That's at high-z in a full DMO setup on one node. We'll see how much it improves unskip when we start using that (at high z we rebuild every step).
So far it's a gain of 7% of the total runtime. To z=35 so there is still some universe evolution ahead.
Down to z=6:
master
:Time spent in the different code sections: - 'Engine Launch (Tasks) ' ( 433 calls, time: 78474.3631s): 82.8392% - 'Engine Marktasks ' ( 258 calls, time: 6601.5886s): 6.9688% - 'Engine Unskip ' ( 176 calls, time: 3133.6703s): 3.3080% - 'Space Rebuild ' ( 258 calls, time: 2667.2456s): 2.8156% - 'Gpart Mesh Forces ' ( 257 calls, time: 1189.0423s): 1.2552%
branch
:Time spent in the different code sections: - 'Engine Launch (Tasks) ' ( 438 calls, time: 78835.1453s): 90.0962% - 'Space Rebuild ' ( 260 calls, time: 2679.0522s): 3.0617% - 'Engine Marktasks ' ( 260 calls, time: 1397.1246s): 1.5967% - 'Gpart Mesh Forces ' ( 258 calls, time: 1193.9383s): 1.3645% - 'Engine Unskip ' ( 179 calls, time: 905.4478s): 1.0348%
So:
- marktasks: 25s/call --> 5.3s/call
- unskip: 17.8s/call --> 5.1s/call
Good stuff!
added 21 commits
-
6ee29ee6...c96c2585 - 18 commits from branch
master
- d14f2d80 - First implementation
- bea56270 - Unskip is working
- b755fa82 - Cleanup
Toggle commit list-
6ee29ee6...c96c2585 - 18 commits from branch
added 1 commit
- b9d341fa - Improve the code comments and simplify a bit the operations in cell_activate_subcell_grav_tasks()
@lhausammann I have made some small changes and added a few comments to explain the logic of what is going on. I might do some more small changes in naming things when time permits.
added 1 commit
- 1777cfc7 - Change the name of the new flags to reflect a bit better what they do.
Some more style changes.
@lhausammann Two remaining questions from me:
- Should we clear the flags in the drift or in the tree walk in the runner? The runner seems more natural to me somehow as it should mimic the unksipping walk.
- Should we add a debugging check at the end of the time-step to verify that all the flags have indeed been processed? Would make me more confident that we did not miss something and then forget to unskip something in the next step.
Also, my MPI + debug check run eventually reached a state where not all particles are active and it died with:
[0000] [133352.3] runner_doiact_grav.c:runner_dopair_recursive_grav():2141: cj->grav.multipole not drifted.
I do not have more information yet about the exact cell configuration that lead to this.
A non-MPI version has gone much further already so I wonder whether there is something tricky maybe with foreign cells/multipoles.