Time-step limiter and time-step synchronization over MPI
Hopefully final batch of changes for the physics required by EAGLE. The main change is to allow users to run with the time-step limiter and synchronization over MPI.
This is done by running the limiter and synchronization tasks as part of the regular task graph. We then run all the hydro and gravity time-step communication tasks in order to make sure every cell that may have seen its time-step change whilst having been inactive has communicated that change to all the relevant parties. (I hope to optimize this at a later stage and reduce the number of comms required)
Other changes include:
- Fixes a few of the debugging checks that were incorrect when running over MPI
- Fix the start of time-step of the cells in which a particle is limited (was wrong also in non-MPI)
- Fix the time-bin of a synch'd particle to be not larger than the current max (was wrong also in non-MPI)
- Deactivate the (small) optimization in DOPAIR2() where we saved a bit of memory when all the particles were active since we do not explicitly track that information any more.
Merge request reports
Activity
added 5 commits
-
e73795f7...9228135f - 4 commits from branch
master
- 5ebce90a - Merge branch 'master' into limiter_mpi
-
e73795f7...9228135f - 4 commits from branch
added 1 commit
- 65d9b04d - Fix mistake in the last correction of the debugging checks
added 1 commit
- 7ddd4205 - Distinguish the engine_launch for the FOF tasks and the FOF communications for the timing analysis
Yes, the goal is to improve things in a future iteration.
But basically:
- The check in the recv() is not appropriate any more since we now also send particles after their new time-step has been calculated (for the limiter comms). I could reinstate the check for the other comms if you prefer.
- The check in
timestep_limiter_end_force()
was wrong and creates problem for other people already (and has little to do with this MR apart from the fact that it also triggered for me) - The
ti_hydro_end_max
of cells is not gathered correctly any more with this change. However, that quantity is only used in one place for a memory optimization in DOPAIR2. I am still thinking whether I want to restore that (and calculate the end_max correctly) or whether we should drop it altogether to save memory and data to communicate.
Edited by Matthieu Schallermentioned in commit ab352b59
Please register or sign in to reply