Tag only supercells
Reduce the number of tags needed by tagging only cells that actually have send/recv tasks attached. This is now the bare minimum number of tags needed.
This involves adding an additional communication step to exchange tags when rebuilding the space, but this is relatively small compared to all the other stuff going on, so it shouldn't hurt.
Merge request reports
Activity
added 321 commits
-
d19f9068...c49a9670 - 320 commits from branch
master
- aa6b7c9a - Merge remote-tracking branch 'origin/master' into tag_only_supercells
-
d19f9068...c49a9670 - 320 commits from branch
Yes, we'll need to remove the
engine_task_tags
and associated changes.In the meanwhile, when I run the test:
mpirun -np 4 ../swift_mpi -a -t 4 -s eagle_6.yml
we die in the
proxy_cells_exchange
function when callingcell_pack_tags
, with a memory overrun.Looking at the code it seems to me that the
offset
array needs to be extended toin
andout
versions. Otherwise these are mixing up. A quick hack doing this is now working.BTW, see 5e5a260e for the offsets hack.
added 17 commits
-
aa6b7c9a...0c583dda - 14 commits from branch
master
- c1d900f4 - Need to split up offset array to handle in/out
- 72d06cba - Merges in changes to use one MPI communicator per subtask type.
- f2319741 - No longer need engine_task_tags so remove
Toggle commit list-
aa6b7c9a...0c583dda - 14 commits from branch
That is actually quite related to the problems I am facing with the split gravity tasks over MPI as well as the crashes I get running out of memory with some runs.
The original plan was to split tasks to reduce the amount of data that needs to be shipped over MPI. But at the moment, I don't think that gain can be realized. We create the proxies including the allocation of the foreign data before the splitting. We would likely have to re-order the way we do things to benefit from lower memory usage. But we may also be reconstructing tasks based on old information which then leads to the crashes. In other words I still need to think about this.
BTW, see 5e5a260e for the offsets hack.
Wow, yeah, that was dumb of me. Thanks for fixing it!
Thinking about the different tag types, do we really even need them, e.g. for sending the
x
and then later therho
? Distinct tags are only useful if the sends/recvs ever overlap, but from the dependency graph, at least the hydro-related ones do not. So could we just use the same tag for sending/recvingx
,rho
, andti
?I did a few runs with and without the communicators fix, just to make sure we weren't suffering any slowdowns. The upshot of that was that I couldn't see any difference at all, so much so that I ran them again to make sure I hadn't made a mistake. That was EAGLE_50 on 8 nodes of COSMA7. Obviously that was just testing runtime per step, not looking at the task timing plots.