Parallel mesh assignment also in single-node case
Change the strategy used when interpolating the gpart
onto the gravity mesh. Each thread (aka. top-level cell) constructs a local patch of the mesh and assigns its particles to it. When done, the patch is written to the global mesh using atomics. This is now similar to what is done in the distributed MPI case.
This strategy seems to be beneficial when there are lots of particles in only a few cells, for instance in a zoom run. It seems to not be slower in other cases either.
The behaviour can be controlled by a runtime parameter (Gravity:mesh_uses_local_patches
defaults to 1
) if one wants to roll back to the "old" per-particle atomic assignment.
Edited by Matthieu Schaller