Split stars density self and ghost tasks to reduce dead time.
This MR adds the possibility to split the stars density self and ghost tasks into a configurable number of separate tasks that can run concurrently. The rationale behind this is that these tasks have been found to be the main cause of dead time in many simulation projects. Both tasks consist of a single loop over a cell's particles, and can therefore be trivially split by using a larger loop increment and different offsets.
The number of self and ghost tasks can be configured using
One complication with the split self tasks is that, in order to effectively run them concurrently, a change to the locking mechanism is required. Otherwise, the various split self tasks would all lock the cell and can still not run concurrently. To achieve this, a new atomic counter similar to the
hold counter is used. A split self task first locks the cell as usual. It then increments the split hold counter and unlocks the cell again. A non-zero hold counter blocks any pair task from running, but does not prevent another split self from running. There are some extra complexities when dealing with recursion.
These changes have been tested in the COLIBRE fork of the code, but I still need to run some more tests to check that nothing was broken while porting them to the main repo.
The new hold mechanism needs some cleaning up.