Split gravity tasks
Apply a similar splitting scheme for gravity tasks as for hydro
Merge request reports
Activity
@matthieu, can you give this a spin on Cosma and see what the task plots look like? Cheers!
Here we go:
- Old gravity from last week: http://icc.dur.ac.uk/~jlvc76/SWIFT/Gravity_87f203c6/
- Use of sigmoid instead of erf(): http://icc.dur.ac.uk/~jlvc76/SWIFT/Gravity_b41d81e6/
- Same with gravity self splitting: http://icc.dur.ac.uk/~jlvc76/SWIFT/Gravity_f7d059f0/
Ouch... So not really much improvement?
Can you run the sigmoid version through VTune to see if the sigmoid/exp approximation is the new hotspot? Or maybe it is just due to the number of times it's called, and not because its super-expensive?
Another possibility is that we were auto-vectorizing for erf/exp, but now no longer for the sigmoid approximation?
From vTune, the sigmoid is not the hotspot any more. The top hotspot is now the multipole-multipole kernel, as expected from the literature.
The pair interaction fully vectorises automatically but might still benefit from some work to reduce double->float conversions. The self interaction does not auto-vectorise yet as it is written to exploit symmetry, which I now think is not a good idea.
I managed to cut down some bits of the recursion yesterday night and here is the new result: http://icc.dur.ac.uk/~jlvc76/SWIFT/Gravity_886706a/ For instance step 32 is now >20% faster (3220ms vs 2560ms). Small steps also benefited from this.
I still have a rather long list of potential improvements to go through:
- Only drift the required gparts (Similar to what we did for the parts).
- Make the multipole-multipole function symmetric,
- Check whether single precision is good enough for the multipole calculations,
- Compute all the required derivatives in one go to avoid duplicated intermediate terms,
- Write a particle-multipole function for the few cases where it can be used instead of particle-particle.
On top of all that the performance seems to be very dependant on some of the numerical parameters such as the minimal cell size and the opening angle. I don't want to play with these now but at a later stage we could tune these for a fixed accuracy criterion.
OK, cool, those results look much better now :)
What fraction is still used by the sigmoid? Is it still worth it to replace it with a single rational polynomial?
I can have a look at the pair and multipole interaction functions to see if there is anything there that can be done differently, but it's really just a lot of math, so I'm not too hopeful that there's any low-hanging fruit.
Any idea why the
drift_gpart
tasks happen so late in the timestep?What fraction is still used by the sigmoid? Is it still worth it to replace it with a single rational polynomial?
I don't know yet but probably not worth the effort unless it becomes the top hotspot again.
I can have a look at the pair and multipole interaction functions to see if there is anything there that can be done differently, but it's really just a lot of math, so I'm not too hopeful that there's any low-hanging fruit.
Quite a bit of it is done in double precision at the moment. Also, I don't know how much the compiler is able to group common sub-expressions. Investigating this is my next task.
Any idea why the drift_gpart tasks happen so late in the timestep?
It is likely related to the fact that gravity tasks happen only at the end. I still need to link the pair tasks and their gravity drift. At the moment, the pairs call the drift directly as in the old hydro code. This associated with a better cost estimate might help push them forward.