[WIP] Generic cache
Create a set of generic vectorised neighbour finding functions that are independent of the SPH flavour used.
Merge request reports
Activity
I have created a set of update caches that can be used for the force and density interactions. I get the same performance when benchmarking. What are the common input variables into the interaction function between SPH flavours? I imagine position, smoothing length and mass are one? Or maybe not mass when it comes to GIZMO?
added 1 commit
- 112053c5 - Added struct to store the input parameters for the SPH scheme.
added 1 commit
- 970ad4ce - Extract memory addresses from caches in interaction functions. Density.
added 1 commit
- 0adffc19 - Extract memory addresses from caches in interaction functions. Force.
added 1 commit
- 855a2f09 - Pad cache when reading populating it instead of on the fly. Removed unnecessary variables.
added 1 commit
- 3555f4f9 - Remove SPH scheme specific code from runner_doiac_vec.c. Left-packing and…
added 1 commit
- f370fc32 - Added vectorised version of Minimal SPH for density interactions.
added 1 commit
- e87d0d9d - Added vectorised version of force interaction for Minimal SPH (unfinished).
added 1 commit
- ec137698 - Fixed bug with AVX-512 and horizontal max of the signal velocity. Store result…
added 1 commit
- e43ec5c7 - Fixed bug with AVX-512 and horizontal max of the signal velocity. Store result…
added 1 commit
- 0f29226e - Pad cache when populating it instead of on the fly. Force.
One comment and two questions.
I'd like to keep the Minimal SPH as it is and not introduce POverRho2 (as you did). I know this is less efficient but it matches published stuff and makes it easier for people to follow what is going on in SWIFT compared to the literature.
Now the questions (which are more philosophical). The new cache file is very long and it seems extra difficult for a newcomer to implement the whole thing for their new model. Would there be a more generic way of doing this?
In the case of Minimal-SPH, would it be possible to not write a vectorized version by hand but actually make the vectorized call use a loop of size
VEC_LENGTH
calling the scalar interaction code?added 1 commit
- d85c674e - Call VEC_HMAX correctly when compiling on AVX architecture.
That's fair enough about not introducing POverRho2, I'll remove it.
Regarding the size of
hydro_cache.h
I agree, it became longer than I expected. I was thinking about merging theupdate_cache
into a union between density and force along withinput_params
. I don't know how I would reduce the number of functions though, because they're all parameter specific.So for the last one you mean to define separate
DOPAIR/DOSELF
functions for just the minimal scheme?added 2 commits
added 128 commits
-
39c93b62...137ab111 - 124 commits from branch
master
- 3ddaa38b - Merge branch 'master' into generic_cache
- 61f5fcd7 - Place auto-vectorised scalar function within ifdefs.
- 645f3cbf - Reverted change for storing P_over_rho2 in the particle struct for the Minimal…
- 2868b4aa - Updated dopair_subset_density_vec after merge with master so that the…
Toggle commit list-
39c93b62...137ab111 - 124 commits from branch
added 2 commits
added 1 commit
- 8a5fbc36 - Enclosed Intel specific pragmas in #ifdefs and removed unneeded functions.
492 482 493 483 /* Shift the particles positions to a local frame so single precision can be 494 484 * used instead of double precision. */ 485 #ifdef __ICC 495 486 #pragma simd 487 #endif 496 488 for (int i = 0; i < ci_count; i++) { 492 482 493 483 /* Shift the particles positions to a local frame so single precision can be 494 484 * used instead of double precision. */ 485 #ifdef __ICC 495 486 #pragma simd 487 #endif 496 488 for (int i = 0; i < ci_count; i++) { added 1 commit
- ddc59a6e - Added compiler hints and restrict keywords to auto-vectorise.
added 701 commits
-
ddc59a6e...8650b65f - 699 commits from branch
master
- fdc9f4f9 - Merge branch 'master' into generic_cache
- 5a3ffe30 - Fixed bugs after merge with master.
-
ddc59a6e...8650b65f - 699 commits from branch
added 1 commit
- c43c18df - Always read the particle position and smoothing length into the cache. Defined…
added 1 commit
- d029c143 - Added pragmas the vectorise loops. Needed to specify #pragma nounroll on density
added 1 commit
- f20b5cbd - Created a macro to specify simd pragmas depending on whether they exist in the compiler.
added 11 commits
- 7d223fd7 - Created a macro 'loop' to provide alignment information to compiler for vectorisation purposes.
- fd94282d - Updated testInteractions.c to use new vectorisation strategy and removed…
- d2de489d - Created a generic cache to perform caching for all SPH flavours.
- b4fd76c8 - Changed ifdef names.
- dbe04de2 - Created a separate cache.h file to hold cache populating functions.
- f572cfc3 - Include new files in Makefile.
- 9a7b4b67 - Set all field arrays to MAX_NUM_OF_CACHE_FIELDS. Use correct variables in debug checks.
- 5b58a6fd - Removed generic part of caching and moved it to generic_cache.h and cache.h.
- 10a3e472 - Use new generic cache.
- 654f5817 - Enclose in ifdefs for vectorisation.
- a5d87818 - Use the generic cache in the Gadget2 hydro scheme.
Toggle commit listadded 6 commits
- caf583ad - Typo density -> force for branching functions.
- 63d44196 - Use max number of cache fields.
- 67f9fe6a - Added particle update functions for density and force that include explicit intrinsics.
- 1d6c2f5a - Only need to read particle update fields before the double for loop over…
- 1907faaa - Added particle update functions for density and force that include explicit intrinsics.
- 717c75ae - Renamed particle update functions.
Toggle commit listadded 240 commits
-
717c75ae...b223b78f - 238 commits from branch
master
- 5b862db6 - Merge branch 'master' into generic_cache
- c293c0df - Changed VEC_HMAX to VEC_HADD for u_dt in Minimal SPH hydro scheme.
-
717c75ae...b223b78f - 238 commits from branch