Gravity splittask (non-MPI)
This implements task splitting for gravity. Changes are:
- Split large gravity tasks (self and pair)
- Replace gravity pair tasks that are far enough with a multipole-multipole task.
- Re-order operations in a time-step.
The last item has been a long-running accuracy issue as we were always running a time-step after detecting that a rebuild is needed.
The task-splitting is currently not enabled over MPI. I will make this work alongside the periodic gravity over MPI in the next merge request.
Merge request reports
Activity
added 1 commit
- 05c71340 - Better naming convention for the function deciding whether a task can be split in the hydro case.
added 1 commit
- 5ad98d9d - Added functions to decide whether a gravity task associated to a given cell can be split.
added 1 commit
- 0e8d069d - Limit the depth of the gravity tasks being split.
added 1 commit
- 5e33f262 - Do not construct hierarchical gravity tasks below the level where there are any tasks.
added 1 commit
- 42f89f51 - Allow the user to tweak the maximal depth of gravity tasks.
added 1 commit
- 733f2de2 - Do not recurse below the level where there are tasks when unskiping
added 1 commit
- 88c04547 - Make the construction of self-gravity task an O(M^3) and not O(M^6) problem any more.
added 1 commit
- dc99b5bf - Use careful locks and unlocks around the activation of the gravity drifts.
I ran the EAGLE_50 test on a node of COSMA7 with the address sanitizer enabled and got this after 2000+ steps:
ASAN:DEADLYSIGNAL ================================================================= ==155620==ERROR: AddressSanitizer: SEGV on unknown address 0x2b53d748c96a (pc 0x0000004e01b8 bp 0x2b49b3acad30 sp 0x2b49b3acac50 T77) ==155620==The signal is caused by a WRITE memory access. #0 0x4e01b7 in scheduler_activate /cosma7/data/dp004/pdraper/swiftsim-check/src/scheduler.h:123 #1 0x4e01b7 in cell_unskip_gravity_tasks /cosma7/data/dp004/pdraper/swiftsim-check/src/cell.c:2327 #2 0x82f8be in runner_do_unskip_gravity /cosma7/data/dp004/pdraper/swiftsim-check/src/runner.c:981 #3 0x82f8be in runner_do_unskip_mapper /cosma7/data/dp004/pdraper/swiftsim-check/src/runner.c:1008 #4 0x5a1a6b in threadpool_chomp /cosma7/data/dp004/pdraper/swiftsim-check/src/threadpool.c:155 #5 0x5a1a6b in threadpool_runner /cosma7/data/dp004/pdraper/swiftsim-check/src/threadpool.c:182 #6 0x2b2a4e041e24 in start_thread (/lib64/libpthread.so.0+0x7e24) #7 0x2b2a4e34e34c in __clone (/lib64/libc.so.6+0xf834c) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV /cosma7/data/dp004/pdraper/swiftsim-check/src/scheduler.h:123 in scheduler_activate Thread T77 created by T0 here: #0 0x2b2a4a559060 in __interceptor_pthread_create ../../../../libsanitizer/asan/asan_interceptors.cc:243 #1 0x5a262a in threadpool_init /cosma7/data/dp004/pdraper/swiftsim-check/src/threadpool.c:235 #2 0x514bef in engine_config /cosma7/data/dp004/pdraper/swiftsim-check/src/engine.c:6105 #3 0x4077d3 in main /cosma7/data/dp004/pdraper/swiftsim-check/examples/main.c:851 #4 0x2b2a4e277c04 in __libc_start_main (/lib64/libc.so.6+0x21c04) ==155620==ABORTING
haven't looked closely, but maybe that makes more sense to you...
Excellent, thanks Peter. That's exactly the location where I used to catch it as well. The confusing bit is that this address would have been correct just the line above.
Do you think that re-ordering the tasks within the cell structure could help identify what is going wrong? By possibly hitting a "more invalid" address and hence see what is going wrong?