Add functions to permanently remove particles from a simulation
Add functions to remove particles. When a particle is removed, it is first flagged as inhibited. Then ignored in all interactions. We eliminate them from memory at the next rebuild.
Also fixes an error reported from the undefined behaviour sanitizer and fixes #470 (closed).
Merge request reports
Activity
added 1 commit
- c3480fcb - Added a few more checks that no inhibited particles are going through the calculations.
@jborrow could you check that the hydro test-suite still works on this with the debugging checks on and the minimal hydro scheme? Thanks!
Edited by Matthieu Schaller@pdraper This is the merge request for the particle re-ordering. The main changes are in space_rebuild and engine_redistribute.
Note that there is no code at the moment calling the functions removing particles. But that can be done by uncommenting the temporary block of code in runner_end_force().
There is one corner case that is not yet well-handled when you remove a particle and then dump a snapshot before rebuilding. A fix for that will come along with the functions adding particles.
Hmm, thought I give it a spin and uncommented that test, which gave:
==22336==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x2ab5d3d620dc at pc 0x00000062a4f8 bp 0x2ab5e80716a0 sp 0x2ab5e8071690 READ of size 1 at 0x2ab5d3d620dc thread T46 #0 0x62a4f7 in gpart_is_inhibited /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/active.h:302 #1 0x62a4f7 in runner_dopair_grav_pp_truncated /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/runner_doiact_grav.h:380 #2 0x62a4f7 in runner_dopair_grav_pp /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/runner_doiact_grav.h:833 #3 0x62d48a in runner_dopair_recursive_grav /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/runner_doiact_grav.h:1597 #4 0x62d3e6 in runner_dopair_recursive_grav /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/runner_doiact_grav.h:1619 #5 0x7ffc9d in runner_main /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/runner.c:2477 #6 0x2ab5bc7616b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) #7 0x2ab5bedba41c in clone (/lib/x86_64-linux-gnu/libc.so.6+0x10741c) 0x2ab5d3d620dc is located 60 bytes to the right of 159463584-byte region [0x2ab5ca54e800,0x2ab5d3d620a0) allocated by thread T0 here: #0 0x2ab5bb886076 in __interceptor_posix_memalign (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x99076) #1 0x579ec9 in read_ic_single /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/single_io.c:487 #2 0x4071fd in main /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/examples/main.c:763 #3 0x2ab5becd382f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) Thread T46 created by T0 here: #0 0x2ab5bb823253 in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x36253) #1 0x51c8e8 in engine_config /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/engine.c:6966 #2 0x407bb9 in main /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/examples/main.c:926 #3 0x2ab5becd382f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) SUMMARY: AddressSanitizer: heap-buffer-overflow /loc/pwda/pdraper/scratch/swift-tests/swiftsim-spare/src/active.h:302 gpart_is_inhibited
That is running the non-MPI EAGLE_6 test with:
./configure --with-metis --enable-debugging-checks --enable-debug --enable-sanitizer --enable-undefined-sanitizer --disable-vec
So we are dying in the extra check:
/* Check that the particle we interact with was not inhibited */ if (gpart_is_inhibited(&gparts_j[pjd], e) && mass_j != 0.f) error("Inhibited particle used as gravity source.");
added 1 commit
- 865284d3 - Correctly define 1 as an unsigned long long when bit shifting by 63.
added 7 commits
- b69a6ce9 - Only check the inhibited status of gparts if they are not dummy cache-adding particles.
- da9bf1d6 - Added star formation task.
- 3219828e - Collect the number of inhibited particles at the end of a time-step in the same way as the updates.
- cca41277 - Make the gpart friend of inhibted particles a stand-alone particle.
- 4413f001 - Move the temporary star formation law to the dedicated task.
- 10f784f7 - Add a task action for the star formation task. Add the new engine policy's name string.
- 89c0e00e - Set the correct task action for the long-range gravity task.
Toggle commit listadded 1 commit
- 1a60a352 - Make the temporary star formation law convert part into gparts.
added 11 commits
-
9b033be0...7d9a29e3 - 10 commits from branch
master
- 468abb5a - Merge branch 'master' into removing_particles
-
9b033be0...7d9a29e3 - 10 commits from branch
1028 1028 for (int k = 0; k < gcount; k++) { 1029 1029 const double m = gparts[k].mass; 1030 1030 1031 #ifdef SWIFT_DEBUG_CHECKS 1032 if (gparts[k].time_bin == time_bin_inhibited) 1033 error("Inhibited particle in P2M. Should have been remvoed earlier."); changed this line in version 9 of the diff
OK, here is an MPI run error:
[0003] [00728.4] part.c:part_verify_links():167: Linked particles are not at the same position! gp->x=[1.457784e-01 3.753363e+00 2.581094e+00] p->x=[3.347836e-01 3.875554e+00 2.642131e+00] diff=[-1.890052e-01 -1.221907e-01 -6.103649e-02]
That was running EAGLE_6 with:
mpirun -np 4 ../swift_mpi -c -s -G -S -t 2 eagle_6.yml
configure:
./configure --with-metis --enable-debugging-checks --enable-debug --enable-sanitizer --enable-undefined-sanitizer --disable-vec
Just after step 128 when some interesting things had all kicked off.
127 1.278127e-02 0.9104352 0.0983759 1.637011e-07 43 43 1 1 0 19.507 0 [0003] [00688.1] runner_do_star_formation: Removing particle id=6173617147839 rho=1.500975e+07 128 1.278143e-02 0.9104457 0.0983631 1.637023e-07 43 50 800621 1661079 29847 108372.891 0 [0000] [00727.8] engine_drift_all: Drifting all to a=9.104563e-01 [0000] [00727.9] engine_repartition: repartitioning space [0003] [00728.4] part.c:part_verify_links():167: Linked particles are not at the same position! gp->x=[1.457784e-01 3.753363e+00 2.581094e+00] p->x=[3.347836e-01 3.875554e+00 2.642131e+00] diff=[-1.890052e-01 -1.221907e-01 -6.103649e-02]
I'll start that again to see if it reproduces.
added 32 commits
-
6725ee2d...f241e5ea - 30 commits from branch
master
- c22e4447 - Updated doxygen documentation of the MPI collection routines.
- 768937bf - Merge branch 'master' into removing_particles
-
6725ee2d...f241e5ea - 30 commits from branch
added 1 commit
- 245e0644 - Correct wrong address access in the macro of engine_redistribute_savelink_mapper_part()
Before 245e0644 this piece of code triggered a seg-fault. I am surprised it never appeared before. The fix seems to do the right thing. What do you think?
Edited by Matthieu Schalleradded 1 commit
- f4daa0f3 - Added debugging function to crash if trying to write an inhibited particle to a snapshot.
added 1 commit
- 33c9cc37 - Collect the number of inhibited particles in each space in engine_collect_end_of_step()
added 1 commit
- 05cacf92 - When dumping a snapshot, fish out the particles that have not been inhibited.
added 1 commit
- dd11e16d - Remove the debugging checks for the inhibited particles i/o.
I fixed the i/o routines to fish out only the particles we actually want to dump. This now works for MPI i/o as well.
The sanitizer crash is also in master... So it's not the branch updates.
Running
mpirun -np 4 ../swift_mpi -c -s -G -S -t 4 eagle_12.yml
with./configure --with-metis --enable-debugging-checks --enable-debug --enable-sanitizer --enable-undefined-sanitizer --disable-vec
and theparallel_hdf5/1.10.3
module crashes. If you useparallel_hdf5/1.18.20
you are safe. Probably worth starting a separate discussion thread on this. I don't know yet whether there were some compilation/optimization differences in the way these two modules were generated or whether it is a genuine hdf5 bug.I have deferred the discussion of the hdf5 problem to #480 (closed).
added 1 commit
- 4427cce3 - Do not use collective meta-data writes in HDF5 1.10.x. There seems to be an issue with them.