First version of the multiple time-stepping
Eventually...
That should open up debugging from a larger crowd.
Merge request reports
Activity
Added 5 commits:
- 03c44d0f - Use autoconf affinity policy (again)
- 0fda419f - MPI and affinity control are not compatible
- b56f6179 - Use single swift command instead of the old _fixdt and _mindt
- 41606240 - Don't attempt to write non-existent field dt
- 15dd7891 - Recover missing mpi_type definitions, remove unused variables, remove repeated d…
Toggle commit listI've built this and have gotten it to run and came up with the commits above.
The MPI version was completely broken as it missed the new type definitions. It also ran very slowly on a single machine (haven't tried any multiple machine tests yet), which came down to a clash with the new affinity code. I've ifdef'd that out for MPI, but it is an area we should think about a bit more.
My SodShock test runs for 2063 time steps then crashes.
engine_step: t_end_min=0.003807 t_end_max=0.004215 engine_step: Step: 2063 e->time=0.003807 space_regrid: h_max is 1.017e-02 (cell_max=0.000e+00). space_regrid: set cell dimensions to [ 44 5 5 ]. Segmentation fault (core dumped)
This is down to a particle having
-nan
valued position, which happens at lines:cell_getid(cdim, p->x[0] * ih[0], p->x[1] * ih[1], p->x[2] * ih[2]); cells[ind[k]].count++;
in engine.c. The index
ind[k]
isnan
, since:(gdb) print p->x[0] $1 = -nan(0x8000000000000) (gdb) print p->x[1] $2 = -nan(0x8000000000000) (gdb) print p->x[2] $3 = -nan(0x8000000000000)
One possible cause happens a little earlier:
engine_step: t_end_min=0.003783 t_end_max=0.004185 engine_step: Step: 1947 e->time=0.003783 runner_doghost: Smoothing length failed to converge on 1 particles.
but I haven't attempted to check that.
More generally there is a line:
message("No active particles, skipping....");
that is rather verbose, commented that out in my version. Also we have a lot of commented out sections. Ideally I'd like to not see redundant code being committed to master.
One final point. Can we keep the fixdt version working? Or are we confident that technique will never be useful again?
Added 1 commit:
- ba4e4e6c - Removed nested function in engine_init_particles()
Added 1 commit:
- 4cc942a3 - Create executables with a fixdt policy
Added 1 commit:
- 601c8317 - Print policy when starting the engine.
Added 1 commit:
- e293da45 - More engine clean-up
Added 1 commit:
- 2722f6fc - Reintroduce task-marking to prevent enqueing unnecessary tasks.
Added 56 commits:
-
2722f6fc...a77c1d1b - 55 commits from branch
master
- 2f96fdcd - Merge branch 'master' into newTimeDefinition
-
2722f6fc...a77c1d1b - 55 commits from branch
Added 4 commits:
-
24a05c3d...472bab56 - 3 commits from branch
master
- 89573009 - Import the new plotting scripts from Peter
-
24a05c3d...472bab56 - 3 commits from branch
Added 1 commit:
- 8190b0e1 - Cleaned-up ghost task
Added 1 commit:
- af271cdd - Updated default run scripts with the new policy and parameters
Added 1 commit:
- 77f68508 - Missing file...
Added 1 commit:
- 05dee395 - Print library information in the HDF5 as well
Added 1 commit:
- 321516d4 - Extracted the physics from the tasks into a new set of files. The correct file i…
Added 1 commit:
- 839cbab3 - Moved the interaction functions to the hydro directory
Added 1 commit:
- ee114579 - Same treatment for the gravity core routines
Added 4 commits:
Toggle commit listAdded 1 commit:
- c900b14e - Cleaned-up unnecessary constants
Added 1 commit:
- 17cf7788 - Added timers to the init tasks
Added 1 commit:
- 5a69314f - Compile with clang
Added 1 commit:
- 629bb48c - clang-compatibility
Started to look at this again, and the SodShock test ran without any issues (single node, 6 threads for 8190 time steps), good.
However, the SedovBlast test doesn't get very far
# Step Time time-step CPU Wall-clock time [ms] engine_step: t_end_min=0.000000 t_end_max=0.000000 engine_step: Step: 0 e->time=0.000000 engine_step: t_end_min=0.000031 t_end_max=0.007812 engine_step: Step: 1 e->time=0.000031 engine_step: t_end_min=0.000061 t_end_max=0.007812 engine_step: Step: 2 e->time=0.000061 engine_step: t_end_min=0.000061 t_end_max=0.007812 engine_step: Step: 3 e->time=0.000061 engine_step: t_end_min=0.000061 t_end_max=0.007812 engine_step: Step: 4 e->time=0.000061 space_regrid: h_max is 5.613e-02 (cell_max=0.000e+00). Segmentation fault (core dumped)
This happens at line 343 of space.c:
343 cells[ind[k]].count++;
the value of
ind[k]
is524236
, but we only have15625
cells. Checking the particle data we see:(gdb) print *p $3 = {x = {169.39329781083813, 6.2777476187669521, 3.8292171049782753}, v = {4.32336538e+09, 218439120, 30988146}, a = {4.83560232e+16, 2.4432002e+15, 3.46596546e+14}, h = 0.0556494445, t_begin = 6.10947609e-05, t_end = 6.11543655e-05, u = 0.78253454, rho = 0.994605839, rho_dh = 0.969468594, {density = {div_v = 0.996183693, wcount_dh = 0.515203595, curl_v = {2697.84912, -0.797386408, 272116416}, wcount = 0.932364762}, force = {balsara = 0.996183693, POrho2 = 0.515203595, u_dt = 2697.84912, h_dt = -0.797386408, v_sig = 272116416, c = 0.932364762}}, mass = 0.00011920929, id = 549252, gpart = 0x0}
so the particle is outside of the space and travelling very fast...
BTW, the parameters are:
../swift -t 1 -f sedov.hdf5 -m 0.02 -w 5000 -c 1. -d 1e-7 -e 0.01
from the new
runs.sh
.Edited by Peter W. DraperAdded 1 commit:
- 9817bc32 - Properly collect the number of particles updated
Added 1 commit:
- 2441b85e - Better time-step output
I've also ran the CosmoVolume for a good while without any problems. Here are some animations of velocity for the SodShock and CosmoVolume. To my eye some of the artefacts of the previous SodShock look to have been fixed, and it's good to see any action without gravity in the CosmoVolume (still not a lot...):
Added 11 commits:
- 159c9532 - Now initialise the particles using the hydro functions.
- 990a8e98 - The scheduler now takes a submask argument as well to only run through part of the task graph
- 2d4053cc - Less verbosity in the scheduler
- 626f555a - Correct usage of entropy flag in ICs for Gadget-2 compatibility
- a3596a2e - Rewrote the Gadget-2 hydro implementation. Everything OK until the end of the density loop.
- 92c299ad - Rewrote the Gadget-2 hydro implementation. Everything OK until the end of the density loop.
- 7f50b0b1 - Velocity curl and divergence now correct as well.
- 6bfdc639 - Drift is now Gadget-2 identical
- 83787596 - Gadget-2 force loop now correct
- 7ef2eea8 - Stupid lack of initialisation here....
- 921a4e6e - Accelerations and time-steps now match Gadget-2
Toggle commit list@pdraper, could you give it another go ?
I think the code behaves much better now. Still need some cleaning-up but the physics should be correct.
Doesn't build for me:
serial_io.c: In function 'read_ic_serial': serial_io.c:189:38: error: 'struct part' has no member named 'u' (char*)(&(part[0]).field), importance) ^ serial_io.c:313:7: note: in expansion of macro 'readArray' readArray(h_grp, "InternalEnergy", FLOAT, *N, 1, *parts, N_total, offset, ^ serial_io.c: In function 'write_output_serial': serial_io.c:501:39: error: 'struct part' has no member named 'u' (char*)(&(part[0]).field)) ^ serial_io.c:672:7: note: in expansion of macro 'writeArray' writeArray(h_grp, "InternalEnergy", FLOAT, N, 1, N_total, offset, parts,
Can you try from clean sources.
Added 1 commit:
- 97a04907 - Correct i/o also in MPI mode(s)