Debug check fails when checking whether to write snapshot after final simulation step when output list is used
Porting issue here from slack, with instructions for reproducible example.
Configure with:
./configure --with-hydro-dimension=1 --enable-debug --enable-debugging-checks
Reproducible example:
swiftsim/examples/RadiativeTransferTests/CosmoAdvection_1D
(yes, it's an RT example, but the debug check fail also occurs without RT.)
run with
../../../swift --hydro --cosmo rt_advection1D_medium_redshift.yml
Run crashes after final step:
[00001.8] engine_print_stats: Saving statistics at a=1.405296e-01
[00001.8] engine_dump_snapshot: Dumping snapshot at a=1.407294e-01
61 9.276920e-01 0.1398616 6.1499254 9.757966e-03 50 50 1000 0 0 0 0 8.487 24 0.101
[00001.8] cosmology.c:cosmology_get_delta_time():1294: ti_end must be >= ti_start
gdb
backtrace:
[00001.8] cosmology.c:cosmology_get_delta_time():1294: ti_end must be >= ti_start
Thread 1 "swift" received signal SIGABRT, Aborted.
0x00007ffff4c969fc in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff4c969fc in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff4c42476 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff4c287f3 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x0000000000483bd7 in cosmology_get_delta_time (c=<optimised out>, ti_start=<optimised out>, ti_end=<optimised out>) at cosmology.c:1294
#4 0x000000000044c7d4 in engine_io_check_snapshot_triggers (e=e@entry=0x7ffffff9a638) at engine_io.c:1270
#5 0x0000000000435f0a in engine_step (e=e@entry=0x7ffffff9a638) at engine.c:2465
#6 0x0000000000408283 in main (argc=<optimised out>, argv=<optimised out>) at swift.c:1720
(gdb) frame 3
#3 0x0000000000483bd7 in cosmology_get_delta_time (c=<optimised out>, ti_start=<optimised out>, ti_end=<optimised out>) at cosmology.c:1294
1294 if (ti_end < ti_start) error("ti_end must be >= ti_start");
(gdb) p ti_end
$1 = <optimised out>
(gdb) p ti_start
$2 = <optimised out>
(gdb) frame 4
#4 0x000000000044c7d4 in engine_io_check_snapshot_triggers (e=e@entry=0x7ffffff9a638) at engine_io.c:1270
1270 time_to_next_snap = cosmology_get_delta_time(e->cosmology, e->ti_current,
(gdb) l
1265 const int with_cosmology = (e->policy & engine_policy_cosmology);
1266
1267 /* Time until the next snapshot */
1268 double time_to_next_snap;
1269 if (e->policy & engine_policy_cosmology) {
1270 time_to_next_snap = cosmology_get_delta_time(e->cosmology, e->ti_current,
1271 e->ti_next_snapshot);
1272 } else {
1273 time_to_next_snap = (e->ti_next_snapshot - e->ti_current) * e->time_base;
1274 }
(gdb) p e->ti_current
$3 = 139611588448485376
(gdb) p e->ti_next_snapshot
$4 = -1
Stan's analysis:
What I believe is happening is that when using a snapshot file, and you have reached the last snapshot, the ti_next_snapshot is explicitly set to -1. However, when running with debugging checks, if ti_next_snapshot is negative, it throws an error. I don't remember the exact files/lines where it happens, but the -1 is deliberately set in one of the files that checks for snapshot times in a snapshot list.