`g-particle (id=7056, type=Gas) did not interact gravitationally with all other gparts` in RT Example
We stumbled upon a possible problem in !1499 (merged)
Running the examples/RadiativeTransferTests/RandomizedBox_3D
example at that state sometimes threw an error in the form
2590 5.389404e-02 1.0000000 0.0000000 1.220703e-05 43 43 1 1 0 0 0 3.475 0 1.844
[00067.4] engine_drift_all: Drifting all to t=5.390625e-02
[00067.4] space_rebuild: (re)building space
[00067.6] runner_others.c:runner_do_end_grav_force():807: g-particle (id=7056, type=Gas) did not interact gravitationally with all other gparts gp->num_interacted=31041, total_gparts=31000 (local num_gparts=31000 inhibited_gparts=0)
Here's a copy of TK's output.log.
The error is not thrown every run, pointing towards a race condition somewhere.
I was able to reproduce the issue at the same place and get it over and over again from a restart from step 2500. However, this time around I get 41 missing particle interactions:
2590 5.388184e-02 1.0000000 0.0000000 2.441406e-05 44 44 2 2 0 0 0 6.913 0 1.364
[00001.2] engine_drift_all: Drifting all to t=5.390625e-02
[00001.2] space_rebuild: (re)building space
[00001.5] runner_others.c:runner_do_end_grav_force():801: g-particle (id=7056, type=Gas) did not interact gravitationally with all other gparts gp->num_interacted=30959, total_gparts=31000 (local num_gparts=31000 inhibited_gparts=0)
Here's the full output.log from my restart run.
Matthieu pointed out that the issue is probably due to the nonsensical gravity setup, where I used only 12 mesh cells, while the code determined that 11 top level cells should suffice for the run. Indeed amping it up to 32 mesh cells doesn't reproduce the error. Matthieu also writes:
Surely having more TLCs than gravity mesh cells is not the expected way of running the code. If that's the problem indeed then the solution is to prevent users from making that choice.