Rounding can leave particles out of cells/domain
When particles are moving slowly relatively to the box size, it is possible for rounding errors to leave a particle with position = boxsize when rounded periodically, e.g.
p->x[2] = 1.5e-19
dim_z = 2.000000000001
dim_z + p->x == dim_z
This means when the periodic boundary condition is activated, the system crashes as it looks for a cell that doesn't exist.
@nnrw56 suggested an easy fix to this:
const double old_pos_x = (p->x[0] + dim_x) - dim_x;
const double old_pos_y = (p->x[1] + dim_y) - dim_y;
const double old_pos_z = (p->x[2] + dim_z) - dim_z;
This works for gcc -O0
and icc. With gcc -Ofast -ffast-math
this doesn't work due to -fassociative-math
.
To see this you can test this code with godbolt:
#include <stdio.h>
#include <stdlib.h>
struct part{
double x[3];
};
void main(){
struct part parts[256];
for(int i = 0; i < 256; i++){
parts[i].x[0] = -((double)rand()/(double)RAND_MAX) * 1.e-19;
parts[i].x[1] = 0.0;
parts[i].x[2] = 0.0;
double dim_x = 2.0;
double dim_y = 1.0;
double dim_z = 1.0;
struct part *p = &parts[i];
const double old_pos_x = (p->x[i] + dim_x) - dim_x;
printf("%e %e\n", old_pos_x, parts[i].x[0]);
}
}
With -std=c99 -Ofast -ffast-math
you will get something beginning -8.401877e-20 -8.401877e-20
With -std=c99 -Ofast -ffast-math -fno-associative-math
you will get the expected result: 0.000000e+00 -8.401877e-19
For intel (using 17.0 on godbolt), with simply -std=c99 -Ofast
you get -8.401877e-20 -8.401877e-20
If you use std=c99 -Ofast -fp-model consistent
(or precise) you get the expected: 0.000000e+00 -8.401877e-20
The alternative solution would be something like if(pos_x == dim_x) pos_x == 0.0;
after wrapping the particles.
So the questions are:
Readability - which is preferred? The second is more obvious but its so deep in the code that may not be important.
If not, optimisation: How much does fno-associative-math
and -fp-model consistent
cost for the standard SWIFT runs. My expectation would be -fno-associative-mathmay be low (as the documentation suggests this is basically all it does), but I have no idea on
-fp-model consistent (nor if it should be precise or consistent, my reading suggests maybe precise is less expensive). Also can we apply this flag to just this file (
space.c`) @pdraper ?