Parallel space_rebuild()
Looking at some of the scaling results, it turns out that the last remaining significant chunk of non-parallel code is space_rebuild()
and the majority of the time in there is spent computing the cell index of the particles.
This can easily done in parallel and on the EAGLE_25 shows significant improvements in the code speed and scalability. Although I should say that this comes from running this on 16 cores only and based on the vTune outputs (which usually match the actual tests).
What do you think ?
Merge request reports
Activity
@jwillis could you do a scaling test of this branch against the master in your usual setup ? On the EAGLE_12
space_rebuild()
is not as high but still contributes to the losses in efficiency.817 /* Get the particle */ 818 struct gpart *restrict gp = &gparts[k]; 819 820 const double old_pos_x = gp->x[0]; 821 const double old_pos_y = gp->x[1]; 822 const double old_pos_z = gp->x[2]; 823 824 /* Put it back into the simulation volume */ 825 const double pos_x = box_wrap(old_pos_x, 0.0, dim_x); 826 const double pos_y = box_wrap(old_pos_y, 0.0, dim_y); 827 const double pos_z = box_wrap(old_pos_z, 0.0, dim_z); 828 829 /* Get its cell index */ 830 const int index = 831 cell_getid(cdim, pos_x * ih_x, pos_y * ih_y, pos_z * ih_z); 832 ind[k] = index; Reassigned to @pdraper
817 /* Get the particle */ 818 struct gpart *restrict gp = &gparts[k]; 819 820 const double old_pos_x = gp->x[0]; 821 const double old_pos_y = gp->x[1]; 822 const double old_pos_z = gp->x[2]; 823 824 /* Put it back into the simulation volume */ 825 const double pos_x = box_wrap(old_pos_x, 0.0, dim_x); 826 const double pos_y = box_wrap(old_pos_y, 0.0, dim_y); 827 const double pos_z = box_wrap(old_pos_z, 0.0, dim_z); 828 829 /* Get its cell index */ 830 const int index = 831 cell_getid(cdim, pos_x * ih_x, pos_y * ih_y, pos_z * ih_z); 832 ind[k] = index; 817 /* Get the particle */ 818 struct gpart *restrict gp = &gparts[k]; 819 820 const double old_pos_x = gp->x[0]; 821 const double old_pos_y = gp->x[1]; 822 const double old_pos_z = gp->x[2]; 823 824 /* Put it back into the simulation volume */ 825 const double pos_x = box_wrap(old_pos_x, 0.0, dim_x); 826 const double pos_y = box_wrap(old_pos_y, 0.0, dim_y); 827 const double pos_z = box_wrap(old_pos_z, 0.0, dim_z); 828 829 /* Get its cell index */ 830 const int index = 831 cell_getid(cdim, pos_x * ih_x, pos_y * ih_y, pos_z * ih_z); 832 ind[k] = index; mentioned in commit e6bc5754
Missed a problem that should be resolved in f15c30bb. Can you check that. Fixes master for me.