Skip to content

Speedup synchronization

Matthieu Schaller requested to merge speedup_synchronization into master

This is inspired by what you reported on the Optane. By reverting the way we loop, we gain around 20% on cosma7 in this function call. It's not game changing but still worth the change.

On EAGLE-25 with one node (28 threads) and the Intel compiler with the usual flags we go from ~510ms per call to ~400ms.

Happy to hear any thoughts you may have on this.

Edited by Matthieu Schaller

Merge request reports