Skip to content
Snippets Groups Projects

Only activate the tend comms that are needed

Merged Matthieu Schaller requested to merge reduced_dt_comms into master

In large simulations, especially with small time-steps, we swamp the system with tend communications at the end of a step. That is because in the current logic we don't have a good way of deciding which ones to launch (because of the complexities of the sync + limiter) and so decided to activate all of them. That can be N^3 * 125 communications, where N is the number of top-level cells on one side. That's 4e6 comms in a 32^3 setup like the one-before-largest colibre runs!!

Here, we improve upon this by doing the following:

  • Construct an array of boolean (char) of the size of the top-level grid.
  • The timestep_collect, sync, and limiter tasks when running at the top-level set the boolean to 'true' if they ended up changing anything related to the time-step in this cell
  • We then all-reduce the array for all nodes.
  • Each node then activates the tend comms involved in local cells for which the boolean is true.

This means we trade a lot of communications for a global reduction bottleneck.

On a COLIBRE L100N1504 running on 20 nodes (80 ranks, 32^3 TLCs), we see a 5-10% speed-up. In particular, all the steps involving very few particles are significantly faster (500+ms to 200ms). The impact will be larger at even higher resolution.

Edited by Matthieu Schaller

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • added 1 commit

    • 20f51ab2 - Reorder the operations in the time integration to use the same updates as in master

    Compare with previous version

  • added 1 commit

    • 2c8b8468 - Use an atomic operation to update the space-carried array of top-level cell updates

    Compare with previous version

  • added 1 commit

    • f79149ff - Fix documentation of the new function

    Compare with previous version

  • Turns out that using an atomic update is necessary when different threads try to act on the same thing... Who knew?

  • Matthieu Schaller resolved all threads

    resolved all threads

  • added 1 commit

    Compare with previous version

  • added 9 commits

    Compare with previous version

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading