Skip to content
Snippets Groups Projects

Add MPI version of SNII kinetic feedback

Merged Evgenii Chaikin requested to merge kinetic_feedback_with_MPI into master

Example of task dependency graph from the isolated galaxy example with MPI, which ran for 100 Myr on two 2 ranks and didn't crash.

dependency_graph_0

Edited by Matthieu Schaller

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • added MPI SPH labels

  • Evgenii Chaikin changed the description

    changed the description

  • added 1 commit

    Compare with previous version

  • The simplest example of task dependencies with the new MPI tasks

    dependency_graph_0

  • added 1 commit

    • 82d2030e - Fix extremely annoying mistake in runner_doiact_functions_stars.h

    Compare with previous version

  • The first important test is passed!

    15 stellar particles kicking gas particles in a homogeneous box: MPI and non-MPI versions produce the same result

    A piece of debug output:

    Loop2: Increase the largest stellar id of part 103901 from -1 to 2 (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 103022 from -1 to 12 (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 103983 from -1 to 14 (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 103869 from -1 to 14 (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 102024 from -1 to 8 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 104042 from -1 to 6 (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 102093 from -1 to 15 (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 102591 from -1 to 12 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 100153 from -1 to 3 (ray 0, MPI rank 1) 
    Loop3: Increment the switch value of star 2 by 1 because part 103901 wants to be kicked by this star (ray 0, MPI rank 1) 
    Loop3: Increment the switch value of star 12 by 1 because part 103022 wants to be kicked by this star (ray 0, MPI rank 1) 
    Loop3: Increment the switch value of star 14 by 1 because part 103983 wants to be kicked by this star (ray 0, MPI rank 1) 
    Loop3: Increment the switch value of star 14 by 1 because part 103869 wants to be kicked by this star (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 100030 from -1 to 3 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 100966 from -1 to 11 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101781 from -1 to 4 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 102523 from -1 to 2 (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 2 by 1 because part 102523 wants to be kicked by this star (ray 0, MPI rank 1) 
    Loop3: Increment the switch value of star 12 by 1 because part 102591 wants to be kicked by this star (ray 0, MPI rank 1) 
    Loop2: Increase the largest stellar id of part 100040 from -1 to 13 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101298 from -1 to 15 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101402 from -1 to 11 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 100740 from -1 to 9 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 100300 from -1 to 4 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101414 from -1 to 9 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 100770 from -1 to 8 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101571 from -1 to 1 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101263 from -1 to 6 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 102003 from -1 to 5 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 100796 from -1 to 7 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101319 from -1 to 13 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 102200 from -1 to 10 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101422 from -1 to 10 (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 8 by 1 because part 102024 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 101145 from -1 to 7 (ray 0, MPI rank 0) 
    Loop2: Increase the largest stellar id of part 100768 from -1 to 5 (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 3 by 1 because part 100030 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 11 by 1 because part 100966 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 4 by 1 because part 101781 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 13 by 1 because part 100040 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 15 by 1 because part 101298 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 11 by 1 because part 101402 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 9 by 1 because part 100740 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 4 by 1 because part 100300 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 9 by 1 because part 101414 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop4: Part 103901 is kicked by star 2 (ray 0, MPI rank 1)
    Loop3: Increment the switch value of star 8 by 1 because part 100770 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 6 by 1 because part 101263 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 7 by 1 because part 100796 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 5 by 1 because part 102003 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 7 by 1 because part 101145 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop4: Part 103022 is kicked by star 12 (ray 0, MPI rank 1)
    Loop3: Increment the switch value of star 5 by 1 because part 100768 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 15 by 1 because part 102093 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 1 by 1 because part 101571 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop4: Part 103983 is kicked by star 14 (ray 0, MPI rank 1)
    Loop3: Increment the switch value of star 6 by 1 because part 104042 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop4: Part 103869 is kicked by star 14 (ray 0, MPI rank 1)
    Loop3: Increment the switch value of star 3 by 1 because part 100153 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 13 by 1 because part 101319 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 10 by 1 because part 102200 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop3: Increment the switch value of star 10 by 1 because part 101422 wants to be kicked by this star (ray 0, MPI rank 0) 
    Loop4: Part 104042 is kicked by star 6 (ray 0, MPI rank 1)
    Loop4: Part 102024 is kicked by star 8 (ray 0, MPI rank 0)
    Loop4: Part 102024 is heated by star 1 (ray 0, MPI rank 0)
    Loop4: Part 102591 is kicked by star 12 (ray 0, MPI rank 0)
    Loop4: Part 100153 is kicked by star 3 (ray 0, MPI rank 1)
    Loop4: Part 102093 is kicked by star 15 (ray 0, MPI rank 1)
    Loop4: Part 100030 is kicked by star 3 (ray 0, MPI rank 0)
    Loop4: Part 100966 is kicked by star 11 (ray 0, MPI rank 0)
    Loop4: Part 101781 is kicked by star 4 (ray 0, MPI rank 0)
    Loop4: Part 102523 is kicked by star 2 (ray 0, MPI rank 0)
    Loop4: Part 100040 is kicked by star 13 (ray 0, MPI rank 0)
    Loop4: Part 101298 is kicked by star 15 (ray 0, MPI rank 0)
    Loop4: Part 101402 is kicked by star 11 (ray 0, MPI rank 0)
    Loop4: Part 100740 is kicked by star 9 (ray 0, MPI rank 0)
    Loop4: Part 100300 is kicked by star 4 (ray 0, MPI rank 0)
    Loop4: Part 101414 is kicked by star 9 (ray 0, MPI rank 0)
    Loop4: Part 100770 is kicked by star 8 (ray 0, MPI rank 0)
    Loop4: Part 101263 is kicked by star 6 (ray 0, MPI rank 0)
    Loop4: Part 100796 is kicked by star 7 (ray 0, MPI rank 0)
    Loop4: Part 102003 is kicked by star 5 (ray 0, MPI rank 0)
    Loop4: Part 101145 is kicked by star 7 (ray 0, MPI rank 0)
    Loop4: Part 100768 is kicked by star 5 (ray 0, MPI rank 0)
    Loop4: Part 101571 is heated by star 1 (ray 0, MPI rank 0)
    Loop4: Part 101319 is kicked by star 13 (ray 0, MPI rank 0)
    Loop4: Part 102200 is kicked by star 10 (ray 0, MPI rank 0)
    Loop4: Part 101422 is kicked by star 10 (ray 0, MPI rank 0)
  • One thing to look at (before I forget) is to not break the non-kinetic flavour. That means only creating & activating the comms when we actually compiled with the extra loops.

  • Evgenii Chaikin
  • Evgenii Chaikin
  • With this branch, the Isolated galaxy example at M6 resolution is running as expected

    SFR_galaxy

    Edited by Evgenii Chaikin
  • Evgenii Chaikin changed the description

    changed the description

  • added 3 commits

    • 0823810d - Only construct the task_subtype_part_prep1 communications if we are
    • 51817e19 - Apply similar changes to the new star comms
    • 7ed70dc9 - Also create the correct dependencies in...

    Compare with previous version

  • @chaikin I have pushed a few things:

    • Only create the new comm tasks when we actually use a feedback model with 4 loops
    • Only create the new dependencies when we actually use a feedback model with 4 loops
    • Use the correct super-pointer in the send task creation. Might have fixed your issue from above.
  • added 1 commit

    • bfd48164 - Applied code formatting script

    Compare with previous version

  • added 1 commit

    • dfb9f322 - Give the hydro_prep_ghost task the correct colour on the task dependency graph

    Compare with previous version

  • Matthieu Schaller unmarked as a Work In Progress

    unmarked as a Work In Progress

  • added 1 commit

    • 669290dd - Only activate the new tasks in engine_marktasks() if we actually have the new loops

    Compare with previous version

    • Resolved by Matthieu Schaller

      There is one remaning issue. In the unskip, we try to activate the send/recv tasks even if we are in the old scenario where we only have two loops.

      We could either play the same trick as in marktask or we could move some of the activation into the following block where we loop over the prep tasks. These will only exist if we have the extra loop.

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading