Pack the timebin for the limiter communications
For the part
communication related to the time-step limiter, only send the time-bin
(an int8_t
) rather than the whole particle. The packing is done manually rather than by using an MPI-provided mechanism.
Merge request reports
Activity
added performance label
I am running my standard full-physics eagle-25 box on 2 nodes with 2 ranks / node on c7.
@pdraper Is there another test you'd suggest at this stage? Anything where the limiter was particularly problematic?
To keep you up-to-speed, I will also do the same thing for the
gpart
but without the need to unpack on the receiving side. That will follow the changes I tried in !1318 (closed) but with the packing done by hand also in a separate task.These two changes + the numa updates would be nice to have.
added 10 commits
- d3fc8b9e - Add a new task type to do the limiter packing and unpacking
- b7132b1b - Create the new tasks to pack and unpack the limiter and link them
- 615fe200 - Activate the new pack/unpack tasks when activating the corresponding send/recv
- 872ca5ae - Document the new tasks
- 4472f4a4 - Call the correct runner_ function when the unpack task is found in runner_main()
- 3e4f9704 - Use the correct types in the send/recv dependency creation
- 0119d194 - Fix logic in task_pass_buffer()
- f5c1c17f - Also call the new task function when enqueing a task that is done
- aa8c633b - Too many brackets...
- e6e15ca9 - Fix documentation string
Toggle commit listadded 1 commit
- ff94e221 - Add the new task types to the task plot tools
I guess I could use the pack task's buff pointer to point to the buff pointer of the send task. Is that cleaner?
I toyed with the idea of fishing out the pointer from the dependency list (of size 1) otherwise. We can't really store the buffer pointer in the cell as their number is variable. A linked list might work there but that's also quite the overhead.
An alternative would be to maintain a dynamic list of buffers where each buffer is an object
struct buffer{ void* next; void* buffer; enum task_subtype sub_type; cell* c; };
The engine maintains these. When we pack, we put the buffer pointer in one of these. The send task can then figure out which element it needs to send based on the cell pointer and sub_type. This has to come with a locking mechanism, however.
Or we put this linked list in the cell object. Then it's lock-free but more memory-heavy.
I don't like that more than the current solution.