Pack the timebin for the limiter communications

I am running my standard full-physics eagle-25 box on 2 nodes with 2 ranks / node on c7.

@pdraper Is there another test you'd suggest at this stage? Anything where the limiter was particularly problematic?

No, that is the test I would run, perhaps over a few more nodes. Stretching the problem usual reveals the MPI issues more easily.

As always, not sure what to make of the results. At least it's going in the direction we expect.

One thing I was wondering is whether doing the copy where we do it is indeed the best choice. It currently happens during the enqueue procedure. Would it be better to have a separate task do the copy. And when that copy is done we unlock the send task, which does just that: send.

Yes, that would be better. Don't want really want to do work when queueing. It is much how I do my split sends using RDMA, which superficially looks better (task plot wise), but the speed up wasn't great. Suspect that is down to NUMA effects and the overheads of more tasks!

I like the explicit split so we clearly identify in the task plot what the costs are.

Did you modify the task structure to be able to pass the buffer pointer along?

To keep you up-to-speed, I will also do the same thing for the gpart but without the need to unpack on the receiving side. That will follow the changes I tried in !1318 (closed) but with the packing done by hand also in a separate task.

These two changes + the numa updates would be nice to have.

Yes, I had an additional pointer for the extra memory location (and a lot more other things as well, for RDMA support).

added 10 commits

d3fc8b9e - Add a new task type to do the limiter packing and unpacking
b7132b1b - Create the new tasks to pack and unpack the limiter and link them
615fe200 - Activate the new pack/unpack tasks when activating the corresponding send/recv
872ca5ae - Document the new tasks
4472f4a4 - Call the correct runner_ function when the unpack task is found in runner_main()
3e4f9704 - Use the correct types in the send/recv dependency creation
0119d194 - Fix logic in task_pass_buffer()
f5c1c17f - Also call the new task function when enqueing a task that is done
aa8c633b - Too many brackets...
e6e15ca9 - Fix documentation string

Compare with previous version

What do you think of the solution adopted here? I recycled the task->buff pointer but had to invent a function to pass the information along.

I am not convinced I like it....

It does run, however.

added 1 commit

ff94e221 - Add the new task types to the task plot tools

Compare with previous version

I see, yes that does feel unpleasant and a bit extreme to avoid an extra pointer. Could we store a pointer to the send/recv task somewhere and just use the buff from it?

I guess I could use the pack task's buff pointer to point to the buff pointer of the send task. Is that cleaner?

I toyed with the idea of fishing out the pointer from the dependency list (of size 1) otherwise. We can't really store the buffer pointer in the cell as their number is variable. A linked list might work there but that's also quite the overhead.

Here is what this all looks like on 2 and 8 cosma7 nodes prior to this last batch of changes:

Looking good, if only that orange track hadn't looked the same for the first 18 hours.

An alternative would be to maintain a dynamic list of buffers where each buffer is an object

struct buffer{
   void* next;
   void* buffer;
   enum task_subtype sub_type;
   cell* c;
};

The engine maintains these. When we pack, we put the buffer pointer in one of these. The send task can then figure out which element it needs to send based on the cell pointer and sub_type. This has to come with a locking mechanism, however.

Or we put this linked list in the cell object. Then it's lock-free but more memory-heavy.

I don't like that more than the current solution.

The other option is to recycle buff in the task to point to the next task. Might be a tiny bit cleaner.

Isn't there a one to one relationship between pack/unpack and send/recv? If so the pack/unpack buff pointer could point at the send/recv task and via that access the send/recv task buff pointer, which both use.

added 1 commit

f51c9ec9 - Use directly the pointer of the next task

Compare with previous version

Pack the timebin for the limiter communications

Merge request reports

Activity