Skip to content
Snippets Groups Projects

Pack the task types and re-arrange the task structure to go from 80 bytes to 64 bytes.

Merged Matthieu Schaller requested to merge skinny_task into master

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • mentioned in issue #194 (closed)

  • I have limited the number of dependencies per task to 32678 as I am using a short int instead of an int. Similarly, I have limited the number of runners to 32768 for the same reason. Going to char seemed to extreme.

    I have also packed the enums but I haven't yet investigated whether this has knock-on effects elsewhere. Are we doing bit-wise arithmetic with the task types anywhere ?

  • Good stuff!

    We could further pack skip, tight, and implicit into a bitmask on a single char as well.

    I think the packed enums should be fine, we actually want them to be contiguous since the scheduler mask uses their values.

  • In non-MPI mode we are at 63 bytes. Packing these together will bring us to 61. Is it worth it ?

    The flag is used mostly for the sorts. Do we need all 32 bits there ? Could we do with a short int instead ?

  • The flags field is also used by send/recv tasks to store the tag, IIRC, so we probably can't shorten it.

    How large are things with MPI?

  • Indeed, forgot about the use of the flag for MPI.

    With MPI, we get an extra 16 bytes bringing us to 80 bytes. Both the pointer and the Request are 8 bytes. So, if we align on 32 bytes, that would be 96.

    The tic and toc cost us 16 bytes but I don't think there is a way around it. We can't replace the cell pointers by indices as the cells are in a linked list. So, we won't be able to bring this down to 64 anyway.

  • Added 1 commit:

    • b0a3c8d6 - Align the tasks when allocated. Alignment set to 32 bytes.
  • Here is an updated version with aligned allocations.

  • Well, we could replace tic and toc with a tic_toc value which is just replaced by the difference when computed... But then we're still short a byte :/

  • 61 - 8 + 16 would give us 69 in MPI-land. So still 5 off.

  • The master branch currently has sizes 80 and 96 for the task structure in non-MPI/MPI modes. The changes in here would bring this to 64/80 and then with alignment to 64/96. With the insurance that everything is tightly packed and well aligned.

    Does aligning things on 16 still make sense on current architectures ?

  • The main reason for aligning things on 64 (or a multiple thereof) was to make sure that they're aligned with the CPU cache lines. For 96-byte structures, this is a bit broken, but I would still posix_memalign them to a multiple of 64 or 128 so that they span at most two lines.

  • So we would do 64 bytes in non-MPI and 128 bytes in MPI-land ?

  • For reference, we do currently align the xpart and gpart to 32.

  • No, I'd stick with 96 in MPI-land, but posix_memalign to 128 bytes to make sure we only use at most two cache lines.

  • So,

    struct task{/*...*/} __attribute__((aligned(32)));

    and

    posix_memalign(..., 128,...) 

    or

    posix_memalign(..., 64,...) 
  • OK, we should also align those to 128. Cache lines are typically 64 bytes, but we should accommodate for when they go to 128 bytes (they will eventually).

  • Ok. Will do.

  • Yes, the alignment in posix_memalign is only for the first element, so it should not depend on the struct size.

  • Added 1 commit:

    • e25c7c7f - Differentiate the alignment of the task structure in MPI and non-MPI worlds.
  • Are you happy with these changes or should I just align everything on 128 anyway ?

29 29 #include "cell.h"
30 30 #include "cycle.h"
31 31
32 #ifdef WITH_MPI
33 #define task_align 128
  • Reassigned to @pdraper

  • Apart from the one comment, all is good!

  • Added 1 commit:

    • 6d1f3277 - Align on 128 bytes in all cases.
  • Yep, done. Thanks !

  • Matthieu Schaller Title changed from [WIP] Pack the task types and re-arrange the task structure to go from 80 bytes to 64 bytes. to Pack the task types and re-arrange the task structure to go from 80 bytes to 64 bytes.

    Title changed from [WIP] Pack the task types and re-arrange the task structure to go from 80 bytes to 64 bytes. to Pack the task types and re-arrange the task structure to go from 80 bytes to 64 bytes.

  • Looks good to me, accepting.

  • Peter W. Draper Status changed to merged

    Status changed to merged

  • Peter W. Draper mentioned in commit 9ade785e

    mentioned in commit 9ade785e

  • Please register or sign in to reply
    Loading