Pack the task types and re-arrange the task structure to go from 80 bytes to 64 bytes.
Merge request reports
Activity
mentioned in issue #194 (closed)
I have limited the number of dependencies per task to 32678 as I am using a
short int
instead of anint
. Similarly, I have limited the number of runners to 32768 for the same reason. Going tochar
seemed to extreme.I have also packed the enums but I haven't yet investigated whether this has knock-on effects elsewhere. Are we doing bit-wise arithmetic with the task types anywhere ?
Indeed, forgot about the use of the flag for MPI.
With MPI, we get an extra 16 bytes bringing us to 80 bytes. Both the pointer and the Request are 8 bytes. So, if we align on 32 bytes, that would be 96.
The
tic
andtoc
cost us 16 bytes but I don't think there is a way around it. We can't replace the cell pointers by indices as the cells are in a linked list. So, we won't be able to bring this down to 64 anyway.Added 1 commit:
- b0a3c8d6 - Align the tasks when allocated. Alignment set to 32 bytes.
The master branch currently has sizes 80 and 96 for the task structure in non-MPI/MPI modes. The changes in here would bring this to 64/80 and then with alignment to 64/96. With the insurance that everything is tightly packed and well aligned.
Does aligning things on 16 still make sense on current architectures ?
Added 1 commit:
- e25c7c7f - Differentiate the alignment of the task structure in MPI and non-MPI worlds.
Reassigned to @pdraper
Added 1 commit:
- 6d1f3277 - Align on 128 bytes in all cases.
mentioned in commit 9ade785e