Draft: NUMA aware pinning of queues and runners.
Adds the capability to pin queues and runners within NUMA regions.
Adds queue selection by tasks on the basis of the NUMA region that holds the start of it's main data area.
Adds the spreading of swift allocated memory using larger interleave chunks (default for those is 4k).
On COSMA8 this shows speed improvements over the existing master, even with pinning and interleave.
(Based on the !1649 (merged) so we also have those improvements, now merged.)
Not sure how serious these changes are yet, as we need to add an additional argument to all the swift_free() calls so that the memory spread can be undone, also the memory alignment is done using page boundaries (4k). Also requires that there is a one to one correspondence between queues and runners as these are pinned to NUMA regions in pairs.