More NUMA / shared memory thoughts

Spent the day thinking over shared memory and NUMA.

Supporting a single segment of shared memory is smooth and graceful. It looks good in the API, is simple and easy to understand for the user.

Multiple segments is messy. The user needs to provide per-process state, and to register each segment in each process, before it can be used. Most significant bits have to be taken from the offset value, to indicate which segment the offset is from. When the user passes in a pointer, a lookup has to occur to figure out which segment that pointer is from.

There is a reason to use multiple segments in Linux.

This is that memory policy is on a per-process basis, not per-data structure.

So if I go striped, fine, I can allocate one shared memory block and it’ll be striped on a page basis.

But what if I want striped for one data structure, but something else for another?

There is only one policy, and it is enforced when pages are swapped back in, so you can’t set it, do stuff, and then change it : whatever you have set *now* is what gradually comes to be applied, as pages swap in and out.

In fact this is a problem anyway : if I do have multiple shared memory segments, one per NUMA node, and I’m so controlling my NUMA directly, and striping on a per-entity bais – memory policy will mess it up for me by applying itself to my allocations.

So there is only one memory policy and it applies to everything in your process, like it or not. You’re fucked anyway. Multiple segments will not save you, unless you pin the pages so they can’t swap, which isn’t a reasonable thing to ask.

So on Linux, multple shared memory segments are not useful, because memory policy stops you from controlling your own NUMA anyway.

On Windows, you do need multiple shared memory segments because the OS does not control NUMA. You do it yourself. So if you want to spread an allocation over multiple NUMA nodes, you need to manually allocate on each of them and then put those elements into the data structure.