Who allocates for the allocator…?
Well, you could build up state locally then once you know what you need to know, alloc in the correct place and transfer your state there.
I guess I’m kinda thinking of an alloc state, which carries the NUMA/LC map.
However, in some situations, the only solution is striping, e.g. four NUMA nodes and the user says “ALL” – you cannot have equal performance on all four nodes all the time. You could if he only cared about two of the NUMA nodes, then you’d select a middle-man.
But this now is getting back more or less to what the NUMA API for Linux offers; you basically configure your preferences and then malloc does what you’re asaking for. I don’t need to implement this, it exists already.
This is not the case under Windows, which only offers a function whereby you indicate the NUMA node you wish to allocate from.
Something else has just struck me though; large pages really are for very specific tasks and their controlled environments, because their allocation, relying as it does on contigious physically memory blocks, can only really be guaranteed just after boot.
Maybe I can for now not worry about large pages – just make the library for normal arbitrary post-boot usage…
What it comes down basically is that the user must decide where to store his data (which memory) and where to run his code (which core), where his code will be running on a wide range of systems.
What he really wants of course is locality – but now this brings us not only to controlling where the memory goes, but where the threads go. We would need an allocator, for memory, and a starter, for threads…
…and some generalised (and viable!) method for specifying desired behaviour.
So we would imagine some high-level way of quantifying the relationships between threads in an application – this thread communicates with this thread, etc. But then how much do they communicate? 30% with one thread, 70% with another? it’s hard to imagine this working well – but the problem here is that too much of the machine detail (cores and memory) has come to have to be exposed to developers.
It’s a similar problem with liblfds now. Take SMR for example. A library exists not just to encapsulate functionality but also knowledge. It exists to save users having to *understand* the details of its inner workings. If SMR is say removed fully from the library and made the users responsibility, the user now has to understand SMR. The library has in that sense failed.