With the run-time cache alignment, there are some state structures where the final element is cache line aligned – those structures also needed padding *after* that final element to ensure it has the cache line to itself. Added that.
Also added a flag to SMR thread registration where the user can indicate the thread should perform “minimal SMR”, which is to say, SMR work consists of two things; the scan to see if the generation counter can be incremented and the scan to liberate release candidates.
The latter is single-threaded and can only be performed by the thread owning those release candidates and so this SMR work must occur; but the scan to see if the generation counter be be incremented is not mandatory for every thread – and it requires scanning the full active and retired thread lists, and the release candidate lists of all retired (but still in-use) thread SMR states.
It can then be set that a given thread does not perform this scan. I see this as useful for low latency threads, say for audio, where burstiness is bad.