singly linked-list design

I can figure out how to make a practical *ordered* singly-linked list, but not a practical *unordered* singly-linked list. Bit of a surprise!

The key problem is iterating over the list.

Imagine you have a list, and you want to iterate over it.

You have a cursor; it points at the element you’re currently looking at, and prevents it from being deallocated. It does not – cannot – prevent it from being *unlinked*.

Now, when you want to move to the next element, you get the next pointer of your current element (which is to say, you point a hazard pointer at it, and then get hold of the value of the hazard pointer, this means the next element also can’t be deallocated – but it still can be unlinked.

So the problem is this : your cursor is on an element, any element, that’s fine (just assume we got there – it doesn’t change anything – we’re setting up a scenario which illustrates the problem). We get our next pointer, that’s fine. Before we move to the next element, though, someone else unlinks it.

Well, huh. So at this point, the next element we’re about to move to is unlinked – but it still points at what was *its* next element, so we’re still okay… (and we know the element has been unlinked, because its unlink warning bit has been set by the thread which unlinked it).

But what now happens if the element *after* our next element is unlinked *and* deallocated – which it can be, as we have no hazard pointer on it.

Because our next element has been unlinked, its next pointer is *not* updated by the unlink of its next element. So now the cursor is pointing at an element which has a next pointer pointing at deallocated store.

So what we see is that if we see the unlink bit is set, we have to abort the iteration.

Now, if we’re aborted the iteration, well, then what? users will often want to iterate over the whole list and look at each element just once. So we need to restart the iteration, at the point we left off.

But in an unordered list, we can have no idea where that point is, because the entire list could have been deallocated and replaced in the meantime.

We can only know where that point is if the list is *ordered*. Then the key held by the unlinked element which derailed us, which we still hold a hazard point to and so which cannot be deallocated, let’s us know how far we have to fast-forward over the list to get back to the point where we can recommence the iteration.

singly-linked list update

So, it’s all very interesting.

You need a cursor, to iterate over the list – because you need a pair of hazard pointers to iterate. There’s a problem here though I think – when we deregister an SMR hazard pointer per-thread state, there can be allocations still in the list of allocations pending reclamation. An unusued thread state with lingering elements is always checked by the other threads; so either we can’t use it till its clear (which is awkward as hell for the user – remember, we will not look to dynamic allocation as the library performs no allocations) or when we re-use, we then rely upon the now-owning thread to run scans. Problem is, that now-owning thread is expecting to be the sole owner – he has no idea he might have gigabytes of store sitting in his pending reclamation list! so I have to find a nice solution to this.

Another problem I’m currently grappling with is that although the hazard pointers prevent any element you’re using from being deallocated, they don’t stop it from being unlinked.

If you’re unlinked, it could be the element after you is unlinked before you try to get_next with the cursor – problem then is the next pointer in your element is now pointing at a deallocated allocation because of course it’s not updated when that next element was unlinked…!

I’m vaguely thinking there needs to be a “deleted bit” *IN* an element, which is an immediate logical delete, which is followed then by the usual delete bit in the predecessor element, as part of the two-step unlink (which I think is still needed – but not certain).

However, this is because I’ve not got the way it all works clear yet in my head, so it could be complete garbage.

Update

Have implemented the freelist and queue with hazard pointers. Will do the stack once the freelist tests are written and pass.

Now implementing – finally, after all these years! – a lock-free singly-linked list, with support for real delete =-)

Once that’s done, I’ll think about how complex it is to support delete in the unbalanced tree.

English. Trying speaking it. Words have meanings, that sort of thing.

I’m joining – trying to join, as ever you have to struggle and fight to get past all the incompetence and mistakes every company makes in its desperate effort to commit suicide – Revolut, a FinTech.

They ask for four photos for proof of ID – front and back of passport, pic with yourself and passport next to your face, and then a pic of yourself.

I photo the front and the back, open the passport hold next to my face take a pic and then pic of me.

I send this off (after checking their SSL cipher suite is okay).

“When we said front and back we did not not mean the front and back of the passport; we meant the two pages (authors note – on the inside!) which have identity information.”

“Yes, we do have other people sendng us the front and back!”

Using lock-free data structures in a “single-threaded mode”

I have recently implemented hazard pointers.

As with epoch-based safe memory reclamation, each thread maintains a list of allocations which are pending reclamation.

This list must support delete, and where it’s per-thread, it can be single-threaded, so no problems there.

One solution then is to provide in the library a single-threaded list.

However, there is another possible solution, which also provides useful functionality and allows the absence of this singleton single-threaded data structure; to allow the lock-free list to be operated in a single-threaded mode.

The user would need to know, a priori, that at some point the list would only be operated on by a single thread. That thread would enter single-threaded memory with a load barrier, then Do Stuff, and then finish with a store barrier and forced write to memory. Other threads wishing to use the list again in normal multi-threaded mode I think (not yet checked the code) can do so without a special effort, since they get load barriers anyway – but if not, then any thread wishing to again use that list must issue a macro (which turns into a load barrier).

Similar functionality would be useful for all of the data structures, I think; I can imagine times when they are operated on in a single-threaded mode, and then you simply do not want the overhead and costs of the atomic operations – but you DO still want to operate on the SAME data structure you will then use multi-threaded.

There are two efficiency concerns.

The first is that the data structure C-language structures often have alignment constraints on some of their members. This is fine for CAS platforms, but not necessarily for ERG platforms – the worst case on ARM is 2048 bytes, so you end up with fat structures. You would avoid this is you had a single-thread *only* data structure, because then you don’t care about alignment. Still – if we are arguing the use case where we want single-threaded operations on a data structure which we either before or after want to use multi-threaded, then there’s no avoiding it.

The second is to do with the type qualifier volatile. Volatile is necessary to prevent compiler optimization. The compiler can in theory put a value into a register and keep it there forever – load barriers, which could cause the cache line holding that value to change, do not cause the register to be updated, and so the compiler would continue forever to use an old value. This will break most data structures, or at least prevent forward progress 🙂

One common solution is to take use a non-volatile object, take a pointer to it, cast that pointer to be volatile, and access through it.

The Standard does not support this behaviour; by the language of the Standard, the compiler decides what to do with regard to volatile based on the type qualifier of the *object*, not the pointer.

However, GCC has supported this behaviour for a long time. I’m not sure what MSVC does, or other compilers.

What this means is that in single-threaded mode, the lock-free data structures must retain their volatile qualifier, which make them less efficient. We can avoid atomic operations, which is great, and we can perform operations which otherwise we could not (deletes for example) – but we can’t safely avoid volatile.

The problem is that if we did remove volatile, and use casts when we wanted volatile access, *but the compiler does not support it*, it would lead to a fault which would be intermittant and hard to diagnose.

The issue itself is difficult to explain to users – if it were for example exposed via the abstraction layer, by making volatile a define – and it is equally difficult for users to actually find out how their version of their compiler deals with volatile.

Correctness is the first attribute of software. You cannot ever put it at risk; software which does not do as it was intended has failed.

Given that it is problematic to determine whether a compiler will honour the type qualifier of the pointer, it has to be assumed it will not.