Delicate bug

I’ve been working on making the tests pass.

Everything went fine until I came to the penultimate test, ringbuffer writes.

This fails. It’s a peculiar failure; when we come to get a write element, we find both the freelist and the queue are empty. This is in this test impossible – there are 100,000 elements per thread, each thread can at most have one element taken out of the ringbuffer, the balance of the elements will either be in the freelist or the queue. So what gives?

Turns out I had forgotten how the queue works.

The queue has a dummy element. It’s not a static dummy element, though; what it is, is that the element pointed to by the dequeue pointer has been emptied already of its user data – when you perform a dequeue, you get the user data from dequeue->next, not from dequeue itself.

E.g.

dequeue
|
element1 – element2

element1 is the dummy – his user data has already been returned to the user; when we dequeue, we’ll get the user data from element2.

Now, the problem is that for the ringbuffer, I have in each ringbuffer element a queue_element; and when I init the queue, I pass in an extra element, to be the initial dummy.

So I assume when I dequeued, the queue element I obtained in the ringbuffer would be the queue element used for that ringbuffer element. Problem is, it isn’t! it’s the queue element *prior* in the queue!

Good news

Benchmark compiles (again – I did some more work on it).

I’ve now been going through the tests, making them work.

Freelist, queue and stack all pass for DCAS.

Ringbuffer blows a gasket initalizing its freelist. That’s next on the list.

Once DCAS is up I may then get benchmark actually working; either that or proceed with sorting out SMR.

Status

Still not dead :-)

Just very little time between work and life.

I’ve just make benchmark compile.

So now I need to make test pass and benchmark run and sort out SMR.

Benchmark

Well, you guessed it – I worked on the benchmark =-)

Quite a lot done. Needs another half day, I think. The new internal structure makes it much easier to add new synchronization primitives for comparative benchmarking.

Current list is;

Linux : pthread mutex, pthread spinlock, hand-crafted spinlock
Windows : critical section, mutex, hand-crafted spinlock

The three data structures being benchmarked currently are the freelist, queue and stack.

status

Seven hours work and liblfds uses const in all the right place and now the tests compile.

I’ve not tried running them yet, mind – that’ll be a long slog, I think.

Time to eat.

Gym too if there’s time.

Tomorrow, British Museum.

Todo list;

1. finish SMR
2. make tests pass

And then see how I feel about benchmark…

status

So, two more queue tests to go and all tests have been broadly updated – I need then to make them all compile and run properly; then add const to everything which can use it, then make SMR work. I’m thinking to leave benchmark to 7.1.0, although – it’s so small a piece of work, I think, to do, I’m almost loath not to do it because the output is so cool.

Making SMR work is kinda bitchy. The API will have to involve prevent swapping on allocated blocks of memory, well, if I want to be NUMA aware at any rate.

const

Just been reading a review of DOOM 3 code.

One point; use const everywhere you can.

I’ve just done this with ahash, to see what it feels like…

It clutters, the APIs are not so clean, but they are certainly more informative.

I’m going to take the plunge.

Trudge trudge

Freelist tests updated (basically identical to the stack tests, which have already been done and pass).

Done one of the ringbuffer tests, two to go. Then the queue tests.

Takes about an hour per test…

7.0.0 status

Thinking of making a 7.0.0 release without the benchmark programme – it’ll save a fair bit of time.

So currently making the tests work, which is slow work.

Sorted out the freelist tests today, queue and ringbuffer to go.

7.0.0 run

So, four major tasks remain;

1. make tests run
2. make benchmark run
3. write docs
4. build and test on every supported platform

I think it’s about four weeks work.

The one outstanding question is whether or not to finish NUMA-aware SMR for this release. I think the answer is yes; I’d have to hack out a fair bit to back-out fully from SMR, and SMR is close to being done.