Update

I am now this moment beginning to work on bringing all the build files up to date.

I think it’s a day of work.

I want to do a thread sanitizer run too, it’s been a while since I have. That’ll take an hour or two, as well (it’s quick enough to run, but it takes a while to think through all the results).

Docs

Okay, docs are good enough for release – well, I mean, I’ve got to spend a day or two now getting all the builds working again and running test, once that’s done I think I’ll give the guide documents another look over, coming to them a bit more freshly than now.

As I appreciated before, when I documented release 6, mediawiki is in its native form wholly unsuitable for documentation. The simple lack of a global search-and-replace is enough for that. Maybe there are scripts which can do this… documentation has taken up an improperly lrge effort, in that documentation necessarily takes up a certain effort but I’ve ended up spending a lot more effort than that because of the limitations of mediawiki.

I need to make one final set of code changes now – the COUNT queries all need to be safe to use on an in-use data structure (currently they’re all singlethreaded).

CODE COMPLETE

*long exhalated breath…*

Okay, so there is actually a certain bit of documentation to complete – second drafts of the hash, queues and lists. They’ve all had their first drafts though and I’m confident enough now that there’s no surprises in there – the APIs as they are will be unchanged after they’re documented.

So I need to do those docs, need to review all the other docs to get them in line with the changes made over the last week or three, and I need to do all the build config work (MSVC solution files, etc).

Then it’s release time.

And THEN it’s back to work – get SMR in, the SMR versions of freelist and stack, then get the benchmark app back into play. THEN it’ll be real new data structures (linked list with real delete) and RPM packaging and a ton of other stuff I can’t remember right now 🙂

Ci20

So, Ci20 is fabulously fantastic. No DWCAS and GCC < 4.7.3. Done a bunch of fixing and now everything compiles. Going to ditch the NOP stuff, and just use a volatile counting loop - right now there needs to be a NOP per platform, so you actually need the assembly code for a platform to be able to compile! Great - Ci20 tests just passed on release. Thankyou Imagination Technologies!

Ci20… earlier GCC and barriers

The Ci20, thank God, came with an earlier version of GCC – 4.6.2 to be precise. This provide the old style atomics, and so no compiler barriers – you have to use __asm__ to get a compiler barrier.

Problem is, __asm__ is a block-level element, i.e. it has to end in a semi-colon. You can’t have then separated by comma operators… …and the entire code base currently is written using inline-type elements for barriers, so they can be used in macros in while() loops and so on.

AFAICT, there’s no native GCC < 4.7.3 inline-style compiler barrier. I guess I'm going to have to implement __asm__ compiler barriers as inline functions... gaaaak. Feels risky. Wonder if the compiler is going to get it right?

Ci20 Creator is here!

The Ci20 has arrived – i.e. a genuine MIPS32 platform for test!

It’s plugged in and ready to roll.

I’m just getting the test clean under valgrind for the new queue free test, so I can run that a hundred times or so and see if it throws an error.

Linux

Linux -> no docs, no error messages. Now make it work

I’ve spent the last couple of hours failing to configure Dovecot TLS.

I’ve just set up Thunderbird on my new laptop, Debian, and I’ve disabled the insecure cipher suites.

Now Thunderbird and Dovecot won’t talk, ’cause there’s no shared suites.

I mean, there ARE shared suites. There are tons of the them.

Oh yeah – and Thunderbird doesn’t show an error. It just leaves the “Connected” message showing. It’s only when you poke around, find the error console and look at it, that you discover what’s going on. Dovecot logs one line telling you no shared suites – doesn’t list anything else, like what it thinks it has and what the client thinks it has.

I’ve configured everything to be on, and it still isn’t working. I’m starting to call bullshit on this, and just say that it’s broken and doesn’t work, flat out.

Next I’m going to have to install wireshark, and inspect the POP3 packets, to see what Thunderbird is sending, because there’s no other bloody way to find out.

SERIOUSLY. NO SOFTWARE SHOULD REQUIRE PACKET SNIFFERS TO MAKE IT WORK. WHAT THE HELL ARE YOU GUYS SMOKING?

Possible typo/bug in the Michael and Scott queue white paper psuedo-code

Okay, so, I’m about to make quite a claim – and I do not make it lightly, and I make it in full knowledge that I am extremely likely to be completely wrong. M&S are experts in the field, and I am not.

This is the dequeue psuedo-code from their white paper.

dequeue(Q: pointer to queue_t, pvalue: pointer to data type): boolean
 D1:   loop			     // Keep trying until Dequeue is done
 D2:      head = Q->Head	     // Read Head
 D3:      tail = Q->Tail	     // Read Tail
 D4:      next = head.ptr->next      // Read Head.ptr->next
 D5:      if head == Q->Head	     // Are head, tail, and next consistent?
 D6:         if head.ptr == tail.ptr // Is queue empty or Tail falling behind?
 D7:            if next.ptr == NULL  // Is queue empty?
 D8:               return FALSE      // Queue is empty, couldn't dequeue
 D9:            endif
D10:            CAS(&Q->Tail, tail, )     // Tail is falling behind.  Try to advance it
D11:         else		          // No need to deal with Tail
D12:            *pvalue = next.ptr->value // Read value before CAS Otherwise, another dequeue might free the next node
D13:            if CAS(&Q->Head, head, )  // Try to swing Head to the next node
D14:               break             // Dequeue is done.  Exit loop
D15:            endif
D16:         endif
D17:      endif
D18:   endloop
D19:   free(head.ptr)		     // It is safe now to free the old node
D20:   return TRUE                   // Queue was not empty, dequeue succeeded

Now, note the free on D19. The authors are making the point, as they do in the white paper, that once you’ve dequeued, you’re in the clear – you can free the node.

The typo or bug which I think I see is on line D4.

D4:      next = head.ptr->next      // Read Head.ptr->next

The code says “next = head.ptr->next”, which is using “head”, lower-case “h”. The comment says “Head.ptr->next”, which is using “Head”, upper-case “H”, where as we see on D2, “Head” (upper-case) means Q->Head.

I may be utterly wrong, but I think if the code is used, the free on D19 is broken, because it could lead the code (not the comment – the code) on D4 to access a freed node.

The problem is that “head” (lower-case) is a copy of Q->Head. The white paper states the Q->Head will always point to a valid node (as there is a dummy node in the queue) and that’s fine – but we have taken at line D2 a copy of Q->Head, and we can imagine it points a node, where that node could be by another thread dequeued and freed at D19, before our thread gets to D4 and tries to access that node’s next pointer.

The comment however looks right to me – if we read Q->Head.ptr->next, we’d still be fine, as the if() on D5 still works – but it would mean we could now safely call free on D19.

There is a matching issue in the enqueue, on E5, E6 an E7 – but here the comment and code match up.

the M&S design flaw

So, for now at any rate, I’m using the PRNG to generate an initial value for a queue element next pointer counter.

The basic problem is that in the queue, elements have a next pointer, which has a counter – because it is the subject of DWCAS – unlike the stack/freelist, where the elements only have next pointers.

Because it has a counter, when an element leaves the queue, it takes that counter with it – but it is actually only valid for the queue the element just came from – but it CAN end up being mistaken for a valid value in a new queue the element in enqueued to.

Updatez

1. Finally have my head around the M&S queue ABA stuff.
2. Atomic add is now no longer needed – all the data structures get ABA from their CAS operations – so it’s been removed. Performance should improve by maybe a third…!
3. Reversed the arg order for CAS to GCC style (destination, compare, new_destination). Used to be MS style (destination, new_destination, compare) which is nuts.

Now I just need to think about this…

http://polyglotplayground.com/2015/04/30/Problem-with-Michael-Scott-Lock-Free-Queue/

Whoaz…