Performing tuning

I’ve been spending some time poring over the M&S queue.

If I look at the benchmark for the freelist, it goes ballistic on one thread on each of two cores.

The queue just *doesn’t*. It goes at the same speed at the normal locking stuff.

It’s driving me nuts – I need to figure out why, so I can know it’s real and not just me messing up the implementation.

Relating to this, but it didn’t make a difference, I found out today by default on Intel when a cache line is fetched, the following cache lins is also fetched. I tried doubling the atomic isolation aligned to be two times the cache line width, but it made no difference to anything except the btree, which slowed down by about 25%.

One thing I noticed was that I had some store barriers in the dequeue/enqueue code which were absolutely unnecessary – they were being used on counter fields, which would be pushed out anyway by a following DWCAS.

I noticed also I had an inefficient store barrier in the freelist/stack push – I was causing two fields to get pushed by a DWCAS, when in fact one of them was going to be pushed out anyway in the DWCAS. I reordered them, and the freelist/stack sped up by about 10% in the single-thread case (which is the clearest measure of the effect of such a change).

However, the change I made to the queue, where I removed the unnecessary stores, has made it about 50% faster with one thread on each of two physical cores, but noticably SLOWER when one thread on each of four logical cores (i.e. two threads on each of two physical cores) – and this I do *not* understand.

Update

Last test to write on the new queue.

I’ve had a bit of rename. Everything used to be alphabetic, i.e.

list_addonly_ordered_singlylinked

However, thinking of Orwell’s comment “ignore all of above” I’ve decided to be consistent in ordering within each data structure type. I’ve also needed to rename lfds710_queue, because it needs to be clear it’s MPMC.

I had a bit of an insight earlier, in the shower.

I think I can modify the M&S queue to be bounded, by giving it an array of pointers to elements. You can’t use an actual array directly, because it constrains the order in which dequeues must occur for other threads to be able to progress. So it has to be pointers.

However, I’m not going to do that now – I’m moving to release now.

I may do the benchmark for the new queue, but that’s the only extra dev work I’ll do.

In other news, this is my first year with two Easters. Late March in Malta, late April in Athens!

Ringbuffer disappointment

The ringbuffer uses the M&S queue and so needs a dummy element.

This disfigures the API.

I implemented Dmitry Vyukov’s bounded MPMC queue, to use that instead.

However, having implemented and now coming to adjust the ringbuffer, I realise it’s no use.

The problem is that it’s a soft queue and also that it runs from an array. By soft I mean that although all enqueue and dequeue operations occur, they are not necessarily visible to all other cores by the time the enqueue/dequeue function returns.

If we imagine a queue as a ringbuffer, then what it means is that if we try to enqueue and we cannot, because the queue is full, we dequeue, and then enqueue – throwing away one element.

The problem is that when we come to dequeue, someone else may dequeue just before us; our dequeue will succeed but if the other thread is first, then until that other thread’s dequeue becomes visible to us, we can’t continue, because we need the *next element in the array*, the one just dequeued, to become visible to us as dequeued – we can’t enqueue until *it* is available, regardless of the success of our own dequeue.

What’s more, we can’t just say “well, we did a dequeue, so we’ll get something sooner or later” because other threads could be genuinely enqueuing more stuff. So we can’t *know* whether we need to re-issue the dequeue.

The basic issue is that the queue is blocked by any one thread performing a dequeue where that thread is slow (to be clear, though, the queue was never meant to be otherwise – it’s just I’m really appreciating the impact of this now).

It can be solved by forcing a store at the end of the dequeue, so everyone has visibility. This halves the speed of the data structure (currently one CAS for enqueue and one for dequeue). Also, we still are exposed to the problem of a thread performing a dequeue being idled by the OS when it’s halfway – it’s done the first CAS, so the dequeue has occurred, bu it’s not yet made the element usable (adjusting the sequence number). During this period, an idled thread blocks the data structure.

anti-aging news

About seven months ago a woman in the US, a normal woman, not ill, flew to Colombia – to evade the restrictions placed upon her freedom by the FDA – and underwent a gene therapy treatment to help address two of the declines which come with age, loss or muscle mass and telomere shortening.

This news finally hit slashdot, and so I finally read about it – I’m right on the cusp of getting strongly involved in this field, so I’m not yet well read.

The comments on slashdot reflect in part the staggering lack of imagination and abudance of banality which humans can possess.

This technology is on a par with the invention of fire, and there are a non-negligible number of respondants along the lines of “death is death, retards”. I can only hope for timely automobile collisions ;-)

One of my thoughts with regard to a longer or open-ended lifespan is that it will give people time to mature and become wiser. Some people become wise and considerate in twenty years; others need forty, others still sixty or eight. If we can live for hundreds, or thousands, or hundreds of thousands of years, perhaps we might all become wise enough to have a decent planet – and more than one planet – to live on.

Getting to Athens

The trap was to me interesting in that there were a number of small and large fails from various organisations, on the way.

I booked with Aegean. The flight is actually run by SAS. Because of this, when you’re actually booking the flight, you cannot add additional baggage. The option to do so is there, but if you use it, your flight fails to book, with a meaningless error message.

When this happens (and after about my fifth try at getting it to work – I just love filling those long forms out again and again and again and again and again) I called Aegean, they told me it was probably my card. I called the bank (always a trip into hell) and it was a public holiday, so I called the day after. Nothing wrong with my card.

I called Aegean again. This time, I found out what was actually going on. I was told in particular that the only way to take additional baggage was to pay at the airport – the higher rate, then, of course, I think 45 euros. Crazy.

I booked the flight (even with that, it was still far and away the best choice for price and times) and then had a thought – tried to add some additional luggage *after* booking.

This worked fine.

About two hours on the phone with the bank and the airline there, for nothing.

Two days before the flight, Aegean send me an email to check in. I follow the link, and try to check in. It doesn’t work. I called Aegean, again. Because the flight is actually run by SAS, I need to go to their site and check in. The email is completely wrong…

So I took the flight and arrived at Athens.

I discovered a subtle problem with the metro map provided by Athens Public Transport – it’s in Latin, not Greek. This means I cannot easily connect the metro stations on their map with the metro stations on my *actual* real map in my phone, because my actual real map of course shows the names in GREEK.

At the airport, there’s a rail line which runs for some stops and then connects to a metro station. There are two lines in the airport (which was unexecpted). One has a big sign, “METRO”, the other as big sign “SUBURBAN RAILWAY”. Like everyone there for the first time, I opt for metro.

In fact it’s the wrong choice.

What actually happens is that the “METRO” line is running trains which go directly into the blue metro line. However, if you take the suburban railway, you then can change to get on the *green* metro line – which is ACTUALLY what I wanted, since it meant I only had to make that one change.

So “METRO” means “blue line” and “SUBURBAN RAIL” means green line.

(Which reminds me – there are NO MAPS in the airport railway station. None at all. Not one.)

As a result, I had to change twice. I had with me two luggage, about 50kg, and a shoulder bag. In the metro station where I had to change, the first lift was broken.

On the level above, once I got there, the lift worked – but the gap between the lift and the floor is sufficiently great luggage wheels get stuck in the gap.

People are smart and capable – but you have to let them be. If you stick them in an organization with a ton of disencentives, or indeed actually forbid or prevent action on their part, they won’t, and you get a mess.

Wire

I’ve been using Signal now for some time, it’s great for text, but it’s really bad for audio – unusable.

“Wire” came out recently, from the bloke who originally made Skype.

I thought I’d give it a go.

I’ve sideloaded on the phone, and a friend has it installed.

It looks nice, minimalist, etc, there’s just one basic problem.

I cannot find any way in which to add her (or anyone) to my list of contacts in Wire, which currently consists of “Otto The Robot”.

I mean, am I being an idiot? am I missing something obvious? but it’s a really simple UI, there’s hardly anything you *can* do. There’s almost no configuration, and there’s a contact list, and it shows your contacts (I have none) *and there’s no way to add anyone*.

You can search, but I think it’s searching your existing list.

You can *invite* people, but she’s already *made* an account and installed Wire.

I’ve tried logging into the web-site and using the WWW-based version. It had some problems to begin with – the whole screen was dimmed, it had popped up a notifications request, only it *hadn’t*, so I couldn’t accept or decline, so the whole screen was dimmed. I logged out and back in and that sorted it out but – still the same problem. I have absolutely no way, or no clue, how to add people to my list of contacts in Wire.

Christ, finally

All the benchmarks are running again.

Adding NUMA support took SOOO long, OMG I don’t even etc.

(So the benchmarks can now run SMP, NUMA or NUMA unused, where the benchmark admin data is NUMAed but the data which is actually liblfds data is only one the NUMA node which has the largest number of logical cores in the current thread set).

Need to fix up the gnuplots now, add a benchmark duration argument, some other tidying up.

Then I can make the docs, run the tests, then publish 7.1.0 – and then I can have snazzy glittery gnuplots on the home page!

Debian/XFCE Wastebasket just… doesn’t work

More on the theme of everything is broken.

Desktop wastebasket.

The idea is well known. You delete a file, it’s not really deleted – it’s in the wastebasket. If you realise – aggghhh! wrong file!! you can retrieve it.

Only…

I deleted some files. They’re in the wastebasket. When I double click to open, doesn’t open.

When I click on “restore”, says, “Failed to determine the original path”.

So – I mean, right? it’s a wastebasket. The idea is VERY SIMPLE. When I delete, you PUT THE FILE SOMEWHERE. Not hard. Simple. I can then get the file back.

How do you mess this up?

But hey, it’s been achieved!

WTF, you know?

Update

Test is finally back on its feet after rewriting to not use the standard library – all passes on release, valgrind clean on debug.

Unfortunately, it’s not QUITE free of the standard library…

One test uses qsort and bsearch, and a lot of tests use time(), to ensure they run for a reasonable period.

I need to rewrite those tests to run for enough iterations, so time isn’t used.

However, back to the benchmark now.