So, I was working away at SMR, debugging the new implementation, then work became ultra busy for a long time (like, minimum twelve hour days) and I lost my mental state.
What I’ve decided to do for now is strip out the SMR stuff, and the benchmark tool, and just get a release out.
I’ve just done the necessary and the test application has just passed under debug.
I no longer have access to an ARM platform, but I’ve ordered (tried to order - Amazon foiled my best efforts) a quad-core Pi, and I’ve applied to the GCC compile farm, to get access to 64 bit ARM and POWER.
So for now, it’ll be Windows and Linux, 32 bit and 64 bit, user-mode and kernel-mode. The Linux build in theory works for Android too, but not tested - I’ll set up a build platform for that, but not yet.
So now - I need to make everything build and pass its tests on all platforms, polish the code (need a good polish - been so much work everywhere for so long) and then write the docs.
As per earlier post - I tried ordering from Amazon using Tor.
First attempt, password reset, order cancelled.
Second attempt, ditto but also your account is locked and you are NOT told this.
Time for the fix - one to two days.
Turns out of course it ALSO locks your AWS account.
So if you had anything expensive running - tough. You cannot turn it off.
So there’s one hell of a booby-trap here. Order twice using Tor and you can’t access all those expensive by-the-hour VMs you have.
Reinstalled VC2012 on a fresh VM and have just brought all the VC solutions back up to date.
Now making a clone of the VC2012 VM and adding WDK 8.0 to it, for that build.
After that, onto an Amazon Linux VM for Linux user-mode and kernel-mode.
First pass at getting the Windows kernel builds is done, for WDK 7.1 and 8.0.
I’ve just experienced a leap second bug!
mySQL when it passes through a leap second goes to 100% CPU on one core.
You have to manually set the date to fix it.
Just finished the first pass of the API docs for queue_spscb :-) the first full set of docs for an API.
Still wrestling with one last design question - API names.
For example, we have “lfds700_addonly_singlylinked_list”. So, you know what it is from the name, but the name is long. Currently, that name is fully propagated; the directories have that name, the files have that name, the prototypes have that name.
The reason for it is so people coming to the library can immediately and inherently know what they’re looking at.
I’m thinking though (and this was how it was, until I made the full name fully propagate) I could keep the directory names long form, but change the filenames and prototypes to a shorthand, like this;
Much more compact - but now you rely on the user somewhere seeing the long name, because without it, it’s not lear what that data structure is.
So there are data structures and variants upon them, so you can’t just say “queue”. You have to indicate variant.
The current long form versions of the names are;
lfds700_btree_addonly_unbalanced lfds700_freelist lfds700_hash_addonly lfds700_list_addonly_singlylinked lfds700_queue_mutiple_producer_mutiple_consumer lfds700_queue_single_producer_single_consumer_bounded lfds700_ringbuffer_mutiple_producer_mutiple_consumer lfds700_stack
These forms prefix on all prototypes, i.e.
lfds700_queue_single_producer_single_consumer_bounded_init() lfds700_queue_single_producer_single_consumer_bounded_cleanup() lfds700_queue_single_producer_single_consumer_bounded_enqueue() lfds700_queue_single_producer_single_consumer_bounded_dequeue() lfds700_queue_single_producer_single_consumer_bounded_query()
Obviously, so long a name is awkward.
The public headers and the per-data structure directory will retain the long form name, to aid comprehension. Possibly the filenames will retain the long form too. However, the prototype must be shortened.
Question is, to what?
These following seem to be the three options;
lfds700_ahash lfds700_aslist lfds700_aubtree lfds700_mmqueue lfds700_ssbqueue lfds700_mmringbuffer
lfds700_a_hash lfds700_as_list lfds700_au_btree lfds700_mm_queue lfds700_ssb_queue lfds700_mm_ringbuffer
lfds700_hash_a lfds700_list_as lfds700_btree_au lfds700_queue_mm lfds700_queue_ssb lfds700_ringbuffer_mm
The Raspberry Pi 2 Model B has a MICRO SD slot, not as SD slot…
Anyone want a very nice 16GB SD card?
lfds700_btree_au lfds700_freelist_mm lfds700_hash_a lfds700_list_as lfds700_queue_mm lfds700_queue_bss lfds700_ringbuffer_mm lfds700_stack_mm
It’s alphabetic, with least varying parts first.
With lock-free, you basically get three forms of data structures; fully singlethreaded (which isn’t really lock-free, there are no locks), two threaded where one thread writes and the other only reads (say, single producer/consumer queue), and fully multi-threaded (any number of concurrent whatevers).
I’m toying with the idea of adding in single-threaded variants of these data structures.
The outstanding question for me now though is the long form name.
“lfds700_queue_bounded_singleconsumer_singleproducer” needs to be what it is, but do I really want to add “multipleconsumer_multipleproducer” on the end of every fully multi-threaded data structure name?
Directories and filenames are long form, prototypes are short form.
I’ve omitted “multipleconsumer_mutipleproducer” - it’s the default. I might include it in the future, if the addition of other data structure variants would make its use beneficial. Right now it’s not needed, that level of naming complexity does not exist, so it is omitted.
Now I can finally get on with the docs, in their final, proper form.
The Raspberry Pi arrived yesterday.
I ordered the wrong type of SD card - d’oh. I should have ordered a micro SD card.
The local water main broke this morning - it’s just been fixed - so I can now ablute and then head out to buy a card.
Raspberry Pi is up and running and currently executing the test programme.
It’s been flawless. Wrote an image of the Pi port of Debian on an SD card, this has gcc, make, nano, etc, on board. Booted perfectly, logged in, transferred the liblfds source code, compiled it all, first time I’ve build on Linux for a bit so fixed one of two bugs (and as ever always surprised that the MS compiler I’m using doesn’t complain when I used undefined enum types - yes, you read that right!)
Fabulous. Got my own tiny form factor 32-bit quad-core ARM platform, for 70 bucks all told. Couldn’t ask for more.
Release 7.0.0 features exponential backoff for atomic operations.
There was a bug in this code which meant the backoff loop instead of running up to 0/1/2/4/8/16/32/etc times, was running a very large number of times (billions).
This caused the ringbuffer read test to fault :-)
I’ve added now a random delay in all the tests, which is present only in debug builds (i.e. slow builds) and I’m running it through now.
Turned out to be a test bug.
Been writing docs. Getting the filled-in template pages up for each prototype. Done about half.
Docs take foooooooooooooooooooooooooorever.
I know this of course.
I’m just sayin’ :-)
The code is basically complete - I need to update build files for all the platforms, and then run test on them all.
It’s all doc work now.
I’ve just been documenting the list.
I’m finding I’m having to explain a lot about ordered vs unordered usage - I think I’m going to split the list into two, an ordered version and an unordered version.
I’ve split the list into two, one is ordered, the other is unordered.
So far today;
I’ve been looking at the porting documentation.
Right now it’s wrong, because it talks about porting test, but you don’t port test - test has no porting stuff in it. You need to port libcommon; it’s that which has the abstraction layer for test.
It’s called libcommon because although it’s not in this release, there’s also a benchmark programme, and there’s also a libbenchmark, which has another, fatter abstraction layer needed by the benchmark programme.
So it’s a complex mess.
What I’m thinking now is to remove the abstraction layer from liblfds, get rid of libcommon and libbenchmark and have libporting (porting abstraction layer). This will contain all of the abstraction layers - you implement as much as you need to get what you want working, i.e. there’s a minimal layer for liblfds, a bit more on top of that for test, then quite a bit more for benchmark.
libcommon and libbenchmark also contain quite a bit of utility code used by test and benchmark, but I can move that code into the programmes themselves - just to get it out of the way from the user.
Thing is, I really do NOT want to go off into major surgury land at this point. I want to get a release out.
So, I made libpal and moved the liblfds and test porting layers into it. The liblfds porting layer consists of header files only, so libpal does not need to be compiled for liblfds - but it does need to be there and that was odd.
It meant liblfds now dependend upon some header files elsewhere, so you had to make sure to copy not just the library other but this dependency.
This is not going to be expected.
I think in fact the problem was that I added libcommmon and libbenchmark. I like having them, they make it easier to code, but then I need to document well there’s test, and then there’s libcommon, and you need to build this and this for that and so on…
So what I’m going to do is make liblfds, test and benchmark independent (well, test and benchmark both require liblfds of course :-)
They will have their own fully independent porting layers, and I’ve got rid of libcommon/libbenchmark. It will mean duplicating some abstraction and utility code between test and benchmark, but it’s worth it for the simplification of documentation.
liblfds and test compile again, libcommon has been deleted. Will need to update all the build configuration of course. I’ve been building up a list of small things in the code which I need to fix/improve. Will get them all done before updating the build configurations on other platforms.
The build/usage/porting doc layout is now healthy - the layout makes obvious sense (I hope anyway!) to people coming to it for the first time.
Wrote the first pass of the btree docs.
Finish first pass of the usage guides for liblfds and test.
I’ve been unhappy with the ringbuffer API. It’s totally unclear how it’s supposed to be used.
The existing API looks something like this;
Great, huh? obvious - not.
Naively, I think the expectation is for something like this;
read() get the oldest unread element, write() writes to the ringbuffer, overwriting the oldest element if there are no free elements.
The problem though, which led to the existing API, is that the ringbuffer is lock-free and so multi-reader and multi-writer; as such. if we read an element, someone else can completely over-write its contents WHILE WE’RE READING IT.
As such, the existing API acts when reading to detach the element from the ringbuffer, so no other thread can touch it, and the same when writing; we detach an element, write to it, then place it back into the ringbuffer.
In fact I’ve realised this problem/behaviour all turns upon whether or not the ringbuffer is beig used to store transient or permanent data. What I mean by this is that we arrange the ringbuffer either so that each element is pointing to some permanently allocated store for that element (like an audio buffer) or we arrange the ringbufer so that each element is holding a void pointer, which is valid for one read (so we have say a freelist of audio buffers, and we write into the ringbuffer a pointer to a filled audio buffer we’ve obtained from the freelist).
If each element has some store permanent associated with it (this being done typically when the ringbuffer is initialized) then we have the problem earlier described; if we read by just getting a pointer, which is to say obtain a pointer to a filled audio buffer, another thread could perfectly well write into that buffer as we’re still using it. So we need to detach the element from the ringbuffer, which gives us the get_read_element() / put_read_element() style semantics.
OTOH, if we have a freelist of audio buffers, and we take a buffer from there, fill it, then put a void pointer to it into the ringbuffer, then the act of reading is the act of taking that void pointer and the reading thread can then do something with it and when it’s done, put it back on the freelist. With this arrangement, we can have simply read() and write() - write will as in the previous paragraph overwrite the existing void pointer, but that’s fine; that’s not going to interfere with anyone who’s already obtained a copy of that void pointer. (One thing we do need to look out for is when a write to the ringbuffer overwrites an exiting element, where the ringbuffer is full; the writer must get back the void pointer he has overwritten, so that the user can do something with it - such as returning it to a freelist - rather than it being lost).
In the spirit of having one thing perform one task, the ringbuffer shall now have read() / write() semantics, and the user must look after his pointers.
Moved over to new API design.
Made the tests pass.
I’ve updated the ringbuffer API page to the new API design. Will write the per-API docs tomoz.
Well, just done a ton of doc work.
All the API docs need a second pass now to correct mistakes and make them uniform, then a third pass to actually wrote notes and content - right now it’s just prototypes and arguments, basically, the framework of the page.
Got a long list of minor code/doc fixes/improvements to make, which has built up as I’ve been going along. Have to do all those and reflect them in the docs.
Then go over the non-API docs for a second pass.
Then I can bring all the build files up to date and run test on all platforms.
Then I can release - and then immediately start working on the next release, to get the SMR code back in there and out into the field. I’d like to get the benchmark programme out too.
Been working through the small-things to-do list. Done about half.
Home Blog Forum Mailing Lists Documentation GitHub Contactadmin at liblfds dot org