From liblfds.org
Jump to navigation Jump to search

Source Files

    │   └───liblfds700
    │           lfds700_queue.h


enum lfds700_queue_query

Opaque Structures

struct lfds700_queue_element;
struct lfds700_queue_state;
struct lfds700_misc_prng_state;


#define LFDS700_QUEUE_GET_KEY_FROM_ELEMENT( queue_element )
#define LFDS700_QUEUE_SET_KEY_IN_ELEMENT( queue_element, new_key )

#define LFDS700_QUEUE_GET_VALUE_FROM_ELEMENT( queue_element )
#define LFDS700_QUEUE_SET_VALUE_IN_ELEMENT( queue_element, new_value )

#define LFDS700_QUEUE_GET_USER_STATE_FROM_STATE( queue_state )


void lfds700_queue_init_valid_on_current_logical_core( struct lfds700_queue_state *qs,
                                                       struct lfds700_queue_element *qe_dummy,
                                                       struct lfds700_misc_prng_state *ps,
                                                       void *user_state );

void lfds700_queue_cleanup( struct lfds700_queue_state *qs,
                            void (*element_dequeue_callback)(struct lfds700_queue_state *qs, struct lfds700_queue_element *qe, enum lfds700_liblfds_misc_flag dummy_element_flag) );

void lfds700_queue_enqueue( struct lfds700_queue_state *qs,
                            struct lfds700_queue_element *qe,
                            struct lfds700_misc_prng_state *ps );

int lfds700_queue_dequeue( struct lfds700_queue_state *qs,
                           struct lfds700_queue_element **qe,
                           struct lfds700_misc_prng_state *ps );

void lfds700_queue_query( struct lfds700_queue_state *qs,
                          enum lfds700_queue_query query_type,
                          void *query_input,
                          void *query_output );


This data structure implements an unbounded, many-producer, many-consumer queue and internally implements exponential backoff to help deal with high load and so improve scalability.

The implementation performs no allocations. The user is responsible for all allocations (and deallocations), where these allocations are passed into the API functions, which then use them. As such, allocations can be on the stack, on the heap, or as can sometimes be the the case in embedded systems, allocated with fixed addresses at compile time from a fixed global store. Allocations can also be shared memory, but in this case, the virtual addresses used must be the same in all processes.

General usage is that the user calls lfds700_queue_init_valid_on_current_logical_core to initialize a struct lfds700_queue_state, and then calls lfds700_queue_enqueue and lfds700_queue_dequeue to enqueue and dequeue struct lfds700_queue_elements.

The queue requires a single dummy element to function, and this is an argument to the init function. This ummy element will be dequeued by lfds700_queue_dequeue and so it must be possible in the user's code to treat this element in exactly the same was as normal elements.

A queue element provides the ability to store a key and a value, both of which are of type void *. The key is not used in any way by the queue (and of course the value is neither), rather, it is available as a convenience for the user, for situations where data is being transferred between different types of data structures, where some of these data structures do support a meaningful key. The key and value are get and set by macros, such as LFDS700_QUEUE_SET_VALUE_IN_ELEMENT. The SET macros can only be used when an element is outside of the queue. (Things may seem to work even if they are used on elements which are in the queue, but it's a matter of pure chance).

(See the section below, on lock-free specific behaviour, for an explanation of the unusual init function name.)

The state and element structures are both public, present in the lfds700_queue.h header file, so that users can embed them in their own structures (and where necessary pass them to sizeof). Expected use is that user structures which are to enter queues contain within themselves a struct lfds700_queue_element, and this is used when calling lfds700_queue_enqueue, and the value set in the queue element is a pointer to the user structure entering the queue. This approach permits zero run-time allocation of store and also ensures the queue element is normally in the same memory page as the user data it refers to.

Unusually for a lock-free data structure, once a struct lfds700_queue_element has been dequeued, it is safe to pass it to free.

Lock-free Specific Behaviour

The state initialization function, lfds700_queue_init_valid_on_current_logical_core, as the same suggests, initializes the state structure but that initialization is only valid on the current logical core. For the initialization to be valid on other logical cores (i.e. other threads where they are running on other logical cores) those other threads need to call the long-windedly named macro LFDS700_MISC_MAKE_VALID_ON_CURRENT_LOGICAL_CORE_INITS_COMPLETED_BEFORE_NOW_ON_ANY_OTHER_LOGICAL_CORE, which will do that which its name suggests.

Unusually, once a queue element has been dequeued, it can be deallocated.

The struct lfds700_misc_prng_state argument is the state for a single-threaded, fast, high quality random number generator, required by the exponential backoff code. Each thread should allocate and initialize one of these structures, and then it can be used for all API calls which take this argument.

The SET macros (for setting key and value, in stack elements) can only be correctly used on elements which are outside of a queue, and the GET macros, if called by a thread on a logical core other than the logical core of the thread which called the SET macros, can only be correctly used on a queue element which has been enqueued to the queue and then dequeued.

By correctly is it meant to say that the GET macros will actually read the data written by the SET macros, and not some other data.

The queue should be regarded as a safe communication channel between threads. Any queue element which has a key and/or value set, and then is enqueued to a queue, will allow any thread which dequeues that element to correctly read the key and/or value (and by this is it meant to say not just the void pointer of the key and value, but also whatever they point to). This is the only guarantee. Any reads or writes of key and/or value, or what they point to, which occur outside of this enqueuing and dequeuing pair are not guaranteed to be correct; the data written may never be seen by other threads.

White Paper

This data structure impliments a modified version of Michael and Scott's queue.

Note however the word modified. There are two modifications.

The first relates to an issue identified by Bryan Kerr, which can be found in this blog post. In short, where the counters are uninitialized, and where elements carry a counter with them, it is possible when those uninitialized values happen to have the same starting values and when moving elements between multiple queues to end up when enqueuing an element to Queue A, to actually end up enqueing to Queue B - essentially, you come to enqueue the element to Queue A, and then just before that occurs, everything is dequeued from Queue A and placed into Queue B. To the atomic operation about to perform the enqueue, everything looks as it should!

The solution adopted has been to use the (fast - three operations on a 32 bit or 64 bit value) random number generator in liblfds to initialize element next pointer counters when elements are enqueued.

The second is more questionable. It is an issue which appears to me to exist - I am practically certain I am wrong, but cannot for the life of me see that it does not exist, and so until someone can explain to me what has failed to be understood, I have modified the code to address the issue.

The issue is discussed in this blog post, on the liblfds blog.

Essentially, it seems there may be a typo or a bug in the psudo-code in the paper, the upshot of which is a race condition which makes the call to free in the dequeue unsafe. It is easily fixed - so easily, and given the seeming typo in the enqueue code vs enqueue comment on line E4, that is seems to me that this may be a typographic error rather than an actual design flaw. However, this does not explain the matching bug in the dequeue code, which suffers no such typo.

The fix is very, very simple and does not invalidate or significantly change the code or the function of the code - see the blog post for a description.


Coming soon. No, really! (Written 29th Dec 2015).

See Also