Varon-T Documentation Release 0.1.1

RedJack, LLC

February 25, 2013

CONTENTS

i ii Varon-T Documentation, Release 0.1.1

This is the documentation for Varon-T 0.1.1, last updated February 25, 2013.

CONTENTS 1 Varon-T Documentation, Release 0.1.1

2 CONTENTS CHAPTER ONE

CONTENTS

1.1 Introduction

Message passing is currently a popular approach for implementing concurrent data processing applications. In this model, you decompose a large processing task into separate steps that execute concurrently and communicate solely by passing messages or data items between one another. This concurrency model is an intuitive way to structure a large processing task to exploit parallelism in a shared memory environment without incurring the complexity and overhead costs associated with multi-threaded applications. In order to use a message passing model, you need an efficient for passing messages between the processing elements of your application. A common approach is to utilize queues for storing and retrieving messages. Varon-T is a C library that implements a disruptor queue (originally implemented in the Disruptor Java library), which is a particularly efficient FIFO queue implementation. Disruptor queues achieve their efficiency through a number of related techniques: • Objects are stored in a ring buffer, which uses a fixed amount of memory regardless of the number of data records processed. • Objects are stored directly inline in the slots of the ring buffer, and their life cycle is controlled by the disruptor queue, not the application. This eliminates any per-record memory allocation overhead. • The ring buffer’s storage is implemented as a regular C array, so the data instances are all adjacent to each other in memory. This allows us to take advantage of cache striding and locality. • In most cases, the producers and consumers of a queue are coordinated without needing any costly locks, or even atomic CAS operations. Instead, they only require relatively cheap memory barriers. • Multiple consumers are able to drain a single queue, even when there’s a temporal constraint on the consumers — i.e., where a “downstream” consumer must wait to process a particular record until an “upstream” consumer has finished with it. By sharing a single queue, you eliminate additional memory allocations and copies.

1.2 Disruptor queue management

A disruptor queue in Varon-T is a high-performance ring buffer with memory barriers and yielding strategies to coor- dinate producers and consumers of value instances. A distinguishing feature of disruptor queues is that value instances are pre-allocated by the library, usually with size of a power of 2. The result is disruptor queues do little to no memory allocation during execution, and your application requests access from the library to value instances. A Varon-T distruptor queue is defined by the following interface: struct vrt_queue

3 Varon-T Documentation, Release 0.1.1

const char *name An alphanumeric name for the queue. struct vrt_value** values The array of values managed by the queue (see value objects for a discussion of custom values.) unsigned int value_mask A mask that is always equal to |queue| - 1 and used to determine the queue size efficiently through the calculation x % value_count. This works because the actual queue size is always a power of 2. const struct vrt_value_type* value_type The type of values managed by the queue (see value objects for a discussion of custom value types.) vrt_producer_array producers vrt_consumers_array consumers Arrays of producers and consumers feeding this queue vrt_value_id last_consumed_id The ID of the last guaranteed value instanced processed by each consumer. struct vrt_padded_int last_claimed_id The ID of the last value instance claimed by a producer. This is only updated for disruptor queues with multiple produceds. It is expected that single producer disruptor queues will track this value internal to the producer instance. vrt_padded_int cursor The next value instance ID that can written into the queue.

1.2.1 Built-in operations

We support several built-in operations for queue management. struct vrt_queue* vrt_queue_new(const char *name, const struct vrt_value_type *value_type, unsigned int value_count) Construct a new disruptor queue called name that stores value_count objects of type value_type in a ring buffer. void vrt_queue_free(struct vrt_queue *q) Free the memory associated with q. static inline vrt_value_id vrt_queue_get_cursor(struct vrt_queue *q) Return the ID of the value instance that was most recently published into the queue. Since this function involves a memory barrier, it should be used sparingly. #define vrt_queue_size(q) Return the number of values managed by the queue. #define vrt_queue_get(q, id) Return the value instance with the given ID

1.2.2 Built-in result codes

The following result codes are used to indicate various disruptor queue states. VRT_QUEUE_EOF Signify that no more data will be sent through the queue. VRT_QUEUE_FLUSH Signify that an upstream producer has requested a flush operation.

4 Chapter 1. Contents Varon-T Documentation, Release 0.1.1

1.3 Value objects

Each Varon-T disruptor queue manages a list of values, which are allocated and controlled by the disruptor queue. This increases performance of the queue and application by eliminating the need to perform allocations on a per object basis. Another way to think of the disruptor queue is a pool of value objects available to your applications. The value interface is simple and is a superclass of a value managed by a Varon-T disruptor queue. struct vrt_value An oqaque type that serves as a superclass for ring buffer values. Each value type in an application must implement the following interface: struct vrt_value_type

cork_hash type_id A type identifier for this value type. The cork-hash utility in libcork will generate a sufficient hash value for this field given a string identifier. struct vrt_value* (*new_value)(const struct vrt_value_type *type) Allocates, iniatializes, and returns an instance of type. void (*free_value)(const struct vrt_value_type *type, struct vrt_value *value) Frees any resources used by value, which must be an instance of type.

1.3.1 Example: Integer values

The following is a simple implementation of a new value type for storing integer values in a Varon-T disruptor queue. #include

struct vrt_integer_value { struct vrt_value parent; int64_t value; };

static struct vrt_value * vrt_integer__new_value(const struct vrt_value_type *type) { struct vrt_integer_value *self= cork_new( struct vrt_integer_value); return &self->parent; }

static void vrt_integer__free_value(const struct vrt_value_type *type, struct vrt_value *value) { struct vrt_integer_value *self= cork_container_of(value, struct vrt_integer_value, parent); free(self); }

/* The following hash value is produced by the cork-hash utility function */ #define VRT_INTEGER_TYPE 0xcd6e0682

static struct vrt_value_type _vrt_integer_type={ type= VRT_INTEGER_TYPE, vrt_integer__new_value,

1.3. Value objects 5 Varon-T Documentation, Release 0.1.1

vrt_integer__free_value };

const struct vrt_value_type * vrt_integer_type(void) { return &_vrt_integer_type; }

The implementation is straightforward and depends on the libcork library. A few details about this implementation are worth mentioning: • The implementation uses embedded C structs to contain or “subclass” the vrt_value type within vrt_integer_value. The disruptor queue library can then operate efficiently on pointers to the contained or “superclass” struct. However, your application will need a pointer to the container struct when given a pointer to the contained struct in order to perform application specific computations. That is the purpose of cork_container_of(). • The _vrt_integer_type does not require additional fields beyond the vrt_value_type interface. Therefore, it is a static value type instance and accessible through vrt_integer_type().

1.4 Producers

A producer is a queue client that feeds values into a disruptor queue. The queue manages the allocation of memory for value instances in the queue, however, so the produces “claims” the next free value instance in the queue. Once claimed, the producer copies or fills in the value instance and “publishes” the value instances, making it available for consumer clients. At the point of publishing the value instance availability, the producer relinquishes its claim to the value instance. The value instance is then considered active and available until all consumer clients inform the disruptor queue they are finished with the value instance. At this point, the disruptor queue makes the value instance’s slot in the queue array available for reuse and claim by a producer. A Varon-T producer client must implement the following interface: struct vrt_producer

struct vrt_queue* queue A pointer to the disruptor queue fed by this producer client. unsigned int index The index of this producer client within queue. vrt_value_id last_produced_id The ID of the last value instance returned by the producer. vrt_value_id last_claimed_id The ID of the last value instance in the distruptor queue currently claimed by the producer. int (*claim)(struct vrt_queue *q, struct vrt_producer *self ) This is the function the producer will call to claim a value instance in the disruptor queue. The function blocks until there is a value to return to the producer. If the queue is currently full, then this function will call the producers’s yield method to permit other producers and consumer clients execution time. int (*publish)(struct vrt_queue *q, struct vrt_producer *self, vrt_value_id last_published_id) This is the function the producer will use to publish a value instance ID to the disruptor queue. unsigned int batch_size The number of value instances in a disruptor queue the producers will claim in a single call to claim.

6 Chapter 1. Contents Varon-T Documentation, Release 0.1.1

struct vrt_yield_strategy* yield A pointer to the function implementing the producer’s yield strategy. const char *name A name for the producer. unsigned int batch_count The number of batch value instances the producer has fed. Note that VRT_QUEUE_STATS must be defined as true. unsigned int yield_count The number of times the producer has yield whilst waiting to claim a value instances. Note that VRT_QUEUE_STATS must be defined as true.

1.4.1 Built-in producer operations

We provide the following built-in producer operations: struct vrt_producer* vrt_producer_new(const char *name, unsigned int batch_size, struct vrt_queue *q) Allocate a new producer instance to feed the given queue q and initialize to claim batch_size values at a time. If batch_size is to 0, then a reasonable default batch size is calculated. void vrt_producer_free(struct vrt_producer *p) Free a producer instance and any associated resources. int vrt_producer_claim(struct vrt_producer *p, struct vrt_value **value) Claim the next available value instance managed by the producer’s queue. If this funtion returns no error (0), then a value instance is loaded into value and the caller has complete control over its contents. int vrt_producer_publish(struct vrt_producer *p) Publish the most recently claimed value. This function will no return until the value is successfully published to the queue’s consumers. Immediately upon return, the relinquishes all rights to the claimed value, including for reading values. The queue has complete control and can overwrite the value’s contents at any time. int vrt_producer_skip(struct vrt_producer *p) Skip over the most recently claimed value. int vrt_producer_eof(struct vrt_producer *p) Signal that this producer will no longer produce any new values for its queue. int vrt_producer_flush(struct vrt_producer *p) Signals that this producer is flushing any claimed values back to the queue. void vrt_producer_report(struct vrt_producer *p) Prints statistics about the producer’s batch and yields to standard output.

1.5 Consumers

A consumer is a disruptor queue client that processes or “drains” values from the queue. A disruptor queue may more than one consumer in a typical application. Each consumer must check the queue’s cursor to determine the ID of the recently published value instance, and each must maintain an ID of the last value instance that is extracted. This is the mechanism consumers use to ensure safe processing of value instances in the queue. We protect against wrapping around the queue’s ring buffer by enabling producers with an ability to peek at each con- sumer’s cursor. This implies that access to a consumer’s cursor must be thread-safe. Consumers never accesss their cur- sors directly. They must always use vrt_consumer_get_cursor() and vrt_consumer_set_cursor(), and then very sparingly.

1.5. Consumers 7 Varon-T Documentation, Release 0.1.1

Varon-T disruptor queue consumers adhere to the following interface: struct vrt_consumer

const char *name An identifier name for this consumer. struct vrt_queue* queue The disruptor queue that feeds this consumer unsigned int index The index of this consumer within the queue struct vrt_padded_int cursor The last value publically acknowledged as consumed by the consumer. Note that this field is never directly access by the consumer. vrt_value_id last_available_id The ID of the last value instance guaranteed available for processing. This field is not thread-safe and allows the consumer to process a group of value instances without yielding. vrt_value_id current_id The ID of the value instance currently consumed. unsigned int eof_count The number of EOFs seen by this consumer. vrt_consumer_array dependencies A list of consumers upon which this consumer depends. A consumer may not process a value instance until all dependency consumers have processed it. struct vrt_yield_strategy* yield The yield strategy used by this consumer during a blocking operation. unsigned int batch_count The number of batches of values to process. Used only if VRT_QUEUE_STATS is true. unsigned int yield_count The number of time the consumer has yield whilst waiting for a value instances. Used only if VRT_QUEUE_STATS is true.

1.5.1 Built-in consumer operations

We provide the following built-in consumer operations: struct vrt_consumer* vrt_consumer_new(const char *name, struct vrt_queue *q) Allocate a new consumer to process value instances from a queue. void vrt_consumer_free(struct vrt_consumer *c) Free a consumer. #define vrt_consumer_add_dependency(c1, c2) Add a consumer dependency c2 to c1. int vrt_consumer_next(struct vrt_consumer *c, struct vrt_value **value) Retrieve the next value from the consumer’s queue. If this function returns successfully, then value will be filled in with the next value in the queue. The caller then has full read access to the contents of that value. The value instance will only be valid until the next call to vrt_consumer_next. At that point, the queue is free to overwrite the contents of the value at will. Client cannot save the value pointer to be used later on, since it will

8 Chapter 1. Contents Varon-T Documentation, Release 0.1.1

almost certainly be overwritten later on by a different value. The consumer’s client is responsible for extracting desired contents and stashing them into another storage location before retrieving the next value. void vrt_report_consumer(struct vrt_consumer *c) Prints statistics about the consumer’s batches and yields to standard output.

1.6 Yield strategies

Each producer and consumer must yield to other disruptor queue clients when an operation will not immediately succeed. This prevents overwriting of values and gives slower queue clients an opportunity to catch up. Custom yield stratgies must implement the following interface: struct vrt_yield_strategy

int (*yield)(struct vrt_yield_strategy *self, bool first, const char *queue_name, const char *name) Yields control to other producer and consumer clients of the disruptor queue. void (*free)(struct vrt_yield_strategy *self ) Free allocated resources associated with this yield strategy Varon-T has three built-in yielding strategies: struct vrt_yield_strategy* vrt_yield_strategy_spin_wait(void) This simple yield strategy does a spin-loop while waiting for a queue operation that will succeed. This yield strategy requires each producer and consumer client execute in a separate thread. struct vrt_yield_strategy* vrt_yield_strategy_threaded(void) This yield strategy uses a short spin-loop before yielding to other threads. It also requires each queue client execute in a separate thread. struct vrt_yield_strategy* vrt_yield_strategy_hybrid(void) This strategy yields to other coroutines in the same thread for a initial wait cycles. It then utilizes more progres- sively intense yield loops.

1.7 Example: Summing integers

This example uses a Varon-T disruptor queue to produce one million integers and sum them up. The queue uses a single producer and a single consumer, each using the vrt_yield_strategy_threaded() described in yield strategies. Recall that Varon-T depends on the libcork library, which is included in the following block. #include #include #include

#include #include #include

The first coding task is to define the integer value and value types based on vrt_value and vrt_value_type. The following code demonstrates “subclassing vrt_value as an embedded C struct. The value type, however, is provided as a static instance of vrt_value_type, which has an interface to two functions responsible for allocating and deallocating value instances. vrt_integer_value_type(void)() is a helper function for accessing the static value type _vrt_integer_value_type.

1.6. Yield strategies 9 Varon-T Documentation, Release 0.1.1

/* ------* Integer value and type */ struct vrt_integer_value { struct vrt_value parent; int32_t value; }; static struct vrt_value * vrt_integer_value_new(struct vrt_value_type *type) { struct vrt_integer_value *self= cork_new( struct vrt_integer_value); return &self->parent; } static void vrt_integer_value_free(struct vrt_value_type *type, struct vrt_value *vself) { struct vrt_integer_value *iself= cork_container_of(vself, struct vrt_integer_value, parent); free(iself); } static struct vrt_value_type _vrt_integer_value_type={ vrt_integer_value_new, vrt_integer_value_free }; static struct vrt_value_type * vrt_integer_value_type(void) { return &_vrt_integer_value_type; }

An integer generator is a straightforward implementation with an embedded producer pointer and a field to store the number of integers to generate. The for() loop simply iterates over the number of integers to produce, claims a value instance from the queue, populates the value() field of the value instance with the current integer, and publishes the value instance back to the queue. After all integers are published, the generator then pushes an EOF() signal to the queue to indicate it has finished.

/* ------* Integer producer */ struct integer_generator { struct vrt_producer *p; int64_t count; }; void * generate_integers(void *ud) { struct integer_generator *c= ud; int32_t i; for (i=0; icount; i++){ struct vrt_value *vvalue; struct vrt_integer_value *ivalue;

10 Chapter 1. Contents Varon-T Documentation, Release 0.1.1

rpi_check(vrt_producer_claim(c->p,&vvalue)); ivalue= cork_container_of (vvalue, struct vrt_integer_value, parent); ivalue->value= i; rpi_check(vrt_producer_publish(c->p)); } rpi_check(vrt_producer_eof(c->p)); return NULL; }

A summing consumer is similar to the generator producer in a straightforward implementation. A “summer” is com- prised of a consumer client and field for tracking the sum. The consumer iterates over the available value instances in the queue until an EOF is encountered. The value from each value instance is added to the current sum.

/* ------* Integer consumer */ struct integer_summer { struct vrt_consumer *c; int64_t *sum; }; void * sum_integers(void *ud) { int rc; struct integer_summer *c= ud; struct vrt_value *vvalue; int64_t sum=0; while ((rc= vrt_consumer_next(c->c,&vvalue))!= VRT_QUEUE_EOF) { if (rc==0){ struct vrt_integer_value *ivalue= cork_container_of(vvalue, struct vrt_integer_value, parent); sum+= ivalue->value; } } if (rc== VRT_QUEUE_EOF) { *c->sum= sum; } return NULL; }

The disruptor queue is implemented where each client (producer and consumer) executes in a separate thread. The vrt_queue_client structure is a wrapper around queue clients that generalizes vrt_queue_threaded(), and it is demonstrated as a design pattern. The critical steps are thread management (create and join) and configuration of the appropriate yield strategies for producers and consumers.

/* ------* Threaded queue */ struct vrt_queue_client { void *(*run)(void *); void *ud; }; int

1.7. Example: Summing integers 11 Varon-T Documentation, Release 0.1.1

vrt_queue_threaded(struct vrt_queue *q, struct vrt_queue_client *clients) { size_t i; size_t client_count=0; struct vrt_queue_client *client; for (client= clients; client->run!= NULL; client++){ client_count++; }

pthread_t *tids; tids= cork_calloc(client_count, sizeof(pthread_t));

/* Choose a yield strategy */ for (i=0; i< cork_array_size(&q->producers); i++){ struct vrt_producer *p= cork_array_at(&q->producers, i); p->yield= vrt_yield_strategy_threaded(); }

for (i=0; i< cork_array_size(&q->consumers); i++){ struct vrt_consumer *c= cork_array_at(&q->consumers, i); c->yield= vrt_yield_strategy_threaded(); }

/* Create the client threads */ for (i=0; i< client_count; i++){ pthread_create(&tids[i], NULL, clients[i].run, clients[i].ud); }

for (i=0; i< client_count; i++){ pthread_join(tids[i], NULL); }

free(tids); return 0; }

The main function drives the disuptor queue. After successful allocation of the queue, producer, and consumer, the gen- erator and summer are configured and added to the queue as the clients. Recall that each application client (generator and summer) has an embedded queue-specific client (producer and consumer, respectively). The disruptor queue is in- voked through the call vrt_queue_threaded(). Note that result corresponds to sum in integer_summer. int main(int argc, const char **argv) { struct vrt_queue *q; struct vrt_producer *p; struct vrt_consumer *c; int64_t result; size_t QUEUE_SIZE= 64;

/* Note that the parameter for queue size is a power of 2. */ rip_check(q= vrt_queue_new("queue_sum", vrt_integer_value_type(), QUEUE_SIZE)); rip_check(p= vrt_producer_new("generator",4, q)); rip_check(c= vrt_consumer_new("summer", q));

struct integer_generator integer_generator={ p, 1000000

12 Chapter 1. Contents Varon-T Documentation, Release 0.1.1

};

struct integer_summer integer_summer={ c,&result };

struct vrt_queue_client clients[]={ { generate_integers,&integer_generator }, { sum_integers,&integer_summer }, { NULL, NULL} };

rii_check(vrt_queue_threaded(q, clients));

fprintf(stdout,"Result: %" PRId64" \n", result); vrt_queue_free(q); return 0; }

1.7. Example: Summing integers 13 Varon-T Documentation, Release 0.1.1

14 Chapter 1. Contents CHAPTER TWO

INDICES AND TABLES

• genindex • search

15