Piconet

Shared state and lmdb

I’ve been playing with LMDB for an internal project to handle shared state. These are my notes.

Our use-case

We have a large process with numerous worker threads handling inbound and outbound connections. This process offers no general-use shared memory space in which we can store information on one connection for reference by another. Practically, each worker thread is its own little universe. Like everything in life, this is a problem of sharing state.

Writes are the majority of transactions (in fact, most data written will never be read), and reads are the minority of transactions. Both will be involved in a blocking process. Because of these two points, it is required that reads fails quickly, and writes fail immediately.

Previously, we’ve used solutions like etcd to provide a distributed, fault-tolerant session store. I’ve discounted etcd from the beginning, as the service we’re enhancing doesn’t support JSON or protobufs, so we’d be spending a lot of time spawning curl, and then jq. That would be fine for reads, but not writes.

On the back of that - I’d like any solution to be either embeddable, or stateless. I don’t want another running process that I need to check is behaving. Start, write, exit. Start, read, exit. And if that process dies, the only side effect is that the data it was trying to write is lost. No table checking a-la InnoDB/Aria.

Embeddable or stateless
Fast-starting
Fast-acting
Fast-failing
Plaintext keys and values
Automatic record expiration

Why not MySQL

Like most services, the service in question does support MySQL. I decided against using it in this instance for three reasons:

In-memory tables are limited to maxheaptable_size. This is configurable globally, not per-table.
While MySQL would otherwise work here, something far lighter would be usable in many more use-cases.

The solution at hand

lmdb-store is a collection of three shoddy applications written in C. All configuration is done in common.h, this includes the location of the datastore on disk. Using a disk is optional - in production we use a small tmpfs mount that is slightly larger than the specified LMDB store size. In the event there exists no datastore at the configured location, one will be created.

The applications are as follows:

lmset writes to the datastore. If the store doesn’t exist, it creates it. When writing a KVP to the store, a second key with the prefix of “ttl-” and a value of epoch is also written, this is used by lmsweep. If a duplicate key is added to the store, the old value and ttl is replaced. If the store is full, the write fails. If the store is locked by another process, the write blocks (enforce a timeout externally). If the write fails, it exits.
lmget fetches from the datastore by key. If the key doesn’t exist, or the datastore doesn’t exist, it exits. If the store is locked by another process, it blocks. Again, enforce a timeout on the process if this is an issue.
lmsweep takes a timestamp as an argument (in epoch form) and deletes from the store all records (and ttl-) with a ttl-key value prior to that stamp. As per LMDB’s design, deleted records are only removed from disk when overwritten. For example, to delete all records older than an hour using GNU date:

./lmsweep $(date -d "now-1 hour" +%s)