Learn Zig Series (#40) - Key-Value Store: In-Memory Store

Project B: Key-Value Store (1/4)

What will I learn

You will learn designing the key-value store API: get, put, delete, list;
You will learn using std.StringHashMap for the storage backend;
You will learn handling value ownership: who owns the allocated key and value bytes;
You will learn TTL (time-to-live) support with expiration timestamps;
You will learn bulk operations: multi-get, multi-put;
You will learn memory usage tracking: counting bytes stored;
You will learn snapshot: serializing the entire store to a byte buffer;
You will learn comprehensive testing with the testing allocator for leak detection.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
An installed Zig 0.14+ distribution (download from ziglang.org);
The ambition to learn Zig programming.

Difficulty

Intermediate

Curriculum (of the `Learn Zig Series`):

Learn Zig Series (#40) - Key-Value Store: In-Memory Store

New project time! ;-)

We just wrapped up the Markdown-to-HTML converter in episodes 37, 38, and 39 -- tokenizer, parser, renderer, a full pipeline from raw text to HTML. Good stuff, but it was fundamentally a text processing project. Now we're going to build something very different: a key-value store. Think Redis, think Memcached, think LevelDB. A system where you shove data in with a key and get it back later by that same key. Simple concept, deep implementation.

This is Project B, and we're splitting it across four episodes. This first episode (the one you're reading) builds the in-memory store -- the core data structure that holds everything. Next time we'll add a write-ahead log so data survives crashes. After that, a TCP server so other programs can talk to our store over the network. And finaly, a client library with benchmarks to measure how fast this thing actually is.

Why a key-value store? Because it's one of those projects where you genuinely need almost everything we've covered in this series. Hash maps (episode 22), memory management (episode 7), testing (episode 12), file I/O (episode 10), networking (episode 21) -- it all comes together here. And the problem domain is dead simple to understand (store bytes, retrieve bytes) which means we can focus on the implementation challenges rather than wrestling with domain complexity.

Designing the store API

Before writing any code, let's think about what operations our store needs to support. At minimum:

put(key, value) -- store a value under a key (overwrite if it already exists)
get(key) -- retrieve the value for a key (or null if it doesn't exist)
delete(key) -- remove a key and its value
count() -- how many entries are stored
contains(key) -- check if a key exists without fetching the value

That's the basic CRUD-minus-update (put handles updates by overwriting). But we want a few more things to make it interesting:

TTL support -- keys can expire after a specified number of seconds
Multi-get and multi-put -- batch operations for efficiency
Memory tracking -- know how many bytes we're using
Snapshot -- serialize the entire store to a byte buffer

The API design is deliberately simple. Every production KV store starts here and then adds complexity -- transactions, range queries, secondary indexes, replication. We're not building Postgres. We're building the foundation that everything else sits on.

Here's the public interface as a Zig struct:

const std = @import("std");

pub const Entry = struct {
    value: []u8,
    expires_at: ?i64,  // null means no expiration
    created_at: i64,
};

pub const KvStore = struct {
    allocator: std.mem.Allocator,
    map: std.StringHashMap(Entry),
    total_bytes: usize,

    pub fn init(allocator: std.mem.Allocator) KvStore {
        return .{
            .allocator = allocator,
            .map = std.StringHashMap(Entry).init(allocator),
            .total_bytes = 0,
        };
    }

    pub fn deinit(self: *KvStore) void {
        var iter = self.map.iterator();
        while (iter.next()) |entry| {
            self.allocator.free(entry.key_ptr.*);
            self.allocator.free(entry.value_ptr.*.value);
        }
        self.map.deinit();
    }

    pub fn count(self: *const KvStore) usize {
        return self.map.count();
    }

    pub fn bytesUsed(self: *const KvStore) usize {
        return self.total_bytes;
    }
};

A couple of things to notice here. The Entry struct holds the raw value bytes, an optional expiration timestamp, and a creation timestamp. We store ?i64 for expires_at -- null means the key lives forever, a value means it expires at that Unix timestamp. The created_at field is useful for diagnostics and for the snapshot format we'll build later.

The KvStore itself holds an allocator, the hash map, and a running byte counter. That byte counter is important -- in a real system you'd use it to enforce memory limits ("don't let the store grow past 512 MB") and to report usage metrics. We track it incrementally rather than computing it on demand, because scanning every entry to sum their sizes would be O(n) every time someone asks.

Using StringHashMap for the storage backend

We talked about hash maps back in episode 22, and std.StringHashMap is the right pick here. It's a hash map where keys are []const u8 slices -- which is exactly what string keys are in Zig. The type parameter is the value type, so StringHashMap(Entry) gives us string keys mapping to Entry structs.

Why not std.AutoHashMap or a custom hash function? StringHashMap already uses Zig's built-in hashString function under the hood (which is wyhash -- fast, good distribution, not cryptographic). For a KV store where keys are arbitrary byte strings, this is exactly what we want. If we needed integer keys or struct keys, we'd go with AutoHashMap, but string keys are the standard for this kind of store.

The hash map handles resizing automatically. When the load factor gets too high, it allocates a bigger backing array and rehashes everything. We don't need to manage that -- the allocator we pass to init handles the actual memory allocation, and StringHashMap handles the growth strategy. This is one of the nice things about Zig's standard library: the data structure and the allocation strategy are cleanly separated. You can use a general purpose allocator, an arena, a fixed-buffer allocator -- the hash map doesn't care.

One thing to be aware of: StringHashMap does NOT copy keys or values. It stores the slices you give it. If you pass in a key that points to stack memory and then the stack frame returns... you've got a dangling pointer. That's why we need to think carefully about ownership, which is our next topic.

Handling value ownership

This is the part where Zig forces you to think clearly about something that most languages sweep under the rug. Who owns the memory for keys and values?

In our store, the answer is: the store owns everything. When you put a key-value pair, the store makes its own copies of both the key and the value. When you delete a key, the store frees both copies. When you deinit the store, it frees everything that's left. The caller never has to worry about lifetime -- hand your data to the store, and it's the store's problem from that point on.

This is the simplest ownership model and the one that causes the fewest surprises. The alternative -- borrowing, where the store holds references to caller-owned data -- is faster (no copies) but requires the caller to guarantee that the data outlives the store. For an in-memory cache that's fine, but for a persistent store that might outlive any individual request? Owned copies are safer.

Here's the put implementation:

pub fn put(self: *KvStore, key: []const u8, value: []const u8) !void {
    return self.putWithTtl(key, value, null);
}

pub fn putWithTtl(self: *KvStore, key: []const u8, value: []const u8, ttl_seconds: ?u32) !void {
    const now = std.time.timestamp();

    const expires_at: ?i64 = if (ttl_seconds) |ttl| now + @as(i64, @intCast(ttl)) else null;

    // Check if key already exists -- if so, free the old value
    if (self.map.getPtr(key)) |existing| {
        self.total_bytes -= existing.value.len;
        self.allocator.free(existing.value);

        // Copy new value
        const value_copy = try self.allocator.dupe(u8, value);
        existing.* = .{
            .value = value_copy,
            .expires_at = expires_at,
            .created_at = now,
        };
        self.total_bytes += value_copy.len;
        return;
    }

    // New key -- copy both key and value
    const key_copy = try self.allocator.dupe(u8, key);
    errdefer self.allocator.free(key_copy);

    const value_copy = try self.allocator.dupe(u8, value);
    errdefer self.allocator.free(value_copy);

    try self.map.put(key_copy, .{
        .value = value_copy,
        .expires_at = expires_at,
        .created_at = now,
    });

    self.total_bytes += key_copy.len + value_copy.len;
}

Let me walk through the logic. First we grab the current timestamp and compute the expiration time (if a TTL was given). Then we check if the key already exists with getPtr -- this returns a pointer to the existing entry if found, or null. If the key exists, we only need to replace the value (the key is already stored in the map, no need to copy it again). We free the old value bytes, copy the new value, update the entry in place, and adjust the byte counter.

If the key is new, we copy both the key and value. Notice the errdefer on key_copy -- if allocator.dupe fails for the value, or if map.put fails (hash map resize OOM), we need to free the key copy we already made. This is classic Zig error handling: errdefer runs only when an error propagates out of the current scope. If everything succeeds, the errdefer is cancelled and the store now owns both copies.

Having said that, the update path is the one most people get wrong. The natural instinct is to remove the old entry and put the new one, but that's wasteful -- you'd free the key just to immediately allocate an identical copy. Using getPtr to modify in-place avoids that overhead. It's a small optimization, but in a hot loop doing millions of updates per second, it adds up.

The get and delete operations

Get is straightforward -- look up the key, check if the entry has expired, return the value or null:

pub fn get(self: *KvStore, key: []const u8) ?[]const u8 {
    const entry = self.map.get(key) orelse return null;

    // Check expiration
    if (entry.expires_at) |expires| {
        if (std.time.timestamp() >= expires) {
            // Expired -- lazy delete
            self.removeEntry(key);
            return null;
        }
    }

    return entry.value;
}

pub fn contains(self: *KvStore, key: []const u8) bool {
    return self.get(key) != null;
}

We're using lazy expiration here. When someone asks for a key, we check its TTL right then. If it's expired, we delete it and pretend it never existed. The alternative is active expiration -- running a background sweep that periodically cleans up expired keys. Active expiration is better for memory usage (dead keys get cleaned up even if nobody asks for them), but lazy expiration is simpler and works fine when you have a steady read pattern. Redis actually uses both: lazy expiration on reads plus a randomized background sweep. We'll keep it simple with lazy-only for now.

Delete also needs some care:

fn removeEntry(self: *KvStore, key: []const u8) void {
    const kv = self.map.fetchRemove(key) orelse return;
    self.total_bytes -= kv.key.len + kv.value.value.len;
    self.allocator.free(kv.key);
    self.allocator.free(kv.value.value);
}

pub fn delete(self: *KvStore, key: []const u8) bool {
    if (self.map.contains(key)) {
        self.removeEntry(key);
        return true;
    }
    return false;
}

The fetchRemove method on StringHashMap is the key here -- it removes the entry AND returns the key-value pair so we can free both allocations. If we used plain remove, we'd lose the key pointer and leak that memory. fetchRemove gives us ownership of the removed key slice so we can hand it back to the allocator.

Notice that removeEntry is a private helper (no pub). The public API is delete, which returns a bool indicating whether the key actually existed. The private removeEntry is also called from get (for lazy expiration cleanup) and from deinit (to clean up everything).

TTL support with expiration timestamps

We already saw the TTL plumbing in putWithTtl and get. Let's look at the helper for listing keys that also respects expiration:

pub fn listKeys(self: *KvStore, buffer: [][]const u8) usize {
    const now = std.time.timestamp();
    var i: usize = 0;
    var iter = self.map.iterator();

    while (iter.next()) |entry| {
        if (i >= buffer.len) break;

        // Skip expired entries
        if (entry.value_ptr.*.expires_at) |expires| {
            if (now >= expires) continue;
        }

        buffer[i] = entry.key_ptr.*;
        i += 1;
    }
    return i;
}

The caller passes in a buffer (a slice of slice pointers) and we fill it with keys that haven't expired. We return the actual count of keys written. This design avoids allocation in the listing function itself -- the caller decides how big a buffer to provide. If the store has 1000 keys and the caller provides a buffer of 50, they get the first 50 non-expired keys. If they want all of them, they can use count() to size the buffer first.

Why not return an ArrayList of keys? Because that would allocate, and in a high-performance store you want to minimize allocations in the hot path. The buffer-based approach is a common pattern in C and Zig APIs: the caller controls the memory, the function just fills it in. You might recognize this pattern from read() system calls -- you pass a buffer, the kernel fills it.

One subtletly: we don't actually delete expired keys during listing. We just skip them. A full implementation might want to clean them up too, but modifying the hash map while iterating it is tricky -- you'd need to collect the expired keys first and then remove them in a second pass. For our purposes, the lazy cleanup in get is good enough. Expired keys will get cleaned up next time someone tries to read them.

Bulk operations: multi-get, multi-put

When your store is behind a network layer (which it will be in a couple of episodes), round-trip latency dominates. Sending 100 individual get requests over TCP takes 100 round trips. Sending one multi-get with 100 keys takes one round trip. That's why every KV store worth its salt supports batch operations.

pub const GetResult = struct {
    key: []const u8,
    value: ?[]const u8,
};

pub fn multiGet(self: *KvStore, keys: []const []const u8, results: []GetResult) usize {
    const len = @min(keys.len, results.len);
    for (0..len) |i| {
        results[i] = .{
            .key = keys[i],
            .value = self.get(keys[i]),
        };
    }
    return len;
}

pub fn multiPut(self: *KvStore, keys: []const []const u8, values: []const []const u8) !usize {
    const len = @min(keys.len, values.len);
    var succeeded: usize = 0;

    for (0..len) |i| {
        self.put(keys[i], values[i]) catch |err| {
            // If a put fails (OOM), stop here and return how many succeeded
            if (err == error.OutOfMemory) return succeeded;
            return err;
        };
        succeeded += 1;
    }
    return succeeded;
}

multiGet is trivial -- just loop through keys and call get for each one. The results go into a caller-provided buffer, same pattern as listKeys.

multiPut is more interesting because of error handling. If one of the puts fails (most likely due to OOM during the value copy or hash map resize), we need to decide: do we roll back everything, or do we keep what succeeded? We keep what succeeded. This is the "best effort" approach -- partial success is better than total failure when you're inserting independent key-value pairs. The caller gets back the count of how many pairs actually went in, so they know exactly what happened.

A transactional store would do it differently -- either all inserts succeed or none of them do. But transactions add enormous complexity (you need rollback logic, temporary buffers, two-phase commit for distributed stores...) and we're keeping things practical here. Redis MSET also works this way: all-or-nothing at the connection level, but individual key failures within a pipeline are reported individually.

Memory usage tracking

We've been tracking total_bytes all along -- incrementing on puts, decrementing on deletes. But there's a subtlety: what exactly are we counting? Just the key and value bytes? Or also the overhead from the hash map itself?

pub fn bytesUsed(self: *const KvStore) usize {
    return self.total_bytes;
}

pub fn statsReport(self: *const KvStore) Stats {
    return .{
        .entry_count = self.map.count(),
        .data_bytes = self.total_bytes,
        .capacity = self.map.capacity(),
        .load_factor = if (self.map.capacity() > 0)
            @as(f64, @floatFromInt(self.map.count())) / @as(f64, @floatFromInt(self.map.capacity()))
        else
            0.0,
    };
}

pub const Stats = struct {
    entry_count: usize,
    data_bytes: usize,
    capacity: usize,
    load_factor: f64,
};

We track total_bytes as the sum of all key lengths and all value lengths. That's the "useful data" size. The hash map's internal overhead (the bucket array, the metadata bytes per slot) is NOT included -- that's implementation detail that changes as the map resizes. But we expose capacity and load_factor in the stats report so you can see how much overhead the map is adding.

In production, you'd want both numbers. The "data bytes" tells you how much actual content is stored. The "total memory" (data + map overhead) tells you how much RAM you're actually consuming. The hash map overhead is roughly capacity * (sizeof(key_ptr) + sizeof(value) + 1 byte metadata) -- for our entry type that's maybe 40-50 bytes per slot. With a load factor around 0.8, you're looking at 20-25% overhead for the map structure itself.

Snapshot: serializing the store to a byte buffer

A snapshot takes the current state of the store and writes it out as a contiguous byte sequence. This is useful for persistence (write the snapshot to disk), replication (send it to another node), and debugging (dump the store contents).

Our format is deliberately simple. No compression, no versioning, no framing beyond what's needed:

pub fn snapshot(self: *KvStore, allocator: std.mem.Allocator) ![]u8 {
    const now = std.time.timestamp();
    var buffer = std.ArrayList(u8).init(allocator);
    errdefer buffer.deinit();

    // Header: magic bytes + entry count
    try buffer.appendSlice("ZIGKV001");
    try appendU32(&buffer, @intCast(self.map.count()));

    var iter = self.map.iterator();
    while (iter.next()) |entry| {
        // Skip expired entries in the snapshot
        if (entry.value_ptr.*.expires_at) |expires| {
            if (now >= expires) continue;
        }

        const key = entry.key_ptr.*;
        const val = entry.value_ptr.*;

        // Key: length (4 bytes) + bytes
        try appendU32(&buffer, @intCast(key.len));
        try buffer.appendSlice(key);

        // Value: length (4 bytes) + bytes
        try appendU32(&buffer, @intCast(val.value.len));
        try buffer.appendSlice(val.value);

        // Metadata: created_at (8 bytes) + has_expiry (1 byte) + expires_at (8 bytes if present)
        try appendI64(&buffer, val.created_at);
        if (val.expires_at) |exp| {
            try buffer.append(1);
            try appendI64(&buffer, exp);
        } else {
            try buffer.append(0);
        }
    }

    return try buffer.toOwnedSlice();
}

fn appendU32(buffer: *std.ArrayList(u8), val: u32) !void {
    var bytes: [4]u8 = undefined;
    std.mem.writeInt(u32, &bytes, val, .little);
    try buffer.appendSlice(&bytes);
}

fn appendI64(buffer: *std.ArrayList(u8), val: i64) !void {
    var bytes: [8]u8 = undefined;
    std.mem.writeInt(i64, &bytes, val, .little);
    try buffer.appendSlice(&bytes);
}

The format starts with a magic string ZIGKV001 (so we can identify the format and version), followed by the entry count as a little-endian u32. Then each entry is: key length + key bytes + value length + value bytes + timestamps. Expired entries are filtered out during snapshot -- no point persisting dead data.

We use std.mem.writeInt for encoding integers in little-endian format. This is important for portability -- if you just cast an integer pointer to a byte pointer and copy the bytes, you get whatever endianness your CPU uses. x86 is little-endian so it'd work by accident on most machines, but explicitly specifying the byte order means our snapshots are portable across architectures. We touched on this kind of binary encoding back when we discussed memory layout in episode 8.

Now the counterpart -- restoring a store from a snapshot:

pub fn loadSnapshot(allocator: std.mem.Allocator, data: []const u8) !KvStore {
    var store = KvStore.init(allocator);
    errdefer store.deinit();

    if (data.len < 12) return error.InvalidSnapshot;

    // Verify magic
    if (!std.mem.eql(u8, data[0..8], "ZIGKV001")) return error.InvalidSnapshot;

    const entry_count = std.mem.readInt(u32, data[8..12], .little);
    var pos: usize = 12;

    for (0..entry_count) |_| {
        // Read key
        if (pos + 4 > data.len) return error.InvalidSnapshot;
        const key_len = std.mem.readInt(u32, data[pos..][0..4], .little);
        pos += 4;
        if (pos + key_len > data.len) return error.InvalidSnapshot;
        const key = data[pos..][0..key_len];
        pos += key_len;

        // Read value
        if (pos + 4 > data.len) return error.InvalidSnapshot;
        const val_len = std.mem.readInt(u32, data[pos..][0..4], .little);
        pos += 4;
        if (pos + val_len > data.len) return error.InvalidSnapshot;
        const value = data[pos..][0..val_len];
        pos += val_len;

        // Read metadata
        if (pos + 8 > data.len) return error.InvalidSnapshot;
        const created_at = std.mem.readInt(i64, data[pos..][0..8], .little);
        pos += 8;

        if (pos + 1 > data.len) return error.InvalidSnapshot;
        const has_expiry = data[pos];
        pos += 1;

        var expires_at: ?i64 = null;
        if (has_expiry == 1) {
            if (pos + 8 > data.len) return error.InvalidSnapshot;
            expires_at = std.mem.readInt(i64, data[pos..][0..8], .little);
            pos += 8;
        }

        // Insert with owned copies
        const key_copy = try allocator.dupe(u8, key);
        errdefer allocator.free(key_copy);

        const val_copy = try allocator.dupe(u8, value);
        errdefer allocator.free(val_copy);

        try store.map.put(key_copy, .{
            .value = val_copy,
            .expires_at = expires_at,
            .created_at = created_at,
        });
        store.total_bytes += key_copy.len + val_copy.len;
    }

    return store;
}

The load function is essentially the reverse of snapshot. Read the magic bytes, read the entry count, then loop through entries decoding the same length-prefixed format. Every field gets bounds-checked before we read it -- if the data is truncated or corrupted, we return error.InvalidSnapshot rather than reading garbage. This is defensive parsing, and it's important for any serialization format. You never know what might happen to the data between being written and being read (disk corruption, partial transfer, someone editing the file by hand...).

The errdefer chain in the insert section guarantees cleanup if we fail partway through loading. If the 500th entry fails to allocate, all 499 previously loaded entries are still owned by the store, and the errdefer store.deinit() at the top will clean them all up. This is one of those cases where Zig's error handling model really shines -- you get correctness without writing explicit rollback code.

Comprehensive testing

Testing a KV store is satiesfying because the operations are simple enough that you can exhaustively verify them. Here's our test suite:

const std = @import("std");
const KvStore = @import("kv_store.zig").KvStore;

test "put and get basic" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    try store.put("hello", "world");
    try store.put("foo", "bar");

    try std.testing.expectEqualStrings("world", store.get("hello").?);
    try std.testing.expectEqualStrings("bar", store.get("foo").?);
    try std.testing.expect(store.get("missing") == null);
    try std.testing.expectEqual(@as(usize, 2), store.count());
}

test "put overwrites existing key" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    try store.put("key", "value1");
    try std.testing.expectEqualStrings("value1", store.get("key").?);

    try store.put("key", "value2");
    try std.testing.expectEqualStrings("value2", store.get("key").?);

    // Count should still be 1 -- we updated, not added
    try std.testing.expectEqual(@as(usize, 1), store.count());
}

test "delete removes entry" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    try store.put("key", "value");
    try std.testing.expect(store.delete("key"));
    try std.testing.expect(store.get("key") == null);
    try std.testing.expectEqual(@as(usize, 0), store.count());

    // Deleting non-existent key returns false
    try std.testing.expect(!store.delete("nope"));
}

test "byte tracking is accurate" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    try store.put("abc", "12345");  // 3 + 5 = 8 bytes
    try std.testing.expectEqual(@as(usize, 8), store.bytesUsed());

    try store.put("abc", "1");  // overwrite: 3 + 1 = 4 bytes
    try std.testing.expectEqual(@as(usize, 4), store.bytesUsed());

    try store.put("xyz", "99");  // add: 3 + 2 = 5 more
    try std.testing.expectEqual(@as(usize, 9), store.bytesUsed());

    _ = store.delete("abc");  // remove 4
    try std.testing.expectEqual(@as(usize, 5), store.bytesUsed());
}

The std.testing.allocator is our best friend here. It tracks every allocation and every free, and if any memory is leaked when the test ends, it reports the leak as a test failure. This means we don't need separate "leak detection" tests -- every single test is automatically a leak detection test. If our deinit function misses an allocation, or if put has an error path that leaks, the testing allocator catches it.

Let me add TTL tests:

test "ttl expiration" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    // Put with a TTL of 0 seconds -- should expire immediately
    try store.putWithTtl("temp", "data", 0);

    // The timestamp check uses >= so a TTL of 0 means
    // "expires at the current second" which is already past
    std.time.sleep(1 * std.time.ns_per_s);

    try std.testing.expect(store.get("temp") == null);
    // Lazy delete should have cleaned it up
    try std.testing.expectEqual(@as(usize, 0), store.count());
}

test "non-expired keys are returned" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    // TTL of 3600 seconds (1 hour) -- should NOT be expired
    try store.putWithTtl("persist", "data", 3600);
    try std.testing.expectEqualStrings("data", store.get("persist").?);
}

And the snapshot round-trip test:

test "snapshot and restore" {
    // Create a store with some data
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    try store.put("name", "scipio");
    try store.put("lang", "zig");
    try store.put("version", "0.14");

    // Take snapshot
    const snap = try store.snapshot(std.testing.allocator);
    defer std.testing.allocator.free(snap);

    // Restore into a new store
    var restored = try KvStore.loadSnapshot(std.testing.allocator, snap);
    defer restored.deinit();

    // Verify all data came through
    try std.testing.expectEqual(@as(usize, 3), restored.count());
    try std.testing.expectEqualStrings("scipio", restored.get("name").?);
    try std.testing.expectEqualStrings("zig", restored.get("lang").?);
    try std.testing.expectEqualStrings("0.14", restored.get("version").?);
}

test "invalid snapshot rejected" {
    const result = KvStore.loadSnapshot(std.testing.allocator, "garbage");
    try std.testing.expectError(error.InvalidSnapshot, result);
}

test "empty snapshot round-trip" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    const snap = try store.snapshot(std.testing.allocator);
    defer std.testing.allocator.free(snap);

    var restored = try KvStore.loadSnapshot(std.testing.allocator, snap);
    defer restored.deinit();

    try std.testing.expectEqual(@as(usize, 0), restored.count());
}

The snapshot round-trip test is the most valuable one in the suite. It exercises the entire serialization and deserialization pipeline end-to-end. If the snapshot format has a bug -- writing a field in the wrong order, encoding length wrong, missing a null byte -- this test catches it. It's the same philosophy we used for the Markdown converter's end-to-end tests: test the whole pipeline, not just individual functions.

And the multi-operation tests:

test "multi-get returns correct results" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    try store.put("a", "1");
    try store.put("b", "2");
    try store.put("c", "3");

    const keys = [_][]const u8{ "a", "missing", "c" };
    var results: [3]KvStore.GetResult = undefined;
    const got = store.multiGet(&keys, &results);

    try std.testing.expectEqual(@as(usize, 3), got);
    try std.testing.expectEqualStrings("1", results[0].value.?);
    try std.testing.expect(results[1].value == null);
    try std.testing.expectEqualStrings("3", results[2].value.?);
}

test "multi-put inserts all pairs" {
    var store = KvStore.init(std.testing.allocator);
    defer store.deinit();

    const keys = [_][]const u8{ "x", "y", "z" };
    const vals = [_][]const u8{ "10", "20", "30" };
    const inserted = try store.multiPut(&keys, &vals);

    try std.testing.expectEqual(@as(usize, 3), inserted);
    try std.testing.expectEqualStrings("10", store.get("x").?);
    try std.testing.expectEqualStrings("20", store.get("y").?);
    try std.testing.expectEqualStrings("30", store.get("z").?);
}

This test suite covers the core operations, edge cases (overwrite, delete non-existent, expired TTL), memory tracking accuracy, serialization round-trips, batch operations, and invalid input rejection. And every test runs with the testing allocator, so every test is also a memory safety test. That's a LOT of coverage for not that many lines of test code.

Putting it all together

Let's look at the final project structure:

kv-store/
  src/
    kv_store.zig     -- the KvStore struct, put/get/delete, TTL, snapshot
    kv_store_test.zig -- all tests
    main.zig          -- simple REPL for manual testing
  build.zig

And a quick build.zig:

const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    const exe = b.addExecutable(.{
        .name = "kv-store",
        .root_source_file = b.path("src/main.zig"),
        .target = target,
        .optimize = optimize,
    });
    b.installArtifact(exe);

    const run_cmd = b.addRunArtifact(exe);
    run_cmd.step.dependOn(b.getInstallStep());
    const run_step = b.step("run", "Run the KV store REPL");
    run_step.dependOn(&run_cmd.step);

    const tests = b.addTest(.{
        .root_source_file = b.path("src/kv_store_test.zig"),
        .target = target,
        .optimize = optimize,
    });
    const test_step = b.step("test", "Run all tests");
    test_step.dependOn(&b.addRunArtifact(tests).step);
}

And a tiny main.zig that serves as a REPL for manual testing:

const std = @import("std");
const KvStore = @import("kv_store.zig").KvStore;

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    var store = KvStore.init(allocator);
    defer store.deinit();

    const stdin = std.io.getStdIn().reader();
    const stdout = std.io.getStdOut().writer();

    try stdout.writeAll("zigkv> ");

    var line_buf: [1024]u8 = undefined;
    while (stdin.readUntilDelimiter(&line_buf, '\n')) |line| {
        const trimmed = std.mem.trim(u8, line, " \t\r");

        if (std.mem.eql(u8, trimmed, "quit") or std.mem.eql(u8, trimmed, "exit")) break;

        if (std.mem.startsWith(u8, trimmed, "put ")) {
            const rest = trimmed[4..];
            if (std.mem.indexOf(u8, rest, " ")) |sep| {
                const key = rest[0..sep];
                const value = rest[sep + 1 ..];
                store.put(key, value) catch |err| {
                    try stdout.print("error: {}\n", .{err});
                    continue;
                };
                try stdout.writeAll("OK\n");
            } else {
                try stdout.writeAll("usage: put  \n");
            }
        } else if (std.mem.startsWith(u8, trimmed, "get ")) {
            const key = trimmed[4..];
            if (store.get(key)) |val| {
                try stdout.print("{s}\n", .{val});
            } else {
                try stdout.writeAll("(nil)\n");
            }
        } else if (std.mem.startsWith(u8, trimmed, "del ")) {
            const key = trimmed[4..];
            if (store.delete(key)) {
                try stdout.writeAll("(1)\n");
            } else {
                try stdout.writeAll("(0)\n");
            }
        } else if (std.mem.eql(u8, trimmed, "count")) {
            try stdout.print("{d}\n", .{store.count()});
        } else if (std.mem.eql(u8, trimmed, "stats")) {
            const stats = store.statsReport();
            try stdout.print("entries: {d}\ndata bytes: {d}\ncapacity: {d}\nload factor: {d:.2}\n", .{
                stats.entry_count,
                stats.data_bytes,
                stats.capacity,
                stats.load_factor,
            });
        } else {
            try stdout.writeAll("commands: put   | get  | del  | count | stats | quit\n");
        }

        try stdout.writeAll("zigkv> ");
    } else |_| {}

    try stdout.writeAll("bye!\n");
}

Nothing fancy -- just a command loop that reads lines from stdin and dispatches to store operations. It's useful for playing around with the store manually, and it'll serve as a quick sanity check before we add the TCP server in a future episode. The command names (put, get, del) are deliberately similar to Redis -- no reason to invent a new vocabulary when a perfectly good one exists.

Wat we geleerd hebben

Designing a key-value store API with the fundamental operations (put, get, delete, count, contains) and useful extras (TTL, bulk ops, snapshots, memory tracking)
Using std.StringHashMap as the storage backend and understanding why it's the right choice for string-keyed stores -- wyhash, automatic resizing, clean separation from the allocator
Handling value ownership with a "store owns everything" model where both keys and values are duplicated on insert and freed on removal, with errdefer ensuring no leaks on error paths
TTL support via lazy expiration -- checking timestamps on read and silently deleting expired entries, the same approach Redis uses as its primary expiration mechanism
Bulk operations (multi-get, multi-put) that work on caller-provided buffers to avoid allocating in the hot path, with best-effort semantics for partial failures
Memory tracking by maintaining a running byte counter that's updated incrementally on every put, overwrite, and delete -- O(1) instead of scanning the whole map
Snapshot serialization using a simple length-prefixed binary format with explicit endianness (std.mem.writeInt with .little), magic bytes for format identification, and bounds checking on every read during deserialization
Comprehensive testing with std.testing.allocator catching both correctness bugs and memory leaks, including snapshot round-trip tests that exercise the entire serialization pipeline

This is a solid foundation. The in-memory store works, it's tested, and it handles the tricky ownership and lifetime questions that Zig makes you answer upfront. Having said that, right now all our data lives in RAM -- if the process crashes, everything is gone. That's where the write-ahead log comes in, and that's what we'll build next. The concept is simple: before every write operation, append a record to a file on disk. If the process crashes, replay the log to reconstruct the store. It's the same idea that databases have used for decades, and it's a great exercise in file I/O and crash recovery.

Bedankt en tot de volgende keer!

Hive account@scipio

Learn Zig Series (#40) - Key-Value Store: In-Memory Store

Learn Zig Series (#40) - Key-Value Store: In-Memory Store

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Zig Series):

Learn Zig Series (#40) - Key-Value Store: In-Memory Store

Designing the store API

Using StringHashMap for the storage backend

Handling value ownership

The get and delete operations

TTL support with expiration timestamps

Bulk operations: multi-get, multi-put

Memory usage tracking

Snapshot: serializing the store to a byte buffer

Comprehensive testing

Putting it all together

Wat we geleerd hebben

Curriculum (of the `Learn Zig Series`):