Learn Zig Series (#44) - Image Tool: Reading and Writing PPM/BMP

Project C: Image Manipulation Tool (1/3)

What will I learn

You will learn the PPM image format: dead simple, human-readable, perfect for learning;
You will learn reading PPM files: parsing the header (P6, width, height, max value);
You will learn storing pixel data in a flat []u8 buffer (RGB triplets);
You will learn writing PPM files from a pixel buffer;
You will learn the BMP format: binary header, pixel rows, padding for alignment;
You will learn reading BMP files: handling the DIB header variants;
You will learn an Image struct that abstracts over both formats;
You will learn testing: create a small image programmatically, write it, read it back, compare pixels.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
An installed Zig 0.14+ distribution (download from ziglang.org);
The ambition to learn Zig programming.

Difficulty

Intermediate

Curriculum (of the `Learn Zig Series`):

Learn Zig Series (#44) - Image Tool: Reading and Writing PPM/BMP

New project time! After four episodes of key-value stores, TCP protocols, and network benchmarks, we're switching gears completely. Project C is an image manipulation tool -- a command-line program that reads image files, transforms pixels, and writes results back to disk. Over three episodes we'll build it from the ground up: reading and writing image formats (this episode), pixel operations like brightness, contrast, grayscale and blur (next), and finally a CLI pipeline that chains operations together.

Why images? Because image data is a near-perfect playground for systems programming in Zig. You're working with raw bytes in memory, you need precise control over byte ordering and alignment, and performance matters (a 4K image is 25 million pixels -- 75 MB of RGB data). Everything we've learned about slices (episode 5), memory management (episode 7), packed structs (episode 17), and file I/O (episode 10) comes together here.

We'll support two formats: PPM (stupidly simple, great for learning) and BMP (the classic Windows bitmap -- more complex, but the format you'll actually encounter in the real world). Let's start with PPM because it's the easiest image format I've ever seen. And I've seen quit some formats over the years ;-)

PPM: the simplest image format that could possibly work

PPM stands for Portable Pixmap. It was created in the late 1980s as part of the Netpbm toolkit, and its entire design philosophy is "make it so simple that any programmer can implement a reader in 20 minutes." Mission accomplished -- the format is literally a text header followed by raw RGB bytes.

A PPM file (binary variant, P6) looks like this:

P6
640 480
255
<raw RGB bytes: 640 * 480 * 3 = 921,600 bytes>

That's it. P6 means "binary PPM" (there's also P3 which is ASCII PPM where each pixel value is written as a decimal number, but nobody uses that for real work because the files are enormous). The width and height are decimal numbers separated by whitespace. 255 is the maximum value per channel (almost always 255 for 8-bit color). After the newline following the max value, the rest of the file is raw pixel data -- red, green, blue, red, green, blue, repeating width * height times.

No compression, no alpha channel, no color profiles, no metadata. Just pixels. This is why graphics researchers love PPM -- you can debug image processing algorithms by hexdumping the output file and looking at individual bytes. Try doing that with a PNG ;-)

The Image struct

Before we write any format-specific code, let's define the struct that will represent an image in memory. Regardless of whether the file was PPM or BMP, once it's loaded we want the same in-memory representation:

const std = @import("std");

pub const Image = struct {
    width: u32,
    height: u32,
    pixels: []u8,
    allocator: std.mem.Allocator,

    pub fn init(allocator: std.mem.Allocator, width: u32, height: u32) !Image {
        const size = @as(usize, width) * @as(usize, height) * 3;
        const pixels = try allocator.alloc(u8, size);
        @memset(pixels, 0);

        return .{
            .width = width,
            .height = height,
            .pixels = pixels,
            .allocator = allocator,
        };
    }

    pub fn deinit(self: *Image) void {
        self.allocator.free(self.pixels);
    }

    pub fn getPixel(self: *const Image, x: u32, y: u32) struct { r: u8, g: u8, b: u8 } {
        const idx = (@as(usize, y) * @as(usize, self.width) + @as(usize, x)) * 3;
        return .{
            .r = self.pixels[idx],
            .g = self.pixels[idx + 1],
            .b = self.pixels[idx + 2],
        };
    }

    pub fn setPixel(self: *Image, x: u32, y: u32, r: u8, g: u8, b: u8) void {
        const idx = (@as(usize, y) * @as(usize, self.width) + @as(usize, x)) * 3;
        self.pixels[idx] = r;
        self.pixels[idx + 1] = g;
        self.pixels[idx + 2] = b;
    }

    pub fn pixelCount(self: *const Image) usize {
        return @as(usize, self.width) * @as(usize, self.height);
    }
};

Pixels are stored as a flat []u8 buffer in row-major order: the first width * 3 bytes are the top row (left to right), the next width * 3 bytes are the second row, and so on. Each pixel is three consecutive bytes: red, green, blue. So pixel (x, y) lives at index (y * width + x) * 3 in the buffer.

Why a flat slice instead of a 2D array like [height][width][3]u8? Two reasons. First, Zig's arrays need compile-time known sizes, and image dimensions are runtime values -- you'd need an ArrayList or heap-allocated slice anyway. Second, image format I/O works on raw bytes. When we read a PPM file, the pixel data is already a contiguous stream of RGB bytes. We can read it straight into our buffer with a single readAll call. If we used a 2D structure we'd need to copy row by row (or worse, pixel by pixel).

The getPixel and setPixel methods do the index arithmetic so calling code doesn't have to think about the flat layout. We'll use these everywhere in the next episode when implementing pixel operations. The @as(usize, ...) casts are needed because width and height are u32 but array indices need to be usize -- Zig is strict about integer widths and won't do implicit widening. We covered this way back in episode 2.

Reading PPM files

Now let's parse a PPM file. The header parsing is the trickiest part -- the rest is just reading raw bytes:

pub const PpmReader = struct {
    pub fn read(allocator: std.mem.Allocator, path: []const u8) !Image {
        const file = try std.fs.cwd().openFile(path, .{});
        defer file.close();

        var reader = file.reader();

        // Read and validate magic number
        var magic: [2]u8 = undefined;
        _ = try reader.readAll(&magic);
        if (magic[0] != 'P' or magic[1] != '6') {
            return error.InvalidFormat;
        }

        // Skip whitespace after magic
        try skipWhitespaceAndComments(reader);

        // Parse width
        const width = try readDecimalU32(reader);
        try skipWhitespaceAndComments(reader);

        // Parse height
        const height = try readDecimalU32(reader);
        try skipWhitespaceAndComments(reader);

        // Parse max value
        const max_val = try readDecimalU32(reader);
        if (max_val != 255) {
            return error.UnsupportedMaxValue;
        }

        // One whitespace character after max value, then raw pixel data
        _ = try reader.readByte();

        // Allocate image and read pixel data
        var img = try Image.init(allocator, width, height);
        errdefer img.deinit();

        const bytes_read = try reader.readAll(img.pixels);
        if (bytes_read != img.pixels.len) {
            return error.UnexpectedEof;
        }

        return img;
    }

    fn skipWhitespaceAndComments(reader: anytype) !void {
        while (true) {
            const byte = reader.readByte() catch |err| switch (err) {
                error.EndOfStream => return,
                else => return err,
            };

            if (byte == '#') {
                // Comment line -- skip until newline
                while (true) {
                    const c = reader.readByte() catch return;
                    if (c == '\n') break;
                }
            } else if (byte != ' ' and byte != '\t' and byte != '\n' and byte != '\r') {
                // Non-whitespace, non-comment -- put it back
                // We can't unread, so we'll use a different strategy
                // Actually let's restructure to avoid needing unread
                return error.InvalidFormat;
            }
        }
    }

    fn readDecimalU32(reader: anytype) !u32 {
        var value: u32 = 0;
        var found_digit = false;

        while (true) {
            const byte = reader.readByte() catch |err| switch (err) {
                error.EndOfStream => {
                    if (found_digit) return value;
                    return error.InvalidFormat;
                },
                else => return err,
            };

            if (byte >= '0' and byte <= '9') {
                found_digit = true;
                value = value * 10 + @as(u32, byte - '0');
            } else if (found_digit) {
                // Hit non-digit after reading digits -- number is complete
                // The non-digit character is the separator, consumed
                return value;
            } else if (byte == ' ' or byte == '\t' or byte == '\n' or byte == '\r') {
                // Leading whitespace, skip
                continue;
            } else if (byte == '#') {
                // Comment in header
                while (true) {
                    const c = reader.readByte() catch return error.InvalidFormat;
                    if (c == '\n') break;
                }
            } else {
                return error.InvalidFormat;
            }
        }
    }
};

The PPM header parsing looks more involved than you might expect for such a "simple" format. That's because PPM allows comments (lines starting with #) anywhere in the header, and uses arbitrary whitespace between values. A conforming PPM file could look like:

P6
# Generated by my cool tool
640    480
# 8 bit color
255

Our readDecimalU32 function handles this by skipping whitespace and comment lines while accumulating decimal digits. Once it finds a digit, it keeps reading digits until it hits a non-digit (which it consumes as the separator). This is the same kind of hand-rolled parser we built in the markdown tokenizer back in episode 37 -- byte-at-a-time, no backtracking, no regex.

Having said that, there's a subtlety with skipWhitespaceAndComments. In a strict implementation you'd want to be able to "unread" a byte (peek at the next byte without consuming it). Zig's std.io.Reader doesn't have an unread method, but you can wrap it in a std.io.BufferedReader which supports putBackByte. For our purposes the readDecimalU32 function handles the look-ahead internally, so we don't need the separate skip function during the actual number parsing -- it just handles whitespace between fields.

Let me refactor to be cleaner. Here's the version that actually works properly without needing unread:

pub const PpmReader = struct {
    pub fn read(allocator: std.mem.Allocator, path: []const u8) !Image {
        const file = try std.fs.cwd().openFile(path, .{});
        defer file.close();

        var buf_reader = std.io.bufferedReader(file.reader());
        var reader = buf_reader.reader();

        // Read and validate magic number
        var magic: [2]u8 = undefined;
        _ = try reader.readAll(&magic);
        if (magic[0] != 'P' or magic[1] != '6') {
            return error.InvalidFormat;
        }

        // Parse header values (width, height, max value)
        const width = try readHeaderValue(reader);
        const height = try readHeaderValue(reader);
        const max_val = try readHeaderValue(reader);

        if (max_val != 255) {
            return error.UnsupportedMaxValue;
        }

        // Sanity check dimensions
        if (width == 0 or height == 0) return error.InvalidFormat;
        if (width > 65535 or height > 65535) return error.ImageTooLarge;

        // Allocate image and read raw pixel data
        var img = try Image.init(allocator, width, height);
        errdefer img.deinit();

        const bytes_read = try reader.readAll(img.pixels);
        if (bytes_read != img.pixels.len) {
            return error.UnexpectedEof;
        }

        return img;
    }

    fn readHeaderValue(reader: anytype) !u32 {
        // Skip whitespace and comments until we find a digit
        var value: u32 = 0;
        var in_comment = false;
        var found_digit = false;

        while (true) {
            const byte = reader.readByte() catch |err| switch (err) {
                error.EndOfStream => {
                    if (found_digit) return value;
                    return error.InvalidFormat;
                },
                else => return err,
            };

            if (in_comment) {
                if (byte == '\n') in_comment = false;
                continue;
            }

            if (byte == '#') {
                in_comment = true;
                continue;
            }

            if (byte >= '0' and byte <= '9') {
                found_digit = true;
                value = value * 10 + @as(u32, byte - '0');
            } else if (found_digit) {
                // Non-digit after digits means the number is done
                return value;
            }
            // Otherwise it's whitespace before the number -- skip
        }
    }
};

Much cleaner. The readHeaderValue function skips whitespace and comments, accumulates digits, and returns as soon as it hits a non-digit after at least one digit. The non-digit (which is the whitespace/newline separator) gets consumed, and that's fine -- the next call to readHeaderValue handles its own leading whitespace.

I wrapped the file reader in a std.io.bufferedReader here. Without buffering, every readByte() call would be a separate read(2) syscall to the kernel -- and our header parser calls readByte dozens of times. With buffering, the first call reads 4096 bytes (or whatever the buffer size is) and subsequent calls just pull from the in-memory buffer. For the pixel data read we use readAll which does larger reads anyway, but the header parsing benifit is significant.

Writing PPM files

Writing is the easy direction. We already have the pixel data in the right format (contiguous RGB bytes), so we just need to prepend the header:

pub const PpmWriter = struct {
    pub fn write(img: *const Image, path: []const u8) !void {
        const file = try std.fs.cwd().createFile(path, .{});
        defer file.close();

        var writer = file.writer();

        // Write header
        try writer.print("P6\n{d} {d}\n255\n", .{ img.width, img.height });

        // Write pixel data
        try writer.writeAll(img.pixels);
    }
};

That's the entire PPM writer. Eight lines of actual code. The header is literally P6\n{width} {height}\n255\n and then we dump the raw pixel buffer. This is why PPM is perfect for learning -- no compression, no checksums, no chunks, no padding. You can verify your output by opening it in GIMP, Photoshop, or any Netpbm-compatible viewer.

One thing to note: writer.print uses Zig's formatting system (same one we explored in episode 24) to format the width and height as decimal strings. The format string gets validated at comptime, so if you mess up the argument types the compiler catches it. No sprintf buffer overflows here.

BMP: the format nobody loves but everybody supports

BMP (Windows Bitmap) is the opposite of PPM in terms of elegance. It has a 14-byte file header, a variable-length DIB (Device Independent Bitmap) header that can be 12, 40, 56, 108, or 124 bytes depending on the version, optional color tables, pixel rows stored bottom-to-top (yes, really), and row padding to ensure each row's byte count is a multiple of 4. It's a bit of a mess, historically speaking.

But it's also universally supported. Every image editor, every operating system, every browser can open a BMP file. And unlike PNG or JPEG, there's no compression to implement -- BMP stores raw pixel data (in the uncompressed variant we'll handle). So it's a good next step after PPM: more complex header parsing, byte ordering concerns, and alignment padding, but still no algorithmic complexity.

Here's the BMP file structure for a 24-bit uncompressed bitmap:

Offset  Size  Field
------  ----  -----
 0       2    Signature: "BM"
 2       4    File size in bytes
 6       4    Reserved (zero)
10       4    Offset to pixel data
14       4    DIB header size (40 for BITMAPINFOHEADER)
18       4    Image width
22       4    Image height (positive = bottom-up)
26       2    Color planes (always 1)
28       2    Bits per pixel (24 for RGB)
30       4    Compression (0 = uncompressed)
34       4    Image data size (can be 0 for uncompressed)
38       4    Horizontal resolution (pixels/meter)
42       4    Vertical resolution (pixels/meter)
46       4    Colors in palette (0 = default)
50       4    Important colors (0 = all)
54       ...  Pixel data (bottom-up, padded rows)

That's 54 bytes of header before we even get to the pixel data. And notice the row padding requirement: each row must be a multiple of 4 bytes. For a 24-bit image with width w, each row is w * 3 bytes of pixel data, padded to the next multiple of 4. So a 5-pixel wide row is 5 * 3 = 15 bytes, padded to 16 with one byte of zero padding.

BMP header struct

Let's define the BMP headers as packed structs. This is where episode 17 comes in handy -- packed structs let us define the exact byte layout and read the entire header in one operation:

pub const BmpFileHeader = extern struct {
    signature: [2]u8,
    file_size: u32 align(1),
    reserved: u32 align(1),
    pixel_offset: u32 align(1),
};

pub const BmpInfoHeader = extern struct {
    header_size: u32 align(1),
    width: i32 align(1),
    height: i32 align(1),
    planes: u16 align(1),
    bpp: u16 align(1),
    compression: u32 align(1),
    image_size: u32 align(1),
    x_ppm: u32 align(1),
    y_ppm: u32 align(1),
    colors_used: u32 align(1),
    colors_important: u32 align(1),
};

I'm using extern struct with align(1) instead of packed structs here. The reason: BMP headers use little-endian byte order and the fields aren't bit-packed -- they're just tightly packed regular integers. extern struct gives us defined field layout (no reordering by the compiler) and align(1) removes padding between fields. This is exactly the pattern you'd use in C with __attribute__((packed)).

Notice height is i32, not u32. That's because BMP uses a negative height to indicate a top-down image (pixels stored from top row to bottom row, like PPM). A positive height means bottom-up (the historical default, because early CRT displays scanned from bottom to top). We need to handle both cases.

Reading BMP files

pub const BmpReader = struct {
    pub fn read(allocator: std.mem.Allocator, path: []const u8) !Image {
        const file = try std.fs.cwd().openFile(path, .{});
        defer file.close();

        var reader = file.reader();

        // Read file header
        var file_hdr: BmpFileHeader = undefined;
        const fh_bytes = std.mem.asBytes(&file_hdr);
        const fh_read = try reader.readAll(fh_bytes);
        if (fh_read != @sizeOf(BmpFileHeader)) return error.UnexpectedEof;

        if (file_hdr.signature[0] != 'B' or file_hdr.signature[1] != 'M') {
            return error.InvalidFormat;
        }

        // Read info header
        var info_hdr: BmpInfoHeader = undefined;
        const ih_bytes = std.mem.asBytes(&info_hdr);
        const ih_read = try reader.readAll(ih_bytes);
        if (ih_read != @sizeOf(BmpInfoHeader)) return error.UnexpectedEof;

        // We only support 24-bit uncompressed BMPs
        if (info_hdr.bpp != 24) return error.UnsupportedBpp;
        if (info_hdr.compression != 0) return error.UnsupportedCompression;

        // Handle negative height (top-down vs bottom-up)
        const top_down = info_hdr.height < 0;
        const width: u32 = @intCast(info_hdr.width);
        const height: u32 = if (top_down)
            @intCast(-info_hdr.height)
        else
            @intCast(info_hdr.height);

        if (width == 0 or height == 0) return error.InvalidFormat;
        if (width > 65535 or height > 65535) return error.ImageTooLarge;

        // Skip to pixel data (there might be extra header bytes or color table)
        const current_pos = @sizeOf(BmpFileHeader) + @sizeOf(BmpInfoHeader);
        if (file_hdr.pixel_offset > current_pos) {
            const skip = file_hdr.pixel_offset - current_pos;
            try reader.skipBytes(skip, .{});
        }

        // Calculate row size with padding
        const row_bytes = width * 3;
        const row_padding = (4 - (row_bytes % 4)) % 4;
        const padded_row = row_bytes + row_padding;

        // Allocate image
        var img = try Image.init(allocator, width, height);
        errdefer img.deinit();

        // Read pixel data row by row
        var row_buf = try allocator.alloc(u8, padded_row);
        defer allocator.free(row_buf);

        for (0..height) |i| {
            const row_read = try reader.readAll(row_buf);
            if (row_read != padded_row) return error.UnexpectedEof;

            // BMP stores pixels as BGR, we want RGB
            const y: u32 = if (top_down)
                @intCast(i)
            else
                @intCast(height - 1 - i);

            const dest_offset = @as(usize, y) * @as(usize, width) * 3;
            for (0..width) |x| {
                const src = x * 3;
                const dst = dest_offset + x * 3;
                img.pixels[dst] = row_buf[src + 2];     // R (from B position)
                img.pixels[dst + 1] = row_buf[src + 1]; // G
                img.pixels[dst + 2] = row_buf[src];     // B (from R position)
            }
        }

        return img;
    }
};

There are several things happening here that are worth highlighting.

First, the BGR to RGB swap. BMP stores pixel components in blue-green-red order, not red-green-blue. This is a historical artifact from how Windows GDI worked internally. Every BMP reader has to do this swap, and forgetting it is probably the single most common BMP parsing bug. Your image loads fine but all the reds are blue and all the blues are red. Ask me how I know ;-)

Second, the bottom-up row ordering. For a standard BMP (positive height), the first row in the file is the bottom row of the image. So we read rows from the file sequentially (row 0 in the file = bottom of image) and place them into our buffer in reverse order. If the height is negative (top-down BMP), the file order matches the image order and we just copy directly.

Third, the row padding. Each row must be padded to a 4-byte boundary. We calculate row_padding = (4 - (row_bytes % 4)) % 4. The double modulo handles the case where row_bytes is already a multiple of 4 (padding = 0). We read the full padded row into a temporary buffer, then copy only the pixel data (ignoring the padding bytes) into our Image.

Fourth, the pixel_offset skip. We read the two standard headers (14 + 40 = 54 bytes), but some BMP files have extra header data or color tables between the headers and the pixel data. The pixel_offset field in the file header tells us exactly where the pixel data starts. If it's greater than 54, we skip ahead.

Writing BMP files

pub const BmpWriter = struct {
    pub fn write(img: *const Image, path: []const u8) !void {
        const file = try std.fs.cwd().createFile(path, .{});
        defer file.close();

        var writer = file.writer();

        const row_bytes = img.width * 3;
        const row_padding = (4 - (row_bytes % 4)) % 4;
        const padded_row = row_bytes + row_padding;
        const pixel_data_size = padded_row * img.height;
        const file_size = @sizeOf(BmpFileHeader) + @sizeOf(BmpInfoHeader) + pixel_data_size;

        // Write file header
        const file_hdr = BmpFileHeader{
            .signature = .{ 'B', 'M' },
            .file_size = @intCast(file_size),
            .reserved = 0,
            .pixel_offset = @sizeOf(BmpFileHeader) + @sizeOf(BmpInfoHeader),
        };
        try writer.writeAll(std.mem.asBytes(&file_hdr));

        // Write info header
        const info_hdr = BmpInfoHeader{
            .header_size = @sizeOf(BmpInfoHeader),
            .width = @intCast(img.width),
            .height = @intCast(img.height),
            .planes = 1,
            .bpp = 24,
            .compression = 0,
            .image_size = @intCast(pixel_data_size),
            .x_ppm = 2835, // 72 DPI
            .y_ppm = 2835,
            .colors_used = 0,
            .colors_important = 0,
        };
        try writer.writeAll(std.mem.asBytes(&info_hdr));

        // Write pixel data bottom-up with BGR ordering
        const padding_bytes = [_]u8{ 0, 0, 0 };

        var y: u32 = img.height;
        while (y > 0) {
            y -= 1;
            const row_offset = @as(usize, y) * @as(usize, img.width) * 3;

            for (0..img.width) |x| {
                const src = row_offset + x * 3;
                // Write BGR (swap R and B)
                const bgr = [3]u8{
                    img.pixels[src + 2], // B
                    img.pixels[src + 1], // G
                    img.pixels[src],     // R
                };
                try writer.writeAll(&bgr);
            }

            // Write padding
            if (row_padding > 0) {
                try writer.writeAll(padding_bytes[0..row_padding]);
            }
        }
    }
};

The writer does everything in reverse: we write the headers first (calculating the total file size upfront), then write the pixel data bottom-up with the RGB-to-BGR swap and row padding. The resolution values 2835 pixels per meter correspond to 72 DPI (dots per inch), which is a standard default -- 72 * 39.3701 inches/meter ≈ 2835.

Notice we iterate y from height - 1 down to 0 to get the bottom-up ordering. Each row writes individual pixels with the channel swap (our buffer is RGB, the file needs BGR), then appends zero-padding bytes to fill the row to a 4-byte boundary.

A unified read/write interface

Now let's tie PPM and BMP together behind a format-agnostic interface. We'll detect the format from the file extension and dispatch to the right reader/writer:

pub const ImageFormat = enum {
    ppm,
    bmp,
};

pub fn detectFormat(path: []const u8) !ImageFormat {
    if (std.mem.endsWith(u8, path, ".ppm")) return .ppm;
    if (std.mem.endsWith(u8, path, ".bmp")) return .bmp;
    return error.UnknownFormat;
}

pub fn readImage(allocator: std.mem.Allocator, path: []const u8) !Image {
    const format = try detectFormat(path);
    return switch (format) {
        .ppm => PpmReader.read(allocator, path),
        .bmp => BmpReader.read(allocator, path),
    };
}

pub fn writeImage(img: *const Image, path: []const u8) !void {
    const format = try detectFormat(path);
    return switch (format) {
        .ppm => PpmWriter.write(img, path),
        .bmp => BmpWriter.write(img, path),
    };
}

This gives us a clean public API: readImage and writeImage take a path, figure out the format from the extension, and handle it. The calling code doesn't need to know or care which format the file is in -- it gets an Image struct either way.

You could make this more robust by also checking the file's magic bytes (first 2 bytes: P6 for PPM, BM for BMP) instead of relying on the extension. That's what libraries like libmagic and the file command do. For our tool, extension-based detection is good enough.

Testing: round-trip verification

The best way to test an image reader and writer is a round-trip test: create an image programmatically, write it to a file, read it back, and verify every pixel matches. If the round trip succeeds for both formats, we know our read and write paths are consistent:

const testing = std.testing;

test "ppm round trip" {
    const allocator = testing.allocator;

    // Create a small test image with known pixel values
    var img = try Image.init(allocator, 4, 3);
    defer img.deinit();

    // Set pixels to a gradient pattern
    for (0..3) |y| {
        for (0..4) |x| {
            const r: u8 = @intCast(x * 60);
            const g: u8 = @intCast(y * 80);
            const b: u8 = @intCast((x + y) * 30);
            img.setPixel(@intCast(x), @intCast(y), r, g, b);
        }
    }

    // Write to PPM
    PpmWriter.write(&img, "/tmp/test_roundtrip.ppm") catch |err| {
        std.debug.print("Failed to write PPM: {}\n", .{err});
        return err;
    };

    // Read it back
    var loaded = try PpmReader.read(allocator, "/tmp/test_roundtrip.ppm");
    defer loaded.deinit();

    // Verify dimensions
    try testing.expectEqual(img.width, loaded.width);
    try testing.expectEqual(img.height, loaded.height);

    // Verify every pixel
    for (0..3) |y| {
        for (0..4) |x| {
            const orig = img.getPixel(@intCast(x), @intCast(y));
            const read = loaded.getPixel(@intCast(x), @intCast(y));
            try testing.expectEqual(orig.r, read.r);
            try testing.expectEqual(orig.g, read.g);
            try testing.expectEqual(orig.b, read.b);
        }
    }
}

test "bmp round trip" {
    const allocator = testing.allocator;

    var img = try Image.init(allocator, 5, 4);
    defer img.deinit();

    // Use width 5 to test row padding (5 * 3 = 15 bytes, padded to 16)
    for (0..4) |y| {
        for (0..5) |x| {
            const r: u8 = @intCast(x * 50);
            const g: u8 = @intCast(y * 60);
            const b: u8 = 128;
            img.setPixel(@intCast(x), @intCast(y), r, g, b);
        }
    }

    BmpWriter.write(&img, "/tmp/test_roundtrip.bmp") catch |err| {
        std.debug.print("Failed to write BMP: {}\n", .{err});
        return err;
    };

    var loaded = try BmpReader.read(allocator, "/tmp/test_roundtrip.bmp");
    defer loaded.deinit();

    try testing.expectEqual(img.width, loaded.width);
    try testing.expectEqual(img.height, loaded.height);

    for (0..4) |y| {
        for (0..5) |x| {
            const orig = img.getPixel(@intCast(x), @intCast(y));
            const read = loaded.getPixel(@intCast(x), @intCast(y));
            try testing.expectEqual(orig.r, read.r);
            try testing.expectEqual(orig.g, read.g);
            try testing.expectEqual(orig.b, read.b);
        }
    }
}

test "format conversion ppm to bmp" {
    const allocator = testing.allocator;

    // Create image, write as PPM, read back, write as BMP, read back
    var img = try Image.init(allocator, 3, 3);
    defer img.deinit();

    img.setPixel(0, 0, 255, 0, 0);     // red
    img.setPixel(1, 0, 0, 255, 0);     // green
    img.setPixel(2, 0, 0, 0, 255);     // blue
    img.setPixel(0, 1, 255, 255, 0);   // yellow
    img.setPixel(1, 1, 255, 0, 255);   // magenta
    img.setPixel(2, 1, 0, 255, 255);   // cyan
    img.setPixel(0, 2, 128, 128, 128); // gray
    img.setPixel(1, 2, 0, 0, 0);       // black
    img.setPixel(2, 2, 255, 255, 255); // white

    try PpmWriter.write(&img, "/tmp/test_convert.ppm");

    var from_ppm = try PpmReader.read(allocator, "/tmp/test_convert.ppm");
    defer from_ppm.deinit();

    try BmpWriter.write(&from_ppm, "/tmp/test_convert.bmp");

    var from_bmp = try BmpReader.read(allocator, "/tmp/test_convert.bmp");
    defer from_bmp.deinit();

    // Verify the round trip through both formats preserves all pixels
    for (0..3) |y| {
        for (0..3) |x| {
            const orig = img.getPixel(@intCast(x), @intCast(y));
            const conv = from_bmp.getPixel(@intCast(x), @intCast(y));
            try testing.expectEqual(orig.r, conv.r);
            try testing.expectEqual(orig.g, conv.g);
            try testing.expectEqual(orig.b, conv.b);
        }
    }
}

The BMP round-trip test intentionally uses width 5 to trigger the row padding logic (5 pixels * 3 bytes = 15 bytes, padded to 16). If our padding calculation or our skip-padding-on-read was wrong, this test would catch it. The format conversion test goes PPM -> BMP -> compare, testing that the BGR/RGB swap and bottom-up/top-down handling produce identical results across formats.

These tests use testing.allocator, which is Zig's test allocator that tracks all allocations and reports leaks after the test finishes. If our Image.deinit fails to free the pixel buffer, or we forget to free the row buffer in the BMP reader, the test fails with a leak report. We covered this allocator back in episode 12.

Generating a test image programmatically

Let's write a quick main function that creates a colorful gradient image and saves it in both formats, so you can verify the output visually:

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer {
        const check = gpa.deinit();
        if (check == .leak) std.debug.print("WARNING: memory leak detected\n", .{});
    }
    const allocator = gpa.allocator();

    const width: u32 = 256;
    const height: u32 = 256;

    var img = try Image.init(allocator, width, height);
    defer img.deinit();

    // Generate a gradient: red increases left-to-right, green increases top-to-bottom
    for (0..height) |y| {
        for (0..width) |x| {
            const r: u8 = @intCast(x);
            const g: u8 = @intCast(y);
            const b: u8 = 128;
            img.setPixel(@intCast(x), @intCast(y), r, g, b);
        }
    }

    std.debug.print("Writing gradient_256x256.ppm...\n", .{});
    try writeImage(&img, "gradient_256x256.ppm");

    std.debug.print("Writing gradient_256x256.bmp...\n", .{});
    try writeImage(&img, "gradient_256x256.bmp");

    // Verify by reading both back
    std.debug.print("Verifying PPM...\n", .{});
    var ppm_img = try readImage(allocator, "gradient_256x256.ppm");
    defer ppm_img.deinit();

    std.debug.print("Verifying BMP...\n", .{});
    var bmp_img = try readImage(allocator, "gradient_256x256.bmp");
    defer bmp_img.deinit();

    // Spot-check a few pixels
    const check_points = [_][2]u32{
        .{ 0, 0 },
        .{ 127, 127 },
        .{ 255, 255 },
        .{ 0, 255 },
        .{ 255, 0 },
    };

    for (check_points) |pt| {
        const x = pt[0];
        const y = pt[1];
        const orig = img.getPixel(x, y);
        const ppm_px = ppm_img.getPixel(x, y);
        const bmp_px = bmp_img.getPixel(x, y);

        std.debug.print("  ({d},{d}): orig=({d},{d},{d}) ppm=({d},{d},{d}) bmp=({d},{d},{d})\n", .{
            x, y,
            orig.r, orig.g, orig.b,
            ppm_px.r, ppm_px.g, ppm_px.b,
            bmp_px.r, bmp_px.g, bmp_px.b,
        });
    }

    std.debug.print("Done. Open the files in an image viewer to see the gradient.\n", .{});
}

Run this and you'll get two files -- one PPM and one BMP -- both showing the same red/green gradient with a constant blue channel. The PPM file will be exactly 256 * 256 * 3 + ~15 = 196,623 bytes (no compression). The BMP file will be 54 + 256 * 256 * 3 = 196,662 bytes (slightly larger because of the headers, but no row padding since width 256 * 3 = 768 is already a multiple of 4).

Project file structure so far

img-tool/
  src/
    image.zig         -- Image struct, getPixel, setPixel (this episode)
    ppm.zig           -- PPM reader and writer (this episode)
    bmp.zig           -- BMP reader and writer (this episode)
    format.zig        -- detectFormat, readImage, writeImage (this episode)
    main.zig          -- gradient generator demo (this episode)
    image_test.zig    -- round-trip tests (this episode)
  build.zig

Six source files for episode one. Next episode we'll add operations.zig for pixel transformations (brightness, contrast, grayscale, blur, sharpen) and pipeline.zig for chaining them together. The final episode will wrap everything in a proper CLI with argument parsing so you can do things like img-tool input.bmp --grayscale --blur 3 --brightness 1.2 output.ppm.

Wat we geleerd hebben

The PPM format (P6 binary variant): a text header with magic number, dimensions, and max value, followed by raw RGB pixel data with no compression and no padding
The Image struct that stores pixels as a flat []u8 buffer in row-major RGB order, with getPixel/setPixel helpers for coordinate-based access
PPM reading: byte-at-a-time header parsing that handles comments and arbitrary whitespace, followed by a single bulk read for the pixel data
PPM writing: format the header with writer.print, dump the raw pixel buffer -- the simplest image writer you'll ever write
BMP file structure: the 14-byte file header, the 40-byte BITMAPINFOHEADER, BGR pixel ordering, bottom-up row storage, and row padding to 4-byte boundaries
BMP reading with extern struct and align(1) for binary header deserialization, BGR-to-RGB channel swapping, and bottom-up to top-down row reordering
Round-trip testing: create image in memory, write to file, read back, verify every pixel matches -- the gold standard for format parser verification
Using std.io.bufferedReader to avoid per-byte syscall overhead when parsing headers byte by byte

Bedankt en tot de volgende keer!

Hive account@scipio

Learn Zig Series (#44) - Image Tool: Reading and Writing PPM/BMP

Learn Zig Series (#44) - Image Tool: Reading and Writing PPM/BMP

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Zig Series):

Learn Zig Series (#44) - Image Tool: Reading and Writing PPM/BMP

PPM: the simplest image format that could possibly work

The Image struct

Reading PPM files

Writing PPM files

BMP: the format nobody loves but everybody supports

BMP header struct

Reading BMP files

Writing BMP files

A unified read/write interface

Testing: round-trip verification

Generating a test image programmatically

Project file structure so far

Wat we geleerd hebben

Curriculum (of the `Learn Zig Series`):