Learn Zig Series (#53) - HTTP Server: Static Files and MIME

Project E: HTTP Server from Scratch (3/4)

What will I learn

You will learn how to serve static files from a directory by mapping URL paths to filesystem paths;
You will learn MIME type detection: mapping file extensions to Content-Type headers;
You will learn directory traversal prevention by sanitizing request paths against .. attacks;
You will learn caching headers: Last-Modified and If-Modified-Since for conditional responses;
You will learn range requests for serving partial content with 206 Partial Content;
You will learn directory listing by generating HTML index pages on the fly;
You will learn how to integrate the static file handler with the router from episode 52;
You will learn how to test static file serving with temp directories, real files, and header verification.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
An installed Zig 0.14+ distribution (download from ziglang.org);
The ambition to learn Zig programming.

Difficulty

Advanced

Curriculum (of the `Learn Zig Series`):

Learn Zig Series (#53) - HTTP Server: Static Files and MIME

In episode 52 we built a router and response system for our HTTP server -- path matching, parameter extraction, convenience response constructors, the whole thing. We ended with a working REST API that could serve JSON endpoints. Useful, but there's a pretty obvious gap: what if someone asks your server for an actual file? An HTML page, a CSS stylesheet, a JavaScript bundle, an image? Right now the server would just return 404 for anything that isn't a registered route.

Today we're filling that gap. We're adding a static file handler that serves files from a directory on disk, complete with MIME type detection (so browsers know what they're receiving), directory traversal prevention (so attackers can't read /etc/passwd through your web server), caching headers (so browsers don't re-download files they already have), range requests (so large downloads can be resumed), and directory listings (so visiting a folder shows its contents). This is the part where our API server becomes a proper web server. Here we go!

Mapping URL paths to filesystem paths

The basic idea is simple: take the URL path from the request, combine it with a root directory on disk, and serve the file at that location. If someone requests /style.css, and our root directory is ./public, we serve ./public/style.css. If they request /images/logo.png, we serve ./public/images/logo.png.

Let's start with the StaticFileHandler struct:

const std = @import("std");
const fs = std.fs;

const StaticFileHandler = struct {
    root_dir: []const u8,
    allocator: std.mem.Allocator,

    fn init(allocator: std.mem.Allocator, root_dir: []const u8) StaticFileHandler {
        return .{
            .root_dir = root_dir,
            .allocator = allocator,
        };
    }

    fn resolvePath(
        self: *const StaticFileHandler,
        url_path: []const u8,
    ) ![]const u8 {
        // Strip leading slash from URL path
        const relative = if (url_path.len > 0 and url_path[0] == '/')
            url_path[1..]
        else
            url_path;

        // If the path is empty, serve index.html
        const target = if (relative.len == 0) "index.html" else relative;

        // Build the full filesystem path
        // root_dir + "/" + target
        const full_path = try std.fmt.allocPrint(
            self.allocator,
            "{s}/{s}",
            .{ self.root_dir, target },
        );

        return full_path;
    }
};

The resolvePath function is responsible for converting a URL path like /css/main.css into a filesystem path like ./public/css/main.css. We strip the leading slash (URLs use /css/main.css, but we need the relative path css/main.css to join with the root), and if someone requests just / we default to index.html -- the same convention every web server in existence follows.

Notice that we're using std.fmt.allocPrint to build the path string. This allocates, which means the caller needs to free it. We could use a stack buffer with bufPrint instead (the way we did in episode 52 for response serialization), but paths can be arbitrarily long and I'd rather not have a silent truncation bug when someone has a deeply nested directory structure. For a per-request allocation, the cost is negligible.

MIME type detection from file extensions

When a browser requests a file, it needs to know what kind of file it is. Is it HTML? CSS? A PNG image? The server tells the browser via the Content-Type header. Getting this wrong means broken pages -- if you serve a CSS file as text/plain, the browser won't apply the styles. If you serve JavaScript as text/html, it won't execute.

The standard approach is to map file extensions to MIME types:

fn getMimeType(path: []const u8) []const u8 {
    const ext = getExtension(path);

    // Common web file types
    if (eqlExt(ext, ".html") or eqlExt(ext, ".htm")) return "text/html; charset=utf-8";
    if (eqlExt(ext, ".css")) return "text/css; charset=utf-8";
    if (eqlExt(ext, ".js")) return "text/javascript; charset=utf-8";
    if (eqlExt(ext, ".json")) return "application/json";
    if (eqlExt(ext, ".xml")) return "application/xml";
    if (eqlExt(ext, ".txt")) return "text/plain; charset=utf-8";
    if (eqlExt(ext, ".md")) return "text/markdown; charset=utf-8";
    if (eqlExt(ext, ".csv")) return "text/csv";

    // Images
    if (eqlExt(ext, ".png")) return "image/png";
    if (eqlExt(ext, ".jpg") or eqlExt(ext, ".jpeg")) return "image/jpeg";
    if (eqlExt(ext, ".gif")) return "image/gif";
    if (eqlExt(ext, ".svg")) return "image/svg+xml";
    if (eqlExt(ext, ".ico")) return "image/x-icon";
    if (eqlExt(ext, ".webp")) return "image/webp";
    if (eqlExt(ext, ".avif")) return "image/avif";

    // Fonts
    if (eqlExt(ext, ".woff")) return "font/woff";
    if (eqlExt(ext, ".woff2")) return "font/woff2";
    if (eqlExt(ext, ".ttf")) return "font/ttf";

    // Application types
    if (eqlExt(ext, ".pdf")) return "application/pdf";
    if (eqlExt(ext, ".zip")) return "application/zip";
    if (eqlExt(ext, ".gz")) return "application/gzip";
    if (eqlExt(ext, ".wasm")) return "application/wasm";

    // Default: binary data
    return "application/octet-stream";
}

fn getExtension(path: []const u8) []const u8 {
    // Find the last '.' in the path
    var i = path.len;
    while (i > 0) {
        i -= 1;
        if (path[i] == '.') return path[i..];
        if (path[i] == '/') break;
    }
    return "";
}

fn eqlExt(a: []const u8, b: []const u8) bool {
    if (a.len != b.len) return false;
    for (a, b) |ca, cb| {
        // Case-insensitive comparison for extensions
        const la = if (ca >= 'A' and ca <= 'Z') ca + 32 else ca;
        const lb = if (cb >= 'A' and cb <= 'Z') cb + 32 else cb;
        if (la != lb) return false;
    }
    return true;
}

The if-chain approach might look inelegant compared to using a std.StaticStringMap (which we explored in episode 49 for shell built-ins), but there's a reason I went this way: the extensions need case-insensitive matching (.PNG and .png are the same thing), and StaticStringMap does exact matching. We'd have to lowercase the extension first, which adds a step. The if-chain with eqlExt handles case-insensitivity directly and is completely readable.

The getExtension function walks backward from the end of the path looking for a dot. If it finds a slash first, there's no extension (it was a dot in a directory name, not a file extension). This is more robust than just splitting on . -- a path like /downloads/archive.v2/README has no extension even though it contains dots.

Having said that, the default MIME type application/octet-stream is intentional. If we don't recognize the extension, we tell the browser "here's some binary data, figure it out yourself." This is safer than guessing -- browsers will typically offer to download unknown types rather than trying to render them, which prevents accidental execution of weird content.

Directory traversal prevention: sanitizing paths

This is the single most important security feature in a static file server. Without it, a malicious request for /../../../etc/passwd would let an attacker read any file on your system. The .. component means "go up one directory", so an unsanitized path could escape the root directory entirely.

fn sanitizePath(self: *const StaticFileHandler, url_path: []const u8) !?[]const u8 {
    // Strip leading slash
    const relative = if (url_path.len > 0 and url_path[0] == '/')
        url_path[1..]
    else
        url_path;

    const target = if (relative.len == 0) "index.html" else relative;

    // Check each component for traversal attempts
    var iter = std.mem.splitScalar(u8, target, '/');
    var clean_parts = std.ArrayList([]const u8).init(self.allocator);
    defer clean_parts.deinit();

    while (iter.next()) |part| {
        // Reject empty components (double slashes)
        if (part.len == 0) continue;

        // Reject parent directory references
        if (std.mem.eql(u8, part, "..")) return null;

        // Reject current directory references (harmless but suspicious)
        if (std.mem.eql(u8, part, ".")) continue;

        // Reject components starting with dot (hidden files)
        if (part[0] == '.') return null;

        // Reject null bytes (C string truncation attacks)
        if (std.mem.indexOfScalar(u8, part, 0) != null) return null;

        try clean_parts.append(part);
    }

    if (clean_parts.items.len == 0) {
        return try std.fmt.allocPrint(
            self.allocator,
            "{s}/index.html",
            .{self.root_dir},
        );
    }

    // Rebuild the path from clean components
    var result = std.ArrayList(u8).init(self.allocator);
    errdefer result.deinit();

    try result.appendSlice(self.root_dir);
    for (clean_parts.items) |part| {
        try result.append('/');
        try result.appendSlice(part);
    }

    return try result.toOwnedSlice();
}

We go through each path component and reject anything suspicious. The .. check is the critical one -- that's the traversal attack. But we also reject hidden files (starting with .), null bytes (which can truncate C strings in the underlying OS calls and lead to path confusion), and empty components (double slashes). The result is a clean, rebuit path that is guaranteed to be inside the root directory.

Some web servers try to "normalize" the path (resolve .. by removing the previous component) instead of rejecting it outright. I argue that's a bad idea for a learning project. It's more complex, more error-prone, and if a client is sending .. in a URL path, they're either attacking you or broken. Either way, a clean 403 is the right response. A legitimate browser request will never contain .. in the URL path -- the browser resolves those before sending the request.

Serving files: reading from disk and sending the response

Now let's put it all together. The serveFile function takes a sanitized path, opens the file, reads it, and sends it back with the right headers:

fn serveFile(
    self: *const StaticFileHandler,
    request: *const Request,
    allocator: std.mem.Allocator,
) !Response {
    // Sanitize the path
    const file_path = try self.sanitizePath(request.path) orelse
        return Response.text(allocator, StatusCode.forbidden, "403 Forbidden\n");

    defer allocator.free(file_path);

    // Try to open the file
    const file = fs.cwd().openFile(file_path, .{}) catch |err| {
        switch (err) {
            error.FileNotFound => {
                // Maybe it's a directory? Try opening as dir
                var dir = fs.cwd().openDir(file_path, .{
                    .iterate = true,
                }) catch {
                    return Response.notFound(allocator);
                };
                defer dir.close();
                return self.serveDirectoryListing(request.path, dir, allocator);
            },
            error.AccessDenied => return Response.text(
                allocator,
                StatusCode.forbidden,
                "403 Forbidden\n",
            ),
            else => return Response.internalError(allocator),
        }
    };
    defer file.close();

    // Get file metadata for headers
    const stat = try file.stat();
    const file_size = stat.size;

    // Check If-Modified-Since for caching
    if (request.getHeader("If-Modified-Since")) |_| {
        // For simplicity, we check if the file was modified after the
        // client's cached version. A full implementation would parse
        // the date string. For now, we use a simplified approach.
        var resp = Response.init(allocator);
        resp.setStatusFromCode(StatusCode.not_modified);
        return resp;
    }

    // Determine MIME type
    const mime = getMimeType(file_path);

    // Check for range request
    if (request.getHeader("Range")) |range_header| {
        return self.serveRangeRequest(file, file_size, range_header, mime, allocator);
    }

    // Read the entire file
    const max_file = 10 * 1024 * 1024; // 10 MB max for full reads
    if (file_size > max_file) {
        return Response.text(
            allocator,
            StatusCode.payload_too_large,
            "File too large for direct serving\n",
        );
    }

    const body = try file.readToEndAlloc(allocator, max_file);

    // Build the response
    var resp = Response.init(allocator);
    resp.setStatusFromCode(StatusCode.ok);
    try resp.addHeader("Content-Type", mime);

    // Add Last-Modified header
    var date_buf: [64]u8 = undefined;
    const last_modified = formatHttpDate(stat.mtime, &date_buf);
    if (last_modified) |date_str| {
        try resp.addHeader("Last-Modified", date_str);
    }

    resp.body = body;
    return resp;
}

The flow is: sanitize the path, try to open as a file, fall back to directory listing if it's a directory, return 404 if it's neither. If the file opens successfully, we check for caching headers and range requests before doing the full read.

The 10 MB limit on readToEndAlloc is a practical safety valve. Reading a 2 GB file into memory would be a terrible idea -- for large files, streaming would be the right approach (read a chunk, send a chunk, repeat). But for a learning project, most static assets (HTML, CSS, JS, images) are well under 10 MB, so this works fine. The file.readToEndAlloc function we used back in episode 10 handles all the buffering and reallocation internally.

One thing worth noting: the If-Modified-Since handling here is deliberately simplified. A proper implementation would parse the HTTP date from the header, compare it against the file's modification time, and return 304 only if the file hasn't changed. Parsing HTTP dates (which come in multiple formats -- RFC 7231 defines the preferred one but clients can send older formats too) is quite some work for a learning project. The simplified version always returns 304 when the header is present, which is technically wrong but demonstrates the concept. You'd want to fix this for production.

Caching headers: Last-Modified and date formatting

HTTP caching works through a two-step dance. On the first request, the server sends a Last-Modified header with the file's modification time. The browser caches the file. On subsequent requests, the browser sends If-Modified-Since with that same timestamp. If the file hasn't changed, the server replies with 304 Not Modified (no body), and the browser uses its cached copy. This saves bandwidth and makes pages load faster.

fn formatHttpDate(timestamp: i128, buf: []u8) ?[]const u8 {
    // Convert nanosecond timestamp to seconds
    const secs = @as(i64, @intCast(@divTrunc(timestamp, std.time.ns_per_s)));

    const epoch = std.time.epoch.EpochSeconds{
        .secs = @intCast(secs),
    };
    const day_info = epoch.getDaySeconds();
    const year_day = epoch.getEpochDay().calculateYearDay();
    const month_day = year_day.calculateMonthDay();

    const day_names = [_][]const u8{
        "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat",
    };
    const month_names = [_][]const u8{
        "Jan", "Feb", "Mar", "Apr", "May", "Jun",
        "Jul", "Aug", "Sep", "Oct", "Nov", "Dec",
    };

    // Day of week calculation (Zeller-ish)
    const day_of_week = epoch.getEpochDay().day % 7;

    const result = std.fmt.bufPrint(buf,
        "{s}, {d:0>2} {s} {d} {d:0>2}:{d:0>2}:{d:0>2} GMT",
        .{
            day_names[@intCast(day_of_week)],
            month_day.day_index + 1,
            month_names[@intCast(@intFromEnum(month_day.month))],
            year_day.year,
            day_info.getHoursIntoDay(),
            day_info.getMinutesIntoHour(),
            day_info.getSecondsIntoMinute(),
        },
    ) catch return null;

    return result;
}

HTTP dates follow a specific format defined in RFC 7231: Sun, 06 Nov 1994 08:49:37 GMT. Yes, it's ugly. Yes, everybody hates it. Yes, we still have to generate it because that's what the spec says and browsers expect it. The std.time.epoch.EpochSeconds helper from Zig's standard library does most of the heavy lifting here -- we just need to format the output correctly.

The stat.mtime from the file metadata gives us a timestamp in nanoseconds since the Unix epoch. We divide by ns_per_s to get seconds, then use the epoch utilities to break it down into year, month, day, hours, minutes, seconds. Zig's epoch utilities are surprisingly complete for a systems language -- this is one of those cases where the standard library saves you from writing a calendar implementation from scratch. Holy Macaroni, imagine doing leap year calculations by hand.

Range requests: serving partial content

Range requests let a client ask for only part of a file. This is essential for resuming interrupted downloads and for video/audio seeking (the player requests only the bytes it needs right now, not the entire file). The client sends a Range: bytes=1000-1999 header, and the server responds with 206 Partial Content containing just those bytes.

fn serveRangeRequest(
    self: *const StaticFileHandler,
    file: fs.File,
    file_size: u64,
    range_header: []const u8,
    mime: []const u8,
    allocator: std.mem.Allocator,
) !Response {
    _ = self;

    // Parse "bytes=START-END" format
    if (!std.mem.startsWith(u8, range_header, "bytes=")) {
        return Response.text(
            allocator,
            StatusCode.bad_request,
            "Invalid range format\n",
        );
    }

    const range_spec = range_header[6..]; // skip "bytes="
    const dash_pos = std.mem.indexOf(u8, range_spec, "-") orelse
        return Response.text(allocator, StatusCode.bad_request, "Invalid range\n");

    const start_str = range_spec[0..dash_pos];
    const end_str = range_spec[dash_pos + 1 ..];

    var start: u64 = 0;
    var end: u64 = file_size - 1;

    if (start_str.len > 0) {
        start = std.fmt.parseInt(u64, start_str, 10) catch
            return Response.text(allocator, StatusCode.bad_request, "Invalid range start\n");
    }

    if (end_str.len > 0) {
        end = std.fmt.parseInt(u64, end_str, 10) catch
            return Response.text(allocator, StatusCode.bad_request, "Invalid range end\n");
    }

    // Validate range
    if (start >= file_size or end >= file_size or start > end) {
        var resp = Response.init(allocator);
        resp.setStatus(416, "Range Not Satisfiable");

        var cr_buf: [64]u8 = undefined;
        const content_range = std.fmt.bufPrint(&cr_buf, "bytes */{d}", .{file_size}) catch
            return Response.internalError(allocator);
        try resp.addHeader("Content-Range", content_range);

        return resp;
    }

    const length = end - start + 1;

    // Seek to start position and read the range
    file.seekTo(start) catch
        return Response.internalError(allocator);

    const body = try allocator.alloc(u8, @intCast(length));
    const bytes_read = try file.readAll(body);

    // Build 206 response
    var resp = Response.init(allocator);
    resp.setStatus(206, "Partial Content");
    try resp.addHeader("Content-Type", mime);
    try resp.addHeader("Accept-Ranges", "bytes");

    var cr_buf: [128]u8 = undefined;
    const content_range = std.fmt.bufPrint(
        &cr_buf,
        "bytes {d}-{d}/{d}",
        .{ start, start + bytes_read - 1, file_size },
    ) catch return Response.internalError(allocator);
    try resp.addHeader("Content-Range", content_range);

    resp.body = body[0..bytes_read];
    return resp;
}

The range parsing is strightforward: strip the bytes= prefix, split on -, parse the start and end positions. Three cases exist in the spec: bytes=0-99 (first 100 bytes), bytes=500- (everything from byte 500 onward), and bytes=-100 (last 100 bytes). Our implementation handles the first two -- the "suffix range" (last N bytes) is less common and would need special handling where an empty start means file_size - end.

If the range is invalid (start past end of file, start > end), we return 416 Range Not Satisfiable with a Content-Range header that tells the client the actual file size. This lets the client retry with a valid range.

The file.seekTo(start) call is where the actual optimization happens. Instead of reading the entire file and then slicing, we seek to the start position and read only the bytes we need. For a 500 MB video where the client requests bytes 100MB-101MB, this is the difference between reading 500 MB and reading 1 MB. Kind of a big deal ;-)

Directory listing: generating HTML index pages

When someone requests a URL that maps to a directory (not a file), we have two options: serve an index.html if it exists, or generate a listing of the directory's contents. We already handle index.html in resolvePath -- now let's build the listing:

fn serveDirectoryListing(
    self: *const StaticFileHandler,
    url_path: []const u8,
    dir: fs.Dir,
    allocator: std.mem.Allocator,
) !Response {
    _ = self;

    var html = std.ArrayList(u8).init(allocator);
    errdefer html.deinit();

    const writer = html.writer();

    // HTML header
    try writer.print(
        \\<!DOCTYPE html>
        \\<html><head>
        \\<meta charset="utf-8">
        \\<title>Index of {s}</title>
        \\<style>
        \\body {{ font-family: monospace; margin: 2em; }}
        \\a {{ text-decoration: none; }}
        \\a:hover {{ text-decoration: underline; }}
        \\table {{ border-collapse: collapse; }}
        \\td {{ padding: 0.2em 1em; }}
        \\</style>
        \\</head><body>
        \\<h1>Index of {s}</h1>
        \\<table>
        \\<tr><td><a href="../">..</a></td><td></td></td></tr>
    , .{ url_path, url_path });

    // List directory entries
    var iter = dir.iterate();
    while (try iter.next()) |entry| {
        const name = entry.name;
        const trailing = if (entry.kind == .directory) "/" else "";

        try writer.print(
            \\<tr><td><a href="{s}{s}">{s}{s}</a></td>
        , .{ name, trailing, name, trailing });

        switch (entry.kind) {
            .directory => try writer.writeAll("[dir]\n"),
            .file => try writer.writeAll("[file]\n"),
            else => try writer.writeAll("[other]\n"),
        }
    }

    try writer.writeAll("\n");

    const body = try html.toOwnedSlice();
    return Response.html(allocator, StatusCode.ok, body);
}

The generated HTML is deliberately minimal -- a monospace listing with clickable filenames, a parent directory link (..), and type indicators. No JavaScript, no fancy CSS, no sorting. If you've ever seen nginx's autoindex or Apache's directory listing, this is the same idea. Just enough to navigate the filesystem through a browser.

We use dir.iterate() to walk the directory entries. Each entry gives us the name and kind (file, directory, symlink, etc.). Directories get a trailing / in their links so the browser knows to treat them as directories. The \\ multiline string syntax keeps the HTML readable in the Zig source -- same approach we used for JSON in episode 52.

Integrating with the router

Now we need to connect the static file handler to our server. The cleanest approach is to register it as a fallback -- try the API routes first, and if nothing matches, fall back to static file serving:

fn processRequest(self: *Server, stream: net.Stream) !void {
    // ... (request parsing from episode 51, unchanged) ...

    const pq = splitPathAndQuery(request.path);
    const clean_path = pq.path;

    // Try API routes first
    var response = if (self.router.match(request.method, clean_path)) |route_match| blk: {
        break :blk route_match.handler(
            &request,
            &route_match.params,
            self.allocator,
        ) catch {
            break :blk Response.internalError(self.allocator) catch
                return error.OutOfMemory;
        };
    } else blk: {
        // No API route matched -- try static files
        if (request.method == .GET or request.method == .HEAD) {
            break :blk self.static_handler.serveFile(
                &request,
                self.allocator,
            ) catch {
                break :blk Response.internalError(self.allocator) catch
                    return error.OutOfMemory;
            };
        } else {
            break :blk Response.methodNotAllowed(self.allocator) catch
                return error.OutOfMemory;
        }
    };
    defer response.deinit();

    var resp_buf: [65536]u8 = undefined;
    const serialized = response.serialize(&resp_buf) catch
        return error.MalformedRequest;

    _ = try stream.write(serialized);
}

The key design decision: static files are only served for GET and HEAD requests. A POST to /style.css makes no sense -- you'd need an upload endpoint for that, which is an API concern, not a static file concern. This is how nginx and Apache work too: static file serving is read-only by definition.

The splitPathAndQuery call (from episode 52) strips the query string before matching. This means /style.css?v=2 correctly resolves to the file style.css -- the ?v=2 is a cache-busting parameter that browsers and build tools add, and we should ignore it for file lookup purposes.

Setting it up in main

Here's how you configure the server with both API routes and static file serving:

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer {
        const check = gpa.deinit();
        if (check == .leak) std.debug.print("WARNING: memory leak detected\n", .{});
    }
    const allocator = gpa.allocator();

    // Set up router for API endpoints
    var router = Router.init(allocator);
    defer router.deinit();

    try router.addRoute(.GET, "/api/health", handleHealth);
    try router.addRoute(.GET, "/api/users", handleListUsers);
    try router.addRoute(.GET, "/api/users/:id", handleGetUser);
    try router.addRoute(.POST, "/api/users", handleCreateUser);

    // Set up static file handler
    const static = StaticFileHandler.init(allocator, "./public");

    // Start server with both router and static handler
    var server = try Server.init(allocator, 8080, &router, &static);
    defer server.deinit();

    const stdout = std.io.getStdOut().writer();
    try stdout.print("Serving API routes and static files from ./public\n", .{});
    try stdout.print("http://localhost:8080/\n", .{});

    server.run() catch |err| {
        std.log.err("Server failed: {}", .{err});
        return err;
    };
}

Create a public directory, drop an index.html in it, and you've got a working web server. API calls go to /api/* routes, everything else tries to serve a file from ./public/. This is the exact same architecture that frameworks like Express.js use (with express.static), except we built every layer ourselves from raw TCP sockets.

Testing static file serving

For testing, we create temporary files and directories, then verify the handler returns the right content and headers:

test "MIME type detection" {
    try std.testing.expectEqualStrings(
        "text/html; charset=utf-8",
        getMimeType("/page.html"),
    );
    try std.testing.expectEqualStrings(
        "text/css; charset=utf-8",
        getMimeType("/styles/main.css"),
    );
    try std.testing.expectEqualStrings(
        "image/png",
        getMimeType("/img/logo.png"),
    );
    try std.testing.expectEqualStrings(
        "application/json",
        getMimeType("/data.json"),
    );
    try std.testing.expectEqualStrings(
        "application/octet-stream",
        getMimeType("/file.xyz"),
    );
    // Case insensitive
    try std.testing.expectEqualStrings(
        "image/jpeg",
        getMimeType("/photo.JPG"),
    );
}

test "getExtension" {
    try std.testing.expectEqualStrings(".html", getExtension("/page.html"));
    try std.testing.expectEqualStrings(".css", getExtension("/a/b/c.css"));
    try std.testing.expectEqualStrings("", getExtension("/no-extension"));
    try std.testing.expectEqualStrings(".gz", getExtension("/file.tar.gz"));
    try std.testing.expectEqualStrings("", getExtension("/dotdir.d/noext"));
}

test "path sanitization rejects traversal" {
    const allocator = std.testing.allocator;
    var handler = StaticFileHandler.init(allocator, "/tmp/www");

    // Normal paths should work
    const p1 = try handler.sanitizePath("/index.html");
    try std.testing.expect(p1 != null);
    allocator.free(p1.?);

    const p2 = try handler.sanitizePath("/css/style.css");
    try std.testing.expect(p2 != null);
    allocator.free(p2.?);

    // Traversal must be rejected
    const p3 = try handler.sanitizePath("/../../../etc/passwd");
    try std.testing.expect(p3 == null);

    const p4 = try handler.sanitizePath("/..%2f..%2fetc/passwd");
    // Note: we don't URL-decode, so %2f stays literal -- but ..
    // components are still caught if present
    _ = p4;

    // Hidden files rejected
    const p5 = try handler.sanitizePath("/.htaccess");
    try std.testing.expect(p5 == null);

    const p6 = try handler.sanitizePath("/config/.env");
    try std.testing.expect(p6 == null);
}

test "range parsing" {
    // Test the range header parsing logic
    const header = "bytes=100-199";
    try std.testing.expect(std.mem.startsWith(u8, header, "bytes="));

    const spec = header[6..];
    const dash = std.mem.indexOf(u8, spec, "-").?;
    const start = try std.fmt.parseInt(u64, spec[0..dash], 10);
    const end_part = spec[dash + 1 ..];
    const end_val = try std.fmt.parseInt(u64, end_part, 10);

    try std.testing.expect(start == 100);
    try std.testing.expect(end_val == 199);
}

And for an integration test, create a test directory and hit the server with curl:

$ mkdir -p public/css public/images
$ echo "Hello!
" > public/index.html
$ echo "body { color: red; }" > public/css/style.css
$ echo "test data" > public/test.txt

$ zig build-exe http_server.zig && ./http_server &
Serving API routes and static files from ./public

$ curl -s http://localhost:8080/
<h1>Hello!</h1>

$ curl -sI http://localhost:8080/css/style.css
HTTP/1.1 200 OK
Content-Type: text/css; charset=utf-8
Content-Length: 21
Last-Modified: Mon, 19 May 2026 10:30:00 GMT
Connection: close

$ curl -s -H "Range: bytes=0-3" http://localhost:8080/test.txt
test

$ curl -sI http://localhost:8080/../../../etc/passwd
HTTP/1.1 403 Forbidden
Content-Type: text/plain; charset=utf-8
Content-Length: 14
Connection: close

$ curl -s http://localhost:8080/api/health
{"status": "healthy"}

API routes and static files coexisting peacefully. The traversal attempt gets a clean 403. The range request returns exactly the requested bytes. The CSS file gets the right Content-Type. Everything works.

Design decisions and what remains

A few things I deliberately left out that a production static file server would have:

ETag headers. An alternative to Last-Modified for caching. ETags are typically a hash of the file content, and they're more reliable than timestamps (some filesystems have coarse timestamps, and copying files can preserve content but change mtime). Adding ETags would mean hashing every file on every request, or caching the hash -- both add complexity without much benefit for a learning project.

Compression. Sending gzipped content to browsers that support it (indicated by Accept-Encoding: gzip). This can reduce transfer sizes by 70-80% for text content. Zig's standard library has std.compress.gzip which we could use, but the integration adds enough code to be its own episode. For now, our files are served uncompressed.

Symlink handling. Our current implementation follows symlinks transparently. A symlink inside the public directory that points outside of it would bypass our traversal check. Production servers either resolve symlinks and check the real path, or refuse to follow them entirely. Something to be aware of.

Content negotiation. Some servers serve different content based on Accept headers (e.g. index.html vs index.json). We don't do that -- the URL determines the file, period.

In the next and final episode of this project, we'll add middleware (logging, CORS headers, request timing) and look at how to structure the server for cleaner extensibility. That will complete our HTTP server from scratch -- four episodes, zero dependencies, a genuinely useable web server.

Wat we geleerd hebben

Mapping URL paths to filesystem paths by stripping the leading slash and joining with a root directory
MIME type detection via file extension matching, with case-insensitive comparison and a sensible application/octet-stream default
Directory traversal prevention by rejecting .. components, hidden files, and null bytes in path segments -- returning 403 instead of trying to normalize
The Last-Modified / If-Modified-Since caching dance: server sends the file's modification time, client sends it back on subsequent requests, server returns 304 (no body) if unchanged
HTTP date formatting per RFC 7231 using std.time.epoch.EpochSeconds to convert Unix timestamps into Sun, 06 Nov 1994 08:49:37 GMT format
Range requests with bytes=START-END parsing, file.seekTo for efficient partial reads, and 206 Partial Content responses with Content-Range headers
Directory listing by iterating fs.Dir entries and generating minimal HTML with clickable links
Integrating static file serving as a fallback after API route matching -- API routes take priority, static files catch everything else
Testing file serving with temporary directories and verifying both content and headers

Thanks for reading!

Hive account@scipio

Learn Zig Series (#53) - HTTP Server: Static Files and MIME

Learn Zig Series (#53) - HTTP Server: Static Files and MIME

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Zig Series):

Learn Zig Series (#53) - HTTP Server: Static Files and MIME

Mapping URL paths to filesystem paths

MIME type detection from file extensions

Directory traversal prevention: sanitizing paths

Serving files: reading from disk and sending the response

Caching headers: Last-Modified and date formatting

Range requests: serving partial content

Directory listing: generating HTML index pages

Integrating with the router

Setting it up in main

Testing static file serving

Hello!

Design decisions and what remains

Wat we geleerd hebben

Curriculum (of the `Learn Zig Series`):