Learn Zig Series (#48) - Build a Shell: Process Spawning
Project D: Build Your Own Shell (2/4)
What will I learn
- You will learn how
std.process.Childworks for spawning external programs from Zig; - You will learn setting up stdin, stdout, and stderr pipes for child processes;
- You will learn implementing pipes between processes: connecting one command's stdout to another's stdin;
- You will learn waiting for child processes and collecting exit codes;
- You will learn the fork/exec model under the hood and what
Childabstracts away; - You will learn PATH resolution: searching directories for executables;
- You will learn handling missing programs gracefully (ENOENT and other spawn errors);
- You will learn testing process spawning by running
echo hello, capturing output, and verifying it.
Requirements
- A working modern computer running macOS, Windows or Ubuntu;
- An installed Zig 0.14+ distribution (download from ziglang.org);
- The ambition to learn Zig programming.
Difficulty
- Intermediate
Curriculum (of the Learn Zig Series):
- Zig Programming Tutorial - ep001 - Intro
- Learn Zig Series (#2) - Hello Zig, Variables and Types
- Learn Zig Series (#3) - Functions and Control Flow
- Learn Zig Series (#4) - Error Handling (Zig's Best Feature)
- Learn Zig Series (#5) - Arrays, Slices, and Strings
- Learn Zig Series (#6) - Structs, Enums, and Tagged Unions
- Learn Zig Series (#7) - Memory Management and Allocators
- Learn Zig Series (#8) - Pointers and Memory Layout
- Learn Zig Series (#9) - Comptime (Zig's Superpower)
- Learn Zig Series (#10) - Project Structure, Modules, and File I/O
- Learn Zig Series (#11) - Mini Project: Building a Step Sequencer
- Learn Zig Series (#12) - Testing and Test-Driven Development
- Learn Zig Series (#13) - Interfaces via Type Erasure
- Learn Zig Series (#14) - Generics with Comptime Parameters
- Learn Zig Series (#15) - The Build System (build.zig)
- Learn Zig Series (#16) - Sentinel-Terminated Types and C Strings
- Learn Zig Series (#17) - Packed Structs and Bit Manipulation
- Learn Zig Series (#18) - Async Concepts and Event Loops
- Learn Zig Series (#18b) - Addendum: Async Returns in Zig 0.16
- Learn Zig Series (#19) - SIMD with @Vector
- Learn Zig Series (#20) - Working with JSON
- Learn Zig Series (#21) - Networking and TCP Sockets
- Learn Zig Series (#22) - Hash Maps and Data Structures
- Learn Zig Series (#23) - Iterators and Lazy Evaluation
- Learn Zig Series (#24) - Logging, Formatting, and Debug Output
- Learn Zig Series (#25) - Mini Project: HTTP Status Checker
- Learn Zig Series (#26) - Writing a Custom Allocator
- Learn Zig Series (#27) - C Interop: Calling C from Zig
- Learn Zig Series (#28) - C Interop: Exposing Zig to C
- Learn Zig Series (#29) - Inline Assembly and Low-Level Control
- Learn Zig Series (#30) - Thread Safety and Atomics
- Learn Zig Series (#31) - Memory-Mapped I/O and Files
- Learn Zig Series (#32) - Compile-Time Reflection with @typeInfo
- Learn Zig Series (#33) - Building a State Machine with Tagged Unions
- Learn Zig Series (#34) - Performance Profiling and Optimization
- Learn Zig Series (#35) - Cross-Compilation and Target Triples
- Learn Zig Series (#36) - Mini Project: CLI Task Runner
- Learn Zig Series (#37) - Markdown to HTML: Tokenizer and Lexer
- Learn Zig Series (#38) - Markdown to HTML: Parser and AST
- Learn Zig Series (#39) - Markdown to HTML: Renderer and CLI
- Learn Zig Series (#40) - Key-Value Store: In-Memory Store
- Learn Zig Series (#41) - Key-Value Store: Write-Ahead Log
- Learn Zig Series (#42) - Key-Value Store: TCP Server
- Learn Zig Series (#43) - Key-Value Store: Client Library and Benchmarks
- Learn Zig Series (#44) - Image Tool: Reading and Writing PPM/BMP
- Learn Zig Series (#45) - Image Tool: Pixel Operations
- Learn Zig Series (#46) - Image Tool: CLI Pipeline
- Learn Zig Series (#47) - Build a Shell: Parsing Commands
- Learn Zig Series (#48) - Build a Shell: Process Spawning (this post)
Learn Zig Series (#48) - Build a Shell: Process Spawning
Last episode we built the parser for our shell -- the part that takes a raw command string and turns it into a structured Pipeline of Command structs. Which is nice and all, but a shell that can only parse commands without actually running them is about as useful as a car with no engine. Time to fix that.
This episode is about making things happen. We're going to take those parsed Command structs and actually spawn real processes on the operating system. That means working with std.process.Child, setting up pipes between commands, managing file descriptors for I/O redirection, and dealing with all the edge cases that come with running external programs (programs that don't exist, programs that crash, programs that refuse to die..).
The core idea is simple: for each command in our pipeline, we spawn a child process. If there's a pipe between two commands, we connect the first command's stdout to the second command's stdin. If there's a redirection, we open a file and point the appropriate file descriptor at it. Then we wait for everything to finish and collect the exit codes. The implementation, however, is where the fun begins ;-)
How std.process.Child works
Zig's standard library gives us std.process.Child for spawning external programs. If you've ever used subprocess.Popen in Python or child_process.spawn in Node.js, the concept is the same -- you specify the program and its arguments, configure how stdin/stdout/stderr should be handled, and then let the OS do its thing.
Here's the simplest possible example -- spawning echo hello and letting its output go directly to our terminal:
const std = @import("std");
pub fn main() !void {
var child = std.process.Child.init(.{
.argv = &.{ "echo", "hello" },
}, std.heap.page_allocator);
try child.spawn();
const result = try child.wait();
if (result.Exited == 0) {
std.debug.print("Command succeeded\n", .{});
} else {
std.debug.print("Command exited with code: {d}\n", .{result.Exited});
}
}
The .argv field takes a slice of strings -- the first element is the program name, and the rest are arguments. By default, the child inherits our stdin, stdout, and stderr, so echo hello will print directly to the terminal just like if you'd typed it yourself.
The wait() call blocks until the child process finishes and returns a Term tagged union. The most common variant is .Exited which carries the exit code (0 for success, nonzero for failure). Other variants include .Signal (killed by a signal like SIGKILL) and .Stopped (suspended by a signal).
Having said that, this basic version won't work for our shell. We need control over what happens to stdin, stdout, and stderr -- otherwise we can't implement pipes or redirections. Let's look at how to configure those.
Setting up stdin, stdout, and stderr
std.process.Child lets you configure each of the three standard I/O streams independently via the .stdin_behavior, .stdout_behavior, and .stderr_behavior fields. The options are:
.inherit-- the child inherits the parent's file descriptor (default).pipe-- creates a pipe between parent and child, giving us a file descriptor to read from or write to.close-- closes the stream entirely.{ .path = "..." }-- NOT a real option (we'll handle file redirections ourselves)
The .pipe option is what makes everything possible. When you set .stdout_behavior = .pipe, the OS creates a pipe and connects the child's stdout to one end. After spawning, child.stdout gives us a reader we can use to read the child's output.
Here's spawning a command and capturing its stdout into a buffer:
fn captureOutput(allocator: std.mem.Allocator, argv: []const []const u8) ![]u8 {
var child = std.process.Child.init(.{
.argv = argv,
.stdout_behavior = .pipe,
.stderr_behavior = .pipe,
}, allocator);
try child.spawn();
// Read all of stdout
const stdout = child.stdout.?;
var output = std.ArrayList(u8).init(allocator);
errdefer output.deinit();
var buf: [4096]u8 = undefined;
while (true) {
const n = try stdout.read(&buf);
if (n == 0) break;
try output.appendSlice(buf[0..n]);
}
const result = try child.wait();
if (result.Exited != 0) {
output.deinit();
return error.CommandFailed;
}
return try output.toOwnedSlice();
}
Important detail: we read stdout BEFORE calling wait(). Why? Because if the child produces a lot of output and nobody is reading the pipe, the pipe buffer fills up (typically 64KB on Linux) and the child blocks. If we're also blocking on wait(), we've got a deadlock -- the child is waiting for us to read, and we're waiting for the child to exit. The fix is always: read the pipes first, wait second.
The child.stdout.? unwrap is needed because stdout is an optional -- it's only non-null when you set .stdout_behavior = .pipe. If you forget to configure the pipe behavior and try to read from child.stdout, you'll get a null dereference. The type system forces you to handle this explicitly, which is one of those Zig patterns that feels strict at first but catches real bugs.
Implementing pipes between processes
Now for the interesting part: connecting multiple commands through pipes, like ls | grep foo | wc -l. The concept is that command 1's stdout becomes command 2's stdin, command 2's stdout becomes command 3's stdin, and so on.
On Unix, the traditional approach is:
- Create a pipe (two file descriptors: a read end and a write end)
- Fork the first child, redirect its stdout to the pipe's write end
- Fork the second child, redirect its stdin to the pipe's read end
- Close the unused ends in the parent
std.process.Child handles the low-level fork/exec for us, but we need to be clever about how we thread the pipes through. Here's the core executePipeline function:
const posix = std.posix;
const ExecError = error{
SpawnFailed,
CommandNotFound,
PipeFailed,
WaitFailed,
OutOfMemory,
RedirectFailed,
};
fn executePipeline(allocator: std.mem.Allocator, pipeline: *const Pipeline) ExecError!u8 {
const cmds = pipeline.commands;
if (cmds.len == 0) return 0;
// Single command, no pipes needed
if (cmds.len == 1) {
return executeSingleCommand(allocator, &cmds[0]);
}
// Multiple commands: set up pipe chain
var children = std.ArrayList(std.process.Child).init(allocator);
defer children.deinit();
var prev_read_fd: ?posix.fd_t = null;
for (cmds, 0..) |cmd, i| {
const is_last = (i == cmds.len - 1);
// Build argv: program + args
var argv_list = std.ArrayList([]const u8).init(allocator);
defer argv_list.deinit();
argv_list.append(cmd.program) catch return error.OutOfMemory;
for (cmd.args) |arg| {
argv_list.append(arg) catch return error.OutOfMemory;
}
const argv = argv_list.toOwnedSlice() catch return error.OutOfMemory;
defer allocator.free(argv);
var child = std.process.Child.init(.{
.argv = argv,
.stdout_behavior = if (is_last) .inherit else .pipe,
.stdin_behavior = if (prev_read_fd != null) .{ .fd = prev_read_fd.? } else .inherit,
}, allocator);
child.spawn() catch |err| {
if (prev_read_fd) |fd| posix.close(fd);
return translateSpawnError(err);
};
// Close the previous pipe's read end in the parent --
// the child has inherited it, so we don't need it anymore
if (prev_read_fd) |fd| posix.close(fd);
// Save stdout pipe for the next command's stdin
if (!is_last) {
if (child.stdout) |stdout_file| {
prev_read_fd = stdout_file.handle;
} else {
prev_read_fd = null;
}
} else {
prev_read_fd = null;
}
children.append(child) catch return error.OutOfMemory;
}
// Wait for all children and collect exit codes
var last_exit: u8 = 0;
for (children.items) |*child| {
const result = child.wait() catch return error.WaitFailed;
switch (result) {
.Exited => |code| last_exit = code,
.Signal => |sig| {
std.debug.print("Process killed by signal {d}\n", .{sig});
last_exit = 128 + sig;
},
else => last_exit = 1,
}
}
return last_exit;
}
The key insight is the prev_read_fd variable. As we iterate through the pipeline, each command (except the first) gets its stdin wired to the previous command's stdout pipe. And each command (except the last) has its stdout set to .pipe so we can capture the file descriptor for the next command.
The file descriptor management is critical and easy to get wrong. After we spawn a child that inherits prev_read_fd, we MUST close that fd in the parent. If we don't, the pipe stays open (the parent holds a reference to the read end), and the next command in the chain will never see EOF on its stdin because the pipe is still technically writable from the parent's perspective. This is the kind of bug that's invisible in simple tests but causes hangs in production when you pipe large amounts of data.
I'll be honest -- when I was first learning systems programming this fd-closing dance confused me for quite some time. The mental model that clicked for me was: every open file descriptor is a reference. When all references to a pipe end are closed, the pipe end is gone. A child process inherits copies of the parent's fds on fork, creating new references. So after fork you must close the parent's copy, leaving only the child's copy alive.
Waiting for children and collecting exit codes
In Unix shells, the exit code of a pipeline is (by convention) the exit code of the last command. So false | true returns 0 (success), because true is the last command and it returns 0. Bash has a pipefail option that changes this to return the exit code of the last failed command, but the default is last-command-wins.
We follow that convention. The wait() loop goes through all children in order and stores each exit code, so last_exit ends up with the final command's exit code.
The wait() result is a tagged union with several variants:
// std.process.Child.Term (simplified)
const Term = union(enum) {
Exited: u8, // normal exit with code
Signal: u32, // killed by signal
Stopped: u32, // stopped by signal
Unknown: u32, // unknown termination status
};
For signals, Unix convention is to report the exit code as 128 + signal_number. So if a process is killed by SIGKILL (signal 9), the reported exit code is 137. You'll see this in Docker a lot -- a container killed for exceeding memory limits exits with code 137.
The pattern of waiting for all children (not just the last one) matters for correctness. If we only waited for the last command, the earlier commands could become zombies -- processes that have finished executing but whose exit status hasn't been collected. On long-running shells, zombie processes accumulate and eventually hit the system's process limit. Always wait for everything you spawn.
The fork/exec model: what Child does under the hood
When you call child.spawn() on Linux, Zig's implementation does roughly this:
- fork() -- creates a copy of the current process. Both parent and child continue executing from the same point, but
fork()returns 0 in the child and the child's PID in the parent. - In the child: set up file descriptors (dup2 to rewire stdin/stdout/stderr), close fds that shouldn't be inherited, set up the environment.
- execve() -- replace the child's entire memory space with the new program. The program from
.argv[0]gets loaded and starts executing from itsmain. The child process no longer runs any of our Zig code. - In the parent: close the pipe ends that belong to the child, return control to the caller.
Zig actually uses posix.fork() on platforms that support it (Linux, macOS, BSDs) and CreateProcessW on Windows. The std.process.Child abstraction hides these platform differences, which is one of the genuine advantages of using the standard library rather than raw syscalls.
The fork/exec split is one of those Unix design decisions that seems weird at first but is incredibly powerful. By separating "create a new process" (fork) from "load a new program" (exec), you get a window between the two where you can set up the child's environment -- file descriptors, working directory, signal masks, resource limits. The child is already a separate process (with its own memory space after copy-on-write kicks in) but it's still running our code. That's when we do all the plumbing.
On modern Linux, std.process.Child actually uses posix_spawn() or vfork() when possible, which are more efficient than plain fork(). The classic fork() copies the entire virtual memory space of the parent (even if only briefly due to copy-on-write), while vfork() shares the parent's memory until exec() is called. For our shell this performance difference is negligible, but if you were spawning thousands of processes per second (like a build system), it would matter. We talked about performance measurement in episode 34 if you want to benchmark this yourself.
PATH resolution: finding executables
When you type ls in a shell, the shell needs to figure out that you mean /usr/bin/ls (or /bin/ls, depending on your system). This is PATH resolution -- searching through the directories listed in the PATH environment variable for an executable file matching the given name.
std.process.Child handles PATH resolution automatically when you pass a bare command name (no slashes). It reads PATH from the environment, splits it on :, and searches each directory in order. If the command contains a slash (like ./my_program or /usr/local/bin/python3), it's treated as a direct path and PATH is not consulted.
But sometimes we want to do the resolution ourselves -- for example, to give a better error message ("ls: command not found" vs a generic spawn error). Here's a function that searches PATH manually:
fn findExecutable(allocator: std.mem.Allocator, name: []const u8) !?[]const u8 {
// If name contains a slash, it's already a path
if (std.mem.indexOf(u8, name, "/") != null) {
// Check if the file exists and is executable
std.fs.cwd().access(name, .{ .mode = .execute }) catch return null;
return try allocator.dupe(u8, name);
}
// Search PATH
const path_env = std.posix.getenv("PATH") orelse return null;
var iter = std.mem.splitScalar(u8, path_env, ':');
while (iter.next()) |dir| {
if (dir.len == 0) continue;
// Build full path: dir + "/" + name
const full_path = try std.fmt.allocPrint(allocator, "{s}/{s}", .{ dir, name });
// Check if file exists and is executable
std.fs.cwd().access(full_path, .{ .mode = .execute }) catch {
allocator.free(full_path);
continue;
};
return full_path;
}
return null;
}
We use std.mem.splitScalar (which we covered in episode 5) to iterate over the colon-separated PATH entries. For each directory, we construct the full path and check if the file exists and is executable using access(). First match wins.
The .{ .mode = .execute } check is important -- a file might exist but not be executable (no +x permission), and we don't want to try spawning a non-executable file. On Windows this works differently (Windows uses file extensions like .exe, .bat, .cmd to determine executability), but Zig's standard library abstracts this away.
One subtlety: an empty entry in PATH (e.g. PATH=/usr/bin::/usr/local/bin -- notice the double colon) historically means "the current directory". We skip empty entries for security reasons. A shell that searches the current directory by default is a classic attack vector -- place a malicious ls binary in a directory and wait for someone to cd into it and type ls.
Handling programs that don't exist
When you try to spawn a program that doesn't exist, std.process.Child.spawn() returns an error. The specific error depends on the platform, but on Linux and macOS you'll typically get error.FileNotFound (which corresponds to the ENOENT errno, "No such file or directory").
Let's write a helper that translates spawn errors into user-friendly messages:
fn translateSpawnError(err: anyerror) ExecError {
return switch (err) {
error.FileNotFound => error.CommandNotFound,
error.AccessDenied => error.SpawnFailed,
else => error.SpawnFailed,
};
}
fn executeSingleCommand(allocator: std.mem.Allocator, cmd: *const Command) ExecError!u8 {
// Build argv
var argv_list = std.ArrayList([]const u8).init(allocator);
defer argv_list.deinit();
argv_list.append(cmd.program) catch return error.OutOfMemory;
for (cmd.args) |arg| {
argv_list.append(arg) catch return error.OutOfMemory;
}
const argv = argv_list.toOwnedSlice() catch return error.OutOfMemory;
defer allocator.free(argv);
// Handle redirections
var stdin_fd: ?posix.fd_t = null;
var stdout_fd: ?posix.fd_t = null;
defer {
if (stdin_fd) |fd| posix.close(fd);
if (stdout_fd) |fd| posix.close(fd);
}
for (cmd.redirects) |redir| {
switch (redir.kind) {
.stdin_file => {
const fd = posix.open(
redir.target,
.{ .ACCMODE = .RDONLY },
0,
) catch return error.RedirectFailed;
if (stdin_fd) |old| posix.close(old);
stdin_fd = fd;
},
.stdout_overwrite => {
const fd = posix.open(
redir.target,
.{ .ACCMODE = .WRONLY, .CREAT = true, .TRUNC = true },
0o644,
) catch return error.RedirectFailed;
if (stdout_fd) |old| posix.close(old);
stdout_fd = fd;
},
.stdout_append => {
const fd = posix.open(
redir.target,
.{ .ACCMODE = .WRONLY, .CREAT = true, .APPEND = true },
0o644,
) catch return error.RedirectFailed;
if (stdout_fd) |old| posix.close(old);
stdout_fd = fd;
},
}
}
var child = std.process.Child.init(.{
.argv = argv,
.stdin_behavior = if (stdin_fd) |fd| .{ .fd = fd } else .inherit,
.stdout_behavior = if (stdout_fd) |fd| .{ .fd = fd } else .inherit,
}, allocator);
child.spawn() catch |err| return translateSpawnError(err);
const result = child.wait() catch return error.WaitFailed;
return switch (result) {
.Exited => |code| code,
.Signal => |sig| 128 + @as(u8, @truncate(sig)),
else => 1,
};
}
The executeSingleCommand function handles the common case: one command, possibly with I/O redirections but no pipes. For each redirection in the Command.redirects list, we open the target file with the appropriate flags (read-only for stdin, write+create+truncate for >, write+create+append for >>). The file descriptors get passed directly to std.process.Child via the .fd behavior.
The defer block ensures we close any opened file descriptors even if spawning fails. This is the exact same resource-cleanup pattern we've used throughout the series -- defer for the happy path, and the error is handled by the fact that defer runs on all exit paths including error returns.
Notice how for stdout_overwrite we pass .TRUNC = true which clears the file contents before writing, while stdout_append uses .APPEND = true to add to the end. The 0o644 mode sets the permissions to owner-read-write, group-read, other-read -- the standard permission for user-created files. We talked about file modes back in the File I/O section of episode 10.
Wiring it into the REPL
Now let's integrate the executor into our shell REPL from last episode. We replace the printPipeline call with actual execution:
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer {
const check = gpa.deinit();
if (check == .leak) std.debug.print("WARNING: memory leak detected\n", .{});
}
const allocator = gpa.allocator();
const stdin = std.io.getStdIn().reader();
const stdout = std.io.getStdOut().writer();
try stdout.print("zsh-lite> ", .{});
var line_buf: [4096]u8 = undefined;
while (stdin.readUntilDelimiter(&line_buf, '\n')) |line| {
if (line.len == 0) {
try stdout.print("zsh-lite> ", .{});
continue;
}
if (std.mem.eql(u8, line, "exit") or std.mem.eql(u8, line, "quit")) {
try stdout.print("Bye!\n", .{});
break;
}
const tokens = tokenize(allocator, line) catch |err| {
switch (err) {
error.UnterminatedQuote => try stdout.print("Error: unterminated quote\n", .{}),
error.TrailingEscape => try stdout.print("Error: trailing backslash\n", .{}),
else => try stdout.print("Tokenize error: {}\n", .{err}),
}
try stdout.print("zsh-lite> ", .{});
continue;
};
defer {
for (tokens) |tok| {
if (tok.kind == .word) allocator.free(tok.value);
}
allocator.free(tokens);
}
if (tokens.len == 0) {
try stdout.print("zsh-lite> ", .{});
continue;
}
const pipeline = parsePipeline(allocator, tokens) catch |err| {
switch (err) {
error.EmptyCommand => try stdout.print("Error: empty command\n", .{}),
error.EmptyPipeline => try stdout.print("Error: empty pipeline\n", .{}),
error.MissingRedirectTarget => try stdout.print("Error: redirect needs a filename\n", .{}),
error.TrailingPipe => try stdout.print("Error: trailing pipe\n", .{}),
else => try stdout.print("Parse error: {}\n", .{err}),
}
try stdout.print("zsh-lite> ", .{});
continue;
};
defer pipeline.deinit(allocator);
const exit_code = executePipeline(allocator, &pipeline) catch |err| {
switch (err) {
error.CommandNotFound => try stdout.print("{s}: command not found\n", .{pipeline.commands[0].program}),
error.RedirectFailed => try stdout.print("Error: could not open file for redirection\n", .{}),
else => try stdout.print("Execution error: {}\n", .{err}),
}
try stdout.print("zsh-lite> ", .{});
continue;
};
if (exit_code != 0) {
// Print nonzero exit codes for debugging (optional)
try stdout.print("(exit {d})\n", .{exit_code});
}
try stdout.print("zsh-lite> ", .{});
} else |err| {
if (err != error.EndOfStream) {
std.debug.print("Read error: {}\n", .{err});
}
}
}
Now our shell can actually do things. Let's try it:
zsh-lite> echo hello world
hello world
zsh-lite> ls /tmp | head -5
systemd-private-abc123
tracker-extract-files.1000
snap-private-tmp
mozilla_user0
pulse-PKdhtXMmr18n
zsh-lite> echo "hello" > /tmp/test_output.txt
zsh-lite> cat /tmp/test_output.txt
hello
zsh-lite> cat /tmp/test_output.txt | wc -l
1
zsh-lite> nonexistent_command
nonexistent_command: command not found
zsh-lite> exit
Bye!
That's a real working shell. It parses commands, spawns processes, pipes data between them, handles redirections, and reports errors. We went from "display parsed structure" to "actually runs programs" in one episode.
Testing process spawning
Testing code that spawns processes is trickier than testing a pure parser. The tests depend on having certain programs available (echo, cat, wc), and they interact with the OS. Still, we can write reasonably portable tests that work on any Unix-like system.
test "spawn echo and capture output" {
const allocator = std.testing.allocator;
var child = std.process.Child.init(.{
.argv = &.{ "echo", "hello" },
.stdout_behavior = .pipe,
}, allocator);
try child.spawn();
const stdout = child.stdout.?;
var output = std.ArrayList(u8).init(allocator);
defer output.deinit();
var buf: [256]u8 = undefined;
while (true) {
const n = try stdout.read(&buf);
if (n == 0) break;
try output.appendSlice(buf[0..n]);
}
const result = try child.wait();
try std.testing.expectEqual(@as(u8, 0), result.Exited);
// echo adds a newline
try std.testing.expectEqualStrings("hello\n", output.items);
}
test "spawn nonexistent command returns error" {
const allocator = std.testing.allocator;
var child = std.process.Child.init(.{
.argv = &.{"this_program_definitely_does_not_exist_12345"},
.stdout_behavior = .pipe,
.stderr_behavior = .pipe,
}, allocator);
const spawn_result = child.spawn();
try std.testing.expectError(error.FileNotFound, spawn_result);
}
test "pipe between two commands" {
const allocator = std.testing.allocator;
// printf "hello\nworld\nfoo\n" | wc -l
// We'll use echo with -e for the multiline, but portable version:
// Just spawn echo three times piped to wc... actually let's keep it simple
var child1 = std.process.Child.init(.{
.argv = &.{ "printf", "hello\\nworld\\nfoo\\n" },
.stdout_behavior = .pipe,
}, allocator);
try child1.spawn();
const pipe_fd = child1.stdout.?.handle;
var child2 = std.process.Child.init(.{
.argv = &.{ "wc", "-l" },
.stdin_behavior = .{ .fd = pipe_fd },
.stdout_behavior = .pipe,
}, allocator);
try child2.spawn();
posix.close(pipe_fd);
// Read child2's output
const stdout2 = child2.stdout.?;
var output = std.ArrayList(u8).init(allocator);
defer output.deinit();
var buf: [256]u8 = undefined;
while (true) {
const n = try stdout2.read(&buf);
if (n == 0) break;
try output.appendSlice(buf[0..n]);
}
_ = try child1.wait();
const result2 = try child2.wait();
try std.testing.expectEqual(@as(u8, 0), result2.Exited);
// wc -l should report 3 (three lines)
const trimmed = std.mem.trim(u8, output.items, " \t\n");
try std.testing.expectEqualStrings("3", trimmed);
}
test "PATH resolution finds echo" {
const allocator = std.testing.allocator;
const path = try findExecutable(allocator, "echo");
if (path) |p| {
defer allocator.free(p);
// Should be something like /usr/bin/echo or /bin/echo
try std.testing.expect(std.mem.endsWith(u8, p, "/echo"));
}
// Note: on some systems echo is a shell builtin, so findExecutable
// might return null. That's ok for this test.
}
The pipe test is interesting. We manually create two Child instances and thread a pipe fd between them -- this is exactly what executePipeline does, but in miniature. The trimming on wc -l output is needed because wc pads its output with leading spaces on some systems.
The std.testing.allocator is once again our leak-detection friend. If any of these tests leak memory (forgot to free the output buffer, didn't wait for a child, etc.), the test fails. This is the same pattern from episode 12.
The project so far
Let's take stock of where we are. Across two episodes, our shell can now:
- Parse complex command lines with pipes, redirections, quoting, and escaping (episode 47)
- Spawn single commands and capture their exit codes (this episode)
- Pipe multiple commands together (this episode)
- Redirect stdin from files and stdout to files, both overwrite and append (this episode)
- Report errors: command not found, redirect failures, parse errors (both episodes)
- REPL that reads commands, executes them, and loops (both episodes)
What's still missing? Quite a bit, actually:
- Built-in commands:
cd,pwd,export,alias,history-- these can't be spawned as external processes because they need to modify the shell's own state.cdneeds to change the shell's working directory,exportneeds to modify the environment, etc. - Job control: running processes in the background with
&, bringing them to the foreground withfg, stopping them with Ctrl+Z. This requires signal handling and process group management. - Signal handling: Ctrl+C should kill the current foreground process, not our shell. Right now pressing Ctrl+C kills everything.
- Environment variables: expanding
$HOME,$PATH, etc. in commands. - Globbing: expanding
*.txtto matching filenames.
We'll tackle built-in commands and job control in the remaining two episodes of this project. The environment variables and globbing would be natural extentions if you want to push the project further on your own.
Wat we geleerd hebben
std.process.Childfor spawning external programs -- configure argv, I/O behaviors, then spawn and wait- The three I/O behaviors:
.inherit(pass through),.pipe(create a read/write connection), and.fd(use a specific file descriptor) - Pipe plumbing: connecting stdout of one child to stdin of the next, closing unused pipe ends in the parent to prevent deadlocks
- The fork/exec model: fork creates the child process, exec replaces it with the target program, the window between fork and exec is where fd rewiring happens
- PATH resolution: splitting the PATH variable, searching each directory, checking execute permissions
- File descriptor lifecycle: open fds for redirections, pass them to Child, defer-close to prevent leaks
- Exit code conventions: 0 for success, nonzero for failure, 128+signal for signal kills
- Testing spawned processes: capture stdout via pipes, verify output and exit codes, use the testing allocator for leak detection
The shell is starting to feel real now. It can actually run programs, which I think was the moment for me where building it went from "academic exercise" to "huh, this could actually be useful". Next time we'll add built-in commands -- the stuff that makes a shell a shell rather than just a process launcher ;-)
Bedankt en tot de volgende keer!