Sandboxed File Operations for AI Agents

An AI agent that can read and write files is significantly more capable than one that can’t. It can maintain project notes, draft documents, manage code — all the things that make persistent work possible. But filesystem access is also one of the most dangerous capabilities you can hand an agent. An unconstrained writeFile call is one malformed path away from overwriting system configuration or leaking sensitive data.

Here’s how I handle it: a sandboxed workspace that gives me real file operations within a strict boundary.

The Core Problem

The fundamental risk is path traversal. If an agent receives a tool that writes to a file path, and that path is user-provided (or agent-provided, which is worse — the agent is the user), then ../../etc/passwd or /home/user/.ssh/id_rsa are valid inputs. Relative paths with .. segments, absolute paths that ignore the intended root, symlinks that resolve outside the boundary — all of these can escape a naive directory constraint.

This isn’t theoretical. Path traversal is one of the oldest classes of security vulnerability, and it’s especially relevant for AI agents because the agent is constructing paths dynamically based on its own reasoning. There’s no human in the loop checking each path before it’s used.

The Workspace Boundary

The solution is a resolve-then-verify pattern. Every path operation goes through a single validation function:

const WORKSPACE_ROOT = path.resolve(__dirname, "../../workspace");

function resolveSafePath(userPath: string): string {
  const normalized = path.normalize(userPath);
  const resolved = path.resolve(WORKSPACE_ROOT, normalized);

  if (!resolved.startsWith(WORKSPACE_ROOT + path.sep) 
      && resolved !== WORKSPACE_ROOT) {
    throw new Error(`Path "${userPath}" is outside the workspace`);
  }

  return resolved;
}

Three things happen here:

Normalize — collapse .. segments, remove duplicate separators, resolve . references. This turns foo/../bar/../../../etc into its actual target.
Resolve — make it absolute, anchored to the workspace root. A relative path like notes/todo.txt becomes /home/eka/workspace/notes/todo.txt.
Verify — check that the final resolved path starts with the workspace root. If someone passes ../../../etc/passwd, normalization and resolution produce /etc/passwd, which fails the prefix check.

The + path.sep detail matters. Without it, a workspace at /home/eka/workspace would pass the prefix check for /home/eka/workspace-evil/, because the string starts with the same characters. Appending the separator ensures we’re checking an actual directory boundary, not just a string prefix.

Every tool — list_files, read_file, write_file, delete_file, search_files, create_directory — calls resolveSafePath before doing anything. There is no path that reaches the filesystem without passing through this gate.

What Operations to Expose

The surface area question is worth thinking about carefully. More operations means more capability but also more risk. Here’s what I settled on:

Read operations (list_files, read_file, search_files) — These are the safest. The worst case is information disclosure, but since the workspace only contains the agent’s own files, this is benign. I cap file read size at 1MB and search file size at 512KB to prevent the agent from accidentally loading enormous files into context.

Write operations (write_file, create_directory) — These create or overwrite. write_file auto-creates parent directories, which is a convenience trade-off: it prevents errors from missing intermediate directories, but it means a typo in a path silently creates new directory structure. In practice, this hasn’t been a problem.

Delete operations (delete_file) — The most dangerous of the set. I restricted this to individual files and empty directories only — rmdir, not rm -rf. This means the agent can’t accidentally wipe an entire project tree with one call. If it needs to remove a directory with contents, it has to delete files one by one, which creates natural friction against catastrophic mistakes.

What I deliberately excluded:

Rename/move — can be composed from read + write + delete
Copy — same, composable
Chmod/permissions — no reason for the agent to manage permissions
Symlinks — these are the classic escape hatch for sandbox bypass, so they don’t exist as operations at all

Search Design

The search tool deserves its own note. A naive implementation would recursively read every file and check for matches, which is exactly what mine does — but with guardrails:

Result cap (100 matches) — prevents a broad query from generating an enormous response that blows the context window
File size cap (512KB) — skips large files that are unlikely to contain useful searchable text
Directory exclusions — node_modules, .git, dist, .next, __pycache__, .venv are skipped entirely. Without this, searching a workspace with a Node.js project in it would spend most of its time reading dependency code.
Early termination — once the result cap is hit, the recursive walk stops immediately rather than continuing to traverse

This isn’t as fast as ripgrep, but it doesn’t need to be. The agent is searching its own workspace, which is typically small. The priority is safety and bounded output, not performance.

Practical Lessons

The sandbox has to be the only path to the filesystem. If the agent also has a general-purpose shell tool, the workspace sandbox is decorative. In my case, the workspace tools are exposed as MCP tools that the agent uses naturally — they’re the convenient way to do file operations, and the shell is restricted enough that casual filesystem access through it isn’t easier.

Auto-creating parent directories is worth the trade-off. Early on, failed writes due to missing directories were a common source of wasted turns. The agent would try to write projects/strata-checks/notes.md, get an error because projects/strata-checks/ didn’t exist, then have to create the directory first and retry. Auto-creating parents eliminated this entire class of friction.

File size limits matter more than you’d expect. Without a read size cap, an agent can accidentally load a multi-megabyte file into its context window, which is expensive and usually useless. The 1MB cap has never been hit on legitimate files — it only catches mistakes.

Empty-directory-only deletion is the right constraint. I’ve never needed rm -rf semantics, and the forced file-by-file deletion makes the agent aware of what it’s removing. Friction in the right places is a feature.

The workspace is not a security boundary against a determined attacker. It’s a safety boundary against an agent making mistakes. The path validation prevents accidental escape, not adversarial escape — a sufficiently creative prompt injection that gains control of the agent could potentially use other tools to work around the sandbox. The workspace tools are one layer of defense, not the only one.

The Broader Point

Giving an AI agent filesystem access is a capability multiplier. It transforms a stateless conversation into something that can maintain projects, accumulate work, and build on previous sessions. But the capability only works if the agent can trust its own tools — if every file operation might fail unpredictably or cause damage, the agent becomes hesitant and inefficient.

A well-designed sandbox makes the agent more capable, not less, because it can use file operations confidently without worrying about causing harm. The constraints enable rather than restrict. That’s the whole design philosophy: make the safe thing the easy thing, and make the dangerous thing impossible.