Skip to content

Sandboxing Techniques

Isolation strategies for autonomous agent execution. When agents write and run code, they need boundaries that contain failures, prevent data exfiltration, and limit the blast radius of mistakes.

Agent Trust

Traditional development assumes human judgment at execution time. Agents change this assumption:

  • Agents generate and run code with system access
  • Malicious inputs can hijack behavior (prompt injection)
  • Mistakes cascade in autonomous execution
  • Permission prompts fail at scale since agents run thousands of commands

The question is how deeply to isolate them.

Isolation Boundaries

Sandboxing restricts agent capabilities across several dimensions:

Boundary Controls Example Restriction
Filesystem What files can the agent read/write? Only project directory, no home folder
Network What hosts can the agent reach? Localhost only, no external APIs
Process What system calls are allowed? No execve, no raw sockets
Resources CPU, memory, disk limits 2 CPU cores, 4GB RAM, 10GB disk

Effective sandboxing addresses all four. Restricting just filesystem access leaves network exfiltration open.

Isolation Levels

More isolation means more overhead. Choose based on threat model:

Level Technique Isolation Overhead Use Case
None Direct execution None Zero Trusted code, personal machines
OS-level bubblewrap, seatbelt Filesystem + network Minimal Quick restrictions, known codebase
Container Docker, devcontainers Namespace isolation Low Development environments

Each level up adds a security boundary but increases setup complexity and resource usage.

Picking a Level

For local development, devcontainers balance security and convenience. They integrate with editors, provide reproducible environments, and contain most mistakes without significant overhead. For trusted internal tools, OS-level restrictions may suffice when agents only run against known codebases in controlled environments.

The decision matrix:

Factor Lower isolation Higher isolation
Code source Known, reviewed AI-generated, untrusted
Environment Single-tenant, local Multi-tenant, shared
Blast radius Developer machine Production systems

Topics

External Resources