ConTree - Sandboxes That Branch Like Git

Safe code execution. Branch from any checkpoint, explore paths in parallel, pick the winner. Built for agents that think ahead.

Branch execution state
Instant rollback
Safe code execution
Branch from any state
Parallel path exploration
Backtrack on failure
Safe LLM code execution
Pay per execution, not idle

Why Agents Need Branching

Think MCTS for code. Explore, evaluate, backtrack, expand.

🌳

Tree Search Execution

Agent explores a solution tree. Each node is a code state. Need to branch, evaluate multiple children, backtrack to promising nodes. More info

Branch N times from any checkpoint. Run in parallel. Score results. Expand best paths.
🔄

Instant Backtrack

Path failed. Tests red. Environment broken. Sequential agents restart from scratch. More info

Jump to any previous node. One API call. Continue from known-good state.

Parallel Evaluation

Need to try 10 different approaches from the same starting point. Sequential = 10x slower. More info

Fork once, run 10 branches simultaneously. Compare. Merge the winner.

Why Not Just Docker?

Containers can't branch state. ConTree can.

Capability Docker Kubernetes ConTree
Isolation level Namespace (kernel shared) Namespace (kernel shared) VM (hardware boundary)
State branching Manual commit required Not supported Automatic per execution
Instant rollback Recreate container Redeploy pod Switch image reference
Filesystem inspection Requires running container Requires running pod API without execution
Execution history External logging External logging Built-in with resources
Scalability Manual orchestration Minutes to scale pods Thousands concurrent instantly

Use Cases

What people build with ConTree.

🤖

AI Agents

Coding agents, deep research agents, data analysis agents. Any agent that needs to execute code, run tools, or explore solutions. More info

Branch per attempt. Parallel exploration. Rollback broken states. Full Linux environment.
🐞

Security Sandboxing

Run untrusted code safely. Analyze third-party scripts. Test without risk to your infrastructure. More info

Hardware isolation. No escape. Inspect post-execution state.
🎓

Educational Platforms

Students run untrusted code. Need isolation, instant reset, and execution history. More info

Safe execution. One-click reset. Review all attempts.

How It Works

Four endpoints. Import, upload, execute, retrieve.

1

Import Image POST /images/import

Pull OCI image rootfs from any registry. Returns a UUID for the base snapshot.

2

Upload Files POST /files

Upload input files (scripts, data). Content-addressed by SHA256 for deduplication.

3

Spawn Instance POST /instances

Execute command with image, files, env vars, stdin. Returns operation ID immediately.

4

Get Results GET /operations/{id}

Poll for stdout/stderr, exit code, resource usage. Result includes new image UUID for branching.

⚠ OCI Import: What Is & Isn't Imported

Imported
  • ✓ Files and directories
  • ✓ Permissions and ownership
  • ✓ Symlinks
  • ✓ ENV variables
  • ✓ WORKDIR
Not Imported (config)
  • ✕ ENTRYPOINT / CMD
  • ✕ EXPOSE / LABEL

Safe to Run LLM-Generated Code

Your agent writes code you haven't reviewed. That's fine.

🛡 What Can't Happen

  • Agent code can't escape to host
  • Can't affect other executions
  • Can't mine crypto or run botnets
  • Can't exhaust your resources
  • Nothing persists unless you save it

🛠 How It Works

  • MicroVM per execution (not containers)
  • Dedicated kernel, hardware boundary
  • No inbound network access
  • Timeout kills runaway code
  • Output capped to prevent log bombs

💡 Best Practices

  • Set reasonable timeouts
  • Pass secrets via env, not files
  • Review agent output before trusting
  • Use disposable mode for untrusted tasks
  • Tag known-good checkpoints

Agent Patterns

How tree search agents use ConTree.

🌳 Beam Search Coding

Generate 5 candidate solutions from checkpoint. Run tests on each branch in parallel. Keep top 2. Expand further. Repeat until solved.

✓ Explore breadth without losing depth

📈 Best-of-N Sampling

Same prompt, N completions. Fork from identical state. Run all in parallel. Score outputs. Return the best.

✓ Maximize quality, minimize latency

🔄 Speculative Execution

Uncertain which approach works? Run both from same checkpoint. First success wins. Discard the other.

✓ Hedge bets, save time

Concepts

Think Git, but for execution environments.

Image
An immutable filesystem snapshot. Base images are imported from OCI registries; new images are created after each execution.
Instance
A single command execution request. Runs in an isolated microVM, produces stdout/stderr and optionally a new image.
Operation
An async task (instance execution or image import). Has states: pending, running, done, failed, cancelled.
Disposable
Execution mode where no new image is created. Use for stateless commands where you only need output.
Tag
A human-readable alias for an image UUID. Like Git tags, but for filesystem snapshots.
Inspect
Browse an image's filesystem via API without executing anything. List directories, download files.
Branch
Start a new execution from any existing image. The result is a new image, creating a tree of states.
Rollback
Switch to a previous image. Instantly restore any prior state by referencing its UUID or tag.

FAQ

Why not Docker with seccomp/AppArmor?

Container security relies on kernel namespaces and syscall filtering. A kernel vulnerability can bypass all of it. ConTree uses microVMs with separate kernels per execution, providing hardware-level isolation that survives kernel exploits.

Can I run servers or open ports?

No. ConTree is designed for batch execution, not hosting services. Executions have no inbound network access. Outbound may be restricted or allowed depending on deployment configuration.

What exactly is versioned/snapshotted?

The filesystem. After each execution, modified files are captured into a new image. Process memory, running services, and network state are not preserved. Think of it as committing your working directory after each command.

How do I retrieve files/artifacts from execution?

Use the Inspect API. After execution completes, the result contains a new image UUID. Call GET /inspect/{uuid}/list?path=/output to list files, then GET /inspect/{uuid}/download?path=/output/result.json to retrieve them.

What's the maximum execution time?

Configurable per request via the timeout field. If not specified, a system default applies. Executions that exceed the timeout are terminated and marked with timed_out: true in the result.

How is stdout/stderr handled?

Captured and returned in the operation result. Output is base64-encoded if it contains binary data, otherwise ASCII. Configurable truncation limit (default 1MB, max 10MB) prevents memory issues with verbose programs.

Can I cancel a running execution?

Yes. Call DELETE /operations/{id} to cancel a pending or running operation. The operation state will change to cancelled.

What resource metrics are available?

Each completed operation includes: user/system CPU time, max RSS (memory), block I/O, page faults, context switches, wall-clock elapsed time, and exit code/signal.