Why Agents Need Branching

Think MCTS for code. Explore, evaluate, backtrack, expand.

🌳

Tree Search Execution

Agent explores a solution tree. Each node is a code state. Need to branch, evaluate multiple children, backtrack to promising nodes. More info

Branch N times from any checkpoint. Run in parallel. Score results. Expand best paths.

🔄

Instant Backtrack

Path failed. Tests red. Environment broken. Sequential agents restart from scratch. More info

Jump to any previous node. One API call. Continue from known-good state.

⚖

Parallel Evaluation

Need to try 10 different approaches from the same starting point. Sequential = 10x slower. More info

Fork once, run 10 branches simultaneously. Compare. Merge the winner.

Why Not Just Docker?

Containers can't branch state. ConTree can.

Capability	Docker	Kubernetes	ConTree
Isolation level	Namespace (kernel shared)	Namespace (kernel shared)	VM (hardware boundary)
State branching	Manual commit required	Not supported	Automatic per execution
Instant rollback	Recreate container	Redeploy pod	Switch image reference
Filesystem inspection	Requires running container	Requires running pod	API without execution
Execution history	External logging	External logging	Built-in with resources
Scalability	Manual orchestration	Minutes to scale pods	Thousands concurrent instantly

Use Cases

What people build with ConTree.

🤖

AI Agents

Coding agents, deep research agents, data analysis agents. Any agent that needs to execute code, run tools, or explore solutions. More info

Branch per attempt. Parallel exploration. Rollback broken states. Full Linux environment.

🐞

Security Sandboxing

Run untrusted code safely. Analyze third-party scripts. Test without risk to your infrastructure. More info

Hardware isolation. No escape. Inspect post-execution state.

🎓

Educational Platforms

Students run untrusted code. Need isolation, instant reset, and execution history. More info

Safe execution. One-click reset. Review all attempts.

How It Works

Four endpoints. Import, upload, execute, retrieve.

Import Image `POST /images/import`

Pull OCI image rootfs from any registry. Returns a UUID for the base snapshot.

Upload Files `POST /files`

Upload input files (scripts, data). Content-addressed by SHA256 for deduplication.

Spawn Instance `POST /instances`

Execute command with image, files, env vars, stdin. Returns operation ID immediately.

Get Results `GET /operations/{id}`

Poll for stdout/stderr, exit code, resource usage. Result includes new image UUID for branching.

⚠ OCI Import: What Is & Isn't Imported

Imported

✓ Files and directories
✓ Permissions and ownership
✓ Symlinks
✓ ENV variables
✓ WORKDIR

Not Imported (config)

✕ ENTRYPOINT / CMD
✕ EXPOSE / LABEL

Safe to Run LLM-Generated Code

Your agent writes code you haven't reviewed. That's fine.

🛡 What Can't Happen

Agent code can't escape to host
Can't affect other executions
Can't mine crypto or run botnets
Can't exhaust your resources
Nothing persists unless you save it

🛠 How It Works

MicroVM per execution (not containers)
Dedicated kernel, hardware boundary
No inbound network access
Timeout kills runaway code
Output capped to prevent log bombs

💡 Best Practices

Set reasonable timeouts
Pass secrets via env, not files
Review agent output before trusting
Use disposable mode for untrusted tasks
Tag known-good checkpoints

Agent Patterns

How tree search agents use ConTree.

🌳 Beam Search Coding

Generate 5 candidate solutions from checkpoint. Run tests on each branch in parallel. Keep top 2. Expand further. Repeat until solved.

✓ Explore breadth without losing depth

📈 Best-of-N Sampling

Same prompt, N completions. Fork from identical state. Run all in parallel. Score outputs. Return the best.

✓ Maximize quality, minimize latency

🔄 Speculative Execution

Uncertain which approach works? Run both from same checkpoint. First success wins. Discard the other.

✓ Hedge bets, save time

Concepts

Think Git, but for execution environments.

Image: An immutable filesystem snapshot. Base images are imported from OCI registries; new images are created after each execution.
Instance: A single command execution request. Runs in an isolated microVM, produces stdout/stderr and optionally a new image.
Operation: An async task (instance execution or image import). Has states: pending, running, done, failed, cancelled.
Disposable: Execution mode where no new image is created. Use for stateless commands where you only need output.
Tag: A human-readable alias for an image UUID. Like Git tags, but for filesystem snapshots.
Inspect: Browse an image's filesystem via API without executing anything. List directories, download files.
Branch: Start a new execution from any existing image. The result is a new image, creating a tree of states.
Rollback: Switch to a previous image. Instantly restore any prior state by referencing its UUID or tag.

FAQ

Why not Docker with seccomp/AppArmor?

Container security relies on kernel namespaces and syscall filtering. A kernel vulnerability can bypass all of it. ConTree uses microVMs with separate kernels per execution, providing hardware-level isolation that survives kernel exploits.

Can I run servers or open ports?

No. ConTree is designed for batch execution, not hosting services. Executions have no inbound network access. Outbound may be restricted or allowed depending on deployment configuration.

What exactly is versioned/snapshotted?

The filesystem. After each execution, modified files are captured into a new image. Process memory, running services, and network state are not preserved. Think of it as committing your working directory after each command.

How do I retrieve files/artifacts from execution?

Use the Inspect API. After execution completes, the result contains a new image UUID. Call GET /inspect/{uuid}/list?path=/output to list files, then GET /inspect/{uuid}/download?path=/output/result.json to retrieve them.

What's the maximum execution time?

Configurable per request via the timeout field. If not specified, a system default applies. Executions that exceed the timeout are terminated and marked with timed_out: true in the result.

How is stdout/stderr handled?

Captured and returned in the operation result. Output is base64-encoded if it contains binary data, otherwise ASCII. Configurable truncation limit (default 1MB, max 10MB) prevents memory issues with verbose programs.

Can I cancel a running execution?

Yes. Call DELETE /operations/{id} to cancel a pending or running operation. The operation state will change to cancelled.

What resource metrics are available?

Each completed operation includes: user/system CPU time, max RSS (memory), block I/O, page faults, context switches, wall-clock elapsed time, and exit code/signal.

ConTree - Sandboxes That Branch Like Git