OpenSandbox Architecture
OpenSandbox is a general-purpose sandbox platform for AI applications. It provides client SDKs and tools, protocol definitions, a lifecycle control plane, Docker and Kubernetes runtime backends, and in-sandbox execution components for commands, files, code interpreters, browser automation, desktop environments, and training workloads.
This document describes the current repository architecture and the main boundaries between public contracts, server implementation, runtime providers, and sandbox data-plane components.
Architecture Overview
OpenSandbox is organized around six practical surfaces:
- Client surface - SDKs, the
osbCLI, and the MCP server used by applications, agents, and operators. - Protocol surface - OpenAPI contracts under
specs/for lifecycle, diagnostics, in-sandbox execution, and egress policy. - Lifecycle control plane - the FastAPI server under
server/that authenticates requests, validates config, persists server-managed records, and delegates lifecycle work to a configured runtime service. - Runtime backends - Docker for local and single-host deployments, and Kubernetes through workload providers such as BatchSandbox and
kubernetes-sigs/agent-sandbox. - Sandbox data plane - the user workload container plus the injected
execddaemon, optional Jupyter/code-interpreter runtime, volumes, and optional egress sidecar. - Network and security plane - endpoint resolution, server proxying, Kubernetes ingress gateway routing, secure endpoint access, egress policy enforcement, resource limits, and secure container runtimes.
The split is intentional: SDKs and tools should depend on the public contracts, the server should own lifecycle orchestration, runtime providers should own platform-specific resource creation, and execd/egress should own operations that happen from inside the sandbox network and filesystem namespace.
1. Client Surface
The client surface is the developer-facing entry point for OpenSandbox.
1.1 Sandbox SDKs
The sandbox SDKs wrap lifecycle operations and in-sandbox operations behind language-native APIs:
- Python:
sdks/sandbox/python - JavaScript/TypeScript:
sdks/sandbox/javascript - Java/Kotlin:
sdks/sandbox/kotlin - C#/.NET:
sdks/sandbox/csharp - Go:
sdks/sandbox/go
Common capabilities include:
- Create, list, inspect, pause, resume, renew, and delete sandboxes.
- Resolve service endpoints for sandbox ports.
- Execute commands with streamed output and background status/log polling.
- Manage files and directories.
- Read resource metrics from
execd. - Inspect or patch runtime egress policy when an egress sidecar is attached.
Generated OpenAPI clients live beside handwritten adapters. Generated code handles ordinary request/response APIs; handwritten layers cover SDK ergonomics, streaming, transport lifecycle, error mapping, and high-level models.
1.2 Code Interpreter SDKs
The code-interpreter SDKs build on the sandbox SDKs and execd code execution APIs. They manage code execution contexts and expose language-oriented code execution helpers.
The official code-interpreter image is under sandboxes/code-interpreter/. It provides Python, Java, Node.js, and Go runtimes, and Jupyter kernels for Python, Java, TypeScript/JavaScript, Go, and Bash. Exact language versions are image-controlled and selected through environment variables such as PYTHON_VERSION, JAVA_VERSION, NODE_VERSION, and GO_VERSION.
1.3 CLI and MCP
The osb CLI under cli/ is a terminal interface for day-to-day sandbox operations:
osb sandbox: lifecycle and endpoint managementosb command: command execution, background logs, and shell sessionsosb file: file and directory operationsosb egress: runtime egress policy inspection and mutationosb devops: low-level diagnosticsosb skills: OpenSandbox-specific agent skill installation
The MCP server under sdks/mcp/sandbox/python exposes focused sandbox lifecycle, command, and text-file tools to MCP-capable clients such as Claude Code and Cursor.
2. Protocol Surface
OpenSandbox treats specs/ as the public contract source of truth.
2.1 Lifecycle API
specs/sandbox-lifecycle.yml defines the lifecycle API, served by the server with the base path /v1.
Main resource groups:
- Sandboxes: create from an image or snapshot, list, get, delete, pause, resume, renew expiration, and resolve port endpoints.
- Snapshots: create a persistent snapshot from a sandbox, list snapshots, get snapshot state, and delete snapshots.
Important request features:
imageorsnapshotIdstartup source.entrypoint, environment variables, metadata, and opaqueextensions.resourceLimitsfor CPU, memory, GPU, and future resource keys.platformconstraints.volumesfor host paths, platform-managed named volumes/PVCs, and OSSFS.networkPolicyfor egress sidecar configuration.secureAccessfor Kubernetes ingress gateway deployments that require endpoint credentials.
2.2 Diagnostics API
specs/diagnostic-api.yml defines best-effort diagnostic descriptors for sandbox logs and events. The server also exposes practical DevOps diagnostics routes that return plain text for operators and AI troubleshooting workflows.
2.3 Execd API
specs/execd-api.yaml defines the in-sandbox execution API exposed by components/execd/.
Main capabilities:
- Health check:
GET /ping - Code contexts and execution:
/code/contexts,/code/context,/code - Bash sessions:
/session - Commands:
/command, command status, and background command logs - Files and directories:
/files/*,/directories - Metrics:
/metrics,/metrics/watch
Command and code execution use Server-Sent Events for streaming output. The current execd implementation also includes interactive PTY WebSocket endpoints under /pty for long-lived shell sessions.
2.4 Egress API
specs/egress-api.yaml defines the runtime policy API exposed directly by the egress sidecar:
GET /policyPATCH /policy
The API is reached by resolving the sandbox endpoint for the egress sidecar port. When sidecar authentication is enabled, callers must include the endpoint headers returned by the lifecycle endpoint resolution API.
3. Lifecycle Control Plane
The lifecycle server under server/ is a FastAPI application. It owns request validation, API-key authentication, server configuration, lifecycle orchestration, endpoint formatting, diagnostics, and server-managed persistence.
3.1 Server Structure
Key packages:
opensandbox_server/main.py: app startup, middleware, router registration, runtime validation, and renew-intent startup.opensandbox_server/api/: lifecycle routes, proxy routes, pool routes, diagnostics routes, and request/response schemas.opensandbox_server/services/: lifecycle service interfaces and Docker/Kubernetes implementations.opensandbox_server/services/k8s/: Kubernetes workload providers, endpoint resolution, volume/egress helpers, informer support, and provider-specific mapping.opensandbox_server/repositories/: persistence adapters, currently used for server-managed snapshot records.opensandbox_server/integrations/renew_intent/: optional auto-renew-on-access integration.opensandbox_server/middleware/: API-key authentication and request ID middleware.
3.2 Runtime Service Selection
The server selects exactly one lifecycle implementation from [runtime].type:
docker->DockerSandboxServicekubernetes->KubernetesSandboxService
Both implementations satisfy the same SandboxService interface, so API routes stay thin and delegate behavior to services. Runtime-specific details stay behind the service boundary.
3.3 Server Persistence
The [store] configuration selects the server-managed metadata store. The default is SQLite at ~/.opensandbox/opensandbox.db. Snapshot metadata is the first persisted server resource; future persistent records should reuse the same repository boundary.
3.4 Server Proxy and Endpoint Resolution
The lifecycle endpoint API returns the reachable address for a service port inside a sandbox. Depending on runtime and configuration, the endpoint may be:
- A Docker host/bridge mapped endpoint.
- A Kubernetes ingress gateway endpoint.
- A server-proxied URL under
/sandboxes/{sandboxId}/proxy/{port}whenuse_server_proxy=true.
The server proxy supports HTTP and WebSocket traffic and is also integrated with optional renew-on-access behavior.
4. Runtime Backends
4.1 Docker Runtime
The Docker runtime is the local and single-host backend. It talks directly to the Docker daemon and manages containers, timers, labels, volumes, ports, optional sidecars, and snapshots.
Core responsibilities:
- Pull public or private images, including per-request registry authentication.
- Create containers with CPU, memory, GPU, platform, capability, AppArmor, seccomp, PID, and secure-runtime settings.
- Stage the
execdbinary from[runtime].execd_imageinto the sandbox and install a bootstrap launcher before starting the user entrypoint. - Support network modes
host,bridge, and custom user-defined networks. - Allocate host ports for
execdand user service endpoints in non-host network modes. - Restore expiration timers for existing managed containers after server restart.
- Support host bind mounts, Docker named volumes via the
pvcvolume model, and OSSFS-backed mounts. - Attach an egress sidecar when
networkPolicyis requested and Docker networking is compatible. - Create Docker-backed persistent snapshots as local images and restore sandboxes from those snapshot images.
Docker pause/resume uses container-level pause/resume. Docker snapshots are exposed through the public snapshot API.
4.2 Kubernetes Runtime
The Kubernetes runtime delegates actual workload creation to a workload provider selected by kubernetes.workload_provider.
Supported providers:
batchsandbox- the default provider backed by OpenSandbox's Kubernetes controller andBatchSandboxCRD.agent-sandbox- a provider forkubernetes-sigs/agent-sandbox.
The Kubernetes server path handles:
- Kubernetes client initialization and optional informer-backed reads.
- Workload creation from image requests;
snapshotIdstartup resolves to a stored restorable image when the snapshot record supports restore. - Template merging for BatchSandbox and agent-sandbox manifests.
- Per-request image pull secrets where the provider supports them.
- Resource limits and GPU translation to Kubernetes extended resources.
- Platform constraints and RuntimeClass integration for secure runtimes.
- Volumes, egress sidecars, and secure endpoint access annotations.
- Endpoint resolution through direct workload data or ingress gateway configuration.
- Pause/resume delegation to providers.
- Plain-text diagnostics from Kubernetes resources.
4.3 BatchSandbox Controller
The Kubernetes controller under kubernetes/ implements OpenSandbox-specific CRDs for high-throughput and pooled sandbox delivery:
BatchSandbox: create one or many sandbox replicas from a pod template.Pool: maintain pre-warmed resources for fast allocation.SandboxSnapshot: internal rootfs snapshot records used by Kubernetes pause/resume.
BatchSandbox supports both template-based creation and pool-based creation via extensions.poolRef. It also supports optional task orchestration for batch and RL-style workloads.
Kubernetes pause/resume is implemented through rootfs snapshots for BatchSandbox.spec.replicas=1: pause commits the sandbox root filesystem to an OCI image and releases runtime resources; resume rewrites the workload template to use the snapshot image and recreates the runtime while preserving the sandbox ID.
The public snapshot API currently has a Docker-backed runtime implementation. Kubernetes pause/resume uses the controller's internal SandboxSnapshot flow; a general Kubernetes implementation of the public snapshot API is a separate runtime concern.
5. Sandbox Data Plane
Each sandbox runs the user's image and entrypoint, with OpenSandbox control processes injected around it.
5.1 Execd
components/execd/ is a Go daemon built with Gin. It runs inside the sandbox and exposes the execution API.
Responsibilities:
- Shell command execution with SSE streaming.
- Background command status and incremental log retrieval.
- Persistent bash sessions.
- Interactive PTY sessions over WebSocket.
- File and directory operations.
- Jupyter-backed code contexts and code execution.
- Local CPU/memory metrics and optional OpenTelemetry metrics export.
- Optional shared access token enforcement through
X-EXECD-ACCESS-TOKEN.
In Docker, the server stages execd into the container and installs a bootstrap script. In Kubernetes BatchSandbox template mode, an init container copies execd and bootstrap.sh from the configured execd_image into an emptyDir volume mounted by the main sandbox container.
5.2 Code Interpreter Runtime
The code-interpreter sandbox image starts Jupyter inside the sandbox. execd talks to Jupyter over HTTP/WebSocket and translates Jupyter kernel messages into OpenSandbox streaming events.
The Code Interpreter SDKs are optional high-level clients. The lower-level execution API remains available through the sandbox SDKs and direct execd clients.
5.3 Volumes
The lifecycle API exposes runtime-neutral volume models:
host: bind a permitted host path.pvc: platform-managed named storage. Docker maps this to a Docker named volume; Kubernetes maps it to a PersistentVolumeClaim.ossfs: mount Alibaba Cloud OSS through the server/runtime integration.
Runtime providers validate and materialize these volume definitions differently, but the API shape stays shared.
5.4 Egress Sidecar
components/egress/ enforces outbound network policy from the sandbox network namespace.
Capabilities:
- FQDN and wildcard-domain allow/deny rules.
dnsmode for DNS filtering.dns+nftmode for DNS plus nftables enforcement of resolved IPs and CIDR/IP rules where supported.- Runtime policy inspection and patching through
/policy. - Optional sidecar authentication.
- Optional platform-enforced always-allow and always-deny overlays.
- Experimental transparent HTTPS MITM mode.
Docker starts the egress sidecar as a separate container and runs the main sandbox container in the sidecar network namespace. Kubernetes appends the egress sidecar to the pod spec and drops NET_ADMIN from the main sandbox container so only the sidecar mutates network rules.
6. Networking and Access
6.1 Ingress
components/ingress/ is a Kubernetes-oriented HTTP/WebSocket reverse proxy. It watches sandbox resources and routes traffic to sandbox ports.
Supported routing modes:
- Header mode:
OpenSandbox-Ingress-To: <sandbox-id>-<port>or host parsing. - URI mode:
/<sandbox-id>/<port>/<path>. - Wildcard host mode through server endpoint formatting.
For BatchSandbox, ingress reads endpoint data from the sandbox.opensandbox.io/endpoints annotation. For agent-sandbox, it reads status.serviceFQDN.
6.2 Secure Access
secureAccess is currently supported for Kubernetes sandboxes exposed through ingress gateway mode. When enabled, the server provisions endpoint credentials and returns required headers with endpoint responses. Signed route tokens are also supported when gateway secure-access signing keys are configured.
6.3 Auto-Renew on Access
The optional renew-intent integration extends sandbox TTL when access is observed. It can be triggered by server proxy requests or by ingress gateway events delivered through Redis. Per-sandbox opt-in is controlled by the extensions["access.renew.extend.seconds"] create parameter.
7. Core Flows
7.1 Sandbox Creation
Client / SDK / CLI / MCP
-> POST /v1/sandboxes
-> FastAPI lifecycle server validates request and config
-> selected runtime service creates Docker container or Kubernetes workload
-> runtime stages execd and optional egress/volume/network configuration
-> sandbox reaches Running or reports Failed with status reason/messageCreation is asynchronous from the API perspective. Clients should poll GET /v1/sandboxes/{sandboxId} or use SDK readiness helpers.
7.2 Command, File, and Code Execution
Client
-> resolve execd endpoint from sandbox metadata or server proxy
-> call execd API with X-EXECD-ACCESS-TOKEN when required
-> execd runs command, file operation, session, PTY, or Jupyter code execution
-> execd streams SSE/WebSocket output or returns structured responses7.3 Service Exposure
Client
-> GET /v1/sandboxes/{sandboxId}/endpoints/{port}
-> server returns Docker-mapped, ingress-gateway, or server-proxy endpoint
-> client includes returned headers when secure access or sidecar auth requires them
-> HTTP/WebSocket traffic reaches the target sandbox port7.4 Egress Policy
Create request with networkPolicy
-> server validates [egress] config
-> runtime attaches egress sidecar with initial policy
-> sandbox outbound DNS/network traffic is filtered by sidecar
-> clients may resolve the egress endpoint and PATCH /policy at runtime7.5 Pause, Resume, and Snapshots
Pause / resume
-> lifecycle server delegates to runtime provider
-> Docker pauses/resumes the container
-> BatchSandbox uses rootfs snapshot commit/recreate for supported single-replica sandboxes
Public snapshot API
-> server persists snapshot metadata
-> Docker runtime commits the sandbox to a restorable image
-> create-from-snapshot resolves that image and starts a new sandbox8. Design Principles
Protocol First
Public behavior starts from OpenAPI contracts in specs/. SDKs and clients should align to those contracts, and generated outputs should be regenerated from source specs rather than patched as the only fix.
Control Plane vs Data Plane
The lifecycle server should orchestrate and validate. Platform-specific provisioning belongs in runtime services/providers. In-sandbox operations belong in execd and egress sidecars.
Runtime Neutral API, Runtime Specific Execution
The lifecycle API uses shared concepts such as resource limits, volumes, endpoints, network policy, and metadata. Docker and Kubernetes can materialize those concepts differently while preserving the API contract.
Secure Defaults with Explicit Escape Hatches
The server supports API-key authentication, startup guardrails for unauthenticated mode, resource limits, capability drops, optional secure runtimes, egress controls, endpoint headers, and platform-specific network isolation. Less restrictive modes are intended for local development or explicit operator choice.
Observable Failures
Sandbox state includes state, reason, message, and transition time. execd exposes metrics, the server exposes diagnostics, ingress/egress/execd support logs and OpenTelemetry metrics where implemented, and request IDs are propagated for debugging.
9. Common Use Cases
- Coding agents: run Claude Code, Gemini CLI, Codex CLI, Qwen Code, Kimi CLI, or other agent tools in isolated sandboxes.
- AI code execution: execute model-generated code with command/file/code-interpreter APIs and streamed feedback.
- Browser automation: run Chrome or Playwright with controlled filesystem and network behavior.
- Remote development: expose VS Code Web, desktops, VNC, or development servers through sandbox endpoints.
- RL and evaluation workloads: use Kubernetes BatchSandbox, Pool, and task orchestration for high-throughput sandbox delivery.
- Enterprise isolation: combine secure runtimes, ingress, egress, endpoint access headers, and Kubernetes deployment controls.
10. References
- Root README
- Sandbox Lifecycle Spec
- Diagnostics Spec
- Sandbox Execution Spec
- Egress Spec
- Server Documentation
- Server Configuration
- execd Documentation
- Ingress Documentation
- Egress Documentation
- Kubernetes Controller
- Pause and Resume
- Secure Container Runtime Guide
- Network Isolation for Kubernetes
- CLI README
- MCP README
- Examples
This page is sourced from:
docs/architecture.md