Deploy anywhere
The SDK is headless and runtime-agnostic — the same code runs in a browser tab, a serverless function, or inside a long-lived sandbox. The only design question is where the agent loop runs, because that's what a function time limit applies to.
Pure browser — no backend
Drive the agent straight from the tab against your LLM endpoint. Use a WebContainer (real Node + shell in the browser) or an in-memory/IndexedDB FS for file-only agents. Nothing server-side.
import { query, createOpenAIClient } from 'anyclaude-sdk'
import { WebContainerWorkspace } from 'anyclaude-sdk'
const workspace = await WebContainerWorkspace.boot()
const llm = createOpenAIClient({ baseUrl: 'https://api.openai.com/v1', model: 'gpt-4o', apiKey: userKey })
for await (const m of query({ prompt, workspace, llm, model: 'gpt-4o', includePartialMessages: true })) render(m)Recipe 1 — Serverless function (browser + function)
Run query() in a serverless function; stream the SDKMessages back to a browser client. Great for "no infra, free tier." The function's wall-clock cap bounds the agent loop, so checkpoint to a SessionStore if a run can exceed it.
// app/api/agent/route.ts (Vercel function, streamed)
import { query, createOpenAIClient, MemoryFileSystem, NoopCommandExecutor, composeWorkspace } from 'anyclaude-sdk'
export const maxDuration = 300 // set ~10s below your platform's cap
export async function POST(req: Request) {
const { prompt, sessionId } = await req.json()
const workspace = composeWorkspace(new MemoryFileSystem(), new NoopCommandExecutor(), '/work')
const llm = createOpenAIClient({ baseUrl: process.env.LLM_BASE, model: process.env.LLM_MODEL, apiKey: process.env.LLM_KEY })
const stream = new ReadableStream({
async start(c) {
for await (const m of query({ prompt, workspace, llm, model: process.env.LLM_MODEL })) c.enqueue(new TextEncoder().encode(JSON.stringify(m) + '\n'))
c.close()
},
})
return new Response(stream, { headers: { 'content-type': 'application/x-ndjson' } })
}Free-tier function limits
Limits change — set maxDuration ~10s below your platform's configured cap. These are current free-tier defaults:
| Platform (free) | Limit model | Suggested budget |
|---|---|---|
| Vercel Hobby | wall-clock, ~300s (Fluid Compute) | ~290s |
| Appwrite | wall-clock, 30s | ~25s |
| Netlify | sync funcs ~10s (background up to 15 min) | ~8s, or background funcs |
| Cloudflare Workers | CPU-time (I/O waits are free) | I/O-bound agents rarely hit it; or use Durable Objects / Workflows |
| AWS Lambda | wall-clock, 15 min | ~14m50s |
The "survivor" — span the cap shipped in 0.2.0
For runs longer than your function's limit, the SDK checkpoints at a turn boundary just before the deadline (query({ maxDurationMs })), persists to a pluggable SessionStore (Supabase / Neon / Vercel KV / Upstash Redis / IndexedDB / memory), and emits a paused message. The continuation runs with query({ resume: true, continueRun: true }) and the same sessionId. The anyclaude-react client (createAgentClient / createEndpointClient) auto-stitches the streams into one — so a 30s Appwrite function runs an unlimited-length agent, transparently.
Recipe 2 — Sandbox-as-runtime (no time limit)
Run query() inside a long-lived sandbox (E2B, Vercel Sandbox, Daytona) and stream to the browser. The sandbox microVM is the compute, so there's no per-invocation function cap — the agent runs as long as the sandbox is alive (configurable, minutes to hours). Best for heavy / long autonomous runs. No survivor needed.
// runs in a process started inside the E2B sandbox
import { query, LocalSandbox, createOpenAIClient } from 'anyclaude-sdk'
const workspace = new LocalSandbox({ cwd: '/home/user' }) // local to the sandbox
const llm = createOpenAIClient({ baseUrl: process.env.LLM_BASE, model: process.env.LLM_MODEL, apiKey: process.env.LLM_KEY })
for await (const m of query({ prompt, workspace, llm, model: process.env.LLM_MODEL })) emit(m) // → exposed port → browserGotcha: if the loop runs in a function and only uses E2B for tool execution, the function's time limit still applies — the sandbox being alive doesn't save the orchestrator. The no-limit benefit requires the loop itself to run in the sandbox.
Recipe 3 — Server brain, browser hands (clientTools)
Keep your LLM key and the agent loop on the server, but execute specific tools in the user's browser — e.g. run bash in a WebContainer so the agent edits and runs real code on the client, no sandbox bill. List those names in clientTools: the run pauses with a client_tool_request, the browser executes them, and you resume with clientToolResults.
// server function
for await (const m of query({ prompt, workspace, llm, model, sessionId, clientTools: ['bash'] })) {
if (m.type === 'system' && m.subtype === 'client_tool_request') return respond(m.requests) // → browser runs them
emit(m)
}
// next request — browser sends back results:
query({ workspace, llm, model, sessionId, resume: true, continueRun: true,
clientTools: ['bash'], clientToolResults })Full working example: vercel-clienttools.
Recipe 4 — Drop-in Claude Code router 0.6.0
Run Claude Code itself against any OpenAI-compatible model — DeepSeek, Qwen, GLM, Kimi, local Ollama, OpenRouter. The anyclaude-sdk/anthropic-endpoint subpath bridges the Anthropic Messages API to the SDK's LLMClient: stand up an Anthropic-compatible /v1/messages server, point Claude Code at it with ANTHROPIC_BASE_URL, and every turn is served by your chosen model. Unlike a naive proxy, inline tool-call dialects are recovered into proper tool_use blocks (via model profiles), so tool use actually works on cheap models.
// server (Node) — one turn per request; Claude Code runs the loop
import { createOpenAIClient } from 'anyclaude-sdk/llm'
import { anthropicToChat, anthropicSSE } from 'anyclaude-sdk/anthropic-endpoint'
const llm = createOpenAIClient({ baseUrl: 'https://api.deepseek.com/v1', model: 'deepseek-chat', apiKey: process.env.DEEPSEEK_API_KEY })
// POST /v1/messages
const req = anthropicToChat(body) // Anthropic request → ChatMsg[] + tools
for await (const evt of anthropicSSE(llm, req, { model: 'deepseek-chat' })) res.write(evt) // → Anthropic SSE# then point Claude Code at it:
ANTHROPIC_BASE_URL=http://localhost:8787 ANTHROPIC_API_KEY=dummy claudeRunnable, config-driven (default / background / long-context routing): examples/claude-code-router. Also exports streamResultToAnthropicMessage (non-streaming) and anthropicToolsToDefs.
Working examples
Each is a runnable Vite project (Vercel-ready, or browser-only) in the repo's examples/:
browser-ide
WebContainer IDE — real shell + Node in the tab, file explorer, editor, terminal, chat. No backend.
vercel-clienttools
Serverless brain + browser-executed bash via clientTools.
vercel-kv-survivor
Survivor across the function cap with Vercel KV.
vercel-supabase-survivor
Survivor with Supabase as the SessionStore.
vercel-indexeddb-survivor
Stateless function + browser IndexedDB session + survivor.
browser-chat
Minimal browser chatbot with useAgent.
claude-code-router
Anthropic-compatible server: run Claude Code against any OpenAI model, with tool-dialect recovery.
Hosting the frontend
The client UI is just static files (or a Vite/Next app). Deploy it anywhere — Vercel, Netlify, Cloudflare Pages, or Puter static hosting. This very docs site is served from Puter.