● Claude Code, unbundled

Claude Code agents,
for any LLM, anywhere.

A standalone, headless agent engine — tools, the tool loop, MCP, sub-agents, sessions — that runs against any OpenAI- or Anthropic-compatible endpoint, in the browser (WebContainer), Node, and Bun. No backend, no OAuth, no native binaries.

npm: anyclaude-sdk GitHub

npm install anyclaude-sdk

Live demo — a full IDE in your browser

No install, no backend, no key. A real jsh shell + Node.js via WebContainer, with the agent driving a file explorer, code editor, and terminal — running entirely in your tab.

Ask it to build and run something — e.g. "create an Express server with a /hello route and start it." The agent writes the files, runs npm/node in the WebContainer, and you watch it happen. Built with anyclaude-sdk + anyclaude-react.

Launch the live IDE

First load registers a tiny service worker (for the cross-origin isolation WebContainer needs) and reloads once — then boots in a couple of seconds.

Why anyclaude

The same query() async-generator interface and SDKMessage envelope as the official Claude Agent SDK — but provider-agnostic and runtime-agnostic.

Headless engine

Pure agent logic — bring your own UI (chatbot, research assistant, terminal agent). Zero framework lock-in.

Browser, Node, Bun

Runs in a WebContainer tab, a Node server, or a Bun process — the exact same code.

Any LLM endpoint

OpenAI, Anthropic, xAI, Groq, Together, OpenRouter, Ollama, or a local server. Three transport clients, one interface.

Full toolset

Files (text/image/PDF/notebook), shell, glob/grep, web fetch & search, todos — plus your own custom tools.

Pluggable sandboxes

WebContainer, E2B, Vercel, Daytona, Cloudflare, or the real local OS. A sandbox is just FS + exec.

No backend required

Drive an agent straight from the browser against your LLM key, or run it serverless — your choice.

Quick start

Pick a runtime. Every example is the same three steps: a workspace (files + shell), an LLM client, and query().

import { WebContainer } from '@webcontainer/api'
import { query, WebContainerWorkspace, createOpenAIClient } from 'anyclaude-sdk'

const wc = await WebContainer.boot()
const workspace = new WebContainerWorkspace(wc)

const llm = createOpenAIClient({
  apiKey: import.meta.env.VITE_OPENAI_API_KEY,
  baseUrl: 'https://api.openai.com/v1', // or xAI, Groq, OpenRouter, local…
  model: 'gpt-4o',
})

for await (const msg of query({ prompt: 'List the files and summarize the project', workspace, llm })) {
  if (msg.type === 'assistant')
    for (const b of msg.message.content) if (b.type === 'text') console.log(b.text)
  else if (msg.type === 'result' && msg.subtype === 'success') console.log('Done:', msg.result)
}

import { query, LocalSandbox, createAnthropicClient } from 'anyclaude-sdk'

// Real OS: agent works against your actual filesystem + shell, like Claude Code.
const workspace = new LocalSandbox({ cwd: process.cwd() })
const llm = createAnthropicClient({ apiKey: process.env.ANTHROPIC_API_KEY, model: 'claude-sonnet-4-6' })

for await (const msg of query({ prompt: 'add a --version flag and run the tests', workspace, llm })) {
  if (msg.type === 'assistant')
    for (const b of msg.message.content) if (b.type === 'text') process.stdout.write(b.text)
}

import { query, MemoryFileSystem, NoopCommandExecutor, composeWorkspace, createOpenAIClient } from 'anyclaude-sdk'

// In-memory FS, no shell — great for serverless / Bun functions.
const fs = new MemoryFileSystem()
const workspace = composeWorkspace(fs, new NoopCommandExecutor(), '/home/user')
const llm = createOpenAIClient({ apiKey: process.env.LLM_KEY, baseUrl: 'https://api.x.ai/v1', model: 'grok-build-0.1' })

for await (const msg of query({ prompt: 'Write a haiku to /home/user/poem.txt', workspace, llm, model: 'grok-build-0.1' })) {
  if (msg.type === 'result') console.log(msg.subtype)
}

Core concepts

query() returns an AsyncGenerator<SDKMessage>. You iterate the stream and render each message — the same envelope as the official SDK.

The message stream

for await (const msg of query({ prompt, workspace, llm, includePartialMessages: true })) {
  switch (msg.type) {
    case 'system':       break // init (tools, model, cwd), compact_boundary, local_command_output
    case 'stream_event': break // token deltas (when includePartialMessages: true)
    case 'assistant':    break // a full assistant turn: text + tool_use blocks
    case 'user':         break // synthetic tool_result turns (msg.isSynthetic)
    case 'result':       break // final: subtype 'success' | 'error', usage, total_cost_usd
  }
}

Tools

A Tool is an OpenAI function schema + a run(input, ctx). Ships with the full Claude Code toolset; add your own.

Sandboxes

WebContainer · LocalSandbox · E2B · Vercel · Daytona · Cloudflare. All implement one Sandbox = FileSystem + CommandExecutor.

Filesystems

DexieFileSystem (IndexedDB) · OpfsFileSystem · MemoryFileSystem. Seed a Linux tree with seedLinuxTree().

LLM clients

createOpenAIClient · createAnthropicClient · createResponsesClient — all normalize tools, streaming, and usage.

Custom tools

Define a tool with defineTool() — a name, description, JSON-Schema params, and a run() execution method that can do anything async.

import { query, defineTool } from 'anyclaude-sdk'

const weather = defineTool({
  name: 'get_weather',
  description: 'Current weather for a city',
  parameters: { properties: { city: { type: 'string' } }, required: ['city'] },
  run: async ({ city }, ctx) => ({ content: await fetchWeather(String(city)) }),
})

// extraTools is ADDITIVE — your tools alongside the builtins (read/write/bash/grep…)
query({ prompt, workspace, llm, extraTools: [weather] })

// tools REPLACES the builtins entirely — use when you want a locked-down set
query({ prompt, workspace, llm, tools: [weather] })

extraTools adds; tools replaces. Reach for extraTools in almost every case so you keep the file/shell/search builtins.

Tool selection & disabling

Narrow or disable any tool — builtin or custom — with allow/deny lists (also readable from .claude/settings.json).

// Whitelist: ONLY these tools are available
query({ prompt, workspace, llm, allowedTools: ['read_file', 'grep', 'get_weather'] })

// Denylist: everything except these (applied after allowedTools)
query({ prompt, workspace, llm, disallowedTools: ['bash', 'delete_file'] })

Message queue — type while it works

Interject follow-up messages into a live agent run. Each queued message is delivered one per turn boundary — injected as a user turn before the next model call, and a run that would otherwise end keeps going while the queue is non-empty.

import { query, MessageQueue } from 'anyclaude-sdk'

const queue = new MessageQueue()
const run = query({ prompt: 'Refactor the auth module', workspace, llm, messageQueue: queue })

// …while it's still streaming/running tools, the user types more:
queue.push('also add rate limiting')
queue.push('and update the README')   // delivered one at a time, at turn boundaries

for await (const msg of run) { /* render… */ }

Perfect for chat UIs: the user never has to wait for the agent to finish before lining up the next instruction.

More capabilities

Everything you expect from Claude Code, as composable options.

Sessions & resume

Persist the transcript via a sessionStore and resume a prior session by id.

MCP servers

In-process (createSdkMcpServer) or remote HTTP/SSE, with a CORS mcpProxy for the browser. Tools surface as mcp__server__tool.

Sub-agents

The task tool spawns isolated sub-agents; enable with agents: {}. Custom agent types supported.

Teammates / coordinator

team: true adds a shared mailbox + task board: send_message, task_create/update, dispatch_tasks.

Permissions & plan mode

allow/deny/ask rules, a canUseTool gate, and a plan mode that blocks mutations until you approve.

Slash commands

/help, /compact, /cost, /tools, /model + your own promptCommands.

Background tasks

background: true → run sub-agents off the critical path; poll with task_list/task_output/task_stop.

Cost accounting

Per-run usage + total_cost_usd on the result message, with prefix-cache awareness.

Deploy anywhere

Because the engine is headless, you choose where the agent loop runs. Two recipes cover everything.

Recipe 1 — Serverless function (browser + a function backend)

The browser drives the UI; a serverless function runs query() and streams back. Great on free tiers — but the function's time limit caps the run.

Platform (free tier)	Function limit	Notes
Vercel Hobby	~300s	Covers most agent runs in a single invocation
Appwrite	30s	Tight — continuation recommended
Netlify	~10s (sync) / 15 min (background)	Use background functions for long runs
Cloudflare Workers	CPU-time (I/O is "free")	LLM-bound runs rarely hit it; or use Durable Objects / Workflows
AWS Lambda	15 min	Effectively unbounded for most agents

Survivor continuation shipped — when a run approaches the function's limit, the agent checkpoints to a pluggable SessionStore (Supabase · Neon · Vercel KV · Upstash Redis · IndexedDB) at a turn boundary (query({ maxDurationMs })), and the anyclaude-react client transparently re-requests to continue in a fresh invocation (resume + continueRun). The user sees one continuous stream; a 30s function runs an unbounded agent. Default budget: ~10s under your platform's configured timeout, fully configurable. See Deploy.

Recipe 2 — Sandbox-as-runtime (no time limit)

Run query() inside a long-lived sandbox (E2B / Vercel Sandbox / Daytona) and stream to the browser. The sandbox is the compute, so there's no per-request cap — the agent runs as long as the sandbox is alive.

import { Sandbox } from 'e2b'
import { query, E2BSandbox, createAnthropicClient } from 'anyclaude-sdk'

const sbx = await Sandbox.create()                 // long-lived microVM
const workspace = new E2BSandbox(sbx)
const llm = createAnthropicClient({ apiKey, model: 'claude-sonnet-4-6' })

// Runs to completion regardless of any function timeout.
for await (const msg of query({ prompt, workspace, llm })) { /* stream to the browser */ }

Claude Code agents,for any LLM, anywhere.