Claude API — TypeScript SDK#
What it is#
@anthropic-ai/sdk is the official TypeScript client for the Anthropic API — a thin, fully typed wrapper around fetch that exposes messages.create, messages.stream, the Batch API, the Files API, and the rest of the Claude surface. It runs on Node 18+, Deno, Bun, Cloudflare Workers, and modern browsers (with a server-side proxy). Reach for it when your stack is TypeScript/JavaScript instead of Python; the request/response shapes are identical, only the language idioms differ.
Install#
npm install @anthropic-ai/sdk
Output:
added 1 package, and audited 12 packages in 1s
Or with pnpm / bun / yarn:
pnpm add @anthropic-ai/sdk
Output:
+ @anthropic-ai/sdk 0.49.0
Setup#
The client reads ANTHROPIC_API_KEY from process.env by default. Pass it explicitly when running in environments where env vars are scoped (Cloudflare Workers, Lambda layers, browsers behind a proxy).
export ANTHROPIC_API_KEY="sk-ant-api03-…REDACTED…"
Output: (none — exits 0 on success)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
// or: new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY })
Basic message#
The minimum call. Use await client.messages.create({ ... }) with model, max_tokens, and a messages array. The response is typed as Anthropic.Message; access the first text block via response.content[0].
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: "Explain TypeScript generics in one paragraph." }],
});
const first = response.content[0];
if (first.type === "text") {
console.log(first.text);
}
console.log(response.usage);
Output:
TypeScript generics let a function or type accept other types as parameters, so the
same code can work with different shapes while keeping full static type safety.
Inside the function, the placeholder type behaves like any other type; at the call
site, TypeScript infers the concrete type from the arguments.
{ input_tokens: 14, output_tokens: 65, cache_creation_input_tokens: 0, cache_read_input_tokens: 0 }
Discriminated unions for content blocks#
response.content is Array<TextBlock | ToolUseBlock | ThinkingBlock | ...>. Use block.type as a discriminant to narrow safely.
for (const block of response.content) {
switch (block.type) {
case "text":
console.log("TEXT:", block.text);
break;
case "tool_use":
console.log("TOOL:", block.name, block.input);
break;
case "thinking":
console.log("THINK:", block.thinking.length, "chars");
break;
}
}
Output:
TEXT: Sure — here is the answer.
System prompt#
system accepts a string or an array of typed content blocks. The array form is required when you want to attach cache_control to portions of the system prompt.
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 512,
system: "You are a concise TypeScript reviewer. Reply in bullet points only.",
messages: [{ role: "user", content: "How should I model a result that can fail?" }],
});
const first = response.content[0];
if (first.type === "text") console.log(first.text);
Output:
- Prefer a discriminated union: `{ ok: true; value: T } | { ok: false; error: E }`.
- Avoid throwing for predictable failures — make them part of the return type.
- Use a small `Result<T, E>` helper and `match`/exhaustive switch to consume it.
Multi-turn conversation#
The API is stateless. Maintain your own message array; append each assistant turn verbatim before the next user message.
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: "What is 2 + 2?" },
{ role: "assistant", content: "4" },
{ role: "user", content: "Multiply that by 10." },
];
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 256,
messages,
});
const first = response.content[0];
if (first.type === "text") console.log(first.text);
Output:
40
Streaming with helpers#
client.messages.stream(...) returns a MessageStream you can iterate to receive incremental text. The helper exposes .on("text", handler) for per-chunk callbacks and .finalMessage() for the assembled result.
const stream = client.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: "Count to five slowly." }],
});
for await (const text of stream) {
process.stdout.write(text);
}
console.log();
const finalMessage = await stream.finalMessage();
console.log("tokens:", finalMessage.usage.output_tokens);
Output:
One... two... three... four... five.
tokens: 19
Low-level events#
For full control — handling content_block_start, input_json_delta for streaming tool input, or rendering thinking blocks — iterate the raw event stream.
for await (const event of stream) {
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
if (event.type === "content_block_delta" && event.delta.type === "input_json_delta") {
process.stdout.write(event.delta.partial_json);
}
}
See Streaming for the complete event reference.
Tool use#
Tools are typed as Anthropic.Tool[]. input_schema is JSON Schema; the SDK does not validate it for you on the way in, so pair with zod or another validator on the way out.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools: Anthropic.Tool[] = [{
name: "get_weather",
description: "Get current weather. Call this when the user asks about weather.",
input_schema: {
type: "object",
properties: {
location: { type: "string", description: "City and country" },
unit: { type: "string", enum: ["celsius", "fahrenheit"] },
},
required: ["location"],
},
}];
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
tools,
messages: [{ role: "user", content: "What's the weather in Toronto?" }],
});
console.log(response.stop_reason);
for (const block of response.content) {
if (block.type === "tool_use") {
console.log(block.name, block.input);
}
}
Output:
tool_use
get_weather { location: 'Toronto, Canada' }
Continue the loop#
Append the assistant turn and a tool_result user turn, then call again.
const toolUse = response.content.find(b => b.type === "tool_use");
if (toolUse && toolUse.type === "tool_use") {
const result = JSON.stringify({ temp: 12, condition: "cloudy" });
const followup = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
tools,
messages: [
{ role: "user", content: "What's the weather in Toronto?" },
{ role: "assistant", content: response.content },
{
role: "user",
content: [{ type: "tool_result", tool_use_id: toolUse.id, content: result }],
},
],
});
const first = followup.content[0];
if (first.type === "text") console.log(first.text);
}
Output:
The current weather in Toronto, Canada is 12°C and cloudy.
See Tool use for the full agentic loop, parallel tools, and tool_choice reference.
Zod-validated tool input#
Tool input arrives as Record<string, unknown>. Validate with zod to get a typed object and friendly errors when the model goes off-schema.
import { z } from "zod";
const WeatherInput = z.object({
location: z.string(),
unit: z.enum(["celsius", "fahrenheit"]).optional(),
});
function handleWeather(raw: unknown): string {
const args = WeatherInput.parse(raw); // throws on bad input
return JSON.stringify({ temp: 12, condition: "cloudy", unit: args.unit ?? "celsius" });
}
Vision — image input#
Images attach as content blocks alongside text. Pass base64 for local data or url for a public link.
import fs from "node:fs/promises";
const imageData = (await fs.readFile("chart.png")).toString("base64");
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{
role: "user",
content: [
{
type: "image",
source: { type: "base64", media_type: "image/png", data: imageData },
},
{ type: "text", text: "What trend does this chart show?" },
],
}],
});
const first = response.content[0];
if (first.type === "text") console.log(first.text);
Output:
The chart shows a steady upward trend in monthly active users, with the
sharpest growth between August and October.
PDF input#
PDFs work like images — Claude reads them visually with layout intact. Up to 100 pages per document, 32 MB per file.
import fs from "node:fs/promises";
const pdf = (await fs.readFile("report.pdf")).toString("base64");
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 2048,
messages: [{
role: "user",
content: [
{
type: "document",
source: { type: "base64", media_type: "application/pdf", data: pdf },
cache_control: { type: "ephemeral" },
},
{ type: "text", text: "Summarise in five bullet points." },
],
}],
});
For very large or reused PDFs, upload once via the Files API and reference by file_id.
Extended thinking#
Toggle with thinking: { type: "enabled", budget_tokens: N }. Requires temperature: 1 (the default) and a budget ≥ 1024.
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 16000,
thinking: { type: "enabled", budget_tokens: 10000 },
messages: [{
role: "user",
content: "Twelve coins, one is heavier or lighter. Find it in 3 weighings.",
}],
});
for (const block of response.content) {
if (block.type === "thinking") console.log(`[Thinking: ${block.thinking.length} chars]`);
if (block.type === "text") console.log(block.text);
}
Output:
[Thinking: 4218 chars]
Yes — a classic decision tree solves it in three weighings…
Prompt caching#
Mark expensive prefixes with cache_control: { type: "ephemeral" }. The first call writes the cache; subsequent calls within 5 minutes read it at ~10% input cost.
const docs = await fs.readFile("manual.txt", "utf8");
const response = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
system: [
{ type: "text", text: "You answer questions about the manual." },
{ type: "text", text: docs, cache_control: { type: "ephemeral" } },
],
messages: [{ role: "user", content: "What does section 4.2 say about resets?" }],
});
console.log(response.usage);
Output (first call):
{ input_tokens: 30, output_tokens: 142, cache_creation_input_tokens: 24512, cache_read_input_tokens: 0 }
Output (second call, same prefix, within 5 min):
{ input_tokens: 30, output_tokens: 138, cache_creation_input_tokens: 0, cache_read_input_tokens: 24512 }
See Prompt caching for breakpoint placement and cost math.
Token counting#
Estimate input tokens (including system, tools, images) before sending.
const count = await client.messages.countTokens({
model: "claude-opus-4-7",
messages: [{ role: "user", content: "Explain quantum entanglement." }],
});
console.log(count.input_tokens);
Output:
8
Error handling#
The SDK throws typed errors. Catch the specific class to differentiate retryable from permanent failures.
import Anthropic, { APIError, RateLimitError, AuthenticationError } from "@anthropic-ai/sdk";
const client = new Anthropic();
try {
await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello" }],
});
} catch (err) {
if (err instanceof AuthenticationError) {
console.error("Bad API key");
} else if (err instanceof RateLimitError) {
console.error("Rate limited; back off and retry");
} else if (err instanceof APIError) {
console.error(`API error ${err.status}: ${err.message}`);
} else {
throw err;
}
}
Configuring retries and timeouts#
Set maxRetries (default 2) and timeout (default 10 minutes) on the client, or per-request with .withOptions.
const client = new Anthropic({
maxRetries: 5,
timeout: 60_000, // ms
});
// Override for one slow request
const slow = await client.with({ timeout: 180_000 }).messages.create({
model: "claude-opus-4-7",
max_tokens: 8000,
messages: [{ role: "user", content: "Write a long essay." }],
});
Raw HTTP response#
Wrap a request with .withResponse() to access the underlying Response (Fetch API). Useful for reading rate-limit headers.
const { data, response } = await client.messages
.create({
model: "claude-opus-4-7",
max_tokens: 64,
messages: [{ role: "user", content: "ping" }],
})
.withResponse();
console.log(response.status);
console.log(response.headers.get("anthropic-ratelimit-tokens-remaining"));
console.log(data.content[0]);
Output:
200
399136
{ type: 'text', text: 'pong' }
Cloudflare Workers#
The SDK runs natively on Workers — no Node-specific imports. Pass the API key from a secret binding.
import Anthropic from "@anthropic-ai/sdk";
export interface Env {
ANTHROPIC_API_KEY: string;
}
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const client = new Anthropic({ apiKey: env.ANTHROPIC_API_KEY });
const body = await req.json<{ message: string }>();
const result = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: body.message }],
});
const first = result.content[0];
return new Response(first.type === "text" ? first.text : "");
},
};
Stream tokens back to a browser using a ReadableStream:
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const client = new Anthropic({ apiKey: env.ANTHROPIC_API_KEY });
const { message } = await req.json<{ message: string }>();
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
const llm = client.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: message }],
});
for await (const text of llm) {
controller.enqueue(encoder.encode(text));
}
controller.close();
},
});
return new Response(stream, { headers: { "Content-Type": "text/plain" } });
},
};
Bun and Deno#
Both runtimes support the SDK with no extra config.
bun add @anthropic-ai/sdk
Output:
installed @anthropic-ai/sdk@0.49.0
// deno_app.ts — Deno 1.40+
import Anthropic from "npm:@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: Deno.env.get("ANTHROPIC_API_KEY") });
const resp = await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 256,
messages: [{ role: "user", content: "Hello from Deno" }],
});
console.log(resp.content[0]);
deno run --allow-net --allow-env deno_app.ts
Output:
{ type: "text", text: "Hello! How can I help you from Deno today?" }
Express streaming endpoint#
Mirror the Python FastAPI streaming pattern — forward Claude tokens straight to the HTTP response body.
import express from "express";
import Anthropic from "@anthropic-ai/sdk";
const app = express();
app.use(express.json());
const client = new Anthropic();
app.post("/chat", async (req, res) => {
res.setHeader("Content-Type", "text/plain; charset=utf-8");
const stream = client.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content: req.body.message }],
});
for await (const text of stream) {
res.write(text);
}
res.end();
});
app.listen(3000);
Test it:
curl -N -X POST http://localhost:3000/chat \
-H "Content-Type: application/json" \
-d '{"message": "Count to three."}'
Output:
One. Two. Three.
Type re-exports#
Useful types you’ll reach for repeatedly. Import as members of the default namespace.
import Anthropic from "@anthropic-ai/sdk";
type Msg = Anthropic.MessageParam;
type Tool = Anthropic.Tool;
type ToolUse = Anthropic.ToolUseBlock;
type ToolResult = Anthropic.ToolResultBlockParam;
type Message = Anthropic.Message;
type Usage = Anthropic.Usage;
type Stream = Anthropic.MessageStream;
Bedrock and Vertex variants#
The same SDK ships sub-packages for Claude on AWS Bedrock and Google Vertex.
import { AnthropicBedrock } from "@anthropic-ai/bedrock-sdk";
const client = new AnthropicBedrock({ awsRegion: "us-east-1" });
const resp = await client.messages.create({
model: "anthropic.claude-opus-4-7-v1:0",
max_tokens: 256,
messages: [{ role: "user", content: "Hello from Bedrock" }],
});
import { AnthropicVertex } from "@anthropic-ai/vertex-sdk";
const client = new AnthropicVertex({ region: "us-central1", projectId: "my-gcp-project" });
Common pitfalls#
| Pitfall | Symptom | Fix |
|---|---|---|
Accessing content[0].text without narrowing | TS2339: Property 'text' does not exist | Check block.type === "text" first |
Missing await on async call | Returns Promise, breaks downstream | All SDK methods are async — await them |
| Hardcoded key in source | Leaks via git | Use ANTHROPIC_API_KEY or a secret binding |
| Streaming without finalisation | Connection hangs | await stream.finalMessage() after the loop, or always close res.end() |
| Browser bundle includes SDK | API key exposed in JS | Run the SDK server-side; only ship the proxy endpoint to the browser |
JSON.stringify on Map / class | Tool input arrives empty | Stringify plain objects only |
Common recipes#
Typed message builder#
import Anthropic from "@anthropic-ai/sdk";
function userTurn(text: string): Anthropic.MessageParam {
return { role: "user", content: text };
}
function assistantTurn(blocks: Anthropic.ContentBlock[]): Anthropic.MessageParam {
return { role: "assistant", content: blocks };
}
function toolResult(id: string, content: string, isError = false): Anthropic.ToolResultBlockParam {
return { type: "tool_result", tool_use_id: id, content, is_error: isError };
}
Result helper#
type Result<T, E = Error> =
| { ok: true; value: T }
| { ok: false; error: E };
async function tryClaude(args: Anthropic.MessageCreateParamsNonStreaming): Promise<Result<Anthropic.Message>> {
try {
return { ok: true, value: await client.messages.create(args) };
} catch (err) {
return { ok: false, error: err as Error };
}
}
Cost estimator#
const PRICES = {
"claude-opus-4-7": { in: 15.0, out: 75.0 },
"claude-sonnet-4-6": { in: 3.0, out: 15.0 },
"claude-haiku-4-5": { in: 0.8, out: 4.0 },
} as const;
function estimateCost(model: keyof typeof PRICES, usage: Anthropic.Usage): number {
const p = PRICES[model];
return (usage.input_tokens * p.in + usage.output_tokens * p.out) / 1_000_000;
}
See also#
- Python SDK — same API in Python.
- Streaming — SSE event reference.
- Tool use — schema, agentic loops, tool_choice.
- Batch API — bulk processing at 50% cost.
- Prompt caching — TTL and cost math.
- Files API — upload once, reference many.