Claude API — Files#

What it is#

The Files API lets you upload PDFs, images, and text once and reference them by file_id in any subsequent messages.create call — no base64 re-upload, no per-request bandwidth, automatic caching across requests. It is the right pattern for RAG over a fixed corpus, repeated questions against the same large PDF, multi-step agent workflows that need to revisit a document, and any workload where the same file is sent to Claude more than once. Files have a 32 MB per-file size cap; PDFs may be up to 100 pages. Uploads count against a per-organization storage budget (multi-GB tier; see your dashboard for current limits).

[!NOTE] The Files API is in beta. Send the header anthropic-beta: files-api-2025-04-14 on every call (upload, list, retrieve, delete, and any messages.create that references a file_id). The SDK helpers client.beta.files.* set this for you automatically.

When to use it#

Scenario	Files API?
Same PDF referenced in 5+ requests	Yes
Persistent document store for an agent	Yes
RAG over a fixed corpus	Yes
Large image library	Yes
One-shot question on a brand-new PDF	No — inline base64 is simpler
Per-user upload with strict privacy isolation	Yes (one file per user)
Files > 32 MB	Pre-chunk; Files API will reject

Supported file types#

Type	MIME	Notes
PDF	`application/pdf`	Up to 100 pages, 32 MB
Image	`image/jpeg`, `image/png`, `image/gif`, `image/webp`	32 MB; same limits as inline images
Text	`text/plain`, `text/markdown`	UTF-8 encoded

Limits#

Limit	Value
Max file size	32 MB
PDF max pages	100
Org storage budget	Tier-dependent (see dashboard)
File expiration	None — files persist until you delete them
Allowed in batch API	Yes
Counts against context window	Yes — same as inline; tokens for the doc are billed every reference

Upload — Python#

client.beta.files.upload returns a FileObject with an id you reference elsewhere. Pass either a path-like file handle or a tuple (filename, bytes, mime_type).

import anthropic

client = anthropic.Anthropic()

uploaded = client.beta.files.upload(
    file=open("manual.pdf", "rb"),
)

print(uploaded.id)
print(uploaded.filename)
print(uploaded.size_bytes)
print(uploaded.mime_type)
print(uploaded.created_at)

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z
manual.pdf
4_271_088
application/pdf
2026-05-25T13:21:08Z

Upload with explicit metadata#

When the source is bytes (e.g. generated PDF in memory), pass a (filename, bytes, mime) tuple.

from pathlib import Path

pdf_bytes = Path("manual.pdf").read_bytes()
uploaded = client.beta.files.upload(
    file=("manual.pdf", pdf_bytes, "application/pdf"),
)
print(uploaded.id)

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z

Upload — TypeScript#

The TypeScript SDK accepts File, Blob, ReadStream, or the toFile() helper (which wraps a Buffer / Uint8Array).

import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "node:fs";

const client = new Anthropic();

const uploaded = await client.beta.files.upload({
  file: fs.createReadStream("manual.pdf"),
});

console.log(uploaded.id, uploaded.filename, uploaded.size_bytes);

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088

With an in-memory buffer:

const buffer = await fs.promises.readFile("report.pdf");
const uploaded = await client.beta.files.upload({
  file: await toFile(buffer, "report.pdf", { type: "application/pdf" }),
});

Upload via curl#

For ad-hoc uploads or scripts in languages without an SDK, use a multipart POST.

curl https://api.anthropic.com/v1/files \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -H "anthropic-beta: files-api-2025-04-14" \
    -F "file=@manual.pdf"

Output:

{
  "id": "file_01XVnKzQp8mN7vF4LqJ2cR3Z",
  "filename": "manual.pdf",
  "mime_type": "application/pdf",
  "size_bytes": 4271088,
  "created_at": "2026-05-25T13:21:08Z",
  "downloadable": false
}

Reference a file in a message#

Once uploaded, reference the file by file_id in any document or image content block. The model treats it identically to an inline document — same context budget, same visual reading for PDFs.

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
            },
            {"type": "text", "text": "Summarise section 4.2 about reset behaviour."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

print(response.content[0].text)

Output:

Section 4.2 describes two reset modes. A soft reset clears the I/O queue and
in-progress transactions but preserves the serial number, calibration data,
and firmware. A hard reset clears everything, reverts firmware to factory
defaults, and requires re-pairing the device.

Reference an image#

img = client.beta.files.upload(file=open("chart.png", "rb"))

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {"type": "file", "file_id": img.id},
            },
            {"type": "text", "text": "What does this chart show?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.content[0].text)

Output:

The chart shows monthly revenue from January through December, with steady
growth from $1.2M to $2.1M and a notable spike in Q4 to $2.8M.

Reference from TypeScript#

const response = await client.beta.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 2048,
  messages: [{
    role: "user",
    content: [
      { type: "document", source: { type: "file", file_id: uploaded.id } },
      { type: "text", text: "What is the warranty period?" },
    ],
  }],
}, {
  headers: { "anthropic-beta": "files-api-2025-04-14" },
});

const first = response.content[0];
if (first.type === "text") console.log(first.text);

Output:

The standard warranty period is 24 months from the date of purchase, with an
optional extended-warranty plan that adds another 36 months.

Citations#

Enable citations per-document with citations: { enabled: true }. Each text block in Claude’s response carries a citations array linking back to the exact span in the cited file — page numbers for PDFs, character ranges for text.

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "title": "Device Manual",
                "citations": {"enabled": True},
            },
            {"type": "text", "text": "List the three supported reset modes with the exact wording from the manual."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

for block in response.content:
    if block.type != "text":
        continue
    print(block.text)
    for cite in block.citations or []:
        print(f"  -> p.{cite.start_page_number}-{cite.end_page_number}: {cite.cited_text!r}")

Output:

The manual lists three reset modes:

1. **Soft reset** — "clears the I/O queue but preserves calibration data."
  -> p.12-12: 'clears the I/O queue but preserves calibration data'
2. **Hard reset** — "reverts firmware to factory defaults."
  -> p.13-13: 'reverts firmware to factory defaults'
3. **Recovery reset** — "loaded from the recovery partition."
  -> p.13-13: 'loaded from the recovery partition'

List files#

list pages over uploaded files with optional after_id / before_id cursors.

for f in client.beta.files.list(limit=20):
    print(f.id, f.filename, f.size_bytes, f.created_at)

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088 2026-05-25T13:21:08Z
file_01XVnJyPo7lM6uE3KpI1bQ2Y chart.png    284910 2026-05-25T11:04:33Z
file_01XVnHxNm6kL5tD2JoH0aP1X spec.md       12480 2026-05-24T19:51:02Z

const files = await client.beta.files.list({ limit: 20 });
for (const f of files.data) {
  console.log(f.id, f.filename, f.size_bytes);
}

Retrieve metadata#

Fetch a single file’s metadata by ID.

f = client.beta.files.retrieve(uploaded.id)
print(f.filename, f.size_bytes, f.mime_type, f.created_at)

Output:

manual.pdf 4271088 application/pdf 2026-05-25T13:21:08Z

Delete#

Files are billable storage — clean them up when no longer needed.

client.beta.files.delete(uploaded.id)
print("deleted")

Output:

deleted

await client.beta.files.delete(uploaded.id);
console.log("deleted");

Output:

deleted

Combined with prompt caching#

When you reference a file by file_id, the file’s tokens still count toward context — but they cache the same way inline documents do. Attach cache_control to the document block to make the file’s content (which is large and stable) a cached prefix.

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "cache_control": {"type": "ephemeral"},
            },
            {"type": "text", "text": "What does the warranty say?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.usage)

Output (first call — writes cache):

Usage(input_tokens=24, output_tokens=78, cache_creation_input_tokens=18420, cache_read_input_tokens=0)

Output (second call within 5 min on the same file):

Usage(input_tokens=24, output_tokens=82, cache_creation_input_tokens=0, cache_read_input_tokens=18420)

See Prompt caching for breakpoint placement.

Using files in the Batch API#

A batched request may reference uploaded files exactly as a synchronous one would. The file is read once per request; combined with prompt caching, the per-request cost drops dramatically across a 10K-item batch.

batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": f"q-{i}",
            "params": {
                "model": "claude-opus-4-7",
                "max_tokens": 256,
                "messages": [{
                    "role": "user",
                    "content": [
                        {
                            "type": "document",
                            "source": {"type": "file", "file_id": uploaded.id},
                            "cache_control": {"type": "ephemeral"},
                        },
                        {"type": "text", "text": q},
                    ],
                }],
            },
        }
        for i, q in enumerate(QUESTIONS)
    ],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(batch.id)

Output:

msgbatch_01XVnKzQpZ8mN7vF4LqJ2cR3

Multi-file RAG pattern#

Upload your entire corpus once, then attach the relevant subset (selected by an external retriever — vector DB, BM25, full-text) to each user turn.

import anthropic, json, pathlib

client = anthropic.Anthropic()

# 1. One-time ingest
corpus = pathlib.Path("docs").glob("*.pdf")
manifest: dict[str, str] = {}
for path in corpus:
    f = client.beta.files.upload(file=path.open("rb"))
    manifest[path.stem] = f.id
pathlib.Path("manifest.json").write_text(json.dumps(manifest, indent=2))

# 2. Per-query: retrieve relevant file_ids however you like
def answer(question: str, file_ids: list[str]) -> str:
    docs = [
        {
            "type": "document",
            "source": {"type": "file", "file_id": fid},
            "cache_control": {"type": "ephemeral"},
        }
        for fid in file_ids
    ]
    resp = client.beta.messages.create(
        model="claude-opus-4-7",
        max_tokens=2048,
        messages=[{"role": "user", "content": [*docs, {"type": "text", "text": question}]}],
        extra_headers={"anthropic-beta": "files-api-2025-04-14"},
    )
    return resp.content[0].text

Tool result content with a file#

A tool whose result is a generated PDF or chart can return it as a document/image block with the file’s id — Claude reads it before continuing.

def render_chart_tool(args: dict) -> dict:
    # ... generate plot.png ...
    f = client.beta.files.upload(file=open("plot.png", "rb"))
    return {
        "type": "tool_result",
        "tool_use_id": args["tool_use_id"],
        "content": [
            {"type": "text", "text": "Chart generated for Q4 metrics."},
            {"type": "image", "source": {"type": "file", "file_id": f.id}},
        ],
    }

Privacy and lifecycle#

Property	Behaviour
Visibility	Files are scoped to your workspace — not shared across orgs
Retention	Files persist until you delete them or until the workspace is wound down
Training	Per Anthropic policy, Files API data is not used for model training
Encryption	Encrypted at rest and in transit
Auditing	All file operations are logged in your org audit log

[!WARNING] Files are workspace-scoped — multiple workspaces in the same org each have their own file store. Plan your workspace structure with this in mind for per-customer isolation.

Common pitfalls#

Pitfall	Symptom	Fix
Missing beta header	`400` on upload or message reference	Add `anthropic-beta: files-api-2025-04-14` on every call
Mixing `client.messages` and `client.beta.messages`	File reference rejected	Use `client.beta.messages.create` while the API is in beta
Uploading a file > 32 MB	`413 Payload Too Large`	Split or compress before upload
Referencing a deleted file	`404` mid-message	Track file lifetime; do not delete until all references are settled
Counting on storage being free	Surprise bill	Files count against your storage tier — delete unused ones
Skipping `cache_control` on the document	Re-pay for the file’s tokens every call	Mark the document block cacheable; reuse hits the cache
Treating files as ephemeral	Files persist forever until deleted	Delete from a daily cleanup job for transient files
Re-uploading the same PDF	Wasted bandwidth and storage	Hash before upload; reuse the existing `file_id`

Common recipes#

Idempotent upload with content-hash dedupe#

import hashlib, json, pathlib

INDEX = pathlib.Path("file_index.json")

def upload_once(path: pathlib.Path) -> str:
    digest = hashlib.sha256(path.read_bytes()).hexdigest()
    index = json.loads(INDEX.read_text()) if INDEX.exists() else {}
    if digest in index:
        return index[digest]
    f = client.beta.files.upload(file=path.open("rb"))
    index[digest] = f.id
    INDEX.write_text(json.dumps(index, indent=2))
    return f.id

Bulk ingest with rate-limit#

import time

def ingest(paths: list[pathlib.Path], pause: float = 0.1) -> dict[str, str]:
    mapping: dict[str, str] = {}
    for p in paths:
        f = client.beta.files.upload(file=p.open("rb"))
        mapping[p.stem] = f.id
        print(f"uploaded {p.name} -> {f.id}")
        time.sleep(pause)
    return mapping

TTL cleanup job#

Delete files older than 30 days that haven’t been re-referenced.

import datetime as dt

CUTOFF = dt.datetime.now(dt.UTC) - dt.timedelta(days=30)

for f in client.beta.files.list(limit=200):
    created = dt.datetime.fromisoformat(f.created_at.replace("Z", "+00:00"))
    if created < CUTOFF:
        client.beta.files.delete(f.id)
        print(f"deleted {f.id} ({f.filename})")

Show what is in storage#

total = 0
by_type: dict[str, int] = {}
for f in client.beta.files.list(limit=200):
    total += f.size_bytes
    by_type[f.mime_type] = by_type.get(f.mime_type, 0) + f.size_bytes

print(f"total: {total / 1_000_000:.1f} MB")
for mime, size in sorted(by_type.items(), key=lambda x: -x[1]):
    print(f"  {mime}: {size / 1_000_000:.1f} MB")

Output:

total: 412.7 MB
  application/pdf: 380.2 MB
  image/png: 18.4 MB
  text/markdown: 14.1 MB

g h	home
g p	Programming section
g p	Python section
g j	JavaScript section
g t	TypeScript section
g o	OS section
g l	Linux section
g w	Windows section
g z	z/OS section
g o	macOS section
g a	AI section
g c	Claude Code section
g c	Codex CLI section
g c	Claude API section
g p	Prompting section
g f	Frameworks section
g p	Packages section
g p	Pip (Python) section
g p	npm (Node) section
g p	Cargo (Rust) section
g p	Go modules section
g g	graph view
g t	tags index

⌘K / /	open search palette
t	cycle theme (dark → light → system)
?	toggle this panel

[ / ]	previous / next sheet in section
j / k	scroll down / up

Claude API — Files#

What it is#

When to use it#

Supported file types#

Limits#

Upload — Python#

Upload with explicit metadata#

Upload — TypeScript#

Upload via curl#

Reference a file in a message#

Reference an image#

Reference from TypeScript#

Citations#

List files#

Retrieve metadata#

Delete#

Combined with prompt caching#

Using files in the Batch API#

Multi-file RAG pattern#

Tool result content with a file#

Privacy and lifecycle#

Common pitfalls#

Common recipes#

Idempotent upload with content-hash dedupe#

Bulk ingest with rate-limit#

TTL cleanup job#

Show what is in storage#

See also#