skip to content

Claude API — Files

The Anthropic Files API — upload PDFs, images, and text once and reference them by `file_id` across multiple messages, with citations, lifecycle management, and Workbench integration.

10 min read 38 snippets deep dive

Claude API — Files#

What it is#

The Files API lets you upload PDFs, images, and text once and reference them by file_id in any subsequent messages.create call — no base64 re-upload, no per-request bandwidth, automatic caching across requests. It is the right pattern for RAG over a fixed corpus, repeated questions against the same large PDF, multi-step agent workflows that need to revisit a document, and any workload where the same file is sent to Claude more than once. Files have a 32 MB per-file size cap; PDFs may be up to 100 pages. Uploads count against a per-organization storage budget (multi-GB tier; see your dashboard for current limits).

[!NOTE] The Files API is in beta. Send the header anthropic-beta: files-api-2025-04-14 on every call (upload, list, retrieve, delete, and any messages.create that references a file_id). The SDK helpers client.beta.files.* set this for you automatically.

When to use it#

ScenarioFiles API?
Same PDF referenced in 5+ requestsYes
Persistent document store for an agentYes
RAG over a fixed corpusYes
Large image libraryYes
One-shot question on a brand-new PDFNo — inline base64 is simpler
Per-user upload with strict privacy isolationYes (one file per user)
Files > 32 MBPre-chunk; Files API will reject

Supported file types#

TypeMIMENotes
PDFapplication/pdfUp to 100 pages, 32 MB
Imageimage/jpeg, image/png, image/gif, image/webp32 MB; same limits as inline images
Texttext/plain, text/markdownUTF-8 encoded

Limits#

LimitValue
Max file size32 MB
PDF max pages100
Org storage budgetTier-dependent (see dashboard)
File expirationNone — files persist until you delete them
Allowed in batch APIYes
Counts against context windowYes — same as inline; tokens for the doc are billed every reference

Upload — Python#

client.beta.files.upload returns a FileObject with an id you reference elsewhere. Pass either a path-like file handle or a tuple (filename, bytes, mime_type).

import anthropic

client = anthropic.Anthropic()

uploaded = client.beta.files.upload(
    file=open("manual.pdf", "rb"),
)

print(uploaded.id)
print(uploaded.filename)
print(uploaded.size_bytes)
print(uploaded.mime_type)
print(uploaded.created_at)

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z
manual.pdf
4_271_088
application/pdf
2026-05-25T13:21:08Z

Upload with explicit metadata#

When the source is bytes (e.g. generated PDF in memory), pass a (filename, bytes, mime) tuple.

from pathlib import Path

pdf_bytes = Path("manual.pdf").read_bytes()
uploaded = client.beta.files.upload(
    file=("manual.pdf", pdf_bytes, "application/pdf"),
)
print(uploaded.id)

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z

Upload — TypeScript#

The TypeScript SDK accepts File, Blob, ReadStream, or the toFile() helper (which wraps a Buffer / Uint8Array).

import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "node:fs";

const client = new Anthropic();

const uploaded = await client.beta.files.upload({
  file: fs.createReadStream("manual.pdf"),
});

console.log(uploaded.id, uploaded.filename, uploaded.size_bytes);

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088

With an in-memory buffer:

const buffer = await fs.promises.readFile("report.pdf");
const uploaded = await client.beta.files.upload({
  file: await toFile(buffer, "report.pdf", { type: "application/pdf" }),
});

Upload via curl#

For ad-hoc uploads or scripts in languages without an SDK, use a multipart POST.

curl https://api.anthropic.com/v1/files \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -H "anthropic-beta: files-api-2025-04-14" \
    -F "file=@manual.pdf"

Output:

{
  "id": "file_01XVnKzQp8mN7vF4LqJ2cR3Z",
  "filename": "manual.pdf",
  "mime_type": "application/pdf",
  "size_bytes": 4271088,
  "created_at": "2026-05-25T13:21:08Z",
  "downloadable": false
}

Reference a file in a message#

Once uploaded, reference the file by file_id in any document or image content block. The model treats it identically to an inline document — same context budget, same visual reading for PDFs.

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
            },
            {"type": "text", "text": "Summarise section 4.2 about reset behaviour."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

print(response.content[0].text)

Output:

Section 4.2 describes two reset modes. A soft reset clears the I/O queue and
in-progress transactions but preserves the serial number, calibration data,
and firmware. A hard reset clears everything, reverts firmware to factory
defaults, and requires re-pairing the device.

Reference an image#

img = client.beta.files.upload(file=open("chart.png", "rb"))

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {"type": "file", "file_id": img.id},
            },
            {"type": "text", "text": "What does this chart show?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.content[0].text)

Output:

The chart shows monthly revenue from January through December, with steady
growth from $1.2M to $2.1M and a notable spike in Q4 to $2.8M.

Reference from TypeScript#

const response = await client.beta.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 2048,
  messages: [{
    role: "user",
    content: [
      { type: "document", source: { type: "file", file_id: uploaded.id } },
      { type: "text", text: "What is the warranty period?" },
    ],
  }],
}, {
  headers: { "anthropic-beta": "files-api-2025-04-14" },
});

const first = response.content[0];
if (first.type === "text") console.log(first.text);

Output:

The standard warranty period is 24 months from the date of purchase, with an
optional extended-warranty plan that adds another 36 months.

Citations#

Enable citations per-document with citations: { enabled: true }. Each text block in Claude’s response carries a citations array linking back to the exact span in the cited file — page numbers for PDFs, character ranges for text.

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "title": "Device Manual",
                "citations": {"enabled": True},
            },
            {"type": "text", "text": "List the three supported reset modes with the exact wording from the manual."},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)

for block in response.content:
    if block.type != "text":
        continue
    print(block.text)
    for cite in block.citations or []:
        print(f"  -> p.{cite.start_page_number}-{cite.end_page_number}: {cite.cited_text!r}")

Output:

The manual lists three reset modes:

1. **Soft reset** — "clears the I/O queue but preserves calibration data."
  -> p.12-12: 'clears the I/O queue but preserves calibration data'
2. **Hard reset** — "reverts firmware to factory defaults."
  -> p.13-13: 'reverts firmware to factory defaults'
3. **Recovery reset** — "loaded from the recovery partition."
  -> p.13-13: 'loaded from the recovery partition'

List files#

list pages over uploaded files with optional after_id / before_id cursors.

for f in client.beta.files.list(limit=20):
    print(f.id, f.filename, f.size_bytes, f.created_at)

Output:

file_01XVnKzQp8mN7vF4LqJ2cR3Z manual.pdf 4271088 2026-05-25T13:21:08Z
file_01XVnJyPo7lM6uE3KpI1bQ2Y chart.png    284910 2026-05-25T11:04:33Z
file_01XVnHxNm6kL5tD2JoH0aP1X spec.md       12480 2026-05-24T19:51:02Z
const files = await client.beta.files.list({ limit: 20 });
for (const f of files.data) {
  console.log(f.id, f.filename, f.size_bytes);
}

Retrieve metadata#

Fetch a single file’s metadata by ID.

f = client.beta.files.retrieve(uploaded.id)
print(f.filename, f.size_bytes, f.mime_type, f.created_at)

Output:

manual.pdf 4271088 application/pdf 2026-05-25T13:21:08Z

Delete#

Files are billable storage — clean them up when no longer needed.

client.beta.files.delete(uploaded.id)
print("deleted")

Output:

deleted
await client.beta.files.delete(uploaded.id);
console.log("deleted");

Output:

deleted

Combined with prompt caching#

When you reference a file by file_id, the file’s tokens still count toward context — but they cache the same way inline documents do. Attach cache_control to the document block to make the file’s content (which is large and stable) a cached prefix.

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "cache_control": {"type": "ephemeral"},
            },
            {"type": "text", "text": "What does the warranty say?"},
        ],
    }],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(response.usage)

Output (first call — writes cache):

Usage(input_tokens=24, output_tokens=78, cache_creation_input_tokens=18420, cache_read_input_tokens=0)

Output (second call within 5 min on the same file):

Usage(input_tokens=24, output_tokens=82, cache_creation_input_tokens=0, cache_read_input_tokens=18420)

See Prompt caching for breakpoint placement.

Using files in the Batch API#

A batched request may reference uploaded files exactly as a synchronous one would. The file is read once per request; combined with prompt caching, the per-request cost drops dramatically across a 10K-item batch.

batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": f"q-{i}",
            "params": {
                "model": "claude-opus-4-7",
                "max_tokens": 256,
                "messages": [{
                    "role": "user",
                    "content": [
                        {
                            "type": "document",
                            "source": {"type": "file", "file_id": uploaded.id},
                            "cache_control": {"type": "ephemeral"},
                        },
                        {"type": "text", "text": q},
                    ],
                }],
            },
        }
        for i, q in enumerate(QUESTIONS)
    ],
    extra_headers={"anthropic-beta": "files-api-2025-04-14"},
)
print(batch.id)

Output:

msgbatch_01XVnKzQpZ8mN7vF4LqJ2cR3

Multi-file RAG pattern#

Upload your entire corpus once, then attach the relevant subset (selected by an external retriever — vector DB, BM25, full-text) to each user turn.

import anthropic, json, pathlib

client = anthropic.Anthropic()

# 1. One-time ingest
corpus = pathlib.Path("docs").glob("*.pdf")
manifest: dict[str, str] = {}
for path in corpus:
    f = client.beta.files.upload(file=path.open("rb"))
    manifest[path.stem] = f.id
pathlib.Path("manifest.json").write_text(json.dumps(manifest, indent=2))

# 2. Per-query: retrieve relevant file_ids however you like
def answer(question: str, file_ids: list[str]) -> str:
    docs = [
        {
            "type": "document",
            "source": {"type": "file", "file_id": fid},
            "cache_control": {"type": "ephemeral"},
        }
        for fid in file_ids
    ]
    resp = client.beta.messages.create(
        model="claude-opus-4-7",
        max_tokens=2048,
        messages=[{"role": "user", "content": [*docs, {"type": "text", "text": question}]}],
        extra_headers={"anthropic-beta": "files-api-2025-04-14"},
    )
    return resp.content[0].text

Tool result content with a file#

A tool whose result is a generated PDF or chart can return it as a document/image block with the file’s id — Claude reads it before continuing.

def render_chart_tool(args: dict) -> dict:
    # ... generate plot.png ...
    f = client.beta.files.upload(file=open("plot.png", "rb"))
    return {
        "type": "tool_result",
        "tool_use_id": args["tool_use_id"],
        "content": [
            {"type": "text", "text": "Chart generated for Q4 metrics."},
            {"type": "image", "source": {"type": "file", "file_id": f.id}},
        ],
    }

Privacy and lifecycle#

PropertyBehaviour
VisibilityFiles are scoped to your workspace — not shared across orgs
RetentionFiles persist until you delete them or until the workspace is wound down
TrainingPer Anthropic policy, Files API data is not used for model training
EncryptionEncrypted at rest and in transit
AuditingAll file operations are logged in your org audit log

[!WARNING] Files are workspace-scoped — multiple workspaces in the same org each have their own file store. Plan your workspace structure with this in mind for per-customer isolation.

Common pitfalls#

PitfallSymptomFix
Missing beta header400 on upload or message referenceAdd anthropic-beta: files-api-2025-04-14 on every call
Mixing client.messages and client.beta.messagesFile reference rejectedUse client.beta.messages.create while the API is in beta
Uploading a file > 32 MB413 Payload Too LargeSplit or compress before upload
Referencing a deleted file404 mid-messageTrack file lifetime; do not delete until all references are settled
Counting on storage being freeSurprise billFiles count against your storage tier — delete unused ones
Skipping cache_control on the documentRe-pay for the file’s tokens every callMark the document block cacheable; reuse hits the cache
Treating files as ephemeralFiles persist forever until deletedDelete from a daily cleanup job for transient files
Re-uploading the same PDFWasted bandwidth and storageHash before upload; reuse the existing file_id

Common recipes#

Idempotent upload with content-hash dedupe#

import hashlib, json, pathlib

INDEX = pathlib.Path("file_index.json")

def upload_once(path: pathlib.Path) -> str:
    digest = hashlib.sha256(path.read_bytes()).hexdigest()
    index = json.loads(INDEX.read_text()) if INDEX.exists() else {}
    if digest in index:
        return index[digest]
    f = client.beta.files.upload(file=path.open("rb"))
    index[digest] = f.id
    INDEX.write_text(json.dumps(index, indent=2))
    return f.id

Bulk ingest with rate-limit#

import time

def ingest(paths: list[pathlib.Path], pause: float = 0.1) -> dict[str, str]:
    mapping: dict[str, str] = {}
    for p in paths:
        f = client.beta.files.upload(file=p.open("rb"))
        mapping[p.stem] = f.id
        print(f"uploaded {p.name} -> {f.id}")
        time.sleep(pause)
    return mapping

TTL cleanup job#

Delete files older than 30 days that haven’t been re-referenced.

import datetime as dt

CUTOFF = dt.datetime.now(dt.UTC) - dt.timedelta(days=30)

for f in client.beta.files.list(limit=200):
    created = dt.datetime.fromisoformat(f.created_at.replace("Z", "+00:00"))
    if created < CUTOFF:
        client.beta.files.delete(f.id)
        print(f"deleted {f.id} ({f.filename})")

Show what is in storage#

total = 0
by_type: dict[str, int] = {}
for f in client.beta.files.list(limit=200):
    total += f.size_bytes
    by_type[f.mime_type] = by_type.get(f.mime_type, 0) + f.size_bytes

print(f"total: {total / 1_000_000:.1f} MB")
for mime, size in sorted(by_type.items(), key=lambda x: -x[1]):
    print(f"  {mime}: {size / 1_000_000:.1f} MB")

Output:

total: 412.7 MB
  application/pdf: 380.2 MB
  image/png: 18.4 MB
  text/markdown: 14.1 MB

See also#