Fast Mode — "Stuff Anthropic Published That Nobody Reads"

Thread Copy

I read every page of the Anthropic docs so you don't have to.

Here's what I found that nobody's talking about:

1/ Opus has a speed dial.

Set speed: "fast" in your API call → 2.5x faster output tokens per second.

Same weights. Same intelligence. Just faster inference.

$30 input / $150 output per MTok (6x standard).

It's a beta research preview — you have to join a waitlist.

But the fact that you can make the SAME model go 2.5x faster by paying more? That's a pricing model nobody expected from Anthropic.

Source

platform.claude.com/docs/en/build-with-claude/fast-mode

Key Facts

  • Parameter: speed: "fast"
  • Beta header: fast-mode-2026-02-01
  • Waitlist: claude.com/fast-mode
  • 2.5x faster output tokens per second (OTPS, not TTFT)
  • 6x standard pricing ($30/$150 per MTok vs $5/$25)
  • Same model weights and behavior
  • Opus 4.6 only
  • Separate rate limits from standard
  • Not available on Batch API
  • Prompt cache doesn't carry between fast/standard

Graphics

  • v1 (infographic): clean data layout, side-by-side comparison
  • v2 (illustrated): stick figure turning a dial, hand-drawn style