How much does a Claude Code session cost?

A typical 50-message Claude Code session using Claude Opus costs approximately $2.10. This is because prompt caching saves 87.5% on input tokens — the cached system prompt costs only $0.30 per million tokens instead of $3.00. Actual costs depend on context length and number of tool calls.

How does Claude Code prompt caching work?

Claude Code uses Anthropic's prompt caching feature to cache the static portion of the system prompt. The first 15KB of the system prompt — core instructions identical for all users — is cached at the API level. Subsequent calls reuse this cache at a 90% discount: cached input costs $0.30/MTok vs $3.00/MTok uncached for Claude Opus.

What is Claude Code microcompaction and how much does it cost?

Microcompaction is Claude Code's technique for reducing context window size when it approaches limits. A Haiku agent (the cheapest model at $0.80/MTok input) summarizes older conversation turns, compressing them into a compact summary. A single microcompaction operation costs approximately $0.001.

What model does Claude Code use for background tasks?

Claude Code uses Claude Haiku for all background and forked agent tasks: memory extraction, session memory, auto-dream consolidation, speculative execution, and microcompaction. Haiku costs 52x less than Claude Opus, making background operations negligible in cost.

CHAPTER 05 — SOURCE ANALYSIS

Cost Optimization Secrets

Claude Code's source reveals exactly what you pay — and five engineering tricks that slash the bill. Prompt caching alone saves 87.5% on input tokens. A Haiku-powered compaction costs a tenth of a cent. A 50-message Opus session lands at roughly $2.10. Here is how it all works, extracted directly from modelCost.ts and the surrounding infrastructure.

What does Claude Code actually cost?

The exact numbers live in services/api/modelCost.ts. Select a metric below to compare models. The output column is the one that hurts — Opus costs $75 per million output tokens.

Opus 4.6

$75/MTok

Sonnet 4.6

$15/MTok

Haiku 4.5

$4.0/MTok

Cache writes cost 25% more than regular input (one-time penalty to prime the cache). After that, every cache read is 90% cheaper. Source: services/api/modelCost.ts

5 Strategies Claude Code Uses to Keep Costs Down

These are not guidelines for users — they are engineering decisions baked into the source. Tap any strategy to see how it works and the actual code behind it.

The Hidden Cost: Cache Misses

The entire caching strategy depends on the static system prompt staying stable. If Anthropic updates it — adding a new tool description, changing a safety instruction, or fixing a bug — every cached entry for every user invalidates simultaneously. The first request after an invalidation pays full price: ~$0.12 on Opus for the 8KB prompt alone. This is why the static portion of the system prompt is engineered to change as rarely as possible. Frequent updates would negate the caching benefit entirely.

Cache TTL: 1 hour · Boundary marker: __SYSTEM_PROMPT_DYNAMIC_BOUNDARY__

Build Your Session Estimate

The defaults below represent a typical 50-message Opus session with active caching. Adjust the counts to match your usage pattern.

Session Cost Calculator

Adjust the counts below to estimate your session spend on Opus 4.6.

Opening message

system prompt + first message, no cache — $0.0800 each

$0.080

Subsequent messages

cached system prompt, Opus — $0.0400 each

$1.200

Subagent calls

Haiku/Sonnet, isolated context — $0.0050 each

$0.075

Compaction events

Haiku summarization — $0.0010 each

$0.003

Memory extractions

background Haiku, every response — $0.0005 each

$0.025

Estimated session total

Opus 4.6 · with prompt caching active

$1.38

Key Source Files (v2.1.88)

services/api/modelCost.ts

Exact pricing per model, cache multipliers

utils/api.ts

Cache splitting logic, boundary marker

services/compaction.ts

Microcompaction algorithm, Haiku model choice

services/budget.ts

Token budget tracking, status bar display

constants/prompts.ts

System prompt assembly, static/dynamic split