RC RANDOM CHAOS

Claude Code Burns Through Pro Max 5x Quota in 90 Minutes

· via Hacker News

Original source

Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

Hacker News →

A detailed bug report reveals that Anthropic’s Claude Code CLI can exhaust a Pro Max 5x (Opus) subscription quota in as little as 1.5 hours of moderate use. The reporter’s analysis of local session logs points to a likely culprit: cached input tokens appear to count at their full token rate against the rate limit, rather than at the expected 1/10 reduced rate that reflects their actual cost. With a 1M context window, each API call can send up to 960k tokens, and at 200+ calls per hour during normal tool-heavy workflows, the math gets ugly fast.

The report identifies several compounding factors. Background Claude Code sessions left open in other terminals silently consume shared quota even without active user interaction - in this case accounting for 78% of post-reset usage. Auto-compaction events fire automatically and generate the most expensive single API calls by sending the full pre-compact context. The 1M context window, marketed as a premium feature, paradoxically accelerates quota depletion under these conditions.

An Anthropic engineer acknowledged the reports and confirmed they are investigating. Early mitigations include UX nudges to clear stale sessions and a potential default reduction to 400k context windows. The issue highlights a growing tension in AI tooling: as context windows expand and agentic workflows multiply API calls, subscription models struggle to keep pace with actual token throughput.

Read the full article

Continue reading at Hacker News →

This is an AI-generated summary. Read the original for the full story.