Claude Code Users Face Token Drain Crisis
Anthropic has acknowledged that developers using its Claude Code AI coding assistant are burning through usage quotas far faster than expected, calling the issue the team's "top priority." The admission, made on the company's official Reddit forum on Sunday, comes amid a wave of complaints from paying subscribers who say they are being locked out of the tool within hours or even minutes of starting work.
A Perfect Storm of Factors
The rapid quota drain appears to stem from a convergence of events. On March 28, a two-week promotional period that doubled Claude usage limits during off-peak hours expired, returning all Free, Pro, Max, and Team subscribers to standard allotments. Separately, Anthropic engineer Thariq Shihipar announced last week that the company was tightening five-hour session limits during peak weekday hours — 5 a.m. to 11 a.m. Pacific Time — to manage growing demand. He estimated that roughly 7 percent of users would hit session limits they previously would not have encountered.
But users say something beyond the policy changes is wrong. One Max 5x subscriber, paying $100 per month, reported on Reddit that their quota was consumed in just 19 minutes. Another developer on a $200-per-year Pro plan said they could only use Claude for 12 out of every 30 days . A widely shared post noted that simply sending "hello" to Claude Code consumed about 2 percent of a session's token budget.
Cache Bugs Under the Hood
A detailed technical investigation posted to Reddit on March 30 points to two bugs in Claude Code's prompt caching system as a likely culprit. Prompt caching is designed to dramatically reduce costs by reusing previously sent content — cached token reads cost just one-tenth the price of uncached input. When the cache breaks, costs can silently balloon by 10 to 20 times.
The first bug reportedly involves the standalone Claude Code binary, which inserts a unique billing hash into request headers that changes with each session, invalidating the cross-session cache. The second affects the --resume command, which causes a complete cache miss on the entire conversation history, forcing the system to reprocess hundreds of thousands of tokens from scratch.
"Downgrading to 2.1.34 made a very noticeable difference," one developer said, referring to an older version of Claude Code that predates the bugs, which were introduced in version 2.1.69 .
A Broader Industry Tension
The episode highlights a growing tension across the AI industry between subscription models that promise generous access and the real cost of serving large language models. Google faced similar backlash earlier this month when users of its Antigravity coding tool protested pricing changes. Anthropic, which does not publish exact token limits for its subscription tiers, instead describes them in relative terms — the Pro plan offers "at least five times" the free tier, for instance — leaving developers with limited ability to plan their usage.
"We're actively investigating... it's the top priority for the team, and we know this is blocking a lot of people," an Anthropic representative wrote. No timeline for a fix has been given.
