- Pricing Structure Shift: Anthropic has transitioned its enterprise plan from flat seat fees to usage-based billing, requiring customers to pay for Claude, Claude Code, and Cowork at standard API rates.
- Service Reliability Issues: Claude reported an API uptime of 98.95% over a recent 90-day period, significantly lower than the 99.99% industry standard, leading some enterprise clients to migrate to alternative providers.
- Compute Resource Constraints: Rising costs and supply shortages for hardware, specifically Blackwell GPUs, alongside increasing power grid demands, have necessitated the removal of subsidized flat-rate access.
- Agentic Workload Impact: The adoption of automated agentic workflows has caused a substantial surge in token consumption, rendering previous fixed-price subscription models financially unsustainable for the provider.
- Industry Wide Transition: Reflecting broader market conditions, other major AI organizations are similarly moving away from flat-fee subscriptions toward usage-metered pricing models for compute-heavy services.
Anthropic has restructured its enterprise plan to bill Claude, Claude Code, and Cowork usage separately from seat fees, moving its largest business customers to per-token pricing at standard API rates, according to the company's updated enterprise help documentation. Organizations on older seat-based plans with fixed usage allowances must migrate by their next contract renewal or lose the grandfathered terms. The Information first reported the shift, describing it as tied to a deepening compute crunch that will raise bills for heavy business users significantly.
That is the official version of the story. The unofficial version starts with David Hsu.
David Hsu prefers Claude Opus 4.6. He said so publicly. Then he moved his company off it.
Hsu runs Retool, the software platform. He told the Wall Street Journal that Anthropic's model was better, but the service kept dying. His customers could not ship code when Claude choked. So he picked OpenAI, the inferior option, because the inferior option stayed up.
That is the tell.
Anthropic's API uptime over the 90 days ending April 8 was 98.95 percent, according to WSJ reporting. Established cloud providers commit to 99.99. One extra nine does not sound like much until you translate it. At Anthropic's rate, that is roughly 92 hours of downtime a year. At the 99.99 standard, 53 minutes. For an enterprise buyer, that is the difference between a tool and a liability.
And then, quietly, Anthropic changed how it bills.
The Breakdown
- Anthropic quietly moved enterprise seats to usage-based billing, unbundling Claude, Claude Code, and Cowork from flat fees.
- Claude API uptime hit 98.95% over 90 days ending April 8, far below the 99.99% standard, pushing Retool founder David Hsu to switch to OpenAI.
- Revenue tripled from $9B to $30B in four months, but session caps, cache TTL cuts, and OpenClaw metering show the subsidy bleeding.
- Expect every major AI provider running agentic workloads to move to usage-based billing within six months.
AI-generated summary, reviewed by an editor. More on our AI guidelines.
What Anthropic stopped pretending
The enterprise help center now reads like a concession letter dressed as documentation. "The seat fee only covers access to the platform and doesn't include any usage," it says. "All usage across Claude, Claude Code, and Cowork is billed separately at standard API rates, based on what your team actually consumes." Anthropic is not framing the change as a price hike. The direction is unmistakable anyway. Run rate at the end of 2025 was $9 billion. Run rate today is $30 billion. And the company's message to its biggest customers is: budget more for compute.
That is not a typo. Revenue tripled in four months. The response was a billing restructure, not a victory lap.
The open bar was always a subsidy
For about eighteen months, the AI industry sold a story. The story was not that the models were good. The story was that the pricing was stable.
A $20 Claude Pro subscription gave you access to one of the best coding models in the world, running on compute that cost Anthropic significantly more than you paid. Power users always understood this. The arithmetic was never a secret. A single engineer on Claude Code burns through far more inference than a Pro fee could ever cover. The subscription was an open bar. Anthropic was buying the drinks.
Open bars work until somebody shows up thirsty. In this case, the thirsty somebody is the agent.
Agentic workflows do not sip. They chain tools across steps and run loops without asking. They spawn subagents carrying their own contexts. Cached tokens burn by the hundred thousand. One engineer running Claude Code overnight can consume the token budget of 200 casual chat users. OpenAI, running the same math on its own platform, watched token usage jump from 6 billion per minute in October to 15 billion per minute by late March. That is not growth. That is a break in the load curve.
Get Implicator.ai in your inbox
Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.
No spam. Unsubscribe anytime.
Blackwell GPU rental prices climbed 48 percent in two months. CoreWeave raised prices more than 20 percent late last year and now forces smaller customers onto three-year contracts. Bank of America expects compute demand to outstrip supply through 2029. PJM, the grid operator for the eastern US, warned Friday that it needs 15 gigawatts of additional power tied to AI by early 2027.
The subsidy stopped being theoretical. It started bleeding.
The open bar closes
Watch the pattern. In late March, Anthropic quietly tightened five-hour session limits for Pro and Max users during peak hours, 5 a.m. to 11 a.m. Pacific, weekdays. About seven percent of users started hitting caps they had not hit before, per the company's own disclosure. Claude Code's prompt-cache time-to-live shifted back from one hour to five minutes in early March, driving up quota burn for long coding sessions. OpenClaw, a popular agent framework, got pulled out of the flat-fee bundle on April 4 and moved into usage-metered billing, with heavy users facing bills potentially 50 times higher.
Each move was defended as a "product change" or an "optimization." None was framed as a price hike. Put them together and the shape is obvious. Anthropic is lifting features out of the subscription, welding meters onto each one, and sending customers the difference.
You can feel the institutional anxiety in the public responses. Anthropic engineers denied, on X and GitHub, that the company had degraded Claude. They are not wrong about the model. They are cornered on the framing. The model is the same. The economics underneath it are not.
One AMD senior director filed a data-heavy GitHub complaint on April 2, analyzing 6,852 Claude Code session files to argue Claude had stopped reasoning as deeply. Anthropic's Claude Code lead walked through her analysis, conceded that thinking-effort defaults had been lowered on March 3 to "the best balance across intelligence, latency and cost," and explained the rest as UI changes. Translate: we dialed down the compute and hoped nobody would notice.
People noticed. That is why the usage-based migration is happening in public now. Enterprise buyers will not accept a silent shrinkflation. They want the meter visible, the billing legible, and someone else to blame when the monthly number gets ugly.
Who pays for the buildout
The shift does real work for Anthropic. It transfers compute cost from balance sheet to invoice. It lets the company keep signing million-dollar accounts without needing to pre-buy the GPU capacity to serve them flat-rate. It gives Anthropic a clean story for Wall Street. Every new revenue dollar is now backed by a compute dollar, metered and paid.
It breaks something else, though. Claude's growth engine was individual developers falling in love with Claude Code on their own dime, dragging it into their startups, then pushing employers to buy enterprise seats. That pipeline runs on a cheap Pro plan with enough headroom to experiment. Tighten the plan, meter the agents, add surprise quota math, and you poison the upstream.
Business Insider spoke with three of those users last week. One restructured his entire workday around limit resets. Another now breaks a single project into four micro-chats to conserve tokens. A third simply stops working when the cap hits, because manual coding feels pointless after Claude has spoiled him. The affection is intact. The relationship is not.
The crack nobody is naming
Here is the part Anthropic will not say out loud. Axios went hunting for a historical comparison and came back empty. Google's search-advertising ramp was the previous record. Anthropic covered nearly four times that ground in a single quarter. Snowflake took a decade to reach a billion in run rate. Anthropic added twenty-nine billion in roughly a year. More than 1,000 customers now pay over a million dollars a year for Claude.
None of those numbers mean anything if the service cannot stay up. 98.95 percent is the crack. Enterprise buyers have been trained by twenty years of cloud discipline to treat that figure as disqualifying, and some are already voting with their wallets. Ramp card data shows Anthropic gaining on OpenAI in business spend, yes. That data lags the service degradation. The next three months test whether customer affection for Claude outruns exhaustion with its downtime.
Expect two things from here. First, every major AI provider running agentic workloads moves to usage-based enterprise billing within six months. OpenAI already started, shifting Codex from flat-message to token metering in early April. GitHub tightened Copilot limits on April 10. Windsurf swapped credits for daily quotas. The flat-fee era is over. The memo is circulating.
Second, Anthropic keeps telling two stories at once. The growth story for investors goes like this. Thirty billion dollars, 1,000 million-dollar accounts, a three-year curve that would embarrass Rockefeller. The rationing story for users lives somewhere else entirely. Capacity tightens during peak hours. Effort defaults quietly dropped in March. Session caps hit sooner than they used to. OpenClaw got its own meter. Both stories are true. Neither is sustainable without the other eventually catching up.
David Hsu made his call already. He kept the inferior model because it stayed up. That is the verdict that should scare Anthropic more than any benchmark screenshot on X. It is not a performance problem. It is a trust problem. You cannot patch trust with a pricing page.
Frequently Asked Questions
What actually changed in Anthropic's enterprise pricing?
The enterprise plan's seat fee now only covers platform access. Usage across Claude, Claude Code, and Cowork is billed separately at standard API rates based on actual consumption. Organizations on older seat-based plans with fixed usage allowances must migrate by their next contract renewal or lose the grandfathered terms.
Why did Retool's founder switch from Claude to OpenAI?
David Hsu told the Wall Street Journal he preferred Anthropic's Opus 4.6 model, but the service kept going down. Anthropic's API uptime over the 90 days ending April 8 was 98.95 percent, compared to the 99.99 percent standard that established cloud providers maintain. That gap translates to roughly 100 hours of downtime a year versus 50 minutes.
What is the compute crunch behind the pricing shift?
Blackwell GPU rental prices climbed 48 percent in two months. CoreWeave raised prices more than 20 percent late last year and now forces smaller customers into three-year contracts. Bank of America expects demand to outstrip supply through 2029. PJM, the eastern US grid operator, is looking for 15 gigawatts of additional power tied to AI by early 2027.
What did Anthropic change about Claude Code sessions?
In late March, Anthropic tightened five-hour session limits for Pro and Max users during weekday peak hours of 5 a.m. to 11 a.m. Pacific. About 7 percent of users started hitting caps they had not hit before. Claude Code's prompt-cache time-to-live shifted from one hour back to five minutes in early March, driving up quota burn for long coding sessions.
Are other AI providers moving to usage-based billing too?
Yes. OpenAI shifted Codex from flat-message pricing to token metering in early April and introduced a new $100 Pro tier for compute-heavy coding. GitHub tightened Copilot limits on April 10. Windsurf replaced its credit system with daily and weekly quotas in March. The flat-fee subscription era for agentic workloads is effectively over.
AI-generated summary, reviewed by an editor. More on our AI guidelines.
[
The Intelligence Arbitrage Is Closing
Implicator PRO Briefing / 24 Mar 2026 Pro Members Only AI token prices crashed 99% in three years. Six providers sell comparable intelligence at commodity rates. This analysis maps the fi
The Implicator

](https://www.implicator.ai/the-intelligence-arbitrage-is-closing/)
[
Seven Chips Ship. Seventy Percent Walk.
San Francisco | March 17, 2026 Jensen Huang held up a chip yesterday in San Jose. Then he held up six more. Nvidia's Vera Rubin platform ships seven processors as one organism, backed by $1 trillion
The Implicator

](https://www.implicator.ai/seven-chips-ship-seventy-percent-walk/)
[
Anthropic and Google circle a cloud pact measured in tens of billions
A compute-for-TPUs deal would anchor Anthropic on Google Cloud while keeping AWS firmly in the frame. A six-figure hourly compute bill meets a hyperscaler eager for flagship AI workloads. Anthropic i
The Implicator

](https://www.implicator.ai/anthropic-and-google-circle-a-cloud-pact-measured-in-tens-of-billions/)
Share X / Twitter LinkedIn Email
Marcus Schuler
San Francisco
Editor-in-Chief and founder of Implicator.ai. Former ARD correspondent and senior broadcast journalist with 10+ years covering tech. Writes daily briefings on policy and market developments. Based in San Francisco. E-mail: editor@implicator.ai
