Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

OpenAI, Anthropic Forecast $121B Compute Burn, Revealing AI's True Cost

OpenAI, Anthropic Forecast $121B Compute Burn, Revealing AI's True Cost

Internal forecasts from OpenAI and Anthropic reveal the core challenge of modern AI has shifted from selling the technology to financing the immense compute required for training and inference, with OpenAI projecting $121B in compute spending for 2028.

GAla Smith & AI Research Desk·4h ago·5 min read·11 views·AI-Generated
Share:

A Wall Street Journal report, based on internal financial documents, reveals that the fundamental economics of leading AI companies are dominated by an unprecedented scale of compute expenditure. For OpenAI and Anthropic, the race is no longer purely about algorithmic innovation but about securing the capital to finance the massive GPU clusters required to train each new generation of models and serve user queries.

The Core Financial Challenge

The internal forecasts indicate that the hardest operational challenge is no longer creating demand for AI but paying for the compute to satisfy it. This represents a pivotal shift in the industry's bottleneck from research talent and data to pure financial capital for hardware.

  • OpenAI's Projections: The company expects to spend $121 billion on compute in 2028. Even with significantly higher sales revenue, this would result in an operating burn of approximately $85 billion that year.
  • Anthropic's Pattern: While at a smaller scale, Anthropic's internal projections show the same fundamental pattern: compute costs are the dominant expense, outstripping revenue growth in the near to medium term.

The Two Profit Views

The financial presentations from both companies consequently offer two distinct profit-and-loss pictures, highlighting the tension between their research ambitions and business viability:

  1. Ex-Research Compute View: This presentation removes the cost of compute used for fundamental research and training new frontier models. Under this lens, the core business of serving inference for existing models appears closer to a viable software economics model.
  2. Full Cost View: This includes all research compute. This perspective shows the immense distance these companies still are from achieving profitability under normal software company economics, where gross margins are typically high and R&D is a smaller fraction of revenue.

The existence of these dual narratives underscores a central question for investors: are they funding a software company or a capital-intensive infrastructure venture whose primary product is intelligence itself?

gentic.news Analysis

This financial revelation crystallizes a trend our reporting has tracked for over a year: the industrialization of AI. The frontier is no longer defined solely by a novel transformer variant or a clever training trick, but by who can assemble and afford the largest, most efficient compute clusters. This aligns with our December 2025 coverage of the "Exaflop Era," where we noted that training runs for models like GPT-5 and Claude 4 were exceeding sustained exaflop-months of compute, a scale previously reserved for national supercomputing labs.

The reported $121B compute spend for 2028 by OpenAI is a staggering figure that contextualizes the company's aggressive fundraising and its deepening partnership with Microsoft (📈), which provides not just capital but preferential access to Azure's AI infrastructure. Similarly, Anthropic's reliance on Amazon (📈) and Google for compute underscores that the competitive landscape is now a tripartite struggle between model labs (OpenAI, Anthropic), cloud hyperscalers (Microsoft, Google, Amazon), and chip providers (NVIDIA, AMD, and custom silicon efforts).

For practitioners, this signals a hardening reality. The era of easily replicating state-of-the-art results from academic papers on a modest cluster is over for frontier models. The barrier to entry is now fundamentally financial. This will likely accelerate the bifurcation of the AI ecosystem into a handful of well-capitalized players developing frontier models and a broader market fine-tuning and deploying these models for specific applications. The innovation pipeline may increasingly focus on inference efficiency and mixture-of-experts architectures that reduce operational costs, as the marginal return on simply scaling dense models further faces severe economic headwinds.

Frequently Asked Questions

What does "$121B in compute spending" mean?

It refers to the projected total cost OpenAI expects to pay for access to the processing power (primarily GPUs like NVIDIA's H100/H200 or Blackwell chips) needed to train its future models and run inference for its users (like ChatGPT queries) in the year 2028. This includes costs paid to cloud providers like Microsoft Azure or for running owned/leased hardware.

Why is AI compute so expensive?

Training state-of-the-art large language models requires running thousands of specialized chips for months, consuming massive amounts of electricity. Serving inference to millions of users also requires a perpetually running, globally distributed fleet of these expensive chips. The demand for these high-end AI accelerators far exceeds supply, keeping prices extremely high.

How can these companies survive with such huge losses?

They are currently surviving on massive venture capital and strategic investment from technology giants (e.g., Microsoft, Amazon, Google) who are betting on the long-term strategic value of controlling frontier AI. The path to sustainability assumes either revolutionary leaps in AI efficiency, the ability to charge much more for AI services, or that AI becomes so critical that loss-leading is an acceptable strategy for platform control.

What is the difference between "research compute" and other compute?

Research compute is the processing power used to train new, unknown frontier models from scratch—a high-risk, experimental cost. Other compute is used for inference, which is running already-trained models to answer user prompts. Inference is more predictable and, in theory, can be directly tied to revenue from user subscriptions or API calls.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The WSJ report isn't just a financial disclosure; it's a thermodynamic truth bomb for the AI industry. It validates the shift from a research-driven to a capital-driven paradigm, a trend we first analyzed in our deep-dive 'The Capital Stack: How $200B is Fueling the AI Arms Race.' The eye-watering $121B compute forecast for 2028 explains the frantic vertical integration moves we've seen: OpenAI's pursuit of its own chip projects, Microsoft's massive data center build-out, and Anthropic's tight coupling with Amazon's Bedrock. This financial reality directly contradicts the 'software-like margins' narrative often associated with AI and instead paints a picture closer to semiconductor fabrication or aerospace—industries with colossal upfront CAPEX. For the technical community, this has profound implications. Research will be increasingly gated by compute budgets, not ideas. We should expect a surge in work focused on post-training compression, sparsity, and dynamic inference—techniques that improve the 'miles per gallon' of these expensive models. The race for alternative, cheaper hardware (like Groq's LPUs or neuromorphic chips) will intensify. Furthermore, the dual P&L presentations highlight an ongoing tension: investors are being asked to fund a long-term moonshot (AGI) while being shown a path to profitability on today's products. This delicate balancing act will define the IPO readiness of these firms, a topic we explored in 'The Road to IPO: Can AI Labs Become Public Companies?'

Mentioned in this article

Enjoyed this article?
Share:

Related Articles

More in Funding & Business

View all