A version of this article originally appeared in Quartz’s members-only Weekend Brief newsletter. Quartz members get access to exclusive newsletters and more. Sign up here .
Sam Altman has been complaining for months about Nvidia chip shortages slowing ChatGPT releases. OpenAI’s new $10 billion partnership with Broadcom for custom chips shows the company is finally doing something about it.
The partnership, revealed earlier this month after Broadcom’s earnings call, underscores a growing reality: Companies are desperate to escape what the industry calls the “Nvidia tax” — the chip giant’s approximately 60% gross margins on processors that have become essential for AI development.
Altman has been vocal about GPU shortages bogging down ChatGPT releases, writing on X that the company was “out of GPUs” and needed to add “tens of thousands” more to roll out new features. But OpenAI isn’t alone in its frustration. Across Silicon Valley and beyond, a revolution is brewing in developing new chips to break free from Nvidia’s stranglehold. This is especially true for inference, which is when AI systems actually answer questions or create content for users.
While Nvidia chips will continue dominating AI training, more new inference chips could save companies tens of billions of dollars and reduce energy consumption significantly. The math is compelling: Inference happens every time someone asks ChatGPT a question or generates an image, making it far more frequent than the one-time training process.
The Broadcom chip isn’t designed to challenge Nvidia directly, according to the Wall Street Journal , but rather to “plug the gaps” in OpenAI’s hardware needs. This hybrid approach reflects the broader industry strategy — not necessarily replacing Nvidia entirely, but reducing dependency through specialized alternatives.
Companies like Positron claim their chips can deliver two to three times better performance per dollar and three to six times better energy efficiency than Nvidia’s next-generation systems. Groq, founded by Google’s former AI chip development head, claims its specialized chips can make ChatGPT run more than 13 times faster .
The big cloud providers aren’t waiting for startups to solve their Nvidia dependency. Google , Amazon , and Microsoft are all developing inference-focused chips for their internal AI tools and cloud services. These multi-year, well-funded efforts represent a direct challenge to Nvidia’s dominance in the inference market.
Even Qualcomm is returning to datacenter products after abandoning the server market in 2018. CEO Cristiano Amon recently teased plans focusing on “clusters of inference that is about high performance at very low power.”
But some are joining forces rather than fighting. Intel announced this week it will build custom chips that integrate with Nvidia’s systems, with Nvidia taking a $5 billion stake in Intel as part of the deal.
In China, where U.S. export restrictions limit access to advanced AI chips, Alibaba and Baidu have begun using internally designed chips to train AI models, partially replacing Nvidia processors. Alibaba’s AI chip is now reportedly competitive with Nvidia’s H20, the most powerful processor the company can sell in China.
India’s semiconductor ambitions add another dimension to the global competition, with 10 projects, $18 billion in investment commitments, and more than $7 billion in allocated subsidies. The Indian diaspora could prove crucial, with executives noting that a third of Nvidia’s engineers and senior leadership are Indian, potentially easing global customers’ concerns about working with new Indian chipmakers.
“When the chips are down, count on India,” Indian Prime Minister Narendra Modi said at a semiconductor conference this month.
Nvidia isn’t standing still. The company claims its latest Blackwell systems deliver 25 times better inference efficiency per watt compared to previous generations. But speed isn’t everything. Nvidia’s chips can handle whatever new AI model comes next, while specialized inference chips are built for today’s models. In a fast-moving industry where AI architectures change constantly, those specialized chips could become obsolete overnight.
It’s also not just the chips. Cuda, the software that runs Nvidia’s processors, is what most AI developers know how to use. Learning a new system means training teams on unfamiliar tools, creating another barrier for potential switchers.
The current track record for challengers is also lacking. Google has been developing TPUs for nearly a decade, while Microsoft and Amazon have poured billions into their own chips over several years without much to show for it yet. Research shows that roughly 90% of AI papers still cite Nvidia hardware , a dominance that has barely budged despite years of well-funded competition.
Still, while analysts predict that Nvidia will be the supplier for 100% of the training market, it will only capture 50% of the inference market over the long-term. That still leaves almost $200 billion in annual chip spending up for grabs by 2028, enough to motivate companies to keep trying to capture a slice of the chip pie.
The inference computing battle represents more than just technical competition. It’s about reshaping the economic structure of AI, reducing dependence on a single supplier, and making advanced AI capabilities accessible to a broader range of companies and countries. As Sam Altman’s partnership with Broadcom demonstrates, even the most successful AI companies are betting that custom solutions will prove superior to off-the-shelf dominance.
Leave a Comment
Your email address will not be published. Required fields are marked *