According to media reports citing multiple informed sources, OpenAI is dissatisfied with some of NVIDIA's latest artificial intelligence chips and has been actively seeking alternatives since last year, a development that could complicate the relationship between two of the most closely watched companies in the AI boom.
This strategic shift by OpenAI stems from its increasing focus on chips used for specific tasks within AI inference. Inference refers to the computational process performed by AI models, such as the one powering the ChatGPT application, when responding to user questions and requests. While NVIDIA still dominates the market for chips required to train large AI models, inference is emerging as a new competitive battleground.
Analysts suggest that the decision by OpenAI and other companies to seek alternatives in the inference chip market represents a significant test of NVIDIA's dominance in the AI sector.
On Monday, NVIDIA's stock closed down nearly 2.9%.
Currently, OpenAI and NVIDIA are still engaged in investment negotiations.
In September of last year, NVIDIA indicated plans to commit up to $100 billion to OpenAI as part of a deal. This transaction would grant NVIDIA equity in the startup while providing OpenAI with the capital needed to purchase advanced chips.
During this period, OpenAI has already reached agreements with companies like AMD to procure GPUs that compete with NVIDIA's offerings. However, sources familiar with the matter state that OpenAI's continuously evolving product roadmap has also altered the types of computational resources it requires, thereby making negotiations with NVIDIA more complex and slowing their progress.
Last Saturday, NVIDIA CEO Jensen Huang downplayed reports of tensions with OpenAI, calling such claims "complete nonsense" and reaffirming NVIDIA's intention to proceed with the substantial investment in OpenAI. NVIDIA stated in a declaration, "Customers continue to choose NVIDIA for inference because we deliver the best performance and total cost of ownership at scale."
A spokesperson for OpenAI, in a separate statement, mentioned that the company relies on NVIDIA for the vast majority of its inference computing clusters and that NVIDIA offers the best performance per dollar for inference tasks.
Insiders revealed that OpenAI is dissatisfied with the response speed of NVIDIA's hardware in certain specific areas, such as software development and the interaction between AI and other software. OpenAI requires new hardware that could eventually meet approximately 10% of its inference computing needs in the future.
Reports indicate that OpenAI had discussions with startups including Cerebras and Groq about collaborating to obtain chips offering faster inference speeds. However, NVIDIA secured a $20 billion licensing agreement with Groq, which effectively ended the talks between OpenAI and Groq.
Executives in the chip industry suggest that NVIDIA's swift move to secure the deal with Groq appears aimed at consolidating its technology portfolio and enhancing competitiveness within the rapidly evolving AI sector. NVIDIA stated in its announcement that Groq's intellectual property is highly complementary to NVIDIA's product roadmap.
NVIDIA's GPUs are exceptionally well-suited for processing the massive datasets required to train large AI models like ChatGPT, forming a critical foundation for the global AI explosion to date. However, as AI technology advances, the focus is increasingly shifting towards performing inference and drawing conclusions from already-trained models, which could represent a new phase for AI.
Since last year, while seeking GPU alternatives, OpenAI has particularly focused on chip manufacturers that integrate large amounts of memory, known as SRAM, directly onto a single piece of silicon. Packing as much expensive SRAM as possible onto each chip can provide speed advantages when chatbots and other AI systems process millions of user requests.
Compared to training, inference places higher demands on memory because the chips spend relatively more time fetching data from memory rather than performing mathematical calculations. The GPU technologies from NVIDIA and AMD rely on external memory, which can increase processing times and slow down user interactions with chatbots.
According to sources, within OpenAI, this issue is particularly evident in Codex, a product used for generating computer code that the company is actively promoting. OpenAI staff have attributed some of Codex's performance shortcomings to the hardware based on NVIDIA GPUs.
Last month, OpenAI CEO Sam Altman noted that customers using OpenAI's programming models "pay a high premium for the speed of coding work." One way OpenAI is addressing this demand is through its recently announced cooperation agreement with Cerebras. For the average ChatGPT user, speed is not as critical a factor.
In contrast, competing products like Anthropic's Claude and Google's Gemini rely more heavily on Google's self-developed TPUs for deployment. TPUs are specifically designed for the computations required by inference and may offer performance advantages over general-purpose AI chips like NVIDIA's GPUs.
After OpenAI clearly expressed its reservations about NVIDIA's technology, NVIDIA approached companies specializing in high-SRAM chips, including Cerebras and Groq, to discuss potential acquisition possibilities. Informed sources stated that Cerebras declined the acquisition offer and instead entered into a commercial partnership with OpenAI, which was announced last month.
Media reports indicated that Groq had also held discussions with OpenAI about providing computing power and attracted investor interest for a funding round that would value the company at approximately $14 billion.
However, by December, NVIDIA had secured a license for Groq's technology through a non-exclusive, all-cash transaction. Although this agreement allows other companies to also license Groq's technology, Groq is now shifting its focus towards selling cloud software, as NVIDIA has reportedly hired away Groq's chip design personnel.
Comments