Stepping up its push to compete with Nvidia in the market for chips used for artificial-intelligence training and inference applications, Intel this morning unveiled the Gaudi 3, an AI accelerator chip. The company contends it's both faster and more efficient than Nvidia's H100 GPUs -- and "highly competitive" with Nvidia's recently unveiled Blackwell class GPUs.
Gaudi 3 will start shipping later this year, replacing Intel's current Gaudi 2 chip. Intel says it has commitments from four of the most important players in AI servers -- Dell Technologies, Hewlett Packard Enterprise, Super Micro Computer and Lenovo -- to build Gaudi 3-based systems.
Intel made the announcement Tuesday at its Intel Vision customer event, taking place this week in Phoenix, just down the road from the company's new chip fabs in Chandler, Ariz.
Intel asserts that Gaudi 3 is up to 1.7 times faster than the Nvidia H100 at training large-language models, and up to 1.3 times faster for inferencing than the Nvidia H200, which is used specifically for inferencing rather than training. Gaudi 3 is up to 1.5 times faster than Nvidia's H100 for inferencing applications, Intel says.
The company also says that Gaudi 3 is up to 2.3 times more power efficient than the Nvidia H100 at running large-language models.
Intel said it started sampling Gaudi 3 chips for air-cooled systems in the first quarter, with liquid-cooled versions offered this quarter. The company will begin volume production of the air-cooled version in the third quarter, with the liquid-cooled version shipping in Q4.
Asked in a media briefing about how Gaudi 3 will compare with Nvidia's new and speedier Blackwell chips, Intel said that "we do expect it to be highly competitive," adding that Gaudi 3 is "a strong offering" that provides customers with a compelling alternative to Nvidia GPUs at reasonable total cost of ownership and high power efficiency.
The company also announced a new server CPU -- the Xeon 6 -- which it says offers "high AI performance, cloud scalability and energy efficiency spanning data center, cloud, and edge workloads."
Comments