Ampere and Qualcomm aren’t the most obvious of partners. Both, after all, offer Arm-based chips for running data center servers (though Qualcomm’s largest market remains mobile). But as the two companies announced today, they are now combining forces to offer an AI-focused server that uses Ampere’s CPUs and Qualcomm’s Cloud AI 100 Ultra AI inferencing chips for running — not training — models.
Like every other chip manufacturer, Ampere is looking to profit from the AI boom. The company’s focus, however, has always been on fast and power-efficient server chips, so while it can use the Arm IP to add some of these features to its chips, it’s not necessarily a core competency. That’s why Ampere decided to work with Qualcomm (and SuperMicro to integrate the two solutions), Arm CTO Jeff Wittich tells me.
“The idea here is that while I’ll show you some great performance for Ampere CPUs running AI inferencing on just the CPUs, if you want to scale out to even bigger models — multi-100 billion parameter models, for instance — just like all the other workloads, AI isn’t one size fits all,” Wittich told TechCrunch. “We’ve been working with Qualcomm on this solution, combining our super efficient Ampere CPUs to do a lot of the general purpose tasks that you’re running in conjunction with inferencing, and then using their really efficient cards, we’ve got a server-level solution.”
As for partnering with Qualcomm, Wittich said that Ampere wanted to put together best-of-breed solutions.
“[R]eally good collaboration that we’ve had with Qualcomm here,” he said. “This is one of the things that we’ve been working on, I think we share a lot of really similar interests, which is why I think that this is really compelling. They’re building really, really efficient solutions and a lot of different parts of the market. We’re building really, really efficient solutions on the server CPU side.”
The Qualcomm partnership is part of Ampere’s annual roadmap update. Part of that roadmap is the new 256-core AmpereOne chip, built using a modern 3nm process. Those new chips are not quite generally available yet, but Wittich says they are ready at the fab and should roll out later this year.
On top of the additional cores, the defining feature of this new generation of AmpereOne chips is the 12-channel DDR5 RAM, which allows Ampere’s data center customers to better tune their users’ memory access according to their needs.
The sales pitch here isn’t just performance, though, but the power consumption and cost to run these chips in the data center. That’s especially true when it comes to AI inferencing, where Ampere likes to compare its performance against Nvidia’s A10 GPUs.
It’s worth noting that Ampere is not sunsetting any of its existing chips in favor of these new ones. Wittich stressed that even these older chips still have plenty of use cases.
Ampere also announced another partnership today. The company is working with NETINT to build a joint solution that pairs Ampere’s CPUs with NETINT’s video processing chips. This new server will be able to transcode 360 live video channels in parallel, all while also using OpenAI’s Whisper speech-to-text model to subtitle 40 streams.
“We started down this path six years ago because it is clear it is the right path,” Ampere CEO Renee James said in today’s announcement. “Low power used to be synonymous with low performance. Ampere has proven that isn’t true. We have pioneered the efficiency frontier of computing and delivered performance beyond legacy CPUs in an efficient computing envelope.”