|Qualcomm Centriq Aims At Intel Xeon With Competitive Performance And Low Power|
Opinions expressed by Forbes Contributors are their own.
Paul Teich Paul Teich , Contributor
(charts/photos @ link below)
Qualcomm formally launched its Centriq 2400 server system-on-chip (SoC) this week in San Jose, California, in the heart of Silicon Valley. While Qualcomm initially disclosed its intent to sell server SoCs two years ago, its ecosystem partners and customers universally mentioned working with Qualcomm for three years or more to bring the Centriq product line to market.
The technical merits of Qualcomm’s Centriq 2400 boil down to a very simple comparison – Centriq 2400 performance per hardware thread is solidly in Intel Xeon Scalable performance per thread territory, while consuming a lot less power. And yes, it also costs a lot less.
For example, Intel has historically been very proud of Xeon’s floating-point performance. Using preliminary SPECfp_rate2006 numbers (“estimated” because they have not yet gone through SPEC’s reporting process) in a 48 hardware thread comparison, Qualcomm’s Centriq 2460 posted a score of 607 (48 single-threaded cores with a list price $1,995) to Xeon Scalable Platinum 8160’s score of 534 (24 dual-threaded cores with a recommended list price of $4,700 as of 11/9/2017). That’s a 13% better score, without considering price or power consumption.
While Qualcomm refers to Centriq as its fifth generation of Arm core design, Centriq is legitimately Qualcomm’s second-generation ARM 64-bit server design. Centriq 2400 is not a rushed “first silicon” product. While Qualcomm showed a couple of synthetic benchmarks to prove a couple of architectural points, everyone I talked to at the event is evaluating Centriq 2400 in production environments on real workloads.
Converting evaluations into production deployments may take up to a year for some customers. This is normal, as service providers must be absolutely sure that new hardware can survive in their production environment. Qualcomm’s brand and commitment gets them in the door, but Centriq must survive grueling evaluation cycles before cloud service providers deploy it at any kind of scale. As of the launch event, Qualcomm stated it had started taking orders for production Centriq 2400 parts. That starts the evaluation cycle.
In their own words
A lot of partners and potential customers spoke at the launch event. Press release quotes must be approved by legal and marketing departments; I find that on-stage ad libs say much more. Here are the top quotes I paid attention to:
Let me congratulate you on launching Centriq 2400. It’s the highest performance ARM processor and can actually run workloads that I care about.” –Dr. Leendert van Doorn, Distinguished Engineer, Microsoft Azure
Dr. van Doorn doesn’t mince words, ever. His team has evaluated all the available ARM server products. Earlier this year, Microsoft gave a nod to Qualcomm’s Centriq, Cavium’s ThunderX2, AMD’s EPYC and Intel’s Xeon Scalable processors as future processor alternatives for specific workloads in its Azure public cloud. Today’s launch fills in a final, vital piece of that competitive field.
Dr. Leendert van Doorn and Anand Chandrasekher at Qualcomm Centriq launch
In our application we get basically the same performance that we get from the Intel chips that we are deploying in the field right now with less than half of the usage in terms of power. We cut our [server] power bill in half.” … “If you have a workload that is something like CloudFlare’s, then it is a no brainer…” –Matthew Prince, CEO, CloudFlare
I have never seen a CEO of a cloud services company bounce around a stage with such enthusiasm for a new processor. Mr. Prince looks at Qualcomm’s Centriq as a tool to improve his bottom line and his quality of service. He showed his real-world NGINX workload performance numbers to back up his enthusiasm. Read the information packed slide he presented (below, based on engineering samples), it is stunning.
Mathew Prince at Qualcomm Centriq launch
We are preparing to do first integration and testing of this, we’ll start it and will actually publish all of the results of the testing of this in our environment. And then will expand that to multiple ODMs and OEMs once we’ll be able to release this, when it’s complete.” –Yuval Bachar, President and Chairman of the Board, Open19 Foundation; Global Data Center Infrastructure Architect, LinkedIn
LinkedIn created and is funding the Open19 Foundation to open its datacenter architecture as an alternative to Open Compute Project (OCP) standards. (‘19’ refers to the rack width in inches.) Mr. Bachar was very enthusiastic about Centriq, both for its potential in LinkedIn workloads and in the broader Open19 ecosystem he also works with.
Yuval Bachar at Qualcomm Centriq launch
What differentiates Qualcomm Centriq?
It took me a little while to warm up to exactly why Qualcomm was making such a big deal about manufacturing Centriq in a 10nm process. We now know that Samsung is manufacturing Centriq in its advanced 10nm 3D FinFET process.
Anand Chandrasekher holding a Centriq wafer manufactured by Samsung
Qualcomm’s single-threaded ‘Falkor’ core implements the 64-bit ARMv8 instruction set – and only the 64-bit instructions, Qualcomm jettisoned all the legacy 32-bit logic to streamline the core as much as possible. This is a sound strategy for an instruction set entering a new market that has no legacy code to support.
The Centriq SoC contains 18 billion transistors in just under 400 square millimeters of silicon. That is the baseline contribution of 10 nm FinFET process – density. Those transistors implement 24 pairs of Falkor cores, for 48 total cores with an equal number of hardware threads (ARM cores are not multithreaded). Each pair of cores shares 512 KB of L2 cache, and so the SoC contains a total of 24 MB of L2 cache. There is also 60 MB of unified L3 cache.
Those 18 billion transistors run at a base frequency of 2.2 GHz with a total dissipated power (TDP) of 120 W. Qualcomm does not measure TDP at some average value. Datacenters plan for full utilization of their servers, so Qualcomm measures Centriq TDP at high utilization. For many real-world workloads, customers have already measured Centriq TDP at much lower power consumption. For example, CloudFlare measured Centriq’s TDP at 72 W running NGINX.
Centriq’s turbo clock mode is different, as well. Qualcomm calls Centriq’s turbo mode “constant peak frequency.” Centriq boosts the frequency of each individual core to a maximum of 2.6 GHz, and all the cores in an SoC can be boosted to that maximum if a workload’s power consumption profile permits. As a cloud runs a workload, utilization of individual server nodes may vary, but Centriq’s peak core frequency within a server node will not vary inside its 120 W power envelope. That delivers a predictable performance per core for cloud service providers, something they don’t get with Intel Xeon Scalable’s highly variable individual core turbo frequencies.
Samsung’s 10 nm process technology enables Centriq to deliver high server workload performance under high utilization with consistent low power consumption.
We feel confident that Qualcomm’s continued investment in this domain will yield a competitive hardware and software ecosystem across multiple generations.” –Dr. Weifeng Zhang, Senior Director, Alibaba Infrastructure Services Group
This is Alibaba’s way of recognizing Qualcomm’s commitment to Centriq’s future roadmap. While the launch event was about the journey to get to launch and the competitiveness of the first production generation, all the potential customers expect to see a long-term commitment to keep the Centriq product line evolving and competitive.
On the system research and development (R&D) front, Qualcomm is working with HPE and the Gen-Z Consortium on new memory interconnect technologies. HPE has committed to ship early access Cloudline platforms to customers in early 2018.
Mr. Chandrasekher divulged code names for Qualcomm’s next generation Centriq core and SoC – the core is “Saphira” and the SoC is “Firetail.” (Centriq 2400 was code named “Amberwing.”) He also hinted at next generation beyond Saphira/Firetail, but did not elaborate.
Creating a new server product line from scratch is not a sprint, it is a marathon. Qualcomm demonstrated a depth of commitment to the server market at their Centriq product line launch. We will watch closely for Centriq production deployments in 2018.
-- The author and members of the TIRIAS Research staff do not hold equity positions in any of the companies mentioned. TIRIAS Research tracks and consults for companies throughout the electronics ecosystem from semiconductors to systems and sensors to the cloud.