From: toccodolce | 5/14/2024 7:23:28 AM | | | | GREASING THE SKIDS TO MOVE AI FROM INFINIBAND TO ETHERNET
Just about everybody, including Nvidia, thinks that in the long run, most people running most AI training and inference workloads at any appreciable scale – hundreds to millions of datacenter devices – will want a cheaper alternative for networking AI accelerators than InfiniBand.
While Nvidia has argued that InfiniBand only represents 20 percent of the cluster cost and it boosts performance of AI training by 20 percent – and therefore pays for itself – you still have to come up with that 20 percent of the cluster cost, which is considerably higher than the 10 percent or lower that is the normal for clusters based on Ethernet. The latter of which has feeds and speeds that, on paper and often in practice, make it a slightly inferior technical choice.
But, thanks in large part to the Ultra Ethernet Consortium, the several issues with Ethernet running AI workloads are going to be fixed, and we think will also help foment greater adoption of Ethernet for traditional HPC workloads. Well above and far beyond the adoption of the Cray-designed “Rosetta” Ethernet switch and “Cassini” network interface cards that comprised the Slingshot interconnect from Hewlett Packard Enterprise and not including middle of the bi-annual Top500 rankings of “supercomuters” that do not really do either HPC or AI as their day jobs and are a publicity stunt by vendors and nations.
The discussion of how Ethernet is evolving was the most important thing discussed during the most recent call with Wall Street by Arista Networks, which was going over its financial results for the first quarter of 2024 ended in March.
As we previously reported, Meta Platforms is in the process of building two clusters with 24,576 GPUs each, one based on Nvidia’s 400 Gb/sec Quantum 2 InfiniBand (we presume) and one built with Arista Network’s flagship 400 Gb/sec 7800R3 AI Spine (we know), which is a multi-ASIC modular switch with 460 Tb/sec of aggregate bandwidth that supports packet spraying (a key technology to make Ethernet better at the collective network operations that are central to both AI and HPC). The 7830R3 spine switch is based on Broadcom’s Jericho 2c+ ASIC, not the more AI-tuned Jericho 3AI chip that Broadcom is aiming more directly at Nvidia’s InfiniBand and that is still not shipping in volume in products as far as we know.
The interconnect being built by Arista Networks for the Ethernet cluster at Meta Platforms also includes Wedge 400C and Minipack2 network enclosures that adhere to the Open Compute Projects favored by Meta Platforms. (The original Wedge 400 was based on Broadcom’s 3.2 Tb/sec “Tomahawk 3” StrataXGS ASIC and the Wedge 400C used as the top of racker in the AI cluster is based on a 12.8 Tb/sec Silicon One ASIC from Cisco Systems. The Minipack2 is based on Broadcom’s 25.6 Tb/sec “Tomahawk 4” ASIC. It looks like Wedge 400C and Minipack2 are being used to cluster server hosts and the 7800R AI Spine is being used to cluster the GPUs, but Meta Platforms is not yet divulging the details. ( We detailed the Meta Platforms fabric and switches that create it back in November 2021.)
Meta Platforms is the flagship customer for Ethernet in AI, and Microsoft will be as well. But others are also leading the charge. Arista Networks revealed in February that it has design wins for fairly large AI clusters. Jayshree Ullal, co-founder and chief executive officer at the company, provided some insight into how these wins are progressing towards money and how it sets Arista Networks up to reach its stated goal of $750 million in AI networking revenue by 2025.
“This cluster,” Ullal said on the call referring to the Meta Platforms cluster, “tackles complex AI training tasks that involve a mix of model and data parallelization across thousands of processors, and Ethernet is proving to offer at least 10 percent improvement of job completion performance across all packet sizes versus InfiniBand. We are witnessing an inflection of AI networking and expect this to continue throughout the year and decade. Ethernet is emerging as a critical infrastructure across both front-end and back-end AI data centers. AI applications simply cannot work in isolation and demand seamless communication among the compute nodes consisting of back-end GPUs and AI accelerators, as well as the front-end nodes like the CPUs alongside storage.”
That 10 percent improvement in completion time is something that is being done with the current Jericho 2c+ ASIC as the spine in the network, not the Jericho 3AI.
Later in the call, Ullal went into a little more detail about the landscape between InfiniBand and Ethernet, which is useful perspective.
“Historically, as you know, when you look at InfiniBand and Ethernet in isolation, there are a lot of advantages of each technology,” she continued. “Traditionally, InfiniBand has been considered lossless. And Ethernet is considered to have some loss properties. However, when you actually put a full GPU cluster together along with the optics and everything, and you look at the coherence of the job completion time across all packet sizes, data has shown – and this is data that we have gotten from third parties, including Broadcom – that just about in every packet size in a real-world environment, comparing those technologies, the job completion time of Ethernet was approximately 10 percent faster. So, you can look at this thing in a silo, and you can look at it in a practical cluster. And in a practical cluster, we are already seeing improvements on Ethernet. Now, don’t forget, this is just Ethernet as we know it today. Once we have the Ultra Ethernet Consortium and some of the improvements you are going to see on packet spraying and dynamic load balancing and congestion control, I believe those numbers will get even better.”
And then Ullal talked about the four AI cluster deals that Arista Networks won versus InfiniBand out of five major deals it was participating in. (Presumably InfiniBand won the other deal.)
“In all four cases, we are now migrating from trials to pilots, connecting thousands of GPUs this year, and we expect production in the range of 10K to 100K GPUs in 2025,” Ullal continued. “Ethernet at scale is becoming the de facto network and premier choice for scale-out AI training workloads. A good AI network needs a good data strategy delivered by a highly differentiated EOS and network data lake architecture. We are therefore becoming increasingly constructive about achieving our AI target of $750 million in 2025.”
If Ethernet costs one half to one third as much, end to end including optics and cables and switches and network interfaces – and can do the work faster and, in the long run, with more resilience and at a larger scale for a given number of network layers, InfiniBand come under pressure. It already has, if the ratio of four wins out of five on fairly large GPU clusters, as Arista Networks has, is representative. Clearly the intent of citing these numbers is to convince us that it is representative, but the market will ultimately decide.
We said this back in February and we will say it again: We think Arista Networks is low-balling its expectations, and Wall Street seems to agree. The company did raise its guidance for revenue growth in 2024 by two points, to between 12 percent and 14 percent, and we think optimism about the uptake of Ethernet for AI clusters – and eventually maybe HPC clusters – is playing a part here.
But here is the fun bit of math: For every $750 million that Arista Networks makes in AI cluster interconnect sales, Nvidia might be losing $1.5 billion to $2.25 billion. In the trailing twelve months, we estimate that Nvidia had $6.47 billion in InfiniBand networking sales against $39.78 billion in GPU compute sales in the datacenter. At a four to one take out ratio and a steady state market, Nvidia gets to keep about $1.3 billion and the UEC collective gets to keep $1.7 to $2.6 billion, depending on how the Ethernet costs shake out. Multiply by around 1.8X to get the $86 billion or so we expect Nvidia to book in datacenter revenues in 2008 and you see the target for InfiniBand sales is more like $12 billion if everything stays the same.
There is plenty of market share for UEC members to steal, but they will steal it by removing revenue from the system, like Linux did to Unix, not converting that revenue from one type of technology to another. That savings will be plowed back into GPUs.
In the meantime, Arista turned in a pretty decent quarter with no real surprises. Product sales were up 13.4 percent to $1.33 billion, and services revenues rose by 35.3 percent to $242.5 million. Software subscriptions, which are within products, were $23 million, so total annuity-like services accounted for $265.6 million, up 45.6 percent year on year. Total revenues were up 16.3 percent to $1.57 billion. Net income rose by 46.1 percent to $638 million, and Arista Networks existed the quarter with $5.45 billion in cash and we estimate somewhere around 10,000 customers. We think Arista had about $1.48 billion in datacenter revenues, and an operating income of around $623 million for this business. Which is what we care about. Campus and edge are interesting, of course, and we hope they will grow and be profitable, too, for Arista Networks and others. |
| The end of Moore's law - Poet Technologies | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last ReadRead Replies (1) |
|
To: toccodolce who wrote (833) | 5/14/2024 7:49:43 AM | From: toccodolce | | | POET Announces Design Win and Collaboration with Foxconn Interconnect Technology for High-speed AI Systems
POET Announces Design Win and Collaboration with Foxconn Interconnect Technology for High-speed AI Systems
TORONTO, May 14, 2024 (GLOBE NEWSWIRE) -- POET Technologies Inc. (“POET” or the “Company”) (TSX Venture: PTK; NASDAQ: POET), the designer and developer of the POET Optical Interposer™ and Photonic Integrated Circuits (PICs) for the data center, telecommunication and artificial intelligence markets, today announced that Foxconn Interconnect Technology (“FIT”), a market leader of interconnect solutions for communication infrastructure and several other large, high-growth markets, has selected POET’s optical engines, which are silicon photonics integrated circuits (Silicon PIC), for its 800G and 1.6T optical transceiver modules.
POET and FIT have entered into a collaboration to develop 800G and 1.6T pluggable optical transceiver modules using POET optical engines with an aim to address the growth in demand from cutting-edge AI applications and high-speed data center networks. As part of the collaboration, POET will develop and supply its silicon photonics integrated circuit optical engines based on the patented POET Optical Interposer™ technology and FIT, one of the world’s leading manufacturers of interconnect technologies, will design and supply the high-speed pluggable optical transceivers for delivery to some of the largest end customers in the world.
"The growth in demand from emerging applications such as artificial intelligence and machine learning requires continuous innovation to keep pace with power and cost requirements,” said Joseph Wang, CTO at FIT. “We are excited to partner with POET on this development. POET’s hybrid-integration platform technology will enable us to use best-of-breed components and ramp to high volume at a much faster pace and in a cost-efficient manner.”
“POET’s vision is to ‘semiconductorize’ photonics by integrating electronic and photonic components on the interposer to enable wafer-scale assembly,” said Dr. Suresh Venkatesan, POET’s Chairman and CEO. “We are honored to work with an industry leader like FIT, capable of ramping to high volume production with its expertise in transceiver design and manufacturing. We look forward to expanding our collaboration to future projects once this initial project is complete.”
POET’s transmit optical engines integrate externally modulated lasers (EMLs), EML drivers, monitor photodiodes, optical waveguides, thermistors and an optical multiplexer, where applicable, on to an optical interposer-based PIC. The receive optical engines integrate high-speed photodiodes, transimpedance amplifiers, optical waveguides and optical demultiplexers, where applicable. All components are passively assembled on the interposer at wafer scale using standard pick-and-place semiconductor equipment. Passive alignment of the photonic elements and use of high-speed RF traces between the electronic and photonic components to avoid wire-bonds are two hallmarks of the technology.
POET expects to complete the design of the optical engines for FIT by Q3 2024 and start optical engine production at its joint venture, Super Photonics Xiamen, by Q4 2024.
The global optical transceiver market for 800G and 1.6T speed is projected to grow at a CAGR of 33%, from $2.5 billion in 2024 to $10.5 billion by 2029, according to Lightcounting Market forecast. |
| The end of Moore's law - Poet Technologies | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last ReadRead Replies (1) |
|
To: SGJ who wrote (835) | 5/14/2024 8:38:51 AM | From: toccodolce | | | Not only are they huge but Foxconn is the largest electronic contract manufacturer in the world.
Latest NR from Foxconn
Foxconn upbeat on AI demand, stands by Sharp following writedown
- Q1 net profit T$22.01 bln vs T$29.31 bln analysts' forecast
- Foxconn expects Q2 revenue to grow significantly
- Reaffirms commitment to Sharp, saying "worst is behind"
TAIPEI, May 14 (Reuters) - Apple (AAPL.O), opens new tab supplier Foxconn (2317.TW), opens new tab said on Tuesday it remained confident about strong AI server demand this year driving revenue, and pledged to stand by Japan's Sharp after taking a large, profit-impacting writedown last year. Foxconn, the world's largest contract electronics maker and Apple's top iPhone manufacturer, said on an earnings call it expected flat consumer electronics demand, but reiterated it saw significant growth in 2024 revenue given the artificial intelligence (AI) applications boom.
"The visibility for this year has improved compared to in March, mainly thanks to strong AI server demand," company spokesman James Wu told a post-earnings conference call, pointing to a better business outlook but without providing detailed numbers. Foxconn said it expects revenue for the second quarter to grow significantly from a year earlier, broadly in line with previous guidance, with revenue for smart computer electronics likely to be flattish.
It also forecast demand for consumer electronics to be flat this year. It does not provide numerical guidance. For the first three months of 2024, Foxconn reported a 72% rise in profit coming off a low base from the same period a year earlier, but the growth was lower than expected. Apple's quarterly results and forecast beat modest expectations this month, and CEO Tim Cook said revenue growth would return in the current quarter.
In a separate statement, Foxconn, whose earnings took a hit last year from a T$17.3 billion ($533.9 million) writedown related to its 34% stake in Sharp Corp (6753.T), opens new tab, said it was committed to the Japanese electronics maker, describing it as an "important asset". "The worst is behind Sharp. Its future only gets better from here," Foxconn Chairman Young Liu said, adding that the Japanese company's Sakai factory would be transformed into an AI data centre.
Liu did not appear on the earnings call. Foxconn said he was in Europe on a business trip, but did not give details. The Taiwanese company, the world's largest contract electronics maker, said net profit for the January-March quarter rose to T$22.01 billion from T$12.8 billion in the same period a year earlier, when earnings were hit by the Sharp writedown. While Foxconn's quarterly profit missed the T$29.31 billion forecast by analysts, it was the firm's third consecutive quarterly profit rise. In the first quarter, consumer electronics including smartphones accounted for 48% of its revenue while cloud and networking products, including servers, contributed 28%. The company, formally called Hon Hai Precision Industry Co Ltd, said in March that it expected a significant rise in revenue this year driven by booming AI server demand. Foxconn also wants to replicate the success it has had with iPhones with electric vehicles (EV), saying on the call it expected EV sales to be expanded to markets including Southeast Asia, the United States and Europe, though it did not provide a time frame. Wu said recent price cuts for EVs have presented an opportunity for Foxconn, which he said was in talks with some 20 to 30 companies including traditional car makers and start-ups for possible collaboration. "That creates opportunities for outsourcing, which is good for Foxconn," Wu said, referring to the price cuts. Foxconn's shares have risen 65% so far this year, driven by its rosy AI outlook, far outperforming a 17% gain for the broader market
The clouds and service segment, which made up 28% of Foxconn's revenue in the first quarter, will comprise a more significant share of overall revenue by 2025, Wu added. |
| The end of Moore's law - Poet Technologies | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last ReadRead Replies (1) |
|
From: macnai | 5/14/2024 11:45:15 AM | | | | "The growth in demand from emerging applications such as artificial intelligence and machine learning requires continuous innovation to keep pace with power and cost requirements,” said Joseph Wang, CTO at FIT. “We are excited to partner with POET on this development. POET’s hybrid-integration platform technology will enable us to use best-of-breed components and ramp to high volume at a much faster pace and in a cost-efficient manner.” |
| The end of Moore's law - Poet Technologies | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
From: SGJ | 5/14/2024 4:07:37 PM | | | | I think the big price move on this news is going to be on Thursday. Needs to percolate. POET isn't a household name.......yet |
| The end of Moore's law - Poet Technologies | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
| |