From: Glenn Petersen | 5/24/2023 4:46:03 AM | | | | BABA is going to take its cloud unit public:
Alibaba to cut 7% of workforce in its cloud unit as it pursues IPO for the division
PUBLISHED TUE, MAY 23 202311:01 AM EDT UPDATED TUE, MAY 23 202311:10 AM EDT Arjun Kharpal @ARJUNKHARPAL CNBC.com
KEY POINTS
-- Alibaba is cutting 7% of the workforce in its cloud computing division as the unit gears up for an initial public offering.
-- This comes after it announced plans in March to split the company into six business units each with their own chief executive and board of directors.
-- Last week, the company announced plans for a full spin-off of its cloud computing unit and said it intends for the division to become an independent publicly listed company.
Alibaba is cutting 7% of the workforce in its cloud computing division as the unit gears up for an initial public offering.
The move, confirmed to CNBC by a person familiar with the matter who preferred to remain anonymous because they were not able to speak publicly, will see the Chinese e-commerce giant offer severance packages to those affected. Alibaba has begun informing staff of the layoffs and is also helping them to move to different positions internally if they desire, the same source added.
This comes after it announced plans in March to split the company into six business units each with their own chief executive and board of directors.
Last week, the company announced plans for a full spin-off of its cloud computing unit and said it intends for the division to become an independent publicly listed company. Alibaba aims to complete the spin off within the next 12 months.
Alibaba’s CEO Daniel Zhang has long-seen cloud computing as a key part of the e-commerce giant’s future but it currently accounts for just 9% of the group’s total revenue. And revenue has been slowing significantly over the last few quarters. In fact, revenue fell 2% year-on-year in the first quarter of the year.
Zhang said on the company’s earnings call last week that this was “partially due to our proactive move to adjust our revenue structure and focus on high-quality growth, and also a result of external changes in market environment and customer composition.”
TikTok owner ByteDance began moving its international operations off of Alibaba’s cloud which continues to weigh on the company’s cloud business.
Still, Alibaba has made some headway with its cloud business over the past few years. It is the number one player by market share in China and number two in Asia-Pacific, just behind Amazon, according to Synergy Research Group. However, on a global level, it still trails giants Amazon, Microsoft and Google
Alibaba to cut 7% of workforce in its cloud unit as it pursues IPO (cnbc.com) |
| Cloud, edge and decentralized computing | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
From: Glenn Petersen | 6/25/2023 2:26:43 PM | | | | Amazon’s vision: An AI model for everything
Reed Albergoth Semafor June 23, 2023
THE SCENE Matt Wood, vice president of product for Amazon Web Services, is at the tip of the spear of Amazon’s response in the escalating AI battle between the tech giants.
Much of the internet already runs on AWS’s cloud services and Amazon’s long game strategy is to create a single point of entry for companies and startups to tap into a rapidly increasing number of generative AI models, both of the open-source and closed-source variety.
Wood discussed this and other topics in an edited conversation below.
THE VIEW FROM MATT WOOD Q: Microsoft and Google are both nipping at your heels by offering these huge AI models. How does AWS view this market?
A: I have not seen this level of excitement and engagement from customers since the very earliest days of AWS. We have over 100,000 customers today that routinely use AWS to drive their machine-learning capabilities and these generative AI systems.
One of the interesting differences with these generative models is that they make machine learning easier than ever before to use and apply. We built a capability that we call Bedrock, which provides the very easiest way for developers to build new experiences using this technology on AWS. You just provide a prompt, select which model you want to use, and we give you the answer.
Where we kind of think of things a little differently is that it doesn't seem that there's going to be one model to rule them all. As a result, our approach is to take the very best, most promising, most interesting models and to operationalize them so customers can really use them in production. Customers can combine models from Amazon and from third parties in ways that are interesting and novel.
Q: How many models are there now?
A: On Bedrock, we have models from Amazon we call Titan. We provide models from Anthropic, AI21 Labs, which has great support for different languages. We’ve got models from Stability AI, and we’ll have more coming in the future.
Q: So you’re basically curating the best models out there?
A: Indeed. But there’s an old Amazon adage that these things are usually an “and” and not an “or.” So we’re doing both. It's so early and it's so exciting that new models are emerging from industry and academia virtually every single week. But some of them are super early and we don’t know what they’re good at yet.
So we have SageMaker JumpStart, which has dozens of foundational models, many of which have already been trained on AWS so they're already up there. That’s where we’ll have a kind of marketplace that customers can just jump on and start using them. For example, the Falcon model, which is on the top of the leaderboards in terms of capability, was trained on AWS inside SageMaker and today is available inside SageMaker JumpStart.
You can think of JumpStart as the training field for these models and the ones that prove to be really differentiated and capable and popular, we’ll take those and elevate them into Bedrock, where we’ll provide a lot of the operational performance and optimization to deliver low latency, low cost and low power utilization for those models.
Q: If AWS is offering all of these open-source models on its platform, is there concern that AWS could be held responsible for the safety of those models?
A: No. It really helps if you think of them less as this very loaded term of artificial intelligence, and more just applied statistics. It's just a statistical parlor trick, really. You can kind of think of them a little like a database. We want to enable those sorts of capabilities for customers. And so if you think of it that way, it makes a lot more sense as to where the responsibility lies for their usage. We’re engaged in all of those policy discussions and we’ll see how it plays out.
Q: Who do you think will use SageMaker JumpStart and who will use Bedrock?
A: There's going to be people who have the skills, interests, and the investment to actually go and take apart these models, rebuild them, and combine them in all these interesting way. It's one of the benefits of open source models. You can do whatever you want with them.
But then for the vast majority of organizations, they just want to build with these things and want to know it has low latency, is designed for scale and has the operational performance you would expect from AWS, and that the data is secure.
Q: There’s a debate over which is better: Fine tuning/prompting very large, powerful and general large language models for specific purposes, or using more narrowly focused models and fine tuning them even further. It sounds like you believe the latter option is going to win.
A: You can’t fine tune GPT-4. What we found is that in the enterprise, most of those customers have very large amounts of existing private data and intellectual property. And a lot of the advantages and the opportunity that they see for generative AI is in harnessing that private data and IP into new internal experiences, or new product categories or new product capabilities.
The ability to take that data and then take a foundational model and just contribute additional knowledge and information to it very quickly and very easily, and then put it into production very quickly and very easily, then iterate on it in production very quickly and very easily. That's kind of the model that we're seeing.
Q: Can you give me any customer examples that stand out?
A: It’s super early and we’re still in limited preview with Bedrock. What has struck me is just the diversity and the breadth of the use cases that we're seeing. A lot of folks are using these in the kind of unsexy but very important back end.
So personalization, ranking, search and all those sorts of things. We're seeing a lot of interest in expert systems. So chat and question-answer systems. But we’re also seeing a lot of work in decision-making support. So, decomposing and solving more complicated problems and then automating the solution using this combination of language models under the hood.
Q: What’s the vision for who will best be able to take advantage of these products? Do you see a possibility that startups could basically just form a company around these APIs on Bedrock?
A: There are going to be waves and waves of startups that have an idea or an automation that they want to bring into companies, or an entirely new product idea that's enabled through this.
An interesting area is larger enterprises that are very text heavy. So anywhere there is existing text is fertile soil for building these sorts of systems.
And what's super interesting is that we're seeing a lot of interest from organizations in regulated fields that maybe traditionally don't have the best reputation for leaning into or being forward-thinking in terms of cutting-edge technologies. Banking, finance, insurance, financial services, healthcare, life sciences, crop sciences.
They are so rich in the perfect training data. Volumes and volumes of unstructured text, which is really just data represented in natural language. And what these models are incredibly capable at is distilling the knowledge and the representation of that knowledge in natural language, and then exposing it in all of these wonderful new ways that we're seeing.
Q: Were you surprised about how quickly this all happened?
A: ChatGPT may be the most successful technology demo since the original iPhone introduction. It puts a dent in the universe. Nothing is the same once you see that, I think it causes, just like the iPhone, so many light bulbs to go off across so many heads as to what the opportunity here was, and what the technology was really ready for.
The real explosive growth is still ahead of us. Because customers are going through those machinations now. They're starting to find what works and what doesn't work, and they're starting to really move very quickly down that path.
Q: AWS already changed the world once by enabling the cloud, which enabled everything from Uber to Airbnb. How is the world going to change now that internet companies have access to potentially thousands of powerful models through the cloud?
A: I can't think of a consumer application or customer experience that won't have some usage for the capabilities inside these models. They'll drive a great deal of efficiency, a great deal of reinvention. Some of that's going to move from point and click to text. These things are just so compelling in the way they communicate through natural language.
Then the question is how do these models evolve? By using these models, they get better. They learn what works and what doesn't work. These capabilities will get better much more quickly than we've ever seen before.
Q: These models create the possibility for a new kind of internet, where instead of going to Google and getting a set of links, you’re kind of talking to different agents and behind the scenes all these transactions are happening. And a lot of that is going to happen on AWS. That has huge implications for search and advertising. Is that how you see this evolving?
A: You’re still going to have to go to a place to start your exploration in any domain, whether it is where you want to buy, or what you want to learn about. I don't know if there's going to be some big, single point of entry for all of this. That seems very unlikely to me. That's not usually how technology evolves. Usually technology evolves to be a lot more expansive. I think there's a world in which you see not one or two entry points, but thousands, tens of thousands, or millions of new entry points.
Amazon’s vision: An AI model for everything | Semafor |
| Cloud, edge and decentralized computing | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
From: Glenn Petersen | 7/12/2023 4:58:24 AM | | | | The AI Boom Is Here. The Cloud May Not Be Ready.
Traditional cloud infrastructure wasn’t designed to support large-scale artificial intelligence. Hyperscalers are quickly working to rebuild it.
By Isabelle Bousquette Wall Street Journal July 10, 2023 12:00 pm ET
Many companies say the cloud is their go-to when it comes to training and running large AI applications—but today, only a small portion of existing cloud infrastructure is actually set up to support that. The rest is not.
Now cloud providers, including Amazon Web Services, Microsoft Azure and Google Cloud are under pressure to change that calculus to meet the computing demands of a major AI boom—and as other hardware providers see a potential opening.
“There’s a pretty big imbalance between demand and supply at the moment,” said Chetan Kapoor, director of product management at Amazon Web Services’ Elastic Compute Cloud division.
Most generative AI models today are trained and run in the cloud. These models, designed to generate original text and analysis, can be anywhere from 10 times to a 100 times bigger than older AI models, said Ziad Asghar, senior vice president of product management at Qualcomm Technologies, adding that the number of use cases as well as the number of users are also exploding.
“There is insatiable demand,” for running large language models right now, including in industry sectors like manufacturing and finance, said Nidhi Chappell, general manager of Azure AI Infrastructure.
It is putting more pressure than ever on a limited amount of computing capacity that relies on an even more limited number of specialized chips, such as graphic chips, or GPUs, from Nvidia. Companies like Johnson & Johnson, Visa, Chevron and others all said they anticipate using cloud providers for generative AI-related use cases.

Google’s data center in Eemshaven, Netherlands. Google Cloud Platform said it is making AI infrastructure a greater part of its overall fleet. PHOTO: UTRECHT ROBIN/ABACA/ZUMA PRESS --------------------------------- But much of the infrastructure wasn’t built for running such large and complex systems. Cloud sold itself as a convenient replacement for on-premise servers that could easily scale up and down capacity with a pay-as-you-go pricing model. Much of today’s cloud footprint consists of servers designed to run multiple workloads at the same time that leverage general-purpose CPU chips.
A minority of it, according to analysts, runs on chips optimized for AI, such as GPUs and servers designed to function in collaborative clusters to support bigger workloads, including large AI models. GPUs are better for AI since they can handle many computations at once, whereas CPUs handle fewer computations simultaneously.
At AWS, one cluster can contain up to 20,000 GPUs. AI-optimized infrastructure is a small percentage of the company’s overall cloud footprint, said Kapoor, but it is growing at a much faster rate. He said the company plans to deploy multiple AI-optimized server clusters over the next 12 months.

A snapshot of last year’s AWS re:Invent conference in Las Vegas, NV. Amazon Web Services plans to deploy multiple AI-optimized server clusters over the next 12 months. PHOTO: NOAH BERGER/REUTERS ------------------------------------------------------------
Microsoft Azure and Google Cloud Platform said they are similarly working to make AI infrastructure a greater part of their overall fleets. However, Microsoft’s Chappell said that that doesn’t mean the company is necessarily moving away from the shared server—general purpose computing—which is still valuable for companies.
Other hardware providers have an opportunity to make a play here, said Lee Sustar, principal analyst at tech research and advisory firm Forrester, covering public cloud computing for the enterprise.
Dell Technologies expects that high cloud costs, linked to heavy use—including training models—could push some companies to consider on-premises deployments. The computer maker has a server designed for that use.
“The existing economic models of primarily the public cloud environment weren’t really optimized for the kind of demand and activity level that we’re going to see as people move into these AI systems,” Dell’s Global Chief Technology Officer John Roese said.
On premises, companies could save on costs like networking and data storage, Roese said.
Cloud providers said they have several offerings available at different costs and that in the long term, on-premises deployments could end up costing more because enterprises would have to make huge investments when they want to upgrade hardware.
Qualcomm said that in some cases it might be cheaper and faster for companies to run models on individual devices, taking some pressure off the cloud. The company is currently working to equip devices with the ability to run larger and larger models.

Hewlett Packard Enterprise headquarters in Spring, TX. HPE is rolling out its own public cloud service, powered by a supercomputer, that will be available to enterprises looking to train generative AI models in the second half of this year. PHOTO: MARK FELIX/BLOOMBERG NEWS ----------------
And Hewlett Packard Enterprise is rolling out its own public cloud service, powered by a supercomputer, that will be available to enterprises looking to train generative AI models in the second half of 2023. Like some of the newer cloud infrastructure, it has the advantage of being purposely built for large-scale AI use cases, said Justin Hotard, executive vice president and general manager of High Performance Computing, AI & Labs.
Hardware providers agree that it is still early days and that the solution could ultimately be hybrid, with some computing happening on the cloud and some on individual devices, for example.
In the long term, Sustar said, the raison d’être of cloud is fundamentally changing from a replacement for companies’ difficult-to-maintain on-premise hardware to something qualitatively new: Computing power available at a scale heretofore unavailable to enterprises.
“It’s really a phase change in terms of how we look at infrastructure, how we architected the structure, how we deliver the infrastructure,” said Amin Vahdat, vice president and general manager of machine learning, systems and Cloud AI at Google Cloud.
Write to Isabelle Bousquette at isabelle.bousquette@wsj.com
Copyright ©2023 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8
Appeared in the July 11, 2023, print edition as 'The Cloud May Not Be Ready for the Boom in AI'.
The AI Boom Is Here. The Cloud May Not Be Ready. - WSJ |
| Cloud, edge and decentralized computing | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
From: Glenn Petersen | 8/26/2023 6:25:18 AM | | | | A Startup in the New Jersey Suburbs Is Battling the Giants of Silicon Valley
It’s not just Nvidia. There is another big winner of the AI boom—and it’s now competing with some of the world’s most valuable companies.
By Ben Cohen Wall Street Journal Aug. 25, 2023 8:30 am ET
Michael Intrator wasn’t planning to leave his job in finance to work in the world’s hottest industry. Until the day his office nearly overheated.
He was running a natural-gas fund in 2017 when he stumbled into another market that was about to become huge. Intrator and his colleague Brian Venturo purchased their first GPU, the graphics chip they were using for cryptocurrency mining that has since become essential to a more lucrative business: artificial intelligence.
“All of a sudden it went from a GPU to a bunch of GPUs to the pool table being covered in GPUs,” Intrator said. He walked into work one morning that summer, after the air conditioning had been turned off for the weekend, only to find that the servers running their growing collection of GPUs made their Wall Street office feel more like a sauna. He freaked out. “We’re going to burn down this skyscraper,” he thought.
This suddenly terrifying hobby that started with a single chip soon became a company with tens of thousands of them. It’s based far from Silicon Valley and even farther from Taiwan, where chip-fabrication plants churn out today’s vital technology. But its office suite in the New Jersey suburbs is the global headquarters for one of the biggest winners of the AI boom.
Few companies have seen their value change as much in the past year as CoreWeave, a specialized cloud provider offering access to the advanced chips, futuristic data centers and accelerated computing that fuel generative artificial intelligence. It owns the mighty GPUs that have become engines of modern innovation, and CoreWeave sells time and space on its supercomputers to clients in desperate need of the processing power that AI requires.
It’s how a business that most people have never heard of is playing an influential supporting role in a tech revolution.
CoreWeave has quickly and improbably become one of the largest GPU providers and leaders in the arms race for AI infrastructure. It raised more than $400 million this spring from chip maker Nvidia and other investors. It secured another $2.3 billion in debt financing this summer to open data centers by essentially turning chips into financial instruments, using its stash of highly coveted Nvidia semiconductors as collateral. Now it’s racing to keep pace with the fastest software-adoption curve in history.

Infrastructure is not the first thing that comes to mind when you think about generative artificial intelligence—the tech behind chatbots, productivity tools and buzzy startups in every field. But you wouldn’t be thinking about generative artificial intelligence without solid infrastructure.
“It’s similar to electricity: Do you think of the power plant when you flip a light switch?” said Brannin McBee, CoreWeave’s chief strategy officer and third co-founder. “What we’re doing right now is building the electricity grid for the AI market. If this stuff doesn’t get built, then AI will not be able to scale.”
Building that stuff means the success of CoreWeave is by definition interwoven with the success of the entire AI economy.
GPUs that can be used for AI became the world’s most precious asset this year, six years after CoreWeave was founded by former commodities traders.
These chips have enough power to train large-language models and perform AI’s unfathomably complex tasks at ludicrous speeds. The market for this scarce resource is controlled by Nvidia, the top-performing stock in the S&P 500 index this year even before it reported another blowout quarter on Wednesday that once again smashed expectations. Nvidia’s gain in value in 2023 alone is greater than the market capitalization of almost any American company.
But the supply of AI chips isn’t nearly enough to meet the world’s demands. Elon Musk has declared that it’s harder to buy GPUs than drugs, and there’s a whiteboard behind Venturo’s desk in CoreWeave HQ that puts it another way: “I have not been asked for more GPUs in ___ days.”
That number has been zero since last summer.

Nvidia GPUs like this one are essential for AI applications like chatbots and image generators. PHOTO: I-HWA CHENG/BLOOMBERG NEWS ----------------------------------
The mad scramble across Silicon Valley sparked by the AI frenzy has created an opportunity for the Nvidia-backed startup based in Roseland, N.J.
“You’re seeing a whole crop of new GPU-specialized cloud-service providers,” said Jensen Huang, Nvidia’s chief executive, on the company’s earnings call this week. “One of the famous ones is CoreWeave—and they’re doing incredibly well.”
CoreWeave’s rivals in AI infrastructure operations include major cloud providers like Microsoft, Google and Amazon, and it’s hard for any startup to compete with one of the richest companies on the planet, let alone several. The private company does not disclose financial details, but recent fundraising valued CoreWeave around $2 billion. So how can it avoid getting crushed by giants worth trillions?
“It’s a question I answer all the time,” said Intrator, CoreWeave’s chief executive.
He says AI presents challenges that legacy cloud platforms were not designed to handle. This kind of radical transformation in any business can turn the playing field upside down and give upstarts an edge over incumbents forced to adapt. In fact, Intrator likened CoreWeave to the unlikely leader of another market.
“GM can make an electric car,” he said, “but that doesn’t make it a Tesla. ”
CoreWeave has fewer employees (250) than clients (700), but it has deals with Inflection AI and even OpenAI backer Microsoft. The company’s executives say they offer customized systems and a wider range of chips in more configurations than servers equipped for general-purpose computing. That flexibility makes their product more efficient for a variety of applications from rendering images to discovering molecules.
This didn’t seem possible and definitely wasn’t some grand plan in 2017, when CoreWeave’s founders were mining Ethereum for themselves. “It very much started as: How do I make an extra $1,000 so that I can pay my mortgage?” said Venturo, the chief technology officer. But they became fascinated by the mechanics of crypto at the exact moment that crypto prices were going bananas. They could buy a GPU on Monday and it would pay for itself by Thursday. So they kept buying. Before long they had hundreds of GPUs and no clue where to put them.
When they decided to leave Wall Street, they did something very Silicon Valley: They moved their hardware into a garage. Except this garage was in New Jersey—and it belonged to Venturo’s grandfather.
Soon they turned a side hustle into a proper business, named the company Atlantic Crypto and renamed it CoreWeave.
“We’re not very good at naming companies,” Intrator said.
They were better at running one. When crypto prices crashed back to earth in 2018 and 2019, they diversified into other, less-volatile fields that needed lots of GPU computing. They focused on three markets where they could fill a void: media and entertainment, life sciences and, yes, artificial intelligence.
It turned out to be perfect timing. The decision to stockpile those chips positioned the company for what Nvidia is calling “a new computing era,” but even CoreWeave’s own executives admit they couldn’t have predicted the intensity of the AI fervor.
“If anyone said they thought this is what success would look like, they’re lying,” Venturo said.
CoreWeave has upgraded from consumer GPUs in a sweltering office to enterprise GPUs in sprawling data centers around the country with the cooling, power and hundreds of miles of fiber-optic cable to keep running around the clock.
Last summer, right around the release of popular AI image generators like Stable Diffusion and Midjourney, CoreWeave’s executives invested heavily in Nvidia’s latest and fastest H100 chips. Only when ChatGPT dropped last fall did they realize they hadn’t invested nearly enough.
“We had spent $100 million on H100s,” Venturo said. “But the ChatGPT moment was when I was, like: Everything we’ve thought from a scale perspective may be totally wrong. These people don’t need 5,000 GPUs. They need five million.”
Being totally wrong has rarely been so valuable.
One of the many lessons from the speculative manias that have gripped tech in recent years is that it’s wise to be skeptical of any company riding a wave of exuberance. Even if AI really is the next big thing, smaller players could get wiped out. A trillion-dollar company can afford to wait and then dominate. It’s not exactly the most comfortable position for CoreWeave.
“It feels like we are dancing between the feet of elephants,” McBee said.
But that happens to be another measure of success. CoreWeave’s founders no longer have to worry about locating the nearest fire extinguisher. Now they’re trying to not get trampled.
Write to Ben Cohen at ben.cohen@wsj.com
A Startup in the New Jersey Suburbs Is Battling the Giants of Silicon Valley - WSJ (archive.ph) |
| Cloud, edge and decentralized computing | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
From: Glenn Petersen | 11/28/2023 2:20:18 PM | | | | Amazon announces new AI chip as it deepens Nvidia relationship
PUBLISHED TUE, NOV 28 202311:41 AM EST UPDATED 2 HOURS AGO Jordan Novet @JORDANNOVET CNBC.com
KEY POINTS
- Amazon Web Services announced Trainium2, a chip for training artificial intelligence models, and it will also offer access to Nvidia’s next-generation H200 Tensor Core graphics processing units.
- AWS will host a special computing cluster for Nvidia to use.
- For now, AWS customers can start testing new general-purpose Graviton4 chips.
Amazon’s AWS cloud unit announced its new Trainium2 artificial intelligence chip and the general-purpose Graviton4 processor during its Reinvent conference in Las Vegas on Tuesday. The company also said it will offer access to Nvidia’s latest H200 AI graphics processing units.
Amazon Web Services is trying to stand out as a cloud provider with a variety of cost-effective options. It won’t just sell cheap Amazon-branded products, though. Just as in its online retail marketplace, Amazon’s cloud will feature top-of-the-line products. Specifically, that means highly sought after GPUs from top AI chipmaker Nvidia.
The dual-pronged approach might put AWS in a better position to go up against its top competitor. Earlier this month Microsoft took a similar dual-pronged approach by revealing its inaugural AI chip, the Maia 100, and also saying the Azure cloud will have Nvidia H200 GPUs.
The Graviton4 processors are based on Arm architecture and consume less energy than chips from Intel or AMD. Graviton4 promises 30% better performance than the existing Graviton3 chips, enabling what AWS said is better output for the price. Inflation has been higher than usual, inspiring central bankers to hike interest rates. Organizations that want to keep using AWS but lower their cloud bills to better deal with the economy might wish to consider moving to Graviton.
More than 50,000 AWS customers are already using Graviton chips. Startup Databricks and Amazon-backed Anthropic, an OpenAI competitor, plan to build models with the new Trainium2 chips, which will boast four times better performance than the original model, Amazon said.
AWS said it will operate more than 16,000 Nvidia GH200 Grace Hopper Superchips, which contain H100 GPUs and Nvidia’s Arm-based general-purpose processors, for Nvidia’s research and development group. Other AWS customers won’t be able to use these chips.
Demand for Nvidia GPUs has skyrocketed since startup OpenAI released its ChatGPT chatbot last year, wowing people with its abilities to summarize information and compose human-like text. It led to a shortage of Nvidia’s chips as companies raced to incorporate similar generative AI technologies into their products.
Normally, the introduction of an AI chip from a cloud provider might present a challenge to Nvidia, but in this case, Amazon is simultaneously expanding its collaboration with Nvidia. At the same time, AWS customers will have another option to consider for AI computing if they aren’t able to secure the latest Nvidia GPUs.
Amazon is the leader in cloud computing but has been renting out GPUs in its cloud for over a decade. In 2018 it followed cloud challengers Alibaba and Google in releasing an AI processor that it developed in-house, giving customers powerful computing at an affordable price.
AWS has launched more than 200 cloud products since 2006, when it released its EC2 and S3 services for computing and storing data. Not all of them have been hits. Some go without updates for a long time and a rare few are discontinued, freeing up Amazon to reallocate resources. However, the company continues to invest in the Graviton and Trainium programs, suggesting that Amazon senses demand.
AWS didn’t announce release dates for virtual-machine instances with Nvidia H200 chips, or instances relying on its Trainium2 silicon. Customers can start testing Graviton4 virtual-machine instances now before they become commercially available in the next few months.
Amazon reveals Trainium2 AI chip while deepening Nvidia relationship (cnbc.com) |
| Cloud, edge and decentralized computing | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
To: Glenn Petersen who wrote (1669) | 11/28/2023 4:16:29 PM | From: ibyte | | | Is government able to know the inner workings, unlikely. The "cloud" providers have all kinds of personal data and imagery, and there is almost no admission or acknowledgement. |
| Cloud, edge and decentralized computing | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
From: Glenn Petersen | 11/29/2023 10:19:55 AM | | | | Why Amazon and Nvidia Need Each Other
Amazon’s in-house chip program can’t fully match Nvidia’s AI dominance, while chip maker needs to keep its U.S. business strong
By Dan Gallagher Heard on the Street Wall Street Journal Nov. 29, 2023 6:00 am ET

Nvidia CEO Jensen Huang, right, joined AWS CEO Adam Selipsky at the annual AWS re:Invent conference on Tuesday. PHOTO: AMAZON ---------------------------- Amazon AMZN needed to put on a good AI show this week. It got a little help from a surprising friend. The e-commerce titan also happens to run the world’s largest cloud-computing business. In fact, Amazon’s AWS unit now generates significantly more annual revenue than IBM and Oracle and comes second only to Microsoft in the market for business-focused software and related services. But Amazon has also been perceived as lagging behind its largest cloud rival in the field of generative artificial intelligence, given Microsoft’s aggressive push into the technology since the public launch of ChatGPT almost exactly one year ago.
Hence, Amazon used its annual AWS re:Invent conference on Tuesday to lean hard into generative AI. It even announced its own chatbot called Q, which looks like a business-focused version of Microsoft’s Copilot. But most notable was the appearance of Nvidia NVDA Chief Executive Officer Jensen Huang, who joined AWS CEO Adam Selipsky on stage at the Las Vegas event to announce an “expanded collaboration” between the two companies. That will include AWS being the first cloud provider to launch services with Nvidia’s new H200 “superchips” that will start shipping next year.
Tech executives often cross-pollinate each other’s trade shows, and the occasions are generally not worthy of note. But this was the first time Huang has appeared at the annual confab for AWS, which has been a major customer of Nvidia’s data-center business over the last several years.
And it came amid rumors of growing friction between the two companies, as Amazon has gone further than its cloud rivals in designing its own in-house chips, while Nvidia has been pushing into offering cloud-computing services of its own. Amazon even used the same keynote on Tuesday to announce the fourth version of its Graviton processor and the second version of its Trainium accelerator—the latter of which competes with Nvidia’s chips in the training of AI models.
Amazon CEO Andy Jassy bragged on the company’s third-quarter call last month that its Trainium chips “have better price performance characteristics than the other options out there, but also the fact that you can get access to them”—the last part a dig at the well-known shortage of Nvidia’s very in-demand chips. That fed further the view that Amazon might have been on the outs with the supplier of a vital AI component. Nvidia mentioned Microsoft 10 times in its own earnings call last week compared with just one mention of Amazon.
Yet the truth is that the two companies very much need each other. Nvidia’s early moves in artificial intelligence have given the company a strong position that can’t be fully replicated even by in-house chips from established tech giants, who can custom design silicon for their own networks. That has been readily apparent in Nvidia’s recent financial results; the chip maker’s data-center sales have quadrupled over the past two quarters compared with the same period last year. Nvidia credited cloud-service providers as accounting for about half those sales.
But Nvidia also can’t afford to alienate the biggest buyer in the market. Amazon’s total annual capital expenditures have been more than twice that of Microsoft’s over the past four years, and market-research firm Dell’Oro Group estimates that the data-center portion of Amazon’s capex totaled $29 billion last year—39% above Microsoft’s estimated spend for the year. Nvidia is also facing the prospect that its sales to China could take a serious hit because of new export controls, making a strong relationship with a huge U.S. customer even more important. Friends at home count for a lot these days.
Write to Dan Gallagher at dan.gallagher@wsj.com
Why Amazon and Nvidia Need Each Other - WSJ (archive.ph) |
| Cloud, edge and decentralized computing | Stock Discussion ForumsShare | RecommendKeepReplyMark as Last Read |
|
| |