We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

   Technology StocksNVIDIA Corporation (NVDA)

Previous 10 Next 10 
From: Frank Sully9/15/2021 7:21:28 AM
   of 2296
What to Expect At ‘NVIDIA GTC 2021?’


The event will include a keynote from NVIDIA CEO and founder Jensen Huang, alongside a wide variety of sessions and talks with luminaries around the world.

One of the most awaited AI conferences for developers, NVIDIA GTC 2021, is just around the corner. The event is expected to bring together thousands of innovators, researchers, thought leaders, and decision-makers to showcase the latest technology innovations in AI, computer graphics, data science, etc.

The event, which is going to be held from November 8 to 11, 2021, will include a keynote from NVIDIA CEO and founder Jensen Huang, alongside a wide variety of sessions and talks with luminaries around the world. According to the event website, Huang will be sharing some of the latest developments, driving the rapid technology advancements and new offerings to help solve the world’s toughest challenges.

The line-up for GTC 2021 includes Stanford’s professor Fei-Fei Li, Epic Games’ Tim Sweeney, OpenAI’s Ilya Sutskever, NVIDIA’s director of ML research Anima Anandkumar, Deloitte AI Institute’s Beena Ammarnath, World Economic Forum’s Kay Firth-Butterfield, NVIDIA CTO Michael Kagan, ServiceNow chief AI officer Vijay Narayanan, NVIDIA’s director of AI Gal Chechik, and others.

What to expect?


NVIDIA recently pulled off the latest stunt without a glitch, where Huang’s replica spoke to the audience for 14 seconds (from 1.02.41 to 1.02.55) while introducing the CPU designed for terabyte-scale accelerated computing. It was possible thanks to Omniverse, one of the world’s first simulation and collaboration platforms that deliver the foundation of the metaverse.

Now, NVIDIA looks to expand its Omniverse platform in partnership with Blender and Adobe. At the GTC 2021, NVIDIA is most likely to announce major plans to scale its platforms across enterprise and industries, alongside showcasing the nuances of the technology.

Also, for creating an immersive experience using the omniverse platform, there is a need for better screens, head-mounted displays or VR headsets, and mixed reality devices. NVIDIA might also reveal its partnership with companies providing such solutions or products to fuel the metaverse revolution in the coming months.


Last year, Huang announced the NVIDIA Maxine platform. It is a cloud-native streaming video AI platform that enables service providers to bring new AI-powered capabilities to more than 30 million web meetings. Besides this, NVIDIA had also announced the launch of NVIDIA Jarvis conversational AI, which was still in the open beta phase. At GTC 2021, NVIDIA is most likely to announce more deep learning tools that build multimodal conversational AI applications that deliver real-time performance on GPUs.

GPUs, Gaming & Other Hardware Updates

NVIDIA is looking to launch a new version of its GeForce RTX 2060 graphics card, which could double the memory capability to tackle the current gaming graphics card availability. Currently, NVIDIA’s RTX 2060 12 GB graphics cards are in the works. It is expected to be launched early next year.

As per Videocardz, NVIDIA plans to produce a large stock of its RTX 2060 GPUs for launch in January 2022. This move comes as an attempt to take the graphics card and imminent GPU shortages expected in the first half of next year. Previously, NVIDIA RTX 2060 was resumed for production twice to meet the demand of gamers within Asian Pacific markets. However, the production was later halted to focus on the Ampere GPUs, including GeForce RTX 3060and RTX 3060 Ti.

Source: Videocardz.comAs per Greymon55 and RedGamingTech, NVIDIA RTX 30 Super refresh is expected to launch in early 2022.

Another leaker, Kopite7kimi, has also revealed that both RTX 30 Super laptops and GA103 for desktop is being launched early next year.

Besides this, more information on RTX 40 series, a successor to Ampere codenamed Lovelace, is also being released. Greymon55, in a Twitter post, said that the next-gen GPUs are expected to be launched in October 2022. However, this claim does not refer only to NVIDIA Lovelace.

Hopefully, NVIDIA will make more announcements at the GTC 2021 about the launch of next-generation GPUs, gaming technology and graphics cards.

NVIDIA DRIVE SolutionsEarlier, NVIDIA had announced that Volvo Cars, Zoox, and SAIC are using its latest NVIDIA DRIVE solutions to power their next-generation AI-based self-driving or autonomous vehicles. Its design-win pipelines for NVIDIA DRIVE now sum more than $8 billion over the next six years, fueling a whole new range of next-gen cars, trucks, robotaxis and new energy vehicles (NEVs).

Huang had said that transportation is becoming a technology industry. Besides having amazing AI technologies and autonomous driving, vehicles would be ‘programmable platforms’ to offer ‘software-driven services.’ “The business models of ‘transportation’ will be reinvented,” Huang said in April earlier this year.

Recently, NVIDIA had announced the next-gen of its DRIVE platform, called Atlan. It is the first thousand TOPs automotive processing, providing a 4x performance increase over the previous Orin. It also features next-generation GPU architecture, new Arm CPU cores, new deep learning, and computer vision accelerators. It is also equipped with BlueField, with the full programmability required to prevent cyberattacks and data breaches.

The company is looking to target automakers’ 2025 models with the new system-on-a-chip (SoC), ensuring it does not hamper sales of Orin, the previous generation of the platform. It is currently being used by leading automakers for production timelines starting in 2022.

At the upcoming GTC 2021, we can expect more updates related to autonomous driving systems, the latest DRIVE updates, and its partnership with companies in revolutionizing the future of mobility.

Data Processing Unit Recently, NVIDIA announced BlueField-3, a next-generation data processing unit that accelerates software-defined networking, storage, and security. It has 16 Arm A78 cores and accelerates networking traffic at a 400 Gbps line rate. It is also the first DPU to support fifth-generation PCIe and offer time-synchronized data centre acceleration.

BlueField-3 (Source: NVIDIA)

Last year, Huang said the BlueField-4would be available by 2023, which will support CUDA parallel programming platform and NVIDIA AI — turbocharging the in-network computer vision. More updates related to BlueField and next-gen data processing units are expected at the upcoming GTC event.

AMIT RAJA NAIK Amit Raja Naik is a senior writer at Analytics India Magazine, where he dives deep into the latest technology innovations. He is also a professional bass player.

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/15/2021 9:33:39 AM
   of 2296
Medical AI Needs Federated Learning, So Will Every Industry

Results published today in Nature Medicine demonstrate that federated learning builds powerful AI models that generalize across healthcare institutions, a finding that shows promise for further applications in energy, financial services, manufacturing and beyond.

September 15, 2021 by Mona Flores

A multi-hospital initiative sparked by the COVID-19 crisis has shown that, by working together, institutions in any industry can develop predictive AI models that set a new standard for both accuracy and generalizability.

Published today in Nature Medicine, a leading peer-reviewed healthcare journal, the collaboration demonstrates how privacy-preserving federated learningtechniques can enable the creation of robust AI models that work well across organizations, even in industries constrained by confidential or sparse data.

“Usually in AI development, when you create an algorithm on one hospital’s data, it doesn’t work well at any other hospital,” said Dr. Ittai Dayan, first author on the study, who led AI development at Mass General Brigham and this year founded healthcare startup Rhino Health.

“But by developing our model using federated learning and objective, multimodal data from different continents, we were able to build a generalizable model that can help frontline physicians worldwide,” he said.

Other large-scale federated learning projects are already underway in the healthcare industry, including a five-member study for mammogram assessmentand pharmaceutical giant Bayer’s work training an AI model for spleen segmentation.

Beyond healthcare, federated learning can help energy companies analyze seismic and wellbore data, financial firms improve fraud detection models, and autonomous vehicle researchers develop AI that generalizes to different countries’ driving behaviors.

A multi-hospital initiative sparked by the COVID-19 crisis has shown that, by working together, institutions in any industry can develop predictive AI models that set a new standard for both accuracy and generalizability.

Published today in Nature Medicine, a leading peer-reviewed healthcare journal, the collaboration demonstrates how privacy-preserving federated learningtechniques can enable the creation of robust AI models that work well across organizations, even in industries constrained by confidential or sparse data.

“Usually in AI development, when you create an algorithm on one hospital’s data, it doesn’t work well at any other hospital,” said Dr. Ittai Dayan, first author on the study, who led AI development at Mass General Brigham and this year founded healthcare startup Rhino Health.

“But by developing our model using federated learning and objective, multimodal data from different continents, we were able to build a generalizable model that can help frontline physicians worldwide,” he said.

Other large-scale federated learning projects are already underway in the healthcare industry, including a five-member study for mammogram assessmentand pharmaceutical giant Bayer’s work training an AI model for spleen segmentation.

Beyond healthcare, federated learning can help energy companies analyze seismic and wellbore data, financial firms improve fraud detection models, and autonomous vehicle researchers develop AI that generalizes to different countries’ driving behaviors.

Federated Learning: AI Takes a Village

Companies and research institutions developing AI models are typically limited by the data available to them. This can mean that smaller organizations or niche research areas lack enough data to train an accurate predictive model. Even large datasets can be biased by an organization’s patient or customer demographics, specific data-recording methods or even the brand of scientific equipment used.

To gather enough training data for a robust, generalizable model, most organizations would need to pool data with their peers. But in many cases, data privacy regulations limit the ability to directly share data — like patient medical records or proprietary datasets — on a common supercomputer or cloud server.

That’s where federated learning comes in.

Dubbed EXAM (for EMR CXR AI Model), the new study in Nature Medicine — led by Mass General Brigham and NVIDIA — brought 20 hospitals across five continents together to train a neural network that predicts the level of supplemental oxygen a patient with COVID-19 symptoms may need 24 and 72 hours after arriving to point-of-care settings like the emergency department. It’s among the largest, most diverse clinical federated learning studies to date.

Many Hands Make AI Work

Federated learning enabled the EXAM collaborators to create an AI model that learned from every participating hospital’s chest X-ray images, patient vitals, demographic data and lab values — without ever seeing the private data housed in each location’s private server.

Every hospital trained a copy of the same neural network on local NVIDIA GPUs. During training, each hospital periodically sent only updated model weights to a centralized server, where a global version of the neural network aggregated them to form a new global model.

It’s like sharing the answer key to an exam without revealing any of the study material used to come up with the answers.

“The results of the EXAM initiative show it’s possible to train high performing and generalizable AI models in healthcare without private identifiable data exchanging hands, thus upholding data privacy,” said Dr. Brad Wood, coauthor and director of the NIH Center for Interventional Oncology and Chief of Interventional Radiology at the NIH Clinical Center.

“The findings are impactful well beyond this cross-hospital model for COVID-19 predictions, and showcase federated learning as a promising solution for the field in general,” he continued. “This provides the framework toward more effective and compliant big data sharing, which may be required to realize the potential of AI deep learning in medicine.”

The global EXAM model, shared with all participating sites, resulted in a 16 percent improvement of the AI model’s average performance. Researchers saw an average increase of 38 percent in generalizability when compared to models trained at any single site.

Each participating hospital saw improved performance with the global federated learning model (green), compared to the model trained only on local data (blue). Figure originally published in Nature Medicine. The performance boost was especially dramatic for hospitals with smaller datasets, visible in the chart above.

“Federated learning allows researchers all over the world to collaborate on a common objective: to develop a model that learns from and generalizes to everyone’s data,” said Sira Sriswasdi, co-director of the Center for AI in Medicine at Chulalongkorn University and King Chulalongkorn Memorial Hospital in Thailand, one of the 20 hospitals that collaborated on EXAM. “With NVIDIA GPUs and the NVIDIA Clara software, participating in the study was an easy process that yielded impactful results.”

Hospitals, Startups Pursue Further EXAMination

Bringing together collaborators across North and South America, Europe and Asia, the original EXAM study took just two weeks of training to achieve high-quality prediction of patient oxygen needs, an insight that can help physicians determine the level of care a patient requires.

Since then, its collaborators validated that the AI model may generalize and perform well in settings independent from sites that helped build and train the model. Three additional hospitals in Massachusetts — Cooley Dickinson Hospital, Martha’s Vineyard Hospital and Nantucket Cottage Hospital — tested EXAM and discovered that the neural network performed well on their independent unseen data, too.

Cooley Dickinson Hospital found that the model predicted ventilator need within 24 hours of a patient’s arrival in the emergency room with a sensitivity of 95 percent and a specificity of over 88 percent. Similar results were found in the U.K., at Addenbrookes Hospital in Cambridge.

Mass General Brigham plans to deploy EXAM in the near future, said Dr. Quanzheng Li, scientific director of the MGH & BWH Center for Clinical Data Science, who developed the original model. Along with Lahey Hospital & Medical Center and the U.K.’s NIHR Cambridge Biomedical Research Center, the hospital network is also working with NVIDIA Inception startup Rhino Health to run prospective studies using EXAM.

The original EXAM model was trained retrospectively using records of past COVID-19 patients, so researchers already had the ground-truth data on how much oxygen a patient ended up needing. This prospective research instead applies the AI model to data from new patients coming into the hospital, a further step toward deployment in a real-world setting.

“Federated learning has transformative power to bring AI innovation to the clinical workflow,” said Fiona Gilbert, chair of radiology at the University of Cambridge School of Medicine. “Our continued work with EXAM aims to make these kinds of global collaborations repeatable and more efficient, so that we can meet clinicians’ needs to tackle complex health challenges and future epidemics.”

The EXAM model is publicly available for research use through the NVIDIA NGCsoftware hub. Businesses and research institutions getting started with federated learning can use the NVIDIA AI Enterprise software suite of AI tools and frameworks, optimized to run on NVIDIA-Certified Systems.

Learn more about the science behind federated learning in this paper, and read our whitepaper for an introduction to federated learning using the NVIDIA Clara AI platform.

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/15/2021 9:57:50 AM
   of 2296
China’s Domestically Produced GPUs Now As Fast As NVIDIA’s GeForce GTX 1080, JM9 Series GPU Tapes Out

By Hassan aMujtaba / Sep 15, 2021 09:34 EDT

China has taped out its first domestically produced GPU which is as fast as the NVIDIA GeForce GTX 1080 & AMD Radeon RX Vega 64 graphics cards. The GPU which falls under the JM9 series has been produced by the Chinese firm, Jingjia Micro and has been under production for a little over 2 years.

The Jingjia Micro JM9 Series GPUs have two chips, an entry-level version known as 'JM9231' which would offer performance levels similar to the NVIDIA GeForce GTX 1050 graphics card & a higher-end 'JM9271' which would offer the performance level of a GeForce GTX 1080 graphics card. Jingjia went through some development hell to get these GPUs out in the market as they were indented for a launch in 2020 but the first chip is taping out in late Q3 2021.

On September 14 , Jingjiawei, a domestic GPU chip company, issued an announcement stating that the company’s new-generation graphics processing chip has completed the tape out and packaging phases. The product has not yet completed the test work, and has not yet formed mass production and external sales, and will not affect the company. The current performance has a greater impact, and the degree of impact on the company’s future performance is still unpredictable.

On September 14 , Jingjiawei, a domestic GPU chip company, issued an announcement stating that the company’s new-generation graphics processing chip has completed the tape out and packaging phases. The product has not yet completed the test work, and has not yet formed mass production and external sales, and will not affect the company. The current performance has a greater impact, and the degree of impact on the company’s future performance is still unpredictable.

Based on the new statement from the manufacturer, it looks like while the Jingjia Micro has taped out its very first JM9 series GPU, it is still far from launching an actual product since the firm has not yet completed the test work and has not planned out mass production or external sales. So we are looking at at least 1 year before we can see this product in action in the retail segment. It is still one big achievement for the Chinese domestic GPU market which is on par with high-end GPUs that are 2 generations old.

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/15/2021 2:06:54 PM
   of 2296
NVIDIA to Drive “Advances for Decades to Come,” Time Magazine Writes

September 15, 2021


Highlighting NVIDIA’s fast-growing impact, Time magazine Wednesday named NVIDIA CEO Jensen Huang to its list of most influential people of 2021. NVIDIA has enabled a revolution that “allows phones to answer questions out loud, farms to spray weeds but not crops, doctors to predict the properties of new drugs—with more wonders to come,” Andrew Ng writes in a story featured on the cover of the iconic weekly magazine’s latest issue.

“Artificial intelligence is transforming our world,” writes Ng, who is founder of DeepLearning.AI, founder and CEO of Landing AI, and chairman and co-founder of Coursera. “The software that enables computers to do things that once required human perception and judgment depends largely on hardware made possible by Jensen Huang.”

Huang was one of seven honored on Time’s cover for Time’s annual issue on the world’s 100 most influential people. Others profiled include U.S. President Joe Biden, Tesla CEO Elon Musk, Buccaneers Quarterback Tom Brady and singer Billie Eilish.

“In 2003, amid great skepticism, Huang directed his company Nvidia to adapt chips designed to paint graphics on computer screens, known as graphics processing units or GPUs, to perform other, more general-purpose computing tasks,” Ng explains.

“The resulting advancements—and powerful chips—laid a foundation that could accommodate much bigger neural networks, the programs behind much of today’s AI,” Ng writes.

Huang’s gambit worked, Ng explains, because he is among the world’s “most technically savvy CEOs.” He’s also “a compassionate steward of his employees and a generous supporter of education in science and technology.”

“With still-emerging AI technologies creating an insatiable hunger for more computation, Huang’s team is well-positioned to keep driving technological advances for decades to come,” Ng concludes.

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/16/2021 1:23:28 PM
   of 2296
Cerebras’ Wafer-Scale Engine AI System Is Now Available in the Cloud

By Tiffany Trader

September 16, 2021

Five months ago, when Cerebras Systems debuted its second-generation wafer-scale silicon system (CS-2), co-founder and CEO Andrew Feldman hinted of the company’s coming cloud plans, and now those plans have come to fruition. Today, Cerebras and Cirrascale Cloud Services are launching the Cerebras Cloud @ Cirrascale platform, providing access to Cerebras’ CS-2 Wafer-Scale Engine (WSE) system through Cirrascale’s cloud service.

The physical CS-2 machine – sporting 850,000 AI optimized compute cores and weighing in at approximately 500 lbs – is installed in the Cirrascale datacenter in Santa Clara, Calif., but the service will be available around the world, opening up access to the CS-2 to anyone with an internet connection and $60,000 a week to spend training very large AI models.

“For training, we have not found latency to be an issue,” said Cirrascale CEO PJ Go in a media pre-briefing, held in conjunction with the AI Hardware Summit this week.

Feldman agreed, adding, “If you’re going to run your training for 20 hours or more, the speed of light to get from Cleveland to San Jose is probably not too big issue.”

Cirrascale’s Cerebras Cloud customers will gain full access to Cerebras’ software and compiler package.

“The compiler toolset sits underneath the cloud toolset that Cirrascale has developed,” said Feldman. “And so you will enter, you’ll gain access to a compute cluster, storage, a CS-2; you will run your compile stack, you will do your work, you will be checkpointed and stored in the Cirrascale infrastructure, it will be identified so you can get back to that work later. All of that has been integrated.”

The environment supports familiar frameworks such as TensorFlow and PyTorch, and the Cerebras Graph Compiler automatically translates the practitioner’s neural network from their framework representation into a CS-2 executable. This eliminates the need for cluster orchestration, synchronization and model tuning, according to Cerebras.

With a weekly minimum buy-in — pricing is set at $60,000 per week, $180,000 per month or $1,650,000 per year — Cirrascale customers get access to the entire CS-1 system. “The shareable model is not for us,” said Feldman. The raison d’etre of the wafer-scale system is “to get as big of a machine as you can to solve your problem as quickly as you can,” he told HPCwire.

Discounts are provided for multi-month or multi-year contracts. Cerebras does not disclose list prices for its CS systems, but buying a CS-2 system outright will set you back “several million dollars,” according to Feldman.

Both CEOs agreed that “try before you buy” was one of the motivations of the Cerebras Cloud offering, converting renters who are impressed by what CS-2 can do into buyers of one or more systems. But the companies also expect a good share of users to stick with the cloud model.

A preference for OPEX is one reason, but it’s also an issue of skills and experience. Driving home this point, Feldman said, “A little known fact about our industry is how few people can actually build big clusters of GPUs, how rare it is — the skills that are necessary, not just the money. The skills to spread a large model over more than 250 GPUs is probably resident in a couple of dozen organizations in the world.”

Cerebras Cloud offers to streamline this process by making the performance available via a cloud-based hardware and software infrastructure with the billing, storage and other services accessible via the Cirrascale portal. “It was an obvious choice for us in extending our reach to different types of customers,” Feldman said.

Cerebras’ first CS system deployments were on-premises in the government lab space (the U.S. DOE was a foundational win, announced at the 2019 AI Hardware Summit) and commercial sites, mainly pharma (GlaxoSmithKline is a customer). By making CS-2 accessible as a hosted service, Cerebras is going after a broader set of organizations, from startups to Fortune 500 companies.

“We’ve been working on this partnership for some time,” said Andy Hock, vice president of product at Cerebras Systems, in a promo video. “We’re beginning with a focus on training large natural natural language processing models, like BERT, from scratch and we’ll expand our offering from there.”

“The Cerebras CS-2 handles a type of workload that we cannot do on GPUs today,” said David Driggers, founder and CTO, Cirrascale. “[It’s] a very-large scale-up scenario, where we’ve got a model that just does not parallelize and yet it’s managing to deal with a very large amount of data. So the largest NLP models today require a tremendous amount of data input as well as a tremendous amount of calculation. This is very difficult to do on a cluster due to the IO communication that is required. The Cerebras CS-2 allows us to leverage the very large memory space, the large built-in networking and the huge amount of cores to be able to scale NLP to heights that we haven’t been able to do before.”

Analyst Karl Freund (principal, Cambrian AI Research), who was on the pre-briefing call, gave the partnership his nod of approval. “Cerebras seems to be firing on all cylinders of late, with customer wins, the 2nd gen WSE, and most recently their audacious claims that they are building a brain-scale AI1000 times larger than anything we have seen yet,” he told HPCwire.

“What you have is a very hot commodity (their technology) that a lot of people want to experiment with, but who do not want to spend the very big bucks it would take to buy and deploy a CS-2. Enter Cirrascale, and their CS-2 cloud offering, which will make it easier and at least somewhat more affordable for scientists to get their hands on the biggest, fastest AI processor in the industry. This will undoubtably create new opportunities for Cerebras going forward, both in the cloud and on-premises.”

Asked about the risk that today’s AI silicon won’t be suitable for future AI models, Freund said, “if anything, Cerebras is the company who’s architecture is skating to where the puck is going: huge AI.”

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/16/2021 2:35:04 PM
   of 2296
NVIDIA Corporation's (NVDA) Management Presents at Piper Sandler 2021 Virtual Global Technology Conference (Transcript)

Sep. 14, 2021 8:32 PM ET

NVIDIA Corporation (NVDA)

NVIDIA Corporation (NASDAQ: NVDA)Piper Sandler 2021 Virtual Global Technology Conference Call September 14, 2021 2:00 PM ET

Company Participants

Manuvir Das - Vice President of Enterprise Computing

Conference Call Participants

Harsh Kumar - Piper Sandler & Co.

Harsh Kumar

Thanks, everybody, for joining us for a very exciting session that's coming up now. We are very fortunate to join Manuvir Das, who is the Vice President of Enterprise Computing at NVIDIA. NVIDIA is, of course, the largest – single largest market cap company, doing some extremely exciting things, of course, through all of its businesses, but I think the most exciting thing, no one will argue with this, are happening with what they're doing within the data center where Manuvir is deep into it.

So with that, I'm going to turn it over to Manuvir. He's got a short slide deck that he wants to talk about. And Manuvir, the floor is yours.

Manuvir Das

Thank you so much, Harsh, for having me and for giving NVIDIA this opportunity to talk to the audience. It's a real privilege. I thought what I'd do at the outset is just share with you the big-picture view of what NVIDIA is doing and where we are headed in the data center and with artificial intelligence before we do some Q&A here. So I'll start with a statement about what we are sharing in the slides as we always do.

So the first picture I have here is something we've shared before, when we announced a new software product from NVIDIA called NVIDIA AI Enterprise. And I thought I would start with this to just level set. This is the news we've shared prior and why we did this work, right? So if you think about the state of the union for artificial intelligence in the enterprise, for enterprise customers at large, we are at a state today where we've had a lot of success with early adopters.

There is a few thousand companies across the world that have had great success improving their business, improving the experience of their customers with artificial intelligence, but the broad base of the enterprise customers is yet to adopt AI, right? And what is the fundamental reason for this?

The fundamental reason is that there are two very different sets of people within every enterprise company. On the one hand, you have the data scientists. These are the people who understand AI, who understand the tools, Jupyter notebooks, all these kinds of things. They knew the development of new AI capabilities, and they move fast and they are pretty agile, and they are the state-of-the-art cutting-edge, doing new things every night.

On the other hand, you’ve got IT administrators, who are accountable and responsible for making sure that the actual applications running in the enterprise data center are safe, secure, stable because the business of the company depends on it, right, and the experience of the customer depends on it. And these two personas, these two worlds are pretty much apart because the one world of the data scientist wants to use the tools and frameworks that they are comfortable with, whereas the IT administrator is used to a different model for how to deploy applications. And there is a disconnect because IT does not know how to pick up what the data scientists produced and the data scientists don't know how to operate in a world where IT lives.

And so we created NVIDIA AI Enterprise to address this gap. And what we did is we took NVIDIA’s AI software for training, for inference, for data science, and we made it work on top of VMware vSphere, which is sort of the de facto platform in the data center. If you look at any enterprise data center today, you will find virtualized servers rather than VMware vSphere. And so that's what this picture shows, right? And it achieves two things at the same time.

On the one hand, for the data scientists, they see all the tools and frameworks that they are comfortable and experienced with to do their work. That's the layer in green provided by NVIDIA. On the other hand, for the IT administrator, it's the same VMware vSphere environment, they are used to the same tooling, how do I provision, how do I get access to people, but now with these new workloads for AI. And so this is really a way of bringing these two worlds together. So this is what we've announced earlier this year in conjunction with VMware, which is NVIDIA AI Enterprise, really NVIDIA’s way of becoming mainstream for enterprise customers for making AI a mainstream workload for enterprise customers.

Now, this is just actually the beginning. And so what I really wanted to share with you today was that this is something NVIDIA has been thinking about and working on for many years, right? And what we realized is this is mainstream artificial intelligence in enterprise data centers is a full stack problem. Of course, you need the right hardware, that is the layer I've shown you in green. But then you also need all of these pieces of software, sort of the operating system of AI, all the essential tools, so that you can run your different AI workloads.

And then finally, if you think about it, there are just different use cases, whether it's Vision AI detecting interesting things that are going on in video feeds or cybersecurity, finding attacks that are happening in your data center. And so you would love to have pieces of software that are customized frameworks for each of these use cases that are easy to adopt.

And so I've drawn this abstract picture for you that is representative, you can think of it as a brick wall, right, that if you really want to solve the artificial intelligence problem end-to-end, you need to fully construct this brick wall of all these different boxes to get a complete solution. And at NVIDIA, that is exactly what we've done.

This is the same picture, but I've replaced every one of those abstract concepts with an NVIDIA product at the bottom hardware products, but in the middle and the top all software products that NVIDIA has produced over the last few years, and especially over the last year to really complete this brick wall. This is not a vision slide. This is an execution slide. All of these things I'm showing you on this slide today already exist, are already usable by customers.

The fact of the matter is that today NVIDIA is much more a software company than a hardware company. We have thousands of software engineers within NVIDIA who work on this – on all of these things every day. And so we built this entire stack a set of frameworks for these different use cases, the essential software that allows all of this to run on mainstream servers as I said in conjunction with folks like VMware, Cloudera et cetera, all of the hardware.

And then what we announced recently was we have a partnership with Equinix to put all of this technology, the hardware and the software into Equinix data centers around the world. So that for customers as they get going, it's very easy for them to start their journey where NVIDIA has pre-deployed all of these things for them. And then as they proceed in their journey, of course, they can procure and deploy these things for themselves in their own data centers or in a colocation facility.

Before I come back to you Harsh, the final point I wanted to make was that NVIDIA is a pretty fast moving company, right? This is our general philosophy. And so I did this exercise for myself or if I were to show you this picture, this same picture from last year, but only show you the things that were in execution mode that we actually produced. What would this slide actually look like? And this is what it would look like. Whereas today it looks like this, right?

And so I just wanted to end by making this point that NVIDIA is an R&D-first, innovation-first company. The business results we have today are based on the work we've done in the last few years. And what our teams are working on every day today, all of these software stacks that we have been producing and are putting out are to unlock the opportunity in the years ahead. And that's what we are really focused on as a company.

So Harsh, that's what I had as a bit of an opening context setting statement, if you will. Artificial intelligence and the enterprise data center is a full stack problem. It's an end-to-end problem. It requires a broad ecosystem. This is where NVIDIA is focused. We've built the hardware. We've built the software. We've created an ecosystem. We have more than 2.5 million developers who use different parts of our stack to develop their own applications and solutions.

And that's our contribution to make AI feasible for enterprise customers. And with that, we have a go-to-market motion that is in conjunction with established partners, the OEMs who produce servers, folks like VMware who produce software stacks for the data center. And we are really looking forward to this journey of democratizing AI for enterprise customers over the next few years.

And so with that, I'll stop sharing my slide deck and hand back to you, Harsh.

Question-and-Answer Session

Q - Harsh Kumar

Manuvir, that is simply incredible to see the number of products you guys have introduced in just the last 12 months to be able to fill up the gaps of where you were and where you're trying to go. And that brings us to an interesting topic. There has been a lot of changes here not just with COVID, but just generally data center is always morphing, always changing. Can you talk about how it's changing and what are the large changes that are happening in the industry that sort of you wake up and think about and say, this is the kind of direction that NVIDIA maybe needs to think about going into?

Manuvir Das

Yes. That's a great question, Harsh. And it's amazing how much the landscape of data centers has changed in the last decade. You know, you’ll hear some of these buzzwords these days, like cloud, Kubernetes, containers, all these things. What's the common threat to all of that, right? The common threat to all of it is that for quite some time, computing in the data center was done in a scale-up manner. You take one server, you run your application on it, and as the applications get more demanding, you make your server bigger and bigger and more capable, right? And then you buy a few of these servers and they're super expensive.

And then what happened with the advent of the public cloud was the proliferation of a different model, which is scale-out rather than scale-up. Instead of having one giant server, let me have many small servers that cooperate to run a workload, right? This is what in computer science for decades has been referred to as distributed computing that the public cloud already did. And Kubernetes and containers are just a mechanism for building your application as a distributed computing workload, right? And this is how data centers have really evolved in the last decade. So what does this mean? This means now that when you run an application, instead of running on one server, you're running on a set of servers that are working in conjunction to run your workload.

So when you think about computing now, you're not just thinking about building the best server, you have to think about the networking because the data is flowing across all these servers. You have to think about security because if you – if a malicious thing intercepts one server, they have access to all the other servers. You have to think about how you store your data, so it's accessible to all the servers, right?

So computing is really evolving to data center scale. Every workload runs within a complete data center rather than a single server. And so because of that, you have to solve this as a full stack problem. You have to think about what's the right servers, what's the right networking gear, what's the right networking software so that it goes fast, what is the software stack for orchestrating the workload and running the workload. You have to put it all together, right? And this is how NVIDIA has really evolved, that we've become a full stack company for that reason.

Now, the other thing I would say, Harsh, as you know, our Genesis at NVIDIA was as a hardware company, right, with the GPU. So the other insight that we had in NVIDIA was that in order to make this full stack go, you're going to need three essential components in every server. Of course, you need a CPU, which is what applications have traditionally run on. You need a GPU, which is the way of accelerating the workload, so you can do more in every server, and then you need this new form factor that we call it, DPU, a data processing unit, which sits on the network interface and really runs not the workload, but the infrastructure of the data center itself, okay? So every server needs a CPU, a GPU, and a DPU in conjunction, this is our vision of the data center. And this is why we, of course, have GPUs. We have the BlueField DPU from NVIDIA.

We also recently announced that we are working on a CPU optimized for artificial intelligence called the Grace CPU based on Arm technology. And we really see this as the future direction of the data center where every server will have a CPU, a GPU and a DPU inside, right?

So just to summarize all that, I would say, because I know I said a lot there, Harsh. We really think that computing going forward in the data center becomes a data center scale problem, a full stack problem. We believe every server needs to have a CPU, GPU and a DPU inside of it as the essential hardware components and then you need the right layers of software that I showed on my slides to bring it all together within the data center.

Harsh Kumar

Amazing Manuvir, it seems like the opportunity is getting bigger and bigger as the data center compute sort of gets distributed and flattens out, if you will. So you guys, I'm sure talked to a lot of customers and I'm sure the highest end customers actually come to you with their problems and say, this is kind of what we need to solve. What are you seeing in terms of what's actually strategically important to the customers? And what areas are these customers emphasizing versus deemphasizing, particularly as a result of, for example, COVID-19 that we're caught up right now?

Manuvir Das

Right. I think and you mentioned the pandemic and that's had two profound impacts, Harsh, that we have seen from talking to customers, right? And they are two sides of the same coin, which is namely that the amount of in-person connection has gone down dramatically, right?

One side of the coin is for the companies doing their own work and their own business across the employees, et cetera. The employees are not able to sit in a room together, right? So the question is how could the company remain as productive as before, even though the employees are all in different places and working from home, right? That's one consideration.

The second consideration is the company's engagement with their customer base has also not changed because of the pandemic, right? It's become much more online and digital even more so than before. And so with that change in how they are interacting with customers, what should they do, right? So let me just take a minute to break down each of these, right?

So if I take the first one, which is that employees are not all sitting in the same room together, instead our approach at NVIDIA with our customers is – instead of looking at this as a loss, this is actually a forcing function for a new opportunity for companies that there are actually – a technology can allow companies to be far more productive by leveraging people all over the world, rather than just leveraging people in the room. And this is why we created a platform called NVIDIA Omniverse, which we now make available to enterprise customers.

And the way to think about NVIDIA Omniverse is it is a digital real-time remote collaboration environment for people working on the same project. It could be engineers designing a building together. It could be designers creating the facade of a display somewhere together. And with Omniverse all of these people can essentially log into the same place, they can collaborate in real-time, one person makes a change and other person can see the change, right? So it creates a whole new model for collaboration and working together, right? And this is why we put so much emphasis on Omniverse. It's a big, big initiative for NVIDIA. And of course, there's a bit of a bias here because for such a model to work well, one of the core technologies you need is really good graphics, and that's something NVIDIA knows a thing or two about, right? So it’s a natural pathway, but it's also a distributed computing problem, it’s also a scale, it's a data center scale problem because you're running this giant sort of thing that different people can connect to and work on, right? So that's one change, Harsh. So that is within the company's work. That's why we did Omniverse.

And of course, there've been technologies for remote work like VDI that NVIDIA has been working on for quite some time with our GPU technology and we continue to do that, right? And we see a lot on adoption, for example, of workstations now because if you think about it, if you're an employee working from home, right, you need a proper workstation in your home, if you're going to do all your work from home, right? And it changes the dynamic there, right? So that's the one side.

The other side is the company's engagement with their customer base, which is now much more digital and online than it was even two years ago, right? And so you see – look at how we're doing this conference right now, right? We’re on a video conferencing technology. These are proliferated, right? But you see things like, the need to converse with your customers, what is called conversational AI. So many more customer conversations, you don't have enough humans in your company to do all these conversations, so you need some automation, you need AI, you need a chatbot that can interact with your customers and your website, right, so you can handle more requests and more inquiries.

So on that side of the coin, we've seen a lot more companies now interested in adopting AI because they see it as a way to greatly enhance their communication with their customer base in this new era where their customer is relatively disconnected from them physically, right? So those are the two things I would point out.

Harsh Kumar

And what about – I mentioned that, maybe something that's become important versus something that's become less important to the customer. Can you talk about – if you have any example, I would appreciate it of something that the customers are not as focused on today as they were maybe before?

Manuvir Das

Yes. I know you'd love an answer to that one, Harsh, but I'm going to pass on that because for better or worse, I happen to be in a position where, when I talk to customers, it's mostly about the things they want to do now, and so that they are really focused.

Harsh Kumar

That's fair. Let's talk about off of that first question as well. Big companies that want complex things done come to NVIDIA, and I suspect that you're probably more of a partner today and increasingly in the future than you were before because they’re actually coming to you saying, we need this X, Y, Z and help us with that. And you're sort of involved earlier on, is that – am I correct in thinking about it correctly? And is it really happening? And are you seeing enhanced interaction with your customers on a daily basis with the requirements that they want to fulfill?

Manuvir Das

I think it's a great observation, Harsh, that you have. It is true. I will put it to you this way, right? AI is actually – is AI is hard for any customer to implement. That's the truth of it, right? And we sort of went through this phase where we were just proving it out. The technology was complex and we were working with a small number of customers who really needed it. Yes, like for example, I'm an online shopping site and I need to recommend to a customer what they should buy next. And I know that AI will help me and as painful or difficult it might be, I'll just jump in and do it. And so those are the people we worked with earlier, right? But that has now evolved.

And as we’ve sort of broadened our reach across the enterprise customer, they're not looking for point – pieces of technologies that deploy the software or putting this piece of hardware. They want a solution, right? They want to solve a business problem. And so more and more, we find our conversations to be of that nature. Hey, this is my use case. This is the problem I'm trying to solve. Tell me, what is the recipe? What hardware do I need? What software do I need? What ISV application vendor do I need to work with? What data sets do I need to acquire in order to do my training? It's a complete discussion.

We believe in this so much, Harsh, that if you look inside NVIDIA, we have a very large organization, a dedicated organization of what we call solution architects. And these are people they're not sellers, they're not sales reps, they're not product engineers, they sit in the middle. And what they do is every customer conversation begins with what's the problem you're trying to solve. Here is our SLAs, they will sit down with you as almost consultants, right, and as partners, and we'll design the solution with you. And as we designed our solution with you, maybe you use our technology, maybe you won't use our technology. That's fine. Either way, if you adopt AI as NVIDIA, we’re super excited about that, right?

And we do think we have good piece of technology. Even for folks like myself, Harsh, like when we go and have conversations at the executive level with the customer, right, I never have a conversation as a vendor. I never have a conversation about here is a product I want to sell you, right? My conversation is what's the problem you're trying to solve? How have you architected things so far? We think you might want to architect your infrastructure and data center to go solve this problem. And if we align on that, maybe we can be of help to you with some parts of that architecture, right? That's going to hold it.

Harsh Kumar

It's an amazing way to think about customer interaction because the customer in this situation will more than likely to feel you're there to help them with their issues. Then just when they're trying to sell a product, like you put it best?

Manuvir Das

Yes. Harsh, if you don't mind, I might get into trouble for saying this, but my boss, Jensen, who is the CEO of NVIDIA, you know, I would say the one word he uses the most in meetings with folks like myself at NVIDIA is empathy. That's sort of the most important word in this dictionary. And that is it starts with that, have some empathy for the customer, right, understand the situation they're in, what problem they're trying to solve, what opportunity they're trying to take advantage of and then make it.

Harsh Kumar

And what I bet you, it allows NVIDIA to connect at a completely different level versus the rest of the vendors. Let's move on to software, you mentioned software earlier on. So NVIDIA I've noticed, more over the last three years is bringing increasingly more and more amount of software to the marketplace specifically whether it relates to AI, which is, I would say a core competency of NVIDIA. Can you talk about NVIDIA’s AI software? What is the differentiating factor here? Where are we in the adoption curve? And like, if I dreamed the dream, it’s a long question, but if I dreamed the dream, what is the opportunity for NVIDIA here?

Manuvir Das

Yes. So I will do this in reverse order with the punch line, Harsh. I think we believe that if we execute well on our plans, there is at least a multi-billion dollar incremental software opportunity here on top of what we are really doing, right, because today our revenue in enterprise AI is primarily based on the hardware that we provide, the GPUs and the networking gear, et cetera. But if you think about it – a simple way to think about it is if you look inside an enterprise data center, there are certain layers of software. For example, VMware or SAP, et cetera, that are deployed across servers, and there's a commercial model for that software. And the reason is because that software solves a very important problem for the customer, which is, how do I run my workloads? And the software is almost more important than the hardware because the software is what the customer is experiencing.

And the customer has an expectation that the software is supported. It has a certain level of quality and performance. It is updated regularly, those sorts of things, right? And that's why there's a commercial model for the software. And we are now entering that world for the first time with NVIDIA, right? To date, we have produced software, but it's been made available to the community to do other people to make their life easier. But now for the first time with NVIDIA AI Enterprise, we really have a similar kind of product that can be sold because that thereby a customer can rely on it, right? And there's a simple math you can do about how many servers that are in the world, how many servers we expect would be useful to AI, what sort of licensing you could do for the software and every server that would be fair to the customer, then you multiply those things out, and it's at least multiple billions of dollars of incremental revenue for that layer of the software, right? So I'd stop with that.

Now, to your first question, let’s call it, where are we with this, right? The truth of the matter is we are in early days, right? This year is when we have rolled out the software, in fact, in NVIDIA AI Enterprise went to general availability just last month, right? And have just beginning to rollout now. There's a new version of VMware that supports that, which has been rolling out. So we are in the beginning of this journey, but we certainly expect that there will be broad adoptions. And you can think of that adoption on two fronts, Harsh. One is the software itself being adopted.

But the other thing is what the software is really doing? Is it's making it possible that you can take your regular mainstream servers you have in your data center today that you would normally not think of using for AI, but now you can use them for AI. So it also – so its expands the balloon in two different ways. One way is there's this new thing called the software. That is a commercial proposition. But the second is that the software brings a lot more servers into the picture to be used for AI and so it expands the balloon and it expands the reach, right? So that's kind of what I would say. We see it as a big opportunity. We’re early in the adoption curve. We are in the steep part of the S-curve, if you will.

And then just one thing I might say about the piece parts themselves. The simplest way I would describe this is when you adopt AI, you need to do two things. Number one, with your data scientists and other people you need to develop in AI. And then once you've developed it, you've got these great models, then you need to deploy the AI within application so that you can actually use the AI, for example, to see what's going on in your weekends, okay?

And so essentially we produced two platforms. We have something called NVIDIA Base Command, which is one of the enterprise uses to develop the AI. And we've got a platform called NVIDIA Fleet Command, which you use to then deploy your AI out to all the places where you need to deploy it out, right. So that's the highest level, the simplest way of thinking about our platform. There's Base Command, there’s Fleet Command, and we're very excited about these, but as I said, we are very much in the early stages.

Harsh Kumar

Manuvir, just one more thing on that. Do you think there's anybody in the space that's even close to the level of work you guys are doing? I know, historically, NVIDIA has been just a pioneer in AI on the hardware side, and now I see this focus on the software side. Is there anyone even in the zip code of where NVIDIA operates in bringing the complete package together?

Manuvir Das

Yes. We – of course, I am biased with it. But I would say we do not think so, right? But I will elaborate on that, right? If you think about the picture shown in my slide, the point we made was this is really a full stack problem from the piece parts of the hardware to the systems, to the low level of software, to the frameworks on top. We're the only company in the planet that has been working on all of these limits.

And as I said, my slide was not a vision side. My slide was a reality slide of the things we've been. Now, we operate at all these different levels and we are big believer in the ecosystem, whether its cloud service providers or server manufacturers or whatever, right? So our model is we are happy to partner with anybody at any level. For example, you might be a company that focuses on building frameworks, the top level. But then we have API, so you can use the middle layer of our software as the basis for developing your frameworks.

You might be a system manufacturer like a Dell or HPE, you can incorporate our GPUs and our DPUs into your servers, right? So there are certainly companies at every leader. In fact, we fostered that ecosystem very intentionally, but we believe we are really the only company in the planet, Harsh, that has focused on the entire stack, right? And that's why we need to really optimize it and tailor it for these businesses.

Harsh Kumar

Well, absolutely. No question about it. You guys have been there at the forefront with compute and with AI for a very long time already. Well, you brought up something Manuvir, earlier on, that was fascinating Omniverse. How does Omniverse fit into your software strategy? You've talked in terms of collaboration, but obviously there's got to be a longer game plan I would think if NVIDIA is putting so much upfront into it. What is the opportunity for adoption for this in the next couple of years? And so then maybe I'll hit you with that first and then go from there.

Manuvir Das

Yes. I'll do this one backwards too with the punch line – punch line first, Harsh. Our math, basically when we look at the target audience for Omniverse in the work that we've done, we think there's about 20 million designers and engineers out there for whom Omniverse would be a great platform for them to do their day-to-day work. And if you just do some simple math of a subscription-based model that we've already put out, and then fix the norms and standards of the industries, if you will. This is again, definitely a multi-billion dollar net incremental market opportunity from the use of Omniverse, right? So that's one way of answering the question.

The other way of answering is, as you pointed out, I talked about collaboration and that's certainly a use case of remote collaboration, but do we see a bigger opportunity, right? The bigger opportunity we see, Harsh, is that one way actually of time together everything that NVIDIA has done from its inception as a company, but there is graphics or AI or robotics or self-driving cars or any things is that fundamentally we're a simulation company, okay. We build technologies in different domains that allow you to simulate something without actually having to do it. That's the core of our technology. Like, for example, think about our platform for self-driving cars, yes, you can drive cars around and you can capture what's happening in the roads and make your cars better. Of course, we do that. But we also have a complete simulation platform that you could use to do miles and miles of driving “without actually driving,” right? So you can learn them more.

So we really believe that going forward no matter what industry you're in, as the world evolves, simulation would become more and more routine as the basis for how you're productive and really what Omniverse is, it has dramatically changed the state-of-the-art in terms of being a platform for simulation for real-time simulation, so you can actually model things and see what's happening, right? And we think that is a massive opportunity that goes beyond the just the real-time collaboration.

Harsh Kumar

Thank you. Thank you for that. In the most recent earnings call, I think Jensen focused a lot on software for one reason or the other. And then in your presentation, you're talking a lot about software, so we see the change happening. My question is about, when do you think in the future how far out are we before you start generating meaningful amount of opportunity in revenues from this software stack that NVIDIA is bringing to the table?

Manuvir Das

Yes. I think I'll answer that for you. I'll apologize and answer that for you in a relatively generic way, Harsh, instead of putting specific numbers, right? I think this is definitely a journey that we are beginning off. We are on the steep part of the curve. We are seeing massive interest, so we know we're heading in the right direction. But certainly right now, our revenue is primarily driven by the things we have been working on over many years, right? And these things will begin to pay off as we go forward.

But as I said, what I quoted to you for both NVIDIA enterprise as well as for Omniverse enterprise, there have been multi-billion dollar opportunities. We see these as very real opportunities, right? Yes, I would also say, Harsh, that – I also want to paint the picture accurately for the audience, right? There's in fact a next level of software opportunity for NVIDIA that is in some ways more powerful than what I described, right? So what I’ve talking about here is sort of the essential software for artificial intelligence or for collaboration and simulation of Omniverse. But if you think of the real AI journey, what is the real AI journey about? It's about saying that in every walk of life, no matter what industry your company is in, there are certain functions that humans are performing, right? And each of those functions is one-by-one.

Then if we can figure out a way to automate that function with AI, then you can do it much more cost-effectively and you can free up your humans to focus on other things. A good example is that you can use NVIDIA’s software frameworks to look at x-rays and detect whether there's a fracture in the person's bone, right? That's something that today radiologist has to do, but you can take that function and you can automate it, right?

In the space of retail, you can look at the camera feeds from across the store and determine who's shopping for what, and what are they walking out of the store with, right? Instead of having humans in a backroom having to sit there and look at the videos with weary eyes all the time, right? So one-by-one, you can take each of these human functions and replace them with some NVIDIA software.

So now the question you ask is what is the potential business value of the software? The business value of the software is not a function of how much did it cost NVIDIA in terms of engineers to develop the software. The business value is in terms of how valuable is it to that enterprise customer to replace that human function or augment that human function with this automated software, right? And so we see a rich landscape of business opportunity from the software there that we are yet to unlock, right? And that's a whole other domain of opportunities.

Harsh Kumar

So I wanted to shift gears a little bit, Manuvir. We were at the last kind of seven, eight minutes, and I wanted to hit upon this. So off late, since maybe the acquisition of Mellanox, we hear NVIDIA talk a lot about SmartNICs and DPUs. And I guess, connectivity is a core theme now you touched upon with the distributed compute channel methodology. Can you update us on how you feel, A, about the importance of things like SmartNICs and DPUs, maybe what's the difference between the two? And then where you are in the roadmap as a company on these two particular connectivity products?

Manuvir Das

Yes. Let me do that, Harsh. So firstly, starting with – let’s just disambiguate these things, SmartNICs and DPUs because there's a number of people, number of companies out there that work on SmartNICs, right? So I think the best way to think about it is SmartNICs are sort of step one, which is to say, I've got a network interface card. The data is flowing through there. If I put a little bit of computing power, maybe some Arm – Arm CPU cores over there, there's some more processing I can do on the data as it's flowing through the network.

Now we took this to the next level and created this concept of the DPU. Our DPU product family is called BlueField. And the idea of the DPU is it has so much horsepower in that processor that what it actually does is it takes over the functions of the data center itself. So we've heard a lot in the last decade about software-defined data centers. What does that really mean? What that means is that all these things you’re doing in your data center firewalls and things for which you had this dedicated hardware will now turn into software that was running on the server itself. But as this happened, more and more of this load going to the server, which mean that there was less and less place for the application themselves to actually run. So whereas you would have needed five servers to run an application, you now need 10 because of the servers being consumed on this stuff.

And what our DPU really does is it says, offload all of that work on to this other processor. Move it there. You free up the CPU and the main server to run your workload. And the way we build the DPU, it actually accelerates, it’s like the GPU. If you take the firewall software and you move it from the CPU to the DPU, it’s not just shifting the problem, it runs a 100x faster and so you need much less silicon in the DPU to do the job, than you would have on the CPU, right? So it actually saves money in the data center, right? So this is why we are so high on the DPU because it can dramatically change the way data centers are architected. So our view is every server needs a DPU.

Now, two specific things we have done here, Harsh, that we think distinguish NVIDIA. The first thing is we learned a great lesson from when we did GPUs. We created a software SDK interface called CUDA, which was a simple way for developers to interact with GPU. We said, no matter what GPU you use, CUDA is CUDA, right? So it makes your work portable. We've done the same thing here with DPUs. We've created an SDK called DOCA, and it's a consistent SDK across our DPU family. And so again, what we say to the ecosystem is program to this API, this SDK and your work will translate as we make better and better DPUs and your software will just become better.

And the proof point of this, the second point I want to make is we have a roadmap. We are already working on BlueField-3, the third generation. We've already announced the architecture of BlueField-4, right? And it's not just making that processor better, but we now are working on versions of that processor where we've actually got the GPU capabilities inside the DPU as well, right? So you can do AI now inside the network, right? So think about what that enables, right? So that's how I'd summarize it, Harsh.

On the one hand, we have a rich hardware roadmap for how much more powerful DPUs are becoming. But we've created is an interface called DOCA that rides along. So for the ecosystem, you just develop once and as the processor gets better, your software will just get better along the way.

Harsh Kumar

Manuvir, it's amazing. You described it so well. I think maybe 15, 17 years ago, maybe even 20 years ago, when the first NICs were coming out, I was trying to understand what they did. And the point was, it takes away some of the complex functionality off of the CPU and does it for the CPU. And it seems like the same thing is happening, except the functions are getting more complex, they are software richer, but the basic functionality is the same, but we're moving up the stack, which is great for companies like you and actually makes the data center simpler in some ways because like you said, it's more cost-effective. And so anyways, fantastic stuff. So a lot to think about there, a lot to unpack.

Manuvir, as always, pleasure to have you. Thank you so much for your time. Thank you, anybody that joined in and listened to this presentation, and we really appreciate your time. Thank you, Manuvir.

Manuvir Das

Thank you, Harsh. It was my pleasure. And on behalf of Jensen and the entire team at NVIDIA, really appreciate the opportunity to be with you today.

Harsh Kumar

Thank you so much. Take care.

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/17/2021 3:29:43 PM
   of 2296
Pushing Forward the Frontiers of Natural Language Processing

September 16, 2021 by ASHRAF EASSA

Idea generation, not hardware or software, needs to be the bottleneck to the advancement of AI, Bryan Catanzaro, vice president of applied deep learning research at NVIDIA, said this week at the AI Hardware Summit.

“We want the inventors, the researchers and the engineers that are coming up with future AI to be limited only by their own thoughts,” Catanzaro told the audience.

Catanzaro leads a team of researchers working to apply the power of deep learning to everything from video games to chip design. At the annual event held in Silicon Valley, he described the work that NVIDIA is doing to enable advancements in AI, with a focus on large language modeling.

CUDA Is for the Dreamers

Training and deploying large neural networks is a tough computational problem, so hardware that’s both incredibly fast and highly efficient is a necessity, according to Catanzaro.

But, he explained, the software that accompanies that hardware might be even more important to unlocking further advancements in AI.

“The core of the work that we do involves optimizing hardware and software together, all the way from chips, to systems, to software, frameworks, libraries, compilers, algorithms and applications,” he said. “We optimize all of these things to give transformational capabilities to scientists, researchers and engineers around the world.”

This end-to-end approach yields chart-topping performance in industry-standard benchmarks, such as MLPerf. It also ensures that developers aren’t constrained by the platform as they aim to advance AI.

“CUDA is for the dreamers, CUDA is for the people who are thinking new thoughts,” said Catanzaro. “How do they think those thoughts and test them efficiently? They need something general and flexible, and that’s why we build what we build.”

Large Language Models Are Changing the World

One of the most exciting areas of AI is language modeling, which is enabling groundbreaking applications in natural language understanding and conversational AI.

The complexity of large language models is growing at an incredible rate, with parameter counts doubling every two months.

A well-known example of a large and powerful language model is GPT-3, developed by OpenAI. Packing 175 billion parameters, it required 314 zettaflops (1021 floating point operations) to train.

“It’s a staggering amount of compute,” Catanzaro said. “And that means language modeling is now becoming constrained by economics.”

Estimates suggest that GPT-3 would cost about $12 million to train and, Catanzaro observed, the rapid growth in model complexity means that, despite NVIDIA’s tireless work to advance the performance and efficiency of its hardware and software, the cost to train these models is set to grow.

And, according to Catanzaro, this trend suggests that it might not be too long before a single model might require more than a billion dollars’ worth of computer time to train.

“What would it look like to build a model that took a billion dollars to train a single model? Well, it would need to reinvent an entire company, and you’d need to be able to use it in a lot of different contexts,” Catanzaro explained.

Catanzaro expects that these models will unlock an incredible amount of value, inspiring continued innovation. During his talk, Catanzaro showed an example of the surprising capabilities of large language models to solve new tasks without being explicitly trained to do so.

After inputting just a few examples into a large language model — four sentences, with two written in English and their corresponding translations into Spanish — he then entered an English sentence, which the model then translated into Spanish properly.

The model was able to do this despite never being trained to do translation. Instead, it was trained — using, as Catanzaro described, “an enormous amount of data from the internet” — to predict the next word that should follow a given sequence of text.

To perform that very generic task, the model needed to come up with higher-level representations of concepts, such as the existence of languages in general, English and Spanish vocabularies and grammar, and the concept of a translation task, in order to understand the query and properly respond.

“These language models are first steps towards generalized artificial intelligence with few shot learning, and that is enormously valuable and very exciting,” explained Catanzaro.

A Full-Stack Approach to Language Modeling

Catanzaro then went on to describe NVIDIA Megatron, a framework created by NVIDIA using PyTorch “for efficiently training the world’s largest, transformer-based language models.”

A key feature of NVIDIA Megatron, which Catanzaro notes has already been used by various companies and organizations to train large transformer-based models, is model parallelism.

Megatron supports both inter-layer (pipeline) parallelism, which allows different layers of a model to be processed on different devices, as well as intra-layer (tensor) parallelism, which allows a single layer to be processed by multiple different devices.

Catanzaro further described some of the optimizations that NVIDIA applies to maximize the efficiency of pipeline parallelism and minimize so-called “pipeline bubbles,” during which a GPU is not performing useful work.

A batch is split into microbatches, the execution of which is pipelined. This boosts the utilization of the GPU resources in a system during training. With further optimizations, pipeline bubbles can be reduced even more.

Catanzaro described an optimization, recently published, that entails “round-robining each (pipeline) stage among multiple GPUs so that we can further reduce the amount of pipeline bubble overhead in this schedule.”

Although this optimization puts additional stress on the communication fabric within the system, Catanzaro showed that, by leveraging the full suite of NVIDIA’s high-bandwidth, low-latency interconnect technologies, this optimization is able to deliver sizable speedups when training GPT-3 style models.

Catanzaro then highlighted the impressive performance scaling of Megatron on NVIDIA DGX SuperPOD, achieving 502 petaflops sustained across 3,072 GPUs, representing an astonishing 52 percent of Tensor Core peak at scale.

“This represents an achievement by all of NVIDIA and our partners in the industry: to be able to deliver that level of end-to-end performance requires optimizing the entire computing stack, from algorithms to interconnects, from frameworks to processors,” said Catanzaro.

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/17/2021 7:55:39 PM
   of 2296
SambaNova Brings Custom Silicon To Bear on High-End AI Workloads

Alex Woodie

With its own custom silicon for AI workloads and a $5 billion valuation, it seems likely you’ll be hearing more about the Silicon Valley startup SambaNova Systems and its complete AI stack in the years to come.

SambaNova Systems was founded in 2017 by an all-star cast of processor experts, including Rodrigo Liang, who led the development of 12 generations of SPARC processors at Sun Microsystems and Oracle; Stanford University professor Kunle Olukotun, who’s been called the “father of the multi-core processor;” and Chris Ré, a Stanford associate professor who was awarded the MacArthur Fellowship.

At SambaNova, these chip heavyweights developed their own custom silicon. According to SambaNova Vice President of Product Marshall Choy, existing processors just don’t cut it for modern AI workloads.

“We prototyped this stuff on CPUs, GPUs, FPGAs–you name it–and it quickly became clear that with AI being more probabilistic and less deterministic than transactional processing, all these other traditional processor architectures just weren’t right,” Choy said. “There’s too much overhead for loads and stores and stuff like that, and not enough flexibility and configurability of the silicon. And so we thought, ‘Oh [shoot], we gotta build another chip!’”

But don’t make the mistake of thinking that SambaNova is just another chip company. While it did develop its Reconfigurable Dataflow Unit (RDU) with a 7nm process, and contract with TSMC to manufacture it, the company doesn’t actually sell the chip. Instead, the company built a complete machine learning stack around this processor.

SambaNova Systems co-founders (l to r): Chief Technologist Kunle Olukotun; CEO Rodrigo Liang; and Chris Ré, head of engineering

The company sells this combined hardware and software stack in one of two ways: in pre-assembled racks that companies can roll into their data centers, called the DataScale offering; or via the software-as-a-service (SaaS) delivery route, where all customers do is call the stack via APIs, which it calls Dataflow-as-a-Service. (Customers can also get the hardware behind the DaaS offering installed on-prem beyond their firewall, and have SambaNova manage it, providing a blended approach.)

What sets SambaNova apart from other vendors chasing AI opportunities is its capability to deliver accuracy and performance at scale for computer vision, NLP, and machine learning projects, according to Choy.

For example, in computer vision, its DataScale and DaaS offerings are able to train and infer on very high-resolution images, including those 4K and above. By comparison, most other commercially available solutions require the image to be downscaled or chopped up into multiple images before it will fit into memory, Choy said.

“We can train a model with what we call the true resolution of the image,” he said. “So without down-sampling it, without tiling it, all the way up to 60k by 40k images generated by a satellite and anything below that.”

While customers can make their AI work by downscaling images, they will lose potentially valuable accuracy, Choy said. Tiling an image also introduces the need to hand label many more images before feeding it into the model, he said. And it also runs the risk of missing important details that exist in the original image if it happens to be split in that particular place, potentially missing the cancer tumor or manufacturing defect that the AI was designed to detect.

With 1.5TB of memory per RDU, SambaNova is able to bring large amounts of memory to bear on AI problems (Source: SambaNova Hot Chips presentation)

“That’s a core advantage of something like this,” Choy said of SambaNova’s approach. “You basically get out of memory errors with other platforms. So it’s literally enabling people to do things that they cannot do today and deliver results that were unattainable prior.”

Among the handful of customers that SambaNova can disclose are a pair of national laboratories. Lawrence Livermore National Lab is using a DataScale cluster with a pair of workloads, including a modeling and simulation workload for physics research, and another for anti-viral research for COVID-19. The system is paired with LLNL’s Corona supercomputer.

“We’re offloading certain parts of the larger mod-sim workload onto a machine learning framework,” Choy said. “We’re doing large outer loops of training with many, many dozens of inner loops of inferencing, and then feeding the results back to the main simulation, which is then speeding up the overall simulation by about 5x, according to the customer.”

Argonne National Lab also has a DataScale deployment in its AI testbed.

Other current customers include unnamed banks, which are using SambaNova offerings for anomaly detection and fraud detection, as well as to speed up claims processing. SambaNova also has customers in the high-speed trading arena, but Choy doesn’t know what they’re using it for. “I have no idea what their model is,” he said. “They’ll never tell anybody.”

Organizations with more established data science programs will be more likely to buy the shrink-wrapped DataScale offering, enabling their teams of data scientists to bring their own in-house models developed in Python and PyTorch, and benefit from the increases in performance and accuracy that SambaNova can provide, without the overhead and complexity of assembling, integrating, and maintaining their own infrastructure.

“And then there’s many other people who are purely looking at outcomes,” Choy said. “What do they care if it’s BERT model, an LSTM model, or a GPT model for language processing? They just want to have the best results. And so they’re basically offloading all that work to SambaNova and we’re just providing a results-oriented outcome to consume.”

These types of customers are more likely to buy the DaaS offering, which the company introduced in late 2020.

SambaNova can train and infer on images with up to 50,000 pixels across (Source: SambaNova Hot Chips presentation)

“We had a bunch of other folks that were talking to said look, this sounds really great, but…I’m not Google. I don’t have 3,000 data scientists. I don’t have 300 data scientist. I don’t’ even have 30. I’ve got three [data scientists] and budget plans to expand that team to six people in the next year or so. And so how do I use this?

“That’s where we said, look, we’re just going to up level the abstraction level of the system beyond the hardware, beyond the models themselves, and just give you API calls,” he continued. “This makes it accessible to people who maybe don’t know much about AI at all.”

To be sure, SambaNova is not a silver bullet for AI. It’s not handling every aspect of the machine learning process. It’s up to customers to bring good, clean data to the party. And as Choy explained, the company isn’t providing MLOps tools or anything like that (although it is looking to particulate in that growing ecosystem).

But if your data is in fairly reasonable shape, the company can help you automate decisions with it using AI.

“I’ve got a bounty of PhDs who are keeping up with and driving the latest trends in these areas,” Choy said. “We give you the model. You don’t have to worry about model selection, model tuning, model maintenance, with all the cost and time related to that. We just [run] your custom data sets.”

In April, the Palo Alto, California company announced the closing of a Series D round in the amount of $676 million at a valuation of $5.1 billion. The round was led by SoftBank, with participation by new investors Temasek and the government of Singapore Investment Corp. (GIC), both new investors, along with existing investors BlackRock, Intel Capital, GV (formerly Google Ventures), Walden International, and WRVI.

While building your own chip is a capital-intensive business, the more than $1 billion in total investments ($1.1 billion to be exact) shows that venture capitalist have a lot of faith in SambaNova’s approach. With AI expected to generate trillions of dollars in new value in the years to come, it may not be a bad investment.

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/17/2021 8:33:35 PM
1 Recommendation   of 2296
842 Chips Per Second: 6.7 Billion Arm-Based Chips Produced in Q4 2020

By Anton Shilov February 13, 2021

Arm-based chips surpass x86, ARC, Power, and MIPS-powered chips, combined

(Image credit: Arm)

Being the most popular microprocessor architecture, Arm powers tens of billions of devices sold every year. The company says that in the fourth quarter of 2020 alone, the Arm ecosystem shipped a record 6.7 billion Arm-based chips, which works out to an amazing production rate of 842 chips per second. This means that Arm outsells all other popular CPU instruction set architectures — x86, ARC, Power, and MIPS — combined.

6.7 Billion of Arm Chips Per Quarter

Arm's Cortex-A, Cortex-R, Cortex-M, and Mali IP powers thousands of processors, controllers, microcontrollers, and graphics processing units from over 1,600 companies worldwide. As the world is rapidly going digital, demand for all types of chips is at all times high, giving a great boost to Arm given the wide variety of applications its technologies are used for.

Arm says that as many as 842 chips featuring its IP were sold every second in the fourth quarter of 2020. Meanwhile, it is noteworthy that although Arm’s Cortex-A-series general-purpose processor cores get the most attention from the media (because they are used inside virtually all smartphones shipped these days), Arm’s most widely used cores are its Cortex-M products for microcontrollers that are virtually everywhere, from thermometers to spaceships. In Q4 alone, 4.4 billion low-power Cortex-M-based microcontrollers were sold.

"The record 6.7 billion Arm-based chip shipments we saw reported last quarter is testament to the incredible innovation of our partners: from technology inside the world’s number one supercomputer down to the tiniest ultra-low power devices," said Rene Haas, president of IP Products Group at Arm. "Looking ahead, we expect to see increased adoption of Arm IP as we signed a record 175 licenses in 2020, many of those signed by first-time Arm partners."

Share RecommendKeepReplyMark as Last Read

From: Frank Sully9/18/2021 9:52:10 AM
   of 2296
Chip makers like Nvidia are set to soar as semiconductor sales to reach $544 billion in 2021, Bank of America says

Carla Mozée

Sep. 18, 2021, 08:30 AM

Nvidia is a top stock pick for Bank of America.

Krystian Nawrocki/Getty Images
  • Bank of America on Friday raised its 2021 outlook for sales growth in the semiconductor industry to 24% from 21%.
  • Semiconductor companies have newfound pricing power in the ongoing global chip shortage.
  • Nvidia, ON Semiconductor and KLA-Tencor are among the investment bank's top stock picks in the sector.
Bank of America bumped up its sales outlook for the semiconductor industry as it sees growing demand for chips that make computers and cars run, and named Nvidia and auto chip supplier ON Semiconductor among its top stock picks heading into the final quarter of 2021.

The persistent global chip shortage that has dogged companies ranging from automakers, to video game publishers, to consumer electronics producers, has contributed to strengthening sales for chip companies. BofA expects above-trend growth to last through next year and now projects total industry sales in 2021 to increase by 24% to $544 billion, up from its previous view for an increase of 21%.

"We remain firmly in the stronger-for-longer camp for semis given their critical role in the rapidly digitizing global economy and the newfound pricing power and supply discipline of this remarkably profitable industry operating with a very lean supply chain," said analysts led by Vivek Arya in a Friday research note.

BofA's semiconductor analysts outlined their fourth-quarter playbook before investors headed into the fourth quarter. It said between 2010 and 2020, the fourth and first quarters have been the two best quarters to own semiconductor stocks as the PHLX Semiconductor Sector has outperformed the benchmark S&P 500.

There are three hot spots in the industry: computing, which includes cloud services and AI, gaming and networking; cars; and capex, or capital spending by businesses and the government.

In the computing group, BofA raised its price target on Nvidia to $275 from $260 and said the graphics-cards maker is a top pick along with AMD and Marvell. In the car group, it increased its price target on top pick ON Semiconductor to $60 from $55.

The investment bank called KLA-Tencor its top pick in the capex segment and raised its price target by 6% to $450 from $425. The stock traded around $369 on Friday.

Share RecommendKeepReplyMark as Last Read
Previous 10 Next 10