|NVDA CC: Volta "single largest processor...ever made...3D packaging"..................................|
Gotta love an all-fired-up CEO.
I was hoping to speak in a near-term and a longer-term question. On the near term, you talked about the health on demand side for Volta. Curious if you're seeing any sort of restrictions on the supply side, whether it's wafers or access to high-bandwidth memory, et cetera. And then the longer-term question really revolves around CUDA. You've talked about that as being a sustainable competitive advantage for you guys entering the year. And now that we've moved beyond HPC and hyperscale training to more into inference and GPU as a service and you've posted GTC around the world, curious if you could extrapolate on how you're seeing that advantage and how you've seen it evolve over the year and how you're thinking about CUDA as the AI standard?
Yes, thanks a lot, C.J. Well, everything that we build is complicated. Volta is the single largest processor that humanity has ever made, at 21 billion transistors, 3D packaging, the fastest memories on the planet and all of that in a couple of hundred watts which basically says it's the most energy-efficient form of computing that the world has ever known. And one single Volta replaces hundreds of CPUs. And so it's energy-efficient, it saves an enormous amount of money and it gets this job done really, really fast which is just one of the reasons why GPU-accelerated computing is so popular now. With respect to the outlook for our architecture. As you know, we are a one architecture company. And it's so vitally important. And the reason for that is because there are so much software and so much tools created on top of this one architecture.
On the inference side -- on the training side, we have a whole stack of software and optimizing compilers and numeric libraries that are completely optimized for one architecture called CUDA. On the inference side, the optimizing compilers that takes these large, huge computational graphs that come out of all of these frameworks, and these computational graphs are getting larger and larger and their numerical precision differs from one type of network to another -- from one type of application to another. Your numerical precision requirements for a self-driving car where lives are at stake to detecting where counting the number of people crossing the street, counting something versus trying to track -- detect and track something very subtle in all the weather conditions, is a very, very different problem.
And so the numeric -- the types of networks are changing all the time, they're getting larger all the time. The numerical precision is different for different applications. And we have different computing -- compute performance levels as well as energy availability levels that these inference compilers are likely to be some of the most complex software in the world. And so the fact that we have one singular architecture to optimize for, whether it's HPC for numeric, molecular dynamics and computational chemistry and biology and astrophysics, all the way to training to inference gives us just enormous leverage. And that's the reason why NVIDIA could be an 11,000 people company.
And arguably, performing at a level that is 10x that. And the reason for that is because we have one singular architecture that's -- that is accruing benefits over time instead of three, four, five different architectures where your software organization is broken up into all these different, small subcritical mass pieces. And so it's a huge advantage for us. And it's a huge advantage for the industry.
So people who support CUDA know that the next-generation architecture will just get a benefit and go for the ride that technology advancement provides them and affords them, okay? So I think it's an advantage that is growing exponentially, frankly. And I'm excited about it.