Jen-Hsun Huang - NVIDIA Corp.
Analyst · Joe Moore with Morgan Stanley
Yes. The inference market is going to be very large. And as you know very well, in the future almost every computing device will have inferencing on it. A thermostat will have inferencing on it, a bicycle lock will have inferencing on it, cameras will have inferencing on it, and self-driving cars would have a large amount of inferencing on it. Robots, vacuum cleaners, you name it, smart microphones, smart speakers, all the way into the data center. And so I believe that long-term there will be a trillion devices that has inferencing connected to edge computing devices near them, connected to cloud computing devices, cloud computing servers. So that's basically the architecture. And so the largest inferencing platform will likely be arm devices. I think that that goes without saying. Arm will likely be running inferencing networks, 1-bit XNOR, 8-bit, and even some floating-point. It just depends on what level of accuracy do you want to achieve, what level of perception do you want to achieve, and how fast do you want to perceive it? And so the inferencing market is going to be quite large. We're going to focus in markets where the inferencing precision, the inferencing, the perception scenario and the performance by which you have to do is mission-critical. And of course, self-driving cars is a perfect example of that. Robots, manufacturing robots, will be another example of that. In the future you're going to see in our GTC, if you have a chance to see that, we're working with AI City partners all over the world for end-to-end video analytics, and that requires very high throughput, a lot of computation. And so the examples go on and on, all the way back into the data center. In the data center, there are several areas where inferencing is quite vital. I mentioned one number earlier, just mapping the earth, mapping the earth at the street level, mapping the earth in HD, in three-dimensional level for self-driving cars. Now, that process is going to require, well, just a pile of GPUs running continuously as we continuously update the information that needs to be mapped. There's inferencing, which is called offline inferencing where you have to retrain a network after you deployed it, and you would likely retrain and re-categorize, reclassify the data using the same servers that you used for training. And so even the training servers will be used for inferencing. And then lastly, all of the nodes in cloud will be inferencing nodes in the future. I've said before that I believe that every single node in the cloud data center will have inferencing capability and accelerated inferencing capability in the future. I continue to believe that and these are all opportunities for us.