It’s clear that there’s a revolution in how artificial intelligence is done with neural networks as opposed to the old school systems of the ’80s and the ’90s. It’s clear that hardware is beginning to evolve, and it’s also quite clear that the way that we power these hardware systems is going to have to change.
GPUs and AI hardware are tremendously power-intensive, and this week we speak with Robert Gendron of Vicor Corporation, a company focused on powering AI systems. Vicor is in partnership with Kisaco Research, which is putting on the 2019 AI Hardware Summit September 17 and 18 in Mountain View, California.
Robert speaks about why the way that they are powered needs to be different than traditional manufacturing equipment. He also discusses how the powering of these systems need to work if businesses want to reduce energy costs and be as efficient as they can when it comes to AI.
Subscribe to our AI in Industry Podcast with your favorite podcast service:
Expertise: AI hardware
Brief Recognition: Robert has been at Vicor for 8 years. Prior to Vicor, he was Director of Sales at United Electronic Industries and Director of Sales at Volterra.
(02:00) What makes AI systems so power-hungry and what does that really mean?
Robert Gendron: We’ve seen GPUs and ASICs used in AI applications where the power demands have increased from a few hundred watts to now we’re working with some customers up into the thousand-watt range. More importantly than the power level itself is how that power is being delivered. That is, it’s typically sub-one volts and being sub-one volts against the plan, that means it’s about a thousand amps of current. So very high current at a very low voltage.
It’s best probably think about a light bulb. A typical light bulb is 60 watts. So again, we’re talking a thousand watts that one processor is consuming. So again that one processor is in a let’s say a server rack and then in a server farm. So we’re talking a massive power consumption typically by an AI-type processor.
Ironically the industry typically has used a solution that’s been around now for about two decades. It’s called a multi-phased power scheme. And you can imagine it’s one voltage regulator or power block that you parallel up or you scale up by adding more of these power blocks together to create larger and larger overall power delivery systems.
(04:15) So this is the same thing that would be used at one of these giant server centers for your Facebook’s of the world, your Google’s of the world?
RG: Yes, exactly. I mean you have it in your probably the laptop that’s in front of you, same with mine. You have this kind of multi-phase scheme where your processor is being powered, let’s say by three or four of these little power blocks scaled together. But for a GPU, you could have as many as 20 of these power blocks paralleled together supplying the power.
(05:00) What does this new system look like in terms of optimizing power?
RG: Yeah, so the challenge is that again, with the conventional type system, it’s becoming larger and larger and so now it’s harder and harder to keep that power delivery converter close to the processor. And so as it moves farther and farther away, you create more of a loss in delivering the power to the processors. So not the consumption of processor, but just losses in delivering the power.
We’ve created a technology and a product that actually locates the power regulator that the power delivery system directly underneath the processor itself, entirely eliminating all of the losses that you would have on the motherboard or PCB.
(06:00) What are the sectors where this is most relevant?
RG: Yeah. So historically AI was more of a supercomputer, a dedicated machine for deep learning and perhaps inferencing. Then we’ve seen in the cloud space, inferencing start to be performed, and now in the cloud space we also see deep learning being performed.
So there’s a massive shift going on of going from a supercomputer to cloud deployment. And this is what’s really increasing dramatically the adoption of AI in the cloud of course, but then to support IOT-type devices.
Now you also mentioned automotive. You have both quite a powerful processor in the automobile, but you also have support back in the cloud board also. So all of these areas are driving larger and larger powered AI devices.
You have in your home you’ll have thermostats, you’ll have even your entertainment equipment, your health devices, health monitoring, health exercise-type devices.
There’s many things now being tied back to quote the cloud for then being analyzed, monitored, etc. Again, what we’re focused on is then that high power of the inferencing, the deep learning that’s going on in these large infrastructure deployments, again, that is the cloud.
(09:00) What is it about proximity to power that makes it relevant for this new kind of computing that might not have been relevant if we just kept up going with CPUs?
RG: Yeah, I guess the best way to always think about it is like water flow. If you pour water, let’s say down a drain and it’s restricted, that’s resistance, right? The more resistance you have, the less water you can pour down the drain.
That analogy then is compared to current being delivered to let’s say a processor. The more the drain is open or the more the pipe is open, the more current can be delivered.
Again, less resistance enables more current, less loss. The more loss that occurs creates heat and now you have to get rid of this heat and manage that heat. So again, the closer you are to the processor with the power converter, the less losses you create and the less heat you generate in the system.
(10:30) Are there rules of thumb about how much is lost so people can kind of get an idea of what the leakage of this heat and power looks like?
RG: Sure. It’s simple enough. The losses are calculated by a very simple equation of I squared R, that is you take the current you’re transmitting, you square that, and then you multiply it by the resistance or loss of the material. In this case, we’re talking about a PCB and that tells you how many watts you’re losing because of that distance.
(11:00) Is this a GPU or a neural net-specific issue or is it just power in general is going to start moving closer to what it needs to power?
RG: So you have distribution losses as you mentioned in almost any market in any application. So again, industrial, any sort of application you have distribution losses. So anytime you can place the converter or the power delivery closer to the load, whatever it be, a robotic arm, et cetera, yes, that’s more advantageous, you get less loss.
What makes AI unique is that the currents are increasing by so much, again from historically maybe a few hundred amps to now, like I mentioned earlier, about a thousand amps of current and being delivered at sub-one volt.
This is quite a challenge and so as I mentioned, any sort of loss in that type of system can really hurt the overall system. Now then you take that one problem of that one processor being powered and multiply that by a deployment in a cloud-based infrastructure, and it becomes a massive power problem.
(13:00) Do you foresee a change in how those systems are powered over the course of the years ahead?
RG: Power typically limits a system or an AI processor in the computational type performance. So power right now there’s a demand on power I should say to make it more dense.
So again, get more power out of a smaller converter, and then again get that power converter closer to the load itself to reduce the distribution losses. And this is prevalent powering the processor, powering the server board, powering the server rack. You can go all the way back to the power lines itself, looking at the exact same situation.
Subscribe to our AI in Industry Podcast with your favorite podcast service:
This article was sponsored by Kisaco Research and was written, edited and published in alignment with our transparent Emerj sponsored content guidelines. Learn more about reaching our AI-focused executive audience on our Emerj advertising page.
Header image credit: Forbes