In 2018, James Kobielus wrote an article on the AI market’s shift to workload-optimized hardware platforms, in which he proposed:
Workload-optimized hardware/software platforms will find a clear niche for on-premises deployment in enterprises’ AI development shops. Before long, no enterprise data lake will be complete without pre-optimized platforms for one or more of the core AI workloads: data ingest and preparation, data modeling and training, and data deployment and operationalization.
We are seeing Kobielus’ words come true. In the past year, nearly 100 companies have announced some sort of AI-optimized IP, chip, or system optimized, primarily for inferencing workloads but also for training. Hyperscalers like Facebook, Amazon, and Google are increasingly talking publicly about “full-stack” optimization of AI, from silicon, through algorithms, up to the application layer.
But what does this look like for the rest of enterprises, those who are just at the beginning of their AI journey? How deeply are they thinking about the platforms upon which their AI applications run?
At the AI Hardware Summit, held at the Computer History Museum in Mt. View, CA, 17-18 September this year, several enterprises including United Health Group and Medallia will join hyperscalers and AI hardware vendors in discussing how data bottlenecks are impacting the choice to invest in on-premise AI hardware.
We caught up with Naveen Rao, Corporate Vice President and General Manager of the Intel AI Products Group, an organization engaged with a wide range of enterprises and industries working at the forefront of AI, to ask him how workload-optimized hardware is changing the AI landscape, how their customers are getting started with existing hardware architectures, and whether Kobielus’ vision is coming true.
To what extent is new custom AI hardware (beyond GPU & CPU-based systems) being investigated or deployed by your enterprise customers? Are there particular industries leading the charge?
Deep learning is really coming of age now, and we are entering the next big wave in algorithmic development where companies are training models based on hundreds of thousands or even millions of parameters.
To more effectively and efficiently build those models, new types of hardware are needed. For example, the GPT-2 language model was built on 1.5 billion parameters, and the amount of compute being used to train it is doubling approximately every three months.
That’s staggering. While most of the AI activity today is concentrated in the consumer services sector, within just a few years every industry will feel the positive impacts of AI. Telecom will see tremendous gains from AI as they progress toward autonomous networks, and automotive, healthcare, financial services, and manufacturing will see some of the most interesting use cases.
It has been said that most AI inference in the data center is being done by CPUs. For enterprise customers with on-premise data centers, what is the benefit in replacing their existing compute capabilities for custom-built AI hardware?
It’s actually not about replacing – it’s about making smart use of what you already have, and then adding acceleration in a purposeful way. CPU-based inference is a great solution for a lot of companies, from superusers like Facebook down to an enterprise that is just starting on their AI journey.
For companies that do inference consistently or need near-instantaneous insights at the point of data collection, a dedicated accelerator offers a better ROI. So it’s not about moving inference off CPUs but rather optimizing for specific workload needs and reducing latency by delivering efficient results as close to the data source as possible. As an example, consider an MRI machine that uses AI to interpret patient scans in near-real-time.
What are some of the challenges of full-stack AI optimization from Intel’s view? What are the benefits of overcoming these challenges?
Regardless of hardware architecture, a solid software stack is necessary to ensure developers have a great working experience. As new hardware is developed, its software should ideally integrate with existing toolchains to leverage what developers already know. Success in software development requires extensive system knowledge and open-source influence.
While this combination of talent and expertise is difficult to achieve, doing so means customers and hardware adopters have an easy on-ramp and can train their own developers and data scientists quickly.
Is Intel anticipating a large uptake of on-premise AI infrastructure from the enterprise or will AI compute be consumed via the cloud?
For many, it will be both, and it all depends on the business and application needs. For example, several of our customers are using public cloud to spin up and iterate model training quickly, but then scaling training and deploying via on-premise solutions for more cost-effective scale or lower latency. For some, public cloud is simply not an option because of data privacy, security, or other regulatory constraints.
Which industries are likely to benefit most from bespoke AI hardware, and why?
We like to use the phrase “accelerate with purpose,” because for many customers, running their training on CPUs simply provides a faster path to getting started, and better ROI by utilizing excess capacity within their existing infrastructure. But there are definitely industries and applications where accelerating with AI-specific hardware is a key advantage.
For training, it really depends on the speed and recurrence with which you need to train your model. For inference, applications like Facebook’s algorithms reflect the type of intensive, continuous, high-volume and mission-critical tensor compute where offloading a portion of the process to AI-specific hardware will be the right play for total cost of ownership.
Intel is the headline partner for the AI Hardware Summit in September, and you’re delivering the Day 1 Closing Keynote this year. Can you give us any spoilers on your talk? What would you say to an enterprise decisionmaker thinking of attending the summit?
No spoilers! But this will be a great opportunity for enterprises to learn about how to think about their data problems, and that is the #1 issue facing enterprises as they look to employ AI applications.
AI is a tool, a technique to solve for the exponential increase in data and to help companies manage their own businesses better, as well as provide better solutions to their customers. I will talk about what I’ve seen work well…and not so well…as a way for companies to think through their own AI implementations.
This article was sponsored by Kisaco Research and was written, edited and published in alignment with our transparent Emerj sponsored content guidelines. Learn more about reaching our AI-focused executive audience on our Emerj advertising page.
Header image credit: Analytics India Magazine