The AI compilation wars: Intel, Google, Microsoft, Nvidia, IBM and others arm for deep learning acceleration

By James Kobielus | May 03, 2018

Compilers are among the least sexy tools in any developer’s kitbag. You call on your compiler only when you’re ready to see how well your splendid new app will perform in the field.

Data scientists are the primary developers in this new era of artificial intelligence-driven applications. Consequently, many may greet the news that Intel Corp. has released an open-source neural-network compiler, nGraph, with a shrug at best. Most developers simply use the built-in machine learning compilers that come with whatever development framework they happen to use.

In fact, it’s not even brand-new news: Wikibon discussed nGraph in our November 2017 report on deep learning development frameworks. Nevertheless, it’s significant because cross-platform model compilers such as nGraph, which Intel has now released officially, are harbingers of the new age in which it won’t matter what front-end tool you used to build your AI algorithms and what back-end clouds, platforms or chipsets are used to execute them.

NGraph is one of a growing number of cross-platform AI model compilers available to today’s developers. The growing list of rival offerings includes Amazon Web Services Inc.’s NNVM Compiler, Google Inc.’s XLA and Nvidia Corp.’s TensorRT 3.

What they have in common is they enable AI models created in one front-end tool — such as TensorFlow, MXNet, PyTorch and CNTK — to be compiled for optimized execution on heterogeneous back-end DL platforms and hardware platforms, including graphics processing units, central processing units, field programmable gate arrays and so on. They do this by abstracting the core computational graph structure of a neural network from the various target runtime environments.

Some observers have interpreted Intel’s formal release of nGraph as a defensive maneuver to encourage developers to compile their AI for execution on its CPUs, rather than the GPUs of its powerhouse competitor Nvidia. Intel has fueled this perception by reporting significant performance improvements on neural network models compiled with the latest version of nGraph for Intel-based hardware, including its Nervana AI chips and Movidius systems on a chip for edge computing. Clearly, it has every interest in discouraging developers from writing their AI directly to Nvidia’s APIs.

Nvidia gained a tactical advantage in the compilation wars with Google’s recent announcement of TensorFlow integration with Nvidia’s TensorRT library. This enables compilation of neural nets built in the most popular front-end modeling tool for optimized inferencing on Nvidia GPUs. It also creates a runtime for efficient deployment of TensorFlow models on GPUs in production environments. Separately, Google also handed Intel a minor victory, announcing that these vendors have delivered TensorFlow integration with a faster, more efficient open-source deep learning library from Intel.

However, these sorts of defensive strategies will diminish over time as AI developers demand the ability to abstract their models into discrete subgraphs and then auto-compile them evenhandledly for distributed execution in hybridized AI hardware environments consisting of GPUs, CPUs, FPGAs and other chipsets. Cross-platform AI compilers will become standard components of every AI development environment, enabling developers to access every deep learning framework and target platform without having to know the technical particular of each environment. This trend is consistent with the recent AI developer accessibility discussion by Ritika Gunnar, IBM Corp.’s vice president of products for Watson AI, on SiliconANGLE Media’s video studio theCUBE at the recent Think 2018 conference:

It’s likely that, within the next two to three years, the AI industry will converge around one open-source cross-compilation supported by all front-end and back-end environments. Other industry developments in recent weeks call attention to the opening of the AI cross-compilation ecosystem.

These include:

Microsoft’s open-sourcing of a GitHub repoto foster cross-framework benchmarking of GPU-optimized deep learning models.
ARM’s partnership with Nvidiato integrate the open-source Nvidia Deep Learning Accelerator architecture into its just-announced Project Trillium platform, designed to enable cross-framework deep learning model compilation for optimized execution in mobile, “internet of things” and other mass-market edge devices.
IBM’s launchof the new open-source Fabric for Deep Learning framework, which supports optimized deployment of deep-learning microservices built in TensorFlow, PyTorch, Caffe2 and other frameworks to diverse compute nodes on heterogeneous clusters of GPUs and CPUs via stateless RESTful interfaces over Kubernetes.
Linux Foundation’s launchof the Acumos AI Project, which defines APIs, an open-source framework, and an AI model catalog for framework-agnostic AI app development, chaining and deployment over Kubernetes.

Before long, it will become quaint to claim, as IBM does in this recent article, that a particular deep learning library is faster than a specific framework running on a particular hardware architecture in a particular cloud. Regardless of whether these or any other performance claims have merit, developers will increasingly have the ability to convert an underperforming model automatically for optimized execution of the AI cloud of their choice.

Article Categories

By James Kobielus | May 03, 2018

James Kobielus

You may also be interested in

AWS Summit NY 2025 Highlights the Agentic Era of Software Development

Paul Nashawaty July 18, 2025

Agentic Infrastructure Arrives at AWS – Now It Needs a System of Intelligence

David Vellante July 16, 2025

Cutting Edge Research, Analysis, Insights + Media

Studio Locations

Silicon Valley
989 Commercial St.
Palo Alto, CA 94303

Boston Metro
5 Mount Royal Ave.
Marlborough, MA 01752

Research Areas

Podcasts

Solutions

Engage

Stay Connected

theCUBE Research weekly

Stay ahead of the curve with the exclusive insights by our team straight to your inbox each week.

By submitting this form, you are consenting to receive marketing emails from: theCUBEResearch, info@siliconangle.com. You can revoke your consent to receive emails at any time by using the SafeUnsubscribe® link, found at the bottom of every email. Emails are serviced by Constant Contact