Formerly known as Wikibon
Search
Close this search box.

The AI compilation wars: Intel, Google, Microsoft, Nvidia, IBM and others arm for deep learning acceleration

Compilers are among the least sexy tools in any developer’s kitbag. You call on your compiler only when you’re ready to see how well your splendid new app will perform in the field.

Data scientists are the primary developers in this new era of artificial intelligence-driven applications. Consequently, many may greet the news that Intel Corp. has released an open-source neural-network compiler, nGraph, with a shrug at best. Most developers simply use the built-in machine learning compilers that come with whatever development framework they happen to use.

In fact, it’s not even brand-new news: Wikibon discussed nGraph in our November 2017 report on deep learning development frameworks. Nevertheless, it’s significant because cross-platform model compilers such as nGraph, which Intel has now released officially, are harbingers of the new age in which it won’t matter what front-end tool you used to build your AI algorithms and what back-end clouds, platforms or chipsets are used to execute them.

NGraph is one of a growing number of cross-platform AI model compilers available to today’s developers. The growing list of rival offerings includes Amazon Web Services Inc.’s NNVM Compiler, Google Inc.’s XLA and Nvidia Corp.’s TensorRT 3.

What they have in common is they enable AI models created in one front-end tool — such as TensorFlow, MXNet, PyTorch and CNTK — to be compiled for optimized execution on heterogeneous back-end DL platforms and hardware platforms, including graphics processing units, central processing units, field programmable gate arrays and so on. They do this by abstracting the core computational graph structure of a neural network from the various target runtime environments.

Some observers have interpreted Intel’s formal release of nGraph as a defensive maneuver to encourage developers to compile their AI for execution on its CPUs, rather than the GPUs of its powerhouse competitor Nvidia. Intel has fueled this perception by reporting significant performance improvements on neural network models compiled with the latest version of nGraph for Intel-based hardware, including its Nervana AI chips and Movidius systems on a chip for edge computing. Clearly, it has every interest in discouraging developers from writing their AI directly to Nvidia’s APIs.

Nvidia gained a tactical advantage in the compilation wars with Google’s recent announcement of TensorFlow integration with Nvidia’s TensorRT library. This enables compilation of neural nets built in the most popular front-end modeling tool for optimized inferencing on Nvidia GPUs. It also creates a runtime for efficient deployment of TensorFlow models on GPUs in production environments. Separately, Google also handed Intel a minor victory, announcing that these vendors have delivered TensorFlow integration with a faster, more efficient  open-source deep learning library from Intel.

However, these sorts of defensive strategies will diminish over time as AI developers demand the ability to abstract their models into discrete subgraphs and then auto-compile them evenhandledly for distributed execution in hybridized AI hardware environments consisting of GPUs, CPUs, FPGAs and other chipsets. Cross-platform AI compilers will become standard components of every AI development environment, enabling developers to access every deep learning framework and target platform without having to know the technical particular of each environment. This trend is consistent with the recent AI developer accessibility discussion by Ritika Gunnar, IBM Corp.’s vice president of products for Watson AI, on SiliconANGLE Media’s video studio theCUBE at the recent Think 2018 conference:

It’s likely that, within the next two to three years, the AI industry will converge around one open-source cross-compilation supported by all front-end and back-end environments. Other industry developments in recent weeks call attention to the opening of the AI cross-compilation ecosystem.

These include:

  • Microsoft’s open-sourcing of a GitHub repoto foster cross-framework benchmarking of GPU-optimized deep learning models.
  • ARM’s partnership with Nvidiato integrate the open-source Nvidia Deep Learning Accelerator architecture into its just-announced Project Trillium platform, designed to enable cross-framework deep learning model compilation for optimized execution in mobile, “internet of things” and other mass-market edge devices.
  • IBM’s launchof the new open-source Fabric for Deep Learning framework, which supports optimized deployment of deep-learning microservices built in TensorFlow, PyTorch, Caffe2 and other frameworks to diverse compute nodes on heterogeneous clusters of GPUs and CPUs via stateless RESTful interfaces over Kubernetes.
  • Linux Foundation’s launchof the Acumos AI Project, which defines APIs, an open-source framework, and an AI model catalog for framework-agnostic AI app development, chaining and deployment over Kubernetes.

Before long, it will become quaint to claim, as IBM does in this recent article, that a particular deep learning library is faster than a specific framework running on a particular hardware architecture in a particular cloud. Regardless of whether these or any other performance claims have merit, developers will increasingly have the ability to convert an underperforming model automatically for optimized execution of the AI cloud of their choice.

 

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content