New AI systems on a chip will spark an explosion of even smarter devices

By James Kobielus | May 03, 2018

Artificial intelligence is permeating everybody’s lives through the face recognition, voice recognition, image analysis and natural language processing capabilities built into their smartphones and consumer appliances. Over the next several years, most new consumer devices will run AI natively, locally and, to an increasing extent, autonomously.

But there’s a problem: Traditional processors in most mobile devices aren’t optimized for AI, which tends to consume a lot of processing, memory, data and battery on these resource-constrained devices. As a result, AI has tended to execute slowly on mobile and “internet of things” endpoints, while draining their batteries rapidly, consuming inordinate wireless bandwidth and exposing sensitive local information as data makes roundtrips in the cloud.

That’s why mass-market mobile and IoT edge devices are increasingly coming equipped with systems-on-a-chip that are optimized for local AI processing. What distinguishes AI systems on a chip from traditional mobile processors is that they come with specialized neural-network processors, such as graphics processing units or GPUs, tensor processing units or TPUs, and field programming gate arrays or FPGAs. These AI-optimized chips offload neural-network processing from the device’s central processing unit chip, enabling more local autonomous AI processing and reducing the need to communicate with the cloud for AI processing.

To some degree, the term “system on a chip” is a misnomer. Where AI is concerned, they tend to integrate multiple specialized processor chips — such as CPUs, GPUs and digital signal processors — into a mobile or IoT endpoint form factor. The entire configuration is optimized either for very specific workloads — such as face recognition, voice recognition, natural language processing, or augmented reality — or for a range of AI functions that can be executed directly on the mobile device. In addition, AI systems on a chip usually come with application programming interfaces, libraries, and tools that enable developers to either to develop AI from scratch for execution on the devices, or to import machine learning and deep learning models built in TensorFlow, Caffe2, PyTorch and other widely adopted frameworks.

Flood unleashed

In recent months, we’ve seen signs that a growing variety of sophisticated AI systems on a chip are entering the market. Already, one such offering is gaining traction in the market: Apple Inc.’s A11 Bionic SoC, which, among other functions, drives the Face ID authentication feature of the latest generation of iPhones. Likewise, Intel Corp. was one of the first movers in this market when, last year, it released its Movidius Myriad X system on a chip for AI-driven vision processing in smart cameras and other edge devices. Yet another system on a chip, Huawei’s Kirin 970, comes with its own embedded AI neural processing engine. Rumors abound that Google Inc. and Amazon.com Inc. are developing their own AI systems on a chip for their commercial devices.

Another sign that AI systems on a chip are about to flood the market was Nvidia Corp.’s recently announced partnership with ARM Holdings Ltd. to integrate the open-source Nvidia’s Deep Learning Accelerator or NVDLA into the ARM Project Trillium machine learning platform. Project Trillium enables efficient AI inference, or the operation of machine learning algorithms, on systems on a chip that incorporate CPUs, GPUs, digital signal processors and hardware accelerators, along with the second-generation ARM object detection processor and open-source ARM neural network software. ARM licenses chip designs to manufacturers, which means that the eventual NVDLA/Trillium system on a chip design will be available for integration in diverse IoT edge devices for high-performance, low-power mobile AI inferencing.

And in yet another significant recent announcement, Qualcomm Inc. launchedtwo new low-power AI systems on a chip: the QCS605 and QCS603. These are designed for mobile and IoT edge inferencing in computer vision applications, especially smart video/still cameras for security, sports, wearable, virtual reality and robotics. Their embedded AI capabilities help the cameras operate in very low-light conditions, ensure image stabilization when cameras are in moving or unsteady platforms, and provide guidance to platforms such as drones to avoid obstacles.

These Qualcomm systems on a chip pack a lot of mobile/IoT endpoint functionality and performance. Delving deeper into the underlying technology, each incorporates Qualcomm Adreno GPUs, multiple Qualcomm Cryo ARM CPU cores, Hexagon 685 Vector Processors, Qualcomm’s Snapdragon Neural Processing Engine, dual 14-bit Spectra 270 image signal processors, dual 16-megapixel sensors and onboard Wi-Fi. The eight-core 605 can handle up to simultaneous 4K (Ultra HD) and 1080p (Full HD) video feeds, each at 60 fps as well as even more simultaneous streams at lower resolution, while the lower-power, smaller-footprint quad-core 603 tops out at simultaneous 4K and 720p streams, each at 30 frames per second. The SoCs can achieve up to 2.1 tera operations per second of compute performance for deep neural network inferences. Supporting this are dual 14-bit Spectra 270 image signal processors, supporting dual 16 megapixels sensors.

A Qualcomm software developer kit enables ML/DL models created in TensorFlow, Caffe and Caffe2, and also with the Android Neural Networks API and Qualcomm’s own Hexagon Neural Network library, to be ported to the SoC’s AI engine. Typically, AI model training will still need to occur first in the cloud, since the SoCs have not been optimized to run training workloads locally.

We’re sure to see more AI systems on a chip come to market this year. Many of the AI chipset vendors that have emerged over the past several years are avidly partnering to build systems on a chip for mobile and IoT edge device applications of every variety.

Some leaders in the field, such as Nvidia Chief Executive Jensen Huang, refer to this as a “Cambrian explosion” of new AI-infused species of autonomous device. Here’s the Wikibon team’s recent discussion of why AI-optimized hardware matters in this new era of edge computing:

Article Categories

By James Kobielus | May 03, 2018

James Kobielus

You may also be interested in

Dell data protection and cyber resilience for AI

Dell PowerProtect forms a Resilient Foundation for AI Infrastructure

Rob Strechay June 30, 2025

HPE Discover AI and Data flowing through the air of the Sphere in Vegas

HPE Discover 2025: Unlocking Agentic Infrastructure

Rob Strechay June 30, 2025

Cutting Edge Research, Analysis, Insights + Media

Studio Locations

Silicon Valley
989 Commercial St.
Palo Alto, CA 94303

Boston Metro
5 Mount Royal Ave.
Marlborough, MA 01752

Research Areas

Podcasts

Solutions

Engage

Stay Connected

theCUBE Research weekly

Stay ahead of the curve with the exclusive insights by our team straight to your inbox each week.

By submitting this form, you are consenting to receive marketing emails from: theCUBEResearch, info@siliconangle.com. You can revoke your consent to receive emails at any time by using the SafeUnsubscribe® link, found at the bottom of every email. Emails are serviced by Constant Contact