Arm’s latest: A CPU design to better serve AI, ML

Arm Holdings has introduced the Armv9 microarchitecture, the first overhaul of its CPU architecture in a decade, with heavy emphasis on security and all things artificial intelligence (AI) and machine learning (ML).

Arm, for the unfamiliar, does not make CPUs like Intel and AMD. It makes basic architectural designs that licensees modify with their own special technological sauce. It makes variances for high-performance, mobile, embedded, and edge/cloud.

As part of Arm’s Vision Day event earlier this week, the company announced the first details of the Armv9 architecture, with more to come later this year. The company has to tread cautiously as it is in the process of being acquired by Nvidia, and forces are lining up to oppose the deal.

From an architectural standpoint, v9 probably isn’t as big a jump as v8 was over v7, where the company introduced 64-bit instructions for the first time with the AArch64 instruction set, along with a redesigned execution mode.

Armv9 is built on Armv8 and is backwards compatible, so no software rewrites will be required. The major new technology concept introduced in Arm v9 is the Confidential Compute Architecture and the concept of Realms.

Realms are containerized, isolated execution environments, completely hidden from the operating system and hypervisor. Realms can protect commercially sensitive data and code from the rest of the system while it is in-use, at rest, and in transit.

Arm did not detail how it would work, exactly. It could be reminiscent of IBM’s hardware partitioning for its z Series mainframes or something more like AMD’s Secure Encrypted Virtualization in its Epyc processors, which fully encrypts each guest in a hypervisor and walls them off. We won’t know for a while.

SVE2 for AI

In addition to security, Armv9 focuses on AI, improved vector instructions, and new digital signal processing (DSP) capabilities. It introduced Scalable Vector Extensions, or SVE, back in 2016. The first licensee to actually use them was Fujitsu, which made them a key part of its A64FX CPU, which powers the world’s fastest (for now) supercomputer, Fukagu in Japan.

The first run of SVE was it was lacking in certain SIMD instruction. SVE 1 was great for high-performance computing, but non-HPC workloads didn’t see the benefits. Arm announced SVE2 in April 2019 with the aim of bringing more complementary scalable SIMD instructions for DSP and machine learning (ML) workloads.

SVE2 enhances the processing ability of 5G systems, virtual and augmented reality, and ML workloads running locally on CPUs, such as image processing and smart-home applications. Over the next few years, Arm says it will further extend the AI capabilities of its technology to the CPU, its Mali GPUs and Ethos NPUs.

For overall performance, Arm believes its new architecture will allow chip manufacturers to gain more than 30% in compute power over the next two chip generations, not just in mobile CPUs but also server-based processors, like AWS Graviton processor and Ampere’s Altra server processors.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Hannah