Just for You

Arm Announces Ethos-N78 NPU: Bigger And More Efficient

Yesterday Arm released the new Cortex-A78, Cortex-X1 CPUs and the new Mali-G78 GPU. Alongside the new “key” IPs from the company, we also saw the reveal of the newest Ethos-N78 NPU, announcing Arm’s new second-generation design.

Over the last few years we’ve seen a literal explosion of machine learning accelerators in the industry, with a literal wild west of different IP solutions out there. On the mobile front particularly there’s been a huge amount of different custom solutions developed in-house by SoC vendors, this includes designs such as from Qualcomm, HiSilicon, MediaTek and Samsung LSI. For vendors who do not have the design ability to deploy their own IP, there’s the possibility of licensing something from an IP vendor such as Arm.

Arm’s “Ethos” machine learning IP is aimed at client-side inferencing workloads, originally described as “Project Trillium” and the first implementation seeing life in the form of the Ethos-N77. It’s been a year since the release of the first generation, and Arm has been working hard on the next iteration of the architecture. Today, we’re covering the “Scylla” architecture that’s being used in the new Ethos-N78.

From a very high-level view, what the N78 promises is a quite large boost both in performance and efficiency. The new design scales up much higher than the biggest N77 configuration, now being able to offer 2x the peak performance at up to 10TOPs of raw computational throughput.

Arm has revamped the design of the NPU for better power efficiency, enabled through various new compression techniques as well as an improvement in external memory bandwidth per inference of up to 40%.

Strong points of the N78 are the IP’s ability to scale performance across different configuration options. The IP is available at 4 different performance points, or better said at four different distinct engine configurations, from the smallest config at 1TOPs, to 2, 5 and finally a maximum of 10TOPs. This corresponds to MAC configurations of 512, 1024, 2048 and 4096 units for the totality of the design.

The interesting aspect of scaling bigger…

Source

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *