The first SoC including Neural Engine was the A11 Bionic, used in iPhone 8, 8 Pl...

aurareturn · 2025-09-09T20:30:35 1757449835

The Neural Engine is its own block. Neural Engine is not used for local LLMs on Macs. Neural Engine is optimized for power efficiency while running small models. It's not good for LARGE language models.

This change is strictly adding matmul acceleration into each GPU core where it is being used for LLMs.

runjake · 2025-09-09T20:16:41 1757449001

The matmul stuff is part of the Neural Accelerator marketing, which is distinct from the Neural Engine you're talking about.

I don't blame you. It's confusing.

Nokinside · 2025-09-09T20:17:36 1757449056

It's remaining and rearrangement of the same stuff. Not a new feature.

aurareturn · 2025-09-09T20:35:21 1757450121

The NPU is still there. This adds matmul acceleration directly into each GPU core. It takes about ~10% more transistors to add these accelerators into the GPU so it's a significant investment for Apple.

runjake · 2025-09-09T20:29:43 1757449783

1. It adds new features. Eg. see matmul and other to-be-detailed-soon features.

2. It moves some stuff from the external Neural Engine to the GPU, which substantially increases speeds for those workloads. That itself is a feature.

Will any of this really matter much to the average consumer at this point? Probably not. Not until Apple Intelligence gets off the ground.