> 90%+ of the core logic area (stuff that is not i/o, power, memory, or clock di...

> 90%+ of the core logic area (stuff that is not i/o, power, memory, or clock distribution) on the GPU are very basic matrix multipliers. >All best possible arithmetic circuits, multipliers, dividers, etc. are public knowledge.

Combine these 2 statements and most GPUs would have roughly identical performance characteristics (performance/Watt, performance/mm2, etc)

And yet, you see that both AMD and Nvidia GPUs (but especially the latter) have seen massive changes in architecture and performance.

As for the 90% number itself: look at any modern GPU die shot and you'll see that 40% is dedicated just to moving data in and out of the chip. Memory controllers, L2 caches, raster functions, geometry handling, crossbars, ...

And within the remaining 60%, there are large amounts of caches, texture units, instruction decoders etc.

The pure math portions, the ALUs, are but a small part of the whole thing.

I don't know enough about the very low level details of CPUs and GPUs to judge which ones are more complex, but in claiming that there's no space for sophistication, I can at least confidently say that I know much more than you.