Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is less cyclical and more common sense changes.

AMD used to have 5-wide VLIW. Theoretical performance was massive, but NOTHING could take advantage of 5-wide parallelism consistently. When they switched to VLIW-4, essentially zero performance per CU was lost. GCN went the complete opposite direction and got rid of all the VLIW in exchange for tons of flexibility.

It turns out that (as we've known for a very long time) in-order parallelism has diminishing returns. If you look at in-order CPUs for example, two generic execution ports can be used something like 80% of the time. Moving to 3 in-order drops the usage of the third to something like 15-30% and the fourth is single digits of usage.

Adding a second port should guarantee lots of use, but only if the second unit is kept flexible. RDNA3 can only use the second port for a handful of operations which means that it can't be used anywhere near that 80% metric for most applications. Future versions of RDNA and CDNA with more flexible second ports should start to be able to leverage those extra compute units (if they decide it's worth the transistors).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: