How do architectural bottlenecks due to modified Von Neumann architectures' *deb...

imtringued · 2024-12-04T11:30:28 1733311828

For whatever reason Hynix hasn't turned their PIM into a usable product. LPDDR based PIM is insanely effective for inference. I can't stress this enough. An NPU+LPDDR6 PIM would kill GPUs for inference.

westurner · 2024-12-04T15:18:06 1733325486

How many TOPS/W and TFLOPS/W? (T [Float] Operations Per Second per Watt (hour *?))

/? TOPS/W and FLOPS/W: https://www.google.com/search?q=TOPS%2FW+and+FLOPS%2FW :

- "Why TOPS/W is a bad unit to benchmark next-gen AI chips" (2020) https://medium.com/@aron.kirschen/why-tops-w-is-a-bad-unit-t... :

> The simplest method therefore would be to use TOPS/W for digital approaches in future, but to use TOPS-B/W for analogue in-memory computing approaches!

> TOPS-8/W

> [ IEEE should spec this benchmark metric ]

- "A guide to AI TOPS and NPU performance metrics" (2024) https://www.qualcomm.com/news/onq/2024/04/a-guide-to-ai-tops... :

> TOPS = 2 × MAC unit count × Frequency / 1 trillion

- "Looking Beyond TOPS/W: How To Really Compare NPU Performance" (2023) https://semiengineering.com/looking-beyond-tops-w-how-to-rea... :

> TOPS = MACs * Frequency * 2

> [ { Frequency, NNs employed, Precision, Sparsity and Pruning, Process node, Memory and Power Consumption, utilization} for more representative variants of TOPS/W metric ]

westurner · 2024-12-04T16:20:35 1733329235

Is this fast enough for DDR or SRAM RAM? "Breakthrough in avalanche-based amorphization reduces data storage energy 1e-9" (2024) https://news.ycombinator.com/item?id=42318944