I have the opposite: my most hated illustration. It's the standard diagram of ho... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		eterevsky on Jan 12, 2023 \| parent \| context \| favorite \| on: Ask HN: What's your favorite illustration in compu... I have the opposite: my most hated illustration. It's the standard diagram of how Transformer language model works (https://www.researchgate.net/figure/Transformer-Language-Mod...). When I tried to figure out transformers, I saw it in every single paper, and it didn't help almost at all. I think I finally got a good understanding only when I looked at a few implementations.

YetAnotherNick on Jan 12, 2023 [–]

Also it is wrong. The paper has add and norm, but the official implementation and all other good implementation has pre norm architecture: https://twitter.com/francoisfleuret/status/14671353665032192...

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact