Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You left this out

"The transformer processes the entire sequence all at once by using something called self attention"



This is the very next sentence, so it is a little odd that "hence the title" comes before, and not after, "...using something called self attention."

My take is these are nitpicks though. I can't count the number of podcasts I've listened to where the subject is my area of expertise and I find mistakes or misinterpretations at the margins, where basically 90% or more of the content is accurate.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: