Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And brains do it on about 20 watts.


Training takes many years though.


Training took hundreds of thousands of years. Everyone just gets a foundation model to fine tune for another couple of decades before it can start flipping burgers.


It's been something like half a billion years since the first brain.


That was a much smaller model, couldn't do much more than crawl around and run away.


It was still used as part of pretraining the current model.


Nonsense, the current model is a new architectural approach.

It was all explained in that recent paper, "Attention is all your meat"




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: