Hacker News new | past | comments | ask | show | jobs | submit login

> ensemble of transformer models

Isn't that just dropout?




No. Why do you think so?


Geoffrey Hinton describes dropout that way. It's like you're training different nets each time dropout changes.


Dropout is different from ensembles. It is a regularization method.

It might look like an ensemble because you’re selecting different subsets but ensembles combine different independent models rather than just subset models.


That said random forests are an internal ensemble, so I guess that could work.

In my mind an ensemble is like a committee. For it to be effective, each member should be independent (able to pick up different signals) and have a greater than random chance of being correct.


I am aware it is not literally an ensemble model, but Geoffrey Hinton says it achieves the same thing conceptually and practically.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: