The vast majority of the data sets are very small (on the order of 100 to 1000 e...

		argonaut on April 8, 2016 \| parent \| context \| favorite \| on: 20 lines of code that beat A/B testing (2012) The vast majority of the data sets are very small (on the order of 100 to 1000 examples!). In fact, in the paper they discarded some of the larger UCI datasets. It's not surprising that they found random forests perform better, even though the conventional wisdom is boosting outperforms (it's much harder to overfit with forests).