The problem of recommending something (especially something genuinely new) is actually quite complex, which you realize by studying inferential/causal statistics.
A statisticians would say that you are making "heroic assumptions" about both the taste of viewers, and the dimensions of the product, especially given that the products you have are actually quite sparse. Heck, there's even interesting literature in empirical economics (called empirical industrial organization) trying to estimate prices/demand/costs (which is, essentially the same issue as Netflix trying to estimate what you like) and dealing with the inherent curse of dimensionality. This becomes really obvious if you are actually write down explicitly what you are doing.
Of course ML engineers are engineers and not really interested in causal inference, so Netflix wants me to watch 12 Hitler documentaries in a row and Amazon recommends that I better buy that second fridge right after the first.
Where _does_ collaborative filtering actually work?
A statisticians would say that you are making "heroic assumptions" about both the taste of viewers, and the dimensions of the product, especially given that the products you have are actually quite sparse. Heck, there's even interesting literature in empirical economics (called empirical industrial organization) trying to estimate prices/demand/costs (which is, essentially the same issue as Netflix trying to estimate what you like) and dealing with the inherent curse of dimensionality. This becomes really obvious if you are actually write down explicitly what you are doing.
Of course ML engineers are engineers and not really interested in causal inference, so Netflix wants me to watch 12 Hitler documentaries in a row and Amazon recommends that I better buy that second fridge right after the first.
Where _does_ collaborative filtering actually work?