My theory is that we're seeing the "exploration" part of exploration vs. exploitation in action [1]. I use Youtube almost exclusively for music videos and tech lectures. One day I watched this music video created by somebody who has uploaded a lot of Jordan Peterson fan content:
I was getting JP and right-leaning politics recommended to me for a month afterward, even though I never use Youtube to watch JP or political content of any kind. But maybe I was getting that content recommended because I had no prior indications of interest in that conceptual space. The recommender could be exploring previously untouched territory in hopes of finding something with high exploitation value.
A value-blind system trying to maximize engagement is going to find that some niche content performs well on some users, and will try to explore that space for other users to see if it has high exploitation value for them too. So start out watching videos about health and it's natural -- from a data perspective -- to surface some conceptually adjacent videos that include anti-vaccination conspiracies. If you watch one or two of those -- even without saving or liking -- then the exploration may expand into conceptually adjacent videos about other Things "They" Don't Want You To Know, and so on.
The Youtube recommender works pretty well for discovering new music, in my experience. The problem is that the process that gradually takes you from a Top 40 hit to a less popular musician to somebody who recorded their first song last month isn't always benign when "music" is replaced by "health" or "world news."
https://www.youtube.com/watch?v=GK72U02TUsI
I was getting JP and right-leaning politics recommended to me for a month afterward, even though I never use Youtube to watch JP or political content of any kind. But maybe I was getting that content recommended because I had no prior indications of interest in that conceptual space. The recommender could be exploring previously untouched territory in hopes of finding something with high exploitation value.
A value-blind system trying to maximize engagement is going to find that some niche content performs well on some users, and will try to explore that space for other users to see if it has high exploitation value for them too. So start out watching videos about health and it's natural -- from a data perspective -- to surface some conceptually adjacent videos that include anti-vaccination conspiracies. If you watch one or two of those -- even without saving or liking -- then the exploration may expand into conceptually adjacent videos about other Things "They" Don't Want You To Know, and so on.
The Youtube recommender works pretty well for discovering new music, in my experience. The problem is that the process that gradually takes you from a Top 40 hit to a less popular musician to somebody who recorded their first song last month isn't always benign when "music" is replaced by "health" or "world news."
[1] https://towardsdatascience.com/reinforcement-learning-demyst...