> That's one of the reason's OpenAI are signing big dollar deals with media companies The Atlantic, Vox etc have really high quality tokens!
They have really high quality tokens in their archive, at any rate. Since a bunch of media outlets have adopted GPT-powered writing tools, future tokens are presumably going to be far less valuable.
You are still looping your own models output into the training set regardless. Human verification may avoid outright errors creeping in, but it won't stop the model biasing it's own training set
They have really high quality tokens in their archive, at any rate. Since a bunch of media outlets have adopted GPT-powered writing tools, future tokens are presumably going to be far less valuable.