How can you get the definition of fairness so backwards? Giant corporations provide literally everything you take for granted and they should be punished because you are envious? I don't get it.
There is a reason everyone with over 130 IQ wants to work for them rather than starting their own companies.
They shouldn’t be punished because people are envious, they should be punished because they’re not respecting other people's intellectual property without an agreement in place.
We can’t protect IPs only when that benefits big corps. We should protect them always or accept that the world is better if we go in another direction, changing the rules for everybody.
Training on copyrighted data should be legally allowed
- of course exact reproduction of protected content is a no-no
- but learning is ok, as long as it is transformative. User prompts and responses are pushing the model outside its training distribution anyway
- users add their own intent, making usage transformative
- when LLMs synthesize from multiple sources, the result is transformative
- if you try to protect expression it is meaningless now, but if you protect abstract ideas it kneecaps creativity
- the problems of copyright started with the apparition of internet, not with AI
- revenues from royalty are almost zero today, as each new content competes against an unbounded list of other works that have been accumulating for decades online
- because royalties are shit, creatives now focus on ads, and this leads to enshittification, attention grabbing junk everywhere, attention is scarce content is post-scarcity
- we actually like interactive participation more than passive consumption; we now edit Wikipedia, contribute to open source, have papers published for free on arXiv, use social networks where our comments are shared with the world, play games instead of reading books - it is another age, the interactive age
- AI is actually more than an infringement tool, it is useful for many legit purposes
- and AI is the worst possible infringement tool, it can hallucinate details, get thins wrong; By comparison copying is free and easy and precise to the letter
So the idea that training is infringement is pretty abusive, it tries to make copyright be about abstractions which is wrong. We can't return to 1990s, so we have to live with its demise. It's been dying for 3 decades already.
LLMs are allowed to "learn" from all this content because humans are allowed to. Most humans have to access the content legally to learn. But training LLMs it's basically "Copyright lol, yolo".
Is there a reason a human can't torrent movies and say "But I'm just learning from them"?
It's been reduced to zero for 3 decades. When you publish your work, there are a million other works competing for attention. That is the real issue. When you search for an image, you get thousands of images instantly, faster than diffusion models. Content doesn't matter anymore, attention matters, curation matters too.
Even if you forbid AI from training on copyrighted works, people are going to comment about them online, and the model will pick up the ideas. There is no way to protect ideas from spreading and reaching AIs.
How can you get the definition of fairness so backwards? The King provides literally everything you take for granted and he should be punished because you are envious? I don't get it.
There's a reason why every vassal with a sizeable estate wants to be in the King's court rather than starting their own country.
There is a reason everyone with over 130 IQ wants to work for them rather than starting their own companies.