Like posts have said, whether training is fair use is not a matter of opinion, it is a matter of law. You can't use an appeal to your authority to make grand statements like this.
Frankly, I don't care that ML/AI _needs_ this to work. That's not my problem. You don't get to circumvent existing agreements (and law) because you believe that ML learning is the same as a human reading a piece of code and then typing it up on the side. Tesla manages just fine by generating their own training data. Other businesses have found partners to acquire data from. The only reason this isn't being immediately addressed is because there is near-zero accountability for license violations in software companies, and ML further obfuscates that.
A human that watches a Hollywood movie and then goes on to recreate it frame-by-frame with, idk, everyone has cat ears and go "nah this is all my original creation" is an idiot. A human that watches a Hollywood movie and then goes on to create existing works within the genre, with some homage (say, a specific hat, or a specific framing of a pivotal scene, or a specific lighting choice) to the original movie that inspired them, is learning.
I think the law should address multiple things, only one of which is the outputs of learning. For example, if both the copycat human and the original director human both watched the movie via stealing it, that's also bad. Especially because copycat human is then going on to create copies of work they never had legal permission to copy! CoPilot effectively cannot tell the difference between an homage and theft.
That's absurd, it's like saying you want an army of robot slaves.
Now, wanting to minimize the impact of robotic competition on human wellbeing, understandable. But the means to that end is declining to recognize property rights of those who who try to privatize the commons.
Frankly, I don't care that ML/AI _needs_ this to work. That's not my problem. You don't get to circumvent existing agreements (and law) because you believe that ML learning is the same as a human reading a piece of code and then typing it up on the side. Tesla manages just fine by generating their own training data. Other businesses have found partners to acquire data from. The only reason this isn't being immediately addressed is because there is near-zero accountability for license violations in software companies, and ML further obfuscates that.