A few questions I have on this: Is it possible for an LLM of llama3/sonnet3.5/GP...

Workaccount2 · 2025-01-12T16:22:50 1736698970

It's not even clear that training on copyrighted data even is a breach of IP law. People start their arguments on that assumption so they have an argument, but in reality that question isn't even resolved yet, and frankly it looks like the courts will likely determine that it's not a breach of IP law to train on copyrighted data (but is a breach to output it).

jazzyjackson · 2025-01-12T21:55:03 1736718903

How did they get the copyrighted data? O, right, they downloaded it without permission.

jorams · 2025-01-12T18:01:10 1736704870

Note that training is not even relevant here. Downloading copyrighted content you don't have the right to download is illegal. Distributing content you don't have the right to distribute is illegal. Meta did both. They did so knowingly, very deliberately even. It is unambiguously copyright infringement, on a massive scale.