> the people whose work has been stolen to create it "Stolen" is kind of a loade...

autoexec · 2025-06-11T19:21:35 1749669695

AI has been trained on pirated material and that would be very different from someone buying books and reading them and learning from them. Right now it's still up to the courts what counts as infringing but at this point even Disney is accusing AI of violating their copyrights https://www.nytimes.com/2025/06/11/business/media/disney-uni...

AI outputs copyrighted material: https://www.nytimes.com/interactive/2024/01/25/business/ai-i... and they can even be ranked by the extent to which they do it: https://aibusiness.com/responsible-ai/openai-s-gpt-4-is-the-...

AI is getting better at data laundering and hiding evidence of infringement, but ultimately it's collecting and regurgitating copyrighted content.

astrange · 2025-06-11T19:44:33 1749671073

> at this point even Disney is accusing AI of violating their copyrights

"even" is odd there, of course Disney is accusing them of violating copyright, that's what Disney does.

> AI is getting better at data laundering and hiding evidence of infringement, but ultimately it's collecting and regurgitating copyrighted content.

That's not the standard for copyright infringement; AI is a transformative use.

Similarly, if you read a book and learn English or facts about the world by doing that, the author of the book doesn't own what you just learned.

kod · 2025-06-11T21:38:03 1749677883

Facts aren't copyrightable. Expression is. LLMs reproduce expression from the works they were trained on. The way they are being trained involves making an unlicensed reproduction of works. Both of those are pretty straightforwardly infringement of an exclusive right.

Establishing an affirmative defense that it's transformative fair use would hopefully be an uphill battle, given that it's commercial, using the whole work, and has a detrimental effect on the market for the work.

yencabulator · 2025-06-11T22:12:36 1749679956

> AI is a transformative use.

Reproducing a movie still well enough that I honestly wouldn't know which one is the original is transformative?

MacsHeadroom · 2025-06-11T23:27:09 1749684429

The still is not transformative but the model reproducing it is obviously transformative. Other general purpose tools can be used to infringe and yet are non-infringing as well.

antihipocrat · 2025-06-12T01:10:59 1749690659

If I watch a movie, then draw a near perfect likeness of the main character from my very good memory, put it on a tshirt and sell the t-shirt. That is grounds for violation of copyright if the source isn't yet in the public domain (not guaranteed but open to a lawsuit).

If I download all content from a website that has a use policy stating that all content is owned by that website and can't be resold. Then allow my users to query this downloaded data and receive a detailed summary of all related content, and sell that product. Perhaps this is a violation of the use policy.

All of this hasn't been properly tested in the courts yet.. large payments have already been made to Reddit to avoid this, likely because Reddit has the means to fight this in court.. my little blog though, fair game because I can't afford to engage.

yencabulator · 2025-06-12T02:02:35 1749693755

For sure, it's rich people playing rules for thee not for me. What's interesting is we'll discover on which side of the can-afford-to-enforce-its-copyright boundary the likes of NYTimes fall.

JimDabell · 2025-06-12T07:10:41 1749712241

That’s not “data laundering and hiding evidence of infringement” though.

You’re talking about overt infringement, the GP was talking about covert infringement. It’s difficult to see how something could be covert yet not transformative.

quantified · 2025-06-11T19:26:31 1749669991

Stolen doesn't imply anything is for sale, does it? Most things that are stolen are not for sale.

strangattractor · 2025-06-11T21:49:30 1749678570

I think there is case to be made that AI companies are taking the content - providing people with a modified version of that content and not necessarily providing references to the original material.

Much of the content that is created by people is done so to generate revenue. They are denied that revenue when people don't go to their site. One might interpret that as theft. In the case of GRRM's books - I would assumed they were purchased and the author received the revenue from the sale.

michael1999 · 2025-06-12T17:10:27 1749748227

I think you are missing some context. They were using Anna's Archive! They paid for nothing, downloaded in violation of copyright, and processed it. They violated US copyright law even before they actually ingested it!

skywhopper · 2025-06-11T22:28:55 1749680935

Yes, there are ethical differences to an individual doing things by hand, and a corporation funded by billions of investor dollars doing an automated version of that thing at many orders of magnitude in scale.

Also, LLMs don’t just imitate style, they can be made to reproduce certain content near-verbatim in a way that would be a copyright violation if done by a human being.

You can excuse it away if you want with reduction ad absurdum arguments, but the impact is distinctly different, and calls for different parameters.

GuinansEyebrows · 2025-06-11T20:24:37 1749673477

> It implies the content was for sale and was taken without payment

that's literally what happened in innumerable individual cases, though.