I asked this in the other thread (no response, but I was a bit late) How does an...

akdev1l · 2025-06-07T03:51:49 1749268309

> How does anyone using AI like this have confidence that they aren't unintentionally plagiarizing code and violating the terms of whatever license it was released under?

They don’t and no one cares

tptacek · 2025-06-07T04:25:25 1749270325

Most of the code generated by LLMs, and especially the code you actually keep from an agent, is mid, replacement-level, boring stuff. If you're not already building projects with LLMs, I think you need to start doing that first before you develop a strong take on this. From what I see in my own work, the code being generated is highly unlikely to be distinguishable. There is more of me and my prompts and decisions in the LLM code than there can possibly be defensible IPR from anybody else, unless the very notion of, like, wrapping a SQLite INSERT statement in Golang is defensible.

The best way I can explain the experience of working with an LLM agent right now is that it is like if every API in the world had a magic "examples" generator that always included whatever it was you were trying to do (so long as what you were trying to do was within the obvious remit of the library).

saghm · 2025-06-07T03:02:59 1749265379

Safety in the shadow of giant tech companies. People were upset when Microsoft released Copilot trained on GitHub data, but nobody who cared doing do anything about it, and nobody who could have done something about it cared, so it just became the new norm.

simonw · 2025-06-07T13:24:12 1749302652

All of the big LLM vendors have a "copyright shield" indemnity clause for their paying customers - a guarantee that if you get sued over IP for output from their models their legal team will step in to fight on your behalf.

ryandrake · 2025-06-07T03:54:18 1749268458

This is an excellent question that the AI-boosters always seem to dance around. Three replies already are saying “Nobody cares.” Until they do. I’d be willing to bet that some time in the near future, some big company is going to care a lot and that there will be a landmark lawsuit that significantly changes the LLM landscape. Regulation or a judge is going to eventually decide the extent to which someone can use AI to copy someone else’s IP, and it’s not going to be pretty.

SpicyLemonZest · 2025-06-07T15:51:42 1749311502

It just presumes a level of fixation in copyright law that I don’t think is realistic. There was a landmark lawsuit MAI v. Peak Computer in 1993, where judges determined that repairing a computer without the permission of the operating system’s author is copyright infringement, and it didn’t change the landscape at all because everyone immediately realized it’s not practical for things to work that way. There’s no realistic world where AI tools end up being extremely useful but nobody uses them because of a court ruling.

kentonv · 2025-06-07T14:49:55 1749307795

I'm fairly confident that it's not just plagiarizing because I asked the LLM to implement a novel interface with unusual semantics. I then prompted for many specific fine-grain changes to implement features the way I wanted. It seems entirely implausible to me that there could exist prior art that happened to be structured exactly the way I requested.

Note that I came into this project believing that LLMs were plagiarism engines -- I was looking for that! I ended up concluding that this view was not consistent with the output I was actually seeing.

cavisne · 2025-06-07T17:26:06 1749317166

Some API's (Gemini at least) run a search on their outputs to see if the model is reciting data from training.

So for direct copies like what you are talking about that would be picked up.

For copying concepts from other libraries, seems like a problem with or without LLM's.

aryehof · 2025-06-07T05:42:57 1749274977

The consensus for right or wrong, is that LLM produced code (unless repeated verbatim) is equivalent to you or I legitimately stating our novel understanding of mixed sources some of which may be copyrighted.

throwawaysleep · 2025-06-07T03:35:35 1749267335

As an individual dev, I simply don’t care. Not my problem.

Companies are satisfied with the idemnity provided by Microsoft.