Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You might be able to get your clients to sign something to allow usage, but if you don't, as you say, it doesn't seem wise to vibe code for them. For two reasons:

1. A typical contract transfers the rights to the work. The ownership of AI generated code is legally a wee bit disputed. If you modify and refactor generated code heavily it's probably fine, but if you just accept AI generated code en masse, making your client think that you wrote it and it is therefore their copyright, that seems dangerous.

2. A typical contract or NDA also contains non disclosure, i.e. you can't share confidential information, e.g. code (including code you _just_ wrote, due to #1) with external parties or the general public willy nilly. Whether any terms of service assurances from OpenAI or Anthropic that your model inputs and outputs will probably not be used for training are legally sufficient, I have doubts.

IANAL, and _perhaps_ I'm wrong about one or both of these, in one or more countries, but by and large I'd say the risk is not worth the benefit.

I mostly use third party LLMs like I would StackOverflow: Don't post company code there verbatim, make an isolated example. And also don't paste from SO verbatim. I tried other ways of using LLMs for programming a few times in personal projects and can't say I worry about lower productivity with these limitations. YMMV.

(All this also generally goes for employees with typical employment contracts: It's probably a contract violation.)



Nobody is seriously disputing the ownership of AI generated code. A serious dispute would be a considerable, concerted effort to stop AI code generation in any jurisdiction, that provides a contrast to the enormous, ongoing efforts by multiple large players with eye-watering investments to make code generation bigger and better.

Note, that this is not a statement about the fairness or morality of LLM building, but to think that the legality of AI code generation is something to reasonably worry about, is betting against multiple large players and their hundreds of billions of dollars in investment right now, and that likely puts you in a bad spot in reality.


> Nobody is seriously disputing the ownership of AI generated code

From what I've been following it seems very likely that, at least in the US, AI-generated anything can't actually be copyrighted and thus can't have ownership at all! The legal implications of this are yet to percolate through the system though.


Only if that interpretation lasts despite likely intense lobbying to the contrary.


Other forms of LLM output is being seriously challenged however.

https://llmlitigation.com/case-updates.html

Personally I have roughly zero trust in US courts on this type of issue but we'll see how it goes. Arguably there are cases to be made where LLM:s cough up code cribbed from repos with certain licenses without crediting authors and so on. It's probably a matter of time until some aggressively litigious actors do serious, systematic attempts at getting money out of this, producing case law as a by product.

Edit: Oh right, Butterick et al went after Copilot and image generation too.

https://githubcopilotlitigation.com/case-updates.html

https://imagegeneratorlitigation.com/case-updates.html


this is "Kool-aid" from the supply side of LLMs for coding IMO. Plenty of people are plenty upset about the capture of code at Github corral, fed into BigCorp$ training systems.

parent statement reminds me of smug French in a castle north of London circa 1200, with furious locals standing outside the gates, dressed in rags with farm tools as weapons. One well-equipped tower guard says to another "no one is seriously disputing the administration of these lands"


Your mother was a hamster and your father smelt of elderberries?


I think the comparison falls flat, but it's actually really funny. I'll keep it in mind.


Yes these are indeed the points. I don't really care too much, it would make me a bit more efficient but I'm billing by the hour anyway so I'm completely fine playing by the book.


Not sure I can agree with the "I'm billing by the hour" part.

I mean sure, but I think of my little agency providing value, for a price. Clients have budgets, they have limited benefits from any software they build, and in order to be competitive against other agencies or their internal teams, overall, I feel we need to provide a good bang for buck.

But since it's not all that much about typing in code, and since even that activity isn't all that sped up by LLMs, not if quality and stability matters, I would still agree that it's completely fine.


Yes, it's important of course that I'm efficient, and I am. But my coding speed isn't the main differentiating factor why clients like me.

I meant that I don't care enough to spearhead and drive this effort within the client orgs. They have their own processes, and internal employees would surely also like to use AI, so maybe they'll get there eventually. And meanwhile I'll just use it in the approved ways.


This comes down to a question of what one can prove. NNs are necessary not explainable and none of this would have much evidence to show in court.


Sure there's evidence: Your statements about this when challenged. And perhaps to a degree the commit log, at least that can arouse suspicion.

Sure, you can say "I'd just lie about it". But I don't know how many people would just casually lie in court. I sure wouldn't. Ethics is one thing, it takes a lot of guts, considering the possible repercussions.


"I do not recall"


Yup, Gates style would work. But billionaires have a tendency to not get into serious trouble for lying to the public, a court, congress and what not. Commoners very much do.


What about 10 years ago when we all copied code from SO? Did we worry about copyright then? Maybe we did and I don’t recall.


“We” took care to not copy it verbatim (it’s the concrete code form that is copyrighted, not the algorithm), and depending on jurisdiction there is the concept of https://en.wikipedia.org/wiki/Threshold_of_originality in copyright law, which short code snippets on Stack Overflow typically don’t meet.


It's roughly the same, legally, and I was well aware of that.

Legally speaking, you also want to be careful about your dependencies and their licenses, a company that's afraid to get sued usually goes to quite some lengths to ensure they play this stuff safe. A lot of smaller companies and startups don't know or don't care.

From a professional ethics perspective, personally, I don't want to put my clients in that position unless they consciously decide they want that. They hire professionals not just to get work done they fully understand, but to a large part to have someone who tells them what they don't know.


You raise a good point. It was kinda gray in the SO days. You almost always had to change something to get your code to work. But at lot of LLM's can spit out code that you can just paste in. And, I guess maybe the tests all pass, but if it goes wrong, you, the coder probably don't know where it went wrong. But if you'd written it all yourself, you could probably guess.

I'm still sorting all this stuff out personally. I like LLM's when I work in an area I know well. But vibing in areas of technology that I don't know well just feels weird.


SO seems different because the author of the post is republishing it. If they are republishing copyrighted material without notice, it seems like the SO author is the one in violation of copyright.

In the LLM case, I think it’s more of an open question whether the LLM output is republishing the copyrighted content without notice, or simply providing access to copyrighted content. I think the former would put the LLM provider in hot water, while the latter would put the user in hot water.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: