Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They are not engagement bait. That argument, in particular, survived multiple rounds of reviews with friends outside my team who do not fully agree with me about this stuff. It's a deeply sincere, and, I would say for myself, earned take on this.

A lot of people are misunderstanding the goal of the post, which is not necessarily to persuade them, but rather to disrupt a static, unproductive equilibrium of uninformed arguments about how this stuff works. The commentary I've read today has to my mind vindicated that premise.



> That argument, in particular, survived multiple rounds of reviews with friends outside my team who do not fully agree with me about this stuff. It's a deeply sincere, and, I would say for myself, earned take on this.

Which argument? The one dismissing all arguments about IP on the grounds that some software engineers are pirates?

That argument is not only unpersuasive, it does a disservice to the rest of the post and weakens its contribution by making you as the author come off as willfully inflammatory and intentionally blind to nuance, which does the opposite of breaking the unproductive equilibrium. It feeds the sense that those in the skeptics camp have that AI adopters are intellectually unserious.

I know that you know that the law and ethics of IP are complicated, that the "profession" is diverse and can't be lumped into a cohesive unit for summary dismissal, and that there are entirely coherent ethical stances that would call for both piracy in some circumstances and condemnation of IP theft in others. I've seen enough of your work to know that dismissing all that nuance with a flippant call to "shove this concern up your ass" is beneath you.


> The one dismissing all arguments about IP on the grounds that some software engineers are pirates?

Yeah... this was a really, incredibly horseshit argument. I'm all for a good rant, but goddamn, man, this one wasn't good. I would say "I hope the reputational damage was worth whatever he got out of it", but I figure he's been able to retire at any time for a while now, so that sort of stuff just doesn't matter anymore to him.


I love how many people have in response to this article tried to intimate that writing it put my career in jeopardy; so forcefully do they disagree with a technical piece that it must somehow be career-limiting.


It's just such a mind-meltingly bad argument, man.

"A whole bunch of folks ignore copyright terms, so all complaints that 'Inhaling most-to-all of the code that can be read on the Internet with the intent to make a proprietary machine that makes a ton of revenue for the owner of that machine and noone else is probably bad, and if not a violation of the letter of the law, surely a violation of its spirit.' are invalid."

When I hear someone sincerely say stuff that works out to "Software licenses don't matter, actually.", I strongly reduce my estimation of their ability to reason well and behave ethically. Does this matter? Probably not. There are many folks in the field who hold that sort of opinion, so it's relatively easy to surround yourself with likeminded folks. Do you hold these sorts of opinions? Fuck if I know. All I know about is what you wrote today.

Anyway. As I mentioned, you're late-career in what seems to be a significantly successful career, so your reputation absolutely doesn't matter, and all this chatter is irrelevant to you.


I don't know who you're quoting, but it's not me.


I'm not quoting anyone. Perhaps wrapping the second paragraph in what I understand to be Russian-style quotes (« ») would have been clearer? Or maybe prepending the words "Your argument ends up being something like " to the second paragraph would have been far clearer? shrug


On HN, the convention is that quotations indicate literal quotation. It's not a reasonable paraphrase of my argument either, but you know that.


> It's not a reasonable paraphrase of my argument either, but you know that.

To quote from your essay:

"But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.

The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show."

Man, you might not see the resemblance now, but if you return to it in three to six months, I bet you will.

Also, I was a professional musician in a former life. Given the content of your essay, you might be surprised to know how very, very fast and loose musicians as a class play with copyright laws. In my personal experience, the typical musician paid for approximately zero of the audio recordings in their possession. I'd be surprised if things weren't similar for the typical practitioner of the visual arts.


I agree this is a bad "collective punishment" argument from him, even if I think he's somewhat right in spirit because I as a software dev don't care in the slightest about LLMs training on code, text, videos, or images and fully believe it's equivalent to humans perceiving and learning from the output of others and I know many or most software devs agree on that point while most artists don't.

I think artists are very cavalier about IP, on average. Many draw characters from franchises that do not allow such drawing, and often directly profit by selling those images. Do I think that's bad? No. (Unless it's copying the original drawing plagiaristically.) Is it odd that most of the people who profit in this way consider generative AI unethical copyright infringement? I think so.

I think hypocrisy on the issue is annoying. Either you think it's cool for LLMs to learn from code and text and images and videos or you don't think any of it is fine. tptacek should bite one bullet or the other.


I don't accept the premise that "training on" and "copying" are the same thing, any more than me reading a book and learning stuff is copying from the book. But past that, I have, for the reasons stated in the piece, absolutely no patience for software developers trying to put this concern on the table. From my perspective, they've forfeited it.


> I don't accept the premise that "training on" and "copying" are the same thing...

Nor do I. Training and copying are clearly different things... and if these tools had never emitted -verbatim- nontrivial chunks of the code they'd ingested, [0] I'd be much less concerned about them. But as it stands now, some-to-many of the companies that build and deploy these machines clearly didn't care to ensure that their machines simply wouldn't plagiarize.

I've a bit more commentary that's related to whether or not what these companies are doing should be permitted here. [1]

[0] Based on what I've seen, when it happens, it is often with either incorrect copyright and/or license notifications, or none of the verbiage the license of the copied code requires in non-trivial reproductions of that code.

[1] <https://news.ycombinator.com/item?id=44166983>


Who is this "they" who have forfeited it?

What about the millions of software developers who have never even visited a pirate site, much less built one?

Are we including the Netflix developers working actively on DRM?

How about the software developers working on anti-circumvention code for Kindle?

I'm totally perplexed at how willing you are to lump a profession of more than 20 million people all into one bucket and deny all of them, collectively, the right to say anything about IP. Are doctors not allowed to talk about the society harms of elective plastic surgery because some of them are plastic surgeons? Is anyone with an MBA not allowed to warn people against scummy business practices because many-to-most of them are involved in dreaming those practices up?

This logic makes no sense, and I have to imagine that you see that given that you're avoiding replying to me.


You can say whatever you'd like about IP. You just don't get to tell me how to hear it.


I mean, that's fine I guess? As long as you're aware that you're being totally and utterly irrational about it.


I come from a family musicians (I'm the only non-musician in it).


Ah good. If one of your family were to bring a plagiarism suit against another musician (or company (regardless of whether that company's music was produced by humans or robots)) that'd clearly ripped off their work, would you decry them as a hypocrite?

If not, why not?

If so, (seriously, earnestly) kudos for being consistent in your thoughts on the matter.


And I'm the only one in mine who isn't either a musician or an author. I'm not sure why you believe that being in a creative family gives you some sort of divine authority to condemn the rest of us for our collective sins.


Are you the only pretentious wanker?


The second paragraph in OP's comment is absolutely a reasonable paraphrase of your argument. I read your version many times over to try to find the substance and... that is the substance. If you didn't mean it to be then that section needed to be heavily edited.


What really resonated with me was your repeated calls for us at least to be arguing about the same thing, to get on the same page.

Everything about LLMs and generative AI is getting so mushed up by people pulling it in several directions at once, marketing clouding the water, and the massive hyperbole on both sides, it's nearly impossible to understand if we're even talking about the same thing!


It's a good post and I strongly agree with the part about level setting. You see the same tired arguments basically every day here and subreddits like /r/ExperiencedDevs. I read a few today and my favorites are:

- It cannot write tests because it doesn't understand intent

- Actually it can write them, but they are "worthless"

- It's just predicting the next token, so it has no way of writing code well

- It tries to guess what code means and will be wrong

- It can't write anything novel because it can only write things it's seen

- It's faster to do all of the above by hand

I'm not sure if it's the issue where they tried copilot with gpt 3.5 or something, but anyone who uses cursor daily knows all of the above is false, I make it do these things every day and it works great. There was another comment I saw here or on reddit about how everyone needs to spend a day with cursor and get good at understanding how prompting + context works. That is a big ask but I think the savings are worth it when you get the hang of it.


Yes. It's this "next token" stuff that is a total tell we're not all having the same conversation, because what serious LLM-driven developers are doing differently today than they were a year ago has not much at all to do with the evolution of the SOTA models themselves. If you get what's going on, the "next token" thing has nothing at all to do with this. It's not about the model, it's about the agent.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: