Hacker News new | past | comments | ask | show | jobs | submit login

From a copyright perspective the only question that matters is this: Do we treat AI models like (Xerox) copiers or do we treat them like artists?

If we treat it like a copier it's the end user that's responsible when they tell it to produce something that infringes on someone else's copyrighted work. No different than if someone walked up to a machine and copied an entire book.

Furthermore, if the end user never even distributes the result of a prompt the question is moot anyway: Copyright only matters when something gets distributed. No distribution == No violation of copyright.

If we treat AI like an artist it is the owner/creator of the AI model that's responsible when it produces something that violates another's copyright. Since it is literally impossible to maintain a database of all copyrighted works that exist (in order to check if something violates copyright or not) this option is untenable. It's not possible to implement unless we go back to requiring all copyrights be registered (and provide that database to anyone that asks—thus, distributing all those copyrighted works which would defeat the purpose).

I very strongly believe that the courts will ultimately settle on treating AI like a copier. It's a tool/machine and should be treated as such by copyright law.






We treat them as models. We allowed them to be fitted on copyrighted data, arguing the research is an inherent public good. But now that these companies are directly competing with that material's copyright holders, it makes sense to reevaluate that assumption.

A good first step would be to mandate AI labs share their weights and methodology before commercial release or lose that privilege. This would spare universities and non-profits, while requiring commercial labs to contribute something back, be it in licensing fees or usable research.


A good argument. however to compare an AI model to a Xerox machine is reductive and not a sound metaphor...

It can not be treated just as a Xerox machine, but it can be treated as a Xerox machine that has within it all the copywritten works (that a user can inventively request combinations there within) which it has trained on (and saved in the form of weights/bias). In this case the AI model itself is the distribution of works under copyright. Encrypting/transforming copywritten works and transmitting it is a violation of copyright (afaik; ianal).

This is all to say, copyright - as it stands - needs heavy reform. I'm rather copyleft. Because all of this is vestigial nonsense from an age where printers from the 1800's setting the rules, and our thinking hasn't updated yet.


What would happen if you made a lossy image compression format derived from tons of scraped, copyrighted images.

There’s no generative ability, but anytime you compress/decompress your image the model uses weights and biases learned from copyrighted works.

Is that a violation?


Nothing. and no:

Only distributing it to others is when copyright is an issue. Private translations/transformations are unenforceable. You can mark up a book you've bought as much as you want.

copyright makes no sense;


Treat them as a search engine.

This is wrong on so many counts, you should not be giving legal judgments in comments. As one example, “no distribution == no violation of copyright” is incorrect.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: