Most argue training with copyrighted data is fair use.
AI companies have all kinds of arguments against paying for copyrighted content::The companies building generative AI tools like ChatGPT say updated copyright laws could interfere with their ability to train capable AI models. Here are comments from OpenAI, StabilityAI, Meta, Google, Microsoft and more.
This. If the model and its parameters are open source and under an unrestricted license, they can scrape anything they want in my opinion. But if they make money with someone's years of work writing a book, then please give that author some money as well.
But if they make money with someone’s years of work writing a book, then please give that author some money as well.
Why? I've read many books on programming, and now I work as a programmer. The authors of those books don't get a percentage of my income just because they spent years writing the book. I've also read (and written) plenty of open source code over the years, and learned from that code. That doesn't mean I have to give money to all the people who contributed to those projects.
Like with most things, consent and intent matter. I went out on Halloween when I was a kid and got free candy, so why is it bad if I break in and steal other people's candy?
I will never be totally happy with this situation until they're required to offer a free version of all the models that were created with unlicensed content.