Hot take: LLM technology is being purposefully framed as AI to avoid accountability
Which of the following sounds more reasonable?
I shouldn't have to pay for the content that I use to tune my LLM model and algorithm.
We shouldn't have to pay for the content we use to train and teach an AI.
By calling it AI, the corporations are able to advocate for a position that's blatantly pro corporate and anti writer/artist, and trick people into supporting it under the guise of a technological development.
I do. If it's publicly available, individuals should be able to learn from it. Artists don't pay their influences that helped develop their style, we don't pay the programmers that answer questions on stack overflow
Hell, I'm not sure generative AI should have to pay for training data at all. It points to a weakness in the system, and it doesn't fix it - the field is getting away from needing existing datasets. GPT4 swallowed everything worth swallowing, and it's already training GPT4.5. This would only make it harder for new players to compete in the generative AI space
It can't profit only the few, it's too big a force multiplier. Paying up front doesn't fix it, recurring payments don't fix it... That's nothing but a payoff to a few people as this starts to eat the best parts of the job market
We need to think much bigger - we need to look at how we handle ownership as a society
I'm not really sure comparing AI to a human artist learning and being inspired by others quite fits. At least in the context of a commercial AI (one that a company charges others to use). It feels scummy for a company (for profit entity) to steal training data from others without consent, and then turn around and charge people for the product they built on that stolen content.
That said, existing copyright law allows for 'fair use', which includes educational purposes. In that light, AI companies could be seen as a sort of AI school program. But the icky part to me, is that AI is not a person. It can't choose to leave the school. That school can then profit off that student forever and ever.
I feel like the fair use argument for education applies to humans, not AI (at least not till they actually gain sapience). AI are machines that can be leveraged and exploited by the few and powerful, and that power should come without us subsidizing their development.
Though honestly it's sort of a moot point, because it's already done and we're very unlikely to ever properly charge them now. And now that they have the start, they have a leg up on everyone else. So the morality of how it was built no longer really matters, unless we want to argue AI should all be open source or public domain.