Skip Navigation

The shady world of Brave selling copyrighted data for AI training

stackdiary.com The shady world of Brave selling copyrighted data for AI training

I'm fairly certain that I was not the only person in the world who thought to himself, "Did they just yoink the entire Internet and bundle it together

The shady world of Brave selling copyrighted data for AI training

From the article:

"I know for a fact that Wikipedia operates under a CC BY-SA 4.0 license, which explicitly states that if you're going to use the data, you must give attribution. As far as search engines go, they can get away with it because linking back to a Wikipedia article on the same page as the search results is considered attribution.

But in the case of Brave, not only are they disregarding the license - they're also charging money for the data and then giving third parties "rights" to that data."

112

You're viewing a single thread.

112 comments
  • Is that shady? Arent all other AI companies + many other data gather services doing exactly the same thing. We need to wait for the court cases to conclude if AI datasets can use publicly available information for training.

    • Just because multiple companies are doing it doesn't make it less shady. They're literally selling you 'rights' to content that isn't theirs to sell.

You've viewed 112 comments.