I'm fairly certain that I was not the only person in the world who thought to himself, "Did they just yoink the entire Internet and bundle it together
From the article:
"I know for a fact that Wikipedia operates under a CC BY-SA 4.0 license, which explicitly states that if you're going to use the data, you must give attribution. As far as search engines go, they can get away with it because linking back to a Wikipedia article on the same page as the search results is considered attribution.
But in the case of Brave, not only are they disregarding the license - they're also charging money for the data and then giving third parties "rights" to that data."
Is that shady? Arent all other AI companies + many other data gather services doing exactly the same thing. We need to wait for the court cases to conclude if AI datasets can use publicly available information for training.