I'm wondering about the benchmark too. It's way above my level to figure out how it can be gamed. But, buried in the article:
Moreover, ARC-AGI-1 is now saturating – besides o3's new score, the fact is that a large ensemble of low-compute Kaggle solutions can now score 81% on the private eval.
The most expensive o3 version achieved 87.5%
Man I don't need to be reminded of the sorry state of meat alternatives.
It's bitterly funny to me that fashoid governments started banning cultivated meat as if the economic and technical issues weren't enough. Ignorants terrified of threats they made up in their head as always.
The promptfans testing OpenAI Sora have gotten mad that it's happening to them and (temporarily) leaked access to the API.
https://techcrunch.com/2024/11/26/artists-appears-to-have-leaked-access-to-openais-sora/
“Hundreds of artists provide unpaid labor through bug testing, feedback and experimental work for the [Sora early access] program for a $150B valued [sic] company,” the group, which calls itself “Sora PR Puppets,” wrote in a post ...
"Well, they didn't compensate actual artists, but surely they will compensate us."
“This early access program appears to be less about creative expression and critique, and more about PR and advertisement.”
OK, I could give them the benefit of the doubt: maybe they're new to the GenAI space, or general ML Space ... or IT.
But I'm not going to. Of course it's about PR hype.
Also, I the image is perfect. I especially like the Joe Kucan-looking general embedded in the star trek tactical station. The Technology of Peace ain’t what it used to be, is it?
Is that a screenshot from Command&Conquer 4?
That article gave me a whiplash. First part: pretty cool. Second part: deeply questionable.
For example these two paragraphs from sections 'problem with code' and 'magic of data':
“Modular and interpretable code” sounds great until you are staring at 100 modules with 100,000 lines of code each and someone is asking you to interpret it.
Regardless of how complicated your program’s behavior is, if you write it as a neural network, the program remains interpretable. To know what your neural network actually does, just read the dataset
Well, "just read the dataset bro" sound great sounds great until you are staring at a dataset with 100 000 examples and someone is asking you to interpret it.
Who knows. The only thing that came to my mind reading that is the joke "statement made by utterly deranged". And then I realized there is no joke.
It was supposed to. I'm just not that good at writing.
Yeah, neural network training is notoriously easy to reproduce /s.
Just few things can affect results: source data, data labels, network structure, training parameters, version of training script, versions of libraries, seed for random number generator, hardware, operating system.
Also, deployment is another can of worms.
Also, even if you have open source script, data and labels, there's no guarantee you'll have useful documentation for either of these.
I am neither left nor right wing, as I’m a libertarian
Ah, yes, the classic "I'm not like the other girls" of politics.
It would be funny if someone was literally beating up servers with a wooden shoe.
Then there is John Michael Greer...
Wow, that's a name I haven't heard in a long time.
A regular contributor at UnHerd...
I did not know that, and I hate that it doesn't surprise me. I tended to dismiss his peak oil doomerism as wishing for some imagined "harmony with nature". This doesn't help with that bias.
The first paragraph surprised me. I didn't know there were still some true believers left.
YAML is great if you need to make simple configuration files
... which is why no one uses it for things like Kubernetes /s
Automattic... that's why there are two t's!? Jesus Christ.
Just something I found in the wild (r/machine learning): Please point me in the right direction for further exploring my line of thinking in AI alignment
I'm not a researcher or working in AI or anything, but ...
you don't say