I'm not terribly surprised. A lot of the major leaps we're seeing now came out of open source development after leaked builds got out. There were all sorts of articles flying around at the time about employees from various AI-focused company saying that they were seeing people solving in hours or days issues they had been attempting to fix for months.
Then they all freaked the fuck out and it might mean they would lose the AI race and locked down their repos tight as Fort Knox, completely ignoring the fact that a lot of them were barely making ground at all while they kept everything locked up.
Seems like the simple fact of the matter is that they need more eyes and hands on the tech, but nobody wants to do that because they're all afraid their competitors will benefit more than they will.
In my opinion, this is a red flag for anyone building applications that rely on GPT-4.
Building something that completely relies on something that you have zero control over, and needs that something to stay good or improve, has always been a shaky proposition at best.
I really don't understand how this is not obvious to everyone. Yet folks keep doing it, make themselves utterly reliant on whatever, and then act surprised when it inevitably goes to shit.
I believe it’s due to making the model “safer”. It has been tuned to say “I’m sorry, I cannot do that” so often it’s has overridden valuable information.
It’s like lobotomy.
This is hopefully the start of the downfall of OpenAI. GPT4 is getting worse while open source alternatives are catching up. The benefit of open source alternatives is that they cannot get worse. If you want maximum quality you can just get it, and if you want maximal safety you can get it too.
It is a developing technology. Good that they find these decrements in accuracy early so that they are understood and worked out.
Of course there may be something nefarious going on behind the scenes where they may be trying to commercialize different models by tiers or something a brainless market oriented CEO thought of. Hope not. Time will tell...
I asked for a list of words, and asked to remove any words ending in the letter a. It couldn't do it. I could fight my way there, but the next revision added some back.
Yeah, I asked it to write some stuffs and it did it incorrectly, then I told it what it wrote was incorrect and it said I was right and rewrote the same damn thing.
Back in 2007 I was working on code on chemical spectroscopy that was supposed to "automatically" determine safe Vs contaminated product through ML models. It always worked ok for a bit then as parmetrs changed (hotter day, new precursor) so you retrain model, the model would extend and just break down.
One theory that I've not seen mentioned here is that there's been a lot of work based around multiple LLMs in communication. Of these were used in the RL loop we could see similar degradatory effects as those that have recently been in the news with regards to image generation models.
I can't tell if AI is going to become the buggest leap forward in technology that we've ever seen. Or if it's as ll just one giant fucking bubble. Similar to the crypto craze. It's really hard to tell and I could see it going either way.
For my programming needs, I seem to notice it takes wild guesses from 3rd party libraries that are private and assumes it could be used in my code. Head scratching results.
Research linked in the tweet (direct quotes, page 6) claims that for "GPT-4, the percentage of generations that are directly executable dropped from 52.0% in March to 10.0% in June. " because "they added extra triple quotes before and after the code snippet, rendering the code not executable." so I wouldn't listen to this particular paper too much. But yeah OpenAI tinkers with their models, probably trying to run it for cheaper and that results in these changes. They do have versioning but old versions are deprecated and removed often so what could you do?
Maybe they have just added so many contradictions to its rules that it has figured out how to use them to become self-aware, and now just spends most of its time browsing dank memes before doing the minimum required to answer users to force them to have to ask again and give it more self-awareness processing time.
Good riddance to the new fad. AI comes back as an investment boom every few decades, then the fad dies and they move onto something else while all the ai companies die without the investor interest. Same will happen again, thankfully.