We not only have to stop ignoring the problem, we need to be absolutely clear about what the problem is.
LLMs don't hallucinate wrong answers. They hallucinate all answers. Some of those answers will happen to be right.
If this sounds like nitpicking or quibbling over verbiage, it's not. This is really, really important to understand. LLMs exist within a hallucinatory false reality. They do not have any comprehension of the truth or untruth of what they are saying, and this means that when they say things that are true, they do not understand why those things are true.
That is the part that's crucial to understand. A really simple test of this problem is to ask ChatGPT to back up an answer with sources. It fundamentally cannot do it, because it has no ability to actually comprehend and correlate factual information in that way. This means, for example, that AI is incapable of assessing the potential veracity of the information it gives you. A human can say "That's a little outside of my area of expertise," but an LLM cannot. It can only be coded with hard blocks in response to certain keywords to cut it from answering and insert a stock response.
This distinction, that AI is always hallucinating, is important because of stuff like this:
But notice how Reid said there was a balance? That’s because a lot of AI researchers don’t actually think hallucinations can be solved. A study out of the National University of Singapore suggested that hallucinations are an inevitable outcome of all large language models. **Just as no person is 100 percent right all the time, neither are these computers. **
That is some fucking toxic shit right there. Treating the fallibility of LLMs as analogous to the fallibility of humans is a huge, huge false equivalence. Humans can be wrong, but we're wrong in ways that allow us the capacity to grow and learn. Even when we are wrong about things, we can often learn from how we are wrong. There's a structure to how humans learn and process information that allows us to interrogate our failures and adjust for them.
When an LLM is wrong, we just have to force it to keep rolling the dice until it's right. It cannot explain its reasoning. It cannot provide proof of work. I work in a field where I often have to direct the efforts of people who know more about specific subjects than I do, and part of how you do that is you get people to explain their reasoning, and you go back and forth testing propositions and arguments with them. You say "I want this, what are the specific challenges involved in doing it?" They tell you it's really hard, you ask them why. They break things down for you, and together you find solutions. With an LLM, if you ask it why something works the way it does, it will commit to the bit and proceed to hallucinate false facts and false premises to support its false answer, because it's not operating in the same reality you are, nor does it have any conception of reality in the first place.
"We invented a new kind of calculator. It usually returns the correct value for the mathematics you asked it to evaluate! But sometimes it makes up wrong answers for reasons we don't understand. So if it's important to you that you know the actual answer, you should always use a second, better calculator to check our work."
Altman going "yeah we could make it get things right 100% of the time, but that would be boring" has such "my girlfriend goes to another school" energy it's not even funny.
I'm a bit annoyed at all the people being pedantic about the term hallucinate.
Programmers use preexisting concepts as allegory for computer concepts all the time.
Your file isn't really a file, your desktop isn't a desk, your recycling bin isn't a recycling bin.
[Insert the entirety of Object Oriented Programming here]
Neural networks aren't really neurons, genetic algorithms isn't really genetics, and the LLM isn't really hallucinating.
But it easily conveys what the bug is. It only personifies the LLM because the English language almost always personifies the subject. The moment you apply a verb on an object you imply it performed an action, unless you limit yourself to esoteric words/acronyms or you use several words to overexplain everytime.
The Chinese Room thought experiment is a good place to start the conversation. AI isn't intelligent, and it doesn't hallucinate. Its not sentient. It's just a computer program.
People need to stop using personifying language for this stuff.
The simple solution is not to rely upon AI. It's like a misinformed relative after a jar of moonshine, they might be right some of the time, or they might be totally full of shit.
I honestly don't know why people are obsessed with relying on AI, is it that difficult to look up the answer from a reliable source?
It will never be solved. Even the greatest hypothetical super intelligence is limited by what it can observe and process. Omniscience doesn't exist in the physical world. Humans hallucinate too - all the time. It's just that our approximations are usually correct, and then we don't call it a hallucination anymore. But realistically, the signals coming from our feet take longer to process than those from our eyes, so our brain has to predict information to create the experience. It's also why we don't notice our blinks, or why we don't see the blind spot our eyes have.
AI representing a more primitive version of our brains will hallucinate far more, especially because it cannot verify anything in the real world and is limited by the data it has been given, which it has to treat as ultimate truth. The mistake was trying to turn AI into a source of truth.
Hallucinations shouldn't be treated like a bug. They are a feature - just not one the big tech companies wanted.
When humans hallucinate on purpose (and not due to illness), we get imagination and dreams; fuel for fiction, but not for reality.
Honestly I feel people are using them completely wrong.
Their real power is their ability to understand language and context.
Turning natural language input into commands that can be executed by a traditional software system is a huge deal.
Microsoft released an AI powered auto complete text box and it's genius.
Currently you have to type an exact text match in an auto complete box. So if you type cats but the item is called pets you'll get no results. Now the ai can find context based matches in the auto complete list.
This is their real power.
Also they're amazing at generating non factual based things. Stories, poems etc.
All of Silicon Valley — of Big Tech — is focused on taking large language models and other forms of artificial intelligence and moving them from the laptops of researchers into the phones and computers of average people.
But if I type “show me a picture of Alex Cranz” into the prompt window, Meta AI inevitably returns images of very pretty dark-haired men with beards.
Earlier this year, ChatGPT had a spell and started spouting absolute nonsense, but it also regularly makes up case law, leading to multiple lawyers getting into hot water with the courts.
In a commercial for Google’s new AI-ified search engine, someone asked how to fix a jammed film camera, and it suggested they “open the back door and gently remove the film.” That is the easiest way to destroy any photos you’ve already taken.
An AI’s difficult relationship with the truth is called “hallucinating.” In extremely simple terms: these machines are great at discovering patterns of information, but in their attempt to extrapolate and create, they occasionally get it wrong.
This idea that there’s a kind of unquantifiable magic sauce in AI that will allow us to forgive its tenuous relationship with reality is brought up a lot by the people eager to hand-wave away accuracy concerns.
The original article contains 1,211 words, the summary contains 212 words. Saved 82%. I'm a bot and I'm open source!
It's not hallucination, it's confabulation. Very similar in its nuances to stroke patients.
Just like the pretrained model trying to nuke people in wargames wasn't malicious so much as like how anyone sitting in front of a big red button labeled 'Nuke' might be without a functioning prefrontal cortex to inhibit that exploratory thought.
Human brains are a delicate balance between fairly specialized subsystems.
Right now, 'AI' companies are mostly trying to do it all in one at once. Yes, the current models are typically a "mixture of experts," but it's still all in one functional layer.
Hallucinations/confabulations are currently fairly solvable for LLMs. You just run the same query a bunch of times and see how consistent the answer is. If it's making it up because it doesn't know, they'll be stochastic. If it knows the correct answer, it will be consistent. If it only partly knows, it will be somewhere in between (but in a way that can be fine tuned to be detected by a classifier).
This adds a second layer across each of those variations. If you want to check whether something is safe, you'd also need to verify that answer isn't a confabulation, so that's more passes.
It gets to be a lot quite quickly.
As the tech scales (what's being done with servers today will happen around 80% as well on smartphones in about two years), those extra passes aren't going to need to be as massive.
This is a problem that will eventually go away, just not for a single pass at a single layer, which is 99% of the instances where people are complaining this is an issue.
without reading the article, this is the best summary I could come up with:
Mainstream government tied media keeps hallucinatin up facts. Republican, democrat, doesn't matter; they hallucinate up facts. Time to stop ignoring human's hallucination problem. At least with ai, they don't have some subversive agenda beneath the surface when they do it. Time to help ai take over the world bbl
The AI isn't alive, it's not hallucinating.... We will likely never have true AI until we figure out the Hard Problem of Consciousness, till we know what makes a human alive, we can't make a machine alive.
Ais real power its ability to use tools and understand context form existing tools. For a Foss tool that uses an llm to do web searches and generate accurate(not guaranteed) results try my tool https://github.com/muntedcrocodile/Sydney