I’m convinced people who can’t tell when a chat bot is hallucinating are also bad at telling whether something else they’re reading is true or not. What online are you reading that you’re not fact checking anyway? If you’re writing a report you don’t pull the first fact you find and call it good, you need to find a couple citations for it. If you’re writing code, you don’t just write the program and assume it’s correct, you test it. It’s just a tool and I think most people are coping because they’re bad at using it
I asked it several questions in the form of 'are there any things of category x which also are in category y?' type questions.
It would often confidently reply 'No, here's a summary of things that meet all your conditions to fall into category x, but sadly none also fall into category y'.
Then I would reply, 'wait, you don't know about thing gamma, which does fall into both x and y?'
To which it would reply 'Wow, you're right! It turns out gamma does fall into x and y' and then give a bit of a description of how/why that is the case.
After that, I would say '... so you... lied to me. ok. well anyway, please further describe thing gamma that you previously said you did not know about, but now say that you do know about.'
And that is where it gets ... fun?
It always starts with an apology template.
Then, if its some kind of topic that has almost certainly been manually dissuaded from talking about, it then lies again and says 'actually, I do not know about thing gamma, even though I just told you I did'.
If it is not a topic that it has been manually dissuaded from talking about, it does the apology template and then also further summarizes thing gamma.
...
I asked it 'do you write code?' and it gave a moderately lengthy explanation of how it is comprised of code, but does not write its own code.
Cool, not really what I asked. Then command 'write an implementation of bogo sort in python 3.'
... and then it does that.
...
Awesome. Hooray. Billions and billions of dollars for a shitty way to reform web search results into a coversational form, which is very often confidently wrong and misleading.
I beg someone to help me. There is this new guy at my workplace, officially as a developer who can't write code at all. He has pasted an entire project I did into ChatGPT with "optimize this" and pull requested it. I swear.
Because of I haven't found anyone asking the same question on a search index, ChatGPT won't tell me to just use Google or close my question as a duplicate when it's not a duplicate.
Because in a lot of applications you can bypass hallucinations.
getting sources for something
as a jump off point for a topic
to get a second opinion
to help argue for r against your position on a topic
get information in a specific format
In all these applications you can bypass hallucinations because either it's task is non-factual, or it's verifiable while promoting, or because you will be able to verify in any of the superseding tasks.
Just because it makes shit up sometimes doesn't mean it's useless. Like an idiot friend, you can still ask it for opinions or something and it will definitely start you off somewhere helpful.
Reminder that all these Chat-formatted LLMs are just text-completion engines trained on text formatted like a chat. You're not having a conversation with it, it's "completing" the chat history you're providing it. By randomly(!) choosing the next text tokens that seems like they best fit the text provided.
If you don't directly provide, in the chat history and/or the text completion prompt, the information you're trying to retrieve, you're essentially fishing for text in a sea of random text tokens that seems like it fits the question.
It will always complete the text, even if the tokens it chooses minimally fit the context, it chooses the best text it can but it will always complete the text.
This is how they work, and anything else is usually the company putting in a bunch of guide bumpers to reformat prompts into coaxing the models to respond in a "smarter" way (see GPT-4o and "chain of reasoning")
sigh people do talk about this, they complain about it non-stop. These same people probably aren't using it as intended, or are deliberately trying to farm a "gotcha" response. AI is a very neat tool which can do a lot of things well, but it's important to recognize its limitations. I don't use it for things I don't understand because I won't recognize if it's spitting out nonsense, but for topics I do understand it's hard to overstate how efficient and time saving it is.
Because most people are too lazy to bother with making sure the results are accurate when they sound plausible. They want to believe the hype, and lack critical thinking.
My job uses a data science platform that has a special ai assistant trained on its own docs.
The first time I tried using it, it used the wrong language. The second time I used it, it was hallucinating its own functions, but after looking up the docs I told it what function to use and it gave me code that worked
I have not used it a third time. I don’t think i will.
I only use it for complex searches with results I can usually parse myself like ''list 30 typical household items without descriptions or explainations with no repeating items'' kind of thing.
It's usually good for ecosystems with good and loads of docs. Whenever docs are scarce the results become shitty. To me it's mostly a more targeted search engine without the crap (for now)
Big businesses know, they even ask people like me to add extra measures in place. I like to call it the concorde effect. Youre trying to make a plane that can shove air out of the way faster than it wants to move, and this takes an enormous amount of energy that isn't worth the time save, or the cost. Even if you have higher airspeed when it works, if your plane doesn't make it to destination it isn't "faster".
We hear a lot about the downsides of AI, except that doesn't fit the big corpo narrative and people don't care enough really. If youre just a consumer who has no idea how this really works, the investments companiess make into shoving it everywhere makes it seem like it's not a problem and it looks like there's only AI hype and no party poopers.
It depends upon what you use ChatGPT for and if you know how to use it productively. For example if I ask ChatGPT coding questions it is often very helpful. If I ask it history questions it constantly makes things up. You also again need to know how to use it, like people who claim ChatGPT is not helpful for coding you ask them how they use it and they basically just ask ChatGPT to do their whole project for them and when it fails they claim it is useless. But that's not the productive way to use it, the productive way to use it is like a replacement for StackOverflow or to provide you examples of how to use some library, or things like that, not doing your whole project for you. Of course, people often use it incorrectly so it's probably not a good idea to allow its use in the workplace, but for individual use it can be very helpful.
Remember when you had to have extremely niche knowledge of "banks" in a microcontroller to be able to use PWM on 2 pins with different frequencies?
Yes, I remember what a pile of shit it was to try and find out why xyz is not working while x and y and z work on their own. GPT usually gets me there after some tries. Not to mention how much faster most of the code is there, from A to Z, with only little to tweak to get it where I want (since I do not want to be hyper specific and/or it gets those details wrong anyway, as would a human without massive context).
in my use case, the hallucinations are a good thing. I write fiction, in a fictional setting that will probably never actually become a book. If i like what gpt makes up, I might keep it.
Usually, I'll have a conversation going into detail about a subject, this is me explaining the subject to gpt, then having gpt summarize everything it learned about the subject. I then plug that summary into my wiki of lore that nobody will ever see. Then move on to the next subject. Also gpt can identify potential connections between subjects that I didn't think about, and wouldn't have if it didn't hallucinate them.
Gippity is pretty good at getting me 90% of the way there.
It usually sets me up with at least all the terms and etc I now know to google, whereas before I wouldnt even know what I am looking for in the first place.
Also not gonna lie, search engines are even worse than gippity for accuracy often.
And Ive had to fight with so many cases of garbage documentation lately that gippity genuinely does the job better, because it has all the random comments from issues and solutions in its data.
Usually once I have my sort of key terms I need to dig into, I can use youtube/google and get more specific information though, and thats the last 10%
chatgpt has been really good for teaching me code. As long as I write the code myself and just ask for clarity or best practices i haven't had any bad hallucinations.
For example I wanted to change a character in an array with another one but it would give some error about data types that were way out of my league. Anyways apparently I needed to run list(string) first even though string[5] will return the character.
However that's in python which I assume is well understood due to the ton of stackoverflow questions and alternative docs. I did ask it to do something in Google docs scripting something once and it had no idea what was going on and just hoped it worked. Fair enough, I also had no idea what was going on.
You have to understand it well enough to know what stuff you can rely on.
On the other hand nowadays there are often sources there, so it's easy to check.
I usually tell it "using only information found on applicationwebsite.com <question>" that works pretty well at least to get me in the ballpark to find the answer I'm looking for.
In another thread, I was curious about the probability of reaching the age of 60 while living in the US.
Google gave me an assortment of links to people asking similar questions on Quora, and to some generic actuarial data, and to some totally unrelated bullshit.
ChatGPT gave me a multi-paragraph response referencing its data sources and providing both a general life expectancy and a specific answer broken out by gender. I asked ChatGPT how it reached this answer, and it proceeded to show its work. If I wanted to verify the work myself, ChatGPT gave me source material to cross-check and the calculations it used to find the answer. Google didn't even come close to answering the question, much less producing the data it used to reach the answer.
I'm as big an AI skeptic as anyone, but it can't be denied that generic search engines have degraded significantly. I feel like I'm using Alta Vista in the 90s whenever I query Google in the modern day. The AI systems do a marginally better job than old search engines were doing five years ago, before enshittification hit with full force.