Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The hallucinations get quoted and then sourced as truth unfortunately.

A simple example. "Which MS Dos productivity program had connect four built in?".

I have an MSDOS emulator and know the answer. It's a little obscure but it's amazing how i get a different answer from all the AI's every time. I never saw any of them give the correct answer. Try asking it the above. Then ask it if it's sure about that (it'll change it's mind!).

Now remember that these types of answers may well end up quoted online and then learnt by AI with that circular referenced source as the source. We have no truth at that point.

And seriously try the above question. It's a great example of AI repeatedly stating an authoritative answer that's completely made up.




When I asked, "Good afternoon! I'm trying to settle a bet with a friend (no money on the line, just a friendly "bet"!) Which MS DOS productivity program had a playable version of the game Connect Four built in as an easter egg?", it went into a very detailed explanation of how to get to the "Hall of Tortured Souls" easter egg in Excel 5.0, glossing over the fact that I said "MS DOS" and also conflating the easter eggs by telling me specifically that the "excelkfa" cheat code would open a secret door/bridge to the connect four game.

So, I retried with, "Good afternoon! I'm trying to settle a bet with a friend (no money on the line, just a friendly "bet"!) Which *MS DOS* [not Win95, i.e., Excel 5] productivity program had a playable version of the game Connect Four built in as an easter egg?". I got Lotus 1-2-3 once, Excel 4 twice, and Borland Quattro Pro three different times, all from that prompt.

The correct answer you point out in another subthread was never returned as a possibility, and the responses all definitely came across as confident. Definitely a fascinating example.


Claude 4 Sonnet gave the (reasonable given the obscurity, but wrong) answer that there was no such easter egg:

>I'm not aware of any MS-DOS productivity program that had Connect Four as a built-in easter egg. While MS-DOS era software was famous for including various easter eggs (like the flight simulator in Excel 97, though that was Windows-era), I can't recall Connect Four specifically being hidden in any major DOS productivity applications.

>The most well-known DOS productivity suites were things like Lotus 1-2-3, WordPerfect, dBase, and later Microsoft Office for DOS, but I don't have reliable information about Connect Four being embedded in any of these.

>It's possible this is a case of misremembered details - perhaps your friend is thinking of a different game, a different era of software, or mixing up some details. Or there might be an obscure productivity program I'm not familiar with that did include this easter egg.

>Would you like me to search for more information about DOS-era software easter eggs to see if we can track down what your friend might be thinking of?

That seems like a pretty reasonable response given the details, and included the appropriate caveat that the model was not aware of any such easter egg, and didn't confidently state that there was none.


>It's possible this is a case of misremembered details - perhaps your friend is thinking of a different game, a different era of software, or mixing up some details. Or there might be an obscure productivity program I'm not familiar with that did include this easter egg.

I am not a fan of this kind of communication. It doesn't know so try to deflect the short coming it onto the user.

Im not saying that isn't a valid concern, but it can be used as an easy out of its gaps in knowledge.


> I am not a fan of this kind of communication. It doesn't know so try to deflect the short coming it onto the user.

This is a very human-like response when asked a question that you think you know the answer to, but don't want to accuse the asker of having an incorrect premise. State what you think, then leave the door open to being wrong.

Whether or not you want this kind of communication from a machine, I'm less sure... but really, what's the issue?

The problem of the incorrect premise happens all of the time. Assuming the person asking the question is correct 100% of the time isn't wise.


Humans use the phrase "I don't know.".

AI never does.


>I'm not aware of any MS-DOS productivity program...

>I don't know of any MS-DOS productivity programs...

I dunno, seems pretty similar to me.

And in a totally unreltaed query today, I got the following response:

>That's a great question, but I don't have current information...

Sounds a lot like "I don't know".


>> And in a totally unreltaed query today, I got the following response:

>That's a great question,

Found the LLM who's training corpus includes transcripts of every motivational speaker and TED talk Q&A ever...


Yeah, I've been meaning to tweak my system prompt to try and avoid some of that kind of language, but haven't gotten around to it yet.


Because there is no "I don't know" in the training data. Can you imagine a forum where in the response for a question of some obscure easter egg there are hunddeds of "I don't know"?


You gave one explanation, but the problem remains.


Gemini 2.5 Flash me a similar answer, although it was a bit more confident in it's incorrect answer:

> You're asking about an MS-DOS productivity program that had ConnectFour built-in. I need to tell you that no mainstream or well-known MS-DOS productivity program (like a word processor, spreadsheet, database, or integrated suite) ever had the game ConnectFour built directly into it.


> didn't confidently state that there was none

And better. Didn’t confidently state something wrong.


Whenever I ask these AI "Is the malloc function in the Microsoft UCRT just a wrapper around HeapAlloc?", I get answers that are always wrong.

They claim things like the function adds size tracking so free doesn't need to be called with a size or they say that HeapAlloc is used to grab a whole chunk of memory at once and then malloc does its own memory management on top of that.

That's easy to prove wrong by popping ucrtbase.dll into Binary Ninja. The only extra things it does beyond passing the requested size off to HeapAlloc are: handle setting errno, change any request for 0 bytes to requests for 1 byte, and perform retries for the case that it is being used from C++ and the program has installed a new-handler for out-of-memory situations.


ChatGPT 4o waffles a little bit and suggests the Microsoft Entertainment pack (which is not productivity software or MS-DOS), but says at the end:

>If you're strictly talking about MS-DOS-only productivity software, there’s no widely known MS-DOS productivity app that officially had a built-in Connect Four game. Most MS-DOS apps were quite lean and focused, and games were generally separate.

I suspect this is the correct answer, because I can't find any MS-DOS Connect Four easter eggs by googling. I might be missing something obscure, but generally if I can't find it by Googling I wouldn't expect an LLM to know it.


ChatGPT in particular will give an incorrect (but unique!) answer every time. At the risk of losing a great example of AI hallucination, it's Autosketch

Not shown fully but https://www.youtube.com/watch?v=kBCrVwnV5DU&t=39s note the game in the file menu.


Wow, that is quite obscure. Even with the name I can't find any references to it on Google. I'm not surprised that the LLMs don't know about it.

You can always make stuff up to trigger AI hallucinations, like 'which 1990s TV show had a talking hairbrush character?'. There's no difference between 'not in the training set' and 'not real'.

Edit: Wait, no, there actually was a 1990s TV show with a talking hairbrush character: https://en.wikipedia.org/wiki/The_Toothbrush_Family

This is hard.


> There's no difference between 'not in the training set' and 'not real'.

I know what you meant but this is the whole point of this conversation. There is a huge difference between "no results found" and a confident "that never happened", and if new LLMs are trained on old ones saying the latter then they will be trained on bad data.


>> You can always make stuff up to trigger AI hallucinations

Not being able to find an answer to a made up question would be OK, it's ALWAYS finding an answer with complete confidence that is a major problem.


I imagine asking for anything obscure where there's plenty of noise can cause hallucinations. What Google search provides the answer? If the answer isn't in the training data, what do you expect? Do you ask people obscure questions, and do you then feel better than them when they guess wrong?

I just tried:

  What MS-DOS program contains an easter-egg of an Amiga game?
And got some lovely answers from ChatGPT and Gemini.

Aside I personally would associate "productivity program" with productivity suite (like MS Works) so I would have trouble googling an answer (I started as a kid on Apple ][ and have worked with computers ever since so my ignorance is not age or skill related).


The good option would be for the LLM to say it doesn't know. It's the making up answers that's the problem.


interesting. gemini 2.5 pro considered that it might be "AutoCAD" but decided it was not:

"A specific user recollection of playing "Connect Four" within a version of AutoCAD for DOS was investigated. While this suggests the possibility of such a game existing within that specific computer-aided design (CAD) program, no widespread documentation or confirmation of this feature as a standard component of AutoCAD could be found. It is plausible that this was a result of a third-party add-on, a custom AutoLISP routine (a scripting language used in AutoCAD), or a misremembered detail."


I wouldn't worry about losing examples. These things are Mandela Effect personified. Anything that is generally unknown and somewhat counterintuitive will be Hallucination Central. It can't NOT be.


In what world is that 'productivity software'?

Sure, it helps you do a job more productively, but that's roughly all non-entertainment software. And sure, it helps a user create documents, but, again, most non-entertainment software.

Even in the age of AI, GIGO holds.


"Productivity software" typically refers to any software used for work rather than entertainment. It doesn't mean software such as a todo list or organizer. Look up any laptop review and you'll find they segment benchmarks between gaming and "productivity". Just because you personally haven't heard of it doesn't mean it's not a widely used term.

https://en.m.wikipedia.org/wiki/Productivity_software

> Productivity software (also called personal productivity software or office productivity software) is application software used for producing information (such as documents, presentations, worksheets, databases, charts, graphs, digital paintings, electronic music and digital video). Its names arose from it increasing productivity


Debatable but regardless you could reformulate the question however you want and still won't get anything other than hallucinations fwiw since there's no references to this on the internet. You need to load up autosketch 2.0 in a dos emulator and see it for yourself.

Amusingly i get an authoritative but incorrect "It's autocad!" if i narrow down the question to program commonly used by engineers that had connect four built in.


> I might be missing something obscure, but generally if I can't find it by Googling I wouldn't expect an LLM to know it.

The Google index is already polluted by LLM output, albeit unevenly, depending on the subject. It's only going to spread to all subjects as content farms go down the long tail of profitability, eking profits; Googling won't help because you'll almost always find a result that's wrong, as will LLMs that resort to searching.

Don't get me started on Google's AI answers that assert wrong information and launders fanfic/reddit/forum and elevating all sources to the same level.


It gave me two answers (one was Borland sidekick) which I then asked "are you sure about that?" waffled and said actually neither of those it's IBM Handshaker to which I said "I don't think so, I think it's another productivity program" and it replied on further review it's not IBM Handshaker, there are no productivity programs that include Connect Four. No wonder CTO like this shit so much, it's the perfect bootlick.


If I can find something by Googling I wouldn’t need an LLM to know it.


Any current question to an LLM is just a textual interpretation of the search results though; the use the same source of truth (or lies in many cases)


So, like normal history just sped up exponentially to the point it's noticeable in not just our own lifetime (which it seemed to reach prior to AI), but maybe even within a couple years.

I'd be a lot more worried about that if I didn't think we were doing a pretty good job of obfuscating facts the last few years ourselves without AI. :/


just tried this with gemini 2.5 flash and pro several times, it just keeps saying it doesn't know of any such thing and suggesting it was a software bundle where the game was included alongside the productivity application or I'm not remembering correctly.

not great (assuming there actually is such a software) but not as bad as making something up


AIs make knowledge work more efficient.

Unfortunately that also includes citogenesis.

https://xkcd.com/978/


probably chatgpt search function already finds this thread soon to answer correctly, hn domain does well on seo and shows up on search results soon enough


What is the correct answer?


Autosketch for MS-Dos had connect four. It's under "game" in the file menu.

This is an example of a random fact old enough no one ever bothered talking about it on the internet. So it's not cited anywhere but many of us can just plain remember it. When you ask ChatGPT (as of now on June 6th 2025) it gives a random answer every time.

Now that i've stated this on the internet in a public manner it will be corrected but... There's a million such things that i could give as an example. Some question obscure enough that no one's given an answer on the internet before so AI doesn't know but recent enough that many of us know the answer so we can instantly see just how much AI hallucinates.


https://imgur.com/a/eWNTUrC for a screenshot btw to anyone curious.

To give some context, i wanted to go back to it for nostalgia sake but couldn't quite remember the name of the application. I asked various AI's what was the application i'm trying to remember and they were all off the mark. In the end only my own neurons finally lighting up got me the answer i was looking for.


Thanks for this fascinating example! Autosketch is still downloadable ( https://winworldpc.com/product/autosketch/30 ). Then you can unzip it, and

  $ strings disk1.img | grep 'game'
  The object of the game is to get four
  Start a new game and place your first
So if ChatGPT cares to analyze all files on the internet, it should know the correct answer...

(edit: formatting)


> random fact old enough no one ever bothered talking about it on the internet. So it's not cited anywhere but many of us can just plain remember it.

And since it is not written down on some website, this fact will disappear from the world once "many of us" die.


Interestingly, the Kagi Assistant managed to find this thread while researching the question, but every model I tested (without access to the higher quality Ultimate plan models) was unable to retrieve the correct answer.

Here’s an example with Gemini Flash 2.5 Preview: https://kagi.com/assistant/9f638099-73cb-4d58-872e-d7760b3ce...

It will be interesting to see if/when this information gets picked up by models.


Interestingly, Copilot in Windows 11 claims that it was Excel 95 (which actually had a Flight Simulator Easter Egg).


Next time try asking which software has the classic quote by William of Ockham in the About menu.


Wait until you meet humans on the Internet. Not only do they make shit up, but they'll do it maliciously to trick you.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: