They're reliable already if you change the way you approach them. These probabil...

kubb · 2025-06-19T11:18:02 1750331882

I also think they might never become reliable.

flir · 2025-06-19T11:25:51 1750332351

There is a bar below which they are reliable.

"Write a Python script that adds three numbers together".

Is that bar going up? I think it probably is, although not as fast/far as some believe. I also think that "unreliable" can still be "useful".

diggan · 2025-06-19T11:24:53 1750332293

But what does that mean? If you tell the LLM "Say just 'hi' without any extra words or explanations", do you not get "hi" back from it?

TeMPOraL · 2025-06-19T11:28:53 1750332533

That's literally the wrong way to use LLMs though.

LLMs think in tokens, the less they emit the dumber they are, so asking them to be concise, or to give the answer before explanation, is extremely counterproductive.

diggan · 2025-06-19T11:32:51 1750332771

I was trying to make a point regarding "reliability", not a point about how to prompt or how to use them for work.

TeMPOraL · 2025-06-19T11:48:06 1750333686

This is relevant. Your example may be simple enough, but for anything more complex, letting the model have its space to think/compute is critical to reliability - if you starve it for compute, you'll get more errors/hallucinations.

diggan · 2025-06-19T12:04:01 1750334641

Yeah I mean I agree with you, but I'm still not sure how it's relevant. I'd also urge people to have unit tests they treat as production code, and proper system prompts, and X and Y, but it's really beyond the original point of "LLMs aren't reliable" which is the context in this sub-tree.

kubb · 2025-06-19T12:49:40 1750337380

Sometimes I get "Hi!", sometimes "Hey!".

diggan · 2025-06-19T12:57:10 1750337830

Which model? Just tried a bunch of ChatGPT, OpenAI's API, Claude, Anthropic's API and DeepSeek's API with both chat and reasonee, every single one replied with a single "hi".

throwdbaaway · 2025-06-19T13:47:39 1750340859

o3-mini-2025-01-31 with high reasoning effort replied with "Hi" after 448 reasoning tokens.

gpt-4.5-preview-2025-02-27 replied with "Hi!"

diggan · 2025-06-19T15:20:06 1750346406

> o3-mini-2025-01-31 with high reasoning effort replied with "Hi" after 448 reasoning tokens.

I got "hi", as expected. What is the full system prompt + user message you're using?

https://i.imgur.com/Y923KXB.png

> gpt-4.5-preview-2025-02-27

Same "hi": https://i.imgur.com/VxiIrIy.png

throwdbaaway · 2025-06-20T03:38:36 1750390716

Ah right, my bad. Somehow I thought the prompt was only:

    Say just 'hi'

while the "without any extra words or explanations" part was for the readers of your comment. Perhaps kubb also made a similar mistake.

I used empty system prompt.