If you just say, "here is what llm said" if that turns out to be nonsense you can say something like, "I was just passing along the llm response, not my own opinion"
But if you take the llm response and present it as your own, at least there is slightly more ownership over the opinion.
This is kind of splitting hairs but hopefully it makes people actually read the response themselves before posting it.
Taking ownership isn't the worst instinct, to be fair. But that's a slightly different formulation.
"People are responsible for the comments that they post no matter how they wrote them. If you use tools (AI or otherwise) to help you make a comment, that responsibility does not go away"
> At the very least the copy paster should read what the llm says, interpret it, fact check it, then write their own response.
then write their own response using an AI to improve the quality of the response? the implication here is that an AI user is going to do some research when using the AI was their research. to do the "fact check" as you suggest would mean doing actual work, and clearly that's not something the user is up for indicated by use of the AI.
so, to me, your suggestion is fantasy level thinking
I have a client who does this — pastes it into text messages! as if it will help me solve the problem they are asking me to solve — and I'm like "that's great I won't be reading it". You have to push back.
That's because "certain" and "know the answer" has wildly different definitions depending on the person, you need to be more specific about what you actually mean with that. Anything that can be ambiguous, will be treated ambiguously.
Anything that you've mentioned in the past (like `no nonsense`) that still exists in context, will have a higher possibility of being generated than other tokens.
> By the numbers, Star Wars is far more grounded as science fiction that Star Trek, but people will insist the former is at best merely "science fantasy." It's really all just vibes.
Worked with it a bit last night! Seems quick. I did run into the same problem I have with Gemini often where the response says something like, "I need to do x" or "I did x" and then nothing actually happens. Agent seems to think it actually does finish the task but it stops part way.
But I'm sure they will sort that out, as I dont have that issue with other anthropic models.
This is interesting. I’ve had this same issue trying to build an agentic system with the smaller ChatGPT models, almost no matter the prompt (“think aloud” are magic words that help a lot, but it’s still flaky). Most of the time it would either perform the tool call before explaining it (the default) or explain it but then not actually make the call.
I’ve been wondering how Cursor et al solved this problem (having the LLM explain what it will do before doing it is vitally important IMO), but maybe it’s just not a problem with the big models.
Your experience seems to support that smaller models are just generally worse about tool calling (were you using Gemini Flash?) when asked to reason first.
But no one uses LLMs like this. This is the type of simple fact you could just Google and check yourself.
LLMs are useful for providing answers to more complex questions where some reasoning or integration of information is needed.
In these cases I mostly agree with the parent commenter. LLMs often come up with plausibly correct answers, then when you ask to cite sources they seem to just provide articles vaguely related to what they said. If you're lucky it might directly address what the LLM claimed.
I assume this is because what LLMs say is largely just made up, then when you ask for sources it has to retroactively try to find sources to justify what it said, and it often fails and just links something which could plausibly be a source to back up it's plausibly true claims.
I do, and so does Google. When I googled "When was John Howard elected?" the correct answer came back faster in the AI Overview than I could find the answer in the results. The source the AI Overview links even provides confirmation of the correct answer.
Yeah but before AI overviews Google would have shown the first search result with a text snippet directly quoted from the page with the answer highlighted.
Thats just as fast (or faster) than the AI overview
The snippet included in the search result does not include or highlight the relevant fact. I feel like you’re not willing to take simple actions to confirm your assertions.
When I searched, the top result was Wikipedia with the following excerpt: “At the 1974 federal election, Howard was elected as a member of parliament (MP) for the division of Bennelong. He was promoted to cabinet in 1977, and…”
To me this seemed like the relevant detail in the first excerpt.
But after more thought I realize you were probably expecting the date of his election to prime minister which is fair! That’s probably what searchers would be looking for.
If you just say, "here is what llm said" if that turns out to be nonsense you can say something like, "I was just passing along the llm response, not my own opinion"
But if you take the llm response and present it as your own, at least there is slightly more ownership over the opinion.
This is kind of splitting hairs but hopefully it makes people actually read the response themselves before posting it.
reply