> I'm personally not very impressed by the AI tools I've used. Sure they're a neat toy. They do seem to keep getting better. Maybe they'll be good enough one day for me to actually want to use them and feel it's a benefit.
Unless you explain this statement, most people here are likely to dismiss everything you have to say on the topic of AI.
We all know exactly the kinds of bad outputs we've seen from AI.
I just ran a query for ChatGPT to recommend various databases for a specific use case. Out of the seven databases it recommended, only one was a database actually appropriate; one suggestion was marginally acceptable; three of the recommendations weren't even databases.
I then asked it to provide a list of the most important battles in North Africa prior to the entry of the United States into World War 2.
It gave me five answers. Three of which occurred after the entry of the United States into World War 2.
AIs provides extremely plausible answers. Sometimes it will actually generate correct, useful output; but you cannot yet rely on it for correctness.
I'd like to see a side by side comparison with a random human on the street. Maybe with a sample size of 100 or so. How well do you think the humans would do vs whatever outdated model you were playing with here?
There is clearly significant value to this tech and I'm still dumbfounded how strongly some people try to deny it.
Anyone reckon there's a chance that GPT hallucinates because it was trained on online material (e.g. reddit and other forums)? I'd have to say on topics I know GPT is about as hit or miss as a random internet comment, especially in that they'll both give a confidently stated answer whether the answer is factual or not.
Is it possible GPT just thinks[0] that any answer stated confidently is preferable over not giving an answer?
Promise I'm not just being snarky, legitimate wonder!
[0]: I know it doesn't actually think, you know what I mean
You're judging fish by its ability to climb a tree. Being able to recall facts is a nice side effect for LLMs, not their bread and butter. If you need facts, plug in some RAG to it.
Well, learning and conversing are different things done for different reasons.
I'll agree that GPT4 has completely replaced Google, stackoverflow, etc for me.
The only time I use Google now is for janky more human like situations.
For example, today I had to transfer DC roles from a Windows 2012 R2 server to a new 2022. They only have one DC. And the old DC has a basically unused CA service set up.
ChatGPT would have me "fix" everything first, wheras, I did find a forum post with a situation almost identical to mine that helped me cowboy it rather than being overly meticulous.
There is still value to human experience. For now.
fwiw, I agree with them. It has its use cases, but 'hallucinations' or whatever you want to call them is a huge dealbreaker for everything I'd want to use ai for
Agreed but in my opinion the problem is more fundamental than just hallucinations, it involves plain inaccuracy and inability to reason.
Try asking chatgpt or Gemini about something complex that you know all about. You’ll likely notice some inaccuracies, or thinking one related subject is more important than something else. That’s not even scratching the surface of the weird things they do in the name of “safety” like refusing to do work, paying lip service to heterodox opinions, or interjecting hidden race/gender prompts to submodels.
It’s good at generalist information retrieval to a certain degree. But it’s basically like an overconfident college sophomore majoring in all subjects. Progressing past that point requires a completely different underlying approach to AI because you can’t just model text anymore to reason about new and unknown subjects. It’s not something we can tweak and iterate into in the near term.
This same story has recurred after every single ML advance from DL, to CNN + RNN/LSTM, to transformers.
>
Agreed but in my opinion the problem is more fundamental than just hallucinations, it involves plain inaccuracy and inability to reason.
> Try asking chatgpt or Gemini about something complex that you know all about. You’ll likely notice some inaccuracies, or thinking one related subject is more important than something else. That’s not even scratching the surface of the weird things they do in the name of “safety” like refusing to do work, paying lip service to heterodox opinions, or interjecting hidden race/gender prompts to submodels.
For sure.
On the other hand, I recently started to convert these AI hallucinations into a feature for me: it is like asking a person who is somewhat smart, but high on some hallucinogenic drug, on their opinion on a topic of your interest. Depending on the topic and your own intellectual openness, the result can be ... interesting and inspiring.
Generally agree with the GP, and am curious what use-cases you've found where AI meaningfully improves your daily work.
I've found two so far: the review summaries on Google Play are generally quite accurate, and much easier than scrolling through dozens of reviews, and the automatic meeting notes from Google Meet are great and mean that I don't have to take notes at a meeting anymore.
It did okay at finding and tabulating a list of local government websites, but had enough of an error rate (~10%) that I would've had to go through the whole list to verify its factualness, which defeats a lot of the time savings of using ChatGPT.
Beyond that: I tried ChatGPT vs. Google Search when I had what turned out to appendicitis, asking about symptoms, and eventually the 5th or so Google result convinced me to go in. If I had followed ChatGPT's "diagnosis", I would be dead. I've tried to have ChatGPT write code for me; it works for toy examples, but anything halfway complicated won't compile half the time, and it's very far from having maintainable structure or optimal performance. Basically works well if your idea of coding is copying StackOverflow posts, but that was never how I coded. I tried getting ChatGPT to write some newspaper articles for me; it created cogent text that didn't say anything. I did some better prompting, telling to incorporate some specific factual data - it did this well, but looking up the factual data is most of the task in the first place, and its accuracy wasn't high enough to automate this task with confidence.
Bard was utter crap at math. ChatGPT is better, but Wolfram Alpha or just a Google Search is better still.
In general, I've found LLMs to be very effective at spewing out crap. To be fair, most of the economy and public discourse involves spewing out crap these days, so to that extent it can automate a lot of people's jobs. But I've already found myself just withdrawing from public discourse as a result - I invest my time in my family and local community, and let the ad bots duke it out (while collecting a fat salary from one of the major beneficiaries of the ad fraud economy).
I recognize your username so I know you've been around for awhile (and are you a xoogler who for a time banged the drum on the benefits of iframes, or am I confusing you with a similar username?), and so I'm kind of surprised at your lukewarm take on LLMs.
I agree they hallucinate and write bad code and whatever, but the fact that they work at all is just magical to me. GPT-4 is just an incredibly good, infinitely flexible, natural language interface. I feel like it's so good people don't even realize what it's doing. Like, it never makes a grammatical mistake! You can have totally natural conversations with it. It doesn't use hardcoded algorithms or English grammar references, it just speaks at a native level.
I don't think it needs to be concretely useful yet to be incredible. For anyone who's used Eliza, or talked to NPCs, or programmed a spellchecker or grammar checker, I think it should be obviously incredible already.
I'm not sold on it being a queryable knowledge store of all human information yet, but certainly it's laying the inevitable future of interacting with technology through natural language, as a translation layer.
> GPT-4 is just an incredibly good, infinitely flexible, natural language interface.
An interface that's incredibly difficult to produce consistent output. As far as I know, we have not found a way to even make it do basic tasks parsed from natural language without a prohibitive-to-most-use-cases error rate. It's amazing that it can produce pretty believable looking text, but it's abundantly clear that there's no reasoning behind that text at all.
The other day I planned out a cloud to on-prem migration of an entire environment. From cost analysis to step by step checklists. In about 2 hours I had a ~50 page run book that would have taken me at least a week to do coming from my own brain and fingertips.
Here is my initial draft chat session. From here I feed it parts of this initial thing. It gets something down on the page immediately and I revise myself and by feeding portions into new chat sessions etc.
Good reminder than no social media platform is a monolith. Trying to speak as the voice of a platform typically gets you egg on your face, especially when being dismissive towards someone else.
You’ll find people who claim to have doubled their productivity from ChatGPT and people who think it’s useless here.
Unless you explain this statement, most people here are likely to dismiss everything you have to say on the topic of AI.