I am puzzled by all the hot takes stating that ChatGPT (and other similar tools) outputs garbage and is being "discredited" when I have personally found it to be an incredibly useful tool that I have been using daily pretty much since it was first released. The only thing being discredited, in my mind, are the people who rush to be contrarian about something they don't actually seem to understand or have taken any time to use.
There seem to be two kinds of people: ones who see the mistakes ChatGPT makes and the ones who don’t.
If I was a scam artist today I would be very happy to find people of the second kind to victimize.
Secondly I’d be very concerned that the second kind of person is working at the hospital dispensing drugs or on the beat as a cop or doing any kind of job where mistakes could have consequences. They're not just going to be making mistakes and letting other people's mistakes go by, they are going to be denying it and making excuses about it.
If ChatGPT has a core competence it is that people give it the deference that they give to high status people, who are used to just saying whatever comes into their head and having people just nod. I think of the story of “the Emperor’s new clothes” where the Emperor pulls off something that not everybody could do.
These concerns converge in the 419 scams which deliberately filter for the second kind of person with deliberately poor grammar and spelling. Amazingly there are people who have $10 million dollars to lose who fall for this! I wonder if there is somebody evil, bright, hard-working and ambitious who is HFRL training a model right now to do romance scams.
I'm acutely aware of ChatGPT's mistakes, have spent lots of time investigating them, and still find immense value in the tool.
While it's true that many less technical people will end up treating it as an oracle, let's not conclude that it's impossible to understand what it is and use it productively despite its flaws.
I’ve frequently been the ‘final assembly programmer’ for projects that were started by some fresher that got to what looked like 90% done by management but it was unsound underneath and that last 10% took 50% or more of the work.
There is nothing I find more fatiguing than pushing a bubble around underneath a rug in a system that eventually has to be carefully taken and inspected bit by bot and a huge amount of work put into figuring out what exactly is wrong with it and fixing it.
I’m afraid the ChatGPT enthusiast is going to walk away from a looming disaster thinking they and ChaptGPT is so brilliant, probably never realize the harm they did, and if they do they’ll be contemptuous of the low-status ‘grinds’ who take so long to make things right.
> There is nothing I find more fatiguing than pushing a bubble around underneath a rug in a system that eventually has to be carefully taken and inspected bit by bot
Couldn't agree more
> I’m afraid the ChatGPT enthusiast is going to walk away from a looming disaster thinking they and ChaptGPT is so brilliant
Also agree.
But none of this detracts from ChatGPT's value or incredible power.
It just means that it will also cause many problems. Maybe once you net everything out the balance sheet is negative. I think you could make that argument for social media, for example. But the point is moot, because it's here, and its influence is only beginning.
> There seem to be two kinds of people: ones who see the mistakes ChatGPT makes and the ones who don’t
Agreed; it quickly became apparent once I found myself entering the 'seeing its rough edges' camp.
Once you've _tried_ breaking it and can see the limits of an LLM you'll find yourself wishing for an AI assistant, but knowing what is and isn't possible makes managing one's expectations with new technologies easier.
I'm still generally an optimist with these new techs because they're still very cool and potentially useful interfaces to existing technologies, but I agree with you in that it's too easy to get caught up deferring to it like it's some techie oracle (and that the tendency for people to want to do so is concerning).
I used to think people wouldn't "simply accept" the types of systems in _Minority Report_ or _Psycho-pass_, to the extent that I found it to be immersion-breaking. But, concerningly, it seems folks are happy and willing to give up that deference (in what's probably a well-studied sociological observation I'm simply unaware of the term for). Scary stuff.
I'm one of the ones who sees ChatGPT's mistakes. It makes lots of them. Some are very dumb.
But the idea that it's a fraud or a scam or whatever label the naysayers choose to put on it is just wrong. It makes mistakes, but it also provides a lot of value when used correctly.
Consider the many use cases where you don't need to rely on its "knowledge" at all - just this morning, I used it to write some marketing emails. I've got a paragraph about the company that I give it, plus some specifics about the particular email, and it knocks out something that's entirely useable in seconds.
There are plenty of tasks like this, in which it's creating something based on your input, in which it is immediately and clearly useful. Just because there are use cases where it fails doesn't make it a scam.
I would wager that the recipients of marketing emails do not consider them valuable. The prospect that in the future, these will not even have human creativity flowing into them just adds insult to injury.
I've got a few thousand people who have signed up for them and haven't yet unsubscribed. Whenever I send out a marketing email to that list, a number of people immediately make purchases.
Why would that be the case if people didn't consider them valuable?
When my "genius" boss first heard about ChatGPT etc., he was convinced that this would end up replacing all the developers in our org. I told him that the more likely COA was that it would replace all the middle managers who just collate TPS reports to push to the VPs.
I think of the story from Lem’s Cyberiad about the 8 story high ‘thinking machine’ that insisted 2+2=5, got angry like the Bing chatbot when it was told it was wrong, then broke loose from its foundations and chased Trurl and Klapucius into the mountains.
People will learn the hard way there is no market for machines that get the wrong answers. There are plenty of places where people will accept one kind of imperfection or another and that’s fine, but when it comes to an accounting bot that screws up your taxes it is not fine.
(Funny I have seen a few chatbots that claim they are busy as soon as you tell them they are wrong about something and I wonder if that is because they’ve been trained on many examples where things really went south when somebody called out somebody’s mistake.)
>People will learn the hard way there is no market for machines that get the wrong answers.
Eh, in machines there are markets for machines that get it occasionally wrong as long as the defect rate is lower and or cheaper than the human equivalent defect rate cost. So I'd say that's a really bad take on the last few hundred years of industrialization.
In accounting there are really two different sets of things going on at once. There are 'the numbers' of which we'd run in a calculator, but then there is interpretation of the written rules in relation to the numbers you input. If your business is in any way complex, take your numbers to 3 different firms and see if even two come back with anything close to the same number. Hell, in the same damned firm we commonly see that two auditors will come up with different numbers and a supervisor had to look at the rule in question and make a judgement call on what they think the IRS agent would accept.
Now, don't confuse that with me thinking that using ChatGPT is a good idea to do the above. We are not there yet.
I share your reflection on the current state and foreseeable future effectiveness of AI, but
> People will learn the hard way there is no market for machines that get the wrong answers.
I fear this is wildly untrue.
To take your example: The wealthy won't use the accounting bot. Everyone else won't have the time/energy/means to recoup whatever it cost them.
In software development one only needs look around to see a world of "markets" - that is, profitable opportunities - for wrong answers (bad designs, useless products, orders of magnitude of inefficiency).
On the one hand, GPT and AI is immediately usable and arguably quite cool to the layman. It's not a perfect system, but it can write Bash scripts and handle your resignation email. Anyone who does copywriting or programming for a living will probably find a couple uses for it. It's fairly cheap, and it's definitely "worth it" more than analogs like Cryptocurrency.
...on the other hand, though, AI is kinda a bubble. It's our nature to put our hopes and dreams (and money) into the most-promising fields of computing; that's fine. But AI truly has limited application - anything that requires accountability, determinism or trust is a bad fit for AI. When you really whittle down the places you can apply AI, it looks a bit like those 4D theaters from the early 2000s - a cool concept for some stuff, but experience-ruining for others. The eager over-application of AI will be the bubble, not the technology itself.
The bing stuff can get very wild. But I think the reason they aren't making it widely available yet (and won't for a while) is that it's just a test for "ChatGPT that can search the web" the same way that ChatGPT was (is?) a public test.
Sure, it's an unending stream of things, but here are some recent examples from my history:
-Had it extract some gnarly string manipulation logic in a MySQL where clause into a function. It wrote clean, well commented code, and even explained what a regex did in a comment.
-Had it explain to me how macOS plist files worked, how to view them, how to edit them
-Asked it what some lines in an ssh config file did and it clearly explained it to me
-Dropped in some XML from a spring applicationcontext file that was doing something I didn't quite understand and it explained it to me. Asked how I would change something about it and it told me how, explained how it worked, and provided example xml
-Gave it a SQL Server CREATE TABLE statement and asked it to create the equivalent CREATE TABLE statement for MySQL 8, which it successfully did
My only complaint has been how sometimes the system has been unavailable. I now pay for the $20/month subscription and that has been resolved.
Thank you for those examples. Converting sql manipulation into a function is an excellent example. You can do a lot in a few case statements especially with nesting but translating that into something else is valuable.
I asked it to write some queries in relational algebra and it was close but it was invalid in ways that a non-database person wouldn't understand. Kind of looked right.
There exist problems where finding the solution is hard but verifying the solution is easy. example in the general case: prime-factoring numbers.
Are you claiming that for all possible problems posed, verifying the correctness of the output of an LLM is equal or higher difficulty than solving the problem itself? If so, that seems like a claim more arising from emotion than reason.
In the general case, you can have it generate code in a proof based language that "proves" the code is correct against formal specification. Unless you consider math itself to be "bullshit" too.
The people who are skeptical that it can be easier to verify correctness than to produce a correct work might as well try to prove P=NP (which is the theoretical and formal statement of such).
They also probably never asked people to work on a task either (hint: people can get things wildly wrong, even generally competent ones)
If it is a well known problem, I might as well find the solution on stack overflow, where actual the actual humans that trained this system discuss it.
As for proof based languages, can it actually do that? Have you tried?
Within the constraints of technical systems, it's pretty obvious. It's not a topic that you can really bullshit - things work or they don't. It makes mistakes sometimes and what it says won't be consistent with other things it's saying or that you know to be true. If you ask it if it's sure, it often will correct itself. Worst case, you verify with other sources and/or try it out.
It's not really any worse than asking someone knowledgable about a subject face-to-face. They know a lot, but they can be mistaken about things. At some point you will tease out the misunderstanding.
The best way to understand the value of the tool is to try it for yourself.
My partner is looking for a new job, and ChatGPT has been an amazing help in writing the cover letters that recruiters can't stop asking for. Previously, she would spend hours molding her previous cover letter to the job description. Now she can just give the job description and a bullet point list of her relevant experiences, and spend 15 minutes on the ChatGPT output to polish it a little bit and correct any potential misstatements.
As they should be! I do not know why they are not already, but as I said recruiters can't seem to stop asking for them. But be assured that I won't mourn them once they are gone, and if ChatGPT gets us there faster then I am grateful for it.
Edit: on second thought, I believe "Editing ChatGPT output" might be a valuable job skill in the future, and maybe a good cover letter would give recruiters some signals in that way.
I tried using chatGPT to write me a cover letter and targeted resume given my full resume and the job description. It gave me the cover letter and the resume - but it made stuff up to match the job description. I have a masters degree but it gave me a PhD, for example (I'm not sure why it did this as the job requirements were Masters of PhD, I guess it figured a PhD was better). I'm pretty sure I would have easily gotten an interview for the position based on that cover letter + resume as it was essentially a perfect fit to the job description, but the interview would've gone badly.
That's why the 15 minute review process afterwards, at least for a cover letter the factual errors are glaring enough that they can be caught easily, and simple enough that they can be edited out in that time. It's not at the stage where I can write my partner a job application bot that scrapes job boards and fires off 50,000 applications per second, but the productivity gain is obvious.
This to me is what I see want to see as the fallacy in the anti-ChatGPT argument, but at the same time have to acknowledge as the real danger. I think AI paired with a human, as you are describing, can be more effective and efficient than either on their own. However, I've never seen it stay that way in commercial applications. So I'm afraid that the smallest drive towards cost-cutting will lead to removal of the human in the loop or at least tighten throughout expectations so much that the human intervention is in practice more often missing than not.
Not sure what to do about this. I want ChatGPT as a tool available to me, but I don't want any service provider I interact with to use it because I know they'll fuck it up eventually and I'll get worse service.
> I want ChatGPT as a tool available to me, but I don't want any service provider I interact with to use it because I know they'll fuck it up eventually and I'll get worse service.
This is just straightforward selfishness and narcissism.
No software developer should be so naive to assume they'll always do the right thing with a powerful but incomprehensibly dangerous tool.
You know how many minutes I "spend molding [my] previous cover letter to the job description"? 15 minutes. If you spend more than that, you are clearly doing it wrong.
Not them, but a short list based on my chat history:
- Explained why a python variable in a closure was saying it was not assigned, after several minutes of googling/SO failed me (needed the nonlocal keyword, different from js)
- Explained the difference in several slightly-different C++ lists, which I couldn't google because the differences were mostly special chars
- Wrote working powershell code to output a list that has every string in $list1 that is not in $list2, on the first try, after I struggled for over an hour to do the same (and google/SO failed me).
- Explained the difference between two batfiles for the Intel Fortran compiler after google failed me
- Generated some really cool D&D statblocks
-Unbeknownst to us players, generated a really creative heist dungeon that our DM ran us through (he thanked ChatGPT as his "co-DM" at the end). This one is particularly notable because the man works at a hiring agency, he's not exactly a techie.
> after several minutes of googling/SO failed me
**
> and google/SO failed me
This is what I've been noticing as well. There have been many times when I spend 5-10 minutes trying to figure something out with Stack Overflow and Google, not getting anywhere, and then ChatGPT immediately gives me the answer. I'm not sure how anyone who considers Stack Overflow useful would think that ChatGPT is useless.
Do I need to verify the answers? Sure. But I have to do the same thing with answers on Stack Overflow. Even if I'm using official documentation, I need to verify that what I think it's saying is what it's actually saying (particularly when it's as unclear as it often is).
What's even more interesting is that I keep running into people who are finding completely different uses for it.
The critique here is like a lot of other critiques I've seen. They seem to come from people who haven't spent much if any time working with it, but have read some articles about how it's not infallible and decided that meant it was junk (though this piece goes the extra mile when it hints that big GPU is behind it all). What's striking to me is the lack of intellectual curiosity on display.
+1 for the DMing. I mostly use it to generate data for my dev/test tables, but the two times I used it for DMing, I gained hours. Also live when the player go somewhere really unexpected (it wasn't my campaign).
Tangent:
Google is less and less useful (as is SO for 'old' languages). I look into a mix of reddit, github issues, github code and SO depending on the issue. Hopefully this advice helps you if Chatgpt/copilot can't.
If you treat it as a surpercharged search engine on a compressed snapshot of the internet in 2021, then its quite useful. If theres ever a function I forgot, but I know how to explain what it does in natural language, chatGPT most of the time can find me what I'm looking for. On some more obscure bugs or if I'm sorting through a new codebase, chatGPT can help me out from time to time to understand new topics.
Of course, we shouldn't rely on chatGPT. It has give me wrong and insecure code before. However, its a nice tool to have around
>Of course, we shouldn't rely on chatGPT. It has give me wrong and insecure code before. However, its a nice tool to have around
You 100% should verify any code generated by ChatGPT - but this goes for any code found off the internet. I have come across bad Stack Overflow code committed to codebases before.
I think what's making these hyperinformed hallucinators useful for coding right now is that fact checking is fundamentally a big part of our work. There's a bare minimum that must be done to even think anything has been accomplished at all, that it runs and does what we expected on the most basic input, but then we are also used to writing tests and reading code critically.
I wonder what a learning model with the ability to test feedback from executing its code in an interpreter would look like? I know there are different groups testing things like integrating the ability for LLMs to use tools, wondering how this will pan out in the end.