Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's so strange to me that in a forum full of programmers, people don't seem to understand that you set up systems to detect errors before they cause problems. That's why I find ChatGPT so useful for helping me with programming - I can tell if it makes a mistake because... the code doesn't do what I want it to do. I already have testing and linting set up to catch my own mistakes, and those things also catch AI's mistakes.


Thank you! I always feel so weird to actually use chatgpt without any major issues while so many people keep on claiming how awful it is; it's like people want it 100% perfect or nothing. For me if it gets me 80% there in 1/10 the time, and then I do the final 20%, that's still heck of a productivity boost basically for free.


Yep, I’m with you. I’m a solo dev who never went to college… o1 makes far fewer errors than I do! No chance I’d make it past round one of any sort of coding tournament. But I managed to bootstrap a whole saas company doing all the coding myself, which involved setting up a lot of guard rails to catch my own mistakes before they reached production. And now I can consult with a programming intelligence the likes of which I could never afford to hire if it was a person. It’s amazing.


Is it working?


Not sure what you're referring to exactly. But broadly yes it is working for me - the number of new features I get out to users has sped up greatly, and stability of my product has also gone up.


Are you making money with your saas idea?


Yep, been living off it for nine years now


Congratulations! That is not an easy task. I am just starting the journey.


Famously, the last 10% takes 90% of the time (or 20/80 in some approximations). So even if it gets you 80% of the way in 10% of the time, maybe you don’t end up saving any time, because all the time is in the last 20%.

I’m not saying that LLMs can’t be useful, but I do think it’s a darn shame that we’ve given up on creating tools that deterministically perform a task. We know we make mistakes and take a long time to do things. And so we developed tools to decrease our fallibility to zero, or to allow us to achieve the same output faster. But that technology needs to be reliable; and pushing the envelope of that reliability has been a cornerstone of human innovation since time immemorial. Except here, with the “AI” craze, where we have abandoned that pursuit. As the saying goes, “to err is human”; the 21st-century update will seemingly be, “and it’s okay if technology errs too”. If any other foundational technology had this issue, it would be sitting unused on a shelf.

What if your compiler only generated the right code 99% of the time? Or, if your car only started 9 times out of 10? All of these tools can be useful, but when we are so accepting of a lack of reliability, more things go wrong, and potentially at larger and larger scales and magnitudes. When (if some folks are to believed) AI is writing safety-critical code for an early-warning system, or deciding when to use bombs, or designing and validating drugs, what failure rate is tolerable?


> Famously, the last 10% takes 90% of the time (or 20/80 in some approximations). So even if it gets you 80% of the way in 10% of the time, maybe you don’t end up saving any time, because all the time is in the last 20%.

This does not follow. By your own assumptions, getting you 80% of the way there in 10% of the time would save you 18% of the overall time, if the first 80% typically takes 20% of the time. 18% time reduction in a given task is still an incredibly massive optimization that's easily worth $200/month for a professional.


Using 90/10 split: that 10% of the time before being reduced to only take 10% of that makes 9% time savings.

160 hours a month * $100/hr programmer * 9% = $1400 savings, easily enough to justify $200/month.

Even if 1/10th of the time it fails, that is still ~8% or $1200 savings.


Does that count the time you spend on prompt engineering?


It depends what you’re doing.

For tasks where bullshitting or regurgitating common idioms is key, it works rather well and indeed takes you 80% or even close to 100% of the way there. For tasks that require technical precision and genuine originality, it’s hopeless.


I'd love to hear what that is.

So far, given my range of projects, I have seen it struggle with lower level mobile stuff and hardware (ESP32 + BLE + HID).

For things like web (front/back), DB, video games (web and Unity), it does work pretty well (at least 80% there on average).

And I'm talking of the free version, not this $200/mo one.


Well, that is a very specific set of skills. I bet the C-suite loves it.


I always feel so weird to actually use chatgpt without any major issues while so many people keep on claiming how awful it is;

People around here feel seriously threatened by ML models. It makes no sense, but then, neither does defending the Luddites, and people around here do that, too.


Well now at $200 it's a little farther away from free :P


What do you mean? ChatGPT is free, the Pro version isn't.

I'm talking of the generally available one, haven't had the chance to try this new version.


I could a car for that kind of money!


Of course, but for every thoroughly set up TDD environment, you have a hundred other people just blindly copy pasting LLM output into their code base and trusting the code based on a few quick sanity checks.


You assume programming software with an existing well-defined and correct test suite is all these will be used for.


>I can tell if it makes a mistake because... the code doesn't do what I want it to do

Sometimes it does what you want it to do, but still creates a bug.

Asked the AI to write some code to get a list of all objects in an S3 bucket. It wrote some code that worked, but it did not address the fact that S3 delivers objects in pages of max 1000 items, so if the bucket contained less than 1000 objects (typical when first starting a project), things worked, but if the bucket contained more than 1000 objects (easy to do on S3 in a short amount of time), then that would be a subtle but important bug.

Someone not already intimately familiar with the inner workings of S3 APIs would not have caught this. It's anyone's guess if it would be caught in a code review, if a code review is even done.

I don't ask the AI to do anything complicated at all, the most I trust it with is writing console.log statements, which it is pretty good at predicting, but still not perfect.


So the AI wrote a bug; but if humans wouldn’t catch it in code review, then obviously they could have written the same bug. Which shouldn’t be surprising because LLMs didn’t invent the concept of bugs.

I use LLMs maybe a few times a month but I don’t really follow this argument against them.


Code reviewing is not the same thing as writing code. When you're writing code you're supposed to look at the documentation and do some exploration before the final code is pushed.

It would be pretty easy for most code reviewers to miss this type of bug in a code review, because they aren't always looking for that kind of bug, they aren't always looking at the AWS documentation while reviewing the code.

Yes, people could also make the same error, but at least they have a chance at understanding the documentation and limits where the LLM has no such ability to reason and understand consequences.


it also catches MY mistakes, so that saves time


So true, and people seem to gloss over this fact completely. They only talk about correcting the LLM's code while the opposite is much more common for me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: