Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Is there some expectation that these things won't improve?

I definitely expect them to improve. But I also think the point at which they can actually replace a senior programmer is pretty much the exact point at which they can replace any knowledge worker, at which point western society (possibly all society) is in way deeper shit than just me being out of a job.

> This is ego speaking.

It definitely isn't. LLMs are useful for coding now, but they can't really do the whole job without help - at least not for anything non-trivial.



Intellisense style systems were a huge feature leap when they gained wider language support and reliability. LLMs are yet another step forward for intellisense and the effort of comprehending the code you're altering. I don't think I will ever benefit from code generation in a serious setting (it's excellent for prototyping) simply due to the fact that it's solving the easy problem (write some code) while creating a larger problem (figure out of the code that was generated is correct).

As another senior developer I won't say it's impossible that I'll ever benefit from code generation but I just think it's a terrible space to try and build a solution - we don't need a solution here - I can already type faster than I can think.

I am keenly interested in seeing if someone can leverage AI for query performance tuning or, within the RDBMS, query planning. That feels like an excellent (if highly specific) domain for an LLM.


> I am keenly interested in seeing if someone can leverage AI for query performance tuning or, within the RDBMS, query planning. That feels like an excellent (if highly specific) domain for an LLM.

Pay the $20 for Claude, copy the table DDL's in along with a query you'd like to tune.

Copy in any similar tuned queries you have and tell it you'd like to tune your query in a similar manner.

Once you've explained what you'd like it to do and provided context hit enter.

I'd be very surprised if having done this you can't find value in what it generates.


> I can already type faster than I can think.

But can you write tickets faster than you can implement them? I certainly can.


> But can you write tickets faster than you can implement them? I certainly can.

Personally, I have a tendency at work to delay creating tickets until after I've already written the implementation.

Why? Because tickets in my employer's system are expected to identify which component needs to be changed, and ideally should have some detail about what needs to be changed. But both of those things depend on the design of the change being implemented.

In my experience, any given feature usually has multiple possible designs, and the only way to know if a design is good is to try implementing it and see how clean or messy it ends up. Of course, I can guess in advance which design will turn out well. I have to guess, or else I wouldn't know which design to try implementing first. But often my first attempt runs into unexpected wrinkles and I retreat and try a different design.

Other people will start with a design and brute-force their way to working code, and (in my judgmental opinion) the code often ends up lower-quality because of it.

Sooner or later, perhaps AI will be able to perform that entire process autonomously, better than I can. In the meantime, though, people often talk about using AI like a 'junior engineer', where you think up a design yourself and then delegate the grunt work to the AI. That approach feels flawed to me, because it disconnects the designer from the implementer.


>>> delay creating tickets until after I've already written the implementation. Why? Because tickets in my employer's system are expected to identify which component needs to be changed,

abso-frigging-lutely

To me this is n example of software being a form Of literacy - creative work. And yet process is designed by software illiterates who think novels can be written by pre-planning all the paragraphs


Depends on the ticket.

If it's "Get us to the moon", it's gonna take me years to write that ticket.

If it was "Make the CTA on the homepage red", it is up for debate whether I needed a ticket at all.


Based on current methods of productivity scoring, I'd try to make 2 tickets for that...


> LLMs are yet another step forward for intellisense

That would be great if the reality wasn't that the suggested LLM slop is actually making it harder to get the much better intellisense suggestions of last year.


If a LLM (or any other tool) makes so that team of 8 can get the same results in the same time as it used to take a team of 10 to do, then I would count that as "replaced 2 programmers" - even if there's no particular person for which the whole job has been replaced, that's not a meaningful practical difference, replacing a significant fraction of every programmer's job has the same outcomes and impacts as replacing a significant fraction of programmers.


Fav anecdote from ages ago:

When hand-held power tools became a thing, the Hollywood set builder’s union was afraid of this exact same thing - people would be replaced by the tools.

Instead, productions built bigger sets (the ceiling was raised) and smaller productions could get in on things (the floor was lowered).

I always took that to mean “people aren’t going to spend less to do the job - they’ll just do a bigger job.”


It's always played out like this in software, by the way. Famously, animation shops hoped to save money on production by switching over to computer rendered cartoons. What happened instead is that a whole new industry took shape, and brought along with it entire cottage industries of support workers. Server farms required IT, renders required more advanced chips, some kinds of animation required entirely new rendering techniques in the software, etc.

A few hundred animators turned into a few thousand computer animators & their new support crew, in most shops. And new, smaller shops took form! But the shops didn't go away, at least not the ones who changed.

It basically boils down to this: some shops will act with haste and purge their experts in order to replace them with LLMs, and others will adopt the LLMs, bring on the new support staff they need, and find a way to synthesize a new process that involves experts and LLMs.

Shops who've abandoned their experts will immediately begin to stagnate and produce more and more mediocre slop (we're seeing it already!) and the shops who metamorphose into the new model you're speculating at will, meanwhile, create a whole new era of process and production. Right now, you really want to be in that second camp - the synthesizers. Eventually the incumbents will have no choice but to buy up those new players in order to coup their process.


And 3D animation still requires hand animation! Nobody starts with 3D animation, the senior animators are doing storyboards and keyframes which they _then_ use as a guide for 3D animation.


The saddle, the stirrup, the horseshoe, the wagon, the plough, and the drawbar all enhanced the productivity of horses and we only ended up employing more of them.

Then the steam engine and internal combustion engine came around and work horses all but disappeared.

There's no economic law that says a new productivity-enhancing programming tool is always a stirrup and never a steam engine.


I think you raise an excellent point, and can use your point to figure out how it could apply in this case.

All the tools that are stirrups were used "by the horse" (you get what I mean); that implies to me that so long as the AI tools are used by the programmers (what we've currently got), they're stirrups.

The steam engines were used by the people "employing the horse" - ala, "people don't buy drills they buy holes" (people don't employ horses, they move stuff) - so that's what to look for to see what's a steam engine.

IMHO, as long as all this is "telling the computer what to do", it's stirrups, because that's what we've been doing. If it becomes something else, then maybe it's a steam engine.

And, to repeat - thank you for this point, it's an excellent one, and provides some good language for talking about it.


Thanks for the warm feedback!

Maybe another interesting case would be secretaries. It used to be very common that even middle management positions at small to medium companies would have personal human secretaries and assistants, but now they're very rare. Maybe some senior executives at large corporations and government agencies still have them, but I have never met one in North America who does.

Below that level it's become the standard that people do their own typing, manage their own appointments and answer their own emails. I think that's mainly because computers made it easy and automated enough that it doesn't take a full time staffer, and computer literacy got widespread enough that anyone could do it themselves without specialized skills.

So if programming got easy enough that you don't need programmers to do the work, then perhaps we could see the profession hollow out. Alternatively we could run out of demand for software but that seems less likely!

(a related article: https://archive.is/cAKmu )


Another anecdote: when mechanical looms became a thing, textile workers were afraid that the new tools would replace them, and they were right.


Oh my, no. Fabrics and things made from fabrics remain largely produced by human workers.

Those textile workers were afraid machines would replace them, but that didn't happen - the work was sent overseas, to countries with cheaper labor. It was completely tucked away from regulation and domestic scrutiny, and so remains to this day a hotbed of human rights abuses.

The phenomenon you're describing wasn't an industry vanishing due to automation. You're describing a moment where a domestic industry vanished because the cost of overhauling the machinery in domestic production facilities was near to the cost of establishing an entirely new production facility in a cheaper, more easily exploitable location.


I think you're talking about people who sew garments, not people who create textile fabrics. In any case, the 19th century British textile workers we all know I'm talking about really did lose their jobs, and those jobs did not return.


430 million people currently work in textiles[0]; how big was the industry before mechanical looms?

[0] https://www.uniformmarket.com/statistics/global-apparel-indu...


How many worked in America and Western Europe and were paid a living wage in their respective countries, and how many of those 430 million people currently work making textiles in America and Western Europe today, and how many are lower paid positions in poorer countries, under worse living conditions? (Like being locked into the sweatshop, and bathroom breaks being regulated?)

Computers aren't going anywhere, so the whole field of programming will continue to grow, but will there still be FAANG salaries to be had?


That's a different topic entirely.


I think this is THE point.


i heard the same shit when people were talking about outsourcing to India after the dotcom bubble burst. programmer salaries would cap out at $60k because of international competition.

if you're afraid of salaries shrinking due to LLMs, then i implore you, get out of software development. it'll help me a lot!


It solely depends on whether more software being built is being constrained by feasibility/cost or a lack of commercial opportunities.

Software is typically not a cost constrained activity due to its typically higher ROI/scale. Its all about fixed costs and scaling profits mostly. Unfortunately given this my current belief is that on balance AI will destroy many jobs in this industry if it gets to the point where it can do a software job.

Assuming inelastic demand (software demand relative to SWE costs) any cost reductions in inputs (e.g. AI) won't translate to much more demand in software. The same effect that drove SWE prices high and didn't change demand for software all that much (explains the 2010's IMO particularly in places like SV) also works in reverse.


Is there a good reason to assume inelastic demand? In my field —- biomedical research —-I see huge untapped areas in need of much more and better core programming services. Instead we “rely” way too much on grad students (not even junior SWEs) who know a bit of Python.


Depends on the niche (software is broad) but I think as a whole for the large software efforts employing a good chunk of the market - I think so. Two main reasons for thinking this way:

- Software scales; generally it is a function of market size, reach, network effects, etc. Cost is a factor but not the greatest factor. Most software makes profit "at scale" - engineering is a capital fixed cost. This means that software feasibility is generally inelastic to cost or rather - if I make SWE's cheaper/not required to employ it wouldn't change the potential profit vs cost equation much IMV for many different opportunities. Software is limited more by ideas, and potential untapped market opportunities. Yes - it would be much cheaper to build things; but it wouldn't change the feasibility of a lot of project assessments since cost of SWE's at least from what I've seen in assessments isn't the biggest factor. This effect plays out to varying degrees in a lot of capex - as long as the ROI makes sense its worth going ahead especially for larger orgs who have more access to capital. The ROI from scale dwarfs potential cost rises often making it less of a function of end SWE demand. This effect happens in other engineering disciplines as well to varying effects - software just has it in spades until you mix hardware scaling into the mix (e.g GPUs).

- The previous "golden era" where the in-elasticity of software demand w.r.t cost meant salaries just kept rising. Inelasticity can be good for sellers of a commodity if demand increases. More importantly demand didn't really decrease for most companies as SWE salaries kept rising - entry requirements were relaxing generally. The good side of in-elasticity is reversed potentially by AI making it a bad thing.

However small "throwaway" software which does have a "is it worth it/cost factor" will increase under AI most probably. Just don't think it will offset the reduction demanded by capital holders; nor will it be necessarily done by the same people anyway (democratizing software dev) meaning it isn't a job saver.

In your case I would imagine there is a reason why said software doesn't have SWE's coding it now - it isn't feasible given the likely scale it would have (I assume just your team). AI may make it feasible, but that doesn't help the OP in the way it does it - it does so by making it feasible for as you put it junior out of uni not even "SWE" grads. That doesn't help the OP.


And that was good. Workers being replaced by tools is good for society, although temporarily disruptive.


This could very well prove to be the case in software engineering, but also could very well not; what is the equivalent of "larger sets" in our domain, and is that something that is even preferable to begin with? Should we build larger codebases just because we _can_? I'd say likely not, while it does make sense to build larger/more elaborate movie sets because they could.

Also, a piece missing from this comparison is a set of people who don't believe the new tool will actually have a measurable impact on their domain. I assume few-to-none could argue that power tools would have no impact on their profession.


> Should we build larger codebases just because we _can_?

The history of software production as a profession (as against computer science) is essentially a series of incremental increases in the size and complexity of systems (and teams) that don't fall apart under their own weight. There isn't much evidence we have approached the limit here, so it's a pretty good bet for at least the medium term.

But focusing on system size is perhaps a red herring. There is an almost unfathomably vast pool of potential software systems (or customization of systems) that aren't realized today because they aren't cost effective...


Have you ever worked on a product in production use without a long backlog of features/improvements/tests/refactors/optimizations desired by users, managers, engineers, and everyone else involved in any way with the project?

The demand for software improvements is effectively inexhaustible. It’s not a zero sum game.


This is a good example of what could happen to software development as a whole. In my experience large companies tend to more often buy software rather than make it. Ai could drastically change the "make or buy" decision in favour of make. Because you need less developers to create a perfect tailored solution that directly fits the needs of the company. So "make" becomes affordable and more attractive.


I work for a large european power utility. We are moving away from buying to in-house development. LLMs have nothing to do with it.


This is a real thing. LLMs are tools, not humans. They truly do bring interesting, bigger problems.

Have people seen some of the recent software being churned out? Hint, it's not all GenAI bubblespit. A lot of it is killer, legitimately good stuff.


That's actually not accurate. See Jevons paradox, https://en.m.wikipedia.org/wiki/Jevons_paradox. In the short term, LLMs should have the effect of making programmers more productive, which means more customers will end up demanding software that was previously uneconomic to build (this is not theoretical - e.g. I work with some non-profits who would love a comprehensive software solution, they simply can't afford it, or the risk, at present).


yes, this. the backlog of software that needs to be built is fucking enormous.

you know what i'd do if AI made it so i could replace 10 devs with 8? use the 2 newly-freed up developers to work on some of the other 100000 things i need done


I'm casting about for project ideas. What are some things that you think need to be built but haven't?


Every company that builds software has dozens of things they want but can't prioritize because they're too busy building other things.

It's not about a discrete product or project, but continuous improvement upon that which already exists is what makes up most of the volume of "what would happen if we had more people".


A consumer-friendly front end to NOAA/NWS website, and adequate documentation for their APIs would be a nice start. Weather.com, accuweather and wunderground exist, but they're all buggy and choked with ads. If I want to track daily rainfall in my area vs local water reservoir levels vs local water table readings, I can do that, all the information exists but I can't conveniently link it together. The fed has a great website where you can build all sorts of tables and graphs about current and historical economic market data, but environmental/weather data is left out in the cold right now.


some ideas from my own work:

- a good LIMS (Laboratory Information Management System) that incorporates bioinformatics results. LIMS come from a pure lab, benchwork background, and rarely support the inclusion of bioinformatics analyses on samples included in the system. I have yet to see a lab that uses an off-the-shelf LIMS unmodified - they never do what they say they do. (And the amount of labs running on something built on age-old software still in use is... horrific. I know one US lab running some abomination built on Filemaker Pro)

- Software to manage grants. Who is being owed what, what are the milestones, who's looking after this, who are the contact persons, what are the milestones and when to remind, due diligence on potential partners etc. I worked for a grant-giving body and they came up with a weird mix of PowerBI and a pile of Excel sheets and PDFs.

- A thing that lets you catalogue Jupyter notebooks and Rstudio projects. I'm drowning in various projects from various data scientists and there's no nice way to centrally catalogue all those file lumps - 'there was this one function in this one project.... let's grep a bit' can be replaced by a central findable, searchable, taggable repository of data science projects.


> A thing that lets you catalogue Jupyter notebooks and Rstudio projects. I'm drowning in various projects from various data scientists and there's no nice way to centrally catalogue all those file lumps - 'there was this one function in this one project.... let's grep a bit' can be replaced by a central findable, searchable, taggable repository of data science projects.

Oh... oh my. This extends so far beyond data science for me and I am aching for this. I work in this weird intersection of agriculture/high-performance imaging/ML/aerospace. Among my colleagues we've got this huge volume of Excel sheets, Jupyter notebooks, random Python scripts and C++ micro-tools, and more I'm sure. The ones that "officially" became part of a project were assigned document numbers and archived appropriately (although they're still hard to find). The ones that were one-off analyses for all kinds of things are scattered among OneDrive folders, Zip files in OneDrive folders, random Git repos, and some, I'm sure, only exist on certain peoples' laptops.


Ha - I’m on the other side of the grant application process and used an LLM to make a tool to describe the project, track milestones, sub-contractors, generate a costed project plan and all generate the other responses that need to be self-consistent in the grant application process.


Ditto for most academic biomedical research: we desperately need more high quality customized code. Instead we have either nothing or Python/R code written by a grad student or postdoc—code that dies a quick death.


> then I would count that as "replaced 2 programmers"

Well then you can count IDEs, static typing, debuggers, version control etc. as replacing programmers too. But I don't think any of those performance enhancers have really reduced the number of programmers needed.

In fact it's a well known paradox that making a job more efficient can increase the number of people doing that job. It's called the Jevons paradox (thanks ChatGPT - probably wouldn't have been able to find that with Google!)

Making people 20% more efficient is very different to entirely replacing them.


I know it's popular to hate on Google, but a link to the Wikipedia is the first result I get for a Google search of "efficiency paradox".


As a founder, I think that this viewpoint misses the reality of a fixed budget. If I can make my team of 8 as productive as 10 with LLMs then I will. But that doesn’t mean that without LLMs I could afford to hire 2 more engineers. And in fact if LLMs make my startup successful then it could create more jobs in the future.


> I definitely expect them to improve. But I also think the point at which they can actually replace a senior programmer is pretty much the exact point at which they can replace any knowledge worker, at which point western society (possibly all society) is in way deeper shit than just me being out of a job.

Agree with this take. I think the probability that this happens within my next 20 years of work is very low, but are non-zero. I do cultivate skills that are a hedge against this, and if time moves on and the probability of this scenario seems to get larger and larger, I'll work harder on those skills. Things like fixing cars, welding, fabrication, growing potatoes, etc (which I already enjoy as hobbies). As you said, skills that are helpful if shit were to really hit the fan.

I think there are other "knowledge workers" that will get replaced before that point though, and society will go through some sort of upheaval as this happens. My guess is that capital will get even more consolidated, which is sort of unpleasant to think about.


im also expecting this outcome. my thoughts are that once devs are completely replaced by ml then were going to finally have to adapt society to raise the floor because we can no longer outcompete automation. im ready to embrace this, but it seems like there is no plan for mass retirement and were going to need it. if something as complicated as app and system dev is automated, pretty much anything can be.

earnest question because i still consider this a vague, distant future: how did you come up with 20 years?


That's around, give or take a few years, when I plan to retire. Don't care a whole lot what happens after that point :)


I think if you are either young or bound to retire in the next 5-10 years there's less worry. If you are young you can pivot easier (avoid the industry); if you are old you can just retire with enough wealth behind you.

Its the mid age people 35-45 yr olds that I think will be hit hardest by this if it eventuates. Usually at this point in life there's plenty of commitments (family, mortgage, whatever). The career may end before they are able to retire, but they are either too burdened with things; or ageism sets in making it hard to be adaptable.


I think you can look at any cohort (except for the people already at retirement age) and see a way that they're screwed. I'm 34 and I agree that you're right, but the 22 year old fresh out of college with a CS degree might have just gotten a degree in a field that is going to be turned inside out, in an economy that might be confused as we adapt. It's just too big of a hypothetical change to say who will get the most screwed, imo.

And like the parent poster said - even if you were to avoid the industry, what would you pivot to? Anything else you might go into would be trivial to automate by that point.


I think trades and blue collar work where experience learnt (the moat) is in the physical realm (not math/theory/process but dexerity and hands-on knowledge), where iteration/brute force costs too much money/causes too much risk is the place to be. If I get that building wrong unlike software where I can just "rebuild" it costs a lot of money in the real world and has risk like say "environmental damage". It means rapid iteration and development just doesn't happen which limits the exponential growth of AI in that area compared to the Digital world. Sure - in a lifetime we may have robots, etc but it will be a LOT more expensive to get there, and happen a lot slower and people can adjust. LLM's being 90% right are just not good enough - the cost of failure is too high and accountability in failure needs to occur.

Those industries also more wisely IMO tend to be unionised, own their own business, and so have the incentive to keep their knowledge tight. Even with automation the customer (their business) and the supplier (them and their staff) are the same so the value transfer of AI will make their job easier but they will keep the value. All good things to slow down progress and keep some economic rent for yourself and your family. Slow change is a less stressful life.

The intellectual fields lose in an AI world long term; the strong and the renter class (capital/owners) win. That's what "Intelligence is a commodity" that many AI heads keep saying actually means. This opens up a lot of future dystopian views/risks that probably aren't worth the benefit IMO to the majority of people that aren't in the above classes of people (i.e. most people).

The problem with software in general is that it is quite difficult generally to be a "long term" founder at least in my opinion for most people which means most employment comes from large corps/govt's/etc where the supplier (the SWE) is different than the employer. Most ideas generally don't make it or last only briefly, and the ones that stick around usually benefit from dealing with scale - something generally only larger corps have with their ideas (there are exceptions in new fields, but then they become the next large corp and there isn't enough for everyone to do this).


What if anti-aging technology is developed and you can live up to 1000 years?


I certainly am not in the 0.5% rich enough to be able to afford such a thing. If it does get developed (it won't in my lifetime), there's absolutely no way in hell it will be accessible to anyone but the ultra rich.


one side thing i want to point out. There is a lot of talk about llms not being able to replace a senior engineer... The implication being that they can replace juniors. How exactly do you get senior engineers when you've destroyed the entire market for junior engineers, exactly?


This.

When this moment becomes reality - the world economy will change a lot, all jobs and markets will shift.

And there won’t be any way to future proof your skills and they all will be irrelevant.

Right now many like to say “learn how to work with ai, it will be valuable “ . No, it won’t. Because even now it is absolutely easy to work with it, any developer can pick up ai in a week, and it will become easier and easier.

A better time spent is developing evergreen skills.


> LLMs are useful for coding now

*sort of, sometimes, with simple enough problems with sufficiently little context, for code that can be easily tested, and for which sufficient examples exist in the training data.

I mean hey, two years after being promised AGI was literally here, LLMs are almost as useful as traditional static analysis tools!

I guess you could have them generate comments for you based on the code as long as you're happy to proofread and correct them when they're wrong.

Remember when CPUs were obsolete after three years? GPT has shown zero improvement in its ability to generate novel content since it was first released as GPT2 almost ten years ago! I would know because I spent countless hours playing with that model.


Firstly, GPT-2 was released in 2019. Five years is not "almost ten years".

Secondly, LLMs are objectively useful for coding now. That's not the same thing as saying they are replacements for SWEs. They're a tool, like syntax highlighting or real-time compiler error visibility or even context-aware keyword autocompletion.

Some individuals don't find those things useful, and prefer to develop in a plain text editor that does not have those features, and that's fine.

But all of those features, and LLMs are now on that list, are broadly useful in the sense that they generally improve productivity across the industry. They already right now save enormous amounts of developer time, and to ignore that because you are not one of the people whose time is currently being saved, indicates that you may not be keeping up with understanding the technology of your field.

There's an important difference between a tool being useful for generating novel content, and a tool being useful. I can think of a lot of useful tools that are not useful for generating novel content.


> are broadly useful in the sense that they generally improve productivity across the industry. They already right now save enormous amounts of developer time,

But is that actually a true statement. Are there actual studies to back that up?

AI is hyped to the moon right now. It is really difficult to separate the hype from reality. There are ancedotal reports of ai helping with coding, but there are also ancedotal reports that they get things almost right but not quite, which often leads to bugs which wouldn't otherwise happen. I think its unclear if that is a net win for productivity in software engineering. It would be interesting if there was a robust study about it.


> Are there actual studies to back that up?

I am aware of an equal number of studies about the time saved overall by use of LLMs, and time saved overall by use of syntax highlighting.

In fact, here's a study claiming syntax highlighting in IDEs does not help code comprehension: https://link.springer.com/article/10.1007/s10664-017-9579-0

Shall we therefore conclude that syntax highlighting is not useful, that developers who use syntax highlighting are just part of the IDE hype train, and that anecdotal reports of syntax highlighting being helpful are counterbalanced by anecdotal reports of $IDE having incorrect syntax highlighitng on $Esoteric_file_format?

Most of the failures of LLMs with coding that I have seen has been a result of asking too much of the LLM. Writing a hundred context-aware unit tests is something that an LLM is excellent at, and would have taken a developer a long time previously. Asking an LLM to write a novel algorithm to speed up image processing of the output of your electron microscope will go less well.


> Shall we therefore conclude that syntax highlighting is not useful, that developers who use syntax highlighting are just part of the IDE hype train, and that anecdotal reports of syntax highlighting being helpful are counterbalanced by anecdotal reports of $IDE having incorrect syntax highlighitng on $Esoteric_file_format?

Yes. We should conclude that syntax highlighting is not useful in languages that the syntax highlighter does not support. I think basically everyone would agree with this statement.

Similarly an llm that worked 100% of the time and could solve any problem would be pretty useful. (Or at least worked correctly as often as syntax highlighting in situations where it is actually used does)

However that's not the world we live in. Its a reasonable question to ask if llm is good enough yet where the productivity gain outweighs the productivity lost.


Your stance feels somewhat contradictory. A syntax highlighter is not useful in languages it does not support, therefore an LLM must be able to solve any problem to be useful?

The point I was trying to make was, an LLM is as reliably useful as syntax highlighting, for the tasks that coding LLMs are good at today. Which is not a lot, but enough to speed up junior devs. The issues come when people assume they can solve any problem, and try to use them on tasks to which they are not suited. Much like applying syntax highlighting on an unsupported language, this doesn't work.

Like any tool, there's a learning curve. Once someone learns what does and does not work, it's generally an strict productivity boost.


The problem is that there are no tasks that LLMs are reliably good at. I believe that's what OP is getting at.

I fixed a production issue earlier this year that turned out to be a naive infinite loop - it was trying to load all data from a paginated API endpoint, but there was no logic to update the page number being fetched.

There was a test for it. Alas, the test didn't actually cover this scenario.

I mention this because it was committed by a co-worker whose work is historically excellent, but who started using Copilot / ChatGPT. I'm pretty sure it was an LLM-generated function and test, and they were deeply broken.

Mostly they've been working great for this co-worker.

But not reliably.


I understand that, the point I'm making is that reliability is not a requirement for utility. One does not need to be reliable to be reliably useful :)

A very similar example is StackOverflow. If you copy/paste answers verbatim from SO, you will have problems. Some top answers are deeply broken or have obvious bugs. Frequently, SO answers are only related to your question, but do not explicitly answer it.

SO is useful to the industry in the same way LLMs are.


Sure, there is a range. If it works 100% of the time its clearly useful. If it works 0% then it clearly isn't.

LLMs are in the middle. Its unclear which side of the line they are on. Some ancedotes say one thing some say another. That's why studies would be great. Its also why syntax highlighting is a bad comparison since that is not in the greyzone.


exactly. many SWEs currently are fighting this fight of “oh it is not good enough bla bla…” on my team currently (50-ish people) you would not last longer than 3 months if you tried to do your work “manually” like we did before. several have tried, no longer around. I believe SWEs fighting LLMs are doing themselves a huge disservice in that they should be full-on embracing it and trying to figure out how to more effectively use them. just like any other tool, it is as good as the user of the tool :)


> Secondly, LLMs are objectively useful for coding now.

No, they're subjectively useful for coding in the view of some people. I find them useless for coding. If they were objectively useful, it would be impossible for me to find them useless because that is what objectivity means.


I do believe you stopped reading my comment at that quote, because I spent the remainder of my comment making the same distinction you did...

It's useless to your coding, but useful to the industry of coding.


> LLM’s never provide code that pass my sniff test

If that statement isn't coming from ego, then where is it coming from? It's provably true that LLM's can generate working code. They've been trained on billions of examples.

Developers seem to focus on the set of cases that LLM's produce code that doesn't work, and use that as evidence that these tools are "useless".


My experience so far has been: if I know what I want well enough to explain it to an LLM then it’s been easier for me to just write the code. Iterating on prompts, reading and understanding the LLM’s code, validating that it works and fixing bugs is still time consuming.

It has been interesting as a rubber duck, exploring a new topic or language, some code golf, but so far not for production code for me.


Okay, but as soon as you need to do the same thing in [programming language you don't know], then it's not easier for you to write the code anymore, even though you understand the problem domain just as well.

Now, understand that most people don't have the same grasp of [your programming language] that you have, so it's probably not easier for them to write it.


I don’t disagree with anything you said and I don’t think anything you said disagrees with my original comment :)

I actually said in my comment that exploring a new language is one area I find LLMs to be interesting.


> It's provably true that LLM's can generate working code

They produce mostly working code, often with odd design decisions that the bot can't fully justify.

The difficult part of coding is cultivating a mental model of the problem + solution space. When you let an LLM write code for you, your mental model falls behind. You can read the new code closely, internalize it, double-check the docs, and keep your mental model up to date (which takes longer than you think), or you can forge ahead, confident that it all more or less looks right. The second option is easier, faster, and very tempting, and it is the reason why various studies have found that code written with LLM assistance introduces more bugs than code written without.

There are plenty of innovations that have made programming a little bit faster (autocomplete, garbage collection, what-have-you), but none of them were a silver bullet. LLMs aren't either. The essential complexity of the work hasn't changed, and a chat bot can't manage it for you. In my (admittedly limited) experience with code assistants, I've found that the faster you move with an LLM, the more time you have to spend debugging afterwards, and the more difficult that process becomes.


there's a lot more involved in senior dev work beyond producing code that works.

if the stakeholders knew how to do what they needed to build and how, then they could use LLMs, but translating complex requirements into code is something that these tools are not even close to cracking.


> there's a lot more involved in senior dev work beyond producing code that works.

Completely agree.

What I don't agree with is statements like these:

> LLM’s never provide code that pass my sniff test

To me, these (false) absolutions about chat bot capabilities, are being rehashed so frequently, that it derails every conversation about using LLM's for dev work. You'll find similar statements in nearly every thread about LLM's for coding tasks.

It's provably true that LLM's can produce working code. It's also true, that some increasingly large portion of coding is being offloaded to LLM's.

In my opinion, developers need to grow out of this attitude that they are John Henry and they'll outpace the mechanical drilling machine. It's a tired conversation.


> It's provably true that LLM's can produce working code.

You've restated this point several times but the reason it's not more convincing to many people is that simply producing code that works is rarely an actual goal on many projects. On larger projects it's much more about producing code that is consistent with the rest of the project, and is easily extensible, and is readable for your teammates, and is easy to debug when something goes wrong, is testable, and so on.

The code working is a necessary condition, but is insufficient to tell if it's a valuable contribution.


The code working is the bare minimum. The code being right for the project and context is the basic expectation. The code being _good_ at solving its intended problem is the desired outcome, which is a combination of tradeoffs between performance, readability, ease of refactoring later, modularity, etc.

LLM's can sometimes provide the bare minimum. And then you have to refactor and massage it all the way to the good bit, but unlike looking up other people's endeavors on something like Stack Overflow, with the LLM's code I have no context why it "thought" that was a good idea. If I ask it, it may parrot something from the relevant training set, or it might be bullshitting completely. The end result? This is _more_ work for a senior dev, not less.

Hence why it has never passed my sniff test. Its code is at best the quality of code even junior developers wouldn't open a PR for yet. Or if they did they'd be asked to explain how and why and quickly learn to not open the code for review before they've properly considered the implications.


> It's provably true that LLM's can produce working code.

This is correct - but it's also true that LLMs can produce flawed code. To me the cost of telling whether code is correct or flawed is larger than the cost of me just writing correct code. This may be an AuDHD thing but I can better comprehend the correctness of a solution if I'm watching (and doing) the making of that solution than if I'm reading it after the fact.


As a developer, while I do embrace intellisense, I don't copy/paste code, because I find typing it out is a fast path to reflection and finding issues early. Copilot seems to be no better than mindlessly copy/pasting from StackOverflow.

From what I've seen of Copilots, while they can produce working code, I've not seen that much that it offers beyond the surface level which is fast enough for me to type. I am also deeply perturbed from some interviews I've done for senior candidates recently who are using them and, when asked to disable them for collaborative coding task, completely fall apart because of their dependency over knowledge.

This is not to say I do not see value in AI, LLMs or ML (I very much do). However, I code broadly at the speed of thought, and that's not really something I think will be massively aided by it.

At the same time, I know I am an outlier in my practice relative to lots around me.

While I don't doubt other improvements that may come from LLM in development, the current state of the art feels less like a mechanical drill and more like an electric triangle.


Code is a liability, not an asset. It is a necessary evil to create functional software.

Senior devs know this, and factor code down to the minimum necessary.

Junior devs and LLMs think that writing code is the point and will generate lots of it without worrying about things like leverage, levels of abstraction, future extensibility, etc.


LLMs can be prompted to write code that considers these things.

You can write good code or bad code with LLMs.


The code itself, whether good or bad, is a liability. Just like a car is a liability, in a perfect world, you'd teleport yourself to your destination, instead you have to drive. And because of that, roads and gas stations have to be built, you have to take care of the car, etc,.... It's all a huge pain. The code you write, you will have to document, maintain, extend, refactor, relearn, and a bunch of other activities. So yo do your best to only have the bare minimum to take care of. Anything else is just future troubles.


Sure, I don’t dispute any of that. But it's not a given that using LLMs means you’re going to have unnecessary code. They can even help to reduce the amount of code. You just have to be detailed in your prompting about what you do and don’t want, and work through multiple iterations until the result is good.

Of course if you try to one shot something complex with a single line prompt, the result will be bad. This is why humans are still needed and will be for a long time imo.


I'm not sure that's true. A LLM can code because it is trained on existing code.

Empirically, LLMs work best at coding when doing completely "routine" coding tasks: CRUD apps, React components, etc. Because there's lots of examples of that online.

I'm writing a data-driven query compiler and LLM code assistance fails hard, in both blatant and subtle ways. There just isn't enough training data.

Another argument: if a LLM could function like a senior dev, it could learn to program in new programming languages given the language's syntax, docs and API. In practice they cannot. It doesn't matter what you put into the context, LLMs just seem incapable of writing in niche languages.

Which to me says that, at least for now, their capabilities are based more on pattern identification and repetition than they are on reasoning.


Have you tried new languages or niche languages with claude sonnet 3.5? I think if you give it docs with enough examples, it might do ok. Examples are crucial. I’ve seen it do well with CLI flags and arguments when given docs, which is a somewhat similar challenge.

That said, you’re right of course that it will do better when there’s more training data.


> It's provably true that LLM's can produce working code

ChatGPT, even now in late-2024, still hallucinates standard-library types and methods more-often-than-not whenever I ask it to generate code for me. Granted, I don’t target the most popular platforms (i.e. React/Node/etc; I’m currently in a .NET shop, which is a minority platform now, but ChatGPT’s poor performance is surprising given the overall volume and quality of .NET content and documentation out there.

My perception is that “applications” work is more likely to be automated-away by LLMs/copilots because so much of it is so similar to everyone else’s, so I agree with those who say LLMs are only as good as there are examples of something online, whereas asking ChatGPT to write something for a less-trodden area, like Haskell or even a Windows driver, is frequently a complete waste of time as whatever it generates is far beyond salvaging.

Beyond hallucinations, my other problem lies in the small context window which means I can’t simply provide all the content it needs for context. Once a project grows past hundreds of KB of significant source I honestly don’t know how us humans are meant to get LLMs to work on them. Please educate me.

I’ll declare I have no first-hand experience with GitHub Copilot and other systems because of the poor experiences I had with ChatGPT. As you’re seemingly saying that this is a solved problem now, can you please provide some details on the projects where LLMs worked well for you? (Such as which model/service, project platform/language, the kinds of prompts, etc?). If not, then I’ll remain skeptical.


> still hallucinates standard-library types and methods more-often-than-not whenever I ask it to generate code for me

Not an argument, unsolicited advice: my guess is you are asking it to do too much work at once. Make much smaller changes. Try to ask for as roughly much as you would put into one git commit (per best practices)-- for me that's usually editing a dozen or less lines of code.

> Once a project grows past hundreds of KB of significant source I honestly don’t know how us humans are meant to get LLMs to work on them. Please educate me.

https://github.com/Aider-AI/aider

Edit: The author of aider puts the percentage of the code written by LLMs for each release. It's been 70%+. But some problems are still easier to handle yourself. https://github.com/Aider-AI/aider/releases


Thank you for your response - I've aksed these questions before in other contexts but never had a reply, so pretty-much any online discussion about LLMs feels like I'm surrounded by people role-playing being on LinkedIn.


> It's provably true that LLM's can produce working code

Then why can't I see this magical code that is produced? I mean a real big application with a purpose and multiple dependencies, not yet another ReactJS todo list. I've seen comments like that a hundred times already but not one repository that could be equivalent to what I currently do.

For me the experience of LLM is a bad tool that calls functions that are obsolete or do not exist at all, not very earth-shattering.


> if the stakeholders knew how to do what they needed to build and how, then they could use LLMs, but translating complex requirements into code is something that these tools are not even close to cracking.

They don't have to replace you to reduce headcount. They could increase your workload so where they needed five senior developers, they can do with maybe three. That's like six one way and half a dozen the other way because two developers lost a job, right?


Yeah. Code that works is a fraction of the aim. You also want code that a good junior can read and debug in the midst of a production issue, is robust against new or updated requirements, has at least as good performance as the competitors, and uses appropriate libraries in a sparse manner. You also need to be able to state when a requirement would loosen the conceptual cohesion of the code, and to push back on requirements thdt can already be achieved in just as easy a way.


> It's provably true that LLM's can generate working code.

What I've seen of them, the good ones mostly produce OK code. Not terrible, usually works.

Although I like them even for that low-ish bar, although I find them to be both a time-saver and a personal motivation assistant, they're still a thing that needs a real domain expert to spot the mistakes they make.

> Developers seem to focus on the set of cases that LLM's produce code that doesn't work, and use that as evidence that these tools are "useless".

I do find it amusing how many humans turned out to be stuck thinking in boolean terms, dismissing the I in AGI, calling them as "useless" because it "can't take my job". Same with the G in AGI, dismissing the breadth of something that speaks 50 languages when humans who speak five or six languages are considered unusually skilled.


> If that statement isn't coming from ego, then where is it coming from? It's provably true that LLM's can generate working code. They've been trained on billions of examples.

I am pro AI and I'm probably even overvaluing the value AI brings. However, for me, this doesn't work in more "esoteric" programming languages or those with stricter rulesets like Rust. LLMS provide fine JS code, since there's no compiler to satisfy, but CPP without undefined behaviour or compiling Rust code is rare.

There's also no chance of LLMS providing compiling code if you're using a library version with a newer API than the one in the training set.


Working code some subset of the time, for popular languages. It’s not good at Rust nor at other smaller languages. Tried to ask it for some help with Janet and it was hopelessly wrong, even with prompting to try to get it to correct its mistakes.

Even if it did work, working code is barely half the battle.


I'd say ChatGPT at least is a fair bit better at Python than it is at Scala which seems to match your experience.


> It's provably true that LLM's can generate working code.

Yeah for simple examples, especially in web dev. As soon as you step outside those bounds they make mistakes all the time.

As I said, they're still useful because roughly correct but buggy code is often quite helpful when you're programming. But there's zero chance you can just say "write me an driver for the nRF905 using Embassy and embedded-hal" and get something working. Whereas I, a human, can do that.


The question is, how long would it take you to get as far as this chat does when starting from scratch?

https://chatgpt.com/share/6760c3b3-bae8-8009-8744-c25d5602bf...


How confident are you in the correctness of that code?

Because, one way or another, you're still going to need to become fluent enough in the problem domain & the API that you can fully review the implementation and make sure chatgpt hasn't hallucinated in any weird problems. And chatgpt can't really explain its own work, so if anything seems funny, you're going to have to sleuth it out yourself.

And at that point, it's just 339 lines of code, including imports, comments, and misc formatting. How much time have you really saved?


Yeah now actually check the nRF905 documentation. You'll find it has basically made everything up.


I imagine though they might replace 3 out of 4 senior programmers (keep one around to sanity check the AI).


That's the same figuring a lot of business folks had when considering off-shoring in the early 2000s - those companies ended up hiring twice as many senior programmers to sanity check and correct the code they got back. The same story can be heard from companies that fired their expensive seniors to hire twice as many juniors at a quarter the price.

I think that software development is just an extremely poor market segment for these kinds of tools - we've already got mountains of productivity tools that minimize how much time we need to spend doing the silly rote programming stuff - most of software development is problem solving.


Oof, the times I've heard something like that with X tech.


Heh, UML is going to save us! The business people can just write the requirements and the code will write itself! /s

Given the growth-oriented capitalist society we live in in the west, I'm not all that worried about senior and super-senior engineers being fired. I think a much more likely outcome is that if a business does figure out a good way to turn an LLM into a force-multiplier for senior engineers, they're going to use that to grow faster.

There is a large untapped niche too that this could potentially unlock: projects that aren't currently economically viable due to the current cost of development. I've done a few of these on a volunteer basis for non-profits but can't do it all the time due to time/financial constraints. If LLM tech actually makes me 5x more productive on simple stuff (most of these projects are simple) then it could get viable to start knocking those out more often.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: