Just like any other tool, there are people who use it poorly, and people who use it well.
Yes, we're all tired of the endless parade of people who exaggerate the abilities of (current day) AI and claim it can do more than it can do.
But I'm also getting tired of people writing articles that showcase people using AI poorly as if that proves some sort of point about its inherent limitations.
Man hits thumb with hammer. Article: "Hammers can't even drive a simple nail!"
It's not a tool its a SaaS. I own and control my tools. I think a John Deer tractor looses it's "tool" status when you can't control it. Sure there's the local models but those aren't what the vast majority of folks are using or pushing.
This is an incredibly weird view to me. If I borrow a hammer from my neighbor, although I don’t own the hammer, it doesn’t suddenly make the hammer not a tool. Associating a tool with the concept of ownership feels like an odd argument to make.
You control the prompt and the system prompt. No, it's not hyper specialized yet on the training side, but that doesn't matter. You can explicitly control the files it reads in Cursor, and I'm sure Roo and Aider can as well. If you self host, you can control exactly where your data is stored.
I've never seen so many false assumptions in one place.
You get different models, configurations, system prompts (and DAN-descended stuff is like DMCA now, unaccountable blacklist for even trying in some cases), you get control vectors and modern variants with more invasive dynamic weight biasing. The expert/friend/coworker drop down doesn't have all the entries in it: there's a button to make Claude code write files full of "in production code we'd do the calculation" mocks and then write a commit message about all the passing tests (with a byline!), but some ops guy pushes that button in the rare event the PID controller or whatever can't cope.
These are hooked up to control theory algorithms based on aggregate and regional KV and prompt cache load. This is true of both fixed and per-token billing. The agent will often be an asset at 4am but a liability at 2pm.
You get experiment segmented always, you get behavior scoped multi-armed badit rotated into and out of multiple segment categories (an experiment universe will typically have not less than 10000 segments, each engineer will need maybe 2 or 3 and maybe hundreds of arms per project/feature, so that's a lot of universes).
At this stage of the consumer internet cycle its about unit economics and regulatory capture and stock manipulation via hype rollercoaster. and make no mistake about what kind of companies these are: they have research programs with heavy short-run applications in mind and a few enclaves where they do AlphaFold or something. I'm sure they created an environment Carmack would tolerate at least for a while, but I gibe it a year or two we saw that movie at Oculus and Bosworth is a pretty good guy, he's like Jesus compared to the new boss.
In this extended analogy about users, owners, lenders, borrowers and hammers, I'd be asking what is the hammer and who is the nail.
Many SaaS products are tools. I'm sure when tractors were first invented, people felt that they didn't "control" it compared to directly holding shovels and manually doing the same work.
Not to say that LLMs are at the same reliability of tractors vs. manual labor, but just think that your classification of what's a tool vs. not isn't a fair argument.
I think the OP comment re: AI's value as a tool comes down this this:
Does what it says: When you swing a hammer and make contact, it provides greater and more focused force than your body at that same velocity. People who sell hammers make this claim and sometimes show you that the hammer can even pull out nails really well. The claims about what AI can do are noisy, incorrect and proffered by people who - I imagine OP thinks and would agree - know better. Essentially they are saying "Hammers are amazing. Swing them around everywhere"
Right to repair: Means an opportunity to understand the guts of a thing and fix it to do what you want. You cannot really do this to AI. You can prompt differently but it can be unclear why you're not getting what you want
Intentionally or not the tractor analogy is a rich commentary on this but it might not make the point you intend. Look into all the lawsuits and shit like that with John Deere and the DRM lockouts where farmers are losing whole crops because of remote shutdown cryptography that's physically impossible to remove at a cost or in a timeframe less than a new tractor.
People on HN love to bring up farm subsidies, and its a real issue, but big agriculture has special deals and what not. They have redundancy and leverage.
The only time this stuff kicks in is when the person with the little plot needs next harvest to get solvent and the only outcome it ever achieves is to push one more family farm on the brink into receivership and directly into the hands of a conglomorate.
Software engineers commanded salaries that The Right People have found an affront to the order of things long after they had gotten doctors and lawyers and other high-skill trades largely brought to heel via the joint licensing and pick a number tuition debt load. This isn't easy in software for a variety of reasons but roughly that the history of computer science in academia is kind of a unique one: it's research oriented in universities (mostly, there are programs with an applied tilt) but almost everyone signs up, graduates, and heads to industry without a second thought, and so back when the other skilled trades were getting organized into the class system it was kind of an oddity, regarded as almost an eccentric pursuit by deans and shit.
So while CS fundamentals are critical to good SWE's, schools don't teach them well as a rule any more than a physics undergraduate is going to be an asset at CERN: its prep for theory research most never do. Applied CS is just as serious a topic, but you mostly learn that via serious self study or from coworkers at companies with chops. Even CS graduates who are legends almost always emphasize that if you're serious about hacking then undergrad CS is remedial by the time you run into it (Coders at Work is full of this sentiment).
So to bring this back to tractors and AI, this is about a stubborn nail in what remains of the upwardly mobile skilled middle class that multiple illegal wage fixing schemes have yet to pound flat.
This one will fail too, but that's another mini blog post.
That's where you find out you need $100k in hardware to do so, and I'm lowballing here. And a person who can put it together, it's not quite a typical ops setup.
The article is literally titled, "AI can't even fix a simple bug," reinforces the claim that "the technology can’t understand a simple bug" multiple times, and willfully ignores all the good that it can do. Anyone who read this article without having experienced AI coding themselves would be grossly misled.
Sure, the article makes other good points, too.
But those points could be made without hyperbolic claims about what AI can and can't do, by everyone, not just this author
The other half of the title is not what I'm complaining about, and has no effect on the inaccuracy of the first half of the title, which is what I'm complaining about
The other half of the title is significant; it puts the article into the context of criticizing business leaders for how they think about AI given its current capabilities, which is what you suggested.
The television, the atom bomb, the cigarette rolling machine, and penicillin are also "just tools". They nevertheless changed our world entirely, for better or worse. If you ascribe the impact of AI to the people using AI, you will be utterly, completely bewildered by what is happening and what is going to happen.
It's increasingly a luxury to be a software engineer who is able to avoid some combination of morally reprehensible leadership harming the public, quality craftsmanship in software being in freefall, and ML proficiency being defined downwards to admit terrible uses of ML.
AI coding stuff is a massive lever on some tasks and used by experts. But its not self-driving and the capabilities of tge frontier vendor stuff might be trending down, they're certainly not skyrocketing.
Any other tool: a compiler, an editor, a shell, even a browser, but I'd say build tools are the best analogy: you have chosen to become proficient or even expert or you haven't and rely on colleagues or communities that provide that expertise. Pick a project or a company: you know if you should be messing around with the build or asking a build person.
AI is no diffetent. Claude 4 Opus just went GA and its in power user tune still, they don't have the newb/cost-control defaults dialed in yet and so its really useful and probably will be for a few days until they get the PID controller wired up to whatever a control vector is these days, and then it will tank to useless slop just like 3.7.
For a week I'll get a little boost in my ouyput and pay them a grand and be glad I did, and then it will go back to worse than useless.
> there are people who use it poorly, and people who use it well.
Precisely. AI needs appropriate and sufficient guidance to be able to write code that does the job. I make sure my prompts have all of the necessary implementation detail that the AI will need. Without this guidance, the expected result is not a good one.
Well, it's where we are now with AI technology. Perhaps a superior future AI will need less of it. For now I give it all that I think it won't reliably figure out on its own.
It's actually where we have always been with software development, which is why so many software professionals think the media narrative is so stupid.
Oh, you want to fire your engineers? Easy, just perfectly specify exactly what you want and how it should work! Oh, that's what the engineers are for? Huh!
Yup, but the management, essentially the ones who control the hiring and firing, are more clued into the media narrative, with a bias toward dismissing what engineers think.
> AI needs appropriate and sufficient guidance to be able to write code that does the job.
Note that the example (shitty Microsoft) implementation was not able to properly run tests during its work, not even tests it had written itself.
If you have an existing codebase that already has a plenty tests and you ask AI to refactor something whilst giving it the access it needs to run tests, it can already sometimes do a great job all by itself.
Good specification and documentation also do a lot, of course, but the iterative approach with feedback if things are actually working as intended is a game changer. Not unsurprisingly also a lot closer to how humans do things.
The iterative approach has one problem -- it is onerous to repeat the lengthy iterative process with a different model, as it will lead to an entirely different conversation. In contrast, when the spec is well-written up-front, it is trivial to switch models to see how the other model implements it differently.
> Just like any other tool, there are people who use it poorly, and people who use it well.
We are not talking about random folks here but about the largest software company with high stakes in the most popular LLM trying to show off how good it is. Stephen Toub is hardly a newbie either.
AI can be used as a tool, sure, but it is distinct from other technologies in that it is an agent, that is: it has or operates with agency.
There are many prominent people out there who are saying that AI will replace SWEs--not just saying that AI will be a tool added to th SWE tool belt.
Although it seems like you agree with the conclusion of the article you are criticizing, the context is much more complex than AI is a tool like a hammer.
It is not "an agent" in the sense you are implying here, it does not will, want, plan, none of those words apply meaningfully. It doesn't reason, or think, either.
I'll be excited if that changes, but there is absolutely no sign of it changing. I mean, explicitly, the possibility of thinking machines is where it was before this whole thing started - maybe slightly higher, but moreso because a lot of money is being pumped into research.
LLMs might still replace some software workers, or lead to some reorganising of tech roles, but for a whole host of reasons, none of which are related to machine sentience.
As one example - software quality matters less and less the as users get locked in. If some juniors get replaced by LLMs and code quality plummets causing major headaches and higher workloads for senior devs, as long as sales don't dip, managers will be skipping around happily.
I didn't mean to imply AI was sentient or approaching sentience. Agency seems to be the key distinction between it and other technologies. You can have agency, apparently, without the traits you claim I imply.
Ah, ok, you must be using agency in some new way I'm not aware of.
Can you clarify what exactly you mean then when you say that "AI" (presumably you mean LLMs) has agency, and that this sets it apart from all other technologies? If this agency as you define it makes it different from all other technologies, presumably it must mean something pretty serious.
This is not my idea. Yuval Noah Harari discusses it in Nexus. Gemini (partially) summarizes it like this:
Harari argues that AI is fundamentally different from previous technologies. It's not just a tool that follows instructions, but an "agent" capable of learning, making decisions, and even generating new ideas independently.
> If this agency as you define it makes it different from all other technologies, presumably it must mean something pretty serious.
Yes, AI does seem different and pretty serious. Please keep in mind the thread I was responding to said we should think of AI as we would a hammer. We can think of AI like a tool, but limiting our conception like that basically omits what is interesting and concerning (even in the context of the original blog post).
Just like any other tool, there are people who use it poorly, and people who use it well.
Yes, we're all tired of the endless parade of people who exaggerate the abilities of (current day) AI and claim it can do more than it can do.
But I'm also getting tired of people writing articles that showcase people using AI poorly as if that proves some sort of point about its inherent limitations.
Man hits thumb with hammer. Article: "Hammers can't even drive a simple nail!"