> I lead an applied AI research team where I work Your paycheck depends on peopl...

KronisLV · on Oct 1, 2024

Snarky tone aside, there are different audiences. For example, I primarily work with web dev and some DevOps and I can tell you that the state of both can be pretty dire. Maybe not as much in my particular case, as in general.

Some examples to illustrate the point: supply chain risks and an ever increasing amount of dependencies (look at your average React project, though this applies to most stacks), overly abstracted frameworks (how many CPU cycles Spring Boot and others burn and how many hoops you have to jump through to get thigns done), patterns that mess up the DBs ability to optimize queries sometimes (EAV, OTLT, trying to create polymorphic foreign keys), inefficient data fetching (sometimes ORMs, sometimes N+1), bad security practices (committed secrets, anyone? bad usage of OAuth2 or OIDC?), overly complex tooling, especially the likes of Kubernetes when you have a DevOps team of one part time dev, overly complex application architectures where you have more more services than developers (not even teams). That's before you even get into the utter mess of long term projects that have been touched by dozens of developers over the years and the whole sector sometimes feeling like wild west, as opposed to "real engineering".

That's why articles like this ring true: http://www.stilldrinking.org/programming-sucks

However, the difference here is that I wouldn't overwhelm anyone who might give me money with rants about this stuff and would navigate around those issues and risks as best I can, to ship something useful at the end of the day. Same with having constructive discussions about any of those aspects in a circle of technical individuals, on how to make things better.

Calling the whole concept a "scam" doesn't do anyone any good, when I already derive value from the LLMs, as do many others. Look at https://www.cursor.com/ for example and consider where we might be in 10-20 years. Not AGI, but maybe good auto-complete, codegen and reasoning about entire codebases, even if they're hundreds of thousands of lines long. Tooling that would make anyone using it more productive than those who don't. Unless the funding dries up and the status quo is restored.

obirunda · on Oct 1, 2024

I think at its core it's not that there isn't value or future value, but currently there is an assertion, maybe some blind faith, that it's inevitable that a future version will deliver a free lunch for society.

I think the testimonies often repeated by coders that use these code completion tools is that "it saved me X amount of time on this one problem I had, therefore it's great value". The issue is that these all fall into a research of n=1 test subjects. It's only useful information for the subject. It appears we don't realize in these moments that when we use those examples, even to ourselves, we are users reviewing a product, as opposed to validating if our workflow is not just different but objectively better.

The truth lies in the aggregate data of the quality and crucially the speed by which fixes and requirements are being implemented at scale across code bases.

Admittedly, a lot of code is being generated, so I don't think I can say everyone hates it, but until someone can do some real research on this, all we have are product reviews.

KronisLV · on Oct 1, 2024

> I think at its core it's not that there isn't value or future value, but currently there is an assertion, maybe some blind faith, that it's inevitable that a future version will deliver a free lunch for society.

To me it seems very much like we're somewhere near the peak of the hype cycle: https://en.wikipedia.org/wiki/Gartner_hype_cycle

Except in the case of "AI" we get new releases that seem somewhat impressive and therefore extend the duration for which the inflated expectations can survive. For what it's worth, stuff like this is impressive https://news.ycombinator.com/item?id=41693087 (I fed my homepage/blog into it and the results were good, both when it came to the generated content and the quality of speech)

> The truth lies in the aggregate data of the quality and crucially the speed by which fixes and requirements are being implemented at scale across code bases.

Honestly? I think we'll never get that, the same way I cannot convincingly answer "How long will implementing functionality X in application Y with the tech stack Z for developer W take?"

We can't even estimate tasks properly and don't have metrics for specific parts of the work (how much creating a front end takes, how much for a back end API, how much for the schema and DB migrations, how much for connecting everything, adding validations, adding audit, fixing bugs etc.) because in practice nobody splits them up in change management systems like Jira so far, nor are any time tracking solutions sophisticated enough to figure those out and also track how much of the total time is just procrastination or attending to other matters (uncomfortable questions would get asked them, way too metrics would be optimized for).

So the best we can hope for is some vague "It helps me with boilerplate and repeatable code which is most of my enterprise CRUD system by X% and as a result something that would take me Y weeks now takes me Z weeks, based on these specific cases." Get enough of those empirical data points and it starts to look like something useful.

I think lots of borderline scams and/or bad products based on overblown products will get funded but in a decade we'll probably have mostly those sticking around that have actual utility.

Retric · on Oct 1, 2024

The top comment of your HN link is exactly the issue at hand

> don't know what I would use a podcast like this for, but the fact that something like this can be created without human intervention in just a few minutes is jaw dropping

AI has recently gotten good at doing stuff that seems like it should be useful, but the limitations aren’t obvious. Self driving cars, LLM’s, Stable Diffusion etc are awesome tech demos as long as you pick the best output.

The issue is the real world cares a lot more about the worst outcomes. Driving better than 99% of people 24/7 for 6 months and then really fucking up is indistinguishable from being a bad driver. Code generation happens to fit really well because of how people test and debug code not because it’s useful unsupervised.

Currently balancing supervision effort vs time saved depends a great deal on the specific domain and very little about how well the AI has been trained, that’s what is going to kill this hype cycle. Investing an extra 100 Billion training the next generation of LLM isn’t going to move the needles that matter.

jcgrillo · on Oct 1, 2024

This is absolutely on point, I don't understand why your post is being down-voted.

jamilton · on Oct 1, 2024

1. It's not actually refuting the point being made which is "it's seems hard to take advantage of LLMs' capabilities", which to me seems like a good point that stands on it's own, regardless of who is saying it.

2. The original post is, to my eyes, not implying anything about training a model to help use other models, so the second part seems irrelevant.

passwordoops · on Oct 1, 2024

I disagree. The comment is striking directly at the claim that people aren't taking "advantage of LLMs' capabilities". This is their capability and no amount of "clear communication" is going to change that

gleenn · on Oct 1, 2024

One of the rules of HN is to assume good intent. People do and say the right thing often even when it's antithetical to their source of income. If there is a substantive way to otherwise then say it, don't immediately write-off people's opinions, especially people in their fields because of a possibility of bias.

jcgrillo · on Oct 1, 2024

OK, but the post this was in response to was bloviating about all kinds of sci-fi stuff like "super-intelligence" and the like. It was the opposite of "antithetical to their source of income", instead it was playing into some techno-futurist faith cult.

roenxi · on Oct 1, 2024

There is a strong argument that super-intelligence is already in the rear-view mirror. My computer is better than me at almost everything at this point; creativity, communication, scientific knowledge, numerical processing, etc. There is a tiny sliver of things that I've spent a life working on where I can consistently outperform a CPU, but it is not at all clear how that could be defensible given the strides AI has made over the last few decades. The typical AI seems more capable than the typical human to me. If that isn't super-intelligence then whatever super-intelligence is can't be far away.

jcgrillo · on Oct 1, 2024

> given the strides AI has made over the last few decades

This is where I lost the plot. The techno-futurists always seem to try to co-opt Moore's law or similar scaling laws and claim it'll somehow take care of whatever scifi bugaboo du jour they're peddling, without acknowledging that Moore's law is specifically about transistor density and has nothing to do with "strides" in "AI".

> whatever super-intelligence is can't be far away.

How do you figure? Or is it just an article of your faith?

It's been the same old, tired argument for seven decades--"These darned computers can count real real fast so therefore any day now they'll be able to think for themselves!" But so far nobody's shown it to be true, and all the people claiming it's right around the corner have been wrong. What makes this time different?

roenxi · on Oct 1, 2024

> How do you figure?

https://epochai.org/blog/training-compute-of-frontier-ai-mod...

https://ourworldindata.org/grapher/test-scores-ai-capabiliti...

We're still seeing exponential upswing in compute and we appear to already be probing around human capacity in the models. Past experience suggests that once AIs are within spitting distance of human ability they will exceed what a human mind can do in short order.

These are not subtle trends.

jcgrillo · on Oct 1, 2024

I'm not sure what the amount of computer time spent on training the models has to do with anything, the article states it "is the best predictor of broad AI capabilities we have" without attempting to defend the claim. The "humies" (to use a dated term) benchmarks are interesting but clearly not super indicative of real world performance--one merely has to interact with one of these LLMs to find their (often severe) limitations, and it's not clear at all that more computer time spent on training will actually make them better.

EDIT: re: the computer time metric, by the same token shouldn't block chains have changed the world by now if computer time is the predictor of success? It makes sense for the industry proponents of LLMs to focus on this metric, because ultimately that's what they sell. Microsoft, NVidia, Google, Amazon, etc all benefit astronomically from computationally intensive fads, be it chatbot parlor tricks or NFTs. And the industry at large does as well--a rising tide lifts all boats. It's not at all obvious any of this is worth something directly, though.

roenxi · on Oct 1, 2024

> I'm not sure what the amount of computer time spent on training the models has to do with anything

Fair enough. What do you think is driving the uptick of AI performance and why don't you think it will be correlated with the amount of compute invested?

The limitations business looks like a red herring. Being flawed and limited doesn't even disqualify an AI from being super-intelligent (whatever that might mean). Humans are remarkably flawed and limited, it takes a lot of setup to get them to a point where they can behave intelligently.

> EDIT: re: the computer time metric, by the same token shouldn't block chains have changed the world by now if computer time is the predictor of success?

That seems like it would be a defensible claim if you wanted to make it. One of the trends I keep an eye on is that log(price) is similar to the trend in log(hash rate) for Bitcoin. I don't think it is relevant though because Bitcoin isn't an AI system.

hsjdhdvsk · on Oct 1, 2024

You forget energy efficiency friend. We are FAR more efficient...

threeseed · on Oct 1, 2024

> The typical AI seems more capable than the typical human to me

Your microwave is more capable than a typical human.

If of course your definition of capabilities is narrowly defined to computations and ignores the huge array of things humans can do that computers are no where close to.

roenxi · on Oct 1, 2024

> the huge array of things humans can do that computers are no where close to.

For example?

achenet · on Oct 1, 2024

anything involving an interface with the physical world.

For example, running a lemonade stand.

You'd need thousands, if not millions of dollars to build a robot machine with a computerized brain capable of doing what a 6 year old child can do - produce lemonade from simple ingredients (lemon, sugar, water) and sell it to consumers.

Same with basically all cooking/food-service and hospitality tasks, an physical therapy type tasks (massage, chiropractor, etc)...

heck, even driving on public roads still doesn't seem to be perfect, despite 10+ years on investment and research by leading tech companies, although there is also a regulatory hurdle here.

roenxi · on Oct 1, 2024

You seem to have shifted the conversation's goalposts there - those are things that computers can do, it just costs a lot.

And, more to the point, they aren't indicative of intelligence. Computers have cleared the intelligence requirements to run a lemonade stand by a large margin - and the other tasks too for that matter.

jcgrillo · on Oct 1, 2024

> those are things that computers can do, it just costs a lot

One could travel between continents in minutes on an ICBM with a reentry vehicle bolted to the front but we don't because it's too expensive. It's a perfectly reasonable constraint to demand that a technology be cost effective. Otherwise it has no practical value.

passwordoops · on Oct 1, 2024

>My computer is better than me at almost everything at this point; creativity, communication, scientific knowledge

By that logic, libraries, art museums and calculators are "super intelligent"

roenxi · on Oct 1, 2024

Are you in the habit of talking to your calculator?