> Just last week i asked for a script to do image segmentation with a basic UI a...

Flamentono2 · 2025-05-20T11:38:24 1747741104

I'm following LLMs, AI/ML for a few years now and not just on a high level.

There is not a single system out there today which can do what claude can do.

I stil see it for what it is: A technology i can communicate/use with natural language and get a very diverse of tasks done. From writing/generating code, to svgs, to emails, translation etc. etc. etc.

Its a paradigma shift for the whole world literaly.

We finally have a system which encodes not just basic things but high level concepts. And we humans are doing often enough something very similiar.

And what limitations are obvious? Tell me? We have not reached any real ceiling yet. We are limited by GPU capacity or how many architectural experiments a researcher can run. We have plenty of work to do to cleanup the data set we use and have. We need to build more infrastructure, better software support etc.

We have not even reached the phase were we all have local AI/ML chips build in.

We don't even know yet how a system will act if everyone of us has access to very fast inferencing like you already get with groq.

lossolo · 2025-05-20T12:51:58 1747745518

> Its a paradigma shift for the whole world literaly.

That's hyperbolic. I use LLMs daily. They speed up tasks you'd normally use Google for and can extrapolate existing code into other languages. They boost productivity for professionals, but it's not like the discovery of the steam engine or electricity.

> And what limitations are obvious? Tell me? We have not reached any real ceiling yet.

Scaling parameters is the most obvious limitation of the current LLM architecture (transformers). That’s why what should have been called GPT-5 is instead named GPT 4.5, it isn’t significantly better than the previous model despite having far more parameters, a lot more cleaned up training data and optimizations.

The low-hanging fruit has already been picked, and most obvious optimizations have been implemented. As a result, almost all leading LLM companies are now operating at a similar level. There hasn’t been a real breakthrough in over two years. And the last huge architectural breakthrough was in 2017 (with paper "Attention is all you need").

Scaling at this point yields only diminishing returns. So no, what you’re saying isn’t accurate, the ceiling is clearly visible now.

Flamentono2 · 2025-05-23T08:56:09 1747990569

> ... but it's not like the discovery of the steam engine or electricity.

completly disagree. People might have googled before but the human<>computer interface was never in any way as accessable as it is now for a normal human being. Can i use Photoshop? yes but i learned it. My sisters played around with Dall-E and are now able to do simiiliar things.

It might feel boring to you that technology accessability drips down like this, but this changes a lot for a lot of people. The entry barrier to everything got a lot lower. It makes a huge difference to you as a human being if you have rich parents and good teachers or not. You had never the chance to just get help like this. Millions of kids struggle because they don't have parents they can ask certain questions required for understanding topics in school.

Steam Engine = fundamental for our scaling economy electricity = fundamental for liberating all of us from day time internet = interconnecting all of us LLM/ML/AI = liberating knowledge through accessability

> 'There hasn’t been a real breakthrough in over two years.' DeepSeek alone was a real breakthrough.

But let me ask an LLM about this:

- Mixture of Experts (MoE) scaling

- Long-context handling

- Multimodal capabilities

- Tool use & agentic reasoning

Funny enough your comment comes before claude 4.0 release (again increase in performance, etc.) and the Google IO.

We don't know if we found all 'low hanging fruits'. The meta paper about thinking in latent space came out in February. I would definitly call this a low hanging fruit.

We are limited, very hard, on infrastructure. Every experiement you want to try consumes a lot of it. If you look at the top x GPU AI clusters, we don't have that many on the planet. We have Google, Microsoft, Azure, Nvidia, Baidu, Tesla and xAI, Cerebras. Not that many researcher are able to just work on this.

Google has now its first Diffusion based Model active. 2025! We are so far away from testing out more and more approaches, architectures etc. And we are optimizing on every front. Cost, speed, precision etc.

whyowhy3484939 · 2025-05-26T18:23:27 1748283807

> My sisters played around with Dall-E and are now able to do simiiliar things.

This is no way shape or form in any actual productive way similar to being skilled at Photoshop. There is absolutely no way these people can mask, crop, tweak color precisely, etc. There are hundreds of these sub-tasks. It's not just "making cool images". No amount of LLMing will make you skilled and no amount of delegation will make you able to ask these specific questions in a skillful way to the LLM.

There is a very real fundamental problem here. To be able to state the right questions you have to have a base of competence that ya'll are so happy about throwing into the wind. The next generation will not even know what a "mask" is, let alone ask an LLM for details. Education is dropping worldwide and these things are not going to help. They are going to accelerate this bullshit.

> liberating knowledge through accessability

Because the thing is, availability of knowledge never was the issue. The existence of ridiculous amounts of copyright free educational material and the hundreds of gigs of books on Project Gutenberg are testament to that.

Even in my youth (90s) there were plenty of books and easy to access resources to learn, say, calculus. Did I peruse them? Hell no. Did my friends? You bet your ass they were busy wasting time doing bullshit as well. Let's just be honest about this.

These problems are not technical and no amount of technology is going to solve them. If anything, it'll make things worse. Good education is everything, focus on that. Drop the AI bullshit, drop the tech bullshit. Read books, solve problems. Focus on good teachers.

sanderjd · 2025-05-20T15:49:29 1747756169

I honestly think it's still way too early to say this either way. If your hypothesis that there are no breakthroughs left is right, then it's still a very big deal, but I'd agree with you that it's not steam engine level.

But I don't think "the transformer paper was eight years ago" is strong evidence for that argument at all. First of all, the incremental improvement and commercialization and scaling that has happened in that period of time is already incredibly fast. Faraday had most of the pieces in place for electricity in the 1830s and it took half a century to scale it, including periods where the state of the art began to stagnate before hitting a new breakthrough.

I see no reason to believe it's impossible that we'll see further step-change progressions in AI. Indeed, "Attention is All You Need" itself makes me think it's more likely than not. Out of the infinite space of things to try, they found a fairly simple tweak to apply to existing techniques, and it happened to work extremely well. Certainly a lot more of the solution space has been explored now, but there's still a huge space of things that haven't been tried yet.

nancyminusone · 2025-05-20T14:33:47 1747751627

LLMs are great at tasks that involve written language. If your task does not involve written language, they suck. That's the main limitation. No matter how hard you push, AI is not a 'do everything machine' which is how it's being hyped.

Flamentono2 · 2025-05-23T09:05:03 1747991103

Written language is very powerful apparently. After all LLM can generate SVG, python code to use Blender etc.

One demo i saw with LLM and code use: "Generate a small snake game" and because the author still had the Blender MCP tool connection, the LLM decided to generate 3D assets through Blender for that game.

ranie93 · 2025-05-20T16:38:29 1747759109

Can "everything" be mapped to a written language task (i.e. described)?

whyowhy3484939 · 2025-05-20T11:54:27 1747742067

> We finally have a system which encodes not just basic things but high level concepts

That's the thing I'm trying to convey: it's in fact not encoding anything you'll recognize and if it is, it's certainly not "concepts" as you understand them. Not saying it cannot correlate text that includes what you call "high level concepts" or do what you imagine to be useful work in that general direction. Again not making claims it's not useful, just saying that it becomes kind of meh once you factor in all costs and not just the hypothetical imaginary future productivity gains. AKA building literal nuclear reactors to do something that basically amounts to filling in React templates or whatever BS needs doing.

If it was reasoning it could start with a small set of bootstrap data and infer/deduce the rest from experience. It cannot. We are not even close as in there is not even theory to get us there forget about the engineering. It's not a subtle issue. We need to throw literally all data we have at it to get it to acceptable levels. At some point you have to retrace some steps and think over some decisions, but I guess I'm a skeptic.

In short it's a correlation engine which, again, is very useful and will go ways to improve our lives somewhat - I hope - but I'm not holding my breath for anything more. A lot of correlation does not causation make. No reasoning can take place until you establish ontology, causality and the whole shebang.

Flamentono2 · 2025-05-20T13:20:23 1747747223

I do understand it but i also think that the current LLMs are the first step to it.

GPT-3 started proper investment into this topic, there was not enough research done in this direction and now it is. People like Yann LeCun already analyse different approaches/architecture but they still use the infrastructure of LLMs (ML/GPUs) and potentially the data.

I never said that LLM is the breaktrhough in consesnes.

But you can also ask LLM strategies for thinking. It can tell you a lot of things. We will see if a LLM will be a fundamental part of AGI or not but GPU/ML will probably be.

I also think that the compression mechanism through LLM lead to concepts through optimization. You can see from the antropic paper, that an LLM doesn't work in normal language space but in a high dimensional one and then 'expresses' the output in a language you like.

We also see that real multi modal models are better in a lot of tasks due to a lot more context available through them. Estimating what someone said due to context.

The necessary infrastructure and power requirement is something i accept too. We can assume, i do, that further progress in a lot of topics will require this type of compute and it also solves our data bottleneck: normal CPU architecture is limited by memory databus.

Also in comparision to a lot of other companies, if the richest companies in the world invest in nuclear, i think this is a lot better than any other companies. They have a lot higher margins and knowledge. co2 is a market separator for them too.

I also expect this amount of compute to be the base for fixing real issues we all face like cancer or optimizing cancer or any other sickness detection. We need to make medicin a lot cheaper and if someone in africa can do a cheap x ray and send it to the cloud to get any feedback, that would / could help a lot of people.

Doing complex and massive protein analysis or mRna research in virtual space, also requires GPUs.

All of this happened in a timespan of only a few years. I have not seen anything progressing as fast as AI/ML currenly does and as unfortunate it is, this needs compute.

Even my small inhouse image recognition fine tuning explodes when you do a handful parameter optimizations but the quality is a lot better than what we had before.

And enabling people to have real natural language UI is HUGE. It makes so much more accessable. Not just for people with a disability.

Things like 'do a eli5 on topic x'. "explain to me this concept" etc. I would have loved that when i tried to be successful in the university math curiculum.

All of that is already crazy and still is. But in parallel what Nvidia and others currently do with ML and Robotics is also something which requires all of that compute. And the progress is again breath taking. The current flood of basic robots standing and walking around is due to ML.

th0ma5 · 2025-05-20T16:20:34 1747758034

I mean, you're not even wrong ! Most all of these large models are based on the idea that if you put all of the representations that we can of the world into a big pile that you can tease out some kind of meaning. There's not even really a cohesive theory as to that, and surely no testable way to prove that it's true. It certainly seems like you can make a system that behaves as if it could be like that, and I think that's what you're picking up on. But it's actually probably something else and something far shorter of that.

hnaccount_rng · 2025-05-21T17:26:22 1747848382

There is an interesting analogy that my Analysis I professor once said: The intersection of all valid examples are also a definition of an object. In many ways this is, at least in my current understanding, how ML systems "think". So yeah it will take some superposition of examples and kind of try to interpolate between those. But fundamentally it is - at least so far - always an interpolation, not an extrapolation.

Whether we consider that "just regurgitating Stackoverflow" or "it thought up the solution to my problem" mostly comes up to semantics

dvfjsdhgfv · 2025-05-20T20:59:02 1747774742

> There is not a single system out there today which can do what claude can do.

Of course there is, it's called Gemini 2.5 Pro and it is also the reason I cancelled my Claude (and earlier OpenAI) subscriptions (I had quite a few of them to go around limits).

skydhash · 2025-05-20T11:55:11 1747742111

Yeah. It’s just fancier techniques than linear regression. Just like the latter takes a set of numbers and produces another set, LLMs takes words and produces another set of words.

The actual techniques are the breakthrough. The result are fun to play with and may be useful in some occasions, but we don’t have to put them on a pedestal.

holoduke · 2025-05-20T14:25:50 1747751150

You have the wrong idea of how an LLM works. Its more like an model that iteratively finds associating / relevant blocks. The reasoning are the iterative steps it takes.