Hacker News new | past | comments | ask | show | jobs | submit login

> We finally have a system which encodes not just basic things but high level concepts

That's the thing I'm trying to convey: it's in fact not encoding anything you'll recognize and if it is, it's certainly not "concepts" as you understand them. Not saying it cannot correlate text that includes what you call "high level concepts" or do what you imagine to be useful work in that general direction. Again not making claims it's not useful, just saying that it becomes kind of meh once you factor in all costs and not just the hypothetical imaginary future productivity gains. AKA building literal nuclear reactors to do something that basically amounts to filling in React templates or whatever BS needs doing.

If it was reasoning it could start with a small set of bootstrap data and infer/deduce the rest from experience. It cannot. We are not even close as in there is not even theory to get us there forget about the engineering. It's not a subtle issue. We need to throw literally all data we have at it to get it to acceptable levels. At some point you have to retrace some steps and think over some decisions, but I guess I'm a skeptic.

In short it's a correlation engine which, again, is very useful and will go ways to improve our lives somewhat - I hope - but I'm not holding my breath for anything more. A lot of correlation does not causation make. No reasoning can take place until you establish ontology, causality and the whole shebang.




I do understand it but i also think that the current LLMs are the first step to it.

GPT-3 started proper investment into this topic, there was not enough research done in this direction and now it is. People like Yann LeCun already analyse different approaches/architecture but they still use the infrastructure of LLMs (ML/GPUs) and potentially the data.

I never said that LLM is the breaktrhough in consesnes.

But you can also ask LLM strategies for thinking. It can tell you a lot of things. We will see if a LLM will be a fundamental part of AGI or not but GPU/ML will probably be.

I also think that the compression mechanism through LLM lead to concepts through optimization. You can see from the antropic paper, that an LLM doesn't work in normal language space but in a high dimensional one and then 'expresses' the output in a language you like.

We also see that real multi modal models are better in a lot of tasks due to a lot more context available through them. Estimating what someone said due to context.

The necessary infrastructure and power requirement is something i accept too. We can assume, i do, that further progress in a lot of topics will require this type of compute and it also solves our data bottleneck: normal CPU architecture is limited by memory databus.

Also in comparision to a lot of other companies, if the richest companies in the world invest in nuclear, i think this is a lot better than any other companies. They have a lot higher margins and knowledge. co2 is a market separator for them too.

I also expect this amount of compute to be the base for fixing real issues we all face like cancer or optimizing cancer or any other sickness detection. We need to make medicin a lot cheaper and if someone in africa can do a cheap x ray and send it to the cloud to get any feedback, that would / could help a lot of people.

Doing complex and massive protein analysis or mRna research in virtual space, also requires GPUs.

All of this happened in a timespan of only a few years. I have not seen anything progressing as fast as AI/ML currenly does and as unfortunate it is, this needs compute.

Even my small inhouse image recognition fine tuning explodes when you do a handful parameter optimizations but the quality is a lot better than what we had before.

And enabling people to have real natural language UI is HUGE. It makes so much more accessable. Not just for people with a disability.

Things like 'do a eli5 on topic x'. "explain to me this concept" etc. I would have loved that when i tried to be successful in the university math curiculum.

All of that is already crazy and still is. But in parallel what Nvidia and others currently do with ML and Robotics is also something which requires all of that compute. And the progress is again breath taking. The current flood of basic robots standing and walking around is due to ML.


I mean, you're not even wrong ! Most all of these large models are based on the idea that if you put all of the representations that we can of the world into a big pile that you can tease out some kind of meaning. There's not even really a cohesive theory as to that, and surely no testable way to prove that it's true. It certainly seems like you can make a system that behaves as if it could be like that, and I think that's what you're picking up on. But it's actually probably something else and something far shorter of that.


There is an interesting analogy that my Analysis I professor once said: The intersection of all valid examples are also a definition of an object. In many ways this is, at least in my current understanding, how ML systems "think". So yeah it will take some superposition of examples and kind of try to interpolate between those. But fundamentally it is - at least so far - always an interpolation, not an extrapolation.

Whether we consider that "just regurgitating Stackoverflow" or "it thought up the solution to my problem" mostly comes up to semantics




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: