Hacker Newsnew | past | comments | ask | show | jobs | submit | georgestrakhov's commentslogin

First we shape our tools, then our tools shape our taste.

That's the interesting part I think. To the next generation of humans the smell of chatgpt text _may_ actually be the smell of good writing. Wouldn't that be a really interesting tragedy of the commons 2.0?


I have way more ideas than I have time to build. Maybe we should talk. What is your preferred stack?


I'd be interested to take you up on this offer. I'm in a similar boat to the OP and looking for things to build for more experience and as a hobby. My preferred tech stack is Typescript/React (or any framework)/PostgreSQL or anything in that area. Really anything on the web or React Native on mobile.


Exactly what I always wanted. I can play naturally but don't know sheet music. Sometimes capturing what I play in notation helps to communicate it to other musicians I am hanging our with...


IMHO, the word agent is quickly becoming meaningless. The amount of agency that sits with the program vs. the user is something that changes gradually.

So we should think about these things in terms of how much agency are we willing to give away in each case and for what gain[1].

Then the ecosystem question that the paper is trying to solve will actually solve itself, because it is already the case today that in many processes agency has been outsourced almost fully and in others - not at all. I posit that this will continue, just expect a big change of ratios and types of actions.

[1] https://essays.georgestrakhov.com/artificial-agency-ladder/


An agent, or something that has agency, is just something that takes some action, which could be anything from a thermostat regulating the temperature all the way up to an autonomous entity such as an animal going about it's business.

Hugging Face have their own definitions of a few different types of agent/agentic system here:

https://huggingface.co/docs/smolagents/en/conceptual_guides/...

As related to LLMs, it seems most people are using "agent" to refer to systems that use LLMs to achieve some goal - maybe a fairly narrow business objective/function that can be accomplished by using one or more LLMs as a tool to accomplish various parts of the task.


> An agent, or something that has agency, is just something that takes some action, which could be anything from a thermostat regulating the temperature all the way up to an autonomous entity such as an animal going about it's business.

I have seen "agency" used in a much more specific way than this: An agent is something that has goals expressed as states of a world, and has an internal model of the world, and takes action to fulfill its goals.

Under this definition, a thermostat is not an agent. A robot vacuum cleaner that follows a list of simple heuristics is also not an agent, but a robot vacuum cleaner with a Simultaneous Location and Mapping algorithm which tries to clean the whole floor with some level of efficiency in its path is an agent.

I think this is a useful definition. It admits a continuum of agency, just like the huggingface link; but it also allows us to distinguish between a kid on a sled, and a rock rolling downhill.

https://www.alignmentforum.org/tag/agent-foundations has some justification and further elaboration.


Hi - have a look at this book if you are interested [1] (Mike Wooldridge, Multi-Agent Systems)

[1] https://amzn.eu/d/6a1KgnL

Here are Mike's credentials :https://www.cs.ox.ac.uk/people/michael.wooldridge/


Hi - have a look at this book if you are interested [1] (Mike Wooldridge, Multi-Agent Systems)

[1] https://amzn.eu/d/6a1KgnL

Here are Mike's credentials :https://www.cs.ox.ac.uk/people/michael.wooldridge/


> IMHO, the word agent is quickly becoming meaningless. The amount of agency that sits with the program vs. the user is something that changes gradually

Yes, the term is becoming ambiguous, but that's because it's abstracting out the part of AI that is most important and activating: the ability to work both independently and per intention/need.

Per the paper: "Key characteristics of agents include autonomy, programmability, reactivity, and proactiveness.[...] high degree of autonomy, making decisions and taking actions independently of human intervention."

Yes, "the ecosystem will evolve," but to understand and anticipate the evolution, one needs a notion of fitness, which is based on agency.

> So we should think about these things in terms of how much agency are we willing to give away in each case

It's unclear there can be any "we" deciding. For resource-limited development, the ecosystem will evolve regardless of our preferences or ethics according to economic advantage and capture of value. (Manufacturing went to China against the wishes of most everyone involved.)

More generally, the value is AI is not just replacing work. It's giving more agency to one person, avoiding the cost and messiness of delegation and coordination. It's gaining the same advantages seen where smaller team can be much more effective than a larger one.

Right now people are conflating these autonomy/delegation features with the extension features of AI agents (permitting them to interact with databases or web browsers). The extension vendors will continue to claim agency because it's much more alluring, but the distinction will likely become clear in a year or so.


> Manufacturing went to China against the wishes of most everyone involved

Certainly those in China and the executive suites of Western countries wished it, and made it happen. Arguably the western markets wanted it too when they saw the prices dropping and offerings growing.

AI isn't happening in a vacuum. Shareholders and customers are buying it.


I think people keep conflating agency with agents, and that they are actually two entirely different things in real life. Right now agents have no agency - they do dot independently come up with new approaches, they’re mostly task-oriented.


Thank you, makes sense now


Thank you. A few thoughts.

1. Legal clearly hasn't stopped anyone so far. And probably won't in the future if there is economic value. So I suggest we take this outside the equation for now. Obviously it will be a thing to take care of, but I'm asking theoretical questions.

2. Effort vs. reward. Is it just about this? Or is there something more? I.e. is there a clear plateau or is it just diminishing returns (in which case sufficiently low cost of energy solves it)

3.Robots: yes on a mechanical level. But robots don't need to be mechanically elegant and Uber precise in order to be useful.


WRT 2:

I do indeed think that it's mostly about effort vs. reward, and that sufficiently low cost of energy, and sufficient time resources, would solve it. But the cost of energy would have to be near zero, and the time allotted would have to be very generous. This is because most mic/webcam/etc. data is of very poor quality -- probably poor enough to actually poison datasets -- so it would need to be mercilessly cleaned up, and then laboriously chunked and sorted. When all's said and done, you're going to devote tremendous labor to cutting >99% of your data, and more labor in categorizing it.

It might be more fruitful to develop a model that creates new training data from internet-derived data. This is a lot more complicated than it may sound, and I don't think I ought to speculate here as to what might be involved, but it still seems more viable than sorting through a low-quality material library of babel.


Hmm... But you don't have to rely on experiments, no? Ilya's original argument was that language is a representation of reality and so even in the noisy data from the internet with zero feedback loops and experiments - sufficient amount of compute could allow LLMs to get the underlying world model to some degree. Wouldn't the same hold true with cameras and robot interactions? Just predict the next frame of reality in the same way you predict the next token of language...

(Actions leading to reactions may or may not be part of the vector we are learning. I mean they should be, but not strictly necessary)

No? What am I missing? Just astronomical compute required? Or something more fundamental?


Try translating differential equations or musical notations or chemical formulas into english. When we find one language to be useless or inefficient at representing reality we create another language. Language is just a tool we use to think and transfer info from one chimp brain to another.


Is it either or? Obviously there is a need for more efficiency. But clearly data in any webcam stream has meaningful non-random information about the world. And so the central dogma of deep learning then should mean that if we throw enough compute - the underlying useful information / pattern however miniscule will be found and compressed.

What am I missing?


But isn't the underlying hypothesis that any data with sufficient underlying pattern is ultimately useful if you throw enough compute at it? His own argument is that even from the nonsense of the internet LLMs could extract general models of the world...



Yes, like that, but less interactive. Clicking to expand every part individually is tedious, especially with no up-front idea of how many levels of detail are there. TFA's example operates on lager blocks of text, too, which seems to work out better.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: