> dependent on human curated reward models to score the output to make it usable.
This is a false premise, because there already exist systems, currently deployed, which are not dependent on human-curated reward models.
Refutations of your point include existing systems which generate a reward model based on some learned AI scoring function, allowing self-bootstrapping toward higher and higher levels.
A different refutation of your point is the existing simulation contexts, for example, by R1, in which coding compilation is used as a reward signal; here the reward model comes from a simulator, not a human.
> So it still depends on human thinking
Since your premise was false your corollary does not follow from it.
> And there won't be a point when human curated reward models are not needed anymore.
This is just a repetition of your previously false statement, not a new one. You're probably becoming increasingly overconfident by restating falsehoods in different words, potentially giving the impression you've made a more substantive argument than you really have.
This is a false premise, because there already exist systems, currently deployed, which are not dependent on human-curated reward models.
Refutations of your point include existing systems which generate a reward model based on some learned AI scoring function, allowing self-bootstrapping toward higher and higher levels.
A different refutation of your point is the existing simulation contexts, for example, by R1, in which coding compilation is used as a reward signal; here the reward model comes from a simulator, not a human.
> So it still depends on human thinking
Since your premise was false your corollary does not follow from it.
> And there won't be a point when human curated reward models are not needed anymore.
This is just a repetition of your previously false statement, not a new one. You're probably becoming increasingly overconfident by restating falsehoods in different words, potentially giving the impression you've made a more substantive argument than you really have.