Some argue they were kind of tricked into thinking that; see https://www.interconnects.ai/p/openais-o1-using-search-was-a... and some other writing by Lambert which has turned out pretty much on-point as far as RL and verifiers are concerned.
Some argue they were kind of tricked into thinking that; see https://www.interconnects.ai/p/openais-o1-using-search-was-a... and some other writing by Lambert which has turned out pretty much on-point as far as RL and verifiers are concerned.