I am curious, do you think it will stay this way? AI couldn't code anything 2.5 years ago, why should it stop with glue code?
Yesterday I fed the first 3 questions from the 2024 Putnam Exam [0] into a new open source LLM and it got 3/3. The progress is incredible and I don't expect it to suddenly stop where it is.
I expect it to get better some day, but not really good in the next 10 years. I've been calling BS on fully autonomous driving for a decade and I still believe it takes AGI to do that and I think that's still a long way out.
LLMs literally grew out of autocomplete, and aside from having a huge amount of information they still don't understand, and certainly don't learn during inference.
Exam was administered on December 7th 2024, so it is possible this is in the training data but unlikely.
The open source model as worked through the problem step by step (it was R1 so I could read its chain of thought). That could be an elaborate ruse, but also unlikely.
I am curious, do you think it will stay this way? AI couldn't code anything 2.5 years ago, why should it stop with glue code?
Yesterday I fed the first 3 questions from the 2024 Putnam Exam [0] into a new open source LLM and it got 3/3. The progress is incredible and I don't expect it to suddenly stop where it is.
[0] https://kskedlaya.org/putnam-archive/