The rate of improvement with LLMs seems to have halted since Claude3.5, which was about a year ago. I think we’ve probably gone as far as we can go with tweaks to transformer architectures, and we’ll need a new academic discovery which could take years to do better