Yeah, when the author was writing about that initial query about delay-per-unit-length, I'm thinking: "This doesn't tell us whether an LLM can apply the concepts, only whether relevant text was included in its training data."
It's a distinction I fear many people will have trouble keeping in-mind, faced with the misleading eloquence of LLM output.
I think you are looking at the term generalizing and memorisation.
It have been shown that LLM generalize, what is important to know is if they generalized it or memorized it.
It's a distinction I fear many people will have trouble keeping in-mind, faced with the misleading eloquence of LLM output.