This tracks, because the entire system reduces to a sophisticated regression analysis. That's why we keep talking about parameters and parameter counts. They're literally talking about the number of parameters that they're weighting during training. Beyond that there are some mathematical choices in how you interrelated the parameters that yields some interesting emergent phenomena, and there are architecture choices to be made there. But the whole thing boils down to regression, and regression is at its heart a development of a canon from a representative variety of examples.
We are warned in statistics to be careful when extrapolating from a regression analysis.
We are warned in statistics to be careful when extrapolating from a regression analysis.