Agreed. Another example in the first minute of the "Attention is all you need" one.
"[Transformers .. replaced...] ...the suspects from the time.. recurrent networks, convolution, GRUs".
GRU has no place being mentioned here. It's hallucinated in effect, though, not wrong. Just a misdirecting piece of information not in the original source.
GRU gives a Ben Kenobi vibe: it died out about when this paper was published.
But it's also kind of misinforming the listener to state this. GRUs are a subtype of recurrent networks. It's a small thing, but no actual professor would mention GRUs here I think. It's not relevant (GRUs are not mentioned in the paper itself) and mentioning RNNs and GRUs is a bit like saying "Yes, uses both Ice and Frozen Water"
So while the conversational style gives me podcast-keep-my-attention vibes.. I feel a uncanny valley fear. Yes each small weird decision is not going to rock my world. But it's slightly distorting the importance. Yes a human could list GRUs just the same, and probably, most professors would mistake or others.
But it just feels like this is professing to be the next, all-there thing. I don't see how you can do that and launch this while knowing it produces content like that. At least with humans, you can learn from 5 humans and take the overall picture - if only one mentions GRU, you move on. If there's one AI source, or AI sources that all tend to make the same mistake (e.g. continuing to list an inappropriate item to ensure conversational style), that's very different.
It then goes on to explain right afterwards that the key thing the transformer does is rely on a mechanism called attention. It makes more sense in that context IMO.
I recently listened to this great episode of "This American Life" [1] which talked about this very subject. It was released in June 2023 which might be ancient history in terms of AI. But it discusses whether LLMs are just parrots and is a nice episode intended for general audiences so it is pretty enjoyable. But experts are interviewed so it also seems authoritative.
This is the very next sentence, so it is a little odd that "hence the title" comes before, and not after, "...using something called self attention."
My take is these are nitpicks though. I can't count the number of podcasts I've listened to where the subject is my area of expertise and I find mistakes or misinterpretations at the margins, where basically 90% or more of the content is accurate.
[Attention is All You Need - 1:07]
> Voice A: How did the "Attention is All You Need" paper address this sequential processing bottleneck of RNNs?
> Voice B: So, instead of going step-by-step like RNNs, they introduced a model called the Transformer - hence the title.
What title? The paper is entitled "Attention is All You Need".
People are fooling themselves. These are stochastic parrots cosplaying as academics.