They just want to reinforce their own bias that OpenAI BAD and DUMB, rationalism GOOD! When it’s their own fault for not understanding enough theory to know what happens if there were to be a loss of precision in the predicted embedding vector that maps to a token. If enough decimal places are lopped off or nudged then that moves the predicted vector slightly away from where it should be and you get a nearby token instead. Instant aphasia. The report said it was a GPU configuration problem so my guess is the wrong precision was used in some number of GPUs but I have no idea how they configure their cluster so take that with a giant grain of salt.
Agreed, but I think they raise a fair point that some kind of automated testing probably should've existed to look for sudden major changes in prompt -> output production between versions, with lots of prompt test cases. Maybe this testing did exist but this problem just didn't surface, for some reason?