I cannot find it anymore, but there is a youtube video of someone making / training an llm who showed a lower precision switch switching his small model from bad to lobotomised and he even mentions he thinks it is what Anthropic does when the servers are overloaded. I notice it, but have no proof. There seems to be a large opportunity for screwing people over without consequences though, especially when using these API's at scale.