Oh. It really is that bad still. So if the question is between wrapping the plaintext in layers of security, or building out a million new server instances to do it via FHE, i know which one everyone will choose.
Accelerators are being developed that claim to get down to 10x, though i think they will be more like 100-1000x, which would still be a huge improvement considering how people use LLMs today for basic tasks like string matching.
Are those accelerators software-only? 10x could let 4$ VPS run server side checks for backup software (evil clients cant clean backups) and git forges (eg, dont allow X to push to main).