The spam filter failed to catch it, which is expected - nothing gets 100% – but then the Mail app interpreted the text of the message in the inbox as legitimate and placed it in the priority section with an excerpt which sounds legit and none of the suspicious parts displayed.
This is basically the Achilles heel of LLMs: they’re gullible, and in a context like spam there are many people with a financial incentive to figuring out how to exploit that. The rush to deploy them will lead to more of these problems as people start using them on untrusted inputs at scale and I imagine a ton of money is going to flood to people who say they can limit this.
This is basically the Achilles heel of LLMs: they’re gullible, and in a context like spam there are many people with a financial incentive to figuring out how to exploit that. The rush to deploy them will lead to more of these problems as people start using them on untrusted inputs at scale and I imagine a ton of money is going to flood to people who say they can limit this.