Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That would make Grok the only model capable of protecting its real system prompt from leaking?


Well, for this version people have only been trying for a day or so.


Providing a fake system prompt would make such jailbreaking very unlikely to succeed unless the jailbreak prompt explicitly accounts for that particular instruction.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: