Does this sandbox the agents? All I want is a way to keep the agents from writing to and reading from arbitrary places on the filesystem. I want that enforced using operating system primitives rather than a pinky promise with an LLM.
It already worries me that the Cursor agents occasionally try to perform operations with full absolute paths, which they wouldn't be able to know if they were properly sandboxed to the current directory.
I think OpenAI's Codex does this. Not sure to what degree, but sandboxing seems to be a priority for that project. Possibly to their detriment since last time I tried it it was not nearly as good as Claude Code.
Codex-cli does use MacOS sandboxing by default. It does unfortunately cause issues for my workflow because the agent is very restricted in what it is allowed to do (like, read/write the Go build cache) and its command whitelisting configurability is currently nonexistent. I'm looking into using containers to allow the agent more autonomy within its environment.
You can solve this yourself with a little elbow grease with Docker + a devcontainer. I did this and I’m very happy with the results - Claude can do anything it wants, but it can’t push to prod.
I wrote https://github.com/anoek/sandbox for that exact purpose, it uses overlayfs to protect your system from LLMs making unwanted changes and optionally masks out places you don't want it to be able to read from.
You could try sandbox-exec. It’s kind of depreciated but was more or less designed for this exact use case I think. It’s too bad Apple doesn’t really support it anymore (although it still works in my limited testing!)
Too bad OSs have such lock-in. Having a macOS with great sandboxing per folder + os capability to avoid the docker hellscape would be awesome. Probably not gonna happen until we can oneshot an OS rewrite :)
It already worries me that the Cursor agents occasionally try to perform operations with full absolute paths, which they wouldn't be able to know if they were properly sandboxed to the current directory.