That used to be how we did it, but this method performed better on super large c...

kketch · 2025-09-30T09:29:31 1759224571

The main concern here isn’t really whether the agent needs access to the whole codebase. Personally I feel an agent might need to have access to all or most of the codebase to make better decision, see things have been done before etc.

The real issue is that containers are being used as a security boundary while it’s well known they are not. Containers aren't a sufficient isolation mechanism for multi-tenant / untrusted workloads.

Using them to run your code review agent again puts your customers source code at risk of theft, unless you are using an actual secure sandbox mechanism to protect your customers data which from reading the article does not seem to be the case.

arjvik · 2025-09-29T18:35:02 1759170902

If that's the case, isn't a grep tool a lot more tractable than a Linux agent that will end up mostly calling `grep`?

lomase · 2025-09-29T19:16:30 1759173390

But then you can't say is powered by AI and get that VC money.

kjok · 2025-09-29T19:34:02 1759174442

Ah ha.

CuriouslyC · 2025-09-29T19:16:19 1759173379

You shouldn't need the entire codebase, just a covering set for the modified files (you can derive this by parsing the files). If your PR is atomic, covering set + diff + business context is probably going to be less than 300k tokens, which Gemini can handle easily. Gemini is quite good even at 500k, and you can run it multiple times with KV cache for cheap to get a distribution (tell it to analyze the PR from different perspectives).