The big manual task we haven't automated is going through documents and determining "is this sensitive enough to warrant information controls?" We may just be stuck with that in the way of things.
Just out of curiosity, why would the LLM need network access for this? I.e. feeding the doc to an LLM and asking "is this sensitive information according to these criteria: [...]" should get you there most of the way, no? Probably need a handful of (carefully designed) tool calls and a human in the loop somewhere, but it seems achievable.
"Do these documents contain models or descriptions of (list of devices redacted for HN), or personally identifying information?" would be a great question to be able to automate since it sucks up a lot of time that could be more profitably spent doing other things. There's costs to both Type I and Type II errors so deterministic filters only get us so far (which isn't very).
Humans of course will screw at least 1% of the time, at least judged retroactively.
The fun part is, if you have non-trivial inputs, even if you don’t change anything, you’ll likely get a different 1% set of errors each time no matter how perfect your judges.
10% seems pretty high, but it really all depends on what you’re evaluating. If it’s all weird edge cases….