Jepsen is more like a fuzzer than a unit test suite. It outputs "I did this and I got back this unexpected result". All of those outputs need to be analyzed by hand.
You don't audit a fuzzer to say "what if it runs the thing wrong". That's not the point of the fuzzer. The point is to do lots of weird stuff and check that the output of the system matches the expectation of what's produced. If the fuzzer outputs a result that's actually expected, then that's easily determined because you have to critically analyze what comes out of the tool in the first place.
It doesn't really matter if the Jensen test suite is faulty. If it reports something and it is (against all odds) not a valid bug. What's the problem. It does not claim to find all problems (and can't) so this is sufficient.