How does this compare with a generic tool like Antithesis? I recognize closed so...

aoli-al · 2025-02-22T22:36:05 1740263765

AFAIK, Antithesis uses a hypervisor to achieve deterministic execution. This can be less effective because the hypervisor does not have language semantics and faces a larger search space. You may check Figures 5 and 6 in our technical report[1], where we compare Fray against RR, a record and replay tool that can also be used for concurrency testing at OS level[2].

[1]: https://arxiv.org/pdf/2501.12618

[2]: https://robert.ocallahan.org/2016/02/introducing-rr-chaos-mo...

vlovich123 · 2025-02-23T17:29:44 1740331784

Antithesis is supposed to be quite a bit faster than rr chaos precisely because it’s a hypervisor vs rr which is trying to intercept syscalls which is notoriously slow, so comparing against rr per second feels like a bad proxy.

Unlike rr chaos, which I believe uses a random search without any knowledge of past runs, Antithesis is supposed to do a more targetted search through the orderings with understanding of history between runs, so the executions needed per bug similarly has rr as a bad proxy.

I’m also not sure I see how language semantics can be exploited when you’re interleaving based on different thread orderings. If I understand it correctly, Fray is also slightly more limited than something like Antithesis which can also test I/O failures and different I/O orderings in a distributed setting as well.