Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This looks like the sort of bug I'd write back when I used mutexes to write I/O routines. These days, I'd use a lock-free state machine to encode something like this:

   NOT_IN_CACHE -> READING -> IN_CACHE
(the real system would need states for cache eviction, and possibly page mutation).

Readers that encounter the READING state would insert a completion handler into a queue, and readers transitioning out of the READING state would wake up all the completion handlers in the queue.

I've been working on an open source library and simple (manual) proof system that makes it easy to verify that the queue manipulation and the state machine manipulation are atomic with respect to each other:

https://docs.rs/atomic-try-update/0.0.2/atomic_try_update/

The higher level invariants are fairly obvious once you show that the interlocks are correct, and showing the interlocks are correct is just a matter of a quick skim of the function bodies that implement the interlocks for a given data type.

I've been looking for good write ups of these techniques, but haven't found any.



The existing btrfs code does use a lock-free state machine for this, `eb->bflags`, that sort of mirrors regular page flags (hence `UPTODATE`, `DIRTY`, etc.).

But Linux kernel APIs like test_bit(), set_bit(), clear_bit(), test_and_set_bit() etc. only work on one bit at a time. The advantage is they can avoid a CAS loop on many platforms. The disadvantage is you only get atomic transitions for one bit at a time. So the `READING -> UPTODATE` transition is more like

    READING -> (READING | UPTODATE) -> UPTODATE
And the `NOT_IN_CACHE -> READING` transition is not fully atomic at all:

    if (!(bflags & UPTODATE)) // atomic
                              // race could happen here
        bflags |= READING;    // atomic
The whole state machine could be made atomic with CAS, but that would be (slightly) more expensive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: