That not-invented-here locking mechanism was a big shock to me. I'd be very interested to know the rationale behind that, are locking primitives somehow not available in file system code?
Locks are perfectly usable in filesystem code, but test_and_set_bit()/wait_on_bit() has lower overhead, so they'll get used as an optimization. This function is called on every metadata read, so the improved performance/scalability of raw atomics over locks can probably make a difference on fast storage.