> The biggest reason raid btrfs is not trustable is that it has no mechanism for correctly handling a temporary device loss. It will happily rejoin an array where one of the devices didn’t see all the writes. This gives a 1/N chance of returning corrupt data for nodatacow (due to read-balancing), and for all other data it will return corrupt data according to the probability of collision of the checksum. (The default is still crc32c, so high probability for many workloads.) It apparently has no problem even with joining together a split-brained filesystem (where the two halves got distinct writes) which will happily eat itself.
That is just mind bogglingly inept. (And thanks, I hadn't heard THIS one before).
For nocow mode, there is a bloody simple solution: you just fall back to a cow write if you can't write to every replica. And considering you have to have the cow fallback anyways - maybe the data is compressed, or you just took a snapshot, or the replication level is different - you have to work really hard or be really inept to screw this one up.
I honestly have no idea how you'd get this wrong in cow mode. The whole point of a cow filesystem is that it makes these sorts of problems go away.
I'm not even going to go through the rest of the list, but suffice it to say - every single broken thing I've ever seen mentioned about btrfs multi device mode is fixed in bcachefs.
Every. Single. One. And it's not like I ever looked at btrfs for a list of things to make sure I got right, but every time someone mentions one of these things - I'll check the code if I don't remember, some of this code I wrote 10 years ago, but I yet to have seen someone mention something broken about btrfs multi device mode that bcachefs doesn't get right.
That is just mind bogglingly inept. (And thanks, I hadn't heard THIS one before).
For nocow mode, there is a bloody simple solution: you just fall back to a cow write if you can't write to every replica. And considering you have to have the cow fallback anyways - maybe the data is compressed, or you just took a snapshot, or the replication level is different - you have to work really hard or be really inept to screw this one up.
I honestly have no idea how you'd get this wrong in cow mode. The whole point of a cow filesystem is that it makes these sorts of problems go away.
I'm not even going to go through the rest of the list, but suffice it to say - every single broken thing I've ever seen mentioned about btrfs multi device mode is fixed in bcachefs.
Every. Single. One. And it's not like I ever looked at btrfs for a list of things to make sure I got right, but every time someone mentions one of these things - I'll check the code if I don't remember, some of this code I wrote 10 years ago, but I yet to have seen someone mention something broken about btrfs multi device mode that bcachefs doesn't get right.
It's honestly mind boggling.