Basically I described how any content-addressable storage system works (including git), with the complication that our content is spread across various archive files, MPQ here but just as applicable to tarballs or anything else.
The difference between my suggestion and how BTRFS dedupes is that the FS does it on a block basis, which might work better for some kinds of files, but for game assets I think doing it on a file basis is good enough if not better.
The difference between my suggestion and how BTRFS dedupes is that the FS does it on a block basis, which might work better for some kinds of files, but for game assets I think doing it on a file basis is good enough if not better.