Hacker News new | past | comments | ask | show | jobs | submit login

you are absolutely on point - i would prefer having a real filesystem with deduplication (not compression), which offers data in a compact form, with good read speed for further processing.

i was already brainstorming of writing a custom purpose-built archive format, which would allow me to have more fine grained control over how i can lay out data and reference it. the thing is that this archive is most likely not absolutely final (additional versions being added) - having a plain filesystem allows for easier adding of new entries. an archive file might have to be rewritten.

if i go the route of custom archive, i can in theory write a virtual filesystem for it to access it read only like it would be a real filesystem... and if i design it properly, maybe even write it.

still would prefer to use a btrfs filesystem tbh ^^ will brainstorm a bit more over the next days - thanks for your input!




This is good thinking, but I think you are basically describing a Restic/Borg respository :)

- Deduplication? Check.

- Compact format? Check.

- Good read speed? Yep. (Proportional to backing store.)

- Custom purpose-built? Yeah, that's what backup programs are for.

- Custom data layout? Check. (Rabin/BuzHash content-defined chunking, SHA256 dedupe.)

- Adding additional versions? Yes. ("Incremental backups"— See above.)

- Virtual filesystem access? `restic mount`/`borg mount`.


thanks for mentioning those 2 projects, will check them out over the holidays and do some experimenting ^^


The same storage system is used by some non-backup software too, Perkeep and Seafile




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: