Question aside. Can anyone recommend any opensource de-duplication tool(s)? I've realized that I have the same data over many drives but manually going through them even for a single drive will take a ton of time. I'm wondering if there's something smart enough where you input paths to be scanned and magically outputs de-duplicated data to a single coherent place...
Edit: Some corrections. I forgot to mention which OS: GNU/Linux and/or BSDs.
Though that works fine from a script perspective I'd like some more interactive way of sorting directories etc. Identifying is just the first step, jdupes helps with linking the files (both soft and hard links comes with caveats though!) but that is mostly to save space, not to help in reorganisation.
It seems to me that is not a trivial problem to solve: de-duplication + reorganization. Maybe I'm incorrect. It also seems the kind of problem where it could be super-easy to screw it if you go with a custom made script plugging different tools...
Edit: Some corrections. I forgot to mention which OS: GNU/Linux and/or BSDs.