> While hard links are certainly a lesser evil than setuid, and there is little motivation to rid ourselves of them, they do serve to illustrate how a seemingly clever and useful design can have a range of side effects which can weigh heavily against the value that the design tries to bring.
This seems to me to be a bit of throwing out the baby with the bathwater... the problem isn’t links but rather setuid programs changing file permissions in user writable directories!
I don't see how the security issues described in this article are really tied to hardlinks. If root is doing chmod/chown in a directory that is writable by untrusted users, the same untrusted users can also just remove or rename files. Is there any example that demonstrates an exploit specifically relying on hardlinks?
Apparently a user can create a hardlink to a sensitive root-owned file (like /etc/shadow) in a user-writable directory where they know a privileged process (in this case tmpfiles.d) will come along and chown it to the user, after which that user will own the sensitive file too.
Thanks a lot for the clarification! I hadn't quite puzzled this together in my head.
It's really counterintuitive that creating a hardlink is allowed based solely on the permissions of the directory it is created in. I think I was expecting another permission check based on the directory the file is sitting in, since that's gating unlink() and rename().
It also means that, any file you can "see", if you have write access somewhere on the same filesystem you can prevent it from being deleted[¹] — even if you can't read it. That feels like a possible privacy issue…
[¹] If the user trying to delete file is aware of the problem, they can truncate it to 0 bytes, but that's not what a plain rm does (because rm is also the tool to remove hardlinks…)
Linux does (since version 3.6) have the ability to prevent users from creating hardlinks to files they don't own. (See man 5 proc under "/proc/sys/fs/protected_hardlinks".) I think FreeBSD has a similar sysctl option.
The linked article does mention it but warns "If you're not using systemd, the vanilla Linux kernel does not enable these protections by default".
>Couldn’t they introduce the same security feature mentioned for symlinks?
"The tmpfiles.d specification for the Z type more or less implies some kind of recursive chown. The spec heads off one type of vulnerability by saying that symlinks should not be followed; however, hard links are still a problem"
>As in, make it so by default you can’t create a hard link to a file you don’t already have write access to?
From the CVE: "when the fs.protected_hardlinks sysctl is turned off"
A description of that: "When set to “1” hardlinks cannot be created by users if they do not already own the source file, or do not have read/write access to it."
.. which apparently now won't work under systemd either!
IMO, he was wrong on this; it should have been enabled by default, and then the people who need that exceptionally rare legacy stuff can disable it with the same techniques (/proc, initrd) that he is currently suggesting to enable it.
You can link a file that is outside of that directory and thus get write access to that file. (Canonical example is linking /etc/passwd somewhere where you expect root to do chown -R you .)
The usual defense is to keep user writable spaces on separate mount points, where in theory they may be able to link with each others' files, but not anything important. And then be mindful about whatever dumb script you run that mucks with permissions.
"others' files, but not anything important" reminded me of https://xkcd.com/1200/ - user files are pretty much the only important thing in many scenarios.
I'd be curious to know what use case people have today for hardlinks, ever since symlinks became a thing.
I've been using Linux for more than 20 years and the only case I've found is for rsync incremental backups (--link-dest option), which is great for doing backups to an external USB hard drive and saving space. But that's rather niche.
Standard use case for hardlinks is replacing a file atomically while creating a backup. The steps you do are:
1. create, write, close/sync "file.new"
2. hardlink "file" to "file.old" and sync again
3. rename "file.new" to "file"
4. sync to finish
With these steps, regardless of where you are interrupted, you always have a "functioning" copy of the file. This is quite important for tools writing configuration files or state to disk, but also just your plain old office suite preventing total loss of your PhD thesis.
(Yes it's not perfect, sync()ing correctly is hard [you need to sync the files and the directory!], and hard filesystem damage/kernel panics can always break things.)
not so niche. That's a really terrific use for them. Hardlinks are hampered compared to symlinks because they can't be used for directories (which seems just a really silly limitation). It would be awesome to be able to roll up and mirror entire directory trees.
With newer systems this can be often handled at the os level. Both xfs and btrfs support copy-on-write, so you can have the same effect just by doing "cp --reflink".
Sure, that will do a CoW copy, but it's not actually the same effect at all, is it? In other words, writing to the copy will not propagate the writes to the original. (or am I wrong on that)
Besides, won't this walk and copy the entire directory subtree? (i.e., CoW only applied to the files themselves, not just directory entries.)
I'm saying I'd like to just have a hard link to the directory itself; if the directory is effectively a list of inode pointers, I just want to add another pointer to that list of pointers. Then, if I create or modify a new file in some deep nested directory in either copy, it will be instantly available in both or more locations without any metadata changes on disk to more than one copy of the metadata itself. This would be interesting for certain use cases.
I mentioned the CoW in the OP context of reflinks being useful for incremental backups. In that case you do want the "copy will not propagate the writes to the original" behaviour.
Sometimes multiple writable entries for a single object are desirable.
As an advanced feature there's plenty of room for abuse and footgun problems, but it's still occasionally, maybe even rarely, the correct tool for the job.
What's more important are multiple readable entries for an object! And for those objects not to disapear unless the ref count on them is zero. The quintessential example is a library... By linking it into a directory the program has acccess to, you expose the library to the program. By using a hard link, you protect that library from being erased by another user
Does multi user posix really get much use still? And should it? The model is how old now and we're still finding vulnerabilities more or less by design. Computers are so cheap that almost everyone has one in their pocket, and most in the first world own 2-3. Multi user operating systems just don't seem relevant anymore.
Yes. One use case is application sandboxing; the users aren't separate people, but separate programs. Eg on Android, each app is a "user", and Linux filesystem permissions are used to control what the apps can do. But there are also a few instances where you still see the different-users-ssh-into-the-same-server model; for example that server might be controlling a cluster, and have users ssh'ing in to do otherwise-slow computations.
Using users to sandbox processes is a hack that has never worked well and is only successful when combined with lots of other sandbox technologies such as jails, SELinix or AppArmor.
> One particularly scary example is the implementation of hard links on HFS+. To keep track of hard links, HFS+ creates a separate file for each hard link inside a hidden directory at the root level of the volume. Hidden directories are kind of creepy to begin with, but the real scare comes when you remember that Time Machine is implemented using hard links to avoid unnecessary data duplication.
HFS+ is still only somewhat extended HFS and the filesystem itself does not support hardlinks to anything. The support for hardlinks is simulated in upper layers which while being questionable design on its own somewhat alleviates the usual problems with directory hardlinks.
There is a difference between “allows hardlinks to directories to exists” and “allows user to straighforwardly create them”. As for existence most unix filesystems have exactly zero issue with that, while on systems that allow you to link(2) to directory you have to be root to do that.
Hardlinks to directories not only break bunch of userspace expectations, but more importantly create a structure that cannot be unlink(2)ed (unlink(2) does not work on directories and rmdir(2) works only for directories with st_nlink==2). So even on systems that allow creation of that it is restricted to root.
> macOS allows it as of 10.5, but it is not exposed to the user.
Depends what you mean by "not exposed to the user": while ln(1) can't create directory hardlinks it works fine via link(2) (on HFS+), with the limitation that hardlinked directories can't be siblings.
And of course directory hardlinks are fucking terrifying because `unlink(1)` will not work, and `rm(1)` will recursively remove directory contents, so you need to go through `unlink(2)` in C.
No so much harlinks, but symlinks are a blight on the POSIX filesystem design. They have caused endless pain and suffering and so many, many CVE's. They need to be eliminated.
I am trying to work out the level of (useless?/unnecessary?) churn in the world of startups / digital transformation / world.
So, yes the Internet is great - it connects what 5 billion adults now, and allows faster finding of the things you want etc. But there is soooo much ... of this stuff. I am guessing that "Digital marketing for the Rental market" means you have a house to let, and you want to list it with these people and their five competitors because you might miss out because who knows where one's audience really looks.
Now we could talk about disaggregation of AiBnB as a positive thing, but really - no, lets not.
What we can talk about is there is a bare minimum of cost / effort we can imagine here. Call it a Craigslist for the whole internet. Want to sell something - just find the right RDF tuple and list it. A search engine can find it and anyone searching for "house to rent in London" or "new pair trainers" will have a complete JSON list to walk through - sortable by price, location, availability etc etc.
Now this is not something I think should exist, but if it did it would still have a cost to operate. But we could measure the unnecessary churn by comparing the actual cost (in people, dollars, time etc) of things like RentPath to this bare minimum.
I expect there are Economics PhDs on this, but it struck me as interesting.
> While hard links are certainly a lesser evil than setuid, and there is little motivation to rid ourselves of them, they do serve to illustrate how a seemingly clever and useful design can have a range of side effects which can weigh heavily against the value that the design tries to bring.