The essential problem is that there is no thread-safe way to implement this while maintaining backwards compatibility -- applications can alter the environment block by changing the environ global pointer, applications can also alter the environment block by replacing individual pointers in the environ array, applications can also alter the environment block by altering the strings pointed to by the individual members of the environ array, applications can also alter the environment block by using setenv/putenv/etc.
Inserting a mutex into the setenv/getenv/etc. functions is pointless because applications are explicitly allowed to modify the environ pointer and array directly without any locking.
> Inserting a mutex into the setenv/getenv/etc. functions is pointless because applications are explicitly allowed to modify the environ pointer and array directly without any locking.
by that logic mutex themselves are pointless because nothing ever forces you to use them, even in memory-safe languages you can still access /dev/mem and change bytes? It's stil a useful thing to have.
The difference is that modifying the environ pointer is explicitly supported behaviour in the standard, poking through /dev/mem is not.
Although I guess a middle ground solution wouldn't be too bad either - most programs don't modify environ directly, so POSIX could offer thread safety for the functions and make multithreading through "environ" UB. This is already kind of explained in the standard:
> The difference is that modifying the environ pointer is explicitly supported behaviour in the standard
they just have to fix the standard. e.g. in my country they manage to improve for instance the standard for electrical plugs every three years, there is NO REASON posix cannot do the same
I think the memory leak solution (copy over the env variables to a new location in memory every time you call setenv and keep the old pointers alive) will cause the fewest crashes.
I would personally go for the aggressive approach (release a new major version of libc that detects multithreaded environments and intentionally crashes out when calling setenv() so people actually notice and fix their broken programs) but I suspect not many people will agree with me on that.
The API is not necessarily bad (it's just very 80s UNIX), but the lack of enforcement of thread-safety causing all kinds of bugs and crashes.
Yeah, but the program may not be broken until you, the glibc maintainer, calls `raise(SIGSEGV)`.
Most programs using setenv call it before starting any threads. That is not broken.
Detecting the linkage of thread support and crashing that program on purpose is, frankly, a pathological way to fix a non-broken program.
Besides which, your proposal won't work anyway, because this remains a potential problem in single threaded programs anyway: a program calling getenv, storing the result, and then calling setenv on the same variable and using the previously stored result will break anyway.
In summary, your proposal is broken in two different ways: 1) it breaks well-defined programs, and 2) it fails to break broken programs.
I wouldn't implement it during linkage, obviously single threaded putenv/setenv calls should still be permitted as part of initialisation routines.
Count the number if children in /proc/self/task for all I care, the detection needs to happen during runtime.
You're right that putenv/setenv are also horribly broken in other ways, and doing multi thread detection doesn't prevent those problems. In a perfect world we would just kill off these two functions all together, replacing them with either crashes or no-ops, but that'd be an even harder sell.
> I wouldn't implement it during linkage, obviously single threaded putenv/setenv calls should still be permitted as part of initialisation routines. Count the number if children in /proc/self/task for all I care, the detection needs to happen during runtime.
That still breaks well-defined, non-broken programs which don't call getenv/setenv in racing ways. There is no way for you do a conditional-upon-threads mechanism without false positives.
> You're right that putenv/setenv are also horribly broken in other ways, and doing multi thread detection doesn't prevent those problems. In a perfect world we would just kill off these two functions all together, replacing them with either crashes or no-ops, but that'd be an even harder sell.
But you don't need to in order to meet your original goal - breaking programs which do call setenv/getenv in the wrong order. Proposing to remove them altogether doesn't fulfill the goal of finding the breakages immediately and introduces breakages in existing programs.
My alternative: use LD_PRELOAD and provide alternative setenv/getenv functions which raise SIGSEGV when setenv is called on a variable more than once, and when getenv is called on a variable that was already setted once. It requires nothing more than a counter for each of setenv/getenv per variable.
That finds programs which actually are broken, with no false positives, and ignores threads altogether because they don't matter under the counter system[1].
Best of all, you can implement this in an afternoon, without needing to modify glibc, and then test it with every single executable on your system to see which ones break.[2]
[1] Since the caller knows they are not thread-safe anyway, we aren't looking for the error where the caller calls setenv concurrently in different threads. That's a different problem.
[2] I would wager good money that few, if any systems, will break under this test.
It solves hard crashes during DNS lookups by wasting a few kilobytes of RAM. Seems like a fine solution to me. The memory leak only occurs in circumstances where the program would've crashed or started messing with random memory anyway.
A proper solution would be to either nuke put/setenv() in the C standard library or redesign the *env() calls entirely, but that would break existing programs.
A proper solution would be to implement your own put/setenv in the program that needs special behavior. That solves everything and does not affect any other programs.
There are circumstances where it's a perfectly valid solution. For example, suppose you're trying to acquire the lock on something to destroy it. It can be the lesser of two evils just to leak that memory instead of just waiting forever/a very long time to acquire that lock. You just need to ensure that you aren't leaking memory too quickly for whatever your constraints are. For example, most programs wouldn't care about a 1kb/day leak of memory because that would take a very long time to actually become noticeable. Furthermore, there's pretty much always some degree of memory growth just due to heap fragmentation (at least if you're using a language like C which can't do memory compaction via GC).
All of this is false, including the last statement. Proof by existence: Solaris/Illumos has a thread-safe (leaking, though the leaks are hidden from memory debuggers), lock-less (whenever you don't write to `environ`) `getenv()`, and thread-safe, locking `setenv()`/`putenv()`/`unsetenv()`: https://src.illumos.org/source/xref/illumos-gate/usr/src/lib...
Yes, it can be done and has been done. glibc has no excuse.
Another problem is that it is hard to reason about security in a C program if all environment variables could change at all times.
If we are talking about the C application deciding when it wants to rescan the environment that is something different, but if your environment can change potentially before and after you check it this opens you up for a heap of new attacks.
you're right; in addition to that though, I'd like to highlight that the use of some form of locking "inside" set/getenv would gain you nothing at all. That is not because of setenv, but because of getenv. The latter returns you a _pointer_. Whether you call that a reference leak, an ownership breakage ... it's not "yours" and when you have it, you don't "hold" it even if getenv internally were to lock whatever the underlaying data structure might be.
_That_ is the issue. You can only solve that if you change the interface. Make a new one, getenv_r(), have it _copy_ the env var value into a user-provided, user-owned buffer. In that case, you can then assure the returned value is both point-in-time correct and immutable. You can never achieve that with getenv() because if you copy/make the returned pointer owned, the owner needs to free it. which is a break from the current behaviour and so not backwards-compatible ... and hence out of the question.
Lamenting about how broken the interfaces might be and then insisting that the implementation should be fixed is ... "conveniently shortsighted". Not saying this isn't worth fixing, but fix it the right way in the right place.
Wrong. As I've pointed out several times in this thread and in other recent threads about getenv(), Solaris/Illumos has an implementation that is lock-less to read (except when you change `_environ`, then it takes a lock at most once until the next time you change `_environ`). It's made safe by "leaking", and by locking in the functions that write. It's only unsafe if you replace the value of `_environ` repeatedly and free the old settings (which I've never seen any code do, and which if you do then you get what you deserve).
Make getenv copy to an OS-provided buffer. Free at program exit, like any other memory leak. There's an obvious drawback, but it's not changing the function signature.
The only argument I see is that an API can be misused. There are ergonomics debates we can have about it, but a user intentionally abusing an API wanting to do something that's completely wrong is hardly indication the API is broken.
Exposing global mutable access to something and providing no thread safe version counts as broken in my book. Your book may differ.
A good API would probably only allow constant access. If mutation is for some reason deemed necessary then it should be through a separate set API and the return results of get should be guaranteed safe.
> Please show me from which book you got the idea that env variables are expected to change throughout the lifetime of a process.
POSIX specifies two functions that alter environment variables. It could've specified that env variables are supposed to be mapped into a read-only memory page where available to indicate that they shouldn't be altered, but it didn't, and instead provided an explicit read/write system.
It seems they are arguing that setenv should not exist in the first place: the fact it exists suggests it can and should be used, and thus not be a footgun.
> can and should be used, and thus not be a footgun
It can and should be used in the cases where it makes sense, with the restrictions that are documented. It's an API that is fundamentally not thread-safe, you can not use it "safely" (in the modern sense of using it after a lobotomy, in any way that the compiler allows) in a multi-threaded context.
There are other such APIs, and if those APIs were removed it would hurt a lot of old software that is running perfectly fine.
> the fact it exists suggests it can and should be used
I think most people argue about that. Just because it exists doesn't mean it should be used imho.
I've used it exactly once, and that was a school exercise where I had to write a Posix shell (most of a posix shell actually), including built-ins. I do not see another use case tbh.
As with most things programming related, “it depends”.
Functional programming has really, truly done untold and massive damage to the industry. Fortunately, you are free to unpoison your mind. I just hope people eventually do.
Inserting a mutex into the setenv/getenv/etc. functions is pointless because applications are explicitly allowed to modify the environ pointer and array directly without any locking.