DangerousDoctor's comments

DangerousDoctor · on Nov 20, 2023

The essential problem is that there is no thread-safe way to implement this while maintaining backwards compatibility -- applications can alter the environment block by changing the environ global pointer, applications can also alter the environment block by replacing individual pointers in the environ array, applications can also alter the environment block by altering the strings pointed to by the individual members of the environ array, applications can also alter the environment block by using setenv/putenv/etc.

Inserting a mutex into the setenv/getenv/etc. functions is pointless because applications are explicitly allowed to modify the environ pointer and array directly without any locking.

jcelerier · on Nov 20, 2023

> Inserting a mutex into the setenv/getenv/etc. functions is pointless because applications are explicitly allowed to modify the environ pointer and array directly without any locking.

by that logic mutex themselves are pointless because nothing ever forces you to use them, even in memory-safe languages you can still access /dev/mem and change bytes? It's stil a useful thing to have.

PhilipRoman · on Nov 20, 2023

The difference is that modifying the environ pointer is explicitly supported behaviour in the standard, poking through /dev/mem is not.

Although I guess a middle ground solution wouldn't be too bad either - most programs don't modify environ directly, so POSIX could offer thread safety for the functions and make multithreading through "environ" UB. This is already kind of explained in the standard:

https://pubs.opengroup.org/onlinepubs/9699919799.2018edition...

jcelerier · on Nov 20, 2023

> The difference is that modifying the environ pointer is explicitly supported behaviour in the standard

they just have to fix the standard. e.g. in my country they manage to improve for instance the standard for electrical plugs every three years, there is NO REASON posix cannot do the same

jeroenhd · on Nov 20, 2023

I think the memory leak solution (copy over the env variables to a new location in memory every time you call setenv and keep the old pointers alive) will cause the fewest crashes.

I would personally go for the aggressive approach (release a new major version of libc that detects multithreaded environments and intentionally crashes out when calling setenv() so people actually notice and fix their broken programs) but I suspect not many people will agree with me on that.

The API is not necessarily bad (it's just very 80s UNIX), but the lack of enforcement of thread-safety causing all kinds of bugs and crashes.

lelanthran · on Nov 20, 2023

Yeah, but the program may not be broken until you, the glibc maintainer, calls `raise(SIGSEGV)`.

Most programs using setenv call it before starting any threads. That is not broken.

Detecting the linkage of thread support and crashing that program on purpose is, frankly, a pathological way to fix a non-broken program.

Besides which, your proposal won't work anyway, because this remains a potential problem in single threaded programs anyway: a program calling getenv, storing the result, and then calling setenv on the same variable and using the previously stored result will break anyway.

In summary, your proposal is broken in two different ways: 1) it breaks well-defined programs, and 2) it fails to break broken programs.

jeroenhd · on Nov 20, 2023

I wouldn't implement it during linkage, obviously single threaded putenv/setenv calls should still be permitted as part of initialisation routines. Count the number if children in /proc/self/task for all I care, the detection needs to happen during runtime.

You're right that putenv/setenv are also horribly broken in other ways, and doing multi thread detection doesn't prevent those problems. In a perfect world we would just kill off these two functions all together, replacing them with either crashes or no-ops, but that'd be an even harder sell.

lelanthran · on Nov 20, 2023

> I wouldn't implement it during linkage, obviously single threaded putenv/setenv calls should still be permitted as part of initialisation routines. Count the number if children in /proc/self/task for all I care, the detection needs to happen during runtime.

That still breaks well-defined, non-broken programs which don't call getenv/setenv in racing ways. There is no way for you do a conditional-upon-threads mechanism without false positives.

> You're right that putenv/setenv are also horribly broken in other ways, and doing multi thread detection doesn't prevent those problems. In a perfect world we would just kill off these two functions all together, replacing them with either crashes or no-ops, but that'd be an even harder sell.

But you don't need to in order to meet your original goal - breaking programs which do call setenv/getenv in the wrong order. Proposing to remove them altogether doesn't fulfill the goal of finding the breakages immediately and introduces breakages in existing programs.

My alternative: use LD_PRELOAD and provide alternative setenv/getenv functions which raise SIGSEGV when setenv is called on a variable more than once, and when getenv is called on a variable that was already setted once. It requires nothing more than a counter for each of setenv/getenv per variable.

That finds programs which actually are broken, with no false positives, and ignores threads altogether because they don't matter under the counter system[1].

Best of all, you can implement this in an afternoon, without needing to modify glibc, and then test it with every single executable on your system to see which ones break.[2]

[1] Since the caller knows they are not thread-safe anyway, we aren't looking for the error where the caller calls setenv concurrently in different threads. That's a different problem.

[2] I would wager good money that few, if any systems, will break under this test.

fch42 · on Nov 20, 2023

Leaking memory is not a "solution". Ever. Maybe for a commercial problem. But not for one in systems API design.

If you copy, provide a new interface. It's time-honoured and proven in Unix to give *_r() ones in such a case.

jeroenhd · on Nov 20, 2023

It solves hard crashes during DNS lookups by wasting a few kilobytes of RAM. Seems like a fine solution to me. The memory leak only occurs in circumstances where the program would've crashed or started messing with random memory anyway.

A proper solution would be to either nuke put/setenv() in the C standard library or redesign the *env() calls entirely, but that would break existing programs.

JohnFen · on Nov 21, 2023

A proper solution would be to implement your own put/setenv in the program that needs special behavior. That solves everything and does not affect any other programs.

slaymaker1907 · on Nov 20, 2023

There are circumstances where it's a perfectly valid solution. For example, suppose you're trying to acquire the lock on something to destroy it. It can be the lesser of two evils just to leak that memory instead of just waiting forever/a very long time to acquire that lock. You just need to ensure that you aren't leaking memory too quickly for whatever your constraints are. For example, most programs wouldn't care about a 1kb/day leak of memory because that would take a very long time to actually become noticeable. Furthermore, there's pretty much always some degree of memory growth just due to heap fragmentation (at least if you're using a language like C which can't do memory compaction via GC).

cryptonector · on Nov 20, 2023

All of this is false, including the last statement. Proof by existence: Solaris/Illumos has a thread-safe (leaking, though the leaks are hidden from memory debuggers), lock-less (whenever you don't write to `environ`) `getenv()`, and thread-safe, locking `setenv()`/`putenv()`/`unsetenv()`: https://src.illumos.org/source/xref/illumos-gate/usr/src/lib...

Yes, it can be done and has been done. glibc has no excuse.

atoav · on Nov 20, 2023

Another problem is that it is hard to reason about security in a C program if all environment variables could change at all times.

If we are talking about the C application deciding when it wants to rescan the environment that is something different, but if your environment can change potentially before and after you check it this opens you up for a heap of new attacks.

fch42 · on Nov 20, 2023

you're right; in addition to that though, I'd like to highlight that the use of some form of locking "inside" set/getenv would gain you nothing at all. That is not because of setenv, but because of getenv. The latter returns you a _pointer_. Whether you call that a reference leak, an ownership breakage ... it's not "yours" and when you have it, you don't "hold" it even if getenv internally were to lock whatever the underlaying data structure might be.

_That_ is the issue. You can only solve that if you change the interface. Make a new one, getenv_r(), have it _copy_ the env var value into a user-provided, user-owned buffer. In that case, you can then assure the returned value is both point-in-time correct and immutable. You can never achieve that with getenv() because if you copy/make the returned pointer owned, the owner needs to free it. which is a break from the current behaviour and so not backwards-compatible ... and hence out of the question.

Lamenting about how broken the interfaces might be and then insisting that the implementation should be fixed is ... "conveniently shortsighted". Not saying this isn't worth fixing, but fix it the right way in the right place.

cryptonector · on Nov 20, 2023

> you're right;

Wrong. As I've pointed out several times in this thread and in other recent threads about getenv(), Solaris/Illumos has an implementation that is lock-less to read (except when you change `_environ`, then it takes a lock at most once until the next time you change `_environ`). It's made safe by "leaking", and by locking in the functions that write. It's only unsafe if you replace the value of `_environ` repeatedly and free the old settings (which I've never seen any code do, and which if you do then you get what you deserve).

SAI_Peregrinus · on Nov 20, 2023

Make getenv copy to an OS-provided buffer. Free at program exit, like any other memory leak. There's an obvious drawback, but it's not changing the function signature.

forrestthewoods · on Nov 20, 2023

Yup. It's a bad, broken, and unfixable API.

The world really needs a new C standard library that doesn't suck.

rewmie · on Nov 20, 2023

> Yup. It's a bad, broken, and unfixable API.

Is it, though?

The only argument I see is that an API can be misused. There are ergonomics debates we can have about it, but a user intentionally abusing an API wanting to do something that's completely wrong is hardly indication the API is broken.

forrestthewoods · on Nov 20, 2023

Exposing global mutable access to something and providing no thread safe version counts as broken in my book. Your book may differ.

A good API would probably only allow constant access. If mutation is for some reason deemed necessary then it should be through a separate set API and the return results of get should be guaranteed safe.

rewmie · on Nov 20, 2023

> Exposing global mutable access to something and providing no thread safe version counts as broken in my book. Your book may differ.

Please show me from which book you got the idea that env variables are expected to change throughout the lifetime of a process.

jeroenhd · on Nov 20, 2023

> Please show me from which book you got the idea that env variables are expected to change throughout the lifetime of a process.

POSIX specifies two functions that alter environment variables. It could've specified that env variables are supposed to be mapped into a read-only memory page where available to indicate that they shouldn't be altered, but it didn't, and instead provided an explicit read/write system.

rewmie · on Nov 20, 2023

> POSIX specifies two functions that alter environment variables.

The same POSIX spec you're citing also states in no ambiguous terms that setenv is not thread-safe.

It's pointless to quote a section of a spec to try to justify failing to comply with the very same section of the very same spec.

jeroenhd · on Nov 20, 2023

I'm not saying it doesn't. I'm just saying the spec indicates that environment variables can change during runtime.

The spec is the problem in my opinion, you can't implement it in a way that doesn't introduce footguns.

necovek · on Nov 20, 2023

It seems they are arguing that setenv should not exist in the first place: the fact it exists suggests it can and should be used, and thus not be a footgun.

jstimpfle · on Nov 20, 2023

> can and should be used, and thus not be a footgun

It can and should be used in the cases where it makes sense, with the restrictions that are documented. It's an API that is fundamentally not thread-safe, you can not use it "safely" (in the modern sense of using it after a lobotomy, in any way that the compiler allows) in a multi-threaded context.

There are other such APIs, and if those APIs were removed it would hurt a lot of old software that is running perfectly fine.

orwin · on Nov 20, 2023

> the fact it exists suggests it can and should be used

I think most people argue about that. Just because it exists doesn't mean it should be used imho.

I've used it exactly once, and that was a school exercise where I had to write a Posix shell (most of a posix shell actually), including built-ins. I do not see another use case tbh.

wredue · on Nov 20, 2023

Your book is wrong.

As with most things programming related, “it depends”.

Functional programming has really, truly done untold and massive damage to the industry. Fortunately, you are free to unpoison your mind. I just hope people eventually do.

eviks · on Nov 20, 2023

Or transition to post-C

xxs · on Nov 20, 2023

Java has System.getenv()... and it's an unmodifiable Map<String, String>. The real culprit it's the attempt to modify it, not C.