I have a different perspective: the underlying problem is calling setenv(). As far as I'm concerned, the environment is a read-only input parameter set on process creation like argv. It's not a mechanism for exchanging information within a process, as used here with SSL_CERT_FILE.
And remember that the exec* family of calls has a version with an envp argument, which is what should be used if a child process is to be started with a different environment — build a completely new structure, don't touch the existing one. Same for posix_spawn.
And, lastly, compatibility with ancient systems strikes again: the environment is also accessible through this:
I think there's a narrow window, at least in some programming languages, when environment variables can be set at the start of a process. But since it's global shared state, it needs to be write (0,1) and read many. No libraries should set them. No frameworks should set them, only application authors and it should be dead obvious to the entire team what the last responsible moment is to write an environment variable.
I am fairly certain that somewhere inside the polyhedron that satisfies those constraints, is a large subset that could be statically analyzed and proven sound. But I'm less certain if Rust could express it cleanly.
Your process can be started in a paused state by a debugger, have new libraries and threads injected into it, and then resumed before a single instruction of your own binary has been executed... and debuggers are far from the only thing that will inject code into your processes. If you're willing to handwave that, pre-main constructors, etc. away, you can write something like this easily enough:
struct BeforeEnvFreeze(());
struct AfterEnvFreeze(());
impl BeforeEnvFreeze {
pub fn new() -> Self { /* singleton check using a static AtomicBool or something */ Self(()) }
pub fn freeze(self) -> AfterEnvFreeze { AfterEnvFreeze(()) }
pub fn set_env(&self, ...) { ... }
}
impl AfterEnvFreeze {
pub fn spawn_thread(&self, ...) { ... }
}
fn main() {
let a = BeforeEnvFreeze::new();
a.set_env(...);
a.set_env(...);
//b.spawn_thread(...); // not available
let b = a.freeze(); // consumes `a`
b.spawn_thread(...);
//a.set_env(...); // not available
}
Exercises left to the reader:
• Banning access to the relevant bits of Rust's stdlib, libc, etc. as a means of escaping this "safe" abstraction
• Conning your lead developer into accepting your handwave
• Setting up the appropriate VCS alerts so you have a chance to NAK "helpful" "utility" pull requests that undermine your "protections"
And of course, this all remains a hackaround for POSIX design flaws - your engineering time might be better spent ensuring or enforcing your libc is "fixed" via intentional memory leaks per e.g. https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f... , which may ≈fix more than your Rust programs.
I agree that libraries certainly should not. But why would writing be the right choice ever, even for applications? Doesn't it make far more sense to use env to create in some better-typed global configuration object, filling any gaps with defaults, then use that?
I'd go further and say env should always be read-only and libraries should never even read env vars.
> I think there's a narrow window, at least in some programming languages, when environment variables can be set at the start of a process.
I mean, based on this issue I would say the only safe time is "at the start of the program, before any new threads may have been created".
But again, as others have said, there's no good reason I'm aware of to set environment variables in your own process, and when you spawn a new process you can give it its own environment with any changes you want.
When using C++ I wanted programs to have a function that was called before main() and set up things that got sealed afterwards, like parsing command-line-arguments, the environment variables, loading runtime libraries, and maybe look at the local directory, but I'm not sure if it'll be a useful and meaningful distinction unless you restructure way too many things.
I remember that on the Fuchsia kernel programs needed to drop capabilities at some point, but the shift needed might be a hard sell given things already "work fine".
Everyone thinks they are can be the first to do something, and that there is surely nothing that will happen before them. Unfortunately everyone save for one is mistaken. Sometimes that chosen one is not even consistent.
This is one of the problems with Singletons. Especially if they end up interacting or being composed.
In Java you’d have the static initializers run before the main method starts. And in some languages that spreads to the imports which is usually where you get into these chicken and egg problems.
One of the solutions here is make the entry point small, and make 100% of bootstrapping explicit.
Which is to say: move everything into the main method.
I’ve seen that work. On the last project it got a little big, and I went in to straighten out some bits and reduce it. But at the end anyone could read for themselves the initialization sequence, without needing any esoteric knowledge.
I know I can fool around with crt0, but I'm not sure how much you can really use that if you plan to use libraries that may depend on global `static` things that get created as they are linked in before `main` starts.
Maybe it's possible, but if I need to review every library (and hope they don't break my assumptions later) I think I lost on building this separation in practical way.
You needn't go "hacky" for this; constructors for global/static variables are called before main(). But then, the underlaying linker support is usually "trivially exposed" (using the constructor attribute in gcc/clang, say).
This (obviously?) isn't "110%" perfect as the order of the constructor calls for several such objects may not be well-defined, and were they to create threads (who am I to suggest being reasonable ...) you end up with chicken-egg situations again.
JavaScript only just got top level async. So what I saw happen is that files that do their own background tasks start those either in their constructor or lazily in the case of static functions.
There was one place and only one place where we violated that, and it was in code I worked on. It was a low level module used everywhere else for bootstrapping, and so we collectively decided to do something sneaky in order to avoid making the entire code base async.
And while I find that most of the time people can handle making one special case for a rule, it was a complicated system and even “we” screwed it up occasionally for a good long while.
The problem was we needed to make a consul call at startup and the library didn’t have a synchronous way to make that call. So all bootstrapping code had to call a function and await it, before loading other things that used that module. At the end we had about a dozen entry points (services, dev and diagnostic tools). And I always got blamed because nobody seemed to remember we decided this together.
I hate singletons. And I ended up with one of only two in the whole project, and that hatred still wasn’t enough to prevent hitting the classical problems with singletons.
That does happen. Still there is a reason many avoid it. Probably every significant project has places where they do that. Still if it isn't in main it is always a little "magic" and that means hard to understand how the program works. (or worse randomly doesn't work because something is used before it is initialized)
> When using C++ I wanted programs to have a function that was called before main() and set up things that got sealed afterwards, like parsing command-line-arguments, the environment variables, loading runtime libraries, and maybe look at the local directory, but I'm not sure if it'll be a useful and meaningful distinction unless you restructure way too many things
If you're only reading environment variables you have no problem, though. It's only if you try to change them that it causes issues.
For setting, "only set environment variables in the Bash script that starts your program" might be a good rule.
The "cross platform" way of setting the environment is to set it "from outside" of the program - meaning, through the executor, whether that's the shell or the container runtime or even the kernel commandline if you insist to rewrite init in rust/go/zig/...
It can be as-easy-as spawning your process via "env -i VAR1=... ... myprogram ..." - and given this also clears the dangers of env-insertion exploits, it's good practice.
(the argument that the horses have long bolted with respect to "just do the right think ok?!" here holds some water. I'm of the generation though where people on the internet could still tell each other they were wrong, and I assert that here; you're wrong if you believe a non-threadsafe unix interface is a bug. No matter what kind of restrictions around its use that means. You're still wrong if you assume the existence of such restrictions is a bug)
Some of the docker containers I made ended up having a bash shell as the entry point and I moved most of the environment variable init out of the code and into the script. But in dev sandbox some of that code runs without the script, so it was still a headache.
>Note that Java, and the JVM, doesn't allow changing environment variables. It was the right choice, even if painful at times.
Not sure why would it be considered painful. Imo, use of setenv to modify your own variable, the definition of setenv is thread unsafe. So unless running a single threaded application it'd never make sense to call it.
Java does support running child processes with a designated env space (ProcessBuilder.environment is a modifiable map, copied from the current process), so inability to modify its own doesn't matter.
Personally I have never needed to change env variables. I consider them the same as the command line parameters.
> Java doesn't even allow to change the working directory also due to potential multi-threading problems.
Linux and macOS both support per-thread working directory, although sadly through incompatible APIs.
Also, AFAIK, the Linux API can't restore the link between the process CWD and thread CWD once broken – you can change your thread's CWD back to the process CWD, but that thread won't pick up any future changes to the process CWD. By contrast, macOS has an API call to restore that link.
That would be so much wasted engineering effort. The actual solution is simple: read what you need from env, and pass it as parameters to the functions you want to. The values of what you have read can be changed... and if you really, really want start a child process with a modified env.
if you really wish - you can change the bootstrap path and allow changing env() for whatever reason you want to (likely via copy on write). If you don't wish to do that feel free to spawn a child process with whatever env you desire, then redirect/join sys in/our/err (0/1/2)
Those are trivial things in around 100 lines of code and have been available since System.getenv() got back (it used to be deprecated and non-functional prior Java 1.5 or 2004)
You can’t convince me that there is EVER a reason to call setenv() after program init as part of a regular program, outside needing to hack around something specific.
Environmental variables are not a replacement for your config. It’s not a place to store your variables.
Even if the env var API is fully concurrent, it is not convention to write code that expects an env var to change. There isn’t even a mechanism for it. You’d have to write something to poll for changes and that should feel wrong.
> You can’t convince me that there is EVER a reason to call setenv() after program init as part of a regular program, outside needing to hack around something specific.
The most common use I see for this is people setting an env in the current process before forking off a separate process; presumably because they don't realize that you can pass a new environment to new processes.
I wonder what bugs you'd find if you injected a library to override setenv() with a crash or error message into various programs. Might be a way to track down these kind of random irreproducable bugs.
Given how old most UNIX APIs are, and that when I do man fork I get information to look into execve(), which provices the feature, I guess not knowing is a typical case from google-copy-paste programming.
As a really old school UNIX guy I'd agree with this. Programmatic manipulation of the environment is an 'attractive nuisance' in that I feel anything you might be trying to achieve by using the environment as a string scratch pad of things that are different for different threads, can be coded in a much safer way.
I'd be happy to have you copy the immutable read-only environment vector of strings into your space and then treat that as the source of such things.
I think it would be interesting to build all the packages with a stdlib that dumps core on any call to setenv() or unsetenv(). That would give one an idea of the scope of the problem.
Environment variables are a gigantic, decades-old hack that nobody should be using... but instead everyone has rejected file-based configuration management and everyone is abusing environment variables to inject config into "immutable" docker containers...
> The setenv() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe.
With the increased use of PIE, thunks for both security and due to ARM + the difference between glibc and musl, plus busybox and you have a huge mess.
I would encourage you to play around with ghidra, just to see what return oriented programming and ARM limits does.
Compilers have been good at hiding those changes from us, but the non-reentrant nature will cause you issues even without threads.
Hint, these thunks can get inserted in the MI lowering stage or in the linker.
But setenv() is owned by posix, with only getenv() being differed to cppr.
Perhaps someone could submit a proposal on how to make it reentrant to the Open Group. But it wasn't really intended for maintaining mutable state so it may be a hard sell.
That applies mostly to databases using the filesystem.
For configuration files, the write-fsync-move strategy works fine. Generally you don't need fsync, since most people don't use the file system settings that allow data writes to be reordered with the metadata rename.
We use env vars on cloud machines to hold various metadata information about the machines. They can be queried by any program and is extremely useful. It's too useful to be considered a hack. People just misuse them.
It's funny how any hack, no matter how big, somehow becomes a commonplace everyday "solution" once it's needed to work around some quirk of whatever technology is fashionable at the time.
That (managing the env "from the outside) is and always has been the "supposed" way of using it.
Modifying _your own_ environment _at runtime_ is not. The corresponding functions - setenv/getenv - and state - envp/environ - have in the UNIX standards "always" (since threads exist, really) been marked non-MT. "way back when" people were happy to accept that stated restrictions on use don't make bugs. Today, general sense of overentitlement makes (some) people say "but since whatever-trickery can remove this restriction... you're wrong and I'm entitled to my bugfix". I agree the damage is done, though.
Even then, you could maintain a separate copy of the environment that you control and freely mutate. Basically, during startup, you create a copy of the env you received. Any setenv primitive you expose to users will modify this copy (that you can sync properly yourself). When you want to launch a process, you explicitly provide the internal copy of the env to that process, you don't rely on libc providing its own copy.
Of course, this means you won't see any changes to env vars from libraries you may use that call setenv(), but you also shouldn't need, or want, that in a shell.
I still think having a proper synchronous thread safe setenv()/getenv() in libc is the better choice.
If you're writing a shell, you can spend the 15 minutes to write a custom mutable data structure for your envvars; no need to significantly worsen the entire ecosystem to reduce the size of shells by a couple dozen lines (or, rather, move those lines into libc..)
I’ve written a lot of subprocess runners and environmental variables passed to a sub-process is just data at that point and you store it in your own variable like you would store someone’s name or someone’s age.
The underlying problem isn't just setenv, because the string returned by getenv can be invalidated by another call to getenv. ISO C says:
"The getenv function returns a pointer to a string associated with the matched list member. The
string pointed to shall not be modified by the program, but can be overwritten by a subsequent call
to the getenv function."
In a single threaded virtual machine, you can immediately duplicate the string returned by getenv and stop using it, right there.
Under threads, getenv is not required to be safe.
I think that with some care, it may be; an environment implementation could guarantee that a non-mutating operation like getenv doesn't invalidate any previously returned strings.
I think POSIX does that. It allows getenv to reallocate the environ array, but not the strings themselves:
"Applications can change the entire environment in a single operation by assigning the environ variable to point to an array of character pointers to the new environment strings. After assigning a new value to environ, applications should not rely on the new environment strings remaining part of the environment, as a call to getenv(), secure_getenv(), [XSI] [Option Start] putenv(), [Option End] setenv(), unsetenv(), or any function that is dependent on an environment variable may, on noticing that environ has changed, copy the environment strings to a new array and assign environ to point to it."
environ is documented together with the exec family of functions; that's where this is found.
So whereas there are things not to like about environ, it can be the basis for thread safety of getenv in an application that doesn't mutate the environment.
Mutating argv is fine for how it is usually done. That is, to permute the arguments in a getopt() call so that all nonoptions are at the end.
It is fine because it is usually done during the initialization phase, before starting any other thread. setenv() can be used here too, though I prefer to avoid doing that in any case. I also prefer not to touch argv, but since that's how GNU getopt() works, I just go with it.
Once the program is running and has started its threads, I consider setenv() is a big no no. The Rust documentation agrees with me: "In multi-threaded programs on other operating systems, the only safe option is to not use set_var or remove_var at all.". Note: here, "other operating systems" means "not Windows".
It may work for top, but not ps among others. The only reliable way is clobbering argv. That's just the way it is. In my opinion, glibc should finally provide setproctitle(), so programs like postgresql or chrome (https://source.chromium.org/chromium/chromium/src/+/main:bas...) don't have to resort to argv hacks.
Yes, and if there were "setargv()" or "getargv()" functions, they'd have the same issues ;) … but argv is a function parameter to main()¹, and only that.
¹ or technically whatever your ELF entry point is, _start in crt0 or your poison of choice.
> but argv is a function parameter to main()¹, and only that.
> ¹ or technically whatever your ELF entry point is, _start in crt0 or your poison of choice.
Once you include the footnote, at least on linux/macos (not sure about Windows), you could take the same perspective with regards to envp and the auxiliary array. It's libc that decided to store a pointer to these before calling your `main`, not the abi. At the time of the ELF entry point these are all effectively stack local variables.
I mean, yes, we're in "violent agreement" there. It's nice that libc squirrels away a copy and gives you a `getenv()` function with a string lookup, but… setenv… that was just a horrible idea. It's not really wrong to view it as a tool that allows you to muck around with main()'s local variables. Which to me sounds like one should take a shower after using it ;D
(Ed.: the man page should say "you are required to take a shower after writing code that uses setenv(), both to get off the dirt, but also to give you time to think about what you are doing" :D)
Thing is, the (history of the) UNIX APIs - call'em "libc" if you like - is littered with the undead corpses of horrible ideas. Who thought that having global file write offsets are great ? Append-only writes ? Global working directories ? The ability to write the password db via putpwent() ? Modifying your own envp or argv ? Why have a horribly-scaling hack like fcntl-based file locking even in the standard ?
"Today", were one to start from scratch, the userspace API of even unix-ish operating systems would be done much differently. After all, systems designers and implementors are intelligent people and learn, and there's 50y+ of history to learn from. But the warts are there, and sometimes, there to "program around" them.
You can see how that would look like, done by UNIX authors themselves, by looking into Inferno and Limbo standard library.
It is kind of ironic how so many stick with UNIX and C ideas as religious ideals from OS and systems programming ultimate design, while the authors moved on creating Plan 9 and Inferno, Alef and Limbo.
Append-only writes are actually amazing, having several processes writing into the same file and have their writes interleaved instead of destroying each other is almost impossible to re-create in the user space.
And I still don't understand why processes "modifying their own envp or argv" are met with such revulsion in this comment thread except from the "I dislike that on ideological grounds" reason. Now, the ability to modify envp and/or argv of other processes while those are running, yes, that's a horrible idea. But modifying your own internal process state?
Oh, and fcntl file locks are horrible for the historical reasons: basically, when POSIX (or its predecessor?) were trying to decide on a portable interface, the representative of one of the vendors cobbled together this API and its implementation in a week or two, and then showed to the meeting with it. To his surprise, instead of arguing everyone else basically said "eh, looks fine", and that was it, we now have broken "why on earth does close()/fork()/exec() interact with locks like that" behaviour.
I had to smirk at the sarcasm (intended or no).
I merely included "processes modifying their env" amongst all these historical warts. I consider doing so as inevitably necessary as append writes, the advantages of which you aptly described. That's my opinion, underpinned by the history of those interfaces. I hope we can agree that the breakage is by-and-large in an (old, historical) interface that allows braindead usage, not in either the implementor or the user ?
On Linux, a privileged process can change the memory address which the kernel (/proc filesystem) reads argv/etc from... prctl(PR_SET_MM) with the PR_SET_MM_ARG_START/PR_SET_MM_ARG_END arguments. Likewise, with PR_SET_MM_ENV_START/PR_SET_MM_ENV_END.
This shouldn't cause the kind of race conditions we are talking about here, since it isn't changing a single arg, it is changing the whole argv all at once. However, the fact that PR_SET_MM_ARG_START/PR_SET_MM_ARG_END are two separate prctl syscalls potentially introduces a different race condition. If Linux would only provide a prctl to set both at once, that would fix that. The reason it was done this way, is the API was originally designed for checkpoint-restore, in which case the process will be effectively suspended while these calls are made.
And remember that the exec* family of calls has a version with an envp argument, which is what should be used if a child process is to be started with a different environment — build a completely new structure, don't touch the existing one. Same for posix_spawn.
And, lastly, compatibility with ancient systems strikes again: the environment is also accessible through this:
Which is, of course, best described as bullshit.