Which programming languages? When using C++ I wanted programs to have a function...

saagarjha · 2025-01-23T09:02:44 1737622964

Everyone thinks they are can be the first to do something, and that there is surely nothing that will happen before them. Unfortunately everyone save for one is mistaken. Sometimes that chosen one is not even consistent.

hinkley · 2025-01-23T18:41:59 1737657719

This is one of the problems with Singletons. Especially if they end up interacting or being composed.

In Java you’d have the static initializers run before the main method starts. And in some languages that spreads to the imports which is usually where you get into these chicken and egg problems.

One of the solutions here is make the entry point small, and make 100% of bootstrapping explicit.

Which is to say: move everything into the main method.

I’ve seen that work. On the last project it got a little big, and I went in to straighten out some bits and reduce it. But at the end anyone could read for themselves the initialization sequence, without needing any esoteric knowledge.

wizzwizz4 · 2025-01-23T16:53:43 1737651223

If everyone is responsible for maintaining the illusion that someone else is first, who's actually first is largely irrelevant.

friendzis · 2025-01-23T09:10:39 1737623439

`main` is the default entrypoint, with one simple argument to the linker you can change entrypoint symbol to whatever you wish.

You can add `premain` function that calls `main` and set it as an entrypoint, you can implement pre-start logic in main and call main loop later.

This is how any sane program is written anyway: set up environment -> continue with business logic

dietr1ch · 2025-01-24T13:33:18 1737725598

I know I can fool around with crt0, but I'm not sure how much you can really use that if you plan to use libraries that may depend on global `static` things that get created as they are linked in before `main` starts.

Maybe it's possible, but if I need to review every library (and hope they don't break my assumptions later) I think I lost on building this separation in practical way.

account42 · 2025-01-23T11:25:20 1737631520

main() is not the entry pont, some platform specific CRT is.

skissane · 2025-01-23T20:02:10 1737662530

Also, the CRT will call any functions you declare `__attribute__((constructor))` before it calls `main()`

fch42 · 2025-01-23T10:21:57 1737627717

You needn't go "hacky" for this; constructors for global/static variables are called before main(). But then, the underlaying linker support is usually "trivially exposed" (using the constructor attribute in gcc/clang, say).

This (obviously?) isn't "110%" perfect as the order of the constructor calls for several such objects may not be well-defined, and were they to create threads (who am I to suggest being reasonable ...) you end up with chicken-egg situations again.

hinkley · 2025-01-23T18:59:59 1737658799

JavaScript only just got top level async. So what I saw happen is that files that do their own background tasks start those either in their constructor or lazily in the case of static functions.

There was one place and only one place where we violated that, and it was in code I worked on. It was a low level module used everywhere else for bootstrapping, and so we collectively decided to do something sneaky in order to avoid making the entire code base async.

And while I find that most of the time people can handle making one special case for a rule, it was a complicated system and even “we” screwed it up occasionally for a good long while.

The problem was we needed to make a consul call at startup and the library didn’t have a synchronous way to make that call. So all bootstrapping code had to call a function and await it, before loading other things that used that module. At the end we had about a dozen entry points (services, dev and diagnostic tools). And I always got blamed because nobody seemed to remember we decided this together.

I hate singletons. And I ended up with one of only two in the whole project, and that hatred still wasn’t enough to prevent hitting the classical problems with singletons.

bluGill · 2025-01-23T16:02:23 1737648143

What is wrong with main setting those things first and then starting your main program? That is what everyone else does.

hinkley · 2025-01-23T18:44:15 1737657855

Not “everyone” does that. You have individual files doing their own initialization when they get loaded. Including loading other files or modules.

They might do it for testing purposes.

bluGill · 2025-01-23T23:17:43 1737674263

That does happen. Still there is a reason many avoid it. Probably every significant project has places where they do that. Still if it isn't in main it is always a little "magic" and that means hard to understand how the program works. (or worse randomly doesn't work because something is used before it is initialized)

robertlagrant · 2025-01-24T12:47:42 1737722862

> When using C++ I wanted programs to have a function that was called before main() and set up things that got sealed afterwards, like parsing command-line-arguments, the environment variables, loading runtime libraries, and maybe look at the local directory, but I'm not sure if it'll be a useful and meaningful distinction unless you restructure way too many things

If you're only reading environment variables you have no problem, though. It's only if you try to change them that it causes issues.

For setting, "only set environment variables in the Bash script that starts your program" might be a good rule.

GoblinSlayer · 2025-01-23T09:04:40 1737623080

grpc reads some configuration from environment; environment has portability problems too, so it's useful to set it to cross platform shape.

fch42 · 2025-01-23T10:32:10 1737628330

The "cross platform" way of setting the environment is to set it "from outside" of the program - meaning, through the executor, whether that's the shell or the container runtime or even the kernel commandline if you insist to rewrite init in rust/go/zig/... It can be as-easy-as spawning your process via "env -i VAR1=... ... myprogram ..." - and given this also clears the dangers of env-insertion exploits, it's good practice.

(the argument that the horses have long bolted with respect to "just do the right think ok?!" here holds some water. I'm of the generation though where people on the internet could still tell each other they were wrong, and I assert that here; you're wrong if you believe a non-threadsafe unix interface is a bug. No matter what kind of restrictions around its use that means. You're still wrong if you assume the existence of such restrictions is a bug)

gpderetta · 2025-01-23T12:38:38 1737635918

At the limit a program can execve itself with the new env.

GoblinSlayer · 2025-01-24T15:15:10 1737731710

It's not cross platform. Does java provide such interface?

fch42 · 2025-01-23T13:41:46 1737639706

Indeed, and from my point of view, that's perfectly ok.

hinkley · 2025-01-23T19:02:16 1737658936

Some of the docker containers I made ended up having a bash shell as the entry point and I moved most of the environment variable init out of the code and into the script. But in dev sandbox some of that code runs without the script, so it was still a headache.

hinkley · 2025-01-23T19:03:05 1737658985

Poor choice of phrasing.

I ended up implying some extra support when all I meant was “one could”.