> C program will run directly on the OS. Adding a layer of runtime means reduced performance.
A C program that isn't compiled in an embedded context has a runtime layer on top of it. The OS doesn't call your main, it runs the init function of the platforms C runtime to initialize all the bloat and indirection that comes with C. Just like your compiled C program doesn't just execute the native sqrt instruction but runs a wrapper that sets all the state C expects, like the globally visible errno value no one ever checks but some long dead 1980s UNIX guru insisted on. C is portable because just like python and C# it too is specified on top of an abstract machine with properties not actually present in hardware. If you really set out to optimize a C program at the low level you immediately run into all the nonsense the C standard is up to.
A C program that isn't compiled in an embedded context has a runtime layer on top of it. The OS doesn't call your main, it runs the init function of the platforms C runtime to initialize all the bloat and indirection that comes with C. Just like your compiled C program doesn't just execute the native sqrt instruction but runs a wrapper that sets all the state C expects, like the globally visible errno value no one ever checks but some long dead 1980s UNIX guru insisted on. C is portable because just like python and C# it too is specified on top of an abstract machine with properties not actually present in hardware. If you really set out to optimize a C program at the low level you immediately run into all the nonsense the C standard is up to.