Of course it's faster - any program can be compiled into something "faster" by making the entire thing do nothing. What I mean is, why isn't uninitialized memory properly defined to be an arbitrary fixed string of bytes? In the example of the post, the compiler would look into `always_returns_true`, and either say "okay, is `x < 150`: well, I have no idea what `x` is, so I can't tell for sure", OR "Ah, for any value of `x` this expression is true, so let's replace it with `true`". There would be no 257th value of "uninitialized"; the value of `x` would definitely be a single value in the allowed range of that type, but it's indeterminable.
To be clear, the compiler is not forbidden from optimizing `always_returns_true` to unconditionally return true. After all, undefined behavior can cause anything to happen, and that includes returning true. Normally LLVM would perform exactly that optimization. But in this case `always_returns_true` is inlined first, so it becomes something like `undef < 150 || undef > 120`; LLVM propagates the `undef` outward through the expression until the whole condition is `undef`, and then it arbitrarily picks that the condition should be false.
But `always_returns_true` is not an example of how this particular undefined behavior can be useful as an optimization, merely an example of how it can be dangerous. For some examples of how it can be useful:
Basically, it helps to be able to replace "cond ? some_value : undef" with "some_value", especially when the code has been transformed to SSA form.
- Some architectures literally have a 257th possible value, like Itanium [1] [2]. On Itanium, every register can be set to "Not a Thing", i.e. uninitialized, and the CPU will trap if you try to store such a value to memory. Ironically, NaT was created in order to make the CPU's behavior more defined in a certain case, or at least more predictable... argh, I'm too tired to explain it properly; look at section 2.3.1 of [2] for a somewhat confusing explanation.
I _think_ I understand the rationale behind `undef` in LLVM, but I still think it's a bad one, since it can, and does, lead to very surprising behaviour.
> LLVM propagates the `undef` outward through the expression until the whole condition is `undef`, and then it arbitrarily picks that the condition should be false.
I think this illustrates what I find counter-intuitive about this whole mess; any function of `undef` shouldn't itself be `undef`. `undef < undef` is false in my head. `undef < 150` is just unknown, not undefined, since we don't know what `undef` is.
> Basically, it helps to be able to replace "cond ? some_value : undef" with "some_value"
This feels really contrived; in what setting would this actually be useful?
> This feels really contrived; in what setting would this actually be useful?
The text file I linked to explains it in more detail, so I'll defer to that.
> I think this illustrates what I find counter-intuitive about this whole mess; any function of `undef` shouldn't itself be `undef`. `undef < undef` is false in my head. `undef < 150` is just unknown, not undefined, since we don't know what `undef` is.
Except that for `undef < undef`, what if they're two different undefs? You would have to track each potentially uninitialized value separately. And then the optimizer would want to strategically choose values for the different undefs – e.g. "we want to merge 'cond ? some_value : undef123' with 'some_value', so let's set undef123 equal to some_value... except that could negatively impact this other section of code that uses it". It's certainly possible, but it would make the optimizer's job somewhat harder.
Since it's too late to edit my comment, I'm replying to note I was a bit off. `undef` actually doesn't propagate all the way to the conditional; instead, LLVM replaces `undef < 150` with `false`, and the same for `undef > 120`, and then `false || false` is simplified to `false`. In C, comparing indeterminate values is undefined behavior, so LLVM would be allowed to replace `undef < 150` with `undef`, but it seems that LLVM itself has slightly stronger semantics.
It's like you suggested in your parent comment: uninitialized memory is similar to I/O data, which can change at any moment without warning. That is, it can be 200 at the "x < 150" and moments later 100 at the "x > 120".
> In the example of the post, the compiler would look into `always_returns_true`
You can't look at each function in isolation. One important optimization all modern compilers use is inlining: short functions (like this `always_returns_true`) or functions that are only used once (like this `always_returns_true`) have their body inserted directly into the caller function, which allows further optimizations like constant propagation.
Ah, I knew my analogy would bite me :)
I didn't mean IO as in "can change at any moment", but as in "read from a file but you have no idea what it is".
You definitely _can_ look at each function in isolation (in this example it's even sufficient to get the "best" possible version of that function"), but I do know that you'd usually do an inline pass, and further optimization passes afterwards. I don't see how that changes anything, though. If the function was inlined you'd get the same expression, and still you'd be unable to tell anything about `x`, except that it would have a definite value that you cannot observe. Again, you could argue that no matter the value, the expression would be `true`, so you could replace it.