Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Through value tracking. It's actually LLVM that does this, GCC probably does it as well, so in theory explicit bounds checks in regular C code would also be removed by the compiler.

How it works exactly I don't know, and apparently it's so complex that it requires over 9000 lines of C++ to express:

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Anal...



> How it works exactly I don't know

The idea is pretty simple. You can build a list of known facts based on control flow, explicit __builtin_assumes, and undefined behavior relations. For example, if you've got this code:

  if (x < N) {
    // In this block, we know that x < N
  } else {
    // ... and in this block we know that x >= N!
  }
And on top of that, we can do some basic algebra. If we know that x < N and N < 5, then we can infer that x < 5. So if we see a comparison x < 5, we can then rewrite that to true.

> and apparently it's so complex that it requires over 9000 lines of C++ to express

The two main reasons for that is that a) there is a lot of rules covering cases like "we know the result of count_leading_zeroes can be no more than the number of bits in an integer" and so forth, and b) this is doing a lot more logic than just tracking integer comparisons: there's tracking known-bits of integers, maximum possible value, floating-point comparisons, pointer object references.


> And on top of that, we can do some basic algebra. If we know that x < N and N < 5, then we can infer that x < 5. So if we see a comparison x < 5, we can then rewrite that to true.

Even better: if `x` and `N` are integers, then we can infer that `x` < 4. :)


I think you mean x <= 4, right?


No, x < 4, within integers. If N < 5, then N <= 4. If x < N, then x < 4.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: