Pro jit hacker here. Super cool seeing folks build stuff like this. Thoughts: - ...

gradschoolfail · on Dec 27, 2023

“Using SSA simplifies the optimization implementations, but building and destroying this form is expensive. Using SSA for the MIR generator's short-pass pipeline makes little sense to me, so I don't use it.”

Thats from his blog motivating the project. https://developers.redhat.com/blog/2020/01/20/mir-a-lightwei...

seertaak · on Dec 27, 2023

> great to see an SSA pipeline, but that implies that this jit is not ideal for baseline jiting.

I'd be interested in knowing the reason -- it's not clear to me tbh.

pizlonator · on Dec 27, 2023

OK, so here's the deal with SSA and the running time of your compiler.

- You can convert code to SSA in something like O(N log N) or even close to O(N) if you get fancy enough. The conversion step is not very expensive to run, though it is annoying to implement. Converting to SSA is fine, that's not the problem.

- Optimizations implemented on top of SSA are cheap to run. That's sort of the point of SSA. So, that's not the problem.

- SSA optimizations achieve goodness when you have multiple of those optimizations running in something like a fixpoint. It's pointless to say "I have SSA" and then just implement one of those optimizations. I don't think MIR does that; like most SSA optimizers, it has multiple optimizations. The problem sort of starts here: SSA sort of implies that you're building an optimizer with multiple optimizations. Even if each opt is cheap, having SSA usually implies that you're going to pile on a bunch of them. So, you'll have cost either from death-by-a-thousand-cuts, or you'll eventually add an optimization that's superlinear.

- After your run SSA optimizations, you'll have nontrivial global data flow - as in, there will be data flows between basic blocks that weren't there when you started, and if you do a lot of different SSA optimizations, then these data flows will have a very complex shape. That's caused by the fact that SSA is really good at enabling compilers to "move" operations to the place in control flow where they make the most sense. Sometimes operations get moved very far, leading to values that are live for a long time (they are live across many instructions). This is where the problem gets significantly worse.

- Even without SSA optimizations, SSA form adds complexity to the data flow graph because of Phi nodes. You almost always want the Phi and its inputs to use the same register in the end, but that's not a given. This adds to the problem even more.

- Probably the reason why you went for SSA was perf. But to get perf from SSA-optimized code, you need to have some way of detangling that data flow graph. You'll want to coalesce those Phis, for example - otherwise SSA will actually pessimise your code by introducing lots of redundant move instructions. Then you'll also want to split ranges (make it so a variable can have one register in one part of a function and a different register in another part), since SSA optimizations are notorious for creating variables that live for a long time and interfere with everything. And then you'll definitely need a decent register allocator. No matter how you implement them, the combo of coalescing, splitting, and register allocation will be superlinear. It's going to be a major bottleneck in your compiler, to the point that you'll wish you had a baseline JIT.

Those things I mentioned - coalsce, split, regalloc - are expensive enough that you won't want them in a JIT that is the bottom tier. But they're great to have in any compiler that isn't the bottom tier. So, it's a good idea to have some non-SSA baseline (or template, or copy-and-patch, or whatever you want to call it) JIT, or a decent interpreter, as your bottom tier.

noelwelsh · on Dec 27, 2023

I assume SSA is too slow.