Linear scan register allocation on SSA

kragen · 2025-08-21T06:59:33 1755759573

Oh man, block-argument-SSA makes so much more sense to me than Φ-SSA. That alone is going to change my life. Thank you, Max!

A lot of this process is the same as graph-coloring register allocation on SSA, of which I found an extensive explanation in Appel's textbook. I think it's maybe time for me to go back to it and do the exercises so I really understand it. The book unfortunately predates the linear allocator age.

The link to https://brrt-to-the-future.blogspot.com/2019/03/reverse-line..., which is a learner's introduction to the LuaJIT reverse linear scan allocator, also seems valuable.

xorvoid · 2025-08-22T02:16:10 1755828970

Because block-argument is actually the “right way”. The phi-node is very un-natural.. in the same sort of way that pi should have been tau.

Also, notice the connection here between Phi nodes and Continuation Passing Style (CPS). It because obvious with the block-arg form because it’s just the same thing. Jumps to blocks are just calls that don’t return.

kragen · 2025-08-22T06:33:16 1755844396

Yes, I'd never understood what people meant when they said that SSA was the same thing as CPS, but with block-argument SSA it's clear.

o11c · 2025-08-20T22:44:14 1755729854

> In fact, you might even consider not allocating a register greedily. What might that look like? I have no idea.

One case I'm aware of: if your ISA supports arbitrary memory operands like x86, rarely-used variables can be operated-on entirely on the stack. Historically this was something ICC did better than GCC, though it became much less relevant with the shift to 64-bit bringing more registers.

fanf2 · 2025-08-21T01:35:21 1755740121

Most x86 instructions only support one memory operand so you can’t completely avoid register allocation. It isn’t a full-on hardcore CISC like 68k or VAX.

fuhsnn · 2025-08-21T02:58:33 1755745113

x86 memory operands are also nice for binary size, my amateur compiler sometimes produces smaller objects than peers despite having no load-store elimination, all by greedily opting for memory operands during instruction selection; kinda amusing 'cause I'm now worrying my future optimization backend might actually cause a regression in size.

mtklein · 2025-08-21T02:20:18 1755742818

I found this especially nice working with large SIMD constants.

fuhsnn · 2025-08-17T14:42:26 1755441746

Another great overview of Go compiler's register allocation: https://developers.redhat.com/articles/2024/09/24/go-compile...