I am not sure in what cases that's a major win, in generation/copy GCs an allocation is a pointer bump, then the object dies trivially (no references from tenured objects) in the young gen collection.
When you run out of the local arena, there will be a CAS involved, of course.
Removing the GC allocation is a win because it allows you to promote the value’s fields to registers.
I agree just the difference between GC allocation and stack allocation is small. But it’s not really about turning the values into stack allocations; it’s about letting the compiler do downstream optimizations based on the knowledge that it’s dealing with an unaliased private copy of the data and then that unlocks a ton of downstream opts.
The allocation elisions have been tried (escape analysis) with enough inlining to no noticeable benefits (I don't have the source/quote, yet it was over 10y already). The escape analysis already provides the same semantics of proving the Object allocation/etc. can be optimized.
Java does lack value types but they are only useful in (large) arrays. A lot of such code has 'degraded' to direct byte buffers, and in some cases straight unsafe, similar to void* in C.
I implemented the original allocation elision in JavaScriptCore and then oversaw the development of the pass that superseded mine.
Huge speedup, but very hit or miss. Out of a large suite of benchmarks, 90% of the tests saw no change and 10% saw improvements of 3x or so. Note each of these tests was itself a large program; these weren’t some BS microbenchmarks.
Maybe someone from V8 can comment but my understanding is they had a similar experience.
So it’s possible that someone measured this being perf neutral, if they used a too small benchmark suite.
I am not sure in what cases that's a major win, in generation/copy GCs an allocation is a pointer bump, then the object dies trivially (no references from tenured objects) in the young gen collection.
When you run out of the local arena, there will be a CAS involved, of course.