No inlining of assembly is required to vastly improve the performance of Go in many such benchmarks. Pool or Arena Allocation are commonly used techniques and whether or not they are "idiomatic" isn't even up for discussion any more since the stdlib includes the former.