The Java and golang numbers are not apples to apples. The Java solution is using the standard library code, while golang's is calling out to native code (wrapper C library) as far as I can tell. Same with Python.
Can't comment on that but why are these people putting spaces after commas (correct) and not to the left and right of the assignment sign = ? It's so weird to read a=b
That’s very straightforward translation of C code using pcre to Python using ctypes. It doesn’t look great because the ugliness isn’t wrapped in a library. But Python ctypes definitely isn’t hard to use. It is unsafe as hell though.
That's interesting that Ruby is as slow as it is, the regex engine (Onigmo) is written in C. I wonder where the bottleneck is compared to the other languages.
Seems to be mainly a test of compiling regexes, rather than executing them. Which is a bit pointless because few applications would not pre-compile regexes ... and then it would actually be penalising languages that put more effort into optimisation and hence perform better in the real world at execution.
Why would you even say this? It is certainly not a test of compiling regexes. Go and profile any one of those programs. You'll see that the majority of time is spent executing the search, not the compilation.
Look at the runtimes being mentioned here. We're talking on the order of seconds. The benchmark itself calls for 15 pretty simple regexes to be compiled. Regex compilation of 1ms or longer would be considered slow I think. So that's 15ms. That's nearly nothing compared to the full runtime of the benchmark. Even if you assumed each regex took 10ms on average to compile, you're still only looking at 150ms total.
I say this as the author of Rust's regex engine and as someone who has spent a non-trivial time looking at and thinking about this particular benchmark.
The top times for the mentioned languages are:
Rust: 0.78s
Ruby: 12.33s
Java: 5.34s
Go: 3.80s
Python: 1.34s