Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The JVM is rock solid, and recent improvements in garbage collection have reduced tail latencies dramatically.

But I find that most people who say that it has "great performance" have not built a parallel implementation in JS, Go, Rust, or even .NET Core. I'm omitting memory unsafe languages by default and Python here, because I think that's the domain Java competes in, and Python lacks the investment these other languages have in performance.

The lack of value types and the amount of pointer chasing that JVM languages do as a result, the way generics are implemented via type erasure (which the JIT then has to re-optimize), and so on usually mean that CPU and memory usage for the same throughput is much higher than a competing implementation in a different language. And on older JVMs, tail latency will be orders - plural - of magnitude worse.

It is absolutely true though that for most workloads that efficiency isn't necessary and the ability to reuse that ecosystem reduces time and cost to develop. But it's just definitely not true that Java has "great performance" and I don't think that's ever really been true.



TechEmpower Web Framework Benchmarks would like to disagree with you.


I don't think those are particularly realistic workloads, as they don't involve substantial amounts of working with in-memory data. Which of the TechEmpower benchmarks uses an ORM?

Before I finished drafting my comment I did have a sentence like this, which I removed, "Barring obscene amounts of optimization", so yes, some web servers like netty and jetty have gotten to a level of good performance in terms of handling plain requests.

But most line of business backends are not using plain jetty/netty, they aren't just responding to every request with the same "SELECT" query to a backend database. They're doing computation, they're storing intermediate data in data classes like ArrayList, TreeMap, etc.

And then of course due to business requirements, they often have to implement some in-process caching, and suddenly the lightweight Java application is a bloated multi-gigabyte CPU consuming monster.

I just don't see that happening often with more memory and cache friendly languages like Go, Rust, Swift, or even JavaScript on V8/Node.js.


JavaScript is hardly cache friendly.

As for Go, that's a language which doubles up the size of every pointer and doesn't even use a moving GC, and last time I looked, the quality of machine code it generated was atrocious. It's not really cache/hardware friendly to do those things. Value types are I suspect being over-estimated here: when Java gets them I am expecting disappointment when they don't magically make everything twice as fast.


> Which of the TechEmpower benchmarks uses an ORM?

There are two: single query and multiple query.


Sorry, I should have been more precise. I am very, very familiar with the TechEmpower benchmarks and I first learned Java around SE 5, right after they switched from 1.x numbering. Please don't mistake me for someone who just learned about Go or Rust and is evangelizing them because I think they're the cool new thing.

Which of the Java implementations for the TechEmpower benchmarks use an ORM? Are they representative of the kind of code you would write? I think that the TechEmpower benchmarks suffer from many of the same problems the language benchmarks game benchmarks do - micro-optimization, unrealistic workloads.

My experience tells me that you an get any sufficient level of performance in almost any language, but that you are going to pay for some languages more in opex than others, particularly in memory usage. It takes more compute spend for a workload written in Java than one written in Go, all other things being equal. That's not to say Java is a bad language, but it does lack many features - some of them being intentional design decisions - which make it less cost effective to operate systems built on Java. However, we know that a significant cost is the cost to develop, so it's hard for me to say Java is a bad language for that reason either.

And memory usage is generally a good predictor of density in terms of scheduling workloads, be it Tomcat servers (back in the day) or VMs or containers these days. I also think that Java suffers, performance wise, from boxing values and pointer chasing / poor cache locality. The default container implementations are just, well, it would be polite to simply say that they're as good as the language allows.

However, data is better than claimed experience, no?

I just opened up the raw benchmark stats[1] for the database updates route. It's one that my favorite languages don't do well in, but I was curious about the operational overhead of running them in memory usage, something I've mentioned quite a lot up above.

I looked at a vertx-postgresql benchmark for the "updates" TechEmpower. This is a high performing implementation without an ORM[2]

I also looked at quarkus + reactive routes + hibernate, which appears to use hibernate, applicable to the original post[3].

And lastly, I looked at actix diesel, another ORM using implementation[4].

    java quarkus-hibernate:  3.9GiB memory (peak, start of test)
    java quarkus-hibernate:  3.1GiB memory (lowest value, near end of test)
    java vertx-postgres:     2.35GiB memory (consistent)

    rust actix-diesel:       1.2GiB memory
Standard deviation was:

    java quarkus-hibernate:  221.4MiB
    java vertx-postgres:       1.9MiB
    rust actix-diesel:         0.5MiB
I included the steady state for quarkus because its memory usage (perhaps due to a config flag starting it with a 4GiB heap?) started out extremely high and decreased over the course of the run. That likely affects the standard deviation, which I included to highlight that I didn't try to cherry-pick results.

Perhaps the funniest thing to me digging into it is, again due to the absurdity of Java's design decisions, to make sure that "Integer" objects are efficient, the Java benchmarks use the command line parameter "-Djava.lang.Integer.IntegerCache.high=10000". This tells you that if the benchmark used a wider range of random values[5], performance would degrade. Have you ever heard of a language requiring an integer cache? It's absurd to me that Java, rather than implement value types, requires Integers to be interned for performance.

Are there any other languages in the TechEmpower benchmark or the Debian benchmark game (formerly went by another name) that requires setting an "IntegerCache" to optimize... allocating integers? I mean, come on. You can't tell me this is a language that was designed for performance when integers can't be directly stored in arrays and instead have to be autoboxed and a cache is needed to intern them!

I will say one final thing: cost to operate/memory efficiency is just one metric for measuring languages. I think that Java is actually a pretty bad language for a lot of reasons, but path dependence has produced an extremely rich ecosystem that gives developers a lot of flexibility and a lot of tools to use when writing it. I think Kotlin, Scala, and even Clojure are by far more pleasurable languages to write in, though the JVM still holds them back for all the reasons above.

[1] Raw results from https://tfb-status.techempower.com/unzip/results.2021-01-13-...

[2] You can see they have simply hardcoded the SQL. See: https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

[3] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

[4] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

[5] The update benchmark only requires random numbers between 1 and 10,000. Performance of Java apps would degrade if they were asked to use boxed integers greater than 10,000, which is possibly the most absurd statement I have said of any programming language ever. See: https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Proj...


> Are there any other languages in the TechEmpower benchmark or the Debian benchmark game (formerly went by another name) that requires setting an "IntegerCache" to optimize... allocating integers?

afaict Java programs shown on the benchmarks game website do not.


> I think that the TechEmpower benchmarks suffer from many of the same problems the language benchmarks game benchmarks do - micro-optimization, unrealistic workloads. ... It takes more compute spend for a workload written in Java than one written in Go, all other things being equal.

Well which is it, then? You say Java is slower, the benchmarks say otherwise. What other benchmark would you accept?

I hate autoboxing as much as the next numerical processor, and I avoid JPA whenever I can, but that doesn't change that Java is plenty fast on a variety of workloads. It's typically only beaten by the hardest of the hardcore Rust and C++ implementations.


I think that if all you're doing is serving the results of plain SQL queries, which is what the TechEmpower benchmarks are, then it's really hard to pick a bad language. Almost every language is capable of tens of thousands of requests per second. Even Ruby, a language we haven't brought up and is dreadfully slow, will do thousands of requests per second with Rails. Beautiful language, abysmal performance (relative to what's possible).

Once you're doing non-trivial things on Java, and I've outlined what those things are in my previous comments and they primarily revolve around memory, wall clock CPU time correspondingly increases as your program spends more time chasing pointers on the heap, poor cache locality, lack of value types, poor monomorphization of generics (until the JIT kicks in), and so on. These things all add up.

I'm not saying it's impossible for Java to be fast, after all, if you just store everything in a "private final double[]" like most of the Benchmarks Game implementations do, sure, the JIT will do wonders for you. But that isn't real world Java, is it?

Real-world Java web servers do more than just respond to epoll_wait(2) events on a loop by sending some bytes to a database, getting them back, and sending them straight back to the client. There's usually more serialization, more authentication, more logging, more metric exporting, more middleware doing one thing or another.

One last thing: GraalVM is the most exciting thing to happen to Java performance since NIO and the newer garbage collectors aiming for sub-ms stop the world times. Quarkus, which I googled over the course of writing my comments, is by far the most interesting new tool I saw for shipping Java in production efficiently by building on GraalVM to deliver web servers in megabytes, not gigabytes of resident memory: https://quarkus.io/

It's a shame Quarkus in the benchmark I saw used so much more memory. It looks like it should be possible to fix that.


> Even Ruby, a language we haven't brought up and is dreadfully slow, will do thousands of requests per second with Rails. Beautiful language, abysmal performance (relative to what's possible).

It's precisely this terrible performance that makes ActiveRecord so much easier to program and better to use than Hibernate (or EntityFramework). I've written over a dozen Rails apps over the past 15 years, and it's "abysmal" performance has never been a problem for me. For my problem space(s), I'd make that tradeoff every day of the week, and twice on Sunday.


I agree with enthusiasm for Graal. Related to it are efforts like Project Valhalla and Project Panama, which will continue to do a lot for the current drags on performance. For example, by adding value types.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: