If you're writing a database engine, and you accidentally wrap each table cell in a Java object, you'll very quickly hit a GC performance wall, particularly when using a low pause GC. So we try to write Java without objects. Guess how easy that is. ;)
Yeah I deal with that a lot with my search engine index too. Honestly it's not that bad once you get used to it.
You can get away with referencing the data through (mutable and reusable) pointer objects that reference memory mapped areas yet provide a relatively comfortable higher level interface. This gets rid of object churn while keeping a relatively sane interface.
Can you give an example how to do that? I've only seen IO and parsing libraries that copy around byte arrays, but you can't cast byte arrays to arbitrary objects, which means you usually have to manipulate the bytes directly (unless you're lucky to work with only strings). At that point I imagine it would be far easier to just use C++ or Rust.
Yeah this sort of work isn't Java's strong suite. A lot of it's sort of like programming old-school C with oven mitts. It'll get a bit better with the Foreign Memory API which is in the JEP pipe.
But a very bare bones example might look something like
class MdrLayout {
static final int FOO_OFFSET = 0;
static final int BAR_OFFSET = 4;
static final int BAZ_OFFSET = 12;
static final int ENTRY_SIZE = 13;
}
class MyDataRecord {
int idx = Integer.MIN_VALUE;
ByteBuffer buffer = null;
MyDataRecord() { }
void movePointer(ByteBuffer buffer, int idx) {
this.buffer = buffer;
this.idx = idx;
}
int foo() { return buffer.getInt(ENTRY_SIZE * idx + FOO_OFFSET); }
long bar() { return buffer.getLong(ENTRY_SIZE * idx + BAR_OFFSET); }
byte baz() { return buffer.get(ENTRY_SIZE * idx + BAZ_OFFSET); }
// may have operators like reset(), next() or skip() as well
}
If you do stuff like that, use a profiler and identify and fix your real performance bottlenecks. As opposed to applying premature optimization blindly. Same with GC tuning. This has gotten easier over the years. But there are still lots of tradeoffs here.
There are plenty of fast performing databases and other middleware written in Java. The JVM is a popular platform for that kind of thing for a good reason. Writing good software of course is a bit of a skill. Benchmarks like this are kind of pointless. Doing an expensive thing in a loop is slow. Well duh. Don't do that.