Hacker News new | past | comments | ask | show | jobs | submit login

I recently implemented my own expression evaluator in java for in-memory data frames, and once you think about doing that deeply, you very quickly understand the need for bytecode. If you directly evaluate the expression using a tree representation, you basically have to do a whole lot of branching (either via switch statements or polymorphism) for every single line of useful operation. Yes, the branch predictor kicks in and it means that your code wouldn’t be as slow as if it didn’t, but it is still measurably slower than if you converted the expression into bytecode once and just ran that on all rows instead.



Bytecode will only impact packing, so more efficient ram, cache and cpu wise. But I don't understand how it would help with branching? You still have to make the same decisions? As in the bytecode executor still needs to do differnt things based on the op code, its not in hardware.


Consider evaluating an expression like "col1 + col2 == col3"

A tree based evaluator has to do something like -

    for (int i = 0; i < df.size(); i++) {
        // switch on expression type (column, binary operator, unary operator etc)
        // switch on expression operator type (add, subtract, equals etc)
        // switch on column type (int, long, float etc)
        // actual evaluation
    }
whereas a bytecode based evaluator is able to run something equivalent to -

    for (int i = 0; i < df.size(); i++) {
        result[i] = df.getLong(column1Index, i) + df.getLong(column2Index, i)) == df.getLong(column3Index, i)
    }


You still have to branch on opcode selection which you're omitting in your translation. Branching on "expression type" and column type in your first example is also unnecessary. Bytecodes have different opcodes to operate on different types and different arities, so in both cases you have only one unpredictable branch for each operation/opcode.

The main benefit of bytecodes is then cache friendliness, by removing all of those unpredictable loads and stores to obtain the expression details (opcode, arguments, etc.). All of those are inlined into the bytecode array which is already in cache.


> Branching on "expression type" and column type in your first example is also unnecessary

It’s not?


It is unnecessary, only branching on expression operator type is strictly necessary. You're trying to abstract too much in your AST which yields an inefficient interpreter. I can invent an inefficient bytecode too, but that wouldn't be a correct evaluation of how fast bytecode interpretation could be when done right.


There is threaded bytecode as well, which uses direct jumping vs a switch for dispatch. This can improve branch prediction, though it is a debated topic and may not offer much improvement for modern processors.


Do you have perhaps some links/references on that?

I have once tried benchmarking it by writing a tiny VM interpreter and a corresponding threaded one with direct jumps in Zig (which can force inline a call, so I could do efficient direct jumps) and I have - to me surprisingly- found that the naive while-switch loop was faster, even though the resulting assembly of the second approach seemed right.

I wasn’t sure if I saw it only due to my tiny language and dumb example program, or if it’s something deeper. E.g. the JVM does use direct threaded code for their interpreter.


How does it know where to jump to?


The jump target is compiled into the bytecode, so rather than return to the big switch statement, it jumps straight to the next opcode's implementation. The process is called "direct threading". These days a decent switch-based interpreter should fit in cache, so I'm not sure direct threading is much of a win anymore.


You should look at https://janino-compiler.github.io/janino/ it can compile Java into class files in memory that can be directly executed.


I found this by looking at spark source code to figure out what they were doing under the hood to deal solve this problem. :)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: