Understanding How Graal Works – A Java JIT Compiler Written in Java

mike_hearn · on Nov 5, 2017

Graal, Truffle, SubstrateVM, Sulong etc are really fantastic projects and I've been following them for some time. This talk is also very good, so thanks to Chris for uploading his notes.

I've done a few experiments with manipulating Graal graphs myself.

I do get a sense of slight confusion in this talk over whether Graal is easy to understand or not. It's clearly easier than C2 but this is not a high bar to reach and is not really related to what language it's written in. I'd say Graal is still harder to understand than LLVM, despite LLVM being written in C++, mostly because the LLVM guys really do believe in documentation and building an open source community. The documentation for Graal is sadly near non-existent. There are academic papers, usually undated of course as is normal for such papers, so it can be hard to know how fresh the information is. There are talks and blog posts, which are helpful but are often kind of repetitive if you've seen earlier talks.

What there isn't is anything like this:

http://llvm.org/docs/

LLVM has detailed tutorials, design overviews, descriptions of each pass and so on. This is one reason why LLVM tends to attract a lot of compiler research and has such a thriving community. Graal is good, but it's mostly done by Oracle and researchers from JKU Linz. I wouldn't say it has much of a community (check out the traffic levels on graal-dev for example). It doesn't even have a website or even a public issue tracker, which is a downgrade in understandability from existing Hotspot compilers!

Graal could give LLVM serious competition as a testbed for compiler research, but in its current state trying to work with it is just painfully slow. Even very basic questions like "how do I swap out this node in the graph" or "how do I insert a node in between two others" or "how do I enable assertions that will check if I mess up the graph" require you to learn by reading the existing code, which is hardly commented (most classes have no javadocs).

I used to think that Graal was like this because it came out of university research, so the developers were judged by metrics like how many papers they published. But LLVM has the same origins, so I guess it's really just a matter of project culture. That may prove hard to change.

throwawaysml · on Nov 5, 2017

> Graal, Truffle, SubstrateVM, Sulong

Is there a succinct explanation as to what those four projects are and how they would be combined in the hypothetical implementation you mention?

The idea is to replace all of the JDK with a construction of those four, right? Would it still need a traditional JVM to execute the resulting toolchain or replace it fully?

chrisseaton · on Nov 6, 2017

Graal is a native-code compiler for Java, implemented in Java. It can be used as a JIT or an AOT compiler.

Truffle is a framework for implementing languages in Java, implemented in Java. Truffle can use Graal to automatically produce a JIT for languages implemented in it.

SubstrateVM is a JVM and AOT compiler using Graal, implemented in Java. With SubstrateVM you can take a Java program and produce a single, statically-linked executable with no dependency on the JVM.

Sulong is an interpreter for LLVM bitcode (so C, Fortran, Rust, etc programs) using Truffle, implemented in Java.

So together you can start to see how it's a system for 'one VM to rule them all' - all languages running with high performance in a single system.

It would be possible to write a Java bytecode interpreter, in Java, using Truffle, which would be JIT compiled using Graal, and to AOT compile that interpreter to a binary using SubstrateVM, which would give you a complete high-performance JVM implemented only in Java, yes.

Today enough of these components are available to produce a standalone high-performance Ruby or JavaScript VM, written entirely in Java, that has no dependency on the JVM.

MaxBarraclough · on Nov 4, 2017

That was an interesting skim-read, good stuff!

Let's not forget though that this isn't the first Java JIT written in Java. I believe the Jikes RVM has that honour.

CalChris · on Nov 5, 2017

JavaInJava (Sun) and Jalapeño (IBM) are both from around 1997. You’d have to dig up some bodies to know which was first.

https://www.researchgate.net/publication/2781745_Implementin...

http://research.ibm.com/people/j/jdchoi/jalapeno.pdf

Actually, reading through the papers, JavaInJava didn't implement a JIT and Jalapeño did.

divs1210 · on Nov 5, 2017

I think the end goal is to implement the jvm as a bytecode interpreter on top of truffle/graal and export it as a standalone executable via svm.

This also means that it will be easier to make vms with performance comparable to java.

One could, for example, design a functional bytecode and an svm based bytecode interpreter for it and write a language that compiles down to it to get a compiled language with a jitting (functional) vm!

divs1210 · on Nov 5, 2017

You could just write a svm-based interpreter directly too.

I wonder what difference these 2 approaches would make.

I'm guesing writing an interpreter directly would be better.

valarauca1 · on Nov 5, 2017

How does the JVMCI handle function inlining?

chrisseaton · on Nov 5, 2017

As well as being given a method to compile you can then also request the bytecode and metadata of any other method, for example to inline it.

valarauca1 · on Nov 5, 2017

Okay, then how are stack traces and de-opt events handled?

These are the _really_ hard problems with JIT compilers. Tracking the IP location in a block of ASM may result in a 3+ Java “methods” being executed at once.

chrisseaton · on Nov 5, 2017

When you install code you need to provide a frame map and metadata that maps registers, stack locations and IPs to locals, objects and bytecode.

Consider following the advice I put in the document. Open it up in Eclipse and explore using standard Java tools and discover for yourself. That was the whole point.

chrisseaton · on Nov 5, 2017

Now I'm at a desk, I can say start looking at the FrameState class.

https://github.com/graalvm/graal/blob/cd655e1df4f388c78364f0...

And that's the beauty of it! If you were interested in frame states, typing FrameState into Eclipse will have found this, and then start looking at what calls these methods.

fiddlerwoaroof · on Nov 4, 2017

It’s nice to see java gaining compiler macros.

sitkack · on Nov 5, 2017

Take a look at bytebuddy [0] and Janino [1]

[0] http://bytebuddy.net/#/

[1] http://janino-compiler.github.io/janino/