RethinkDB Driver for Java now available

PedroBatista · on Dec 9, 2015

Finally! ( congrats :) ) When we decided to use RethinkDB I had to dig up old driver source code with 50-50 compatible protocol with newer versions of ReDB, sometimes if I wrote too much data in bulk it would simply crash the JVM lol. Now I can sleep better knowing my changes of getting fired are 2% less.

Now regarding the awfulness of JSON objects to/from Java Objects, any good news? (I know it's mostly a Java the lang issue)

RyanZAG · on Dec 9, 2015

Do you mean converting from JSON to POJO? Why do you consider that awful? There are very good libraries (jackson, gson) for handling that and they work really well - especially if you're going to be using those POJOs for HTTP REST too.

VeilEm · on Dec 9, 2015

Are you forced to use JSON with RethinkDB? Like is the mechanism RethinkDB -> JSON String -> to POJO and POJO -> JSON string -> RethinkDB with each database interaction?

That seems crazy inefficient.

RyanZAG · on Dec 9, 2015

Isn't that kind of like arguing how serializing an HTTP reply into text for the browser crazy inefficient?

At any rate, JSON is the transfer format used by RethinkDB so at some point it has to be JSON for RethinkDB to read it. You could always make your own data language and integrate it all yourself, but ultimately you can't transfer your POJO directly and will always have to transform it into something. JSON is pretty decent and very compatible.

So to answer your question: (RethinkDB -> JSON String) is a completely required part of the database communication. The additional POJO step is fairly efficient when using Jackson and unlikely to be a large part of your CPU usage - the simple and required byte copy may use more CPU on large POJOs. You could use some type of no-copy directly from the network buffer but then a complex database like RethinkDB would not be the right choice - you'd probably use local memmapped files in C..

In short: try using the driver and pojo conversion in a real world project and profile it. I think you'd be surprised how little overhead it really is. Computers are very good at this stuff, and the JVM's JIT is optimized for this stuff.

coffeemug · on Dec 9, 2015

Slava @ RethinkDB here.

To add to other people's comments, this is really important in the context of microservices/polyglot persistence. For example, a typical app might be composed of Java and Node code, and an object stored from the Java application needs to be read from Node (and vice versa). The only way to do that is to have a standardized intermediate communication format, and after trying a bunch of different formats (Protocol Buffers, custom serialization methods, etc.) we settled on JSON because it has a perfect balance of portability and efficiency.

javajosh · on Dec 9, 2015

Literally every server (db and otherwise) sees the universe as a sequence of strings that need to be parsed. Native drivers have to encode/decode strings; it's just that they have full control over the format and can, at their convenience, build it as efficiently (or inefficiently) as they like.

The things that reduce this overhead (but don't eliminate it) are things like Protocol Buffers. EDIT: And others. See https://en.wikipedia.org/wiki/Comparison_of_data_serializati...

coffeemug · on Dec 9, 2015

The official driver has a few helper methods to create JSON objects, so this hopefully shouldn't be too painful until the language adds native syntax for JSON.

mring33621 · on Dec 9, 2015

Wouldn't Nashorn in Java 8 count as 'native syntax'? Use javascript in the JVM to munge your JSON.

krat0sprakhar · on Dec 9, 2015

I recently used the Rethinkdb client for Clojure[0] and frankly I've found it a bit too unintuitive for use. You need to rely on a "query-builder" to write even simplistic queries. The documentation is also quite incomplete forcing you dive in the codebase to see how to use a feature. I should point out that I'm not taking digs at the library author - much props to him for writing & open-sourcing it.

My question to the more experienced Clojure devs in the house - now that the official java client is out, should I use this from my Clojure app or keep using the other one?

[0] - https://github.com/apa512/clj-rethinkdb

coffeemug · on Dec 9, 2015

Slava @ RethinkDB here. I've never programmed in Clojure (although I spent years programming in Common Lisp), so everything below is what I overheard from the engineers that were looking into this.

The official Java driver can be used from a variety of JVM languages and apparently results in a pretty good experience. Segphault is writing a blog post on this, so you'll see lots of examples soon of what the Java driver looks like in various JVM languages.

Apparently the only exception to this is Clojure -- you can use the Java driver from Clojure, but idiomatically the experience isn't as good as using native Clojure libraries. We're going to play around with this and figure out what to do about it. One possibility is that someone in the community might write a Clojure wrapper on top of the official Java driver -- this would be way easier than writing a Clojure driver from scratch, and would provide idiomatically appropriate interfaces.

I know Aphyr is using clj-rethinkdb for his Jepsen analysis of RethinkDB, and it's working out pretty well, and I know of a few production deployments based on clj-rethinkdb. So another possibility is to invest a little more time/effort into clj-rethinkdb to bring it up to par with official drivers.

We should be able to improve the state of affairs in the next few months.

krat0sprakhar · on Dec 9, 2015

Hi Slava! Thanks for the answer.

I'm aware of Aphyr's work with Rethinkdb, that's how I found about clj-rethinkdb in the first place. Aphyr is a seasoned clojure programmer, so I'm sure using any library wouldn't be too much of work him. I'm still new to Clojure and compared with the Go/Javascript version of the client, I found it a bit hard to understand (the lack of documentation also doesn't help). There's a high possibility that the reason behind experience is due to the shortcomings in my understanding of clj and not the client itself. In any case, I look forward to seeing what rethinkdb / community does with improving the client.

PS: You'd be happy to know that I jumped on the Clojure ship after reading your blogpost[0] about lisp. So thanks for that :)

[0] - http://www.defmacro.org/ramblings/lisp.html

coffeemug · on Dec 9, 2015

Cool! I think you've identified the problem correctly (using the clj driver is harder for a beginner than some of the other drivers). The solution might involve better docs, more/better code, or some combination of the two. We'll try to get on this as soon as we can!

dantiberian · on Dec 9, 2015

I'm a maintainer of clj-rethinkdb. It makes a ton of sense for clj-rethinkdb to use as much as possible from the Java driver. I might be a bit biased, but I think the query language is really idiomatic for Clojure. The docs could definitely do with some love, I'd appreciate feedback on what else we could do to bring it up to par with the official drivers.

coffeemug · on Dec 10, 2015

Slava @ RethinkDB here. Thanks for all your awesome work on the driver!

hardwaresofton · on Dec 9, 2015

I actually didn't have the same problems with the clojure rethinkdb driver. I'll address my opinions in turn:

- query builder reliance

I'm not sure what you mean by this? I got a lot of mileage out of the library by writing partial functions that generate queries that were most commonly used. So, I'd have functions like:

  (def find-user-by-id
   "Find a user by ID"
   [id]
   (-> (r/db "database")
       (r/table "users")
       (r/find id)))

or something similar. Of course, you can get a lot more generic with this -- but I found that keeping these queries with each other created good separation of "database-stuff" from the rest of my app (and made it easy for me to do versioning.

- Needing to look in the codebase for documentation on how to use the methods

While this is certainly unfortunate (and I belive true), I have found that it's actually good advice to tell someone to look at tests in a repo as backup-documentation on projects that are far along enough to have tests. Also... this is kind of a problem with the clojure ecosystem in general, so I'd definitely +1.

Would like to take this moment to also thank apa512 for writing the library, thanks for doing so!

I think I ran into a problem once when working on changefeeds (the default implementation used a seq and I think I wanted a channel or something like that), but that boiled down to me misunderstanding how to use it, rather than it being "wrong" in a sense.

krat0sprakhar · on Dec 9, 2015

Queries like the one you wrote above are indeed quite idiomatic but something as simple as providing a `read_mode` requires a query builder-

   (r/get (rethinkdb.query-builder/term 
            :TABLE [(query/db db) table]
           {:read_mode read_mode}) id)]))

whereas in JS, you can simply do r.db('heroes').table('marvel', {readMode: 'outdated'}).run(conn, callback)

hardwaresofton · on Dec 9, 2015

Thanks for the example -- while I have never needed to actually change the read mode, you shouldn't need to actually use that query buider, read mode is an optional argument to r/table. Apologize if I misunderstood what you were trying to do, or if this was a relatively recent addition that wasn't available last you used clj-rethinkdb.

Here's an example (from tests in the repo): https://github.com/apa512/clj-rethinkdb/blob/1cd09c7c8a4ce27...

dantiberian · on Dec 9, 2015

Hey, I'm one of the maintainers for clj-rethinkdb. There's a few things I can discuss here.

> You need to rely on a "query-builder"

One of the beautiful things about RethinkDB is it's query language. It is basically S-expressions but in vectors instead of lists. This pairs really well with Clojure as Clojure is made up of S-expressions too. All of the drivers do use some kind of a query builder internally (as far as I'm aware) to build up the query data structure, then they call run on it. This process is just a bit more explicit in clj-rethinkdb, and has all kinds of benefits, like being able to make partial functions, and compose ReQL functions really nicely.

> The documentation is also quite incomplete

clj-rethinkdb follows the official JavaScript API quite closely. There were two options: create minimal docstrings in Clojure and leave the main documentation up to RethinkDB, or copy all of the documentation into clj-rethinkdb. There are a few issues with the second approach, the most obvious is how to efficiently copy it, but also that different versions of the RethinkDB server support different functions and options, so we'd need to keep compatibility notes, e.t.c. A really nice approach is in https://github.com/rethinkdb/docs/issues/710, and https://github.com/rethinkdb/docs/issues/803 but that requires more work.

All of that to say, at the moment the best option is to read the JavaScript docs first, then look at the Clojure docstrings for argument order or special notes.

> now that the official java client is out, should I use this from my Clojure app or keep using the other one?

It makes a lot of sense for clj-rethinkdb to take use of as much of the Java driver as possible. I haven't got any news yet, but it's likely we'll be at least poaching the connection and request/response handling.

I'd be happy to jump on a Hangout or text chat in https://gitter.im/apa512/clj-rethinkdb to help you get over any speed bumps using the clj-rethinkdb driver. Hearing a first time users experience would be really good.

segphault · on Dec 10, 2015

Hey, I'm Ryan from RethinkDB. I wrote the OP blog post and I'm working on a follow-up that demonstrates how to use the Java driver in various alt-JVM languages.

You can take advantage of Clojure's Java interop to use the new official Java driver in Clojure, but it doesn't offer a very good user experience compared to clj-rethinkdb. It will work for a lot of simple cases, but it gets weird very quickly. When you define a function in Clojure, the underlying Java object is an IFn[0]. Unfortunately, you can't pass an IFn to native Java methods that expect to receive a Java 8 lambda (an anonymous class with a single method). You can work around this limitation by manually creating an object that conforms with the client driver's various ReqlFunction[1] interfaces, but it's not very pretty:

This ReQL expression in Java:

  r.table("fellowship").map(x -> x.getField("name")).coerceTo("array").run(conn)

Becomes this in Clojure:

  (-> (.table r "fellowship") (.map (reify com.rethinkdb.gen.ast.ReqlFunction1 (apply [this, x] (.getField x "name")))) (.coerceTo "array") (.run conn))

Note the use of reify and the ReqlFunction1 interface. You need to be careful to pick the interface that matches the arity of your anonymous function. For example, if you have four parameters, you use ReqlFunction4 instead of ReqlFunction1.

You could probably do some voodoo to abstract away some of the ugliness, but the way that clj-rethinkdb's r/fn macro works under the hood and translates into the wire protocol is frankly a lot more elegant than anything you could do with our official client driver and Clojure's Java interop IMO. That said, I'm not particularly well-versed in either Java or Clojure, so there may be something I'm overlooking. I hope this explanation is helpful to somebody. :-)

TL;DR: use clj-rethinkdb if you want to write sane-looking ReQL in Clojure.

[0]: https://clojure.github.io/clojure/javadoc/clojure/lang/IFn.h... [1]: https://github.com/rethinkdb/rethinkdb/blob/next/drivers/jav...

sharms · on Dec 9, 2015

I recently heard about RethinkDB on a few podcasts (SE Radio and The Changelog) and it sounds very promising, and has been designed to solve real time web application use cases very well. Glad to see they now support Java so we can use it easier from JVM based languages

hardwaresofton · on Dec 9, 2015

I would also like to point out that it's a pretty good document store even without the super awesome realtime bits:

This video is 2 years old (and rethinkdb has gotten way more awesome since then), but it comes with some super useful things out of the box, like a very nice Web UI, distributed system support, joins, and a great way to write complex queries (ReQL): https://www.youtube.com/watch?v=qKPKsBNw604

If a document model fits your data, I'd suggest giving rethink a try.

RyanZAG · on Dec 9, 2015

Looks great, and the vert.x bit was also great to see. The blockingHandler a bit less so. Any plans for making a non blocking API using netty channels? That would really put RethinkDB as an amazing choice for pairing vert.x/JVM with a data layer.

danielmewes · on Dec 9, 2015

Yes! We wanted to get the synchronous driver out first to get feedback. But a non-blocking mode is definitely coming up. https://github.com/rethinkdb/rethinkdb/issues/4802

moatra · on Dec 10, 2015

Just in time! This will be nice reference. I recently started working on a new Scala driver that uses the v0_4 asynchronous protocol, built on top of Akka's IO module and Play's JSON module. I think I have the performance where I want it, but now I need to flesh out the DSL for proper ReQL support.

habitue · on Dec 10, 2015

Have a look at https://github.com/rethinkdb/rethinkdb/blob/next/drivers/jav...

It has a full listing of all ReQL terms and their signatures and optargs

gamesbrainiac · on Dec 9, 2015

This is interesting since a friend of mine was using a nodejs-based microservice layer to query Rethinkdb from his java application.

I personally don't like java, but if you take a look at the API they've built, its actually pretty nice.

jdoliner · on Dec 9, 2015

Awesome to see an official java driver for rethink. Any plans for a golang driver?

krat0sprakhar · on Dec 9, 2015

I've tried the (unofficial) rethinkdb client for Go and it's been working great for me. https://github.com/dancannon/gorethink. Is there a particular reason why you prefer an official one over this?

coffeemug · on Dec 9, 2015

The community-supported goland driver by Dan Cannon is so good that it's not worth writing a new driver. Over time we plan to adopt it under the official RethinkDB umbrella -- I should be able to talk to Dan about a collaboration soon.

jdoliner · on Dec 9, 2015

Yeah I don't think starting from scratch would make sense, bringing it under the rethink umbrella would be better.

It has a couple of things that I think could be better, in particular it could leverage go's native json support more. I have a few other ideas on this... you know where to find me though.

_dancannon · on Dec 10, 2015

Thanks for the feedback, I try to keep GoRethink up to date and to the same high standards as the official drivers. That being said I can understand why you would want an official driver.

Regarding the json support the driver actually uses the native package however due to RethinkDBs use of psuedo-types an extra decoding step is needed and that is why GoRethink doesnt let you use json tags directly. Regarding your other ideas it would be great to discuss them.

jongwook · on Dec 9, 2015

It seems that we can make an API call nonblocking but we'll have to block and wait for such call to complete. How can they mention reactive/event-driven in the middle of the blog and not provide an asynchronous API? Java 8 CompletableFuture would've been the perfect abstraction for it.

danielmewes · on Dec 9, 2015

Daniel @ RethinkDB here. We're going to add an asynchronous API soon. We decided to get the synchronous driver out to get some early feedback and because we know that a lot of our users have been waiting for official Java support, and not all of them need an async API.

You might be interested in following this GitHub issue: https://github.com/rethinkdb/rethinkdb/issues/4802

zerotosixty · on Dec 9, 2015

You know what would make Rethink more badass. If there was a cheap RethinkDB As A Service provider.

sureshv · on Dec 9, 2015

Not affiliated but https://compose.io might do what you want.

coffeemug · on Dec 9, 2015

IMO $22.50/month for an autoscaled, managed database deployment with full backup, upgrades, security, etc. is a bargain. Compose is pretty great!

zerotosixty · on Dec 10, 2015

It would be amazing if they had a sandbox or freemium model. It's pretty expensive to spend $22.50 bucks a month for project.

merb · on Dec 10, 2015

What's missing would be a Async Interface. Especially with Java8 and CompletableFuture, which will be interoping in Scala 2.12

foo3456 · on Dec 9, 2015

What does Java _driver_ mean -> is this a JDBC driver or just some non standard API to access this database system?

habitue · on Dec 9, 2015

This isn't a JDBC driver since RethinkDB isn't a SQL database, but it fulfills the same role as a JDBC driver.

In addition, RethinkDB doesn't have a textual query language, it's structured data, so the bare driver looks a lot like an ORM

tequila_shot · on Dec 9, 2015

Django driver please (to work with ORM)? :-/

Edit : I am not sure why I am being down voted. Help?

meat_fist · on Dec 9, 2015

Forgive my ignorance, but can you not just use the Python RethinkDB driver? Or does it have to be Django specific?

https://rethinkdb.com/docs/guide/python/

habitue · on Dec 9, 2015

Django is pretty tightly coupled with its orm. The last time I looked into it, the mongo/django solution was essentially a fork of django with the data layer completely rewritten

SuperKlaus · on Dec 9, 2015

I think tequila_shot is looking for a driver that lets him use RethinkDB with the Django ORM.

tequila_shot · on Dec 9, 2015

You are right. I am looking for a driver to work with ORM.

mglukhovsky · on Dec 9, 2015

While it is challenging to build a Django integration, there are some options for working with RethinkDB. You can check this StackOverflow answer for a starting base: http://stackoverflow.com/questions/28001980/how-to-setup-ret...

Jorge Silva (@thejsj) also wrote this demo app to show how you could pair MySQL with RethinkDB when using Django: https://github.com/thejsj/django-and-rethinkdb

vvpan · on Dec 9, 2015

I wouldn't hold my breath. Django isn't popular enough to create a niche driver. Besides as others point out vanilla Django was designed with SQL in mind.

tequila_shot · on Dec 9, 2015

> Besides as others point out vanilla Django was designed with SQL in mind.

I am not trying to challenge your statement and not trying to be sarcastic. Can you quote the link where this is stated so that I can learn the background?