Hi,
I'm Philip Moore - the founder of GizmoData, and creator of GizmoSQL - an Apache Arrow Flight SQL Server - with DuckDB (or SQLite) back-end execution engines.
GizmoSQL is a composable SQL server with Arrow Flight SQL, DuckDB, and SQLite - with the intention of making it easy to run DuckDB (or SQLite) as a server - usable by multiple people from a client (remote) computer. It also adds security (authentication) and encryption of traffic with TLS.
To run GizmoSQL - see the steps in the README.md - where you can see how easy it is to run the server as well as how to connect via ADBC and JDBC from a remote client - such as DBeaver, Python, etc. The easiest way to run GizmoSQL is via Docker - but there are downloads for Linux and macOS for both x86-64 and arm64 platforms (download links in the README).
Why?:
As you may know, DuckDB and SQLite are embedded systems - they don't enable client connectivity, and they aren't really designed for concurrency.
I've built GizmoSQL to work around that - because I believe the DuckDB engine is very powerful, and I feel like a lot of customers overpay and run distributed compute (i.e. Spark) when they don't really need to. Making it easy to have remote connectivity to DuckDB can make it easier to migrate SQL workloads from Spark or other expensive commercial platforms to this engine - with a much simpler architecture/infrastructure.
It is my intention to make GizmoSQL a commercial product - licensed for production use by organizations, but free for developers to code with - evaluate, and test.
A little bit of backstory:
* I built the initial version of this while working for a former employer - it wasn't their core focus, so they open-sourced that early version. After I left there, I forked the product and have improved it substantially - to support concurrency of both reads and writes, improving security, as well as keeping it up to date with the latest versions of Apache Arrow and DuckDB.
* This project evolved from a prototype created by the brilliant Tom Drabas.
* It feels a little weird trying to make a commercial product based upon DuckDB, but MotherDuck started it :P - and I've contributed (albeit very little) to the DuckDB and Apache Arrow projects in the form of a couple of PRs.
I'm really excited about this project - I have run benchmarks of this product against commercial platforms such as Snowflake and Databricks SQL - and it holds its own running the 22-query TPC-H SF1TB benchmark, especially on cost. See the graph at: https://gizmodata.com/gizmosql
Getting started:
Github README: https://github.com/gizmodata/gizmosql-public/blob/main/READM...
DockerHub: https://hub.docker.com/r/gizmodata/gizmosql
GizmoSQL homepage: https://gizmodata.com/gizmosql
Phil's Github profile: https://github.com/prmoore77
Thanks for your time and feedback in advance.
It should be extremely simple for databases that support ADBC (for example Snowflake, PostgreSQL).
For others it might just be a matter of mapping DDL, DML, DQL, etc to a supported database protocol driver (JDBC, ODBC, etc). Of course this is where things may get challenging as it would become the responsibility of your server to convert result to Arrows (tables/streams/etc). But could potentially be delegated to "worker" Flight servers (not a Flight SQL server) and then the server could return/forward their Arrow results (Flight results).
Of course some of this is to some degree already possible through DuckDB's MySQL/Postgres Extensions.
I imagine this could also be useful for developing/testing locally?
It might also provide a way to interchange databases while potentially easing database migrations (vendor to vendor) if ADBC isn't supported by the vendor.
Another potential value-addition could be to provide SQL dialect management by providing Substrait conversions (or sqlglot but looks like the server is Java, so unsure if possible, maybe Graal?).