Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

would you recommend clickhouse over duckdb? and why?


IMO the only reason to not use ClickHouse is when you either have "small" amount of data or "small" servers (<100 Gb of data, servers with <64 Gb of RAM). Otherwise ClickHouse is a better solution since it's a standalone DB that supports replication and in general has very very robust cluster support, easily scaling to hundreds of nodes.

Typically when you discover the need for OLAP DB is when you reach that scale, so I'm personally not sure what the real use case for DuckDB is to be completely honest.


There is another place where you should not use CH, and it's in a system with shared resources. CH loves, and earned the right, to have spikes of hogging resources. They even allude to this on the Keeper setup - if you put the nodes for the two systems in the same machine, CH will inevitably push Keeper off the bed and the two will come to a disagreement. You should not have it on a k8s Pod for that reason, for example. But then again, you shouldn't have ANY storage of that capacity in a k8s pod anyways.


DuckDB probably performs better per core than clickhouse does for most queries. So as long as your workload fits on a single machine (it's likely that it does) it's often the most performant option.

Besides, it's so simple, just a single executable.

Of course if you're at a scale where you need a cluster it's not an option anymore.


The good parts of DuckDB that you've mentioned, including the fact that it is a single-executable, are modeled after ClickHouse.


Can you provide a reference for that belief? To me that's not true. They started from solving very different problems.


I didn't express myself well. What I meant to say was that Duckdb runs a single process. That simplifies things.

Clickhouse typically runs several processes (server, clients) interacting and that already makes things more complicated (and more powerful!).

That's not to say one is good and the other bad, they're just quite different tools.


Note that every use case is different and YMMV.

https://www.vantage.sh/blog/clickhouse-local-vs-duckdb


Great link . Curious how it compares now that Duckdb is 1.0+


Not to mention polars, datafusion, etc. Single node OLAP space is really heating up.


Clickhouse scales from a local tool like Duckdb to a database cluster that can back your reporting applications and other OLAP applications.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: