Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What kind of hardware requirements would be needed to store and query this much data?


- Depends - Just inserting, indexing, storing and simple querying can be done with little memory (i.e. 1:500 memory-disk-ratio 0.5GB RAM per 1TB disk). Typical production clusters with high query load are in the 1:150 range i.e. 64GB RAM for 10TB disk).

Otherwise typical general purpose hardware (Standard SSDs, 1:4 vCPU:memory ratios, ...)


Interesting, so that'd be about 1 vCPU and 4GB RAM per 625GB of data. That seems very price efficient. Would something like AWS's EBS be sufficient for this? Would you need one of the higher tiers? Or would you be looking at running this on a box with locally attached storage?


Most of CrateDB clusters run on cloud providers hardware (azure, aws, alibaba). Using EBS (GP2 or now GP3) is also quite common. Due to the indexing / storage engine, gp disks are typically sufficient and faster disks have little to no advantage


Wouldn't 0.5GB RAM per 1TB disk be more like 1:2000 memory-disk-ratio? Which is even better!


Sorry, mixed up the number 2GB memory (0.5GB heap). So 1:500 is correct




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: