Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How? How does one report on 1.43 billion events stored in ES in 16 seconds?

And can it do this on a vm with 4G of ram?




Did unrelevant and crazy testing on 4GB DigitalOcean VM with a part of a latest hosts dataset from Rapid7's Project Sonar. Data is pairs of ip,certificate thumbprint. On ~1500000 of entries in ES (~300mb with indices, much wow) a sort for ip occurancies is made on the speed of light within 3.5 seconds if data isn't cached and 300-350 if it is.


1.5 million entries isn't exactly equal to 1.5 billion. Extrapolating the query would take almost an hour on ES.


Thing is that the ES index which could fit in memory still performs slowly for full db query.


you didn't specify anything about RAM/resources...


I did above:

> I've been testing https://clickhouse.yandex/. I threw it on a single VM with 4G of ram and imported billions of flow records into it. queries rip through data at tens of millions of records a second.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: