Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The original version of MongoDB used mmap, and I worked at a company that had a ton of issues with cache warmup and the cache getting trashed by competing processes. Granted this was a long time ago, but the main issue was the operating system's willingness to reallocate large swaths of memory from the address space to whatever process was asking for memory right now.

Once the working set got trashed, performance would go through the floor, and our app would slow to a crawl while the cache went through the warmup cycle.

Long story short, with that model, Mongo couldn't "own" the memory it was using, and this lead to chronic problems. Wiredtiger fixed this completely, but I still think this is a cautionary tale for anyone considering building a DB without a dedicated memory manager.




The original sales pitch I heard for slab alocators was: use the standard libraries for general workloads, but if you know your data better than the stdlib, you might be able to do better.

mmap access patterns seem like something where you can do better. Especially in the age of io_uring, when an n+1 pointer chasing situation doesn't particularly care what order the results are processed as long as the last one shows up in a reasonable amount of time.


Perhaps I misread your first sentence but was MongoDB related to your cache warming issue? Or were these two distinct issues related to mmap-based data stores?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: