Yup. Why use the operating system's async I/O system when you can simply burn a thread and do blocking I/O? </snark>
Been down that primrose path, have the road rash to prove it. mmap() is great until you realize that pretty much all you've avoided is some buffer management that you probably need to do anyway. The OS just doesn't have the information it needs to do a great (or even correct) job of caching database pages.
> Why use the operating system's async I/O system when you can simply burn a thread and do blocking I/O? </snark>
mmap isn't non-blocking; page faults are blocking, no different from a read or write to a (non-direct I/O) file using a syscall.
Until recently io_uring literally burned a thread (from a thread pool) for every read or write regular file operation, too. Though now it finally has hooks into the buffer cache so it can opportunistically perform the operation from the same thread that dequeued the command, pushing it to a worker thread if it would need to wait for a cache fault.[1]
[1] Technically the same behavior could be implemented in user space using userfaultfd, but the latency would likely be higher on faults.
A user process doesn't have the information it needs to do a good job of coordinating updates from multiple writers to database pages and indices. With MMAP, writers have access to shared atomics which they can update using compare-exchange operations to prevent data races which would be common when using read() and write() without locks.
There can be a data race any time a processor loads a value, modifies it, and writes it back. Without an atomic update operation like compare_exchange() generally you need to lock the database file against other processes and threads. The typical solution is to only have one process update the file, only have one thread perform the writes, and combine it with a TCP server.
Suppose you have a big data file and want to mark which pages are occupied and which pages are free. Suppose a writer wants to read a bit from an index page to the stack to check whether a data page is occupied, modify the page bit in the stack to claim the data page if another process hasn't claimed it, and write the updated value back to memory to claim the data page to store the data value if another process hasn't claimed it.
If each process read()s the index bits, they can both see that page 2 bit in the index is unset and try to claim it, then write() back the updated index value. The updates to the index will collide, both writers will think the claimed page 2 when only one should have, and one of the data values written to that page will get lost.
This was never up for debate and is more diversion.
Was someone "saying you have to use mmap or you get data races??"
No, no one was saying that. You need it to do lock free synchronization because you need to map the same memory into two different processes to use atomics.