why settle for errno when you can have a segfault.

kabdib · on Jan 14, 2022

Yup. Why use the operating system's async I/O system when you can simply burn a thread and do blocking I/O? </snark>

Been down that primrose path, have the road rash to prove it. mmap() is great until you realize that pretty much all you've avoided is some buffer management that you probably need to do anyway. The OS just doesn't have the information it needs to do a great (or even correct) job of caching database pages.

wahern · on Jan 14, 2022

> Why use the operating system's async I/O system when you can simply burn a thread and do blocking I/O? </snark>

mmap isn't non-blocking; page faults are blocking, no different from a read or write to a (non-direct I/O) file using a syscall.

Until recently io_uring literally burned a thread (from a thread pool) for every read or write regular file operation, too. Though now it finally has hooks into the buffer cache so it can opportunistically perform the operation from the same thread that dequeued the command, pushing it to a worker thread if it would need to wait for a cache fault.[1]

[1] Technically the same behavior could be implemented in user space using userfaultfd, but the latency would likely be higher on faults.

randbox · on Jan 14, 2022

A user process doesn't have the information it needs to do a good job of coordinating updates from multiple writers to database pages and indices. With MMAP, writers have access to shared atomics which they can update using compare-exchange operations to prevent data races which would be common when using read() and write() without locks.

jstimpfle · on Jan 14, 2022

Are you saying that without mmap() there will be data races??

randbox · on Jan 14, 2022

There can be a data race any time a processor loads a value, modifies it, and writes it back. Without an atomic update operation like compare_exchange() generally you need to lock the database file against other processes and threads. The typical solution is to only have one process update the file, only have one thread perform the writes, and combine it with a TCP server.

Suppose you have a big data file and want to mark which pages are occupied and which pages are free. Suppose a writer wants to read a bit from an index page to the stack to check whether a data page is occupied, modify the page bit in the stack to claim the data page if another process hasn't claimed it, and write the updated value back to memory to claim the data page to store the data value if another process hasn't claimed it.

If each process read()s the index bits, they can both see that page 2 bit in the index is unset and try to claim it, then write() back the updated index value. The updates to the index will collide, both writers will think the claimed page 2 when only one should have, and one of the data values written to that page will get lost.

CyberDildonics · on Jan 15, 2022

Their last sentence literally said "data races which would be common when using read() and write() without locks."

jstimpfle · on Jan 15, 2022

Which is not literally what I said.

CyberDildonics · on Jan 15, 2022

They said you get a data race if you use read() and write() without file locks.

You asked "Are you saying that without mmap() there will be data races??".

No, they are saying you get a data race if you use read() and write() without file locks.

jstimpfle · on Jan 15, 2022

> No, they are saying you get a data race if you use read() and write() without file locks.

Since you seem to understand better and my questions never made sense, maybe you can explain now why would we do that?

CyberDildonics · on Jan 15, 2022

This is just goal post shifting.

They said something that I thought was clear: you need file locks with read() and write().

I think you misunderstood that to mean only mmap can avoid data races.

What they actually said was that using mmap allows atomics so you can avoid file locks.

jstimpfle · on Jan 15, 2022

> They said something that I thought was clear: you need file locks with read() and write().

You need _synchronization_. Not necessarily one of mmap() or file locks.

CyberDildonics · on Jan 15, 2022

You need _synchronization_.

This was never up for debate and is more diversion.

Was someone "saying you have to use mmap or you get data races??"

No, no one was saying that. You need it to do lock free synchronization because you need to map the same memory into two different processes to use atomics.

That's the whole thing.