> You weren't asking, you were saying it wasn't necessary, which you did in the sentence right before this one:
quoting my OP: " Why do you need lockless atomic updates to a file-backed memory area? Genuinely curious. " . Dude.
> it seems now you were making claims without much behind them, which is disappointing.
Well thank you very much.
I get the feeling we might just be talking about the same thing. Or we might be not, I'm not sure.
> How do you have two processes writing to the same place in memory without memory mapping a file?
> You can't write outside your own memory from a process with normal permissions so how do you share memory with another process?
For example on Linux, use shm_open() + mmap(). This is just an example, and granted it uses a file-like API (shared memory objects show up on /dev/shm on a typical Linux) but it is not "file-backed" (I meant disk backed and this might be the misunderstanding) and in particular it's certainly not mapping the database file. It's just one way on one OS to map the same physical memory into different processes' address spaces.
If this example approach is "file-backed" to you, then so be it but I think you have willfully misread my comments up to here.
Homework: go back through my comments and identify all the places where I was VERY CLEARLY pointing out that my statement is that no disk-backed file is needed, or where you could reasonably infer this from my use of the term "file-backed", as well as from the general context of the discussion.
> shm_open("/TESTOBJECT"
>>That's a file path
Pedantically, no. It's a name (https://man7.org/linux/man-pages/man3/shm_open.3.html) that identifies a memory object that is only coincidentally also mapped to the file path "/dev/shm/TESTOBJECT" on a typical linux. shm_open() returns an "FD", though.
On Linux, as a sibling poster noted, you could also use mmap(.. MAP_SHARED | MAP_ANONYMOUS, /*fd*/ -1 ...) , which to my knowledge is entirely "file-free" by any meaning of the term "file". But then again, in my understanding this would only work with child processes because that mapping has to be inherited.
On other OSes, there may be completely different APIs to map shared memory that don't involve anything "file" like, either. Quite honestly I can't point you to any because I do only Linux and Windows, but let's just end the discussion here and let's agree that memory != file. I'm angry at myself for wasting another evening fighting a pointless discussion with somebody who would rather argue than try to get my point.
You conflated files with disks on your own. No one did that for you.
rather argue than try to get my point.
I still don't know what your point is. You have to have something that coordinates between two processes for shared memory interprocess communication and that ends up being file paths for the OS. You asked questions, they were answered and you could have learned something.
The whole point was actually that you can map the same memory into two different processes and use atomics, which is an incredible technique. For some reason you wanted to ignore that and make claims without explanation.
If you didn't want to waste time, you would have explained what you meant or asked questions.
> If you didn't want to waste time, you would have explained what you meant or asked questions.
You clearly haven't done your homework, because I did.
> You conflated files with disks on your own. No one did that for you.
I did not really conflate this. It is just conventional but imprecise terminology, and everyone who gets into such a discussion (especially when starting personal attacks) is expected to know to be careful when one hears "file" that it could mean "filepath", "file descriptor", or "file data" - especially "persistent file data" / "file storage", and that it could or could not mean something specific Unix-y or not Unix-y, or just some unspecific "data object". My usage of the term "file-backed" is definitely clear enough. More so given all the other explanations I made. Even more in the context of mmapping database files.
How about this: You yourself are the one who wasn't clear (or just wrong, not really understanding virtual memory), and I was the one clarifying myself multiple times, and I was the one just trying to make a simple point that could be easily understood by not being stubborn.
> The whole point was actually that you can map the same memory into two different processes and use atomics, which is an incredible technique. For some reason you wanted to ignore that and make claims without explanation.
I never ignored that but said from the beginning that you should share memory, but not file-backed memory. It's standard to share memory between processes and threads (especially threads), not an "incredible technique". It's an essential part of virtual memory management.
Go right back here to my first reply to your first reply, https://news.ycombinator.com/item?id=29943137 . Which has it all. "Because it allows you to do lock free memory based interprocess communication, which can be extremely fast." > " There is no need for file-backed memory to do that. ". Also go read my OP's sibling comment. Go read TFA, or just the title of this discussion. How can you not stop pretending you were just caught in an argument that you could not get out of without acknowledging you were wrong?
My very next comment: https://news.ycombinator.com/item?id=29947339 , "You can do "lock-free memory based interprocess communication" with memory (obviously). There is no need to back this memory with files". That comment also explains the problems of using a persistent file as backing. WHAT THE HELL STOP PRETENDING I WASN'T CLEAR THAT THIS IS ABOUT FILES ON DISK.
The next comment: "you can use normal (non-file-backed) memory to do the necessary synchronization (lock-free or not). I'm still not seeing why the memory should be backed by a file"
Then you wouldn't explain it and eventually admit that you do need to have a file path to give to another process, but only after I asked you to show what you meant multiple times.
And there isn't. It seems you just don't really understand virtual memory, and don't want to acknowledge what everyone else understands by "file-backed memory". And given that I find it courageous how stubborn you are, as well as starting personal attacks.
> Then you wouldn't explain it and eventually admit that you do need to have a file path
Need to have a file path IN WHICH ENVIRONMENT, IN WHICH CONTEXT??? Could YOU please clarify. We can easily make a simple OS which doesn't have "files" but does have processes that can share memory using virtual memory technology.
Shared memory IPC is fundamentally not about files, and you were even shown a way to setup shared memory mappings between Linux processes using normal userland API entirely without the use of files or file paths - with the restriction that the mappings have to be inherited (fork()).
How someone, even with no real understanding of the topic, could not at the latest at https://news.ycombinator.com/item?id=29947339 acknowledge that I was being perfectly clear that I was talking about persistent files (I literally said on a hard drive), is beyond me. I should have stopped this discussion at that point.
Files being persistent on storage has nothing to do with communicating through shared memory. It isn't necessary and it doesn't interfere if it's there. It is completely orthogonal, I don't know why it would ever be a part of the conversation when talking about direct reading and writing to the same memory.
> Files being persistent on storage has nothing to do with communicating through shared memory.
Files (whether persistent or not) have not really anything to do with communication through shared memory. In the implementation of an API like shm_open(), the VFS (virtual filesystem) is simply the address space and lookup mechanism that an operating system like Linux happens to use in order to find the memory that should be shared.
> It isn't necessary and it doesn't interfere if it's there.
Sure it does interfere. By backing memory needlessly with a persistent file, you're causing disk I/O from the loading and flushing (that can't really be controlled) and potentially bad performance.
Also, as explained, if you use a persistent file to track the synchronization state, the synchronization state won't be reset when the communicating processes die unexpectedly, and this might be problematic.
system like Linux happens to use in order to find the memory that should be shared.
Right. Is there some other mechanism to coordinate mapping the same memory between processes? That's all I ever asked.
Sure it does interfere. By backing memory needlessly with a persistent file, you're causing disk I/O from the loading and flushing (that can't really be controlled) and potentially bad performance.
That is orthogonal, since once you have the memory mapped into both processes you can use atomics for lock free IPC. That's the whole thing. It doesn't matter what the OS does or doesn't do in the background, atomically reading and writing to memory is unaffected.
> It doesn't matter what the OS does or doesn't do in the background, atomically reading and writing to memory is unaffected.
That's not true. If this thing is file backed there is usually no guarantee that the page of virtual memory (i.e. a page of the file data) you're accessing is present in physical memory. You'll cause page faults and data transfers to/from disk. This can delay the execution of an atomic read or write potentially infinitely, or even cause a "crash" of some kind if the disk transfer fails.
You can avoid the page faulting part of this if you somehow pin the memory. Which is completely ridiculous given that all you ever wanted is anonymous memory. I've looked up a website that seems to explain this better (but I haven't checked it deeply). Maybe it helps: https://eric-lo.gitbook.io/memory-mapped-io/pin-the-page
This can delay the execution of an atomic read or write
You can play "what if" all you want if you don't know what else running, but this was always about lock free interprocess communication, which is not broken by a page fault or process suspension.
An atomic instruction by design will do everything it needs to when the instruction runs.
Saying the OS can ultimately control the execution of a process is a nonsense cop out to try to skew away from the original point.
all you ever wanted is anonymous memory
This is local to a process tree and does not work for interprocess communication.
Dude, the example I gave you with shm_open() is creating anonymous memory. That's just what non-file-backed mappings are called, no matter how long you want to keep obsessing about any "file paths".
This doesn't even seem like a reply to what I said.
If you map memory anonymously you aren't doing interprocess communication.
If you don't, you have a file path that the other program can use to map the same memory.
That's it, there is nothing wrong with this. I don't know why this is so upsetting. Mapping memory anonymously is local to the process tree and doesn't work for two different programs communicating.
Ok, I'm extremely embarassed but it looks like I got the terminology wrong with regards to "Anonymous memory". And sorry for being so upset, at least I finally got something out of it.
It's also a fact that if I'm using disk swap space on a Unix, the same performance and stability issues apply as for disk backed file mappings. In that sense, there really is no difference.
quoting my OP: " Why do you need lockless atomic updates to a file-backed memory area? Genuinely curious. " . Dude.
> it seems now you were making claims without much behind them, which is disappointing.
Well thank you very much.
I get the feeling we might just be talking about the same thing. Or we might be not, I'm not sure.
> How do you have two processes writing to the same place in memory without memory mapping a file?
> You can't write outside your own memory from a process with normal permissions so how do you share memory with another process?
For example on Linux, use shm_open() + mmap(). This is just an example, and granted it uses a file-like API (shared memory objects show up on /dev/shm on a typical Linux) but it is not "file-backed" (I meant disk backed and this might be the misunderstanding) and in particular it's certainly not mapping the database file. It's just one way on one OS to map the same physical memory into different processes' address spaces.
If this example approach is "file-backed" to you, then so be it but I think you have willfully misread my comments up to here.