Something I don't enjoy about remote/distributed locks is that unlike distribute...

imor80 · on Aug 22, 2024

A lot of what you say is explained in detail in Martin Kleppmann's article[0]. As you said, there's no guarantee about when the lock will expire. The proper solution for this is a fencing token. The idea is similar to how people have used optimistic locking when updating data in a db to avoid two users overwriting other's work.

[0]: https://martin.kleppmann.com/2016/02/08/how-to-do-distribute...

rand_r · on Aug 22, 2024

Yes, exactly! We found out the hard way just how unreliable Redis-based locks are, and switched to Postgres locks. It works reliably since our code is already in a Postgres transaction.

Created a “lock” table with a single string key column, so you can “select key for update” on an arbitrary string key (similar UX to redis lock). I looked at advisory locks, but they don’t work when the lock key needs to be dynamically generated.

quectophoton · on Aug 22, 2024

After reading the current[1] top comment about Redlock, this was literally the next low-effort thing that came to mind, so I'm glad to find some else's experiences with using a PostgreSQL table as a lock.

I will need a distributed lock soon, but I've never used one before so I'm taking this chance to learn about them.

[1]: https://news.ycombinator.com/item?id=41315621

hinkley · on Aug 22, 2024

If it goes dark a microsecond after #3 you might have an ambiguous success. Transaction processed but you didn't get a confirmation.

A lot of robust systems end up implementing their own bespoke WAL semantics on top of the system of record. It's like we should have a formal solution for doing that by now.

qaq · on Aug 22, 2024

We do we have globally distributed ACID DBs like Spanner, CockroachDB, FoundationDB etc.

alexey-salmin · on Aug 22, 2024

True. Even simple scenarios like "save a file in s3 IFF the s3 link is saved in postgres" which are seen in virtually any application are rarely handled well.

physicles · on Aug 22, 2024

Uggh, I was cornered into writing a couple of these over the years. The way I handled it was:

1. make sure both operations will be retried if they don't run to completion, and

2. think through how the rest of the system would react to one of them being present without the other

Then I use whichever of the two orderings is less bad from the perspective of #2. Obviously this depends on the exact use case -- I was simply lucky that the rest of the system was designed in such a way that it could tolerate that bad intermediate state.