> *Use regular bigserial (64bit) PKs for internal table relations and UUIDs (128...

franckpachot · 2024-09-11T06:43:23 1726037003

It depends on the use cases and performance goals. You may want to distribute the rows that you insert, and then a random UUID makes sense. However, it is too much distributed for B-Tree indexes and the problem is not only cache but the amount of modifications due to leaf block splits. This includes MySQL which stores the primary key in a B-Tree index. Other use cases may benefit from colocating the rows that are inserted together. Think of timeseries, or simply an order entry where you query the recent orders. A sequence makes sense there, to have a good correlation between the index (on time) and the primary key. This avoids too many random reads with low cache hits.

It is wrong to think that distributed databases do not need sequences. YugabyteDB allows it. With YugabyteDB you use hash sharding to distribute them to a small number of hash ranges, so that they don0t go all at the same place, but are not scattered across the whole database. CockroachDB and Spanner doesn't have hash sharding and that's why they do not recommend sequences. There are also use cases where range sharding on the sequence is good when you don't need to distribute the data ingest, but benefit from their colocation when querying.

paperplatter · 2024-09-10T18:36:23 1725993383

Yeah, Spanner is also clear about this. It doesn't even have sequences, and their docs say to use random pkeys rather than time-dependent things like uuid7.