Skimming the paper [1], the key difference they used is that their hash table in...

elchananHaas · 2025-02-10T21:55:10 1739224510

I might be misreading their algorithm, but from my look at the paper the key improvement is a non-uniform strategy where they divide the array into buckets and focus on different buckets as they fill the table. This increases the average number of locations to be probed even when the table is emptier. They still place the item in the first empty slot they see with this strategy.

The "skipping slots" has to do with jumping ahead in the hash sequence.

SiempreViernes · 2025-02-10T21:11:01 1739221861

But you could do some hybrid, where you do greedy fill for a while and then switch to this fancier fill once your table is approaching full (using some heuristic)?

elchananHaas · 2025-02-10T21:56:12 1739224572

No, it has to do with the coupon collectors problem. The key idea behind this algorithm is to do more looking for an empty spot up front.

layer8 · 2025-02-10T21:25:54 1739222754

I would assume that you need to start early to be able to reap the benefits once the table is almost full.

gizmo · 2025-02-10T21:19:15 1739222355

Not really because you have to use the same algorithms during subsequent lookups. Imagine you add a bunch of items with insert algorithm A, then you cross some threshold and you insert some more items with Algorithm B. Then you delete (tombstone) a bunch of items. Now you look up an item, and the slot is filled, and you're stuck. You have to proceed your lookup with both algorithms to find what you're looking for (thereby giving up any performance benefits) because the current occupancy doesn't tell you anything about the occupancy at the time when the item you were looking for was inserted.