Could someone explain how this works? I feel like unique shuffling given a rando...

throwaway_yy2Di · on April 4, 2016

The general idea is to draw n uniform samples from [1..N] and return them in sorted order. You can generate these incrementally, in order (no sort needed). The mean difference between successive samples is ~N/n. If the samples are uniformly distributed, their differences should have roughly an exponential [0] distribution (if N, n are large).

So the inductive step is: given k'th sample a[k], let a[k+1] = a[k] + S, where S is drawn from a discrete exponential distribution with mean 1/λ = N/n. That's the `exp(log(rand()) * stuff)' part of the code.

This almost works; there's more details [1] if you want it to actually work, and have accurate statistics, not overflow N, etc.

[0] https://en.wikipedia.org/wiki/Exponential_distribution

[1] https://hal.archives-ouvertes.fr/file/index/docid/75929/file...