Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
504 views
in Technique[技术] by (71.8m points)

random - Why is this shuffling algorithm wrong?

Before I read about Fisher-Yates, this is the algorithm I came up with:

def sort(arr):
    for i in range(len(arr)):
        swap(arr, i, rand.randint(0, len(arr) - 1))

From my understanding, the only difference between this and Fisher-Yates is that instead of:

swap(arr, i, rand.randint(0, len(arr) - 1))

I should write:

swap(arr, i, rand.randint(i, len(arr) - 1))

Could someone explain how the first algorithm is incorrect? (ie. does not produce a random shuffle).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

From Wikipedia:

Similarly, always selecting j from the entire range of valid array indices on every iteration also produces a result which is biased, albeit less obviously so. This can be seen from the fact that doing so yields nn distinct possible sequences of swaps, whereas there are only n! possible permutations of an n-element array. Since nn can never be evenly divisible by n! when n > 2 (as the latter is divisible by n?1, which shares no prime factors with n), some permutations must be produced by more of the nn sequences of swaps than others. As a concrete example of this bias, observe the distribution of possible outcomes of shuffling a three-element array [1, 2, 3]. There are 6 possible permutations of this array (3! = 6), but the algorithm produces 27 possible shuffles (33 = 27). In this case, [1, 2, 3], [3, 1, 2], and [3, 2, 1] each result from 4 of the 27 shuffles, while each of the remaining 3 permutations occurs in 5 of the 27 shuffles.

Essentially, you are introducing a subtle bias into the shuffle, which will cause some permutations to crop up a bit more often than others. It's often not very noticeable, but it could make some sensitive applications (e.g. Monte Carlo simulations on permutations) fail to produce accurate answers.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...