Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: A practical alternative of the method of random perumtation of Fisher
& Yates

Replies: 1   Last Post: Jun 24, 2013 3:30 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Mok-Kong Shen

Posts: 524
Registered: 12/8/04
A practical alternative of the method of random perumtation of Fisher
& Yates

Posted: Jun 22, 2013 8:08 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

[I apologize for creating a new thread instead of putting the following
stuffs into a recent thread initiated by me. This is done for avoiding
confusions, since a modified computer code is to be dealt with here.]

For performing (pseudo-)random permutations the algorithm of Fisher &
Yates is theoretically perfect (i.e. disregarding the imperfectness of
the PRNG involved) but requires for dealing with a list of n elments
n-1 PRNs to generate one permutation. There are however cases in
practice where a couple of PRNs are available from the beginning due to
other reasons and thus it could be of some interest to see whether one
couldn't simply make use of such PRNs for random permutations as well
instead of specially acquiring the n-1 PRNs that are needed to do the
F&Y processing.

To avoid misunderstanding, it should be stated that the practical
situation we'll treat is one where n is moderately large (albeit not
too small) and that the number of random permutations to be generated,
while substantial, is nonetheless small in comparison to the
factorial of n, i.e. one doesn't need all, or almost all, of the
possible permutations of n elements.

Through experiments I found that the following function shuffle()
(written in Python), which requires only 2 PRNs in each invocation,
appears to be a viable candidate in the above mentioned context:

# in-shuffle
def riffle(cards,n):
newcards=[]
nh=n//2
for i in range(nh):
newcards.append(cards[nh+i])
newcards.append(cards[i])
if nh*2<n:
newcards.append(cards[n-1])
return(newcards)

# Shuffle n cards with 2 PRNs ur and us in [0,n-1].
def shuffle(cards,ur,us,n):
if us<ur:
t=ur
ur=us
us=t
newcards=cards[ur:n]+cards[0:ur] #1
newcards=riffle(newcards[us:n],n-us)+riffle(newcards[ur:us],us-ur)\
+riffle(newcards[0:ur],ur) #2
newcards=newcards[ur:n]+newcards[0:ur]
newcards=riffle(newcards[us:n],n-us)+riffle(newcards[ur:us],us-ur)\
+riffle(newcards[0:ur],ur) #2
newcards=newcards[ur:n]+newcards[0:ur] #1
return(newcards)

To explain the code in terms of a card deck: In the three statements
marked with #1 one does a "cut" at the position of ur. In the two
statements marked with #2 the 2 PRNs are used to separate the deck into
3 parts, which are individually subjected to a riffle shuffling and
then put in the reversed order, i.e. the 1st and 3rd part are exchanged.

For three chosen values of n, I took the list [0, 1, 2, ... n-1]
(hereafter termed the standard configuration) and applied shuffle() as
well as F&Y 10000000 times repeatedly. After each step the Hamming
distance (H-D) of the resulting permutation with respect to the
standard configuration is computed, with its frequency distribution
obtained in a typical run listed below, where the expected values
(rounded) and the computing times are also shown:

n=20 n=52 n=100

H-D shuffle() F&Y H-D shuffle() F&Y H-D shuffle()
F&Y Expected

0-9 0 0 0-41 0 0 0-89 0
0 0
10 1 2 42 0 2 90 3
2 1
11 10 12 43 7 12 91 13
9 10
12 96 100 44 93 90 92 97
100 91
13 743 714 45 720 742 93 760
753 730
14 5042 5144 46 5218 5108 94 4965
5099 5109
15 30792 30616 47 30775 30638 95 30999
30438 30657
16 153360 152709 48 153147 153195 96 152854
153881 153283
17 613189 611347 49 614813 613352 97 612850
612319 613132
18 1840137 1840471 50 1839321 1837951 98 1841054
1838083 1839397
19 3679788 3675790 51 3677310 3677840 99 3677166
3680107 3678794
20 3676842 3683095 52 3678596 3681070 100 3679239
3679209 3678794

Time 363.4 692.0 572.1 1800.4 888.5
3486.4


Further for a number of values of n I ran the two schemes 100000 times
and computed the number of those permutations generated that have
various numbers of duplicates as follows:

0 dupl. 1 dupl. 2 dupl. 3 dupl. 4 dupl. 5 dupl. 6
dupl. >6 dupl.

shuffle() 757779 104866 9872 678 30 2 0
0
F&Y n=10 759264 104475 9600 678 50 3 1
0

shuffle() 999303 349 0 0 0 0 0
0
F&Y n=15 1000001 0 0 0 0 0 0
0

shuffle() 999999 1 0 0 0 0 0
0
F&Y n=20 1000001 0 0 0 0 0 0
0

shuffle() 1000001 0 0 0 0 0 0
0
F&Y n=25 1000001 0 0 0 0 0 0
0

shuffle() 999979 11 0 0 0 0 0
0
F&Y n=30 1000001 0 0 0 0 0 0
0

shuffle() 1000001 0 0 0 0 0 0
0
F&Y n=40 1000001 0 0 0 0 0 0
0

shuffle() 1000001 0 0 0 0 0 0
0
F&Y n=50 1000001 0 0 0 0 0 0
0

shuffle() 1000001 0 0 0 0 0 0
0
F&Y n=60 1000001 0 0 0 0 0 0
0

Taking consideration also of the fact that the PRNGs used in practice
are not perfect, the function shuffle() thus seems to have fairly well
fulfilled its intended purpose. (The function has been used in a
software of mine in http://s13.zetaboards.com/Crypto/topic/6925232/1/)

M. K. Shen



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.