randint for long type (permutations)
I want to choose a subset of all possible permutations of a sequence of length N, with each element of the subset unique. This is then going to be scattered across multiple machines using mpi. Since there is a one-to-one mapping between the integers in the range 0 <= x < N! and the possible permutations, one solution would be to choose M < N! integers randomly, check for uniqueness, and then scatter only the integers so that individual nodes can construct the permutations. However the integers need to be of type long, and randint doesn't work for numbers which cannot be converted to int. Any suggestions?
Call randint until you get enough bits of entropy to for a long with the appropriate number of bits. def randwords(n): result = 0L for i in range(n): result = (result<<32) | randint(0,2<<32-1) return result -Kevin On 6/14/07, Will Woods <will.woods@ynic.york.ac.uk> wrote:
I want to choose a subset of all possible permutations of a sequence of length N, with each element of the subset unique. This is then going to be scattered across multiple machines using mpi. Since there is a one-to-one mapping between the integers in the range 0 <= x < N! and the possible permutations, one solution would be to choose M < N! integers randomly, check for uniqueness, and then scatter only the integers so that individual nodes can construct the permutations. However the integers need to be of type long, and randint doesn't work for numbers which cannot be converted to int. Any suggestions? _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Will Woods wrote:
I want to choose a subset of all possible permutations of a sequence of length N, with each element of the subset unique. This is then going to be scattered across multiple machines using mpi. Since there is a one-to-one mapping between the integers in the range 0 <= x < N! and the possible permutations, one solution would be to choose M < N! integers randomly, check for uniqueness, and then scatter only the integers so that individual nodes can construct the permutations. However the integers need to be of type long, and randint doesn't work for numbers which cannot be converted to int. Any suggestions?
Is it too slow or bandwidth-consuming to generate the permutations on the master node and then distribute the permuted arrays to the worker nodes? You can reduce that somewhat by permuting arange(N) as an index array and send that to each of the worker nodes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On 14/06/07, Will Woods <will.woods@ynic.york.ac.uk> wrote:
I want to choose a subset of all possible permutations of a sequence of length N, with each element of the subset unique. This is then going to be scattered across multiple machines using mpi. Since there is a one-to-one mapping between the integers in the range 0 <= x < N! and the possible permutations, one solution would be to choose M < N! integers randomly, check for uniqueness, and then scatter only the integers so that individual nodes can construct the permutations. However the integers need to be of type long, and randint doesn't work for numbers which cannot be converted to int. Any suggestions?
A single integer might not be the best representation of a permutation (I can't see just now how you encode it, actually). Why not represent it as a tuple of n integers a_i with a_i<=i? (to get a permutation from this, treat a_i as an instruction to swap element i with element a_i). This should be (I haven't proven it but I'm pretty sure) a bijective representation of permutations. Not very compact, though I suppose you could use an array of 8-bit integers for n<256. It will serve as a key in dictionaries, though, and converting from a permutation in some other representation (as returned by argsort?) to this shouldn't be too difficult. Converting these to longs is also not too difficult (sum(a_i[1:]*cumprod(i[1:])) nor is the reverse operation. And of course generating the permutation randomly becomes easy. Also, it's worth noting that if n is even moderately large, n! is such a staggering number that the probability you will ever generate the same permutation twice is less than the probability that your data will be modified undetectably in memory by cosmic rays. Anne
The range of N I need is from 5-100, which spans the highly likely to highly improbable for M around 1000-10000. The permutation can be derived from an integer using the algorithm here: http://en.wikipedia.org/wiki/Permutation I have found the solution to the random number generator in the random module (I had only looked in numpy.random previously!): getrandbits(k) Returns a python long int with k random bits. This method is supplied with the MersenneTwister generator and some other generators may also provide it as an optional part of the API. When available, getrandbits() enables randrange() to handle arbitrarily large ranges. New in version 2.4. Thanks for all the replies. Will Anne Archibald wrote:
On 14/06/07, Will Woods <will.woods@ynic.york.ac.uk> wrote:
I want to choose a subset of all possible permutations of a sequence of length N, with each element of the subset unique. This is then going to be scattered across multiple machines using mpi. Since there is a one-to-one mapping between the integers in the range 0 <= x < N! and the possible permutations, one solution would be to choose M < N! integers randomly, check for uniqueness, and then scatter only the integers so that individual nodes can construct the permutations. However the integers need to be of type long, and randint doesn't work for numbers which cannot be converted to int. Any suggestions?
A single integer might not be the best representation of a permutation (I can't see just now how you encode it, actually). Why not represent it as a tuple of n integers a_i with a_i<=i? (to get a permutation from this, treat a_i as an instruction to swap element i with element a_i). This should be (I haven't proven it but I'm pretty sure) a bijective representation of permutations. Not very compact, though I suppose you could use an array of 8-bit integers for n<256. It will serve as a key in dictionaries, though, and converting from a permutation in some other representation (as returned by argsort?) to this shouldn't be too difficult. Converting these to longs is also not too difficult (sum(a_i[1:]*cumprod(i[1:])) nor is the reverse operation.
And of course generating the permutation randomly becomes easy.
Also, it's worth noting that if n is even moderately large, n! is such a staggering number that the probability you will ever generate the same permutation twice is less than the probability that your data will be modified undetectably in memory by cosmic rays.
Anne _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
-- Will Woods York Neuroimaging Centre The Biocentre York Science Park Innovation Way Heslington York YO10 5DG http://www.ynic.york.ac.uk
The range of N I need is from 5-100, which spans the highly likely to highly improbable for M around 1000-10000. The permutation can be derived from an integer using the algorithm here: http://en.wikipedia.org/wiki/Permutation I have found the solution to the random number generator in the random module (I had only looked in numpy.random previously!): getrandbits(k) Returns a python long int with k random bits. This method is supplied with the MersenneTwister generator and some other generators may also provide it as an optional part of the API. When available, getrandbits() enables randrange() to handle arbitrarily large ranges. New in version 2.4. Thanks for all the replies. Will Anne Archibald wrote:
On 14/06/07, Will Woods <will.woods@ynic.york.ac.uk> wrote:
I want to choose a subset of all possible permutations of a sequence of length N, with each element of the subset unique. This is then going to be scattered across multiple machines using mpi. Since there is a one-to-one mapping between the integers in the range 0 <= x < N! and the possible permutations, one solution would be to choose M < N! integers randomly, check for uniqueness, and then scatter only the integers so that individual nodes can construct the permutations. However the integers need to be of type long, and randint doesn't work for numbers which cannot be converted to int. Any suggestions?
A single integer might not be the best representation of a permutation (I can't see just now how you encode it, actually). Why not represent it as a tuple of n integers a_i with a_i<=i? (to get a permutation from this, treat a_i as an instruction to swap element i with element a_i). This should be (I haven't proven it but I'm pretty sure) a bijective representation of permutations. Not very compact, though I suppose you could use an array of 8-bit integers for n<256. It will serve as a key in dictionaries, though, and converting from a permutation in some other representation (as returned by argsort?) to this shouldn't be too difficult. Converting these to longs is also not too difficult (sum(a_i[1:]*cumprod(i[1:])) nor is the reverse operation.
And of course generating the permutation randomly becomes easy.
Also, it's worth noting that if n is even moderately large, n! is such a staggering number that the probability you will ever generate the same permutation twice is less than the probability that your data will be modified undetectably in memory by cosmic rays.
Anne _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
-- Will Woods York Neuroimaging Centre The Biocentre York Science Park Innovation Way Heslington York YO10 5DG http://www.ynic.york.ac.uk
Will Woods wrote:
The range of N I need is from 5-100, which spans the highly likely to highly improbable for M around 1000-10000. The permutation can be derived from an integer using the algorithm here: http://en.wikipedia.org/wiki/Permutation
You really, really don't want to do it this way. 100! is a huge number that you *cannot* sample effectively. With the permutation algorithm given here, only the first fraction of the sequence will be shuffled at all. In [29]: def perm(k, s): fact = 1 for j in range(2, len(s)+1): fact *= j-1 i = j - ((k//fact) % j) - 1 tmp = s[i], s[j-1] s[j-1], s[i] = tmp return s ....: In [37]: s0 = range(100) In [38]: print s0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99] In [39]: print perm(10000000000000, s0[:]) [9, 12, 11, 4, 13, 14, 8, 7, 15, 5, 10, 0, 3, 1, 2, 6, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99] Please, take my advice and shuffle the sequences on the master node using numpy.random.permutation() and distribute the sequences among the worker nodes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Will Woods wrote:
The range of N I need is from 5-100, which spans the highly likely to highly improbable for M around 1000-10000. The permutation can be derived from an integer using the algorithm here: http://en.wikipedia.org/wiki/Permutation
Actually, I take it back. It's not as bad as I thought it was. However, instead of cobbling together the bits, just use random.randrange(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On 6/14/07, Will Woods <will.woods@ynic.york.ac.uk> wrote:
I want to choose a subset of all possible permutations of a sequence of length N, with each element of the subset unique. This is then going to be scattered across multiple machines using mpi. Since there is a one-to-one mapping between the integers in the range 0 <= x < N! and the possible permutations, one solution would be to choose M < N! integers randomly, check for uniqueness, and then scatter only the integers so that individual nodes can construct the permutations. However the integers need to be of type long, and randint doesn't work for numbers which cannot be converted to int. Any suggestions? ________
How big is N? Chuck
participants (6)
-
Anne Archibald
-
Charles R Harris
-
Kevin Jacobs <jacobs@bioinformed.com>
-
Robert Kern
-
Will Woods
-
Will Woods