additions to random: innovative names vs. algorithm specification

Rather than just looking for a new name (e.g., znormal), would it not be better to decide on a syntax for specifying PRNG algorithms? (E.g., MATLAB takes such an approach: http://www.mathworks.com/access/helpdesk/help/techdoc/math/brt5wsv.html) Wouldn't this meet the need for replicability with much greater generality? Alan Isaac

On Thu, Jul 29, 2010 at 8:25 AM, Alan G Isaac <alan.isaac@gmail.com> wrote:
Rather than just looking for a new name (e.g., znormal), would it not be better to decide on a syntax for specifying PRNG algorithms? (E.g., MATLAB takes such an approach: http://www.mathworks.com/access/helpdesk/help/techdoc/math/brt5wsv.html)
Wouldn't this meet the need for replicability with much greater generality?
That's probably a better idea. The sort functions have the 'kind' keyword to select between three algorithms and such an addition to the random number functions might be a good thing to have for the future if other algorithms get updated/changed. If such a keyword is added it can be given a value of None where there are no selections available. Chuck

On Thu, Jul 29, 2010 at 09:41, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Thu, Jul 29, 2010 at 8:25 AM, Alan G Isaac <alan.isaac@gmail.com> wrote:
Rather than just looking for a new name (e.g., znormal), would it not be better to decide on a syntax for specifying PRNG algorithms? (E.g., MATLAB takes such an approach: http://www.mathworks.com/access/helpdesk/help/techdoc/math/brt5wsv.html)
Wouldn't this meet the need for replicability with much greater generality?
That's probably a better idea. The sort functions have the 'kind' keyword to select between three algorithms and such an addition to the random number functions might be a good thing to have for the future if other algorithms get updated/changed. If such a keyword is added it can be given a value of None where there are no selections available.
There is a good design principle that when you have a keyword argument which you only expect to pass literals to, you should make multiple functions instead (Book of Guido, 7:42). It's worth noting that this MATLAB API is deprecated. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On 7/29/2010 4:37 PM, Robert Kern wrote:
this MATLAB API is deprecated
The old API has been replaced by a constructor that still takes a string literal argument to determine the PRNG algorithm. See the bottom of http://www.mathworks.com/access/helpdesk/help/techdoc/math/brt5wsv.html This approach would match my suggestion. Even an module approach would match my suggestion (one module per underlying PRNG algorithm). I just think it will pay off to avoid simply multiplying function names (e.g., introducing znormal this year, and whatever new name next year). fwiw, Alan

On Thu, Jul 29, 2010 at 16:03, Alan G Isaac <alan.isaac@gmail.com> wrote:
On 7/29/2010 4:37 PM, Robert Kern wrote:
this MATLAB API is deprecated
The old API has been replaced by a constructor that still takes a string literal argument to determine the PRNG algorithm. See the bottom of http://www.mathworks.com/access/helpdesk/help/techdoc/math/brt5wsv.html This approach would match my suggestion.
Since your suggestion was so vague and your citation talks about multiple things, yes, I guess so. It's certainly not what Chuck interpreted your suggestion to be. "Matching your suggestion" is not the same thing as communicating clearly.
Even an module approach would match my suggestion (one module per underlying PRNG algorithm). I just think it will pay off to avoid simply multiplying function names (e.g., introducing znormal this year, and whatever new name next year).
New sampling algorithms aren't invented *all* that often. That said, it would be reasonable to add arguments to the RandomState constructor to allow it to select different algorithms for each of the distributions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Jul 29, 2010 at 16:03, Alan G Isaac <alan.isaac@gmail.com>
New sampling algorithms aren't invented *all* that often.
No, but it seems George Marsaglia posted a new prng called KISS4691 to sci.math last saturday :) KISS4691 has an immense period (larger than 10**45000), and Marsaglia claims it can produce 138 million pseudorandom ints per second. That puts it far ahead of MT19937both in terms of period and speed. http://groups.google.com/group/sci.math/msg/dc9ad178113a30fd

On Thu, Jul 29, 2010 at 19:26, Sturla Molden <sturla@molden.no> wrote:
On Thu, Jul 29, 2010 at 16:03, Alan G Isaac <alan.isaac@gmail.com>
New sampling algorithms aren't invented *all* that often.
No, but it seems George Marsaglia posted a new prng called KISS4691 to sci.math last saturday :)
Of course. :-)
KISS4691 has an immense period (larger than 10**45000), and Marsaglia claims it can produce 138 million pseudorandom ints per second. That puts it far ahead of MT19937both in terms of period and speed.
http://groups.google.com/group/sci.math/msg/dc9ad178113a30fd
It looks like it has a little bit more peer review to get through, but thanks for the pointer! That's probably the best motivation I've seen to do the refactoring of the numpy.random code to allow multiple core PRNGs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Thu, Jul 29, 2010 at 19:26, Sturla Molden <sturla@molden.no> wrote:
http://groups.google.com/group/sci.math/msg/dc9ad178113a30fd
It looks like it has a little bit more peer review to get through, but thanks for the pointer! That's probably the best motivation I've seen to do the refactoring of the numpy.random code to allow multiple core PRNGs.
Mersenne Twister is over 10 years old, and a well tried algorithm. KISS4691 is to my knowledge a week old, and only published on a newsgroup (not in a peer reviewed paper, though the author is among the world's foremost authorities on random numbers.) KISS4691 advantage over MT19937 is speed and simplicity, but we cannot put an algorithm this new into NumPy. It still needs a lot of testing. I just thought your comment on sampling algorithms not invented "all that often" was a bit funny a week after Marsaglia described KISS4691. :-) But ziggurat and random bit generators are different matters. Ziggurat can use KISS4691, MT19937 or any random bit generator to sample arbitrary distributions in [0,inf) or symmetric distributions in (inf,inf). I wonder if we should consider a general ziggurat sampler for SciPy or NumPy. It could take two arguments: a PDF expression (e.g. lambda function) and a fallback for the tail (prng or ufunc transform). We could use numeric integration to set up the ziggurat table, and amortize the Python overhead for the tail fallback. The other option would be specialized ziggurat generators for common distributions (e.g. standard normal, exponential and gamma). Sturla
participants (5)
-
Alan G Isaac
-
Charles R Harris
-
Jason Grout
-
Robert Kern
-
Sturla Molden