[Numpy-discussion] Random number generators

Charles R Harris charlesr.harris at gmail.com
Mon Sep 4 02:42:23 EDT 2006


On 9/3/06, Robert Kern <robert.kern at gmail.com> wrote:
>
> Charles R Harris wrote:
> > Hi Robert,
> >
> > I am about to get started on some stuff for the random number generators
> > but thought I would run it by you first. I envisage the following:
> >
> > uniform short_doubles -- doubles generated from a single 32 bit random
> > number (advantage: speed)
> > uniform double, short_doubles on the interval (0,1) -- don't touch
> > singularities in functions like log (this is my preferred default)
> > fast_normal -- ziggurat method using single 32 bit random numbers
> > (advantage: speed)
> > fast_exponential -- ziggurat method using single 32 bit random numbers
> > (advantage: speed)
> > MWC8222 random number generator (advantage: speed on some machines,
> > different from mtrand)
> >
> > Except for the last, none conflict with current routines and can be
> > added without a branch. I expect adding MWC8222 might need more
> > extensive work and I will branch for that. So the questions are of
> > utility and naming. I see some utility for myself, otherwise I wouldn't
> > be considering doing the work. OTOH, I already have (C++) routines that
> > I use for these things, so a larger question might be if anyone else
> > sees a use for these. I like speed, but it is not always that important
> > in everyday apps.
>
> I would prefer not to expand the API of numpy.random. If it weren't
> necessary
> for numpy to provide all of the capabilities that came with Numeric's
> RandomArray, I wouldn't want numpy.random in there at all.


Yes, good point.

Now, a very productive course of action would be to refactor numpy.randomsuch
> that the distributions (the first four items on your list fall into this
> category) and the underlying PRNG (the fifth) are separated from one
> another
> such that they can be mixed and matched at runtime. A byproduct of this
> would
> expose the C API of both of these in order to be usable by other C
> extension
> modules, something that's been asked for about a dozen times now. The five
> items
> on your list could be implemented in an extension module distributed in
> scipy.


What sort of api should this be? It occurs to me that there are already 4
sources of random bytes:

Initialization:

/dev/random (pseudo random, I think)
/dev/urandom
crypto system on windows

Pseudo random generators:

mtrand

I suppose we could add some cryptologically secure source as well. That
indicates to me that one set of random number generators would just be
streams of random bytes, possibly in 4 byte chunks. If I were doing this for
linux these would all look like file systems, FUSE comes to mind. Another
set of functions would transform these into the different distributions. So,
how much should stay in numpy? What sort of API are folks asking for?

> I see that Pyrex is used for the interface, so I suppose that is one
> > more tool to become familiar with ;)
>
> Possibly not. Pyrex was getting in the way of exposing a C API the last
> time I
> took a stab at it. A possibility that just occurred to me is to make an
> extension module that *only* exposes the C API and mtrand could be
> rewritten to
> use that API. Hmmm. I like that.


Good, I can do without pyrex.

I can give some guidance about how to proceed and help you navigate the
> current
> code, but I'm afraid I don't have much time to actually code.


Thanks, that is all I ask.

--
> Robert Kern


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060904/6cca8d1e/attachment.html>


More information about the NumPy-Discussion mailing list