[Numpy-discussion] Low-level API for Random

Robert Kern robert.kern at gmail.com
Thu Sep 19 10:52:16 EDT 2019

On Thu, Sep 19, 2019 at 5:24 AM Ralf Gommers <ralf.gommers at gmail.com> wrote:

> On Thu, Sep 19, 2019 at 10:28 AM Kevin Sheppard <
> kevin.k.sheppard at gmail.com> wrote:
>> There are some users of the NumPy C code in randomkit.  This was never
>> officially supported.  There has been a long open issue to provide this
>> officially.
>> When I wrote randomgen I supplied .pdx files that make it simpler to
>> write Cython code that uses the components.  The lower-level API has not
>> had much scrutiny and is in need of a clean-up.   I thought this would also
>> encourage users to extend the random machinery themselves as part of their
>> project or code so as to minimize the requests for new (exotic)
>> distributions to be included in Generator.
>> Most of the generator functions follow a pattern random_DISTRIBUTION.
>> Some have a bit more name mangling which can easily be cleaned up (like
>> ranomd_gauss_zig, which should become PREFIX_standard_normal).
>> Ralf Gommers suggested unprefixed names.
> I suggested that the names should match the Python API, which I think
> isn't quite the same. The Python API doesn't contain things like "gamma",
> "t" or "f".

As the implementations evolve, they aren't going to match one-to-one 100%.
The implementations are shared by the legacy RandomState. When we update an
algorithm, we'll need to make a new function with the better algorithm for
Generator to use, then we'll have two C functions roughly corresponding to
the same method name (albeit on different classes). C doesn't give us as
many namespace options as Python. We could rely on conventional prefixes to
distinguish between the two classes of function (e.g. legacy_normal vs
random_normal). There are times when it would be nice to be more
descriptive about the algorithm difference (e.g. random_normal_polar vs
random_normal_ziggurat), most of our algorithm updates will be minor tweaks
rather than changing to a new named algorithm.

> I tried this in a local branch and it was a bit ugly since some of the
>> distributions have common math names (e.g., gamma) and others are very
>> short (e.g., t or f).  I think a prefix is needed, and after looking
>> through the C API docs npy_random_ seemed like a reasonable choice (since
>> these live in numpy.random).
>> Any thoughts on the following questions are welcome (others too):
>> 1. Should there be a prefix on the C functions?
>> 2. If so, what should the prefix be?
> Before worrying about naming details, can we start with "what should be in
> the C/Cython API"? If I look through the current pxd files, there's a lot
> there that looks like it should be private, and what we expose as Python
> API is not all present as far as I can tell (which may be fine, if the only
> goal is to let people write new generators rather than use the existing
> ones from Cython without the Python overhead)

Using the existing distributions from Cython was a requested feature and an
explicit goal, yes. There are users waiting for this.

Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190919/1e27be3e/attachment.html>

More information about the NumPy-Discussion mailing list