[Numpy-discussion] Low-level API for Random

Thu Sep 19 05:22:36 EDT 2019

On Thu, Sep 19, 2019 at 10:28 AM Kevin Sheppard <kevin.k.sheppard at gmail.com>
wrote:

> There are some users of the NumPy C code in randomkit.  This was never
> officially supported.  There has been a long open issue to provide this
> officially.
>
> When I wrote randomgen I supplied .pdx files that make it simpler to write
> Cython code that uses the components.  The lower-level API has not had much
> scrutiny and is in need of a clean-up.   I thought this would also
> encourage users to extend the random machinery themselves as part of their
> project or code so as to minimize the requests for new (exotic)
> distributions to be included in Generator.
>
> Most of the generator functions follow a pattern random_DISTRIBUTION.
> Some have a bit more name mangling which can easily be cleaned up (like
> ranomd_gauss_zig, which should become PREFIX_standard_normal).
>
> Ralf Gommers suggested unprefixed names.
>

I suggested that the names should match the Python API, which I think isn't
quite the same. The Python API doesn't contain things like "gamma", "t" or
"f".

I tried this in a local branch and it was a bit ugly since some of the
> distributions have common math names (e.g., gamma) and others are very
> short (e.g., t or f).  I think a prefix is needed, and after looking
> through the C API docs npy_random_ seemed like a reasonable choice (since
> these live in numpy.random).
>
> Any thoughts on the following questions are welcome (others too):
>
> 1. Should there be a prefix on the C functions?
> 2. If so, what should the prefix be?
>

Before worrying about naming details, can we start with "what should be in
the C/Cython API"? If I look through the current pxd files, there's a lot
there that looks like it should be private, and what we expose as Python
API is not all present as far as I can tell (which may be fine, if the only
goal is to let people write new generators rather than use the existing
ones from Cython without the Python overhead).

In the end we want to get to a doc section similar to
http://scipy.github.io/devdocs/special.cython_special.html I'd think.

3. Should the legacy C functions be part of the API -- these are mostly the
> ones that produce or depend on polar transform normals (Box-Muller). I have
> a feeling no, but there may be reasons to prefer BM since they do not
> depend on rejection sampling.
>

Even if there would be a couple of users interested, it would be odd
starting to do this after deeming the code "legacy". So I agree with your
"no".

> 4. Should low-level API be consumable like any other numpy C API by
> including the usual header locations and library locations?  Right now, the
> pxd simplifies writing Cython but users have sp specify the location of the
> headers and source manually  An alternative would be to provide a function
> like np.get_include() -> np.random.get_include() that would specialize in
> random.
>

Good question. I'm not sure this is "like any other NumPy C API". We don't
provide a C API for fft, linalg or other functionality further from core
either. It's possible of course, but does it really help library authors or
end users?

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190919/c6629591/attachment.html>