[Numpy-discussion] Intel random number package

Wed Oct 26 15:41:21 EDT 2016

On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> >>
> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
> >>> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> >>> wrote:
> >>>
> >>>     On 26.10.2016 06:34, Charles R Harris wrote:
> >>>     > Hi All,
> >>>     >
> >>>     > There is a proposed random number package PR now up on github:
> >>>     > https://github.com/numpy/numpy/pull/8209
> >>>     <https://github.com/numpy/numpy/pull/8209>. It is from
> >>>     > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
> >>>     <https://github.com/oleksandr-pavlyk>> and implements
> >>>     > the number random number package using MKL for increased speed.
> >>> I think
> >>>     > we are definitely interested in the improved speed, but I'm not
> >>> sure
> >>>     > numpy is the best place to put the package. I'd welcome any
> >>> comments on
> >>>     > the PR itself, as well as any thoughts on the best way organize
> >>> or use
> >>>     > of this work. Maybe scikit-random
> >>>
> >>>
> >>> Note that this thread is a continuation of
> >>> https://mail.scipy.org/pipermail/numpy-discussion/
> 2016-July/075822.html
> >>>
> >>>
> >>>
> >>>     I'm not a fan of putting code depending on a proprietary library
> >>>     into numpy.
> >>>     This should be a standalone package which may provide the same
> >>> interface
> >>>     as numpy.
> >>>
> >>>
> >>> I don't really see a problem with that in principle. Numpy can use
> Intel
> >>> MKL (and Accelerate) as well if it's available. It needs some thought
> >>> put into the API though - a ``numpy.random_intel`` module is certainly
> >>> not what we want.
> >>>
> >>
> >> For me there is a difference between being able to optionally use a
> >> proprietary library as an alternative to free software libraries if the
> >> user wishes to do so and offering functionality that only works with
> >> non-free software.
> >> We are providing a form of advertisement for them by allowing it (hey if
> >> you buy this black box that you cannot modify or use freely you get this
> >> neat numpy feature!).
> >>
> >> I prefer for the full functionality of numpy to stay available with a
> >> stack of community owned software, even if it may be less powerful that
> >> way.
> >
> > But then if this is really just the same random numbers numpy already
> > provides just faster, it is probably acceptable in principle. I haven't
> > actually looked at the PR yet.
>
> The RNG stream is totally different, so yeah, it can't just be a
> silent drop-in replacement like BLAS/LAPACK.
>
> The patch also adds ~10,000 lines of code; here's an example of what
> some of it looks like:
>
>     https://github.com/oleksandr-pavlyk/numpy/blob/
> b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/
> mklrand/mkl_distributions.cpp#L1724-L1833
>
> I don't see how we can realistically commit to maintaining this.
>
>

FYI:  numpy already maintains code exactly like that:
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397

Perhaps the point should be that the numpy devs won't want to maintain two
nearly identical versions of that code.

Warren

> I'm also not really seeing how shipping it as part of numpy provides
> extra benefits to maintainers or users? AFAICT right now it's
> basically structured as a standalone library that's been dropped into
> the numpy source tree, and it would be just as easy to ship separately
> (or am I wrong?). And since the public API is that all the
> functionality comes from importing this specific new module
> ('numpy.random_intel'), it'd be a one-line change for users to import
> from a non-numpy namespace, like 'mkl.random' or whatever. If it were
> more integrated with the rest of numpy then the trade-offs would be
> more complicated, but in its present form this seems like an easy
> call.
>
> The other question is whether it could/should change to *become* more
> integrated... that's more tricky. There's been some work towards
> supporting swappable backends inside np.random; but the focus has
> mostly been on allowing new core generators, though, and this code
> seems to want to take over the whole thing (core generator +
> distributions), so even once the swappable backends stuff is working
> I'm not sure it would be relevant here. The one case I can think of
> that does seem promising is that if we get an API for users to say "I
> don't care about stream compatibility, just give me un-reproducible
> variates as fast as you can", then it might make sense for that to
> silently use MKL if available -- this would be pretty analogous to the
> use of MKL in np.linalg. But we don't have that API yet, I'm not sure
> how the MKL fallback could be maintainably implemented given that it
> would require somehow swapping the entire RandomState implementation,
> and it's entirely possible that once we figure out solutions to those
> then it'd still make sense for the actual MKL wrappers to live in a
> third-party library that numpy imports.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/9b987326/attachment.html>