Re: [Numpy-discussion] Intel random number package

26 Oct 2016

      On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser
 wrote:
...
On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith  wrote:
...
On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
 wrote:
...
On 10/26/2016 06:00 PM, Julian Taylor wrote:
...
On 10/26/2016 10:59 AM, Ralf Gommers wrote:
...
On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
mailto:jtaylor.debian@googlemail.com>
wrote:
On 26.10.2016 06:34, Charles R Harris wrote:
    > Hi All,
    >
    > There is a proposed random number package PR now up on github:
    > https://github.com/numpy/numpy/pull/8209
    https://github.com/numpy/numpy/pull/8209. It is from
    > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
    https://github.com/oleksandr-pavlyk> and implements
    > the number random number package using MKL for increased speed.
I think
    > we are definitely interested in the improved speed, but I'm not
sure
    > numpy is the best place to put the package. I'd welcome any
comments on
    > the PR itself, as well as any thoughts on the best way organize
or use
    > of this work. Maybe scikit-random
Note that this thread is a continuation of
https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html
I'm not a fan of putting code depending on a proprietary library
    into numpy.
    This should be a standalone package which may provide the same
interface
    as numpy.
I don't really see a problem with that in principle. Numpy can use
Intel
MKL (and Accelerate) as well if it's available. It needs some thought
put into the API though - a ``numpy.random_intel`` module is certainly
not what we want.
For me there is a difference between being able to optionally use a
proprietary library as an alternative to free software libraries if the
user wishes to do so and offering functionality that only works with
non-free software.
We are providing a form of advertisement for them by allowing it (hey
if
you buy this black box that you cannot modify or use freely you get
this
neat numpy feature!).
I prefer for the full functionality of numpy to stay available with a
stack of community owned software, even if it may be less powerful that
way.
But then if this is really just the same random numbers numpy already
provides just faster, it is probably acceptable in principle. I haven't
actually looked at the PR yet.
The RNG stream is totally different, so yeah, it can't just be a
silent drop-in replacement like BLAS/LAPACK.
The patch also adds ~10,000 lines of code; here's an example of what
some of it looks like:
https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b52095827...
I don't see how we can realistically commit to maintaining this.
FYI:  numpy already maintains code exactly like that:
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions...
Perhaps the point should be that the numpy devs won't want to maintain two
nearly identical versions of that code.
Heh, good catch! Okay, if random_intel is a massive copy-paste of
random with modifications applied on top, then that's its own issue...
on the one hand, yeah, we definitely don't want to carry around
massive copy/paste code. OTOH, it suggests that it might be possible
to refactor the code so that common parts are shared, and this would
be a benefit to integrating random and random_intel more closely. (And
this benefit would then have to be weighed against all the other
considerations, like how much sharing there actually was,
maintainability of the remaining random_intel-specific bits, the
desire to keep numpy free-and-open, etc.) Hard to make that call just
from skimming a 10,000 line patch, though...

Oleksandr, or others at Intel: how much possibility do you think there
is for sharing code between random and random_intel?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org