From chris.barker at noaa.gov  Sat Oct  1 14:38:16 2016
From: chris.barker at noaa.gov (Chris Barker)
Date: Sat, 1 Oct 2016 11:38:16 -0700
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
Message-ID: <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>

Julian,

This is really, really cool!

I have been wanting something like this for years (over a decade? wow!),
but always thought it would require hacking the interpreter to intercept
operations. This is a really inspired idea, and could buy numpy a lot of
performance.

I'm afraid I can't say much about the implementation details -- but great
work!

-Chris


On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 30.09.2016 23:09, josef.pktd at gmail.com wrote:
> > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor
> > <jtaylor.debian at googlemail.com> wrote:
> >> hi,
> >> Temporary arrays generated in expressions are expensive as the imply
> >> extra memory bandwidth which is the bottleneck in most numpy operations.
> >> For example:
> >>
> >> r = a + b + c
> >>
> >> creates the b + c temporary and then adds a to it.
> >> This can be rewritten to be more efficient using inplace operations:
> >>
> >> r = b + c
> >> r += a
> >
> > general question (I wouldn't understand the details even if I looked.)
> >
> > how is this affected by broadcasting and type promotion?
> >
> > Some of the main reasons that I don't like to use inplace operation in
> > general is that I'm often not sure when type promotion occurs and when
> > arrays expand during broadcasting.
> >
> > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape.
> > another case when I switch away from broadcasting is when b + c is int
> > or bool and a is float. Thankfully, we get error messages for casting
> > now.
>
> the temporary is only avoided when the casting follows the safe rule, so
> it should be the same as what you get without inplace operations. E.g.
> float32-temporary + float64 will not be converted to the unsafe float32
> += float64 which a normal inplace operations would allow. But
> float64-temp + float32 is transformed.
>
> Currently the only broadcasting that will be transformed is temporary +
> scalar value, otherwise it will only work on matching array sizes.
> Though there is not really anything that prevents full broadcasting but
> its not implemented yet in the PR.
>
> >
> >>
> >> This saves some memory bandwidth and can speedup the operation by 50%
> >> for very large arrays or even more if the inplace operation allows it to
> >> be completed completely in the cpu cache.
> >
> > I didn't realize the difference can be so large. That would make
> > streamlining some code worth the effort.
> >
> > Josef
> >
> >
> >>
> >> The problem is that inplace operations are a lot less readable so they
> >> are often only used in well optimized code. But due to pythons
> >> refcounting semantics we can actually do some inplace conversions
> >> transparently.
> >> If an operand in python has a reference count of one it must be a
> >> temporary so we can use it as the destination array. CPython itself does
> >> this optimization for string concatenations.
> >>
> >> In numpy we have the issue that we can be called from the C-API directly
> >> where the reference count may be one for other reasons.
> >> To solve this we can check the backtrace until the python frame
> >> evaluation function. If there are only numpy and python functions in
> >> between that and our entry point we should be able to elide the
> temporary.
> >>
> >> This PR implements this:
> >> https://github.com/numpy/numpy/pull/7997
> >>
> >> It currently only supports Linux with glibc (which has reliable
> >> backtraces via unwinding) and maybe MacOS depending on how good their
> >> backtrace is. On windows the backtrace APIs are different and I don't
> >> know them but in theory it could also be done there.
> >>
> >> A problem is that checking the backtrace is quite expensive, so should
> >> only be enabled when the involved arrays are large enough for it to be
> >> worthwhile. In my testing this seems to be around 180-300KiB sized
> >> arrays, basically where they start spilling out of the CPU L2 cache.
> >>
> >> I made a little crappy benchmark script to test this cutoff in this
> branch:
> >> https://github.com/juliantaylor/numpy/tree/elide-bench
> >>
> >> If you are interested you can run it with:
> >> python setup.py build_ext -j 4 --inplace
> >> ipython --profile=null check.ipy
> >>
> >> At the end it will plot the ratio between elided and non-elided runtime.
> >> It should get larger than one around 180KiB on most cpus.
> >>
> >> If no one points out some flaw in the approach, I'm hoping to get this
> >> into the next numpy version.
> >>
> >> cheers,
> >> Julian
> >>
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161001/3f0c9d89/attachment.html>

From evgeny.burovskiy at gmail.com  Sat Oct  1 15:54:01 2016
From: evgeny.burovskiy at gmail.com (Evgeni Burovski)
Date: Sat, 1 Oct 2016 22:54:01 +0300
Subject: [Numpy-discussion] Vendorize tempita
In-Reply-To: <CAB6mnx+0-Txam+4yKf2CdNqQaZMobwjKZRVu4fb7EZ3x_6jZtA@mail.gmail.com>
References: <CAB6mnx+-1zOxaeV8h01UHj5fmDyWPE=1dYwxk5bZax65ZPz7CQ@mail.gmail.com>
 <CAEQ_TvfoCn7s7Qw_sPH866eb==DRPQJqChsHiYE84PHbtzFEGQ@mail.gmail.com>
 <CANNq6F=LYL6fE1vkd8cjDWscC0wDV-DMfaa8sJT+S5xM7kWd3w@mail.gmail.com>
 <CAB6mnxKoW3+uFCj2tT24+jTAxhoE9xW0s8K3dMNZ+-0-9cgC4Q@mail.gmail.com>
 <CAMRo0isJ2fQ8cnHa4OoYmKLP2y8G2xr4oXxTsjtE8q6PuDXY5Q@mail.gmail.com>
 <CAB6mnxLdSkuivxgkmxqBHsoYfihLmtZ86ZCW1G1Fx1xQN-t6dA@mail.gmail.com>
 <CAB6mnx+dTx1zOBVfY1LY=LPM1kppzz2C1n2C8Xp8LD7G3Z=Ceg@mail.gmail.com>
 <CAB6mnx+0-Txam+4yKf2CdNqQaZMobwjKZRVu4fb7EZ3x_6jZtA@mail.gmail.com>
Message-ID: <CAMRo0iuLXWsgN5zbnxatYTwdOnZEJJ4rU30qZLt1=dJ14yq7kw@mail.gmail.com>

01.10.2016 3:42 ???????????? "Charles R Harris" <charlesr.harris at gmail.com>
???????:
>
>
>
> On Fri, Sep 30, 2016 at 10:36 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:
>>
>>
>>
>> On Fri, Sep 30, 2016 at 10:10 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:
>>>
>>>
>>>
>>> On Fri, Sep 30, 2016 at 9:48 AM, Evgeni Burovski <
evgeny.burovskiy at gmail.com> wrote:
>>>>
>>>> On Fri, Sep 30, 2016 at 6:29 PM, Charles R Harris
>>>> <charlesr.harris at gmail.com> wrote:
>>>> >
>>>> >
>>>> > On Fri, Sep 30, 2016 at 9:21 AM, Benjamin Root <ben.v.root at gmail.com>
wrote:
>>>> >>
>>>> >> This is the first I am hearing of tempita (looks to be a templating
>>>> >> language). How is it a dependency of numpy? Do I now need tempita
in order
>>>> >> to use numpy, or is it a build-time-only dependency?
>>>> >
>>>> >
>>>> > Build time only. The virtue of tempita is that it can be used to
generate
>>>> > cython sources. We could adapt one of our current templating scripts
to do
>>>> > that also, but that would seem to be more work. Note that tempita is
>>>> > currently included in cython, but the cython folks consider that an
>>>> > implemention detail that should not be depended upon.
>>>> >
>>>> > <snip>
>>>> >
>>>> > Chuck
>>>> >
>>>> > _______________________________________________
>>>> > NumPy-Discussion mailing list
>>>> > NumPy-Discussion at scipy.org
>>>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>> >
>>>>
>>>>
>>>> Ideally, it's packaged in such a way that it's usable for scipy too --
>>>> at the moment it's used in scipy.sparse via Cython.Tempita + a
>>>> fallback to system installed tempita if Cython.Tempita is not
>>>> available (however I'm not sure that fallback is ever exercised).
>>>> Since scipy needs to support numpy down to 1.8.2, a vendorized copy
>>>> will not be usable for scipy for quite a while.
>>>>
>>>> So, it'd be great to handle it like numpydoc: to have npy_tempita as a
>>>> small self-contained package with the repo under the numpy
>>>> organization and include it via a git submodule. Chuck, do you think
>>>> tempita would need much in terms of maintenance?
>>>>
>>>> To put some money where my mouth is, I can offer to do some legwork
>>>> for packaging it up.
>>>>
>>>
>>> It might be better to keep tempita and cythonize together so that the
search path works out right. It is also possible that other scripts might
be wanted as cythonize is currently restricted to cython files (*.pyx.in, *.
pxi.in). There are two other templating scripts in numpy/distutils, and I
think f2py has a dependency on one of those.
>>>
>>> If there is a set of tools that would be common to both scipy and
numpy, having them included as a submodule would be a good idea.
>>>
>>
>> Hmm, I suppose it just depends on where submodule is, so a npy_tempita
alone would work fine.  There isn't much maintenance needed if you resist
the urge to refactor the code. I removed a six dependency, but that is now
upstream as well.
>
>
> There don't seem to be any objections, so I will put the current
vendorization in. Evgeni, if you think it a good idea to make a repo for
this and use submodules, go ahead with that. I have left out the testing
infrastructure at https://github.com/gjhiggins/tempita which runs a sparse
set of doctests.

As long as it's being vendored into numpy/tools, I don't think there's much
point in having one more copy.  If any of
cython.tempita, gjhiggins/tempita, and numpy/tools/npy_tempita disappears,
we can reconsider adding a submodule.

Thanks for working on this!

Cheers,

Evgeni
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161001/70c179ce/attachment.html>

From charlesr.harris at gmail.com  Sat Oct  1 19:02:01 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 1 Oct 2016 17:02:01 -0600
Subject: [Numpy-discussion] Dropping sourceforge for releases.
Message-ID: <CAB6mnxKAKAmf0pY0OE_sb8i-tBo8PHpSmULe6ob=dE2oCFWeyw@mail.gmail.com>

Hi All,

Ralf has suggested dropping sourceforge as a NumPy release site. There was
discussion of doing that some time back but we have not yet done it. Now
that we put wheels up on PyPI for all supported architectures source forge
is not needed. I note that there are still some 15,000 downloads a week
from the site, so it is still used.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161001/ba30c1ac/attachment.html>

From ben.v.root at gmail.com  Sun Oct  2 09:10:45 2016
From: ben.v.root at gmail.com (Benjamin Root)
Date: Sun, 2 Oct 2016 09:10:45 -0400
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
 <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
Message-ID: <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>

Just thinking aloud, an idea I had recently takes a different approach. The
problem with temporaries isn't so much that they exists, but rather they
they keep on malloc'ed and cleared. What if numpy kept a small LRU cache of
weakref'ed temporaries? Whenever a new numpy array is requested, numpy
could see if there is already one in its cache of matching size and use it.
If you think about it, expressions that result in many temporaries would
quite likely have many of them being the same size in memory.

Don't know how feasible it would be to implement though.

Cheers!
Ben Root


On Sat, Oct 1, 2016 at 2:38 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> Julian,
>
> This is really, really cool!
>
> I have been wanting something like this for years (over a decade? wow!),
> but always thought it would require hacking the interpreter to intercept
> operations. This is a really inspired idea, and could buy numpy a lot of
> performance.
>
> I'm afraid I can't say much about the implementation details -- but great
> work!
>
> -Chris
>
>
>
>
> On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor <
> jtaylor.debian at googlemail.com> wrote:
>
>> On 30.09.2016 23:09, josef.pktd at gmail.com wrote:
>> > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor
>> > <jtaylor.debian at googlemail.com> wrote:
>> >> hi,
>> >> Temporary arrays generated in expressions are expensive as the imply
>> >> extra memory bandwidth which is the bottleneck in most numpy
>> operations.
>> >> For example:
>> >>
>> >> r = a + b + c
>> >>
>> >> creates the b + c temporary and then adds a to it.
>> >> This can be rewritten to be more efficient using inplace operations:
>> >>
>> >> r = b + c
>> >> r += a
>> >
>> > general question (I wouldn't understand the details even if I looked.)
>> >
>> > how is this affected by broadcasting and type promotion?
>> >
>> > Some of the main reasons that I don't like to use inplace operation in
>> > general is that I'm often not sure when type promotion occurs and when
>> > arrays expand during broadcasting.
>> >
>> > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape.
>> > another case when I switch away from broadcasting is when b + c is int
>> > or bool and a is float. Thankfully, we get error messages for casting
>> > now.
>>
>> the temporary is only avoided when the casting follows the safe rule, so
>> it should be the same as what you get without inplace operations. E.g.
>> float32-temporary + float64 will not be converted to the unsafe float32
>> += float64 which a normal inplace operations would allow. But
>> float64-temp + float32 is transformed.
>>
>> Currently the only broadcasting that will be transformed is temporary +
>> scalar value, otherwise it will only work on matching array sizes.
>> Though there is not really anything that prevents full broadcasting but
>> its not implemented yet in the PR.
>>
>> >
>> >>
>> >> This saves some memory bandwidth and can speedup the operation by 50%
>> >> for very large arrays or even more if the inplace operation allows it
>> to
>> >> be completed completely in the cpu cache.
>> >
>> > I didn't realize the difference can be so large. That would make
>> > streamlining some code worth the effort.
>> >
>> > Josef
>> >
>> >
>> >>
>> >> The problem is that inplace operations are a lot less readable so they
>> >> are often only used in well optimized code. But due to pythons
>> >> refcounting semantics we can actually do some inplace conversions
>> >> transparently.
>> >> If an operand in python has a reference count of one it must be a
>> >> temporary so we can use it as the destination array. CPython itself
>> does
>> >> this optimization for string concatenations.
>> >>
>> >> In numpy we have the issue that we can be called from the C-API
>> directly
>> >> where the reference count may be one for other reasons.
>> >> To solve this we can check the backtrace until the python frame
>> >> evaluation function. If there are only numpy and python functions in
>> >> between that and our entry point we should be able to elide the
>> temporary.
>> >>
>> >> This PR implements this:
>> >> https://github.com/numpy/numpy/pull/7997
>> >>
>> >> It currently only supports Linux with glibc (which has reliable
>> >> backtraces via unwinding) and maybe MacOS depending on how good their
>> >> backtrace is. On windows the backtrace APIs are different and I don't
>> >> know them but in theory it could also be done there.
>> >>
>> >> A problem is that checking the backtrace is quite expensive, so should
>> >> only be enabled when the involved arrays are large enough for it to be
>> >> worthwhile. In my testing this seems to be around 180-300KiB sized
>> >> arrays, basically where they start spilling out of the CPU L2 cache.
>> >>
>> >> I made a little crappy benchmark script to test this cutoff in this
>> branch:
>> >> https://github.com/juliantaylor/numpy/tree/elide-bench
>> >>
>> >> If you are interested you can run it with:
>> >> python setup.py build_ext -j 4 --inplace
>> >> ipython --profile=null check.ipy
>> >>
>> >> At the end it will plot the ratio between elided and non-elided
>> runtime.
>> >> It should get larger than one around 180KiB on most cpus.
>> >>
>> >> If no one points out some flaw in the approach, I'm hoping to get this
>> >> into the next numpy version.
>> >>
>> >> cheers,
>> >> Julian
>> >>
>> >>
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion at scipy.org
>> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >>
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at scipy.org
>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161002/2e65258f/attachment.html>

From cournape at gmail.com  Sun Oct  2 17:26:28 2016
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 2 Oct 2016 22:26:28 +0100
Subject: [Numpy-discussion] Dropping sourceforge for releases.
In-Reply-To: <CAB6mnxKAKAmf0pY0OE_sb8i-tBo8PHpSmULe6ob=dE2oCFWeyw@mail.gmail.com>
References: <CAB6mnxKAKAmf0pY0OE_sb8i-tBo8PHpSmULe6ob=dE2oCFWeyw@mail.gmail.com>
Message-ID: <CAGY4rcUzVtnbhbJ542Vjrx3T8-ffO-jhiw06YrMqhk3ezae6XA@mail.gmail.com>

+1 from me.

If we really need some distribution on top of github/pypi, note that
bintray (https://bintray.com/) is free for OSS projects, and is a much
better experience than sourceforge.

David

On Sun, Oct 2, 2016 at 12:02 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

> Hi All,
>
> Ralf has suggested dropping sourceforge as a NumPy release site. There was
> discussion of doing that some time back but we have not yet done it. Now
> that we put wheels up on PyPI for all supported architectures source forge
> is not needed. I note that there are still some 15,000 downloads a week
> from the site, so it is still used.
>
> Thoughts?
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161002/4e462a48/attachment.html>

From vincent at vincentdavis.net  Sun Oct  2 19:53:32 2016
From: vincent at vincentdavis.net (Vincent Davis)
Date: Sun, 2 Oct 2016 17:53:32 -0600
Subject: [Numpy-discussion] Dropping sourceforge for releases.
In-Reply-To: <CAB6mnxKAKAmf0pY0OE_sb8i-tBo8PHpSmULe6ob=dE2oCFWeyw@mail.gmail.com>
References: <CAB6mnxKAKAmf0pY0OE_sb8i-tBo8PHpSmULe6ob=dE2oCFWeyw@mail.gmail.com>
Message-ID: <CALyJZZX=KfKrsOh2QHZahwxW_p0sYvFCPpBVFXU0RNt7s8J4XQ@mail.gmail.com>

+1, I am very skeptical of anything on SourceForge, it negatively impacts
my opinion of any project that requires me to download from sourceforge.

On Saturday, October 1, 2016, Charles R Harris <charlesr.harris at gmail.com>
wrote:

> Hi All,
>
> Ralf has suggested dropping sourceforge as a NumPy release site. There was
> discussion of doing that some time back but we have not yet done it. Now
> that we put wheels up on PyPI for all supported architectures source forge
> is not needed. I note that there are still some 15,000 downloads a week
> from the site, so it is still used.
>
> Thoughts?
>
> Chuck
>


-- 
Sent from mobile app.
Vincent Davis
720-301-3003
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161002/eb4cbff3/attachment.html>

From lxx9xx at gmail.com  Sun Oct  2 20:15:13 2016
From: lxx9xx at gmail.com (Hush Hush)
Date: Mon, 3 Oct 2016 09:15:13 +0900
Subject: [Numpy-discussion] automatically avoiding temporary arrays
Message-ID: <CAK+1G5Jkd9sW4h-Gi64a0273wtEaJmMDXK=01mmLe5N59PB=GQ@mail.gmail.com>

The same idea was published two years ago:

http://hiperfit.dk/pdf/Doubling.pdf


On Mon, Oct 3, 2016 at 8:53 AM, <numpy-discussion-request at scipy.org> wrote:

> Send NumPy-Discussion mailing list submissions to
>         numpy-discussion at scipy.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.scipy.org/mailman/listinfo/numpy-discussion
> or, via email, send a message with subject or body 'help' to
>         numpy-discussion-request at scipy.org
>
> You can reach the person managing the list at
>         numpy-discussion-owner at scipy.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of NumPy-Discussion digest..."
>
>
> Today's Topics:
>
>    1. Re: automatically avoiding temporary arrays (Benjamin Root)
>    2. Re: Dropping sourceforge for releases. (David Cournapeau)
>    3. Re: Dropping sourceforge for releases. (Vincent Davis)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 2 Oct 2016 09:10:45 -0400
> From: Benjamin Root <ben.v.root at gmail.com>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Subject: Re: [Numpy-discussion] automatically avoiding temporary
>         arrays
> Message-ID:
>         <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ at mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Just thinking aloud, an idea I had recently takes a different approach. The
> problem with temporaries isn't so much that they exists, but rather they
> they keep on malloc'ed and cleared. What if numpy kept a small LRU cache of
> weakref'ed temporaries? Whenever a new numpy array is requested, numpy
> could see if there is already one in its cache of matching size and use it.
> If you think about it, expressions that result in many temporaries would
> quite likely have many of them being the same size in memory.
>
> Don't know how feasible it would be to implement though.
>
> Cheers!
> Ben Root
>
>
> On Sat, Oct 1, 2016 at 2:38 PM, Chris Barker <chris.barker at noaa.gov>
> wrote:
>
> > Julian,
> >
> > This is really, really cool!
> >
> > I have been wanting something like this for years (over a decade? wow!),
> > but always thought it would require hacking the interpreter to intercept
> > operations. This is a really inspired idea, and could buy numpy a lot of
> > performance.
> >
> > I'm afraid I can't say much about the implementation details -- but great
> > work!
> >
> > -Chris
> >
> >
> >
> >
> > On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor <
> > jtaylor.debian at googlemail.com> wrote:
> >
> >> On 30.09.2016 23:09, josef.pktd at gmail.com wrote:
> >> > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor
> >> > <jtaylor.debian at googlemail.com> wrote:
> >> >> hi,
> >> >> Temporary arrays generated in expressions are expensive as the imply
> >> >> extra memory bandwidth which is the bottleneck in most numpy
> >> operations.
> >> >> For example:
> >> >>
> >> >> r = a + b + c
> >> >>
> >> >> creates the b + c temporary and then adds a to it.
> >> >> This can be rewritten to be more efficient using inplace operations:
> >> >>
> >> >> r = b + c
> >> >> r += a
> >> >
> >> > general question (I wouldn't understand the details even if I looked.)
> >> >
> >> > how is this affected by broadcasting and type promotion?
> >> >
> >> > Some of the main reasons that I don't like to use inplace operation in
> >> > general is that I'm often not sure when type promotion occurs and when
> >> > arrays expand during broadcasting.
> >> >
> >> > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape.
> >> > another case when I switch away from broadcasting is when b + c is int
> >> > or bool and a is float. Thankfully, we get error messages for casting
> >> > now.
> >>
> >> the temporary is only avoided when the casting follows the safe rule, so
> >> it should be the same as what you get without inplace operations. E.g.
> >> float32-temporary + float64 will not be converted to the unsafe float32
> >> += float64 which a normal inplace operations would allow. But
> >> float64-temp + float32 is transformed.
> >>
> >> Currently the only broadcasting that will be transformed is temporary +
> >> scalar value, otherwise it will only work on matching array sizes.
> >> Though there is not really anything that prevents full broadcasting but
> >> its not implemented yet in the PR.
> >>
> >> >
> >> >>
> >> >> This saves some memory bandwidth and can speedup the operation by 50%
> >> >> for very large arrays or even more if the inplace operation allows it
> >> to
> >> >> be completed completely in the cpu cache.
> >> >
> >> > I didn't realize the difference can be so large. That would make
> >> > streamlining some code worth the effort.
> >> >
> >> > Josef
> >> >
> >> >
> >> >>
> >> >> The problem is that inplace operations are a lot less readable so
> they
> >> >> are often only used in well optimized code. But due to pythons
> >> >> refcounting semantics we can actually do some inplace conversions
> >> >> transparently.
> >> >> If an operand in python has a reference count of one it must be a
> >> >> temporary so we can use it as the destination array. CPython itself
> >> does
> >> >> this optimization for string concatenations.
> >> >>
> >> >> In numpy we have the issue that we can be called from the C-API
> >> directly
> >> >> where the reference count may be one for other reasons.
> >> >> To solve this we can check the backtrace until the python frame
> >> >> evaluation function. If there are only numpy and python functions in
> >> >> between that and our entry point we should be able to elide the
> >> temporary.
> >> >>
> >> >> This PR implements this:
> >> >> https://github.com/numpy/numpy/pull/7997
> >> >>
> >> >> It currently only supports Linux with glibc (which has reliable
> >> >> backtraces via unwinding) and maybe MacOS depending on how good their
> >> >> backtrace is. On windows the backtrace APIs are different and I don't
> >> >> know them but in theory it could also be done there.
> >> >>
> >> >> A problem is that checking the backtrace is quite expensive, so
> should
> >> >> only be enabled when the involved arrays are large enough for it to
> be
> >> >> worthwhile. In my testing this seems to be around 180-300KiB sized
> >> >> arrays, basically where they start spilling out of the CPU L2 cache.
> >> >>
> >> >> I made a little crappy benchmark script to test this cutoff in this
> >> branch:
> >> >> https://github.com/juliantaylor/numpy/tree/elide-bench
> >> >>
> >> >> If you are interested you can run it with:
> >> >> python setup.py build_ext -j 4 --inplace
> >> >> ipython --profile=null check.ipy
> >> >>
> >> >> At the end it will plot the ratio between elided and non-elided
> >> runtime.
> >> >> It should get larger than one around 180KiB on most cpus.
> >> >>
> >> >> If no one points out some flaw in the approach, I'm hoping to get
> this
> >> >> into the next numpy version.
> >> >>
> >> >> cheers,
> >> >> Julian
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> NumPy-Discussion mailing list
> >> >> NumPy-Discussion at scipy.org
> >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> >>
> >> > _______________________________________________
> >> > NumPy-Discussion mailing list
> >> > NumPy-Discussion at scipy.org
> >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> >
> >>
> >>
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >>
> >
> >
> > --
> >
> > Christopher Barker, Ph.D.
> > Oceanographer
> >
> > Emergency Response Division
> > NOAA/NOS/OR&R            (206) 526-6959   voice
> > 7600 Sand Point Way NE   (206) 526-6329   fax
> > Seattle, WA  98115       (206) 526-6317   main reception
> >
> > Chris.Barker at noaa.gov
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://mail.scipy.org/pipermail/numpy-discussion/
> attachments/20161002/2e65258f/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Sun, 2 Oct 2016 22:26:28 +0100
> From: David Cournapeau <cournape at gmail.com>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Subject: Re: [Numpy-discussion] Dropping sourceforge for releases.
> Message-ID:
>         <CAGY4rcUzVtnbhbJ542Vjrx3T8-ffO-jhiw06YrMqhk3ezae6XA at mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> +1 from me.
>
> If we really need some distribution on top of github/pypi, note that
> bintray (https://bintray.com/) is free for OSS projects, and is a much
> better experience than sourceforge.
>
> David
>
> On Sun, Oct 2, 2016 at 12:02 AM, Charles R Harris <
> charlesr.harris at gmail.com
> > wrote:
>
> > Hi All,
> >
> > Ralf has suggested dropping sourceforge as a NumPy release site. There
> was
> > discussion of doing that some time back but we have not yet done it. Now
> > that we put wheels up on PyPI for all supported architectures source
> forge
> > is not needed. I note that there are still some 15,000 downloads a week
> > from the site, so it is still used.
> >
> > Thoughts?
> >
> > Chuck
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://mail.scipy.org/pipermail/numpy-discussion/
> attachments/20161002/4e462a48/attachment-0001.html>
>
> ------------------------------
>
> Message: 3
> Date: Sun, 2 Oct 2016 17:53:32 -0600
> From: Vincent Davis <vincent at vincentdavis.net>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Subject: Re: [Numpy-discussion] Dropping sourceforge for releases.
> Message-ID:
>         <CALyJZZX=KfKrsOh2QHZahwxW_p0sYvFCPpBVFXU0RNt7s8J4XQ@
> mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> +1, I am very skeptical of anything on SourceForge, it negatively impacts
> my opinion of any project that requires me to download from sourceforge.
>
> On Saturday, October 1, 2016, Charles R Harris <charlesr.harris at gmail.com>
> wrote:
>
> > Hi All,
> >
> > Ralf has suggested dropping sourceforge as a NumPy release site. There
> was
> > discussion of doing that some time back but we have not yet done it. Now
> > that we put wheels up on PyPI for all supported architectures source
> forge
> > is not needed. I note that there are still some 15,000 downloads a week
> > from the site, so it is still used.
> >
> > Thoughts?
> >
> > Chuck
> >
>
>
> --
> Sent from mobile app.
> Vincent Davis
> 720-301-3003
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://mail.scipy.org/pipermail/numpy-discussion/
> attachments/20161002/eb4cbff3/attachment.html>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> ------------------------------
>
> End of NumPy-Discussion Digest, Vol 121, Issue 3
> ************************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161003/241cd52c/attachment.html>

From jorisvandenbossche at gmail.com  Mon Oct  3 05:48:06 2016
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Mon, 3 Oct 2016 11:48:06 +0200
Subject: [Numpy-discussion] ANN: pandas v0.19.0 released
Message-ID: <CALQtMBb9dAyP7e0-uhWcWK4U5LygaJwa8H7q6s-KV=UgsAbvqA@mail.gmail.com>

Hi all,

I'm happy to announce pandas 0.19.0 has been released.
This is a major release from 0.18.1 and includes a number of API changes,
several new features, enhancements, and performance improvements along with
a large number of bug fixes. See the Whatsnew
<http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html> file
for more information. We recommend that all users upgrade to this version.

This is the work of 5 months of development by 117 contributors. A big
thank you to all contributors!

Joris

---

*What is it:*

pandas is a Python package providing fast, flexible, and expressive data
structures designed to make working with ?relational? or ?labeled? data
both easy and intuitive. It aims to be the fundamental high-level building
block for doing practical, real world data analysis in Python.
Additionally, it has the broader goal of becoming the most powerful and
flexible open source data analysis / manipulation tool available in any
language.

*Highlights of the 0.19.0 release include:*

   - New method merge_asof for asof-style time-series joining, see here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-enhancements-asof-merge>
   - The .rolling() method is now time-series aware, see here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-enhancements-rolling-ts>
   - read_csv now supports parsing Categorical data, see here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-enhancements-read-csv-categorical>
   - A function union_categorical has been added for combining
   categoricals, see here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-enhancements-union-categoricals>
   - PeriodIndex now has its own period dtype, and changed to be more
   consistent with other Index classes. See here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-api-period>
   - Sparse data structures gained enhanced support of int and bool dtypes,
   see here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-sparse>
   - Comparison operations with Series no longer ignores the index, see here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-api-series-ops>
   for an overview of the API changes.
   - Introduction of a pandas development API for utility functions, see
   here
   <http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html#whatsnew-0190-dev-api>
   .
   - Deprecation of Panel4D and PanelND. We recommend to represent these
   types of n-dimensional data with the xarray package
   <http://xarray.pydata.org/en/stable/>.
   - Removal of the previously deprecated modules pandas.io.data,
   pandas.io.wb, pandas.tools.rplot.

See the Whatsnew
<http://pandas.pydata.org/pandas-docs/version/0.19.0/whatsnew.html> file
for more information.

*How to get it:*

Source tarballs and windows/mac/linux wheels are available on PyPI (thanks
to Christoph Gohlke for the windows wheels, and to Matthew Brett for
setting up the mac/linux wheels).
Conda packages are already available via the conda-forge channel (conda
install pandas -c conda-forge). It will be available on the main channel
shortly.

*Issues:*

Please report any issues on our issue tracker:
https://github.com/pydata/pandas/issues

*Thanks to all the contributors:*

   - adneu
   - Adrien Emery
   - agraboso
   - Alex Alekseyev
   - Alex Vig
   - Allen Riddell
   - Amol
   - Amol Agrawal
   - Andy R. Terrel
   - Anthonios Partheniou
   - babakkeyvani
   - Ben Kandel
   - Bob Baxley
   - Brett Rosen
   - c123w
   - Camilo Cota
   - Chris
   - chris-b1
   - Chris Grinolds
   - Christian Hudon
   - Christopher C. Aycock
   - Chris Warth
   - cmazzullo
   - conquistador1492
   - cr3
   - Daniel Siladji
   - Douglas McNeil
   - Drewrey Lupton
   - dsm054
   - Eduardo Blancas Reyes
   - Elliot Marsden
   - Evan Wright
   - Felix Marczinowski
   - Francis T. O?Donovan
   - G?bor Lipt?k
   - Geraint Duck
   - gfyoung
   - Giacomo Ferroni
   - Grant Roch
   - Haleemur Ali
   - harshul1610
   - Hassan Shamim
   - iamsimha
   - Iulius Curt
   - Ivan Nazarov
   - jackieleng
   - Jeff Reback
   - Jeffrey Gerard
   - Jenn Olsen
   - Jim Crist
   - Joe Jevnik
   - John Evans
   - John Freeman
   - John Liekezer
   - Johnny Gill
   - John W. O?Brien
   - John Zwinck
   - Jordan Erenrich
   - Joris Van den Bossche
   - Josh Howes
   - Jozef Brandys
   - Kamil Sindi
   - Ka Wo Chen
   - Kerby Shedden
   - Kernc
   - Kevin Sheppard
   - Matthieu Brucher
   - Maximilian Roos
   - Michael Scherer
   - Mike Graham
   - Mortada Mehyar
   - mpuels
   - Muhammad Haseeb Tariq
   - Nate George
   - Neil Parley
   - Nicolas Bonnotte
   - OXPHOS
   - Pan Deng / Zora
   - Paul
   - Pauli Virtanen
   - Paul Mestemaker
   - Pawel Kordek
   - Pietro Battiston
   - pijucha
   - Piotr Jucha
   - priyankjain
   - Ravi Kumar Nimmi
   - Robert Gieseke
   - Robert Kern
   - Roger Thomas
   - Roy Keyes
   - Russell Smith
   - Sahil Dua
   - Sanjiv Lobo
   - Sa?o Stanovnik
   - Shawn Heide
   - sinhrks
   - Sinhrks
   - Stephen Kappel
   - Steve Choi
   - Stewart Henderson
   - Sudarshan Konge
   - Thomas A Caswell
   - Tom Augspurger
   - Tom Bird
   - Uwe Hoffmann
   - wcwagner
   - WillAyd
   - Xiang Zhang
   - Yadunandan
   - Yaroslav Halchenko
   - YG-Riku
   - Yuichiro Kaneko
   - yui-knk
   - zhangjinjie
   - znmean
   - ????Yan Facai?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161003/bf1856b1/attachment.html>

From jtaylor.debian at googlemail.com  Mon Oct  3 06:16:48 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Mon, 3 Oct 2016 12:16:48 +0200
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
 <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
 <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
Message-ID: <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com>

the problem with this approach is that we don't really want numpy
hogging on to hundreds of megabytes of memory by default so it would
need to be a user option.
A context manager could work too but it would probably lead to premature
optimization.

Very new Linux versions (4.6+) now finally support MADV_FREE which gives
memory back to the system but does not require refaulting it if nothing
else needed it. So this might be an option now.
But libc implementations will probably use that at some point too and
then numpy doesn't need to do this.


On 02.10.2016 15:10, Benjamin Root wrote:
> Just thinking aloud, an idea I had recently takes a different approach.
> The problem with temporaries isn't so much that they exists, but rather
> they they keep on malloc'ed and cleared. What if numpy kept a small LRU
> cache of weakref'ed temporaries? Whenever a new numpy array is
> requested, numpy could see if there is already one in its cache of
> matching size and use it. If you think about it, expressions that result
> in many temporaries would quite likely have many of them being the same
> size in memory.
> 
> Don't know how feasible it would be to implement though.
> 
> Cheers!
> Ben Root
> 
> 
> On Sat, Oct 1, 2016 at 2:38 PM, Chris Barker <chris.barker at noaa.gov
> <mailto:chris.barker at noaa.gov>> wrote:
> 
>     Julian,
> 
>     This is really, really cool!
> 
>     I have been wanting something like this for years (over a decade?
>     wow!), but always thought it would require hacking the interpreter
>     to intercept operations. This is a really inspired idea, and could
>     buy numpy a lot of performance.
> 
>     I'm afraid I can't say much about the implementation details -- but
>     great work!
> 
>     -Chris
> 
> 
> 
> 
>     On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor
>     <jtaylor.debian at googlemail.com
>     <mailto:jtaylor.debian at googlemail.com>> wrote:
> 
>         On 30.09.2016 23:09, josef.pktd at gmail.com
>         <mailto:josef.pktd at gmail.com> wrote:
>         > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor
>         > <jtaylor.debian at googlemail.com
>         <mailto:jtaylor.debian at googlemail.com>> wrote:
>         >> hi,
>         >> Temporary arrays generated in expressions are expensive as the imply
>         >> extra memory bandwidth which is the bottleneck in most numpy operations.
>         >> For example:
>         >>
>         >> r = a + b + c
>         >>
>         >> creates the b + c temporary and then adds a to it.
>         >> This can be rewritten to be more efficient using inplace operations:
>         >>
>         >> r = b + c
>         >> r += a
>         >
>         > general question (I wouldn't understand the details even if I looked.)
>         >
>         > how is this affected by broadcasting and type promotion?
>         >
>         > Some of the main reasons that I don't like to use inplace operation in
>         > general is that I'm often not sure when type promotion occurs and when
>         > arrays expand during broadcasting.
>         >
>         > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape.
>         > another case when I switch away from broadcasting is when b + c is int
>         > or bool and a is float. Thankfully, we get error messages for casting
>         > now.
> 
>         the temporary is only avoided when the casting follows the safe
>         rule, so
>         it should be the same as what you get without inplace
>         operations. E.g.
>         float32-temporary + float64 will not be converted to the unsafe
>         float32
>         += float64 which a normal inplace operations would allow. But
>         float64-temp + float32 is transformed.
> 
>         Currently the only broadcasting that will be transformed is
>         temporary +
>         scalar value, otherwise it will only work on matching array sizes.
>         Though there is not really anything that prevents full
>         broadcasting but
>         its not implemented yet in the PR.
> 
>         >
>         >>
>         >> This saves some memory bandwidth and can speedup the
>         operation by 50%
>         >> for very large arrays or even more if the inplace operation
>         allows it to
>         >> be completed completely in the cpu cache.
>         >
>         > I didn't realize the difference can be so large. That would make
>         > streamlining some code worth the effort.
>         >
>         > Josef
>         >
>         >
>         >>
>         >> The problem is that inplace operations are a lot less
>         readable so they
>         >> are often only used in well optimized code. But due to pythons
>         >> refcounting semantics we can actually do some inplace conversions
>         >> transparently.
>         >> If an operand in python has a reference count of one it must be a
>         >> temporary so we can use it as the destination array. CPython
>         itself does
>         >> this optimization for string concatenations.
>         >>
>         >> In numpy we have the issue that we can be called from the
>         C-API directly
>         >> where the reference count may be one for other reasons.
>         >> To solve this we can check the backtrace until the python frame
>         >> evaluation function. If there are only numpy and python
>         functions in
>         >> between that and our entry point we should be able to elide
>         the temporary.
>         >>
>         >> This PR implements this:
>         >> https://github.com/numpy/numpy/pull/7997
>         <https://github.com/numpy/numpy/pull/7997>
>         >>
>         >> It currently only supports Linux with glibc (which has reliable
>         >> backtraces via unwinding) and maybe MacOS depending on how
>         good their
>         >> backtrace is. On windows the backtrace APIs are different and
>         I don't
>         >> know them but in theory it could also be done there.
>         >>
>         >> A problem is that checking the backtrace is quite expensive,
>         so should
>         >> only be enabled when the involved arrays are large enough for
>         it to be
>         >> worthwhile. In my testing this seems to be around 180-300KiB
>         sized
>         >> arrays, basically where they start spilling out of the CPU L2
>         cache.
>         >>
>         >> I made a little crappy benchmark script to test this cutoff
>         in this branch:
>         >> https://github.com/juliantaylor/numpy/tree/elide-bench
>         <https://github.com/juliantaylor/numpy/tree/elide-bench>
>         >>
>         >> If you are interested you can run it with:
>         >> python setup.py build_ext -j 4 --inplace
>         >> ipython --profile=null check.ipy
>         >>
>         >> At the end it will plot the ratio between elided and
>         non-elided runtime.
>         >> It should get larger than one around 180KiB on most cpus.
>         >>
>         >> If no one points out some flaw in the approach, I'm hoping to
>         get this
>         >> into the next numpy version.
>         >>
>         >> cheers,
>         >> Julian
>         >>
>         >>
>         >> _______________________________________________
>         >> NumPy-Discussion mailing list
>         >> NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>         <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
>         >>
>         > _______________________________________________
>         > NumPy-Discussion mailing list
>         > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>         <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
>         >
> 
> 
> 
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         https://mail.scipy.org/mailman/listinfo/numpy-discussion
>         <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
> 
> 
> 
>     -- 
> 
>     Christopher Barker, Ph.D.
>     Oceanographer
> 
>     Emergency Response Division
>     NOAA/NOS/OR&R            (206) 526-6959 <tel:%28206%29%20526-6959>  
>     voice
>     7600 Sand Point Way NE   (206) 526-6329 <tel:%28206%29%20526-6329>   fax
>     Seattle, WA  98115       (206) 526-6317 <tel:%28206%29%20526-6317>  
>     main reception
> 
>     Chris.Barker at noaa.gov <mailto:Chris.Barker at noaa.gov>
> 
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161003/ebdc374b/attachment.sig>

From chris.barker at noaa.gov  Mon Oct  3 14:23:53 2016
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 3 Oct 2016 11:23:53 -0700
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
 <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
 <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
 <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com>
Message-ID: <CALGmxELVXqnconVophUB75RD8jm0TNDMyqAAd8_bBA45WDMH-g@mail.gmail.com>

On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> the problem with this approach is that we don't really want numpy
> hogging on to hundreds of megabytes of memory by default so it would
> need to be a user option.
>

indeed -- but one could set an LRU cache to be very small (few items, not
small memory), and then it get used within expressions, but not hold on to
much outside of expressions.

However, is the allocation the only (Or even biggest) source of the
performance hit?

If you generate a temporary as a result of an operation, rather than doing
it in-place, that temporary needs to be allocated, but it also means that
an additional array needs to be pushed through the processor -- and that
can make a big performance difference too.

I"m not entirely sure how to profile this correctly, but this seems to
indicate that the allocation is cheap compared to the operations (for a
million--element array)

* Regular old temporary creation

In [24]: def f1(arr1, arr2):
    ...:     result = arr1 + arr2
    ...:     return result

In [26]: %timeit f1(arr1, arr2)
1000 loops, best of 3: 1.13 ms per loop

* Completely in-place, no allocation of an extra array

In [27]: def f2(arr1, arr2):
    ...:     arr1 += arr2
    ...:     return arr1

In [28]: %timeit f2(arr1, arr2)
1000 loops, best of 3: 755 ?s per loop

So that's about 30% faster

* allocate a temporary that isn't used -- but should catch the creation cost

In [29]: def f3(arr1, arr2):
    ...:     result = np.empty_like(arr1)
    ...:     arr1 += arr2
    ...:     return arr1

In [30]: % timeit f3(arr1, arr2)

1000 loops, best of 3: 756 ?s per loop

only a ?s slower!

Profiling is hard, and I'm not good at it, but this seems to indicate that
the allocation is cheap.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161003/dc02e8a3/attachment.html>

From jtaylor.debian at googlemail.com  Mon Oct  3 14:43:16 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Mon, 3 Oct 2016 20:43:16 +0200
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CALGmxELVXqnconVophUB75RD8jm0TNDMyqAAd8_bBA45WDMH-g@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
 <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
 <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
 <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com>
 <CALGmxELVXqnconVophUB75RD8jm0TNDMyqAAd8_bBA45WDMH-g@mail.gmail.com>
Message-ID: <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com>

On 03.10.2016 20:23, Chris Barker wrote:
> 
> 
> On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> wrote:
> 
>     the problem with this approach is that we don't really want numpy
>     hogging on to hundreds of megabytes of memory by default so it would
>     need to be a user option.
> 
> 
> indeed -- but one could set an LRU cache to be very small (few items,
> not small memory), and then it get used within expressions, but not hold
> on to much outside of expressions.

numpy doesn't see the whole expression so we can't really do much.
(technically we could in 3.5 by using pep 523, but that would be a
larger undertaking)

> 
> However, is the allocation the only (Or even biggest) source of the
> performance hit?
>  

on large arrays the allocation is insignificant. What does cost some
time is faulting the memory into the process which implies writing zeros
into the pages (a page at a time as it is being used).
By storing memory blocks in numpy we would save this portion. This is
really the job of the libc, but these are usually tuned for general
purpose workloads and thus tend to give back memory back to the system
much earlier than numerical workloads would like.

Note that numpy already has a small memory block cache but its only used
for very small arrays where the allocation cost itself is significant,
it is limited to a couple megabytes at most.


From ben.v.root at gmail.com  Mon Oct  3 15:07:28 2016
From: ben.v.root at gmail.com (Benjamin Root)
Date: Mon, 3 Oct 2016 15:07:28 -0400
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
 <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
 <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
 <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com>
 <CALGmxELVXqnconVophUB75RD8jm0TNDMyqAAd8_bBA45WDMH-g@mail.gmail.com>
 <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com>
Message-ID: <CANNq6F=i24bXp1qKMVHEtVQk1zHAbw_9MsSnbqZfq6K28R305Q@mail.gmail.com>

With regards to arguments about holding onto large arrays, I would like to
emphasize that my original suggestion mentioned weakref'ed numpy arrays.
Essentially, the idea is to claw back only the raw memory blocks during
that limbo period between discarding the numpy array python object and when
python garbage-collects it.

Ben Root

On Mon, Oct 3, 2016 at 2:43 PM, Julian Taylor <jtaylor.debian at googlemail.com
> wrote:

> On 03.10.2016 20:23, Chris Barker wrote:
> >
> >
> > On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor
> > <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> > wrote:
> >
> >     the problem with this approach is that we don't really want numpy
> >     hogging on to hundreds of megabytes of memory by default so it would
> >     need to be a user option.
> >
> >
> > indeed -- but one could set an LRU cache to be very small (few items,
> > not small memory), and then it get used within expressions, but not hold
> > on to much outside of expressions.
>
> numpy doesn't see the whole expression so we can't really do much.
> (technically we could in 3.5 by using pep 523, but that would be a
> larger undertaking)
>
> >
> > However, is the allocation the only (Or even biggest) source of the
> > performance hit?
> >
>
> on large arrays the allocation is insignificant. What does cost some
> time is faulting the memory into the process which implies writing zeros
> into the pages (a page at a time as it is being used).
> By storing memory blocks in numpy we would save this portion. This is
> really the job of the libc, but these are usually tuned for general
> purpose workloads and thus tend to give back memory back to the system
> much earlier than numerical workloads would like.
>
> Note that numpy already has a small memory block cache but its only used
> for very small arrays where the allocation cost itself is significant,
> it is limited to a couple megabytes at most.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161003/c67d1640/attachment.html>

From pav at iki.fi  Mon Oct  3 15:33:57 2016
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 3 Oct 2016 19:33:57 +0000 (UTC)
Subject: [Numpy-discussion] automatically avoiding temporary arrays
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
 <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
 <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
 <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com>
 <CALGmxELVXqnconVophUB75RD8jm0TNDMyqAAd8_bBA45WDMH-g@mail.gmail.com>
 <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com>
 <CANNq6F=i24bXp1qKMVHEtVQk1zHAbw_9MsSnbqZfq6K28R305Q@mail.gmail.com>
Message-ID: <nsubr5$4ps$1@blaine.gmane.org>

Mon, 03 Oct 2016 15:07:28 -0400, Benjamin Root kirjoitti:
> With regards to arguments about holding onto large arrays, I would like
> to emphasize that my original suggestion mentioned weakref'ed numpy
> arrays.
> Essentially, the idea is to claw back only the raw memory blocks during
> that limbo period between discarding the numpy array python object and
> when python garbage-collects it.

CPython afaik deallocates immediately when the refcount hits zero. It's 
relatively rare that you have arrays hanging around waiting for cycle 
breaking by gc. If you have them hanging around, I don't think it's 
possible to distinguish these from other arrays without running the gc.

Note also that an "is an array in use" check probably always requires 
Julian's stack based hack since you cannot rely on the refcount.

	Pauli


From charlesr.harris at gmail.com  Mon Oct  3 20:23:01 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 3 Oct 2016 18:23:01 -0600
Subject: [Numpy-discussion] Dropping sourceforge for releases.
In-Reply-To: <CALyJZZX=KfKrsOh2QHZahwxW_p0sYvFCPpBVFXU0RNt7s8J4XQ@mail.gmail.com>
References: <CAB6mnxKAKAmf0pY0OE_sb8i-tBo8PHpSmULe6ob=dE2oCFWeyw@mail.gmail.com>
 <CALyJZZX=KfKrsOh2QHZahwxW_p0sYvFCPpBVFXU0RNt7s8J4XQ@mail.gmail.com>
Message-ID: <CAB6mnxJwJySSPv8c6Kxi420k1JAPG25KsZ0NoeGy5=UPt5KoNw@mail.gmail.com>

On Sun, Oct 2, 2016 at 5:53 PM, Vincent Davis <vincent at vincentdavis.net>
wrote:

> +1, I am very skeptical of anything on SourceForge, it negatively impacts
> my opinion of any project that requires me to download from sourceforge.
>
>
> On Saturday, October 1, 2016, Charles R Harris <charlesr.harris at gmail.com>
> wrote:
>
>> Hi All,
>>
>> Ralf has suggested dropping sourceforge as a NumPy release site. There
>> was discussion of doing that some time back but we have not yet done it.
>> Now that we put wheels up on PyPI for all supported architectures source
>> forge is not needed. I note that there are still some 15,000 downloads a
>> week from the site, so it is still used.
>>
>> Thoughts?
>>
>> Chuck
>>
>
I've uploaded the NumPy 1.11.2 release to sourceforge and made a note on
the summary page that that will be the last release to be found there.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161003/aa62eeae/attachment.html>

From charlesr.harris at gmail.com  Mon Oct  3 22:15:24 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 3 Oct 2016 20:15:24 -0600
Subject: [Numpy-discussion] NumPy 1.11.2 released
Message-ID: <CAB6mnxJ75JPF4rMxhRpRR8xHAqSg_J6dGs3H8oseF6fxZsZJ1A@mail.gmail.com>

 *Hi All,*

I'm pleased to announce the release of Numpy 1.11.2. This release supports
Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions found in
Numpy 1.11.1.  Wheels for Linux, Windows, and OSX can be found on PyPI.
Sources are available on both PyPI and Sourceforge
<https://sourceforge.net/projects/numpy/files/NumPy/1.11.2/>.

Thanks to all who were involved in this release. Contributors and merged
pull requests are listed below.


*Contributors to v1.11.2*

   - Allan Haldane
   - Bertrand Lefebvre
   - Charles Harris
   - Julian Taylor
   - Lo?c Est?ve
   - Marshall Bockrath-Vandegrift +
   - Michael Seifert +
   - Pauli Virtanen
   - Ralf Gommers
   - Sebastian Berg
   - Shota Kawabuchi +
   - Thomas A Caswell
   - Valentin Valls +
   - Xavier Abellan Ecija +

A total of 14 people contributed to this release. People with a "+" by
their names contributed a patch for the first time.
*Pull requests merged for v1.11.2*

   - #7736 <https://github.com/numpy/numpy/pull/7736>: Backport 4619, BUG:
   many functions silently drop keepdims kwarg
   - #7738 <https://github.com/numpy/numpy/pull/7738>: Backport 5706, ENH:
   add extra kwargs and update doc of many MA...
   - #7778 <https://github.com/numpy/numpy/pull/7778>: DOC: Update Numpy
   1.11.1 release notes.
   - #7793 <https://github.com/numpy/numpy/pull/7793>: Backport 7515, BUG:
   MaskedArray.count treats negative axes incorrectly
   - #7816 <https://github.com/numpy/numpy/pull/7816>: Backport 7463, BUG:
   fix array too big error for wide dtypes.
   - #7821 <https://github.com/numpy/numpy/pull/7821>: Backport 7817, BUG:
   Make sure npy_mul_with_overflow_<type> detects...
   - #7824 <https://github.com/numpy/numpy/pull/7824>: Backport 7820,
   MAINT: Allocate fewer bytes for empty arrays.
   - #7847 <https://github.com/numpy/numpy/pull/7847>: Backport 7791,
   MAINT,DOC: Fix some imp module uses and update...
   - #7849 <https://github.com/numpy/numpy/pull/7849>: Backport 7848,
   MAINT: Fix remaining uses of deprecated Python...
   - #7851 <https://github.com/numpy/numpy/pull/7851>: Backport 7840, Fix
   ATLAS version detection
   - #7870 <https://github.com/numpy/numpy/pull/7870>: Backport 7853, BUG:
   Raise RuntimeError when reloading numpy is...
   - #7896 <https://github.com/numpy/numpy/pull/7896>: Backport 7894, BUG:
   construct ma.array from np.array which contains...
   - #7904 <https://github.com/numpy/numpy/pull/7904>: Backport 7903, BUG:
   fix float16 type not being called due to...
   - #7917 <https://github.com/numpy/numpy/pull/7917>: BUG: Production
   install of numpy should not require nose.
   - #7919 <https://github.com/numpy/numpy/pull/7919>: Backport 7908, BLD:
   Fixed MKL detection for recent versions of...
   - #7920 <https://github.com/numpy/numpy/pull/7920>: Backport #7911: BUG:
   fix for issue#7835 (ma.median of 1d)
   - #7932 <https://github.com/numpy/numpy/pull/7932>: Backport 7925,
   Monkey-patch _msvccompile.gen_lib_option like...
   - #7939 <https://github.com/numpy/numpy/pull/7939>: Backport 7931, BUG:
   Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in...
   - #7953 <https://github.com/numpy/numpy/pull/7953>: Backport 7937, BUG:
   Guard against buggy comparisons in generic...
   - #7954 <https://github.com/numpy/numpy/pull/7954>: Backport 7952, BUG:
   Use keyword arguments to initialize Extension...
   - #7955 <https://github.com/numpy/numpy/pull/7955>: Backport 7941, BUG:
   Make sure numpy globals keep identity after...
   - #7972 <https://github.com/numpy/numpy/pull/7972>: Backport 7963, BUG:
   MSVCCompiler grows 'lib' & 'include' env...
   - #7990 <https://github.com/numpy/numpy/pull/7990>: Backport 7977, DOC:
   Create 1.11.2 release notes.
   - #8005 <https://github.com/numpy/numpy/pull/8005>: Backport 7956, BLD:
   remove __NUMPY_SETUP__ from builtins at end...
   - #8007 <https://github.com/numpy/numpy/pull/8007>: Backport 8006, DOC:
   Update 1.11.2 release notes.
   - #8010 <https://github.com/numpy/numpy/pull/8010>: Backport 8008,
   MAINT: Remove leftover imp module imports.
   - #8012 <https://github.com/numpy/numpy/pull/8012>: Backport 8011, DOC:
   Update 1.11.2 release notes.
   - #8020 <https://github.com/numpy/numpy/pull/8020>: Backport 8018, BUG:
   Fixes return for np.ma.count if keepdims...
   - #8024 <https://github.com/numpy/numpy/pull/8024>: Backport 8016, BUG:
   Fix numpy.ma.median.
   - #8031 <https://github.com/numpy/numpy/pull/8031>: Backport 8030, BUG:
   fix np.ma.median with only one non-masked...
   - #8032 <https://github.com/numpy/numpy/pull/8032>: Backport 8028, DOC:
   Update 1.11.2 release notes.
   - #8044 <https://github.com/numpy/numpy/pull/8044>: Backport 8042, BUG:
   core: fix bug in NpyIter buffering with discontinuous...
   - #8046 <https://github.com/numpy/numpy/pull/8046>: Backport 8045, DOC:
   Update 1.11.2 release notes.

Enjoy,
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161003/b5cb2aa8/attachment.html>

From matthew.brett at gmail.com  Tue Oct  4 00:33:30 2016
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 3 Oct 2016 21:33:30 -0700
Subject: [Numpy-discussion] NumPy 1.11.2 released
In-Reply-To: <CAB6mnxJ75JPF4rMxhRpRR8xHAqSg_J6dGs3H8oseF6fxZsZJ1A@mail.gmail.com>
References: <CAB6mnxJ75JPF4rMxhRpRR8xHAqSg_J6dGs3H8oseF6fxZsZJ1A@mail.gmail.com>
Message-ID: <CAH6Pt5qFVzJk4bCx9gr1dBCrnDrx=OSn7ftwReBPt7+20-M5xA@mail.gmail.com>

On Mon, Oct 3, 2016 at 7:15 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> I'm pleased to announce the release of Numpy 1.11.2. This release supports
> Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions found in
> Numpy 1.11.1.  Wheels for Linux, Windows, and OSX can be found on PyPI.
> Sources are available on both PyPI and Sourceforge.
>
> Thanks to all who were involved in this release. Contributors and merged
> pull requests are listed below.
>
>
> Contributors to v1.11.2
>
> Allan Haldane
> Bertrand Lefebvre
> Charles Harris
> Julian Taylor
> Lo?c Est?ve
> Marshall Bockrath-Vandegrift +
> Michael Seifert +
> Pauli Virtanen
> Ralf Gommers
> Sebastian Berg
> Shota Kawabuchi +
> Thomas A Caswell
> Valentin Valls +
> Xavier Abellan Ecija +
>
> A total of 14 people contributed to this release. People with a "+" by their
> names contributed a patch for the first time.
>
> Pull requests merged for v1.11.2
>
> #7736: Backport 4619, BUG: many functions silently drop keepdims kwarg
> #7738: Backport 5706, ENH: add extra kwargs and update doc of many MA...
> #7778: DOC: Update Numpy 1.11.1 release notes.
> #7793: Backport 7515, BUG: MaskedArray.count treats negative axes
> incorrectly
> #7816: Backport 7463, BUG: fix array too big error for wide dtypes.
> #7821: Backport 7817, BUG: Make sure npy_mul_with_overflow_<type> detects...
> #7824: Backport 7820, MAINT: Allocate fewer bytes for empty arrays.
> #7847: Backport 7791, MAINT,DOC: Fix some imp module uses and update...
> #7849: Backport 7848, MAINT: Fix remaining uses of deprecated Python...
> #7851: Backport 7840, Fix ATLAS version detection
> #7870: Backport 7853, BUG: Raise RuntimeError when reloading numpy is...
> #7896: Backport 7894, BUG: construct ma.array from np.array which
> contains...
> #7904: Backport 7903, BUG: fix float16 type not being called due to...
> #7917: BUG: Production install of numpy should not require nose.
> #7919: Backport 7908, BLD: Fixed MKL detection for recent versions of...
> #7920: Backport #7911: BUG: fix for issue#7835 (ma.median of 1d)
> #7932: Backport 7925, Monkey-patch _msvccompile.gen_lib_option like...
> #7939: Backport 7931, BUG: Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in...
> #7953: Backport 7937, BUG: Guard against buggy comparisons in generic...
> #7954: Backport 7952, BUG: Use keyword arguments to initialize Extension...
> #7955: Backport 7941, BUG: Make sure numpy globals keep identity after...
> #7972: Backport 7963, BUG: MSVCCompiler grows 'lib' & 'include' env...
> #7990: Backport 7977, DOC: Create 1.11.2 release notes.
> #8005: Backport 7956, BLD: remove __NUMPY_SETUP__ from builtins at end...
> #8007: Backport 8006, DOC: Update 1.11.2 release notes.
> #8010: Backport 8008, MAINT: Remove leftover imp module imports.
> #8012: Backport 8011, DOC: Update 1.11.2 release notes.
> #8020: Backport 8018, BUG: Fixes return for np.ma.count if keepdims...
> #8024: Backport 8016, BUG: Fix numpy.ma.median.
> #8031: Backport 8030, BUG: fix np.ma.median with only one non-masked...
> #8032: Backport 8028, DOC: Update 1.11.2 release notes.
> #8044: Backport 8042, BUG: core: fix bug in NpyIter buffering with
> discontinuous...
> #8046: Backport 8045, DOC: Update 1.11.2 release notes.

Thanks very much for doing all the release work, congratulations on the release,

Cheers,

Matthew


From m.h.vankerkwijk at gmail.com  Tue Oct  4 00:45:04 2016
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Mon, 3 Oct 2016 21:45:04 -0700
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <nsubr5$4ps$1@blaine.gmane.org>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAMMTP+CXBch3wkx1f=N-1QVANy_FDg9navm_68k8=eCUEAMFWg@mail.gmail.com>
 <e2342f57-c79f-7e1c-4094-197d7b99b437@googlemail.com>
 <CALGmxEK7RFs2qy85+mLAGGgxH86s5JJV-h+ZxC8wpFrsZRvqeA@mail.gmail.com>
 <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
 <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com>
 <CALGmxELVXqnconVophUB75RD8jm0TNDMyqAAd8_bBA45WDMH-g@mail.gmail.com>
 <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com>
 <CANNq6F=i24bXp1qKMVHEtVQk1zHAbw_9MsSnbqZfq6K28R305Q@mail.gmail.com>
 <nsubr5$4ps$1@blaine.gmane.org>
Message-ID: <CAJNV+9vYxdA-QJz=B5cfw-1e0=YSbymnr6rhFH5LK=10sXK8Yg@mail.gmail.com>

Note that numpy does store some larger arrays already, in the fft
module. (In fact, this was a cache of unlimited size until #7686.) It
might not be bad if the same cache were used more generally.

That said, if newer versions of python are offering ways of doing this
better, maybe that is the best way forward.

-- Marten


From evgeny.burovskiy at gmail.com  Tue Oct  4 01:29:53 2016
From: evgeny.burovskiy at gmail.com (Evgeni Burovski)
Date: Tue, 4 Oct 2016 08:29:53 +0300
Subject: [Numpy-discussion] NumPy 1.11.2 released
In-Reply-To: <CAB6mnxJ75JPF4rMxhRpRR8xHAqSg_J6dGs3H8oseF6fxZsZJ1A@mail.gmail.com>
References: <CAB6mnxJ75JPF4rMxhRpRR8xHAqSg_J6dGs3H8oseF6fxZsZJ1A@mail.gmail.com>
Message-ID: <CAMRo0iuGoaKhfPTSN-nHhBraZhcMw396OdJooRoUrmAn1F98Sw@mail.gmail.com>

Thank you Chuck!
04.10.2016 5:15 ???????????? "Charles R Harris" <charlesr.harris at gmail.com>
???????:

> *Hi All,*
>
> I'm pleased to announce the release of Numpy 1.11.2. This release
> supports Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions
> found in Numpy 1.11.1.  Wheels for Linux, Windows, and OSX can be found
> on PyPI. Sources are available on both PyPI and Sourceforge
> <https://sourceforge.net/projects/numpy/files/NumPy/1.11.2/>.
>
> Thanks to all who were involved in this release. Contributors and merged
> pull requests are listed below.
>
>
> *Contributors to v1.11.2*
>
>    - Allan Haldane
>    - Bertrand Lefebvre
>    - Charles Harris
>    - Julian Taylor
>    - Lo?c Est?ve
>    - Marshall Bockrath-Vandegrift +
>    - Michael Seifert +
>    - Pauli Virtanen
>    - Ralf Gommers
>    - Sebastian Berg
>    - Shota Kawabuchi +
>    - Thomas A Caswell
>    - Valentin Valls +
>    - Xavier Abellan Ecija +
>
> A total of 14 people contributed to this release. People with a "+" by
> their names contributed a patch for the first time.
> *Pull requests merged for v1.11.2*
>
>    - #7736 <https://github.com/numpy/numpy/pull/7736>: Backport 4619,
>    BUG: many functions silently drop keepdims kwarg
>    - #7738 <https://github.com/numpy/numpy/pull/7738>: Backport 5706,
>    ENH: add extra kwargs and update doc of many MA...
>    - #7778 <https://github.com/numpy/numpy/pull/7778>: DOC: Update Numpy
>    1.11.1 release notes.
>    - #7793 <https://github.com/numpy/numpy/pull/7793>: Backport 7515,
>    BUG: MaskedArray.count treats negative axes incorrectly
>    - #7816 <https://github.com/numpy/numpy/pull/7816>: Backport 7463,
>    BUG: fix array too big error for wide dtypes.
>    - #7821 <https://github.com/numpy/numpy/pull/7821>: Backport 7817,
>    BUG: Make sure npy_mul_with_overflow_<type> detects...
>    - #7824 <https://github.com/numpy/numpy/pull/7824>: Backport 7820,
>    MAINT: Allocate fewer bytes for empty arrays.
>    - #7847 <https://github.com/numpy/numpy/pull/7847>: Backport 7791,
>    MAINT,DOC: Fix some imp module uses and update...
>    - #7849 <https://github.com/numpy/numpy/pull/7849>: Backport 7848,
>    MAINT: Fix remaining uses of deprecated Python...
>    - #7851 <https://github.com/numpy/numpy/pull/7851>: Backport 7840, Fix
>    ATLAS version detection
>    - #7870 <https://github.com/numpy/numpy/pull/7870>: Backport 7853,
>    BUG: Raise RuntimeError when reloading numpy is...
>    - #7896 <https://github.com/numpy/numpy/pull/7896>: Backport 7894,
>    BUG: construct ma.array from np.array which contains...
>    - #7904 <https://github.com/numpy/numpy/pull/7904>: Backport 7903,
>    BUG: fix float16 type not being called due to...
>    - #7917 <https://github.com/numpy/numpy/pull/7917>: BUG: Production
>    install of numpy should not require nose.
>    - #7919 <https://github.com/numpy/numpy/pull/7919>: Backport 7908,
>    BLD: Fixed MKL detection for recent versions of...
>    - #7920 <https://github.com/numpy/numpy/pull/7920>: Backport #7911:
>    BUG: fix for issue#7835 (ma.median of 1d)
>    - #7932 <https://github.com/numpy/numpy/pull/7932>: Backport 7925,
>    Monkey-patch _msvccompile.gen_lib_option like...
>    - #7939 <https://github.com/numpy/numpy/pull/7939>: Backport 7931,
>    BUG: Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in...
>    - #7953 <https://github.com/numpy/numpy/pull/7953>: Backport 7937,
>    BUG: Guard against buggy comparisons in generic...
>    - #7954 <https://github.com/numpy/numpy/pull/7954>: Backport 7952,
>    BUG: Use keyword arguments to initialize Extension...
>    - #7955 <https://github.com/numpy/numpy/pull/7955>: Backport 7941,
>    BUG: Make sure numpy globals keep identity after...
>    - #7972 <https://github.com/numpy/numpy/pull/7972>: Backport 7963,
>    BUG: MSVCCompiler grows 'lib' & 'include' env...
>    - #7990 <https://github.com/numpy/numpy/pull/7990>: Backport 7977,
>    DOC: Create 1.11.2 release notes.
>    - #8005 <https://github.com/numpy/numpy/pull/8005>: Backport 7956,
>    BLD: remove __NUMPY_SETUP__ from builtins at end...
>    - #8007 <https://github.com/numpy/numpy/pull/8007>: Backport 8006,
>    DOC: Update 1.11.2 release notes.
>    - #8010 <https://github.com/numpy/numpy/pull/8010>: Backport 8008,
>    MAINT: Remove leftover imp module imports.
>    - #8012 <https://github.com/numpy/numpy/pull/8012>: Backport 8011,
>    DOC: Update 1.11.2 release notes.
>    - #8020 <https://github.com/numpy/numpy/pull/8020>: Backport 8018,
>    BUG: Fixes return for np.ma.count if keepdims...
>    - #8024 <https://github.com/numpy/numpy/pull/8024>: Backport 8016,
>    BUG: Fix numpy.ma.median.
>    - #8031 <https://github.com/numpy/numpy/pull/8031>: Backport 8030,
>    BUG: fix np.ma.median with only one non-masked...
>    - #8032 <https://github.com/numpy/numpy/pull/8032>: Backport 8028,
>    DOC: Update 1.11.2 release notes.
>    - #8044 <https://github.com/numpy/numpy/pull/8044>: Backport 8042,
>    BUG: core: fix bug in NpyIter buffering with discontinuous...
>    - #8046 <https://github.com/numpy/numpy/pull/8046>: Backport 8045,
>    DOC: Update 1.11.2 release notes.
>
> Enjoy,
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161004/59f39d87/attachment.html>

From ralf.gommers at gmail.com  Tue Oct  4 06:18:22 2016
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 4 Oct 2016 23:18:22 +1300
Subject: [Numpy-discussion] update on mailing list issues
Message-ID: <CABL7CQgq8Ha8Sq=8x2GjpEpG2=rDAPjqjS3V1Ccvh_HGdcz=3Q@mail.gmail.com>

Hi all,

We've had a number of issues with the reliability of the mailman setup that
powers the mailing lists for NumPy, SciPy and several other projects. To
address that we'll start migrating to the python.org provided
infrastructure, which should be much more reliable.

The full set of lists is here: https://mail.scipy.org/mailman/listinfo.
Looks like we have to migrate at least:
AstroPy
IPython-dev
IPython-user
NumPy-Discussion
SciPy-Dev
SciPy-User
SciPy-organisers

Some of the other ones that are not clearly obsolete but have almost zero
activity (APUG, Nipy-devel) we'll have to contact the owners. *-tickets may
be useful to archive. The other ones will just be cleaned up, unless
someone indicates that there's a reason to keep them around.

And a pre-emptive thanks to Didrik and Enthought for taking on the task of
migrating the archives and user details.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161004/98ddb91d/attachment.html>

From ndbecker2 at gmail.com  Tue Oct  4 06:50:01 2016
From: ndbecker2 at gmail.com (Neal Becker)
Date: Tue, 04 Oct 2016 06:50:01 -0400
Subject: [Numpy-discussion] update on mailing list issues
References: <CABL7CQgq8Ha8Sq=8x2GjpEpG2=rDAPjqjS3V1Ccvh_HGdcz=3Q@mail.gmail.com>
Message-ID: <nt01gn$343$1@blaine.gmane.org>

Ralf Gommers wrote:

> Hi all,
> 
> We've had a number of issues with the reliability of the mailman setup
> that powers the mailing lists for NumPy, SciPy and several other projects.
> To address that we'll start migrating to the python.org provided
> infrastructure, which should be much more reliable.
> 
> The full set of lists is here: https://mail.scipy.org/mailman/listinfo.
> Looks like we have to migrate at least:
> AstroPy
> IPython-dev
> IPython-user
> NumPy-Discussion
> SciPy-Dev
> SciPy-User
> SciPy-organisers
> 
> Some of the other ones that are not clearly obsolete but have almost zero
> activity (APUG, Nipy-devel) we'll have to contact the owners. *-tickets
> may be useful to archive. The other ones will just be cleaned up, unless
> someone indicates that there's a reason to keep them around.
> 
> And a pre-emptive thanks to Didrik and Enthought for taking on the task of
> migrating the archives and user details.
> 
> Cheers,
> Ralf

Someone will need to update gmane nntp/mail gateway then, I suppose?


From ralf.gommers at gmail.com  Tue Oct  4 13:51:25 2016
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 5 Oct 2016 06:51:25 +1300
Subject: [Numpy-discussion] update on mailing list issues
In-Reply-To: <nt01gn$343$1@blaine.gmane.org>
References: <CABL7CQgq8Ha8Sq=8x2GjpEpG2=rDAPjqjS3V1Ccvh_HGdcz=3Q@mail.gmail.com>
 <nt01gn$343$1@blaine.gmane.org>
Message-ID: <CABL7CQgmOZsawQxYBhmhnGBAiu4hHUUj4zqXpPRQ1JkBUFE5UA@mail.gmail.com>

On Tue, Oct 4, 2016 at 11:50 PM, Neal Becker <ndbecker2 at gmail.com> wrote:

> Ralf Gommers wrote:
>
> > Hi all,
> >
> > We've had a number of issues with the reliability of the mailman setup
> > that powers the mailing lists for NumPy, SciPy and several other
> projects.
> > To address that we'll start migrating to the python.org provided
> > infrastructure, which should be much more reliable.
> >
> > The full set of lists is here: https://mail.scipy.org/mailman/listinfo.
> > Looks like we have to migrate at least:
> > AstroPy
> > IPython-dev
> > IPython-user
> > NumPy-Discussion
> > SciPy-Dev
> > SciPy-User
> > SciPy-organisers
> >
> > Some of the other ones that are not clearly obsolete but have almost zero
> > activity (APUG, Nipy-devel) we'll have to contact the owners. *-tickets
> > may be useful to archive. The other ones will just be cleaned up, unless
> > someone indicates that there's a reason to keep them around.
> >
> > And a pre-emptive thanks to Didrik and Enthought for taking on the task
> of
> > migrating the archives and user details.
> >
> > Cheers,
> > Ralf
>
> Someone will need to update gmane nntp/mail gateway then, I suppose?
>

Thanks for the reminder. Yes, guess we need to do something there. Not just
yet though, this is what I got when I looked at how to edit list details on
gmane: "Not all of Gmane is back yet - We're working hard to restore
everything"

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161005/2b7275ab/attachment.html>

From chris.barker at noaa.gov  Tue Oct  4 14:44:35 2016
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 4 Oct 2016 11:44:35 -0700
Subject: [Numpy-discussion] NumPy 1.11.2 released
In-Reply-To: <CAB6mnxJ75JPF4rMxhRpRR8xHAqSg_J6dGs3H8oseF6fxZsZJ1A@mail.gmail.com>
References: <CAB6mnxJ75JPF4rMxhRpRR8xHAqSg_J6dGs3H8oseF6fxZsZJ1A@mail.gmail.com>
Message-ID: <CALGmxE+qegnLQyJUWmMQWQd6zegrGCsy+ns9W6BS2VaAxdSzxg@mail.gmail.com>

I'm pleased to announce the release of Numpy 1.11.2. This release supports
> Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions found in
> Numpy 1.11.1.  Wheels for Linux, Windows, and OSX can be found on PyPI.
> Sources are available on both PyPI and Sourceforge
> <https://sourceforge.net/projects/numpy/files/NumPy/1.11.2/>.
>

and on conda-forge:

https://anaconda.org/conda-forge/numpy

Hmm, not Windows (darn fortran an openblas!) -- but thanks for getting that
up fast!

And of course, thanks to all in the numpy community for getting this build
out.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161004/99a9adbd/attachment.html>

From srean.list at gmail.com  Wed Oct  5 02:45:11 2016
From: srean.list at gmail.com (srean)
Date: Wed, 5 Oct 2016 12:15:11 +0530
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
Message-ID: <CAJewx8-xGNjsCN+wDVOuighDjWRQxS9AnyPc_hbCW4t=GDY4Eg@mail.gmail.com>

Good discussion, but was surprised by the absence of numexpr in the
discussion., given how relevant it (numexpr) is to the topic.

Is the goal to fold in the numexpr functionality (and beyond) into Numpy ?

On Fri, Sep 30, 2016 at 7:08 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> hi,
> Temporary arrays generated in expressions are expensive as the imply
> extra memory bandwidth which is the bottleneck in most numpy operations.
> For example:
>
> r = a + b + c
>
> creates the b + c temporary and then adds a to it.
> This can be rewritten to be more efficient using inplace operations:
>
> r = b + c
> r += a
>
> This saves some memory bandwidth and can speedup the operation by 50%
> for very large arrays or even more if the inplace operation allows it to
> be completed completely in the cpu cache.
>
> The problem is that inplace operations are a lot less readable so they
> are often only used in well optimized code. But due to pythons
> refcounting semantics we can actually do some inplace conversions
> transparently.
> If an operand in python has a reference count of one it must be a
> temporary so we can use it as the destination array. CPython itself does
> this optimization for string concatenations.
>
> In numpy we have the issue that we can be called from the C-API directly
> where the reference count may be one for other reasons.
> To solve this we can check the backtrace until the python frame
> evaluation function. If there are only numpy and python functions in
> between that and our entry point we should be able to elide the temporary.
>
> This PR implements this:
> https://github.com/numpy/numpy/pull/7997
>
> It currently only supports Linux with glibc (which has reliable
> backtraces via unwinding) and maybe MacOS depending on how good their
> backtrace is. On windows the backtrace APIs are different and I don't
> know them but in theory it could also be done there.
>
> A problem is that checking the backtrace is quite expensive, so should
> only be enabled when the involved arrays are large enough for it to be
> worthwhile. In my testing this seems to be around 180-300KiB sized
> arrays, basically where they start spilling out of the CPU L2 cache.
>
> I made a little crappy benchmark script to test this cutoff in this branch:
> https://github.com/juliantaylor/numpy/tree/elide-bench
>
> If you are interested you can run it with:
> python setup.py build_ext -j 4 --inplace
> ipython --profile=null check.ipy
>
> At the end it will plot the ratio between elided and non-elided runtime.
> It should get larger than one around 180KiB on most cpus.
>
> If no one points out some flaw in the approach, I'm hoping to get this
> into the next numpy version.
>
> cheers,
> Julian
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161005/5e0895ca/attachment.html>

From faltet at gmail.com  Wed Oct  5 05:46:21 2016
From: faltet at gmail.com (Francesc Alted)
Date: Wed, 5 Oct 2016 11:46:21 +0200
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CAJewx8-xGNjsCN+wDVOuighDjWRQxS9AnyPc_hbCW4t=GDY4Eg@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAJewx8-xGNjsCN+wDVOuighDjWRQxS9AnyPc_hbCW4t=GDY4Eg@mail.gmail.com>
Message-ID: <CAFrp1vou=fTN6Gbn-PDdygoiSrGLyQgw1YVjUFY=JbU-Xh19RQ@mail.gmail.com>

2016-10-05 8:45 GMT+02:00 srean <srean.list at gmail.com>:

> Good discussion, but was surprised by the absence of numexpr in the
> discussion., given how relevant it (numexpr) is to the topic.
>
> Is the goal to fold in the numexpr functionality (and beyond) into Numpy ?
>

Yes, the question about merging numexpr into numpy has been something that
periodically shows up in this list.  I think mostly everyone agree that it
is a good idea, but things are not so easy, and so far nobody provided a
good patch for this.  Also, the fact that numexpr relies on grouping an
expression by using a string (e.g. (y = ne.evaluate("x**3 + tanh(x**2) +
4")) does not play well with the way in that numpy evaluates expressions,
so something should be suggested to cope with this too.


>
> On Fri, Sep 30, 2016 at 7:08 PM, Julian Taylor <
> jtaylor.debian at googlemail.com> wrote:
>
>> hi,
>> Temporary arrays generated in expressions are expensive as the imply
>> extra memory bandwidth which is the bottleneck in most numpy operations.
>> For example:
>>
>> r = a + b + c
>>
>> creates the b + c temporary and then adds a to it.
>> This can be rewritten to be more efficient using inplace operations:
>>
>> r = b + c
>> r += a
>>
>> This saves some memory bandwidth and can speedup the operation by 50%
>> for very large arrays or even more if the inplace operation allows it to
>> be completed completely in the cpu cache.
>>
>> The problem is that inplace operations are a lot less readable so they
>> are often only used in well optimized code. But due to pythons
>> refcounting semantics we can actually do some inplace conversions
>> transparently.
>> If an operand in python has a reference count of one it must be a
>> temporary so we can use it as the destination array. CPython itself does
>> this optimization for string concatenations.
>>
>> In numpy we have the issue that we can be called from the C-API directly
>> where the reference count may be one for other reasons.
>> To solve this we can check the backtrace until the python frame
>> evaluation function. If there are only numpy and python functions in
>> between that and our entry point we should be able to elide the temporary.
>>
>> This PR implements this:
>> https://github.com/numpy/numpy/pull/7997
>>
>> It currently only supports Linux with glibc (which has reliable
>> backtraces via unwinding) and maybe MacOS depending on how good their
>> backtrace is. On windows the backtrace APIs are different and I don't
>> know them but in theory it could also be done there.
>>
>> A problem is that checking the backtrace is quite expensive, so should
>> only be enabled when the involved arrays are large enough for it to be
>> worthwhile. In my testing this seems to be around 180-300KiB sized
>> arrays, basically where they start spilling out of the CPU L2 cache.
>>
>> I made a little crappy benchmark script to test this cutoff in this
>> branch:
>> https://github.com/juliantaylor/numpy/tree/elide-bench
>>
>> If you are interested you can run it with:
>> python setup.py build_ext -j 4 --inplace
>> ipython --profile=null check.ipy
>>
>> At the end it will plot the ratio between elided and non-elided runtime.
>> It should get larger than one around 180KiB on most cpus.
>>
>> If no one points out some flaw in the approach, I'm hoping to get this
>> into the next numpy version.
>>
>> cheers,
>> Julian
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Francesc Alted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161005/0b2ac61c/attachment.html>

From robbmcleod at gmail.com  Wed Oct  5 06:56:20 2016
From: robbmcleod at gmail.com (Robert McLeod)
Date: Wed, 5 Oct 2016 12:56:20 +0200
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CAFrp1vou=fTN6Gbn-PDdygoiSrGLyQgw1YVjUFY=JbU-Xh19RQ@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAJewx8-xGNjsCN+wDVOuighDjWRQxS9AnyPc_hbCW4t=GDY4Eg@mail.gmail.com>
 <CAFrp1vou=fTN6Gbn-PDdygoiSrGLyQgw1YVjUFY=JbU-Xh19RQ@mail.gmail.com>
Message-ID: <CAEFUWWU4BzgB-hxY7eCDGOtowtTE1jd-XeuqxYnRf6hUWFhTQw@mail.gmail.com>

All,

On Wed, Oct 5, 2016 at 11:46 AM, Francesc Alted <faltet at gmail.com> wrote:

> 2016-10-05 8:45 GMT+02:00 srean <srean.list at gmail.com>:
>
>> Good discussion, but was surprised by the absence of numexpr in the
>> discussion., given how relevant it (numexpr) is to the topic.
>>
>> Is the goal to fold in the numexpr functionality (and beyond) into Numpy ?
>>
>
> Yes, the question about merging numexpr into numpy has been something that
> periodically shows up in this list.  I think mostly everyone agree that it
> is a good idea, but things are not so easy, and so far nobody provided a
> good patch for this.  Also, the fact that numexpr relies on grouping an
> expression by using a string (e.g. (y = ne.evaluate("x**3 + tanh(x**2) +
> 4")) does not play well with the way in that numpy evaluates expressions,
> so something should be suggested to cope with this too.
>

As Francesc said, Numexpr is going to get most of its power through
grouping a series of operations so it can send blocks to the CPU cache and
run the entire series of operations on the cache before returning the block
to system memory.  If it was just used to back-end NumPy, it would only
gain from the multi-threading portion inside each function call. I'm not
sure how one would go about grouping successive numpy expressions without
modifying the Python interpreter?

I put a bit of effort into extending numexpr to use 4-byte word opcodes
instead of 1-byte.  Progress has been very slow, however, due to time
constraints, but I have most of the numpy data types (u[1-4], i[1-4],
f[4,8], c[8,16], S[1-4], U[1-4]).  On Tuesday I finished writing a Python
generator script that writes all the C-side opcode macros for opcodes.hpp.
Now I have about 900 opcodes, and this could easily grow into thousands if
more functions are added, so I also built a reverse lookup tree (based on
collections.defaultdict) for the Python-side of numexpr.

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universit?t Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcleod at unibas.ch
robert.mcleod at bsse.ethz.ch <robert.mcleod at ethz.ch>
robbmcleod at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161005/1c8fd7dd/attachment.html>

From srean.list at gmail.com  Wed Oct  5 07:11:15 2016
From: srean.list at gmail.com (srean)
Date: Wed, 5 Oct 2016 16:41:15 +0530
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CAEFUWWU4BzgB-hxY7eCDGOtowtTE1jd-XeuqxYnRf6hUWFhTQw@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAJewx8-xGNjsCN+wDVOuighDjWRQxS9AnyPc_hbCW4t=GDY4Eg@mail.gmail.com>
 <CAFrp1vou=fTN6Gbn-PDdygoiSrGLyQgw1YVjUFY=JbU-Xh19RQ@mail.gmail.com>
 <CAEFUWWU4BzgB-hxY7eCDGOtowtTE1jd-XeuqxYnRf6hUWFhTQw@mail.gmail.com>
Message-ID: <CAJewx8-G6TM98MAuEXO3RX5Th+TYdj4ekOmst3yXNASpa3E6Hw@mail.gmail.com>

Thanks Francesc, Robert for giving me a broader picture of where this fits
in. I believe numexpr does not  handle slicing, so that might be another
thing to look at.


On Wed, Oct 5, 2016 at 4:26 PM, Robert McLeod <robbmcleod at gmail.com> wrote:

>
> As Francesc said, Numexpr is going to get most of its power through
> grouping a series of operations so it can send blocks to the CPU cache and
> run the entire series of operations on the cache before returning the block
> to system memory.  If it was just used to back-end NumPy, it would only
> gain from the multi-threading portion inside each function call.
>

Is that so ?

I thought numexpr also cuts down on number of temporary buffers that get
filled (in other words copy operations) if the same expression was written
as series of operations. My understanding can be wrong, and would
appreciate correction.

The 'out' parameter in ufuncs can eliminate extra temporaries but its not
composable. Right now I have to manually carry along the array where the in
place operations take place. I think the goal here is to eliminate that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161005/60f8a029/attachment.html>

From robbmcleod at gmail.com  Wed Oct  5 08:06:06 2016
From: robbmcleod at gmail.com (Robert McLeod)
Date: Wed, 5 Oct 2016 14:06:06 +0200
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CAJewx8-G6TM98MAuEXO3RX5Th+TYdj4ekOmst3yXNASpa3E6Hw@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAJewx8-xGNjsCN+wDVOuighDjWRQxS9AnyPc_hbCW4t=GDY4Eg@mail.gmail.com>
 <CAFrp1vou=fTN6Gbn-PDdygoiSrGLyQgw1YVjUFY=JbU-Xh19RQ@mail.gmail.com>
 <CAEFUWWU4BzgB-hxY7eCDGOtowtTE1jd-XeuqxYnRf6hUWFhTQw@mail.gmail.com>
 <CAJewx8-G6TM98MAuEXO3RX5Th+TYdj4ekOmst3yXNASpa3E6Hw@mail.gmail.com>
Message-ID: <CAEFUWWWboSVx5d2KJ_bgHrOAtwdekWi_4BcpGpXYtCSSOMSmcA@mail.gmail.com>

On Wed, Oct 5, 2016 at 1:11 PM, srean <srean.list at gmail.com> wrote:

> Thanks Francesc, Robert for giving me a broader picture of where this fits
> in. I believe numexpr does not  handle slicing, so that might be another
> thing to look at.
>

Dereferencing would be relatively simple to add into numexpr, as it would
just be some getattr() calls.  Personally I will add that at some point
because it will clean up my code.

Slicing, maybe only for continuous blocks in memory?

I.e.

imageStack[0,:,:]

would be possible, but

imageStack[:, ::2, ::2]

would not be trivial (I think...).  I seem to remember someone asked David
Cooke about slicing and he said something along the lines of, "that's what
Numba is for."  Perhaps NumPy backended by Numba is more so what you are
looking for, as it hooks into the byte compiler? The main advantage of
numexpr is that a series of numpy functions in <expression> can be enclosed
in  ne.evaluate( "<expression>" ) and it provides a big acceleration for
little programmer effort, but it's not nearly as sophisticated as Numba or
PyPy.


> On Wed, Oct 5, 2016 at 4:26 PM, Robert McLeod <robbmcleod at gmail.com>
> wrote:
>
>>
>> As Francesc said, Numexpr is going to get most of its power through
>> grouping a series of operations so it can send blocks to the CPU cache and
>> run the entire series of operations on the cache before returning the block
>> to system memory.  If it was just used to back-end NumPy, it would only
>> gain from the multi-threading portion inside each function call.
>>
>
> Is that so ?
>
> I thought numexpr also cuts down on number of temporary buffers that get
> filled (in other words copy operations) if the same expression was written
> as series of operations. My understanding can be wrong, and would
> appreciate correction.
>
> The 'out' parameter in ufuncs can eliminate extra temporaries but its not
> composable. Right now I have to manually carry along the array where the in
> place operations take place. I think the goal here is to eliminate that.
>

 The numexpr virtual machine does create temporaries where needed when it
parses the abstract syntax tree for all the operations it has to do.  I
believe the main advantage is that the temporaries are created on the CPU
cache, and not in system memory. It's certainly true that numexpr doesn't
create a lot of OP_COPY operations, rather it's optimized to minimize them,
so probably it's fewer ops than naive successive calls to numpy within
python, but I'm unsure if there's any difference in operation count between
a hand-optimized numpy with out= set and numexpr.  Numexpr just does it for
you.

This blog post from Tim Hochberg is useful for understanding the
performance advantages of blocking versus multithreading:

http://www.bitsofbits.com/2014/09/21/numpy-micro-optimization-and-numexpr/


Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universit?t Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcleod at unibas.ch
robert.mcleod at bsse.ethz.ch <robert.mcleod at ethz.ch>
robbmcleod at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161005/baae3a52/attachment.html>

From srean.list at gmail.com  Thu Oct  6 05:51:07 2016
From: srean.list at gmail.com (srean)
Date: Thu, 6 Oct 2016 15:21:07 +0530
Subject: [Numpy-discussion] automatically avoiding temporary arrays
In-Reply-To: <CAEFUWWWboSVx5d2KJ_bgHrOAtwdekWi_4BcpGpXYtCSSOMSmcA@mail.gmail.com>
References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com>
 <CAJewx8-xGNjsCN+wDVOuighDjWRQxS9AnyPc_hbCW4t=GDY4Eg@mail.gmail.com>
 <CAFrp1vou=fTN6Gbn-PDdygoiSrGLyQgw1YVjUFY=JbU-Xh19RQ@mail.gmail.com>
 <CAEFUWWU4BzgB-hxY7eCDGOtowtTE1jd-XeuqxYnRf6hUWFhTQw@mail.gmail.com>
 <CAJewx8-G6TM98MAuEXO3RX5Th+TYdj4ekOmst3yXNASpa3E6Hw@mail.gmail.com>
 <CAEFUWWWboSVx5d2KJ_bgHrOAtwdekWi_4BcpGpXYtCSSOMSmcA@mail.gmail.com>
Message-ID: <CAJewx8--GMfab5HpCxSR7STDBDJLzWJUVahmE8kvtdQut40jdQ@mail.gmail.com>

On Wed, Oct 5, 2016 at 5:36 PM, Robert McLeod <robbmcleod at gmail.com> wrote:

>
> It's certainly true that numexpr doesn't create a lot of OP_COPY
> operations, rather it's optimized to minimize them, so probably it's fewer
> ops than naive successive calls to numpy within python, but I'm unsure if
> there's any difference in operation count between a hand-optimized numpy
> with out= set and numexpr.  Numexpr just does it for you.
>

That was my understanding as well. If it automatically does what one could
achieve by carrying the state along in the 'out' parameter, that's as good
as it can get in terms removing unnecessary ops. There are other speedup
opportunities of course, but that's a separate matter.


> This blog post from Tim Hochberg is useful for understanding the
> performance advantages of blocking versus multithreading:
>
> http://www.bitsofbits.com/2014/09/21/numpy-micro-optimization-and-numexpr/
>

Hadnt come across that one before. Great link. Thanks. using caches and
vector registers well trumps threading, unless one has a lot of data and it
helps to disable hyper-threading.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161006/f13a303b/attachment.html>

From charlesr.harris at gmail.com  Fri Oct  7 21:12:53 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 7 Oct 2016 19:12:53 -0600
Subject: [Numpy-discussion] Integers to negative integer powers,
	time for a decision.
Message-ID: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>

Hi All,

The time for NumPy 1.12.0 approaches and I like to have a final decision on
the treatment of integers to negative integer powers with the `**`
operator. The two alternatives looked to be


*Raise an error for arrays and numpy scalars, including 1 and -1 to
negative powers.*
*Pluses*

   - Backward compatible
   - Allows common powers to be integer, e.g., arange(3)**2
   - Consistent with inplace operators
   - Fixes current wrong behavior.
   - Preserves type


*Minuses*

   - Integer overflow
   - Computational inconvenience
   - Inconsistent with Python integers


*Always return a float *

*Pluses*

   - Computational convenience


*Minuses*

   - Loss of type
   - Possible backward incompatibilities
   - Not applicable to inplace operators


Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161007/646ac8f3/attachment.html>

From josef.pktd at gmail.com  Fri Oct  7 21:38:02 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 7 Oct 2016 21:38:02 -0400
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
Message-ID: <CAMMTP+ANbuiG8+vjXqx0xC6C-3xrPY-pTzcHZE7AU9pQc2JLjg@mail.gmail.com>

On Fri, Oct 7, 2016 at 9:12 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> The time for NumPy 1.12.0 approaches and I like to have a final decision on
> the treatment of integers to negative integer powers with the `**` operator.
> The two alternatives looked to be
>
> Raise an error for arrays and numpy scalars, including 1 and -1 to negative
> powers.
>
> Pluses
>
> Backward compatible
> Allows common powers to be integer, e.g., arange(3)**2
> Consistent with inplace operators
> Fixes current wrong behavior.
> Preserves type
>
>
> Minuses
>
> Integer overflow
> Computational inconvenience
> Inconsistent with Python integers
>
>
> Always return a float
>
> Pluses
>
> Computational convenience
>
>
> Minuses
>
> Loss of type
> Possible backward incompatibilities
> Not applicable to inplace operators
>
>
>
> Thoughts?

2: +1
I'm still in favor of number 2: less buggy code and less mental
gymnastics (watch out for that int, or which int do I need)

(upcasting is not applicable for any inplace operators, AFAIU  <int> *=0.5 ?
zz = np.arange(5)
zz**(-1)
zz *= 0.5

tried in
>>> np.__version__
'1.9.2rc1'
>>> np.__version__
'1.10.4'
backwards compatibility ?
)


Josef


>
> Chuck
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From alan.isaac at gmail.com  Fri Oct  7 23:13:21 2016
From: alan.isaac at gmail.com (Alan Isaac)
Date: Fri, 7 Oct 2016 23:13:21 -0400
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
Message-ID: <c2ec5cb8-25cd-6075-7fb7-0b8949aaf4c2@gmail.com>

On 10/7/2016 9:12 PM, Charles R Harris wrote:
> *Always return a float *
> /Pluses/
>   * Computational convenience


Is the behavior of C++11 of any relevance to the choice?
http://www.cplusplus.com/reference/cmath/pow/

Alan Isaac


From sole at esrf.fr  Sat Oct  8 01:33:49 2016
From: sole at esrf.fr (V. Armando Sole)
Date: Sat, 08 Oct 2016 07:33:49 +0200
Subject: [Numpy-discussion] Integers to negative integer powers,
 time  for a decision.
In-Reply-To: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
Message-ID: <f1393b6505b417ddb6b6eeab315a0678@esrf.fr>

Hi all,

Just to have the options clear. Is the operator '**' going to be handled 
in any different manner than pow?

Thanks.

Armando


From njs at pobox.com  Sat Oct  8 06:40:50 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 8 Oct 2016 03:40:50 -0700
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
Message-ID: <CAPJVwBmNBMdqK07urmjdDfw++020qWyBfDv=m33YeMeaAeV3kg@mail.gmail.com>

On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> The time for NumPy 1.12.0 approaches and I like to have a final decision on
> the treatment of integers to negative integer powers with the `**` operator.
> The two alternatives looked to be
>
> Raise an error for arrays and numpy scalars, including 1 and -1 to negative
> powers.
>
> Pluses
>
> Backward compatible
> Allows common powers to be integer, e.g., arange(3)**2
> Consistent with inplace operators
> Fixes current wrong behavior.
> Preserves type
>
>
> Minuses
>
> Integer overflow
> Computational inconvenience
> Inconsistent with Python integers
>
>
> Always return a float
>
> Pluses
>
> Computational convenience
>
>
> Minuses
>
> Loss of type
> Possible backward incompatibilities
> Not applicable to inplace operators

I guess I could be wrong, but I think the backwards incompatibilities
are going to be *way* too severe to make option 2 possible in
practice.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From charlesr.harris at gmail.com  Sat Oct  8 09:59:06 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 8 Oct 2016 07:59:06 -0600
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAPJVwBmNBMdqK07urmjdDfw++020qWyBfDv=m33YeMeaAeV3kg@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAPJVwBmNBMdqK07urmjdDfw++020qWyBfDv=m33YeMeaAeV3kg@mail.gmail.com>
Message-ID: <CAB6mnx+uT5S__uNN7EuEEkLTeGZRKPpObX0e7zE2EdNrNy6kYA@mail.gmail.com>

On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > Hi All,
> >
> > The time for NumPy 1.12.0 approaches and I like to have a final decision
> on
> > the treatment of integers to negative integer powers with the `**`
> operator.
> > The two alternatives looked to be
> >
> > Raise an error for arrays and numpy scalars, including 1 and -1 to
> negative
> > powers.
> >
> > Pluses
> >
> > Backward compatible
> > Allows common powers to be integer, e.g., arange(3)**2
> > Consistent with inplace operators
> > Fixes current wrong behavior.
> > Preserves type
> >
> >
> > Minuses
> >
> > Integer overflow
> > Computational inconvenience
> > Inconsistent with Python integers
> >
> >
> > Always return a float
> >
> > Pluses
> >
> > Computational convenience
> >
> >
> > Minuses
> >
> > Loss of type
> > Possible backward incompatibilities
> > Not applicable to inplace operators
>
> I guess I could be wrong, but I think the backwards incompatibilities
> are going to be *way* too severe to make option 2 possible in
> practice.
>
>
Backwards compatibility is also a major concern for me.  Here are my
current thoughts


   - Add an fpow ufunc that always converts to float, it would not accept
   object arrays.
   - Raise errors in current power ufunc (**), for ints to negative ints.

The power ufunc will change in the following ways


   - +1, -1 to negative ints will error, currently they work
   - n > 1 ints to negative ints will error, currently warn and return zero
   - 0 to negative ints will error, they currently return the minimum
   integer

The `**` operator currently calls the power ufunc, leave that as is for
backward almost compatibility. The remaining question is numpy scalars,
which we can make either compatible with Python, or with NumPy arrays. I'm
leaning towards NumPy array compatibility mostly on account of type
preservation and the close relationship between zero dimensionaly arrays
and scalars.


The fpow function could be backported to NumPy 1.11 if that would be
helpful going forward.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161008/e7690d5a/attachment.html>

From njs at pobox.com  Sat Oct  8 11:12:31 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 8 Oct 2016 08:12:31 -0700
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAB6mnx+uT5S__uNN7EuEEkLTeGZRKPpObX0e7zE2EdNrNy6kYA@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAPJVwBmNBMdqK07urmjdDfw++020qWyBfDv=m33YeMeaAeV3kg@mail.gmail.com>
 <CAB6mnx+uT5S__uNN7EuEEkLTeGZRKPpObX0e7zE2EdNrNy6kYA@mail.gmail.com>
Message-ID: <CAPJVwB=Pj_4tJxff1B9uvyz2EZCm1z0CHVC9DW7TEz-Dgc0TGQ@mail.gmail.com>

On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > Hi All,
>> >
>> > The time for NumPy 1.12.0 approaches and I like to have a final decision
>> > on
>> > the treatment of integers to negative integer powers with the `**`
>> > operator.
>> > The two alternatives looked to be
>> >
>> > Raise an error for arrays and numpy scalars, including 1 and -1 to
>> > negative
>> > powers.
>> >
>> > Pluses
>> >
>> > Backward compatible
>> > Allows common powers to be integer, e.g., arange(3)**2
>> > Consistent with inplace operators
>> > Fixes current wrong behavior.
>> > Preserves type
>> >
>> >
>> > Minuses
>> >
>> > Integer overflow
>> > Computational inconvenience
>> > Inconsistent with Python integers
>> >
>> >
>> > Always return a float
>> >
>> > Pluses
>> >
>> > Computational convenience
>> >
>> >
>> > Minuses
>> >
>> > Loss of type
>> > Possible backward incompatibilities
>> > Not applicable to inplace operators
>>
>> I guess I could be wrong, but I think the backwards incompatibilities
>> are going to be *way* too severe to make option 2 possible in
>> practice.
>>
>
> Backwards compatibility is also a major concern for me.  Here are my current
> thoughts
>
> Add an fpow ufunc that always converts to float, it would not accept object
> arrays.

Maybe call it `fpower` or even `float_power`, for consistency with `power`?

> Raise errors in current power ufunc (**), for ints to negative ints.
>
> The power ufunc will change in the following ways
>
> +1, -1 to negative ints will error, currently they work
> n > 1 ints to negative ints will error, currently warn and return zero
> 0 to negative ints will error, they currently return the minimum integer
>
> The `**` operator currently calls the power ufunc, leave that as is for
> backward almost compatibility. The remaining question is numpy scalars,
> which we can make either compatible with Python, or with NumPy arrays. I'm
> leaning towards NumPy array compatibility mostly on account of type
> preservation and the close relationship between zero dimensionaly arrays and
> scalars.

Sounds good to me. I agree that we should prioritize within-numpy
consistency over consistency with Python.

> The fpow function could be backported to NumPy 1.11 if that would be helpful
> going forward.

I'm not a big fan of this kind of backport. Violating the
"bug-fixes-only" rule makes it hard for people to understand our
release versions. And it creates the situation where people can write
code that they think requires numpy 1.11 (because it works with their
numpy 1.11!), but then breaks on other people's computers (because
those users have 1.11.(x-1)). And if there's some reason why people
aren't willing to upgrade to 1.12 for new features, then probably
better to spend energy addressing those instead of on putting together
1.11-and-a-half releases.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From charlesr.harris at gmail.com  Sat Oct  8 14:38:08 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 8 Oct 2016 12:38:08 -0600
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAPJVwB=Pj_4tJxff1B9uvyz2EZCm1z0CHVC9DW7TEz-Dgc0TGQ@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAPJVwBmNBMdqK07urmjdDfw++020qWyBfDv=m33YeMeaAeV3kg@mail.gmail.com>
 <CAB6mnx+uT5S__uNN7EuEEkLTeGZRKPpObX0e7zE2EdNrNy6kYA@mail.gmail.com>
 <CAPJVwB=Pj_4tJxff1B9uvyz2EZCm1z0CHVC9DW7TEz-Dgc0TGQ@mail.gmail.com>
Message-ID: <CAB6mnxLkXrSV5rsnKav_D1DZnqr9xyGMcMR3YwNWQ5ov3U=Oog@mail.gmail.com>

On Sat, Oct 8, 2016 at 9:12 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >> > Hi All,
> >> >
> >> > The time for NumPy 1.12.0 approaches and I like to have a final
> decision
> >> > on
> >> > the treatment of integers to negative integer powers with the `**`
> >> > operator.
> >> > The two alternatives looked to be
> >> >
> >> > Raise an error for arrays and numpy scalars, including 1 and -1 to
> >> > negative
> >> > powers.
> >> >
> >> > Pluses
> >> >
> >> > Backward compatible
> >> > Allows common powers to be integer, e.g., arange(3)**2
> >> > Consistent with inplace operators
> >> > Fixes current wrong behavior.
> >> > Preserves type
> >> >
> >> >
> >> > Minuses
> >> >
> >> > Integer overflow
> >> > Computational inconvenience
> >> > Inconsistent with Python integers
> >> >
> >> >
> >> > Always return a float
> >> >
> >> > Pluses
> >> >
> >> > Computational convenience
> >> >
> >> >
> >> > Minuses
> >> >
> >> > Loss of type
> >> > Possible backward incompatibilities
> >> > Not applicable to inplace operators
> >>
> >> I guess I could be wrong, but I think the backwards incompatibilities
> >> are going to be *way* too severe to make option 2 possible in
> >> practice.
> >>
> >
> > Backwards compatibility is also a major concern for me.  Here are my
> current
> > thoughts
> >
> > Add an fpow ufunc that always converts to float, it would not accept
> object
> > arrays.
>
> Maybe call it `fpower` or even `float_power`, for consistency with `power`?
>
> > Raise errors in current power ufunc (**), for ints to negative ints.
> >
> > The power ufunc will change in the following ways
> >
> > +1, -1 to negative ints will error, currently they work
> > n > 1 ints to negative ints will error, currently warn and return zero
> > 0 to negative ints will error, they currently return the minimum integer
> >
> > The `**` operator currently calls the power ufunc, leave that as is for
> > backward almost compatibility. The remaining question is numpy scalars,
> > which we can make either compatible with Python, or with NumPy arrays.
> I'm
> > leaning towards NumPy array compatibility mostly on account of type
> > preservation and the close relationship between zero dimensionaly arrays
> and
> > scalars.
>
> Sounds good to me. I agree that we should prioritize within-numpy
> consistency over consistency with Python.
>
> > The fpow function could be backported to NumPy 1.11 if that would be
> helpful
> > going forward.
>
> I'm not a big fan of this kind of backport. Violating the
> "bug-fixes-only" rule makes it hard for people to understand our
> release versions. And it creates the situation where people can write
> code that they think requires numpy 1.11 (because it works with their
> numpy 1.11!), but then breaks on other people's computers (because
> those users have 1.11.(x-1)). And if there's some reason why people
> aren't willing to upgrade to 1.12 for new features, then probably
> better to spend energy addressing those instead of on putting together
> 1.11-and-a-half releases.
>

The power ufunc is updated in  https://github.com/numpy/numpy/pull/8127.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161008/30d23a4d/attachment.html>

From raksi.raksi at gmail.com  Sat Oct  8 15:31:56 2016
From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=)
Date: Sat, 8 Oct 2016 21:31:56 +0200
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
Message-ID: <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>

Hello,

I think it should be consistent with Python3. So, it should give back a
float.

Best regards,
Krisztian

On Sat, Oct 8, 2016 at 3:12 AM, Charles R Harris <charlesr.harris at gmail.com>
wrote:

> Hi All,
>
> The time for NumPy 1.12.0 approaches and I like to have a final decision
> on the treatment of integers to negative integer powers with the `**`
> operator. The two alternatives looked to be
>
>
> *Raise an error for arrays and numpy scalars, including 1 and -1 to
> negative powers.*
> *Pluses*
>
>    - Backward compatible
>    - Allows common powers to be integer, e.g., arange(3)**2
>    - Consistent with inplace operators
>    - Fixes current wrong behavior.
>    - Preserves type
>
>
> *Minuses*
>
>    - Integer overflow
>    - Computational inconvenience
>    - Inconsistent with Python integers
>
>
> *Always return a float *
>
> *Pluses*
>
>    - Computational convenience
>
>
> *Minuses*
>
>    - Loss of type
>    - Possible backward incompatibilities
>    - Not applicable to inplace operators
>
>
>
> Thoughts?
>
> Chuck
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161008/2992e746/attachment.html>

From charlesr.harris at gmail.com  Sat Oct  8 15:36:40 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 8 Oct 2016 13:36:40 -0600
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>
Message-ID: <CAB6mnxLg4ARxFF1bNySqqjfaH-n=2-LpEUJJ8GuQQvxuQfU=KA@mail.gmail.com>

On Sat, Oct 8, 2016 at 1:31 PM, Kriszti?n Horv?th <raksi.raksi at gmail.com>
wrote:

> Hello,
>
> I think it should be consistent with Python3. So, it should give back a
> float.
>
> Best regards,
> Krisztian
>
>
Can't do that and also return integers for positive powers. It isn't
possible to have behavior completely compatible with python for arrays:
can't have mixed type returns, can't have arbitrary precision integers.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161008/68dfac46/attachment.html>

From raksi.raksi at gmail.com  Sat Oct  8 15:43:06 2016
From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=)
Date: Sat, 8 Oct 2016 21:43:06 +0200
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAJraPOuA9qPQyQ4r0eSmBcuEZ9Aht=hcEA9PkFT3qCbZ1gY2Ww@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>
 <CAB6mnxLg4ARxFF1bNySqqjfaH-n=2-LpEUJJ8GuQQvxuQfU=KA@mail.gmail.com>
 <CAJraPOt9jEsQ-JiUK79PWhpyaRnNdgHherbKMqn_V3dTju572A@mail.gmail.com>
 <CAJraPOuA9qPQyQ4r0eSmBcuEZ9Aht=hcEA9PkFT3qCbZ1gY2Ww@mail.gmail.com>
Message-ID: <CAJraPOtv6k3bKamHGm3yiC0FVZXCrUsVLMVRuLYx+rhssZzgFg@mail.gmail.com>

Sorry, I was not clear enough. I meant that the second option (always
float) would be more coherent with Python3.

On Oct 8, 2016 9:36 PM, "Charles R Harris" <charlesr.harris at gmail.com>
wrote:


On Sat, Oct 8, 2016 at 1:31 PM, Kriszti?n Horv?th <raksi.raksi at gmail.com>
wrote:

> Hello,
>
> I think it should be consistent with Python3. So, it should give back a
> float.
>
> Best regards,
> Krisztian
>
>
Can't do that and also return integers for positive powers. It isn't
possible to have behavior completely compatible with python for arrays:
can't have mixed type returns, can't have arbitrary precision integers.

Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161008/254fad31/attachment.html>

From sole at esrf.fr  Sat Oct  8 16:40:51 2016
From: sole at esrf.fr (V. Armando Sole)
Date: Sat, 08 Oct 2016 22:40:51 +0200
Subject: [Numpy-discussion] Integers to negative integer powers,
 time  for a decision.
In-Reply-To: <CAB6mnxLg4ARxFF1bNySqqjfaH-n=2-LpEUJJ8GuQQvxuQfU=KA@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>
 <CAB6mnxLg4ARxFF1bNySqqjfaH-n=2-LpEUJJ8GuQQvxuQfU=KA@mail.gmail.com>
Message-ID: <71300e52e43daf80c9a3b3f279d562be@esrf.fr>

Well, testing under windows 64 bit, Python 3.5.2, positive powers of 
integers give integers and negative powers of integers give floats. So, 
do you want to raise an exception when taking a negative power of an 
element of an array of integers? Because not doing so would be 
inconsistent with raising the exception when applying the same operation 
to the array.

Clearly things are broken now (I get zeros when calculating negative 
powers of numpy arrays of integers others than 1), but that behavior was 
consistent with python itself under python 2.x because the division of 
two integers was an integer. That does not hold under Python 3.5 where 
the division of two integers is a float.

You have offered either to raise an exception or to always return a 
float (i.e. even with positive exponents). You have never offered to be 
consistent with what Python does. This last option would be my favorite. 
If it cannot be implemented, then I would prefer always float. At least 
one would be consistent with something and we would not invent yet 
another convention.


On 08.10.2016 21:36, Charles R Harris wrote:
> On Sat, Oct 8, 2016 at 1:31 PM, Kriszti?n Horv?th
> <raksi.raksi at gmail.com> wrote:
> 
>> Hello,
>> 
>> I think it should be consistent with Python3. So, it should give
>> back a float.
>> 
>> Best regards,
>> Krisztian
> 
> Can't do that and also return integers for positive powers. It isn't
> possible to have behavior completely compatible with python for
> arrays: can't have mixed type returns, can't have arbitrary precision
> integers.
> 
> Chuck
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Sat Oct  8 17:51:58 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 8 Oct 2016 14:51:58 -0700
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <71300e52e43daf80c9a3b3f279d562be@esrf.fr>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>
 <CAB6mnxLg4ARxFF1bNySqqjfaH-n=2-LpEUJJ8GuQQvxuQfU=KA@mail.gmail.com>
 <71300e52e43daf80c9a3b3f279d562be@esrf.fr>
Message-ID: <CAPJVwBny0m_GURWN5PTq4OQWnJOddW4B-+WTCpnukyZS3DXh2w@mail.gmail.com>

On Sat, Oct 8, 2016 at 1:40 PM, V. Armando Sole <sole at esrf.fr> wrote:
> Well, testing under windows 64 bit, Python 3.5.2, positive powers of
> integers give integers and negative powers of integers give floats. So, do
> you want to raise an exception when taking a negative power of an element of
> an array of integers? Because not doing so would be inconsistent with
> raising the exception when applying the same operation to the array.
>
> Clearly things are broken now (I get zeros when calculating negative powers
> of numpy arrays of integers others than 1), but that behavior was consistent
> with python itself under python 2.x because the division of two integers was
> an integer. That does not hold under Python 3.5 where the division of two
> integers is a float.

Even on Python 2, negative powers gave floats:

>>> sys.version_info
sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0)
>>> 2 ** -2
0.25

> You have offered either to raise an exception or to always return a float
> (i.e. even with positive exponents). You have never offered to be consistent
> with what Python does. This last option would be my favorite. If it cannot
> be implemented, then I would prefer always float. At least one would be
> consistent with something and we would not invent yet another convention.

Numpy tries to be consistent with Python when it makes sense, but this
is only one of several considerations. The use cases for numpy objects
are different from the use cases for Python scalar objects, so we also
consistently deviate in cases when that makes sense -- e.g., numpy
bools are very different from Python bools (Python barely
distinguishes between bools and integers, because they don't need to;
indexing makes the distinction much more important to numpy), numpy
integers are very different from Python integers (Python's
arbitrary-width integers provide great semantics, but don't play
nicely with large fixed-size arrays), numpy pays much more attention
to type consistency between inputs and outputs than Python does (again
because of the extra constraints imposed by working with
memory-intensive type-consistent arrays), etc.

For python, 2 ** 2 -> int, 2 ** -2 -> float. But numpy can't do this,
because then 2 ** np.array([2, -2]) would have to be both int *and*
float, which it can't be. Not a problem that Python has. Or we could
say that the output is int if all the inputs are positive, and float
if any of them are negative... but then that violates the numpy
principle that output dtypes should be determined entirely by input
dtypes, without peeking at the actual values. (And this rule is very
important for avoiding nasty surprises when you run your code on new
inputs.)

And then there's backwards compatibility to consider. As mentioned, we
*could* deviate from Python by making ** always return float... but
this would almost certainly break tons and tons of people's code that
is currently doing integer ** positive integer and expecting to get an
integer back. Which is something we don't do without very careful
weighing of the trade-offs, and my intuition is that this one is so
disruptive we probably can't pull it off. Breaking working code needs
a *very* compelling reason.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From saxri89 at gmail.com  Sat Oct  8 18:11:49 2016
From: saxri89 at gmail.com (Xristos Xristoou)
Date: Sun, 9 Oct 2016 01:11:49 +0300
Subject: [Numpy-discussion] delete pixel from the raster image with specific
	range value
Message-ID: <CAGhnqHRaEQbQLv2LVNq6jYDrj0JXT7VKEgoQN+L3dXmysV=Nmw@mail.gmail.com>

any idea how to delete pixel from the raster image with
specific range value using numpy/scipy or gdal?

for example i have a raster image with the
5 class :

1. 0-100
2. 100-200
3. 200-300
4. 300-500
5. 500-1000

and i want to delete class 1 range value
or maybe i want to delete class 1,2,4,5 if i need only class 3
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161009/685147d0/attachment.html>

From raksi.raksi at gmail.com  Sat Oct  8 18:18:45 2016
From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=)
Date: Sun, 9 Oct 2016 00:18:45 +0200
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAPJVwBny0m_GURWN5PTq4OQWnJOddW4B-+WTCpnukyZS3DXh2w@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>
 <CAB6mnxLg4ARxFF1bNySqqjfaH-n=2-LpEUJJ8GuQQvxuQfU=KA@mail.gmail.com>
 <71300e52e43daf80c9a3b3f279d562be@esrf.fr>
 <CAPJVwBny0m_GURWN5PTq4OQWnJOddW4B-+WTCpnukyZS3DXh2w@mail.gmail.com>
Message-ID: <CAJraPOvYV3sffO49frSmq+aKVgyhphiu5fBzEJ5OcZba+FEekg@mail.gmail.com>

but then that violates the numpy
> principle that output dtypes should be determined entirely by input
> dtypes, without peeking at the actual values. (And this rule is very
> important for avoiding nasty surprises when you run your code on new
> inputs.)
>
At division you get back an array of floats.

>>> y = np.int64([1,2,4])
>>> y/1
array([ 1.,  2.,  4.])
>>> y/y
array([ 1.,  1.,  1.])

Why is it different, if you calculate the power of something?


> And then there's backwards compatibility to consider. As mentioned, we
> *could* deviate from Python by making ** always return float... but
> this would almost certainly break tons and tons of people's code that
> is currently doing integer ** positive integer and expecting to get an
> integer back. Which is something we don't do without very careful
> weighing of the trade-offs, and my intuition is that this one is so
> disruptive we probably can't pull it off. Breaking working code needs
> a *very* compelling reason.
>
This is a valid reasoning. But it could be solved with raising an exception
to warn the users for the new behaviour.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161009/8a3ea604/attachment.html>

From njs at pobox.com  Sat Oct  8 19:34:30 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 8 Oct 2016 16:34:30 -0700
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAJraPOvYV3sffO49frSmq+aKVgyhphiu5fBzEJ5OcZba+FEekg@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAJraPOvMQkQiHavLKzAZdEx+4-iW2VYw40HKs9zcbmshMXF34Q@mail.gmail.com>
 <CAB6mnxLg4ARxFF1bNySqqjfaH-n=2-LpEUJJ8GuQQvxuQfU=KA@mail.gmail.com>
 <71300e52e43daf80c9a3b3f279d562be@esrf.fr>
 <CAPJVwBny0m_GURWN5PTq4OQWnJOddW4B-+WTCpnukyZS3DXh2w@mail.gmail.com>
 <CAJraPOvYV3sffO49frSmq+aKVgyhphiu5fBzEJ5OcZba+FEekg@mail.gmail.com>
Message-ID: <CAPJVwB=AyH_smZZATctbsh-3mEts5Md7bUx49iCpah0ng3KEuw@mail.gmail.com>

On Sat, Oct 8, 2016 at 3:18 PM, Kriszti?n Horv?th <raksi.raksi at gmail.com> wrote:
>
>
>
>> but then that violates the numpy
>> principle that output dtypes should be determined entirely by input
>> dtypes, without peeking at the actual values. (And this rule is very
>> important for avoiding nasty surprises when you run your code on new
>> inputs.)
>
> At division you get back an array of floats.
>
>>>> y = np.int64([1,2,4])
>>>> y/1
> array([ 1.,  2.,  4.])
>>>> y/y
> array([ 1.,  1.,  1.])
>
> Why is it different, if you calculate the power of something?

The difference is that Python division always returns float. Python
int ** int sometimes returns int and sometimes returns float,
depending on which particular integers are used. We can't be
consistent with Python because Python isn't consistent with itself.

>>
>> And then there's backwards compatibility to consider. As mentioned, we
>> *could* deviate from Python by making ** always return float... but
>> this would almost certainly break tons and tons of people's code that
>> is currently doing integer ** positive integer and expecting to get an
>> integer back. Which is something we don't do without very careful
>> weighing of the trade-offs, and my intuition is that this one is so
>> disruptive we probably can't pull it off. Breaking working code needs
>> a *very* compelling reason.
>
> This is a valid reasoning. But it could be solved with raising an exception
> to warn the users for the new behaviour.

That is generally the best conservative strategy for making a
backwards incompatible change like this: instead of going straight to
the new behavior, first make it raise an error, and then once people
have had time to stop depending on the old behavior, then you can add
the new behavior. But in this case if we were going to make int ** int
return float, this rule would mean that we have to make int ** int
always raise an error for a few years, i.e. remove integer power
support from numpy altogether. That's a non-starter.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From Permafacture at gmail.com  Sat Oct  8 20:55:00 2016
From: Permafacture at gmail.com (Elliot Hallmark)
Date: Sat, 8 Oct 2016 19:55:00 -0500
Subject: [Numpy-discussion] delete pixel from the raster image with
 specific range value
In-Reply-To: <CAGhnqHRaEQbQLv2LVNq6jYDrj0JXT7VKEgoQN+L3dXmysV=Nmw@mail.gmail.com>
References: <CAGhnqHRaEQbQLv2LVNq6jYDrj0JXT7VKEgoQN+L3dXmysV=Nmw@mail.gmail.com>
Message-ID: <CAHedU1TdBXgCtL5++YMDTjQzPfd-nHcEsCx0k8Ps=ot_-vjsVw@mail.gmail.com>

What do you mean delete? Set to zero or NaN? You want an (N-1) dimensional
array of all the acceptable values from the N dimensional array?

Elliot
On Oct 8, 2016 5:11 PM, "Xristos Xristoou" <saxri89 at gmail.com> wrote:

> any idea how to delete pixel from the raster image with
> specific range value using numpy/scipy or gdal?
>
> for example i have a raster image with the
> 5 class :
>
> 1. 0-100
> 2. 100-200
> 3. 200-300
> 4. 300-500
> 5. 500-1000
>
> and i want to delete class 1 range value
> or maybe i want to delete class 1,2,4,5 if i need only class 3
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161008/547922ff/attachment.html>

From ralf.gommers at gmail.com  Sun Oct  9 02:43:06 2016
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 9 Oct 2016 19:43:06 +1300
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAPJVwB=Pj_4tJxff1B9uvyz2EZCm1z0CHVC9DW7TEz-Dgc0TGQ@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAPJVwBmNBMdqK07urmjdDfw++020qWyBfDv=m33YeMeaAeV3kg@mail.gmail.com>
 <CAB6mnx+uT5S__uNN7EuEEkLTeGZRKPpObX0e7zE2EdNrNy6kYA@mail.gmail.com>
 <CAPJVwB=Pj_4tJxff1B9uvyz2EZCm1z0CHVC9DW7TEz-Dgc0TGQ@mail.gmail.com>
Message-ID: <CABL7CQhrwwx--qp_DkAiaGijobSmepNM1A8kvzMGeo=GmJ-Myw@mail.gmail.com>

On Sun, Oct 9, 2016 at 4:12 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >> > Hi All,
> >> >
> >> > The time for NumPy 1.12.0 approaches and I like to have a final
> decision
> >> > on
> >> > the treatment of integers to negative integer powers with the `**`
> >> > operator.
> >> > The two alternatives looked to be
> >> >
> >> > Raise an error for arrays and numpy scalars, including 1 and -1 to
> >> > negative
> >> > powers.
> >> >
> >> > Pluses
> >> >
> >> > Backward compatible
> >> > Allows common powers to be integer, e.g., arange(3)**2
> >> > Consistent with inplace operators
> >> > Fixes current wrong behavior.
> >> > Preserves type
> >> >
> >> >
> >> > Minuses
> >> >
> >> > Integer overflow
> >> > Computational inconvenience
> >> > Inconsistent with Python integers
> >> >
> >> >
> >> > Always return a float
> >> >
> >> > Pluses
> >> >
> >> > Computational convenience
> >> >
> >> >
> >> > Minuses
> >> >
> >> > Loss of type
> >> > Possible backward incompatibilities
> >> > Not applicable to inplace operators
> >>
> >> I guess I could be wrong, but I think the backwards incompatibilities
> >> are going to be *way* too severe to make option 2 possible in
> >> practice.
> >>
> >
> > Backwards compatibility is also a major concern for me.  Here are my
> current
> > thoughts
> >
> > Add an fpow ufunc that always converts to float, it would not accept
> object
> > arrays.
>
> Maybe call it `fpower` or even `float_power`, for consistency with `power`?
>
> > Raise errors in current power ufunc (**), for ints to negative ints.
> >
> > The power ufunc will change in the following ways
> >
> > +1, -1 to negative ints will error, currently they work
> > n > 1 ints to negative ints will error, currently warn and return zero
> > 0 to negative ints will error, they currently return the minimum integer
> >
> > The `**` operator currently calls the power ufunc, leave that as is for
> > backward almost compatibility. The remaining question is numpy scalars,
> > which we can make either compatible with Python, or with NumPy arrays.
> I'm
> > leaning towards NumPy array compatibility mostly on account of type
> > preservation and the close relationship between zero dimensionaly arrays
> and
> > scalars.
>
> Sounds good to me. I agree that we should prioritize within-numpy
> consistency over consistency with Python.
>

+1 sounds good to me too.


>
> > The fpow function could be backported to NumPy 1.11 if that would be
> helpful
> > going forward.
>
> I'm not a big fan of this kind of backport. Violating the
> "bug-fixes-only" rule makes it hard for people to understand our
> release versions. And it creates the situation where people can write
> code that they think requires numpy 1.11 (because it works with their
> numpy 1.11!), but then breaks on other people's computers (because
> those users have 1.11.(x-1)). And if there's some reason why people
> aren't willing to upgrade to 1.12 for new features, then probably
> better to spend energy addressing those instead of on putting together
> 1.11-and-a-half releases.
>

Agreed, this is not something we want to backport.

Ralf


>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161009/52f8c9fc/attachment.html>

From raksi.raksi at gmail.com  Sun Oct  9 06:42:44 2016
From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=)
Date: Sun, 9 Oct 2016 12:42:44 +0200
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CABL7CQhrwwx--qp_DkAiaGijobSmepNM1A8kvzMGeo=GmJ-Myw@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <CAPJVwBmNBMdqK07urmjdDfw++020qWyBfDv=m33YeMeaAeV3kg@mail.gmail.com>
 <CAB6mnx+uT5S__uNN7EuEEkLTeGZRKPpObX0e7zE2EdNrNy6kYA@mail.gmail.com>
 <CAPJVwB=Pj_4tJxff1B9uvyz2EZCm1z0CHVC9DW7TEz-Dgc0TGQ@mail.gmail.com>
 <CABL7CQhrwwx--qp_DkAiaGijobSmepNM1A8kvzMGeo=GmJ-Myw@mail.gmail.com>
Message-ID: <CAJraPOuuUmTEWpBtTBb7n7V0RFcNeukwYw=YetOMO4F7BbNxSA@mail.gmail.com>

> Sounds good to me. I agree that we should prioritize within-numpy
> consistency over consistency with Python.
>

I agree with that. Because of numpy consitetncy, the `**` operator should
always return float. Right now the case is:

>>> aa = np.arange(2, 10, dtype=int)
array([2, 3, 4, 5, 6, 7, 8, 9])

>>> bb = np.linspace(0, 7, 8, dtype=int)
array([0, 1, 2, 3, 4, 5, 6, 7])

>>> 1/aa
array([ 0.5       ,  0.33333333,  0.25      ,  0.2       ,  0.16666667,
        0.14285714,  0.125     ,  0.11111111])

>>> aa**-1
array([0, 0, 0, 0, 0, 0, 0, 0])

>>> 1/aa**2
array([ 0.25      ,  0.11111111,  0.0625    ,  0.04      ,  0.02777778,
        0.02040816,  0.015625  ,  0.01234568])

>>> aa**-2
array([0, 0, 0, 0, 0, 0, 0, 0])

>>> aa**bb
array([      1,       3,      16,     125,    1296,   16807,  262144,
       4782969])

>>> 1/aa**bb
array([  1.00000000e+00,   3.33333333e-01,   6.25000000e-02,
         8.00000000e-03,   7.71604938e-04,   5.94990183e-05,
         3.81469727e-06,   2.09075158e-07])

>>> aa**(-bb)
array([1, 0, 0, 0, 0, 0, 0, 0])

For me this behaviour is confusing. But I am not an expert just a user. I
can live together with anything if I know what to expect. And I greatly
appreciate the work of any developer for this excellent package.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161009/55ce0493/attachment.html>

From sebastian at sipsolutions.net  Sun Oct  9 09:25:11 2016
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 09 Oct 2016 15:25:11 +0200
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
Message-ID: <1476019511.6762.10.camel@sipsolutions.net>

On Fr, 2016-10-07 at 19:12 -0600, Charles R Harris wrote:
> Hi All,
> 
> The time for NumPy 1.12.0 approaches and I like to have a final
> decision on the treatment of integers to negative integer powers with
> the `**` operator. The two alternatives looked to be
> 
> Raise an error for arrays and numpy scalars, including 1 and -1 to
> negative powers.
> 

For what its worth, I still feel it is probably the only real option to
go with error, changing to float may have weird effects. Which does not
mean it is impossible, I admit, though I would like some data on how
downstream would handle it. Also would we need an int power? The fpower
seems more straight forward/common pattern.
If errors turned out annoying in some cases, a seterr might be
plausible too (as well as a deprecation).

- Sebastian


> Pluses
> Backward compatible
> Allows common powers to be integer, e.g., arange(3)**2
> Consistent with inplace operators
> Fixes current wrong behavior.
> Preserves type
> 
> Minuses
> Integer overflow
> Computational inconvenience
> Inconsistent with Python integers
> 
> Always return a float?
> 
> Pluses
> Computational convenience
> 
> Minuses
> Loss of type
> Possible backward incompatibilities
> Not applicable to inplace operators
> 
> 
> Thoughts?
> 
> Chuck
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161009/c54b0bd6/attachment.sig>

From shoyer at gmail.com  Sun Oct  9 14:59:10 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Sun, 9 Oct 2016 11:59:10 -0700
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <1476019511.6762.10.camel@sipsolutions.net>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <1476019511.6762.10.camel@sipsolutions.net>
Message-ID: <CAEQ_TvfSXxE81uRaqxFw5JXn4aG=ZSo+fgUy6BYr7ckFd7PZHw@mail.gmail.com>

On Sun, Oct 9, 2016 at 6:25 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> For what its worth, I still feel it is probably the only real option to
> go with error, changing to float may have weird effects. Which does not
> mean it is impossible, I admit, though I would like some data on how
> downstream would handle it. Also would we need an int power? The fpower
> seems more straight forward/common pattern.
> If errors turned out annoying in some cases, a seterr might be
> plausible too (as well as a deprecation).
>

I agree with Sebastian and Nathaniel. I don't think we can deviating from
the existing behavior (int ** int -> int) without breaking lots of existing
code, and if we did, yes, we would need a new integer power function.

I think it's better to preserve the existing behavior when it gives
sensible results, and error when it doesn't. Adding another function
float_power for the case that is currently broken seems like the right way
to go.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161009/c1e00fe8/attachment.html>

From rmay31 at gmail.com  Mon Oct 10 12:38:37 2016
From: rmay31 at gmail.com (Ryan May)
Date: Mon, 10 Oct 2016 10:38:37 -0600
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CAEQ_TvfSXxE81uRaqxFw5JXn4aG=ZSo+fgUy6BYr7ckFd7PZHw@mail.gmail.com>
References: <CAB6mnxKgOhtVcx=0FgPDcv7O6YhScab=ztYFPEZE-gpK_pGF-g@mail.gmail.com>
 <1476019511.6762.10.camel@sipsolutions.net>
 <CAEQ_TvfSXxE81uRaqxFw5JXn4aG=ZSo+fgUy6BYr7ckFd7PZHw@mail.gmail.com>
Message-ID: <CAKH0P+VoO8F9JaMh6uQxSLZ3yU0phXH20DWRFC0nD9B4U0XGiw@mail.gmail.com>

On Sun, Oct 9, 2016 at 12:59 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> On Sun, Oct 9, 2016 at 6:25 AM, Sebastian Berg <sebastian at sipsolutions.net
> > wrote:
>
>> For what its worth, I still feel it is probably the only real option to
>> go with error, changing to float may have weird effects. Which does not
>> mean it is impossible, I admit, though I would like some data on how
>> downstream would handle it. Also would we need an int power? The fpower
>> seems more straight forward/common pattern.
>> If errors turned out annoying in some cases, a seterr might be
>> plausible too (as well as a deprecation).
>>
>
> I agree with Sebastian and Nathaniel. I don't think we can deviating from
> the existing behavior (int ** int -> int) without breaking lots of existing
> code, and if we did, yes, we would need a new integer power function.
>
> I think it's better to preserve the existing behavior when it gives
> sensible results, and error when it doesn't. Adding another function
> float_power for the case that is currently broken seems like the right way
> to go.
>

+1

Ryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161010/5220faa5/attachment.html>

From p.e.creasey.00 at googlemail.com  Tue Oct 11 21:23:44 2016
From: p.e.creasey.00 at googlemail.com (Peter Creasey)
Date: Tue, 11 Oct 2016 18:23:44 -0700
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
Message-ID: <CACUQHuLH0hgPyv3MV=kMNBgREb--BACTYTWgAZZZM2qpPWcxbw@mail.gmail.com>

> On Sun, Oct 9, 2016 at 12:59 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>
>>
>> I agree with Sebastian and Nathaniel. I don't think we can deviating from
>> the existing behavior (int ** int -> int) without breaking lots of existing
>> code, and if we did, yes, we would need a new integer power function.
>>
>> I think it's better to preserve the existing behavior when it gives
>> sensible results, and error when it doesn't. Adding another function
>> float_power for the case that is currently broken seems like the right way
>> to go.
>>
>

I actually suspect that the amount of code broken by int**int->float
may be relatively small (though extremely annoying for those that it
happens to, and it would definitely be good to have statistics). I
mean, Numpy silently transitioned to int32+uint64->float64 not so long
ago which broke my code, but the world didn?t end.

If the primary argument against int**int->float seems to be the
difficulty of managing the transition, with int**int->Error being the
seen as the required yet *very* painful intermediate step for the
large fraction of the int**int users who didn?t care if it was int or
float (e.g. the output is likely to be cast to float in the next step
anyway), and fail loudly for those users who need int**int->int, then
if you are prepared to risk a less conservative transition (i.e. we
think that latter group is small enough) you could skip the error on
users and just throw a warning for a couple of releases, along the
lines of:

WARNING int**int -> int is going to be deprecated in favour of
int**int->float in Numpy 1.16. To avoid seeing this message, either
use ?from numpy import __future_float_power__? or explicitly set the
type of one of your inputs to float, or use the new ipower(x,y)
function for integer powers.

Peter


From m.h.vankerkwijk at gmail.com  Wed Oct 12 12:02:59 2016
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Wed, 12 Oct 2016 12:02:59 -0400
Subject: [Numpy-discussion] Integers to negative integer powers,
 time for a decision.
In-Reply-To: <CACUQHuLH0hgPyv3MV=kMNBgREb--BACTYTWgAZZZM2qpPWcxbw@mail.gmail.com>
References: <CACUQHuLH0hgPyv3MV=kMNBgREb--BACTYTWgAZZZM2qpPWcxbw@mail.gmail.com>
Message-ID: <CAJNV+9vYQOmHTfCDVTR0GsSoDAdTqdqY-_0X3jZVggA6Ti5TzQ@mail.gmail.com>

I still strongly favour ending up at int**int -> float, and like
Peter's suggestion of raising a general warning rather than an
exception for negative powers.  -- Marten


From allanhaldane at gmail.com  Fri Oct 14 13:00:28 2016
From: allanhaldane at gmail.com (Allan Haldane)
Date: Fri, 14 Oct 2016 13:00:28 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
	np.ma.convolve
Message-ID: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>

Hi all,

Eric Wieser has a PR which defines new functions np.ma.correlate and
np.ma.convolve:

https://github.com/numpy/numpy/pull/7922

We're deciding how to name the keyword arg which determines whether
masked elements are "propagated" in the convolution sums. Currently we
are leaning towards calling it "contagious", with default of True:

        def convolve(a, v, mode='full', contagious=True):

Any thoughts?

Cheers,
Allan


From sebastian at sipsolutions.net  Fri Oct 14 13:08:17 2016
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 14 Oct 2016 19:08:17 +0200
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
Message-ID: <1476464897.22194.2.camel@sipsolutions.net>

On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote:
> Hi all,
> 
> Eric Wieser has a PR which defines new functions np.ma.correlate and
> np.ma.convolve:
> 
> https://github.com/numpy/numpy/pull/7922
> 
> We're deciding how to name the keyword arg which determines whether
> masked elements are "propagated" in the convolution sums. Currently
> we
> are leaning towards calling it "contagious", with default of True:
> 
> ????????def convolve(a, v, mode='full', contagious=True):
> 
> Any thoughts?
> 

Sounds a bit overly odd to me to be honest. Just brain storming, you
could think/name it the other way around maybe? Should the masked
values be considered as zero/ignored?

- Sebastian


> Cheers,
> Allan
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161014/bc273acc/attachment.sig>

From ben.v.root at gmail.com  Fri Oct 14 13:44:50 2016
From: ben.v.root at gmail.com (Benjamin Root)
Date: Fri, 14 Oct 2016 13:44:50 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
	np.ma.convolve
In-Reply-To: <1476464897.22194.2.camel@sipsolutions.net>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
Message-ID: <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>

Why not "propagated"?

On Fri, Oct 14, 2016 at 1:08 PM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote:
> > Hi all,
> >
> > Eric Wieser has a PR which defines new functions np.ma.correlate and
> > np.ma.convolve:
> >
> > https://github.com/numpy/numpy/pull/7922
> >
> > We're deciding how to name the keyword arg which determines whether
> > masked elements are "propagated" in the convolution sums. Currently
> > we
> > are leaning towards calling it "contagious", with default of True:
> >
> >         def convolve(a, v, mode='full', contagious=True):
> >
> > Any thoughts?
> >
>
> Sounds a bit overly odd to me to be honest. Just brain storming, you
> could think/name it the other way around maybe? Should the masked
> values be considered as zero/ignored?
>
> - Sebastian
>
>
> > Cheers,
> > Allan
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161014/0ae38ab0/attachment.html>

From allanhaldane at gmail.com  Fri Oct 14 14:23:09 2016
From: allanhaldane at gmail.com (Allan Haldane)
Date: Fri, 14 Oct 2016 14:23:09 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
Message-ID: <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>

I think the possibilities that have been mentioned so far (here or in
the PR) are:

contagious
contagious_mask
propagate
propagate_mask
propagated

`propogate_mask=False` seemed to imply that the mask would never be set,
so Eric also suggested
propagate_mask='any' or propagate_mask='all'


I would be happy with 'propagated=False' as the name/default. As Eric
pointed out, most MaskedArray functions like sum implicitly don't
propagate, currently, so maybe we should do likewise here.


Allan

On 10/14/2016 01:44 PM, Benjamin Root wrote:
> Why not "propagated"?
> 
> On Fri, Oct 14, 2016 at 1:08 PM, Sebastian Berg
> <sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>> wrote:
> 
>     On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote:
>     > Hi all,
>     >
>     > Eric Wieser has a PR which defines new functions np.ma.correlate and
>     > np.ma.convolve:
>     >
>     > https://github.com/numpy/numpy/pull/7922
>     <https://github.com/numpy/numpy/pull/7922>
>     >
>     > We're deciding how to name the keyword arg which determines whether
>     > masked elements are "propagated" in the convolution sums. Currently
>     > we
>     > are leaning towards calling it "contagious", with default of True:
>     >
>     >         def convolve(a, v, mode='full', contagious=True):
>     >
>     > Any thoughts?
>     >
> 
>     Sounds a bit overly odd to me to be honest. Just brain storming, you
>     could think/name it the other way around maybe? Should the masked
>     values be considered as zero/ignored?
> 
>     - Sebastian
> 
> 
>     > Cheers,
>     > Allan
>     >
>     > _______________________________________________
>     > NumPy-Discussion mailing list
>     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
>     >
> 
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From jni.soma at gmail.com  Fri Oct 14 19:49:48 2016
From: jni.soma at gmail.com (Juan Nunez-Iglesias)
Date: Sat, 15 Oct 2016 10:49:48 +1100
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
Message-ID: <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>

+1 for propagate_mask. That is the only proposal that immediately makes sense to me. "contagious" may be cute but I think approximately 0% of users would guess its purpose on first use.

Can you elaborate on what happens with the masks exactly? I didn't quite get why propagate_mask=False was unintuitive. My expectation is that any mask present in the input will not be set in the output, but the mask will be "respected" by the function.

On 15 Oct. 2016, 5:23 AM +1100, Allan Haldane <allanhaldane at gmail.com>, wrote:
> I think the possibilities that have been mentioned so far (here or in
> the PR) are:
>
> contagious
> contagious_mask
> propagate
> propagate_mask
> propagated
>
> `propogate_mask=False` seemed to imply that the mask would never be set,
> so Eric also suggested
> propagate_mask='any' or propagate_mask='all'
>
>
> I would be happy with 'propagated=False' as the name/default. As Eric
> pointed out, most MaskedArray functions like sum implicitly don't
> propagate, currently, so maybe we should do likewise here.
>
>
> Allan
>
> On 10/14/2016 01:44 PM, Benjamin Root wrote:
> > Why not "propagated"?
> >
> > On Fri, Oct 14, 2016 at 1:08 PM, Sebastian Berg
> > <sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>> wrote:
> >
> > On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote:
> > > Hi all,
> > >
> > > Eric Wieser has a PR which defines new functions np.ma.correlate and
> > > np.ma.convolve:
> > >
> > > https://github.com/numpy/numpy/pull/7922
> > <https://github.com/numpy/numpy/pull/7922
> > >
> > > We're deciding how to name the keyword arg which determines whether
> > > masked elements are "propagated" in the convolution sums. Currently
> > > we
> > > are leaning towards calling it "contagious", with default of True:
> > >
> > > def convolve(a, v, mode='full', contagious=True):
> > >
> > > Any thoughts?
> > >
> >
> > Sounds a bit overly odd to me to be honest. Just brain storming, you
> > could think/name it the other way around maybe? Should the masked
> > values be considered as zero/ignored?
> >
> > - Sebastian
> >
> >
> > > Cheers,
> > > Allan
> > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org
> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > <https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > <https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161015/1c88af0e/attachment.html>

From allanhaldane at gmail.com  Sat Oct 15 21:21:13 2016
From: allanhaldane at gmail.com (Allan Haldane)
Date: Sat, 15 Oct 2016 21:21:13 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
Message-ID: <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>

On 10/14/2016 07:49 PM, Juan Nunez-Iglesias wrote:
> +1 for propagate_mask. That is the only proposal that immediately makes
> sense to me. "contagious" may be cute but I think approximately 0% of
> users would guess its purpose on first use.
>
> Can you elaborate on what happens with the masks exactly? I didn't quite
> get why propagate_mask=False was unintuitive. My expectation is that any
> mask present in the input will not be set in the output, but the mask
> will be "respected" by the function.

Here's an illustration of how the PR currently works with convolve, 
using the name "propagate_mask":

     >>> m = np.ma.masked
     >>> a = np.ma.array([1,1,1,m,1,1,1,m,m,m,1,1,1])
     >>> b = np.ma.array([1,1,1])
     >>>
     >>> print np.ma.convolve(a, b, propagate_mask=True)
     [1 2 3 -- -- -- 3 -- -- -- -- -- 3 2 1]
     >>> print np.ma.convolve(a, b, propagate_mask=False)
     [1 2 3 2 2 2 3 2 1 -- 1 2 3 2 1]

Allan


> On 15 Oct. 2016, 5:23 AM +1100, Allan Haldane <allanhaldane at gmail.com>,
> wrote:
>> I think the possibilities that have been mentioned so far (here or in
>> the PR) are:
>>
>> contagious
>> contagious_mask
>> propagate
>> propagate_mask
>> propagated
>>
>> `propogate_mask=False` seemed to imply that the mask would never be set,
>> so Eric also suggested
>> propagate_mask='any' or propagate_mask='all'
>>
>>
>> I would be happy with 'propagated=False' as the name/default. As Eric
>> pointed out, most MaskedArray functions like sum implicitly don't
>> propagate, currently, so maybe we should do likewise here.
>>
>>
>> Allan


From klemm at phys.ethz.ch  Sun Oct 16 05:52:57 2016
From: klemm at phys.ethz.ch (Hanno Klemm)
Date: Sun, 16 Oct 2016 11:52:57 +0200
Subject: [Numpy-discussion] how to name "contagious" keyword in
	np.ma.convolve
In-Reply-To: <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
Message-ID: <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>


> On 16 Oct 2016, at 03:21, Allan Haldane <allanhaldane at gmail.com> wrote:
> 
>> On 10/14/2016 07:49 PM, Juan Nunez-Iglesias wrote:
>> +1 for propagate_mask. That is the only proposal that immediately makes
>> sense to me. "contagious" may be cute but I think approximately 0% of
>> users would guess its purpose on first use.
>> 
>> Can you elaborate on what happens with the masks exactly? I didn't quite
>> get why propagate_mask=False was unintuitive. My expectation is that any
>> mask present in the input will not be set in the output, but the mask
>> will be "respected" by the function.
> 
> Here's an illustration of how the PR currently works with convolve, using the name "propagate_mask":
> 
>    >>> m = np.ma.masked
>    >>> a = np.ma.array([1,1,1,m,1,1,1,m,m,m,1,1,1])
>    >>> b = np.ma.array([1,1,1])
>    >>>
>    >>> print np.ma.convolve(a, b, propagate_mask=True)
>    [1 2 3 -- -- -- 3 -- -- -- -- -- 3 2 1]
>    >>> print np.ma.convolve(a, b, propagate_mask=False)
>    [1 2 3 2 2 2 3 2 1 -- 1 2 3 2 1]
> 
> Allan
> 

Given this behaviour, I'm actually more concerned about the logic ma.convolve uses in the propagate_mask=False case. It appears that the masked values are essentially replaced by zero. Is my interpretation correct and if so does this make sense?

When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. 

Hanno


From harrigan.matthew at gmail.com  Sun Oct 16 08:47:38 2016
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Sun, 16 Oct 2016 08:47:38 -0400
Subject: [Numpy-discussion] add elementwise addition & subtraction to einsum
Message-ID: <CAOfRF=j51y2HTLmspNHYnrfbb6+dd17HrqQu5Orv++okp+xz6Q@mail.gmail.com>

Hello,

This is a follow on for issue 8139
<https://github.com/numpy/numpy/issues/8139>.  I propose adding elementwise
addition and subtraction functionality to einsum.  I love einsum as it
clearly and concisely defines complex linear algebra.  However elementwise
addition is a very common linear algebra operation and einsum does not
currently support it.  The Einstein field equations
<https://en.wikipedia.org/wiki/Einstein_field_equations#Mathematical_form>,
what the notation was originally developed to document, contain that
functionality.  It is fairly common in stress analysis (my background), for
example see these lectures notes
<http://www.brown.edu/Departments/Engineering/Courses/En221/Notes/Index_notation/Index_notation.htm>
.

Specifically I propose adding "+" and "-" characters which separate current
einsum statements which are then combined elementwise.  An example is A =
np.einsum('ij,jk+ij,jk', B, C, D, E), which is A = B * C + D * E.  I wrote
a crude function
<https://github.com/mattharrigan/numpy/blob/einsum-elementwise/numpy/core/einsum_extension.py>
to demonstrate the functionality.

I believe the functionality is useful, in keeping with the spirit of a
clean concise API, and doesn't break the existing API, which could warrant
acceptance.

Additionally I believe it opens the possibility of many interesting
performance optimizations.  For instance, many of the optimizations in this
NEP <http://docs.scipy.org/doc/numpy/neps/deferred-ufunc-evaluation.html>
could be done internally to the einsum function, which may be easier to
accomplish given the narrower scope (but I am ignorant of all the low level
C internals of numpy).  The example in the beginning could become A =
np.einsum('...+...+...', B, C, D).

Thank you for your time and consideration.
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161016/6f24b5b8/attachment.html>

From pierre.haessig at crans.org  Mon Oct 17 13:01:14 2016
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 17 Oct 2016 19:01:14 +0200
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
Message-ID: <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>

Hi,


Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit :
> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. 
When estimating the autocorrelation of a signal, it make sense to drop
missing pairs of values. Only in this use case, it opens the question of
correcting or not correcting for the number of missing elements  when
computing the mean. I don't remember what R function "acf" is doing.


Also, coming back to the initial question, I feel that it is necessary
that the name "mask" (or "na" or similar) appears in the parameter name.
Otherwise, people will wonder : "what on earth is contagious/being
propagated...."

just thinking of yet another keyword name  : ignore_masked (or drop_masked)

If I remember well, in R it is dropna. It would be nice if the boolean
switch followed the same logic.

Now of course the convolution function is more general than just
autocorrelation...

best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 837 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161017/4f42ba5c/attachment.sig>

From Shreyank.Amartya at itcinfotech.com  Tue Oct 18 03:18:15 2016
From: Shreyank.Amartya at itcinfotech.com (Shreyank Amartya)
Date: Tue, 18 Oct 2016 07:18:15 +0000
Subject: [Numpy-discussion] Scipy installation on Window with mingw32
Message-ID: <bc5f73fd45084b4394d93ad2f8736500@i3lmbx4.ITCINFOTECH.com>

Hi,

I am trying install to theano which also requires numpy and scipy on windows 7 with mingw32 compilers.
I have successfully installed numpy using mingw32 but however when trying to install scipy I get this error:

    Looking for python27.dll
    Building msvcr library: "c:\python27\libs\libmsvcr90.a" (from C:\Windows\win
sxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\
msvcr90.dll)
    objdump.exe: C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0
.21022.8_none_750b37ff97f4f68b\msvcr90.dll: File format not recognized
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\setup.py",
line 415, in <module>
        setup_package()
      File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\setup.py",
line 411, in setup_package
        setup(**metadata)
      File "c:\python27\lib\site-packages\numpy\distutils\core.py", line 169, in
setup
        return old_setup(**new_attr)
      File "c:\python27\lib\distutils\core.py", line 151, in setup
        dist.run_commands()
      File "c:\python27\lib\distutils\dist.py", line 953, in run_commands
        self.run_command(cmd)
      File "c:\python27\lib\distutils\dist.py", line 972, in run_command
        cmd_obj.run()
      File "c:\python27\lib\site-packages\numpy\distutils\command\install.py", l
ine 62, in run
        r = self.setuptools_run()
      File "c:\python27\lib\site-packages\numpy\distutils\command\install.py", l
ine 36, in setuptools_run
        return distutils_install.run(self)
      File "c:\python27\lib\distutils\command\install.py", line 563, in run
        self.run_command('build')
      File "c:\python27\lib\distutils\cmd.py", line 326, in run_command
        self.distribution.run_command(command)
      File "c:\python27\lib\distutils\dist.py", line 972, in run_command
        cmd_obj.run()
     File "c:\python27\lib\site-packages\numpy\distutils\command\build.py", lin
e 47, in run
        old_build.run(self)
      File "c:\python27\lib\distutils\command\build.py", line 127, in run
        self.run_command(cmd_name)
      File "c:\python27\lib\distutils\cmd.py", line 326, in run_command
        self.distribution.run_command(command)
      File "c:\python27\lib\distutils\dist.py", line 972, in run_command
        cmd_obj.run()
      File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py",
line 147, in run
        self.build_sources()
      File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py",
line 164, in build_sources
        self.build_extension_sources(ext)
      File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py",
line 323, in build_extension_sources
        sources = self.generate_sources(sources, ext)
      File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py",
line 376, in generate_sources
        source = func(extension, build_dir)
      File "scipy\spatial\setup.py", line 35, in get_qhull_misc_config
        if config_cmd.check_func('open_memstream', decl=True, call=True):
      File "c:\python27\lib\site-packages\numpy\distutils\command\config.py", li
ne 312, in check_func
        self._check_compiler()
      File "c:\python27\lib\site-packages\numpy\distutils\command\config.py", li
ne 39, in _check_compiler
        old_config._check_compiler(self)
      File "c:\python27\lib\distutils\command\config.py", line 102, in _check_co
mpiler
        dry_run=self.dry_run, force=1)
      File "c:\python27\lib\site-packages\numpy\distutils\ccompiler.py", line 59
6, in new_compiler
        compiler = klass(None, dry_run, force)
      File "c:\python27\lib\site-packages\numpy\distutils\mingw32ccompiler.py",
line 96, in __init__
        msvcr_success = build_msvcr_library()
      File "c:\python27\lib\site-packages\numpy\distutils\mingw32ccompiler.py",
line 360, in build_msvcr_library
        generate_def(dll_file, def_file)
      File "c:\python27\lib\site-packages\numpy\distutils\mingw32ccompiler.py",
line 274, in generate_def
        raise ValueError("Symbol table not found")
    ValueError: Symbol table not found

    ----------------------------------------
Command "c:\python27\python.exe -u -c "import setuptools, tokenize;__file__='c:\
\users\\22193\\appdata\\local\\temp\\pip-build-d3f_pb\\scipy\\setup.py';exec(com
pile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __f
ile__, 'exec'))" install --record c:\users\22193\appdata\local\temp\pip-wxyrfu-r
ecord\install-record.txt --single-version-externally-managed --compile" failed w
ith error code 1 in c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\

C:\Python27\Scripts>

I realize this is due to some problem with resolving symbols in python27.dll. Is there a workaround for this?
I need this to work on this setup as this is my workstation at office and I cannot install Ubuntu which would have been way easier.

Things I have tried:
I used to get an error while installing scipy for lapack/blas resources not found, I was able to get through by downloading and compiling them from source.
I have tried to install scipy from http://www.lfd.uci.edu/~gohlke/pythonlibs/ and it does install scipy successfully but I get a 64-bit compatibility error when I try to import theano.
Please help as I'm stuck here.

Thanks
Shreyank
Disclaimer: This  communication  is  for the exclusive use of the intended recipient(s) and  shall  not attach any liability on the originator or ITC Infotech India Ltd./its  Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall  not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its  Subsidiaries/ its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with  by any  third  party  in  any manner whatsoever without the specific consent  of  ITC  Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161018/20c017ff/attachment.html>

From ralf.gommers at gmail.com  Tue Oct 18 05:07:04 2016
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 18 Oct 2016 22:07:04 +1300
Subject: [Numpy-discussion] Scipy installation on Window with mingw32
In-Reply-To: <bc5f73fd45084b4394d93ad2f8736500@i3lmbx4.ITCINFOTECH.com>
References: <bc5f73fd45084b4394d93ad2f8736500@i3lmbx4.ITCINFOTECH.com>
Message-ID: <CABL7CQgDucrSHPswyBxksHAsZvrU5jrYpVM5OdsYvg5H7iMm8w@mail.gmail.com>

Hi,

A few comments:
- you really really want to use a scientific Python distribution to avoid
these issues on Windows. see http://scipy.org/install.html
- we used to build scipy .exe installers with mingw32 but don't do that
anymore because it's just too much of a pain. IIRC the last release we did
that for was 0.16.0, with the toolchain in
https://github.com/numpy/numpy-vendor.
- I don't recognize the error; looks not specific to recent changes in
scipy so there's probably something in your environment not set up quite
right.

Cheers,
Ralf


On Tue, Oct 18, 2016 at 8:18 PM, Shreyank Amartya <
Shreyank.Amartya at itcinfotech.com> wrote:

> Hi,
>
>
>
> I am trying install to theano which also requires numpy and scipy on
> windows 7 with mingw32 compilers.
>
> I have successfully installed numpy using mingw32 but however when trying
> to install scipy I get this error:
>
>
>
>     Looking for python27.dll
>
>     Building msvcr library: "c:\python27\libs\libmsvcr90.a" (from
> C:\Windows\win
>
> sxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_
> none_750b37ff97f4f68b\
>
> msvcr90.dll)
>
>     objdump.exe: C:\Windows\winsxs\amd64_microsoft.vc90.crt_
> 1fc8b3b9a1e18e3b_9.0
>
> .21022.8_none_750b37ff97f4f68b\msvcr90.dll: File format not recognized
>
>     Traceback (most recent call last):
>
>       File "<string>", line 1, in <module>
>
>       File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\
> setup.py",
>
> line 415, in <module>
>
>         setup_package()
>
>       File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\
> setup.py",
>
> line 411, in setup_package
>
>         setup(**metadata)
>
>       File "c:\python27\lib\site-packages\numpy\distutils\core.py", line
> 169, in
>
> setup
>
>         return old_setup(**new_attr)
>
>       File "c:\python27\lib\distutils\core.py", line 151, in setup
>
>         dist.run_commands()
>
>       File "c:\python27\lib\distutils\dist.py", line 953, in run_commands
>
>         self.run_command(cmd)
>
>       File "c:\python27\lib\distutils\dist.py", line 972, in run_command
>
>         cmd_obj.run()
>
>       File "c:\python27\lib\site-packages\numpy\distutils\command\install.py",
> l
>
> ine 62, in run
>
>         r = self.setuptools_run()
>
>       File "c:\python27\lib\site-packages\numpy\distutils\command\install.py",
> l
>
> ine 36, in setuptools_run
>
>         return distutils_install.run(self)
>
>       File "c:\python27\lib\distutils\command\install.py", line 563, in
> run
>
>         self.run_command('build')
>
>       File "c:\python27\lib\distutils\cmd.py", line 326, in run_command
>
>         self.distribution.run_command(command)
>
>       File "c:\python27\lib\distutils\dist.py", line 972, in run_command
>
>         cmd_obj.run()
>
>      File "c:\python27\lib\site-packages\numpy\distutils\command\build.py",
> lin
>
> e 47, in run
>
>         old_build.run(self)
>
>       File "c:\python27\lib\distutils\command\build.py", line 127, in run
>
>         self.run_command(cmd_name)
>
>       File "c:\python27\lib\distutils\cmd.py", line 326, in run_command
>
>         self.distribution.run_command(command)
>
>       File "c:\python27\lib\distutils\dist.py", line 972, in run_command
>
>         cmd_obj.run()
>
>       File "c:\python27\lib\site-packages\numpy\distutils\
> command\build_src.py",
>
> line 147, in run
>
>         self.build_sources()
>
>       File "c:\python27\lib\site-packages\numpy\distutils\
> command\build_src.py",
>
> line 164, in build_sources
>
>         self.build_extension_sources(ext)
>
>       File "c:\python27\lib\site-packages\numpy\distutils\
> command\build_src.py",
>
> line 323, in build_extension_sources
>
>         sources = self.generate_sources(sources, ext)
>
>       File "c:\python27\lib\site-packages\numpy\distutils\
> command\build_src.py",
>
> line 376, in generate_sources
>
>         source = func(extension, build_dir)
>
>       File "scipy\spatial\setup.py", line 35, in get_qhull_misc_config
>
>         if config_cmd.check_func('open_memstream', decl=True, call=True):
>
>       File "c:\python27\lib\site-packages\numpy\distutils\command\config.py",
> li
>
> ne 312, in check_func
>
>         self._check_compiler()
>
>       File "c:\python27\lib\site-packages\numpy\distutils\command\config.py",
> li
>
> ne 39, in _check_compiler
>
>         old_config._check_compiler(self)
>
>       File "c:\python27\lib\distutils\command\config.py", line 102, in
> _check_co
>
> mpiler
>
>         dry_run=self.dry_run, force=1)
>
>       File "c:\python27\lib\site-packages\numpy\distutils\ccompiler.py",
> line 59
>
> 6, in new_compiler
>
>         compiler = klass(None, dry_run, force)
>
>       File "c:\python27\lib\site-packages\numpy\distutils\
> mingw32ccompiler.py",
>
> line 96, in __init__
>
>         msvcr_success = build_msvcr_library()
>
>       File "c:\python27\lib\site-packages\numpy\distutils\
> mingw32ccompiler.py",
>
> line 360, in build_msvcr_library
>
>         generate_def(dll_file, def_file)
>
>       File "c:\python27\lib\site-packages\numpy\distutils\
> mingw32ccompiler.py",
>
> line 274, in generate_def
>
>         raise ValueError("Symbol table not found")
>
>     ValueError: Symbol table not found
>
>
>
>     ----------------------------------------
>
> Command "c:\python27\python.exe -u -c "import setuptools,
> tokenize;__file__='c:\
>
> \users\\22193\\appdata\\local\\temp\\pip-build-d3f_pb\\
> scipy\\setup.py';exec(com
>
> pile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n',
> '\n'), __f
>
> ile__, 'exec'))" install --record c:\users\22193\appdata\local\
> temp\pip-wxyrfu-r
>
> ecord\install-record.txt --single-version-externally-managed --compile"
> failed w
>
> ith error code 1 in c:\users\22193\appdata\local\
> temp\pip-build-d3f_pb\scipy\
>
>
>
> C:\Python27\Scripts>
>
>
>
> I realize this is due to some problem with resolving symbols in
> python27.dll. Is there a workaround for this?
>
> I need this to work on this setup as this is my workstation at office and
> I cannot install Ubuntu which would have been way easier.
>
>
>
> Things I have tried:
>
> I used to get an error while installing scipy for lapack/blas resources
> not found, I was able to get through by downloading and compiling them from
> source.
>
> I have tried to install scipy from http://www.lfd.uci.edu/~
> gohlke/pythonlibs/ and it does install scipy successfully but I get a
> 64-bit compatibility error when I try to import theano.
>
> Please help as I?m stuck here.
>
>
>
> Thanks
>
> Shreyank
>
>
> Disclaimer: This communication is for the exclusive use of the intended
> recipient(s) and shall not attach any liability on the originator or ITC
> Infotech India Ltd./its Holding company/ its Subsidiaries/ its Group
> Companies. If you are the addressee, the contents of this e-mail are
> intended for your use only and it shall not be forwarded to any third
> party, without first obtaining written authorization from the originator or
> ITC Infotech India Ltd./ its Holding company/its Subsidiaries/ its Group
> Companies. It may contain information which is confidential and legally
> privileged and the same shall not be used or dealt with by any third party
> in any manner whatsoever without the specific consent of ITC Infotech India
> Ltd./ its Holding company/ its Subsidiaries/ its Group Companies.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161018/6e9c8d8c/attachment.html>

From josef.pktd at gmail.com  Tue Oct 18 13:25:37 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 18 Oct 2016 13:25:37 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
	np.ma.convolve
In-Reply-To: <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
 <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
Message-ID: <CAMMTP+A6Y6PRBFf8tzRRBq20Ko8YpLRgV6o=BG6yL757MD=Xdg@mail.gmail.com>

On Mon, Oct 17, 2016 at 1:01 PM, Pierre Haessig
<pierre.haessig at crans.org> wrote:
> Hi,
>
>
> Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit :
>> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them.
> When estimating the autocorrelation of a signal, it make sense to drop
> missing pairs of values. Only in this use case, it opens the question of
> correcting or not correcting for the number of missing elements  when
> computing the mean. I don't remember what R function "acf" is doing.
>
>
> Also, coming back to the initial question, I feel that it is necessary
> that the name "mask" (or "na" or similar) appears in the parameter name.
> Otherwise, people will wonder : "what on earth is contagious/being
> propagated...."
>
> just thinking of yet another keyword name  : ignore_masked (or drop_masked)
>
> If I remember well, in R it is dropna. It would be nice if the boolean
> switch followed the same logic.
>
> Now of course the convolution function is more general than just
> autocorrelation...

I think "drop" or "ignore" is too generic, for correlation it would be
for example ignore pairs versus ignore cases.

To me propagate sounds ok to me, but something with `valid` might be
more explicit for convolution or `correlate`, however `valid` also
refers to the end points, so maybe valid_na or valid_masked=True

Josef

>
> best,
> Pierre
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From josef.pktd at gmail.com  Tue Oct 18 13:30:52 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 18 Oct 2016 13:30:52 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
	np.ma.convolve
In-Reply-To: <CAMMTP+A6Y6PRBFf8tzRRBq20Ko8YpLRgV6o=BG6yL757MD=Xdg@mail.gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
 <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
 <CAMMTP+A6Y6PRBFf8tzRRBq20Ko8YpLRgV6o=BG6yL757MD=Xdg@mail.gmail.com>
Message-ID: <CAMMTP+A9ovFy_AN7wHCh=ms-ZmqW33OEOOxtsN=xnmwNX3mt6A@mail.gmail.com>

On Tue, Oct 18, 2016 at 1:25 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Oct 17, 2016 at 1:01 PM, Pierre Haessig
> <pierre.haessig at crans.org> wrote:
>> Hi,
>>
>>
>> Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit :
>>> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them.
>> When estimating the autocorrelation of a signal, it make sense to drop
>> missing pairs of values. Only in this use case, it opens the question of
>> correcting or not correcting for the number of missing elements  when
>> computing the mean. I don't remember what R function "acf" is doing.

as aside: statsmodels has now an option for acf and similar

    missing : str
        A string in ['none', 'raise', 'conservative', 'drop']
specifying how the NaNs
        are to be treated.

Josef

>>
>>
>> Also, coming back to the initial question, I feel that it is necessary
>> that the name "mask" (or "na" or similar) appears in the parameter name.
>> Otherwise, people will wonder : "what on earth is contagious/being
>> propagated...."
>>
>> just thinking of yet another keyword name  : ignore_masked (or drop_masked)
>>
>> If I remember well, in R it is dropna. It would be nice if the boolean
>> switch followed the same logic.
>>
>> Now of course the convolution function is more general than just
>> autocorrelation...
>
> I think "drop" or "ignore" is too generic, for correlation it would be
> for example ignore pairs versus ignore cases.
>
> To me propagate sounds ok to me, but something with `valid` might be
> more explicit for convolution or `correlate`, however `valid` also
> refers to the end points, so maybe valid_na or valid_masked=True
>
> Josef
>
>>
>> best,
>> Pierre
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>


From josef.pktd at gmail.com  Tue Oct 18 13:49:13 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 18 Oct 2016 13:49:13 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
	np.ma.convolve
In-Reply-To: <CAMMTP+A9ovFy_AN7wHCh=ms-ZmqW33OEOOxtsN=xnmwNX3mt6A@mail.gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
 <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
 <CAMMTP+A6Y6PRBFf8tzRRBq20Ko8YpLRgV6o=BG6yL757MD=Xdg@mail.gmail.com>
 <CAMMTP+A9ovFy_AN7wHCh=ms-ZmqW33OEOOxtsN=xnmwNX3mt6A@mail.gmail.com>
Message-ID: <CAMMTP+B+GHDAf-xfpUCX4CM0xnQgeeCK_DA5kXTCB6Hs=95jYg@mail.gmail.com>

On Tue, Oct 18, 2016 at 1:30 PM,  <josef.pktd at gmail.com> wrote:
> On Tue, Oct 18, 2016 at 1:25 PM,  <josef.pktd at gmail.com> wrote:
>> On Mon, Oct 17, 2016 at 1:01 PM, Pierre Haessig
>> <pierre.haessig at crans.org> wrote:
>>> Hi,
>>>
>>>
>>> Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit :
>>>> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them.
>>> When estimating the autocorrelation of a signal, it make sense to drop
>>> missing pairs of values. Only in this use case, it opens the question of
>>> correcting or not correcting for the number of missing elements  when
>>> computing the mean. I don't remember what R function "acf" is doing.
>
> as aside: statsmodels has now an option for acf and similar
>
>     missing : str
>         A string in ['none', 'raise', 'conservative', 'drop']
> specifying how the NaNs
>         are to be treated.

aside to the aside: statsmodels was just catching up in this

The original for masked array acf including correct counting of "valid" terms is

https://github.com/pierregm/scikits.timeseries/blob/master/scikits/timeseries/lib/avcf.py

(which I looked at way before statsmodels had any acf)

Josef

>
> Josef
>
>>>
>>>
>>> Also, coming back to the initial question, I feel that it is necessary
>>> that the name "mask" (or "na" or similar) appears in the parameter name.
>>> Otherwise, people will wonder : "what on earth is contagious/being
>>> propagated...."
>>>
>>> just thinking of yet another keyword name  : ignore_masked (or drop_masked)
>>>
>>> If I remember well, in R it is dropna. It would be nice if the boolean
>>> switch followed the same logic.
>>>
>>> Now of course the convolution function is more general than just
>>> autocorrelation...
>>
>> I think "drop" or "ignore" is too generic, for correlation it would be
>> for example ignore pairs versus ignore cases.
>>
>> To me propagate sounds ok to me, but something with `valid` might be
>> more explicit for convolution or `correlate`, however `valid` also
>> refers to the end points, so maybe valid_na or valid_masked=True
>>
>> Josef
>>
>>>
>>> best,
>>> Pierre
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>


From allanhaldane at gmail.com  Tue Oct 18 18:37:56 2016
From: allanhaldane at gmail.com (Allan Haldane)
Date: Tue, 18 Oct 2016 18:37:56 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
 <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
Message-ID: <c989deba-e435-2e6c-8898-29c3626de27d@gmail.com>

On 10/17/2016 01:01 PM, Pierre Haessig wrote:
> Hi,
> 
> 
> Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit :
>> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. 
> When estimating the autocorrelation of a signal, it make sense to drop
> missing pairs of values. Only in this use case, it opens the question of
> correcting or not correcting for the number of missing elements  when
> computing the mean. I don't remember what R function "acf" is doing.
> 
> 
> Also, coming back to the initial question, I feel that it is necessary
> that the name "mask" (or "na" or similar) appears in the parameter name.
> Otherwise, people will wonder : "what on earth is contagious/being
> propagated...."
> 
> just thinking of yet another keyword name  : ignore_masked (or drop_masked)
> 
> If I remember well, in R it is dropna. It would be nice if the boolean
> switch followed the same logic.

There is an old unimplemented NEP which uses similar language, like
"ignorena", and np.NA.

http://docs.scipy.org/doc/numpy/neps/missing-data.html

But right now that isn't part of numpy, so I think it would be confusing
to use that terminology.

Allan


From allanhaldane at gmail.com  Tue Oct 18 18:49:16 2016
From: allanhaldane at gmail.com (Allan Haldane)
Date: Tue, 18 Oct 2016 18:49:16 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
Message-ID: <aa8bcd5d-e098-6399-8bf8-d20ac3e1bf0f@gmail.com>

On 10/16/2016 05:52 AM, Hanno Klemm wrote:
> 
> 
>> On 16 Oct 2016, at 03:21, Allan Haldane <allanhaldane at gmail.com> wrote:
>>
>>> On 10/14/2016 07:49 PM, Juan Nunez-Iglesias wrote:
>>> +1 for propagate_mask. That is the only proposal that immediately makes
>>> sense to me. "contagious" may be cute but I think approximately 0% of
>>> users would guess its purpose on first use.
>>>
>>> Can you elaborate on what happens with the masks exactly? I didn't quite
>>> get why propagate_mask=False was unintuitive. My expectation is that any
>>> mask present in the input will not be set in the output, but the mask
>>> will be "respected" by the function.
>>
>> Here's an illustration of how the PR currently works with convolve, using the name "propagate_mask":
>>
>>    >>> m = np.ma.masked
>>    >>> a = np.ma.array([1,1,1,m,1,1,1,m,m,m,1,1,1])
>>    >>> b = np.ma.array([1,1,1])
>>    >>>
>>    >>> print np.ma.convolve(a, b, propagate_mask=True)
>>    [1 2 3 -- -- -- 3 -- -- -- -- -- 3 2 1]
>>    >>> print np.ma.convolve(a, b, propagate_mask=False)
>>    [1 2 3 2 2 2 3 2 1 -- 1 2 3 2 1]
>>
>> Allan
>>
> 
> Given this behaviour, I'm actually more concerned about the logic ma.convolve uses in the propagate_mask=False case. It appears that the masked values are essentially replaced by zero. Is my interpretation correct and if so does this make sense?
> 

I think that's right.

Its usefulness wasn't obvious to me either, but googling shows that
in matlab people like the file "nanconv.m" which works this way, using
nans similarly to how the mask is used here.

Just as convolution functions often add zero-padding around an image,
here the mask behavior would allow you to have different borders, eg
[m,m,m,1,1,1,1,m,m,m,m]
using my notation from before.

Octave's "nanconv" does this too.

I still agree that in most cases people should be handling the missing
values more carefully (manually) if they are doing convolutions, but
this default behaviour maybe seems reasonable to me.

Allan


From allanhaldane at gmail.com  Tue Oct 18 19:18:18 2016
From: allanhaldane at gmail.com (Allan Haldane)
Date: Tue, 18 Oct 2016 19:18:18 -0400
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
 <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
Message-ID: <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com>

On 10/17/2016 01:01 PM, Pierre Haessig wrote:
> Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit :
>> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. 
> When estimating the autocorrelation of a signal, it make sense to drop
> missing pairs of values. Only in this use case, it opens the question of
> correcting or not correcting for the number of missing elements  when
> computing the mean. I don't remember what R function "acf" is doing.
> 
> 
> Also, coming back to the initial question, I feel that it is necessary
> that the name "mask" (or "na" or similar) appears in the parameter name.
> Otherwise, people will wonder : "what on earth is contagious/being
> propagated...."

Based on feedback so far, I think "propagate_mask" sounds like the best
word to use. Let's go with that.

As for whether it should default to "True" or "False", the arguments I
see are:

 * False, because that is the way most functions like `np.ma.sum`
   already work, as well as matlab and octave's similar "nanconv".

 * True, because its effects are more visible and might lead to less
   surprises. The "False" case seems like it is often not what the user
   intended. Eg, it affects the overall normalization of normalized
   kernels, and the choice of 0 seems arbitrary.

If no one says anything, I'd probably go with True.

Allan


From shoyer at gmail.com  Tue Oct 18 19:44:03 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 18 Oct 2016 16:44:03 -0700
Subject: [Numpy-discussion] how to name "contagious" keyword in
	np.ma.convolve
In-Reply-To: <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
 <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
 <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com>
Message-ID: <CAEQ_TvdwqYg3bxAe13-sswfKzY+f4MNFYiVV5nWaXpitY7mPEA@mail.gmail.com>

On Tue, Oct 18, 2016 at 4:18 PM, Allan Haldane <allanhaldane at gmail.com>
wrote:

> As for whether it should default to "True" or "False", the arguments I
> see are:
>
>  * False, because that is the way most functions like `np.ma.sum`
>    already work, as well as matlab and octave's similar "nanconv".
>
>  * True, because its effects are more visible and might lead to less
>    surprises. The "False" case seems like it is often not what the user
>    intended. Eg, it affects the overall normalization of normalized
>    kernels, and the choice of 0 seems arbitrary.
>
> If no one says anything, I'd probably go with True
>

I also have serious concerns about if it ever actually makes sense to use
`propagate_mask=False`.

So, I think it's definitely appropriate to default to `propagate_mask=True`.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161018/ca491ef9/attachment.html>

From pierre.haessig at crans.org  Wed Oct 19 04:10:18 2016
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Wed, 19 Oct 2016 10:10:18 +0200
Subject: [Numpy-discussion] how to name "contagious" keyword in
 np.ma.convolve
In-Reply-To: <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com>
References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com>
 <1476464897.22194.2.camel@sipsolutions.net>
 <CANNq6Fn4k1DKJhkJf3nYEpM4qNEA4sQh6JJ92P2m7hBRro_VWw@mail.gmail.com>
 <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com>
 <e373d171-183c-41c5-8ca7-946f524b3a0c@Spark>
 <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com>
 <A3439441-01AA-4816-B2FA-FDD7437643BC@phys.ethz.ch>
 <cf5d84d6-098c-4914-6aee-16433c8930f7@crans.org>
 <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com>
Message-ID: <27f8ecde-e3a3-db74-9b2f-333a85b6ba78@crans.org>

Le 19/10/2016 ? 01:18, Allan Haldane a ?crit :
> Based on feedback so far, I think "propagate_mask" sounds like the best
> word to use. Let's go with that.
>
> As for whether it should default to "True" or "False", the arguments I
> see are:
>
>  * False, because that is the way most functions like `np.ma.sum`
>    already work, as well as matlab and octave's similar "nanconv".
>
>  * True, because its effects are more visible and might lead to less
>    surprises. The "False" case seems like it is often not what the user
>    intended. Eg, it affects the overall normalization of normalized
>    kernels, and the choice of 0 seems arbitrary.
>
> If no one says anything, I'd probably go with True.
Sounds good!

Pierre


From charlesr.harris at gmail.com  Thu Oct 20 13:16:03 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 20 Oct 2016 11:16:03 -0600
Subject: [Numpy-discussion] assert_allclose equal_nan default value.
Message-ID: <CAB6mnxK81UO1i0vJ-y9=xpNyOV=U7sStpySZgh3QSnHeTOLB-g@mail.gmail.com>

Hi All,

Just a heads up that there is a PR changing the default value of
`equal_nan` to `True` in the `assert_allclose` test function. The
`equal_nan` argument was previously ineffective due to a bug that has
recently been fixed. The current default value of `False` is not backward
compatible and causes test failures in  scipy. See the extended argument at
https://github.com/numpy/numpy/pull/8184. I think this change is the right
thing to do but want to make sure everyone is aware of it.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161020/c6c95914/attachment.html>

From nathan12343 at gmail.com  Thu Oct 20 13:18:52 2016
From: nathan12343 at gmail.com (Nathan Goldbaum)
Date: Thu, 20 Oct 2016 12:18:52 -0500
Subject: [Numpy-discussion] assert_allclose equal_nan default value.
In-Reply-To: <CAB6mnxK81UO1i0vJ-y9=xpNyOV=U7sStpySZgh3QSnHeTOLB-g@mail.gmail.com>
References: <CAB6mnxK81UO1i0vJ-y9=xpNyOV=U7sStpySZgh3QSnHeTOLB-g@mail.gmail.com>
Message-ID: <CAJXewO=Nef2bP710KVuwykn1YyfJrSiKXH6UG54iP+cj+LZDUQ@mail.gmail.com>

Agreed, especially given the prevalence of using this function in
downstream test suites:

https://github.com/search?utf8=%E2%9C%93&q=numpy+assert_allclose&type=Code&ref=searchresults

On Thu, Oct 20, 2016 at 12:16 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> Hi All,
>
> Just a heads up that there is a PR changing the default value of
> `equal_nan` to `True` in the `assert_allclose` test function. The
> `equal_nan` argument was previously ineffective due to a bug that has
> recently been fixed. The current default value of `False` is not backward
> compatible and causes test failures in  scipy. See the extended argument at
> https://github.com/numpy/numpy/pull/8184. I think this change is the
> right thing to do but want to make sure everyone is aware of it.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161020/78265f71/attachment.html>

From ben.v.root at gmail.com  Thu Oct 20 13:21:40 2016
From: ben.v.root at gmail.com (Benjamin Root)
Date: Thu, 20 Oct 2016 13:21:40 -0400
Subject: [Numpy-discussion] assert_allclose equal_nan default value.
In-Reply-To: <CAJXewO=Nef2bP710KVuwykn1YyfJrSiKXH6UG54iP+cj+LZDUQ@mail.gmail.com>
References: <CAB6mnxK81UO1i0vJ-y9=xpNyOV=U7sStpySZgh3QSnHeTOLB-g@mail.gmail.com>
 <CAJXewO=Nef2bP710KVuwykn1YyfJrSiKXH6UG54iP+cj+LZDUQ@mail.gmail.com>
Message-ID: <CANNq6Fkr2U5Vr-L9mYK+maTFuEGzT9RFqOtip3HqYpEjJJBgGw@mail.gmail.com>

+1. I was almost always setting it to True anyway.

On Thu, Oct 20, 2016 at 1:18 PM, Nathan Goldbaum <nathan12343 at gmail.com>
wrote:

> Agreed, especially given the prevalence of using this function in
> downstream test suites:
>
> https://github.com/search?utf8=%E2%9C%93&q=numpy+assert_
> allclose&type=Code&ref=searchresults
>
> On Thu, Oct 20, 2016 at 12:16 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>> Hi All,
>>
>> Just a heads up that there is a PR changing the default value of
>> `equal_nan` to `True` in the `assert_allclose` test function. The
>> `equal_nan` argument was previously ineffective due to a bug that has
>> recently been fixed. The current default value of `False` is not backward
>> compatible and causes test failures in  scipy. See the extended argument at
>> https://github.com/numpy/numpy/pull/8184. I think this change is the
>> right thing to do but want to make sure everyone is aware of it.
>>
>> Chuck
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161020/ec8b949f/attachment.html>

From m.h.vankerkwijk at gmail.com  Thu Oct 20 16:38:27 2016
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Thu, 20 Oct 2016 16:38:27 -0400
Subject: [Numpy-discussion] assert_allclose equal_nan default value.
In-Reply-To: <CANNq6Fkr2U5Vr-L9mYK+maTFuEGzT9RFqOtip3HqYpEjJJBgGw@mail.gmail.com>
References: <CAB6mnxK81UO1i0vJ-y9=xpNyOV=U7sStpySZgh3QSnHeTOLB-g@mail.gmail.com>
 <CAJXewO=Nef2bP710KVuwykn1YyfJrSiKXH6UG54iP+cj+LZDUQ@mail.gmail.com>
 <CANNq6Fkr2U5Vr-L9mYK+maTFuEGzT9RFqOtip3HqYpEjJJBgGw@mail.gmail.com>
Message-ID: <CAJNV+9v981C-xMC=tPRizkWro3VNOY1YhxrMO1_nAhedC38Rxw@mail.gmail.com>

Good, that means I can revert some changes to astropy, which made the
tests less readable.
-- Marten


From rays at blue-cove.com  Thu Oct 20 18:25:49 2016
From: rays at blue-cove.com (R Schumacher)
Date: Thu, 20 Oct 2016 15:25:49 -0700
Subject: [Numpy-discussion] invalid value treatment, in filter_design
Message-ID: <201610202226.u9KMQ4IO008573@blue-cove.com>

In an attempt to computationally invert the effect of an analog RC 
filter on a data set and reconstruct the "true" signal, a co-worker 
suggested: "Mathematically, you just reverse the a and b parameters. 
Then the zeros become the poles, but if the new poles are not inside 
the unit circle, the filter is not stable."

So, to "stabilize" the poles' issue seen, I test for the DIV/0 error 
and set it to 2./N+0.j in scipy/signal/filter_design.py ~ line 244
     d = polyval(a[::-1], zm1)
     if d[0]==0.0+0.j:
         d[0] = 2./N+0.j
     h = polyval(b[::-1], zm1) / d

- Question is, is this a mathematically valid treatment?
- Is there a better way to invert a Butterworth filter, or work with 
the DIV/0 that occurs without modifying the signal library?
- Should I post to *-users instead?

I noted d[0] > 2./N+0.j makes the zero bin result spike low; 2/N 
gives a reasonable "extension" of the response curve. This whole 
tweak causes a zero offset however, which I remove.

An example attached...


Ray Schumacher
Programmer/Consultant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161020/61291892/attachment.html>
-------------- next part --------------
import numpy as np
from scipy.signal import butter, lfilter, freqz
import matplotlib.pyplot as plt


def butter_highpass(cutoff, fs, order=5):
    nyq = 0.5 * fs
    normal_cutoff = cutoff / nyq
    b, a = butter(order, Wn=normal_cutoff, btype='highpass', analog=False)
    return b, a

def butter_inv_highpass(cutoff, fs, order=5):
    nyq = 0.5 * fs
    normal_cutoff = cutoff / nyq
    b, a = butter(order, Wn=normal_cutoff, btype='highpass', analog=False)
    ## swap the components
    return a, b

def butter_highpass_filter(data, cutoff, fs, order=5):
    b, a = butter_highpass(cutoff, fs, order=order)
    y = lfilter(b, a, data)
    return y

def butter_inv_highpass_filter(data, cutoff, fs, order=5):
    b, a = butter_inv_highpass(cutoff, fs, order=1)
    offset = data.mean()
    y = lfilter(b, a, data)
    ## remove new offset
    y -= (y.mean() - offset)
    return y


# Filter requirements.
order = 1
fs = 1024.0       # sample rate, Hz
cutoff = 11.6  # desired cutoff frequency of the filter, Hz
nyquist = fs/2.

# Get the filter coefficients so we can check its frequency response.
b, a = butter_highpass(cutoff, fs, order)
bi, ai = butter_inv_highpass(cutoff, fs, order)


# Plot the frequency response.
plt.subplot(2, 1, 1)
w, h = freqz(b, a, worN=8000)
plt.plot(0.5*fs*w/np.pi, np.abs(h), 'g', label='high pass resp')
wi, hi = freqz(bi, ai, worN=8000)
plt.plot(0.5*fs*wi/np.pi, np.abs(hi), 'r', label='inv. high pass resp')
plt.plot(cutoff, 0.5*np.sqrt(2), 'ko')
plt.axvline(cutoff, color='k')
plt.xlim(0, 0.05*fs)
plt.ylim(0, 5)
plt.title("Lowpass Filter Frequency Response")
plt.xlabel('Frequency [Hz]')
# add the legend in the middle of the plot
leg = plt.legend(fancybox=True)
# set the alpha value of the legend: it will be translucent
leg.get_frame().set_alpha(0.5)
plt.subplots_adjust(hspace=0.35)
plt.grid()


# Demonstrate the use of the filter.
# First make some data to be filtered.
T = 5.0         # seconds
n = int(T * fs) # total number of samples
t = np.linspace(0, T, n, endpoint=False)
# "Noisy" data.  We want to recover the 1.2 Hz signal from this.
data = np.sin(1.2*2*np.pi*t)# + 1.5*np.cos(9*2*np.pi*t) + 0.5*np.sin(12.0*2*np.pi*t)

# Filter the data, and plot both the original and filtered signals.
y = butter_highpass_filter(data, cutoff, fs, order)
yi = butter_inv_highpass_filter(y, cutoff, fs, order)

plt.subplot(2, 1, 2)
plt.plot(t, data, 'b-', label='1.2Hz "real" data')
plt.plot(t, y, 'g-', linewidth=2, label='blue box data')
plt.plot(t, yi, 'r--', linewidth=2, label='round-trip data')
plt.xlabel('Time [sec]')
plt.grid()
#plt.legend()
# add the legend in the middle of the plot
leg = plt.legend(fancybox=True)
# set the alpha value of the legend: it will be translucent
leg.get_frame().set_alpha(0.5)
plt.subplots_adjust(hspace=0.35)
plt.show()

From charlesr.harris at gmail.com  Thu Oct 20 22:58:06 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 20 Oct 2016 20:58:06 -0600
Subject: [Numpy-discussion] fpower ufunc
Message-ID: <CAB6mnxKqzFtJOjmMvG6LjNABqZvKR04UxASH_dRzxHShMLu+_Q@mail.gmail.com>

Hi All,

I've put up a preliminary PR <https://github.com/numpy/numpy/pull/8190> for
the proposed fpower ufunc. Apart from adding more tests and documentation,
I'd like to settle a few other things. The first is the name, two names
have been proposed and we should settle on one

   - fpower (short)
   - float_power (obvious)

The second thing is the minimum precision. In the preliminary version I
have used float32, but perhaps it makes more sense for the intended use to
make the minimum precision float64 instead.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161020/c804932b/attachment.html>

From njs at pobox.com  Thu Oct 20 23:11:13 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 20 Oct 2016 20:11:13 -0700
Subject: [Numpy-discussion] fpower ufunc
In-Reply-To: <CAB6mnxKqzFtJOjmMvG6LjNABqZvKR04UxASH_dRzxHShMLu+_Q@mail.gmail.com>
References: <CAB6mnxKqzFtJOjmMvG6LjNABqZvKR04UxASH_dRzxHShMLu+_Q@mail.gmail.com>
Message-ID: <CAPJVwBm8cgDx=S+0ZFUfiimgxPWeyYj2=V3RsueBRiOrX=fQ-g@mail.gmail.com>

On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> I've put up a preliminary PR for the proposed fpower ufunc. Apart from
> adding more tests and documentation, I'd like to settle a few other things.
> The first is the name, two names have been proposed and we should settle on
> one
>
> fpower (short)
> float_power (obvious)

+0.6 for float_power

> The second thing is the minimum precision. In the preliminary version I have
> used float32, but perhaps it makes more sense for the intended use to make
> the minimum precision float64 instead.

Can you elaborate on what you're thinking? I guess this is because
float32 has limited range compared to float64, so is more likely to
see overflow? float32 still goes up to 10**38 which is < int64_max**2,
FWIW. Or maybe there's some subtlety with the int->float casting here?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From charlesr.harris at gmail.com  Thu Oct 20 23:38:33 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 20 Oct 2016 21:38:33 -0600
Subject: [Numpy-discussion] fpower ufunc
In-Reply-To: <CAPJVwBm8cgDx=S+0ZFUfiimgxPWeyYj2=V3RsueBRiOrX=fQ-g@mail.gmail.com>
References: <CAB6mnxKqzFtJOjmMvG6LjNABqZvKR04UxASH_dRzxHShMLu+_Q@mail.gmail.com>
 <CAPJVwBm8cgDx=S+0ZFUfiimgxPWeyYj2=V3RsueBRiOrX=fQ-g@mail.gmail.com>
Message-ID: <CAB6mnxKGp2qr_0aOvJR27ErxaUFhXRjvAgzpBNaTnmBx-Pj=Yw@mail.gmail.com>

On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > Hi All,
> >
> > I've put up a preliminary PR for the proposed fpower ufunc. Apart from
> > adding more tests and documentation, I'd like to settle a few other
> things.
> > The first is the name, two names have been proposed and we should settle
> on
> > one
> >
> > fpower (short)
> > float_power (obvious)
>
> +0.6 for float_power
>
> > The second thing is the minimum precision. In the preliminary version I
> have
> > used float32, but perhaps it makes more sense for the intended use to
> make
> > the minimum precision float64 instead.
>
> Can you elaborate on what you're thinking? I guess this is because
> float32 has limited range compared to float64, so is more likely to
> see overflow? float32 still goes up to 10**38 which is < int64_max**2,
> FWIW. Or maybe there's some subtlety with the int->float casting here?
>

logical, (u)int8, (u)int16, and float16 get converted to float32, which is
probably sufficient to avoid overflow and such. My thought was that float32
is something of a "specialized" type these days, while float64 is the
standard floating point precision for everyday computation.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161020/b51784b4/attachment.html>

From sebastian at sipsolutions.net  Fri Oct 21 03:45:11 2016
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 21 Oct 2016 09:45:11 +0200
Subject: [Numpy-discussion] fpower ufunc
In-Reply-To: <CAB6mnxKGp2qr_0aOvJR27ErxaUFhXRjvAgzpBNaTnmBx-Pj=Yw@mail.gmail.com>
References: <CAB6mnxKqzFtJOjmMvG6LjNABqZvKR04UxASH_dRzxHShMLu+_Q@mail.gmail.com>
 <CAPJVwBm8cgDx=S+0ZFUfiimgxPWeyYj2=V3RsueBRiOrX=fQ-g@mail.gmail.com>
 <CAB6mnxKGp2qr_0aOvJR27ErxaUFhXRjvAgzpBNaTnmBx-Pj=Yw@mail.gmail.com>
Message-ID: <1477035911.18447.1.camel@sipsolutions.net>

On Do, 2016-10-20 at 21:38 -0600, Charles R Harris wrote:
> 
> 
> On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith <njs at pobox.com>
> wrote:
> > On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> > > Hi All,
> > >
> > > I've put up a preliminary PR for the proposed fpower ufunc. Apart
> > from
> > > adding more tests and documentation, I'd like to settle a few
> > other things.
> > > The first is the name, two names have been proposed and we should
> > settle on
> > > one
> > >
> > > fpower (short)
> > > float_power (obvious)
> > 
> > +0.6 for float_power
> > 
> > > The second thing is the minimum precision. In the preliminary
> > version I have
> > > used float32, but perhaps it makes more sense for the intended
> > use to make
> > > the minimum precision float64 instead.
> > 
> > Can you elaborate on what you're thinking? I guess this is because
> > float32 has limited range compared to float64, so is more likely to
> > see overflow? float32 still goes up to 10**38 which is <
> > int64_max**2,
> > FWIW. Or maybe there's some subtlety with the int->float casting
> > here?
> logical, (u)int8, (u)int16, and float16 get converted to float32,
> which is probably sufficient to avoid overflow and such. My thought
> was that float32 is something of a "specialized" type these days,
> while float64 is the standard floating point precision for everyday
> computation.
> 


Isn't the behaviour we already have (e.g. such as mean).

ints -> float64
inexacts do not get upcast?

- Sebastian


> Chuck?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161021/d0534e43/attachment.sig>

From sebastian at sipsolutions.net  Fri Oct 21 04:29:30 2016
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 21 Oct 2016 10:29:30 +0200
Subject: [Numpy-discussion] fpower ufunc
In-Reply-To: <1477035911.18447.1.camel@sipsolutions.net>
References: <CAB6mnxKqzFtJOjmMvG6LjNABqZvKR04UxASH_dRzxHShMLu+_Q@mail.gmail.com>
 <CAPJVwBm8cgDx=S+0ZFUfiimgxPWeyYj2=V3RsueBRiOrX=fQ-g@mail.gmail.com>
 <CAB6mnxKGp2qr_0aOvJR27ErxaUFhXRjvAgzpBNaTnmBx-Pj=Yw@mail.gmail.com>
 <1477035911.18447.1.camel@sipsolutions.net>
Message-ID: <1477038570.18447.3.camel@sipsolutions.net>

On Fr, 2016-10-21 at 09:45 +0200, Sebastian Berg wrote:
> On Do, 2016-10-20 at 21:38 -0600, Charles R Harris wrote:
> > 
> > 
> > 
> > On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith <njs at pobox.com>
> > wrote:
> > > 
> > > On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris
> > > <charlesr.harris at gmail.com> wrote:
> > > > 
> > > > Hi All,
> > > > 
> > > > I've put up a preliminary PR for the proposed fpower ufunc.
> > > > Apart
> > > from
> > > > 
> > > > adding more tests and documentation, I'd like to settle a few
> > > other things.
> > > > 
> > > > The first is the name, two names have been proposed and we
> > > > should
> > > settle on
> > > > 
> > > > one
> > > > 
> > > > fpower (short)
> > > > float_power (obvious)
> > > +0.6 for float_power
> > > 
> > > > 
> > > > The second thing is the minimum precision. In the preliminary
> > > version I have
> > > > 
> > > > used float32, but perhaps it makes more sense for the intended
> > > use to make
> > > > 
> > > > the minimum precision float64 instead.
> > > Can you elaborate on what you're thinking? I guess this is
> > > because
> > > float32 has limited range compared to float64, so is more likely
> > > to
> > > see overflow? float32 still goes up to 10**38 which is <
> > > int64_max**2,
> > > FWIW. Or maybe there's some subtlety with the int->float casting
> > > here?
> > logical, (u)int8, (u)int16, and float16 get converted to float32,
> > which is probably sufficient to avoid overflow and such. My thought
> > was that float32 is something of a "specialized" type these days,
> > while float64 is the standard floating point precision for everyday
> > computation.
> > 
> 
> Isn't the behaviour we already have (e.g. such as mean).
> 
> ints -> float64
> inexacts do not get upcast?
> 

Ah, on the other hand, some/most of the float only ufuncs probably do
it as you made it work?


> - Sebastian
> 
> 
> > 
> > Chuck?
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161021/23f18edc/attachment.sig>

From charlesr.harris at gmail.com  Fri Oct 21 12:26:23 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 21 Oct 2016 10:26:23 -0600
Subject: [Numpy-discussion] fpower ufunc
In-Reply-To: <1477035911.18447.1.camel@sipsolutions.net>
References: <CAB6mnxKqzFtJOjmMvG6LjNABqZvKR04UxASH_dRzxHShMLu+_Q@mail.gmail.com>
 <CAPJVwBm8cgDx=S+0ZFUfiimgxPWeyYj2=V3RsueBRiOrX=fQ-g@mail.gmail.com>
 <CAB6mnxKGp2qr_0aOvJR27ErxaUFhXRjvAgzpBNaTnmBx-Pj=Yw@mail.gmail.com>
 <1477035911.18447.1.camel@sipsolutions.net>
Message-ID: <CAB6mnxJZLQ5dYVP+uROV7H0S5MB4xq5_k64S5yTRqpNw8BfSDg@mail.gmail.com>

On Fri, Oct 21, 2016 at 1:45 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Do, 2016-10-20 at 21:38 -0600, Charles R Harris wrote:
> >
> >
> > On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith <njs at pobox.com>
> > wrote:
> > > On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris
> > > <charlesr.harris at gmail.com> wrote:
> > > > Hi All,
> > > >
> > > > I've put up a preliminary PR for the proposed fpower ufunc. Apart
> > > from
> > > > adding more tests and documentation, I'd like to settle a few
> > > other things.
> > > > The first is the name, two names have been proposed and we should
> > > settle on
> > > > one
> > > >
> > > > fpower (short)
> > > > float_power (obvious)
> > >
> > > +0.6 for float_power
> > >
> > > > The second thing is the minimum precision. In the preliminary
> > > version I have
> > > > used float32, but perhaps it makes more sense for the intended
> > > use to make
> > > > the minimum precision float64 instead.
> > >
> > > Can you elaborate on what you're thinking? I guess this is because
> > > float32 has limited range compared to float64, so is more likely to
> > > see overflow? float32 still goes up to 10**38 which is <
> > > int64_max**2,
> > > FWIW. Or maybe there's some subtlety with the int->float casting
> > > here?
> > logical, (u)int8, (u)int16, and float16 get converted to float32,
> > which is probably sufficient to avoid overflow and such. My thought
> > was that float32 is something of a "specialized" type these days,
> > while float64 is the standard floating point precision for everyday
> > computation.
> >
>
>
> Isn't the behaviour we already have (e.g. such as mean).
>
> ints -> float64
> inexacts do not get upcast?
>
>
Hmm... The best way to do that would be to put the function in
`fromnumeric` and do it in python rather than as a ufunc, then for integer
types call power with `dtype=float64`. I like that idea better than the
current implementation, my mind was stuck in the ufunc universe.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161021/7d9b2648/attachment.html>

From harrigan.matthew at gmail.com  Mon Oct 24 08:44:46 2016
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Mon, 24 Oct 2016 08:44:46 -0400
Subject: [Numpy-discussion] padding options for diff
Message-ID: <CAOfRF=hY-9ddx57xKZzX+iAKL4bSW1UjABiArJVXfE+AJ73tmg@mail.gmail.com>

I posted a pull request <https://github.com/numpy/numpy/pull/8206> which
adds optional padding kwargs "to_begin" and "to_end" to diff.  Those
options are based on what's available in ediff1d.  It closes this issue
<https://github.com/numpy/numpy/issues/8132>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161024/3119dcb9/attachment.html>

From shoyer at gmail.com  Mon Oct 24 11:14:32 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Mon, 24 Oct 2016 08:14:32 -0700
Subject: [Numpy-discussion] padding options for diff
In-Reply-To: <CAOfRF=hY-9ddx57xKZzX+iAKL4bSW1UjABiArJVXfE+AJ73tmg@mail.gmail.com>
References: <CAOfRF=hY-9ddx57xKZzX+iAKL4bSW1UjABiArJVXfE+AJ73tmg@mail.gmail.com>
Message-ID: <CAEQ_Tve1reOh9HMVMG4vkaMY0NXRRKRfubDDtGHgrY2EEvuTLA@mail.gmail.com>

This looks like a welcome addition in functionality! It will be nice to be
able to finally (soft) deprecate ediff1d.

On Mon, Oct 24, 2016 at 5:44 AM, Matthew Harrigan <
harrigan.matthew at gmail.com> wrote:

> I posted a pull request <https://github.com/numpy/numpy/pull/8206> which
> adds optional padding kwargs "to_begin" and "to_end" to diff.  Those
> options are based on what's available in ediff1d.  It closes this issue
> <https://github.com/numpy/numpy/issues/8132>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161024/eda105d2/attachment.html>

From charlesr.harris at gmail.com  Mon Oct 24 18:41:00 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 24 Oct 2016 16:41:00 -0600
Subject: [Numpy-discussion] Numpy integers to integer powers again again
Message-ID: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>

Hi All,

I've been thinking about this some (a lot) more and have an alternate
proposal for the behavior of the `**` operator

   - if both base and power are numpy/python scalar integers, convert to
   python integers and call the `**` operator. That would solve both the
   precision and compatibility problems and I think is the option of least
   surprise. For those who need type preservation and modular arithmetic, the
   np.power function remains, although the type conversions can be surpirising
   as it seems that the base and power should  play different roles in
   determining the type, at least to me.
   - Array, 0-d or not, are treated differently from scalars and integers
   raised to negative integer powers always raise an error.

I think this solves most problems and would not be difficult to implement.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161024/85ba5be5/attachment.html>

From njs at pobox.com  Mon Oct 24 19:30:43 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 24 Oct 2016 16:30:43 -0700
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
Message-ID: <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>

On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> I've been thinking about this some (a lot) more and have an alternate
> proposal for the behavior of the `**` operator
>
> if both base and power are numpy/python scalar integers, convert to python
> integers and call the `**` operator. That would solve both the precision and
> compatibility problems and I think is the option of least surprise. For
> those who need type preservation and modular arithmetic, the np.power
> function remains, although the type conversions can be surpirising as it
> seems that the base and power should  play different roles in determining
> the type, at least to me.
> Array, 0-d or not, are treated differently from scalars and integers raised
> to negative integer powers always raise an error.
>
> I think this solves most problems and would not be difficult to implement.
>
> Thoughts?

My main concern about this is that it adds more special cases to numpy
scalars, and a new behavioral deviation between 0d arrays and scalars,
when ideally we should be trying to reduce the
duplication/discrepancies between these. It's also inconsistent with
how other operations on integer scalars work, e.g. regular addition
overflows rather than promoting to Python int:

In [8]: np.int64(2 ** 63 - 1) + 1
/home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning:
overflow encountered in long_scalars
  #!/home/njs/.user-python3.5-64bit/bin/python3.5
Out[8]: -9223372036854775808

So I'm inclined to try and keep it simple, like in your previous
proposal... theoretically of course it would be nice to have the
perfect solution here, but at this point it feels like we might be
overthinking this trying to get that last 1% of improvement. The thing
where 2 ** -1 returns 0 is just broken and bites people so we should
definitely fix it, but beyond that I'm not sure it really matters
*that* much what we do, and "special cases aren't special enough to
break the rules" and all that.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From shoyer at gmail.com  Tue Oct 25 12:14:40 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 25 Oct 2016 09:14:40 -0700
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
Message-ID: <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>

I am also concerned about adding more special cases for NumPy scalars vs
arrays. These cases are already confusing (e.g., making no distinction
between 0d arrays and scalars) and poorly documented.

On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > Hi All,
> >
> > I've been thinking about this some (a lot) more and have an alternate
> > proposal for the behavior of the `**` operator
> >
> > if both base and power are numpy/python scalar integers, convert to
> python
> > integers and call the `**` operator. That would solve both the precision
> and
> > compatibility problems and I think is the option of least surprise. For
> > those who need type preservation and modular arithmetic, the np.power
> > function remains, although the type conversions can be surpirising as it
> > seems that the base and power should  play different roles in determining
> > the type, at least to me.
> > Array, 0-d or not, are treated differently from scalars and integers
> raised
> > to negative integer powers always raise an error.
> >
> > I think this solves most problems and would not be difficult to
> implement.
> >
> > Thoughts?
>
> My main concern about this is that it adds more special cases to numpy
> scalars, and a new behavioral deviation between 0d arrays and scalars,
> when ideally we should be trying to reduce the
> duplication/discrepancies between these. It's also inconsistent with
> how other operations on integer scalars work, e.g. regular addition
> overflows rather than promoting to Python int:
>
> In [8]: np.int64(2 ** 63 - 1) + 1
> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning:
> overflow encountered in long_scalars
>   #!/home/njs/.user-python3.5-64bit/bin/python3.5
> Out[8]: -9223372036854775808
>
> So I'm inclined to try and keep it simple, like in your previous
> proposal... theoretically of course it would be nice to have the
> perfect solution here, but at this point it feels like we might be
> overthinking this trying to get that last 1% of improvement. The thing
> where 2 ** -1 returns 0 is just broken and bites people so we should
> definitely fix it, but beyond that I'm not sure it really matters
> *that* much what we do, and "special cases aren't special enough to
> break the rules" and all that.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/b6ebf794/attachment.html>

From p.e.creasey.00 at googlemail.com  Tue Oct 25 13:26:39 2016
From: p.e.creasey.00 at googlemail.com (Peter Creasey)
Date: Tue, 25 Oct 2016 10:26:39 -0700
Subject: [Numpy-discussion] padding options for diff
Message-ID: <CACUQHuLrZOcM7LWtBTzGvNi3WpH3yGVZOk63-u6rcNXcd76QUQ@mail.gmail.com>

> Date: Mon, 24 Oct 2016 08:44:46 -0400
> From: Matthew Harrigan <harrigan.matthew at gmail.com>
>
> I posted a pull request <https://github.com/numpy/numpy/pull/8206> which
> adds optional padding kwargs "to_begin" and "to_end" to diff.  Those
> options are based on what's available in ediff1d.  It closes this issue
> <https://github.com/numpy/numpy/issues/8132>

I like the proposal, though I suspect that making it general has
obscured that the most common use-case for padding is to make the
inverse of np.cumsum (at least that?s what I frequently need), and now
in the multidimensional case you have the somewhat unwieldy:

>>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis))

rather than

>>> np.diff(a, axis=axis, keep_left=True)

which of course could just be an option upon what you already have.

Best,
Peter


From shoyer at gmail.com  Tue Oct 25 15:38:16 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 25 Oct 2016 12:38:16 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
Message-ID: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>

With a custom wrapper class, it's possible to preserve NumPy views when
pickling:
https://stackoverflow.com/questions/13746601/preserving-numpy-view-when-pickling

This can result in significant time/space savings with pickling views along
with base arrays and brings the behavior of NumPy more in line with Python
proper. Is this something that we can/should port into NumPy itself?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/30897e70/attachment.html>

From njs at pobox.com  Tue Oct 25 16:07:52 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 25 Oct 2016 13:07:52 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
Message-ID: <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>

On Tue, Oct 25, 2016 at 12:38 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
> With a custom wrapper class, it's possible to preserve NumPy views when
> pickling:
> https://stackoverflow.com/questions/13746601/preserving-numpy-view-when-pickling
>
> This can result in significant time/space savings with pickling views along
> with base arrays and brings the behavior of NumPy more in line with Python
> proper. Is this something that we can/should port into NumPy itself?

Concretely, what do would you suggest should happen with:

base = np.zeros(100000000)
view = base[:10]

# case 1
pickle.dump(view, file)

# case 2
pickle.dump(base, file)
pickle.dump(view, file)

# case 3
pickle.dump(view, file)
pickle.dump(base, file)

?

-- 
Nathaniel J. Smith -- https://vorpus.org


From shoyer at gmail.com  Tue Oct 25 18:07:04 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 25 Oct 2016 15:07:04 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
 <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
Message-ID: <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>

On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith <njs at pobox.com> wrote:

> Concretely, what do would you suggest should happen with:
>
> base = np.zeros(100000000)
> view = base[:10]
>
> # case 1
> pickle.dump(view, file)
>
> # case 2
> pickle.dump(base, file)
> pickle.dump(view, file)
>
> # case 3
> pickle.dump(view, file)
> pickle.dump(base, file)
>
> ?
>

I see what you're getting at here. We would need a rule for when to include
the base in the pickle and when not to. Otherwise, pickle.dump(view, file)
always contains data from the base pickle, even with view is much smaller
than base.

The safe answer is "only use views in the pickle when base is already being
pickled", but that isn't possible to check unless all the arrays are
together in a custom container. So, this isn't really feasible for NumPy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/5c42612b/attachment.html>

From robert.kern at gmail.com  Tue Oct 25 19:28:22 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 25 Oct 2016 16:28:22 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
 <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
 <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>
Message-ID: <CAF6FJivAuqQa0Yc2r7V4-NUDcHb0bHrcaEJsQyH2L4TP2tjLiA@mail.gmail.com>

On Tue, Oct 25, 2016 at 3:07 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>
> On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> Concretely, what do would you suggest should happen with:
>>
>> base = np.zeros(100000000)
>> view = base[:10]
>>
>> # case 1
>> pickle.dump(view, file)
>>
>> # case 2
>> pickle.dump(base, file)
>> pickle.dump(view, file)
>>
>> # case 3
>> pickle.dump(view, file)
>> pickle.dump(base, file)
>>
>> ?
>
> I see what you're getting at here. We would need a rule for when to
include the base in the pickle and when not to. Otherwise,
pickle.dump(view, file) always contains data from the base pickle, even
with view is much smaller than base.
>
> The safe answer is "only use views in the pickle when base is already
being pickled", but that isn't possible to check unless all the arrays are
together in a custom container. So, this isn't really feasible for NumPy.

It would be possible with a custom Pickler/Unpickler since they already
keep track of objects previously (un)pickled. That would handle [base,
view] okay but not [view, base], so it's probably not going to be all that
useful outside of special situations. It would make a neat recipe, but I
probably would not provide it in numpy itself.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/130b2f27/attachment.html>

From harrigan.matthew at gmail.com  Tue Oct 25 20:09:09 2016
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Tue, 25 Oct 2016 20:09:09 -0400
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CAF6FJivAuqQa0Yc2r7V4-NUDcHb0bHrcaEJsQyH2L4TP2tjLiA@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
 <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
 <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>
 <CAF6FJivAuqQa0Yc2r7V4-NUDcHb0bHrcaEJsQyH2L4TP2tjLiA@mail.gmail.com>
Message-ID: <CAOfRF=iGHx_ZsyuH14kuObb1+86ptBLfw8JqjGMFr0T5d507-g@mail.gmail.com>

It seems pickle keeps track of references for basic python types.

x = [1]
y = [x]
x,y = pickle.loads(pickle.dumps((x,y)))
x.append(2)
print(y)
>>> [[1,2]]

Numpy arrays are different but references are forgotten after
pickle/unpickle.  Shared objects do not remain shared.  Based on the quote
below it could be considered bug with numpy/pickle.

Object sharing (references to the same object in different places): This is
similar to self-referencing objects; pickle stores the object once, and
ensures that all other references point to the master copy. Shared objects
remain shared, which can be very important for mutable objects.  link
<https://docs.python.org/2.0/lib/module-pickle.html>

Another example with ndarrays:


x = np.arange(5)
y = x[::-1]
x, y = pickle.loads(pickle.dumps((x, y)))
x[0] = 9
print(y)
>>> [4, 3, 2, 1, 0]

In this case the two arrays share the exact same object for the data buffer
(although object might not be the right word here)

On Tue, Oct 25, 2016 at 7:28 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Oct 25, 2016 at 3:07 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
> >
> > On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> Concretely, what do would you suggest should happen with:
> >>
> >> base = np.zeros(100000000)
> >> view = base[:10]
> >>
> >> # case 1
> >> pickle.dump(view, file)
> >>
> >> # case 2
> >> pickle.dump(base, file)
> >> pickle.dump(view, file)
> >>
> >> # case 3
> >> pickle.dump(view, file)
> >> pickle.dump(base, file)
> >>
> >> ?
> >
> > I see what you're getting at here. We would need a rule for when to
> include the base in the pickle and when not to. Otherwise,
> pickle.dump(view, file) always contains data from the base pickle, even
> with view is much smaller than base.
> >
> > The safe answer is "only use views in the pickle when base is already
> being pickled", but that isn't possible to check unless all the arrays are
> together in a custom container. So, this isn't really feasible for NumPy.
>
> It would be possible with a custom Pickler/Unpickler since they already
> keep track of objects previously (un)pickled. That would handle [base,
> view] okay but not [view, base], so it's probably not going to be all that
> useful outside of special situations. It would make a neat recipe, but I
> probably would not provide it in numpy itself.
>
> --
> Robert Kern
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/7d16ed17/attachment.html>

From robert.kern at gmail.com  Tue Oct 25 20:29:54 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 25 Oct 2016 17:29:54 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CAOfRF=iGHx_ZsyuH14kuObb1+86ptBLfw8JqjGMFr0T5d507-g@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
 <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
 <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>
 <CAF6FJivAuqQa0Yc2r7V4-NUDcHb0bHrcaEJsQyH2L4TP2tjLiA@mail.gmail.com>
 <CAOfRF=iGHx_ZsyuH14kuObb1+86ptBLfw8JqjGMFr0T5d507-g@mail.gmail.com>
Message-ID: <CAF6FJiuKJLGQZGrSSLbnY=R4js68MjwYBBZeofDrJiMu8Um14A@mail.gmail.com>

On Tue, Oct 25, 2016 at 5:09 PM, Matthew Harrigan <
harrigan.matthew at gmail.com> wrote:
>
> It seems pickle keeps track of references for basic python types.
>
> x = [1]
> y = [x]
> x,y = pickle.loads(pickle.dumps((x,y)))
> x.append(2)
> print(y)
> >>> [[1,2]]
>
> Numpy arrays are different but references are forgotten after
pickle/unpickle.  Shared objects do not remain shared.  Based on the quote
below it could be considered bug with numpy/pickle.

Not a bug, but an explicit design decision on numpy's part.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/7f7a6649/attachment.html>

From rainwoodman at gmail.com  Tue Oct 25 22:05:39 2016
From: rainwoodman at gmail.com (Feng Yu)
Date: Tue, 25 Oct 2016 19:05:39 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CAF6FJiuKJLGQZGrSSLbnY=R4js68MjwYBBZeofDrJiMu8Um14A@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
 <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
 <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>
 <CAF6FJivAuqQa0Yc2r7V4-NUDcHb0bHrcaEJsQyH2L4TP2tjLiA@mail.gmail.com>
 <CAOfRF=iGHx_ZsyuH14kuObb1+86ptBLfw8JqjGMFr0T5d507-g@mail.gmail.com>
 <CAF6FJiuKJLGQZGrSSLbnY=R4js68MjwYBBZeofDrJiMu8Um14A@mail.gmail.com>
Message-ID: <CANGhFjbwbsxK3JMszm-fKddzR-BP+uOZqumarG2XcWiRE-8bhw@mail.gmail.com>

Hi,

Just another perspective. base' and 'data' in PyArrayObject are two
separate variables.

base can point to any PyObject, but it is `data` that defines where
data is accessed in memory.

1. There is no clear way to pickle a pointer (`data`) in a meaningful
way. In order for `data` member to make sense we still need to
'readout' the values stored at `data` pointer in the pickle.

2. By definition base is not necessary a numpy array but it is just
some other object for managing the memory.

3. One can surely pickle the `base` object as a reference, but it is
useless if the data memory has been reconstructed independently during
unpickling.

4. Unless there is clear way to notify the referencing numpy array of
the new data pointer. There probably isn't.

BTW, is the stride information is lost during pickling, too? The
behavior shall probably be documented if not yet.

Yu

On Tue, Oct 25, 2016 at 5:29 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Tue, Oct 25, 2016 at 5:09 PM, Matthew Harrigan
> <harrigan.matthew at gmail.com> wrote:
>>
>> It seems pickle keeps track of references for basic python types.
>>
>> x = [1]
>> y = [x]
>> x,y = pickle.loads(pickle.dumps((x,y)))
>> x.append(2)
>> print(y)
>> >>> [[1,2]]
>>
>> Numpy arrays are different but references are forgotten after
>> pickle/unpickle.  Shared objects do not remain shared.  Based on the quote
>> below it could be considered bug with numpy/pickle.
>
> Not a bug, but an explicit design decision on numpy's part.
>
> --
> Robert Kern
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From robert.kern at gmail.com  Tue Oct 25 22:39:14 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 25 Oct 2016 19:39:14 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CANGhFjbwbsxK3JMszm-fKddzR-BP+uOZqumarG2XcWiRE-8bhw@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
 <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
 <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>
 <CAF6FJivAuqQa0Yc2r7V4-NUDcHb0bHrcaEJsQyH2L4TP2tjLiA@mail.gmail.com>
 <CAOfRF=iGHx_ZsyuH14kuObb1+86ptBLfw8JqjGMFr0T5d507-g@mail.gmail.com>
 <CAF6FJiuKJLGQZGrSSLbnY=R4js68MjwYBBZeofDrJiMu8Um14A@mail.gmail.com>
 <CANGhFjbwbsxK3JMszm-fKddzR-BP+uOZqumarG2XcWiRE-8bhw@mail.gmail.com>
Message-ID: <CAF6FJite1eXKDzg=EGWHjW6k6TjLBY6VCQ2YH1syPATeUL6O2g@mail.gmail.com>

On Tue, Oct 25, 2016 at 7:05 PM, Feng Yu <rainwoodman at gmail.com> wrote:
>
> Hi,
>
> Just another perspective. base' and 'data' in PyArrayObject are two
> separate variables.
>
> base can point to any PyObject, but it is `data` that defines where
> data is accessed in memory.
>
> 1. There is no clear way to pickle a pointer (`data`) in a meaningful
> way. In order for `data` member to make sense we still need to
> 'readout' the values stored at `data` pointer in the pickle.
>
> 2. By definition base is not necessary a numpy array but it is just
> some other object for managing the memory.

In general, yes, but most often it's another ndarray, and the child is
related to the parent by a slice operation that could be computed by
comparing the `data` tuples. The exercise here isn't to always represent
the general case in this way, but to see what can be done opportunistically
and if that actually helps solve a practical problem.

> 3. One can surely pickle the `base` object as a reference, but it is
> useless if the data memory has been reconstructed independently during
> unpickling.
>
> 4. Unless there is clear way to notify the referencing numpy array of
> the new data pointer. There probably isn't.
>
> BTW, is the stride information is lost during pickling, too? The
> behavior shall probably be documented if not yet.

The stride information may be lost, yes. We reserve the right to retain it,
though (for example, if .T is contiguous then we might well serialize the
transposed data linearly and return a view on that data upon
deserialization). I don't believe that we guarantee that the unpickled
result is contiguous.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/7aef26da/attachment.html>

From njs at pobox.com  Tue Oct 25 23:36:29 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 25 Oct 2016 20:36:29 -0700
Subject: [Numpy-discussion] Preserving NumPy views when pickling
In-Reply-To: <CAOfRF=iGHx_ZsyuH14kuObb1+86ptBLfw8JqjGMFr0T5d507-g@mail.gmail.com>
References: <CAEQ_TvdeceEoCaHetgKjmYoqbNKUMi1t5VTQ5VEUxYuc9p5AAA@mail.gmail.com>
 <CAPJVwBmFoS-QNJF2EiRfpMfpY6jAi1MHfAum9mD3R0BNqsc0mg@mail.gmail.com>
 <CAEQ_Tvf1f6Q3gfX+NZWGpy801koBeU-z0LSPY0xdz=JAagMQ9w@mail.gmail.com>
 <CAF6FJivAuqQa0Yc2r7V4-NUDcHb0bHrcaEJsQyH2L4TP2tjLiA@mail.gmail.com>
 <CAOfRF=iGHx_ZsyuH14kuObb1+86ptBLfw8JqjGMFr0T5d507-g@mail.gmail.com>
Message-ID: <CAPJVwBn-p4w3YsZ7hmEAeDG_84+-wo0X9y+aVi1sFUT=gLzKxQ@mail.gmail.com>

On Tue, Oct 25, 2016 at 5:09 PM, Matthew Harrigan
<harrigan.matthew at gmail.com> wrote:
> It seems pickle keeps track of references for basic python types.
>
> x = [1]
> y = [x]
> x,y = pickle.loads(pickle.dumps((x,y)))
> x.append(2)
> print(y)
>>>> [[1,2]]

Yes, but the problem is: suppose I have a 10 gigabyte array, and then
take a 20 byte slice of it, and then pickle that slice. Do you expect
the pickle file to be 20 bytes, or 10 gigabytes? Both options are
possible, but you have to pick one, and numpy picks 20 bytes. The
advantage is obviously that you don't have mysterious 10 gigabyte
pickle files; the disadvantage is that you can't reconstruct the view
relationships afterwards. (You might think: oh, but we can be clever,
and only record the view relationships if the user pickles both
objects together. But while pickle might know whether the user is
pickling both objects together, it unfortunately doesn't tell numpy,
so we can't really do anything clever or different in this case.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From charlesr.harris at gmail.com  Wed Oct 26 00:34:42 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 25 Oct 2016 22:34:42 -0600
Subject: [Numpy-discussion] Intel random number package
Message-ID: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>

Hi All,

There is a proposed random number package PR now up on github:
https://github.com/numpy/numpy/pull/8209. It is from
oleksandr-pavlyk <https://github.com/oleksandr-pavlyk> and implements the
number random number package using MKL for increased speed. I think we are
definitely interested in the improved speed, but I'm not sure numpy is the
best place to put the package. I'd welcome any comments on the PR itself,
as well as any thoughts on the best way organize or use of this work. Maybe
scikit-random

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/3aa96ef5/attachment.html>

From robert.kern at gmail.com  Wed Oct 26 00:41:32 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 25 Oct 2016 21:41:32 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
Message-ID: <CAF6FJivfOYzdYd8wnz2XiW96hS-RvT8x-_VdFTFEDQqRCtDihg@mail.gmail.com>

On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris <charlesr.harris at gmail.com>
wrote:
>
> Hi All,
>
> There is a proposed random number package PR now up on github:
https://github.com/numpy/numpy/pull/8209. It is from
> oleksandr-pavlyk and implements the number random number package using
MKL for increased speed. I think we are definitely interested in the
improved speed, but I'm not sure numpy is the best place to put the
package. I'd welcome any comments on the PR itself, as well as any thoughts
on the best way organize or use of this work. Maybe scikit-random

This is what ng-numpy-randomstate is for.

https://github.com/bashtage/ng-numpy-randomstate

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/729a07f2/attachment.html>

From charlesr.harris at gmail.com  Wed Oct 26 01:22:54 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 25 Oct 2016 23:22:54 -0600
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAF6FJivfOYzdYd8wnz2XiW96hS-RvT8x-_VdFTFEDQqRCtDihg@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <CAF6FJivfOYzdYd8wnz2XiW96hS-RvT8x-_VdFTFEDQqRCtDihg@mail.gmail.com>
Message-ID: <CAB6mnxKuYUvFU153zEF9R2fgAqjD8d7i5jxJcMTy6Bxa5isewQ@mail.gmail.com>

On Tue, Oct 25, 2016 at 10:41 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
> >
> > Hi All,
> >
> > There is a proposed random number package PR now up on github:
> https://github.com/numpy/numpy/pull/8209. It is from
> > oleksandr-pavlyk and implements the number random number package using
> MKL for increased speed. I think we are definitely interested in the
> improved speed, but I'm not sure numpy is the best place to put the
> package. I'd welcome any comments on the PR itself, as well as any thoughts
> on the best way organize or use of this work. Maybe scikit-random
>
> This is what ng-numpy-randomstate is for.
>
> https://github.com/bashtage/ng-numpy-randomstate
>

Interesting, despite old fashioned original ziggurat implementation of the
normal and gnu c style... Does that project seek to preserve all the
bytestreams or is it still in flux?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/e076fda1/attachment.html>

From robert.kern at gmail.com  Wed Oct 26 01:29:29 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 25 Oct 2016 22:29:29 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAB6mnxKuYUvFU153zEF9R2fgAqjD8d7i5jxJcMTy6Bxa5isewQ@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <CAF6FJivfOYzdYd8wnz2XiW96hS-RvT8x-_VdFTFEDQqRCtDihg@mail.gmail.com>
 <CAB6mnxKuYUvFU153zEF9R2fgAqjD8d7i5jxJcMTy6Bxa5isewQ@mail.gmail.com>
Message-ID: <CAF6FJivUgRVkAMjT5Zjpgkz=nz-GM+J+W3n3bpJ5EYPyYsuGNw@mail.gmail.com>

On Tue, Oct 25, 2016 at 10:22 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:
>
> On Tue, Oct 25, 2016 at 10:41 PM, Robert Kern <robert.kern at gmail.com>
wrote:
>>
>> On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:
>> >
>> > Hi All,
>> >
>> > There is a proposed random number package PR now up on github:
https://github.com/numpy/numpy/pull/8209. It is from
>> > oleksandr-pavlyk and implements the number random number package using
MKL for increased speed. I think we are definitely interested in the
improved speed, but I'm not sure numpy is the best place to put the
package. I'd welcome any comments on the PR itself, as well as any thoughts
on the best way organize or use of this work. Maybe scikit-random
>>
>> This is what ng-numpy-randomstate is for.
>>
>> https://github.com/bashtage/ng-numpy-randomstate
>
> Interesting, despite old fashioned original ziggurat implementation of
the normal and gnu c style... Does that project seek to preserve all the
bytestreams or is it still in flux?

I would assume some flux for now, but you can ask the author by submitting
a corrected ziggurat PR as a trial balloon. ;-)

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161025/83d4084e/attachment.html>

From jtaylor.debian at googlemail.com  Wed Oct 26 03:33:17 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 26 Oct 2016 09:33:17 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
Message-ID: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>

On 26.10.2016 06:34, Charles R Harris wrote:
> Hi All,
> 
> There is a proposed random number package PR now up on github:
> https://github.com/numpy/numpy/pull/8209. It is from
> oleksandr-pavlyk <https://github.com/oleksandr-pavlyk> and implements
> the number random number package using MKL for increased speed. I think
> we are definitely interested in the improved speed, but I'm not sure
> numpy is the best place to put the package. I'd welcome any comments on
> the PR itself, as well as any thoughts on the best way organize or use
> of this work. Maybe scikit-random
> 

I'm not a fan of putting code depending on a proprietary library into numpy.
This should be a standalone package which may provide the same interface
as numpy.


From ralf.gommers at gmail.com  Wed Oct 26 04:59:23 2016
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 26 Oct 2016 21:59:23 +1300
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
Message-ID: <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>

On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 26.10.2016 06:34, Charles R Harris wrote:
> > Hi All,
> >
> > There is a proposed random number package PR now up on github:
> > https://github.com/numpy/numpy/pull/8209. It is from
> > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk> and implements
> > the number random number package using MKL for increased speed. I think
> > we are definitely interested in the improved speed, but I'm not sure
> > numpy is the best place to put the package. I'd welcome any comments on
> > the PR itself, as well as any thoughts on the best way organize or use
> > of this work. Maybe scikit-random
>

Note that this thread is a continuation of
https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html


>
> I'm not a fan of putting code depending on a proprietary library into
> numpy.
> This should be a standalone package which may provide the same interface
> as numpy.
>

I don't really see a problem with that in principle. Numpy can use Intel
MKL (and Accelerate) as well if it's available. It needs some thought put
into the API though - a ``numpy.random_intel`` module is certainly not what
we want.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/ecb03e05/attachment.html>

From harrigan.matthew at gmail.com  Wed Oct 26 09:05:41 2016
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Wed, 26 Oct 2016 09:05:41 -0400
Subject: [Numpy-discussion] padding options for diff
In-Reply-To: <CACUQHuLrZOcM7LWtBTzGvNi3WpH3yGVZOk63-u6rcNXcd76QUQ@mail.gmail.com>
References: <CACUQHuLrZOcM7LWtBTzGvNi3WpH3yGVZOk63-u6rcNXcd76QUQ@mail.gmail.com>
Message-ID: <CAOfRF=gG=d+RrLR0LM+pcbE_mnkfOf0nyWWXSAF9_9XkngeFcQ@mail.gmail.com>

The inverse of cumsum is actually a little more unweildy since you can't
drop a dimension with take.  This returns the original array (numerical
caveats aside):

np.cumsum(np.diff(x, to_begin=x.take([0], axis=axis), axis=axis), axis=axis)

That's certainly not going to win any beauty contests.  The 1d case is
clean though:

np.cumsum(np.diff(x, to_begin=x[0]))

I'm not sure if this means the API should change, and if so how.  Higher
dimensional arrays seem to just have extra complexity.

On Tue, Oct 25, 2016 at 1:26 PM, Peter Creasey <
p.e.creasey.00 at googlemail.com> wrote:

> > Date: Mon, 24 Oct 2016 08:44:46 -0400
> > From: Matthew Harrigan <harrigan.matthew at gmail.com>
> >
> > I posted a pull request <https://github.com/numpy/numpy/pull/8206> which
> > adds optional padding kwargs "to_begin" and "to_end" to diff.  Those
> > options are based on what's available in ediff1d.  It closes this issue
> > <https://github.com/numpy/numpy/issues/8132>
>
> I like the proposal, though I suspect that making it general has
> obscured that the most common use-case for padding is to make the
> inverse of np.cumsum (at least that?s what I frequently need), and now
> in the multidimensional case you have the somewhat unwieldy:
>
> >>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis))
>
> rather than
>
> >>> np.diff(a, axis=axis, keep_left=True)
>
> which of course could just be an option upon what you already have.
>
> Best,
> Peter
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/e1dc9b49/attachment.html>

From jtaylor.debian at googlemail.com  Wed Oct 26 12:00:21 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 26 Oct 2016 18:00:21 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
Message-ID: <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>

On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>
>
> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> wrote:
>
>     On 26.10.2016 06:34, Charles R Harris wrote:
>     > Hi All,
>     >
>     > There is a proposed random number package PR now up on github:
>     > https://github.com/numpy/numpy/pull/8209
>     <https://github.com/numpy/numpy/pull/8209>. It is from
>     > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
>     <https://github.com/oleksandr-pavlyk>> and implements
>     > the number random number package using MKL for increased speed. I think
>     > we are definitely interested in the improved speed, but I'm not sure
>     > numpy is the best place to put the package. I'd welcome any comments on
>     > the PR itself, as well as any thoughts on the best way organize or use
>     > of this work. Maybe scikit-random
>
>
> Note that this thread is a continuation of
> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html
>
>
>
>     I'm not a fan of putting code depending on a proprietary library
>     into numpy.
>     This should be a standalone package which may provide the same interface
>     as numpy.
>
>
> I don't really see a problem with that in principle. Numpy can use Intel
> MKL (and Accelerate) as well if it's available. It needs some thought
> put into the API though - a ``numpy.random_intel`` module is certainly
> not what we want.
>

For me there is a difference between being able to optionally use a 
proprietary library as an alternative to free software libraries if the 
user wishes to do so and offering functionality that only works with 
non-free software.
We are providing a form of advertisement for them by allowing it (hey if 
you buy this black box that you cannot modify or use freely you get this 
neat numpy feature!).

I prefer for the full functionality of numpy to stay available with a 
stack of community owned software, even if it may be less powerful that way.


From jtaylor.debian at googlemail.com  Wed Oct 26 12:10:36 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Wed, 26 Oct 2016 18:10:36 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
Message-ID: <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>

On 10/26/2016 06:00 PM, Julian Taylor wrote:
> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>>
>>
>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
>> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
>> wrote:
>>
>>     On 26.10.2016 06:34, Charles R Harris wrote:
>>     > Hi All,
>>     >
>>     > There is a proposed random number package PR now up on github:
>>     > https://github.com/numpy/numpy/pull/8209
>>     <https://github.com/numpy/numpy/pull/8209>. It is from
>>     > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
>>     <https://github.com/oleksandr-pavlyk>> and implements
>>     > the number random number package using MKL for increased speed.
>> I think
>>     > we are definitely interested in the improved speed, but I'm not
>> sure
>>     > numpy is the best place to put the package. I'd welcome any
>> comments on
>>     > the PR itself, as well as any thoughts on the best way organize
>> or use
>>     > of this work. Maybe scikit-random
>>
>>
>> Note that this thread is a continuation of
>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html
>>
>>
>>
>>     I'm not a fan of putting code depending on a proprietary library
>>     into numpy.
>>     This should be a standalone package which may provide the same
>> interface
>>     as numpy.
>>
>>
>> I don't really see a problem with that in principle. Numpy can use Intel
>> MKL (and Accelerate) as well if it's available. It needs some thought
>> put into the API though - a ``numpy.random_intel`` module is certainly
>> not what we want.
>>
>
> For me there is a difference between being able to optionally use a
> proprietary library as an alternative to free software libraries if the
> user wishes to do so and offering functionality that only works with
> non-free software.
> We are providing a form of advertisement for them by allowing it (hey if
> you buy this black box that you cannot modify or use freely you get this
> neat numpy feature!).
>
> I prefer for the full functionality of numpy to stay available with a
> stack of community owned software, even if it may be less powerful that
> way.

But then if this is really just the same random numbers numpy already 
provides just faster, it is probably acceptable in principle. I haven't 
actually looked at the PR yet.


From robert.kern at gmail.com  Wed Oct 26 12:29:42 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 26 Oct 2016 09:29:42 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
Message-ID: <CAF6FJivjUtssWtozXuSqHuFBhdRM=QSE9-4zUSLP5AG-c6dzRg@mail.gmail.com>

On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:
>
> On 10/26/2016 06:00 PM, Julian Taylor wrote:

>> I prefer for the full functionality of numpy to stay available with a
>> stack of community owned software, even if it may be less powerful that
>> way.
>
> But then if this is really just the same random numbers numpy already
provides just faster, it is probably acceptable in principle. I haven't
actually looked at the PR yet.

I think the stream is different in some places, at least. And it's not a
silent backend drop-in like np.linalg being built against an optimized
BLAS, just a separate module that is inoperative without MKL.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/5308ede9/attachment.html>

From sebastian at sipsolutions.net  Wed Oct 26 12:36:35 2016
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Wed, 26 Oct 2016 18:36:35 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAF6FJivjUtssWtozXuSqHuFBhdRM=QSE9-4zUSLP5AG-c6dzRg@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAF6FJivjUtssWtozXuSqHuFBhdRM=QSE9-4zUSLP5AG-c6dzRg@mail.gmail.com>
Message-ID: <1477499795.12923.1.camel@sipsolutions.net>

On Mi, 2016-10-26 at 09:29 -0700, Robert Kern wrote:
> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor <jtaylor.debian at google
> mail.com> wrote:
> >
> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> 
> >> I prefer for the full functionality of numpy to stay available
> with a
> >> stack of community owned software, even if it may be less powerful
> that
> >> way.
> >
> > But then if this is really just the same random numbers numpy
> already provides just faster, it is probably acceptable in principle.
> I haven't actually looked at the PR yet.
> 
> I think the stream is different in some places, at least. And it's
> not a silent backend drop-in like np.linalg being built against an
> optimized BLAS, just a separate module that is inoperative without
> MKL.
> 

I might be swayed, but my gut feeling would be that a backend change
(if the default stream changes, an explicit one, though maybe one could
make a "fastest") would be the only reasonable way to provide such a
thing in numpy itself.

- Sebastian


> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/914c1bee/attachment.sig>

From robert.kern at gmail.com  Wed Oct 26 12:53:29 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 26 Oct 2016 09:53:29 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <1477499795.12923.1.camel@sipsolutions.net>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAF6FJivjUtssWtozXuSqHuFBhdRM=QSE9-4zUSLP5AG-c6dzRg@mail.gmail.com>
 <1477499795.12923.1.camel@sipsolutions.net>
Message-ID: <CAF6FJiv=mY-Sk1HwtBgk+WJY=cWNBnt31VLv8juWZb-2x8sRfA@mail.gmail.com>

On Wed, Oct 26, 2016 at 9:36 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:
>
> On Mi, 2016-10-26 at 09:29 -0700, Robert Kern wrote:
> > On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor <jtaylor.debian at google
> > mail.com> wrote:
> > >
> > > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> >
> > >> I prefer for the full functionality of numpy to stay available
> > with a
> > >> stack of community owned software, even if it may be less powerful
> > that
> > >> way.
> > >
> > > But then if this is really just the same random numbers numpy
> > already provides just faster, it is probably acceptable in principle.
> > I haven't actually looked at the PR yet.
> >
> > I think the stream is different in some places, at least. And it's
> > not a silent backend drop-in like np.linalg being built against an
> > optimized BLAS, just a separate module that is inoperative without
> > MKL.
>
> I might be swayed, but my gut feeling would be that a backend change
> (if the default stream changes, an explicit one, though maybe one could
> make a "fastest") would be the only reasonable way to provide such a
> thing in numpy itself.

That mostly argues for distributing it as a separate package, not part of
numpy at all.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/b5b2bc0b/attachment.html>

From mathewsyriac at gmail.com  Wed Oct 26 13:27:50 2016
From: mathewsyriac at gmail.com (Mathew S. Madhavacheril)
Date: Wed, 26 Oct 2016 13:27:50 -0400
Subject: [Numpy-discussion] Combining covariance and correlation coefficient
	into one numpy.cov call
Message-ID: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>

Hi all,

I posted a pull request:
https://github.com/numpy/numpy/pull/8211

which adds a function `numpy.covcorr` that calculates both
the covariance matrix and correlation coefficient with a single
call to `numpy.cov` (which is often an expensive call for large
data-sets). A function `numpy.covtocorr` has also been added
that converts a covariance matrix to a correlation coefficent,
and `numpy.corrcoef` has been modified to call this. The
motivation here is that one often needs the covariance for
subsequent analysis and the correlation coefficient for
visualization, so instead of forcing the user to write their own
code to convert one to the other, we want to allow both to
be obtained from `numpy` as efficiently as possible.

Best,
Mathew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/c4610583/attachment.html>

From shoyer at gmail.com  Wed Oct 26 13:46:48 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Wed, 26 Oct 2016 10:46:48 -0700
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
Message-ID: <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>

I wonder if the goals of this addition could be achieved by simply adding
an optional `cov` argument to np.corr, which would provide a pre-computed
covariance.

Either way, `covcorr` feels like a helper function that could exist in user
code rather than numpy proper.

On Wed, Oct 26, 2016 at 10:27 AM, Mathew S. Madhavacheril <
mathewsyriac at gmail.com> wrote:

> Hi all,
>
> I posted a pull request:
> https://github.com/numpy/numpy/pull/8211
>
> which adds a function `numpy.covcorr` that calculates both
> the covariance matrix and correlation coefficient with a single
> call to `numpy.cov` (which is often an expensive call for large
> data-sets). A function `numpy.covtocorr` has also been added
> that converts a covariance matrix to a correlation coefficent,
> and `numpy.corrcoef` has been modified to call this. The
> motivation here is that one often needs the covariance for
> subsequent analysis and the correlation coefficient for
> visualization, so instead of forcing the user to write their own
> code to convert one to the other, we want to allow both to
> be obtained from `numpy` as efficiently as possible.
>
> Best,
> Mathew
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/ea874fc3/attachment.html>

From mathewsyriac at gmail.com  Wed Oct 26 14:03:36 2016
From: mathewsyriac at gmail.com (Mathew S. Madhavacheril)
Date: Wed, 26 Oct 2016 14:03:36 -0400
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
 <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
Message-ID: <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>

On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> I wonder if the goals of this addition could be achieved by simply adding
> an optional `cov` argument
>
to np.corr, which would provide a pre-computed covariance.
>

That's a fair suggestion which I'm happy to switch to. This eliminates the
need for two new functions.
I'll add an optional `cov = False` argument to numpy.corrcoef that returns
a tuple (corr, cov) instead.


>
> Either way, `covcorr` feels like a helper function that could exist in
> user code rather than numpy proper.
>

The user would have to re-implement the part that converts the covariance
matrix to a correlation
coefficient. I made this PR to avoid that code duplication.

Mathew


>
> On Wed, Oct 26, 2016 at 10:27 AM, Mathew S. Madhavacheril <
> mathewsyriac at gmail.com> wrote:
>
>> Hi all,
>>
>> I posted a pull request:
>> https://github.com/numpy/numpy/pull/8211
>>
>> which adds a function `numpy.covcorr` that calculates both
>> the covariance matrix and correlation coefficient with a single
>> call to `numpy.cov` (which is often an expensive call for large
>> data-sets). A function `numpy.covtocorr` has also been added
>> that converts a covariance matrix to a correlation coefficent,
>> and `numpy.corrcoef` has been modified to call this. The
>> motivation here is that one often needs the covariance for
>> subsequent analysis and the correlation coefficient for
>> visualization, so instead of forcing the user to write their own
>> code to convert one to the other, we want to allow both to
>> be obtained from `numpy` as efficiently as possible.
>>
>> Best,
>> Mathew
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/d27021f0/attachment.html>

From shoyer at gmail.com  Wed Oct 26 14:13:54 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Wed, 26 Oct 2016 11:13:54 -0700
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
 <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
 <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>
Message-ID: <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>

On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril <
mathewsyriac at gmail.com> wrote:

> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> I wonder if the goals of this addition could be achieved by simply adding
>> an optional `cov` argument
>>
> to np.corr, which would provide a pre-computed covariance.
>>
>
> That's a fair suggestion which I'm happy to switch to. This eliminates the
> need for two new functions.
> I'll add an optional `cov = False` argument to numpy.corrcoef that returns
> a tuple (corr, cov) instead.
>
>
>>
>> Either way, `covcorr` feels like a helper function that could exist in
>> user code rather than numpy proper.
>>
>
> The user would have to re-implement the part that converts the covariance
> matrix to a correlation
> coefficient. I made this PR to avoid that code duplication.
>

With the API I was envisioning (or even your proposed API, for that
matter), this function would only be a few lines, e.g.,

def covcorr(x):
    cov = np.cov(x)
    corr = np.corrcoef(x, cov=cov)
    return (cov, corr)

Generally, functions this short should be provided as recipes (if at all)
rather than be added to numpy proper, unless the need for them is extremely
common.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/6f00e01f/attachment.html>

From mathewsyriac at gmail.com  Wed Oct 26 14:26:32 2016
From: mathewsyriac at gmail.com (Mathew S. Madhavacheril)
Date: Wed, 26 Oct 2016 14:26:32 -0400
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
 <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
 <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>
 <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>
Message-ID: <CACKtB+ThFwiaXQSvGsBzyUFuQGyOMAw5Yk2OokcGHeVZSLBqQQ@mail.gmail.com>

On Wed, Oct 26, 2016 at 2:13 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril <
> mathewsyriac at gmail.com> wrote:
>
>> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>>
>>> I wonder if the goals of this addition could be achieved by simply
>>> adding an optional `cov` argument
>>>
>> to np.corr, which would provide a pre-computed covariance.
>>>
>>
>> That's a fair suggestion which I'm happy to switch to. This eliminates
>> the need for two new functions.
>> I'll add an optional `cov = False` argument to numpy.corrcoef that
>> returns a tuple (corr, cov) instead.
>>
>>
>>>
>>> Either way, `covcorr` feels like a helper function that could exist in
>>> user code rather than numpy proper.
>>>
>>
>> The user would have to re-implement the part that converts the covariance
>> matrix to a correlation
>> coefficient. I made this PR to avoid that code duplication.
>>
>
> With the API I was envisioning (or even your proposed API, for that
> matter), this function would only be a few lines, e.g.,
>
> def covcorr(x):
>     cov = np.cov(x)
>     corr = np.corrcoef(x, cov=cov)
>     return (cov, corr)
>
> Generally, functions this short should be provided as recipes (if at all)
> rather than be added to numpy proper, unless the need for them is extremely
> common.
>

Ah, I see what you were suggesting now. I agree that a function like
covcorr need not be provided
by numpy itself, but it would be tremendously useful if a pre-computed
covariance could
be provided to np.corrcoef. I can update this PR to just add `cov = None`
to numpy.corrcoef and
do an `if cov is not None` before calculating the covariance. Note however
that in the case
that `cov` is specified for np.corrcoef, the non-optional `x` argument is
redundant.


>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/176dad9a/attachment.html>

From njs at pobox.com  Wed Oct 26 14:56:41 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 26 Oct 2016 11:56:41 -0700
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
 <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
 <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>
 <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>
Message-ID: <CAPJVwBm4yEr14t=qFe6QZZ9T5O+figT0WdMZ36u3pMbEsQP=6w@mail.gmail.com>

On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer <shoyer at gmail.com> wrote:
> On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril
> <mathewsyriac at gmail.com> wrote:
>>
>> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>>>
>>> I wonder if the goals of this addition could be achieved by simply adding
>>> an optional `cov` argument
>>>
>>> to np.corr, which would provide a pre-computed covariance.
>>
>>
>> That's a fair suggestion which I'm happy to switch to. This eliminates the
>> need for two new functions.
>> I'll add an optional `cov = False` argument to numpy.corrcoef that returns
>> a tuple (corr, cov) instead.
>>
>>>
>>>
>>> Either way, `covcorr` feels like a helper function that could exist in
>>> user code rather than numpy proper.
>>
>>
>> The user would have to re-implement the part that converts the covariance
>> matrix to a correlation
>> coefficient. I made this PR to avoid that code duplication.
>
>
> With the API I was envisioning (or even your proposed API, for that matter),
> this function would only be a few lines, e.g.,
>
> def covcorr(x):
>     cov = np.cov(x)
>     corr = np.corrcoef(x, cov=cov)

IIUC, if you have a covariance matrix then you can compute the
correlation matrix directly, without looking at 'x', so corrcoef(x,
cov=cov) is a bit odd-looking. I think probably the API that makes the
most sense is just to expose something like the covtocorr function
(maybe it could have a less telegraphic name?)? And then, yeah, users
can use that to build their own covcorr or whatever if they want it.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From mathewsyriac at gmail.com  Wed Oct 26 15:11:22 2016
From: mathewsyriac at gmail.com (Mathew S. Madhavacheril)
Date: Wed, 26 Oct 2016 15:11:22 -0400
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CAPJVwBm4yEr14t=qFe6QZZ9T5O+figT0WdMZ36u3pMbEsQP=6w@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
 <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
 <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>
 <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>
 <CAPJVwBm4yEr14t=qFe6QZZ9T5O+figT0WdMZ36u3pMbEsQP=6w@mail.gmail.com>
Message-ID: <CACKtB+Q5XYrU+=bF6Q=nmmJ40hPsxW-7gMZRmvx0+gYDvGWaLw@mail.gmail.com>

On Wed, Oct 26, 2016 at 2:56 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer <shoyer at gmail.com> wrote:
> > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril
> > <mathewsyriac at gmail.com> wrote:
> >>
> >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer <shoyer at gmail.com>
> wrote:
> >>>
> >>> I wonder if the goals of this addition could be achieved by simply
> adding
> >>> an optional `cov` argument
> >>>
> >>> to np.corr, which would provide a pre-computed covariance.
> >>
> >>
> >> That's a fair suggestion which I'm happy to switch to. This eliminates
> the
> >> need for two new functions.
> >> I'll add an optional `cov = False` argument to numpy.corrcoef that
> returns
> >> a tuple (corr, cov) instead.
> >>
> >>>
> >>>
> >>> Either way, `covcorr` feels like a helper function that could exist in
> >>> user code rather than numpy proper.
> >>
> >>
> >> The user would have to re-implement the part that converts the
> covariance
> >> matrix to a correlation
> >> coefficient. I made this PR to avoid that code duplication.
> >
> >
> > With the API I was envisioning (or even your proposed API, for that
> matter),
> > this function would only be a few lines, e.g.,
> >
> > def covcorr(x):
> >     cov = np.cov(x)
> >     corr = np.corrcoef(x, cov=cov)
>
> IIUC, if you have a covariance matrix then you can compute the
> correlation matrix directly, without looking at 'x', so corrcoef(x,
> cov=cov) is a bit odd-looking. I think probably the API that makes the
> most sense is just to expose something like the covtocorr function
> (maybe it could have a less telegraphic name?)? And then, yeah, users
> can use that to build their own covcorr or whatever if they want it.
>

Right, agreed, this is why I said `x` becomes redundant when `cov` is
specified
when calling `numpy.corrcoef`.  So we have two alternatives:

1) Have `np.corrcoef` accept a boolean optional argument `covmat = False`
that lets
one obtain a tuple containing the covariance and the correlation matrices
in the same call
2) Modify my original PR so that `np.covtocorr` remains (with possibly a
better
name) but remove `np.covcorr` since this is easy for the user to add.

My preference is option 2.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/f64089b7/attachment.html>

From josef.pktd at gmail.com  Wed Oct 26 15:20:15 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 26 Oct 2016 15:20:15 -0400
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CACKtB+Q5XYrU+=bF6Q=nmmJ40hPsxW-7gMZRmvx0+gYDvGWaLw@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
 <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
 <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>
 <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>
 <CAPJVwBm4yEr14t=qFe6QZZ9T5O+figT0WdMZ36u3pMbEsQP=6w@mail.gmail.com>
 <CACKtB+Q5XYrU+=bF6Q=nmmJ40hPsxW-7gMZRmvx0+gYDvGWaLw@mail.gmail.com>
Message-ID: <CAMMTP+AWRu8P9vkjzK2+nmaYXxrdGiVDs1mys_P1ufOOFx_e=w@mail.gmail.com>

On Wed, Oct 26, 2016 at 3:11 PM, Mathew S. Madhavacheril <
mathewsyriac at gmail.com> wrote:

>
>
> On Wed, Oct 26, 2016 at 2:56 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer <shoyer at gmail.com> wrote:
>> > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril
>> > <mathewsyriac at gmail.com> wrote:
>> >>
>> >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer <shoyer at gmail.com>
>> wrote:
>> >>>
>> >>> I wonder if the goals of this addition could be achieved by simply
>> adding
>> >>> an optional `cov` argument
>> >>>
>> >>> to np.corr, which would provide a pre-computed covariance.
>> >>
>> >>
>> >> That's a fair suggestion which I'm happy to switch to. This eliminates
>> the
>> >> need for two new functions.
>> >> I'll add an optional `cov = False` argument to numpy.corrcoef that
>> returns
>> >> a tuple (corr, cov) instead.
>> >>
>> >>>
>> >>>
>> >>> Either way, `covcorr` feels like a helper function that could exist in
>> >>> user code rather than numpy proper.
>> >>
>> >>
>> >> The user would have to re-implement the part that converts the
>> covariance
>> >> matrix to a correlation
>> >> coefficient. I made this PR to avoid that code duplication.
>> >
>> >
>> > With the API I was envisioning (or even your proposed API, for that
>> matter),
>> > this function would only be a few lines, e.g.,
>> >
>> > def covcorr(x):
>> >     cov = np.cov(x)
>> >     corr = np.corrcoef(x, cov=cov)
>>
>> IIUC, if you have a covariance matrix then you can compute the
>> correlation matrix directly, without looking at 'x', so corrcoef(x,
>> cov=cov) is a bit odd-looking. I think probably the API that makes the
>> most sense is just to expose something like the covtocorr function
>> (maybe it could have a less telegraphic name?)? And then, yeah, users
>> can use that to build their own covcorr or whatever if they want it.
>>
>
> Right, agreed, this is why I said `x` becomes redundant when `cov` is
> specified
> when calling `numpy.corrcoef`.  So we have two alternatives:
>
> 1) Have `np.corrcoef` accept a boolean optional argument `covmat = False`
> that lets
> one obtain a tuple containing the covariance and the correlation matrices
> in the same call
> 2) Modify my original PR so that `np.covtocorr` remains (with possibly a
> better
> name) but remove `np.covcorr` since this is easy for the user to add.
>
> My preference is option 2.
>

cov2corr is a useful function
http://www.statsmodels.org/dev/generated/statsmodels.stats.moment_helpers.cov2corr.html
I also wrote the inverse function corr2cov, but AFAIR use it only in some
test cases.


I don't think adding any of the options to corrcoef or covcor is useful
since there is no computational advantage to it.
What I'm missing are functions that return the intermediate results, e.g.
var and mean or cov and mean.

(For statsmodels I decided to return mean and cov or mean and var in the
related functions. Some R packages return the mean as an option.)

Josef


>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/6a6c0021/attachment.html>

From charlesr.harris at gmail.com  Wed Oct 26 15:23:08 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 26 Oct 2016 13:23:08 -0600
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
 <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
Message-ID: <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>

On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer <shoyer at gmail.com> wrote:

> I am also concerned about adding more special cases for NumPy scalars vs
> arrays. These cases are already confusing (e.g., making no distinction
> between 0d arrays and scalars) and poorly documented.
>
> On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > Hi All,
>> >
>> > I've been thinking about this some (a lot) more and have an alternate
>> > proposal for the behavior of the `**` operator
>> >
>> > if both base and power are numpy/python scalar integers, convert to
>> python
>> > integers and call the `**` operator. That would solve both the
>> precision and
>> > compatibility problems and I think is the option of least surprise. For
>> > those who need type preservation and modular arithmetic, the np.power
>> > function remains, although the type conversions can be surpirising as it
>> > seems that the base and power should  play different roles in
>> determining
>> > the type, at least to me.
>> > Array, 0-d or not, are treated differently from scalars and integers
>> raised
>> > to negative integer powers always raise an error.
>> >
>> > I think this solves most problems and would not be difficult to
>> implement.
>> >
>> > Thoughts?
>>
>> My main concern about this is that it adds more special cases to numpy
>> scalars, and a new behavioral deviation between 0d arrays and scalars,
>> when ideally we should be trying to reduce the
>> duplication/discrepancies between these. It's also inconsistent with
>> how other operations on integer scalars work, e.g. regular addition
>> overflows rather than promoting to Python int:
>>
>> In [8]: np.int64(2 ** 63 - 1) + 1
>> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning:
>> overflow encountered in long_scalars
>>   #!/home/njs/.user-python3.5-64bit/bin/python3.5
>> Out[8]: -9223372036854775808
>>
>> So I'm inclined to try and keep it simple, like in your previous
>> proposal... theoretically of course it would be nice to have the
>> perfect solution here, but at this point it feels like we might be
>> overthinking this trying to get that last 1% of improvement. The thing
>> where 2 ** -1 returns 0 is just broken and bites people so we should
>> definitely fix it, but beyond that I'm not sure it really matters
>> *that* much what we do, and "special cases aren't special enough to
>> break the rules" and all that.
>>
>>
What I have been concerned about are the follow combinations that currently
return floats

num: <type 'numpy.int8'>, exp: <type 'numpy.int8'>, res: <type
'numpy.float32'>
num: <type 'numpy.int16'>, exp: <type 'numpy.int8'>, res: <type
'numpy.float32'>
num: <type 'numpy.int16'>, exp: <type 'numpy.int16'>, res: <type
'numpy.float32'>
num: <type 'numpy.int32'>, exp: <type 'numpy.int8'>, res: <type
'numpy.float64'>
num: <type 'numpy.int32'>, exp: <type 'numpy.int16'>, res: <type
'numpy.float64'>
num: <type 'numpy.int32'>, exp: <type 'numpy.int32'>, res: <type
'numpy.float64'>
num: <type 'numpy.int64'>, exp: <type 'numpy.int8'>, res: <type
'numpy.float64'>
num: <type 'numpy.int64'>, exp: <type 'numpy.int16'>, res: <type
'numpy.float64'>
num: <type 'numpy.int64'>, exp: <type 'numpy.int32'>, res: <type
'numpy.float64'>
num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
'numpy.float64'>
num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
'numpy.float64'>
num: <type 'numpy.uint64'>, exp: <type 'numpy.int8'>, res: <type
'numpy.float64'>
num: <type 'numpy.uint64'>, exp: <type 'numpy.int16'>, res: <type
'numpy.float64'>
num: <type 'numpy.uint64'>, exp: <type 'numpy.int32'>, res: <type
'numpy.float64'>
num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
'numpy.float64'>
num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
'numpy.float64'>

The other combinations of signed and unsigned integers to signed powers
currently raise ValueError due to the change to the power ufunc. The
exceptions that aren't covered by uint64 + signed (which won't change) seem
to occur when the exponent can be safely cast to the base type. I suspect
that people have already come to depend on that, especially as python
integers on 64 bit linux convert to int64. So in those cases we should
perhaps raise a FutureWarning instead of an error.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/945eb1bd/attachment.html>

From njs at pobox.com  Wed Oct 26 15:24:48 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 26 Oct 2016 12:24:48 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
Message-ID: <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>

On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> On 10/26/2016 06:00 PM, Julian Taylor wrote:
>>
>> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>>>
>>>
>>>
>>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
>>> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
>>> wrote:
>>>
>>>     On 26.10.2016 06:34, Charles R Harris wrote:
>>>     > Hi All,
>>>     >
>>>     > There is a proposed random number package PR now up on github:
>>>     > https://github.com/numpy/numpy/pull/8209
>>>     <https://github.com/numpy/numpy/pull/8209>. It is from
>>>     > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
>>>     <https://github.com/oleksandr-pavlyk>> and implements
>>>     > the number random number package using MKL for increased speed.
>>> I think
>>>     > we are definitely interested in the improved speed, but I'm not
>>> sure
>>>     > numpy is the best place to put the package. I'd welcome any
>>> comments on
>>>     > the PR itself, as well as any thoughts on the best way organize
>>> or use
>>>     > of this work. Maybe scikit-random
>>>
>>>
>>> Note that this thread is a continuation of
>>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html
>>>
>>>
>>>
>>>     I'm not a fan of putting code depending on a proprietary library
>>>     into numpy.
>>>     This should be a standalone package which may provide the same
>>> interface
>>>     as numpy.
>>>
>>>
>>> I don't really see a problem with that in principle. Numpy can use Intel
>>> MKL (and Accelerate) as well if it's available. It needs some thought
>>> put into the API though - a ``numpy.random_intel`` module is certainly
>>> not what we want.
>>>
>>
>> For me there is a difference between being able to optionally use a
>> proprietary library as an alternative to free software libraries if the
>> user wishes to do so and offering functionality that only works with
>> non-free software.
>> We are providing a form of advertisement for them by allowing it (hey if
>> you buy this black box that you cannot modify or use freely you get this
>> neat numpy feature!).
>>
>> I prefer for the full functionality of numpy to stay available with a
>> stack of community owned software, even if it may be less powerful that
>> way.
>
> But then if this is really just the same random numbers numpy already
> provides just faster, it is probably acceptable in principle. I haven't
> actually looked at the PR yet.

The RNG stream is totally different, so yeah, it can't just be a
silent drop-in replacement like BLAS/LAPACK.

The patch also adds ~10,000 lines of code; here's an example of what
some of it looks like:

    https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833

I don't see how we can realistically commit to maintaining this.

I'm also not really seeing how shipping it as part of numpy provides
extra benefits to maintainers or users? AFAICT right now it's
basically structured as a standalone library that's been dropped into
the numpy source tree, and it would be just as easy to ship separately
(or am I wrong?). And since the public API is that all the
functionality comes from importing this specific new module
('numpy.random_intel'), it'd be a one-line change for users to import
from a non-numpy namespace, like 'mkl.random' or whatever. If it were
more integrated with the rest of numpy then the trade-offs would be
more complicated, but in its present form this seems like an easy
call.

The other question is whether it could/should change to *become* more
integrated... that's more tricky. There's been some work towards
supporting swappable backends inside np.random; but the focus has
mostly been on allowing new core generators, though, and this code
seems to want to take over the whole thing (core generator +
distributions), so even once the swappable backends stuff is working
I'm not sure it would be relevant here. The one case I can think of
that does seem promising is that if we get an API for users to say "I
don't care about stream compatibility, just give me un-reproducible
variates as fast as you can", then it might make sense for that to
silently use MKL if available -- this would be pretty analogous to the
use of MKL in np.linalg. But we don't have that API yet, I'm not sure
how the MKL fallback could be maintainably implemented given that it
would require somehow swapping the entire RandomState implementation,
and it's entirely possible that once we figure out solutions to those
then it'd still make sense for the actual MKL wrappers to live in a
third-party library that numpy imports.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From p.e.creasey.00 at googlemail.com  Wed Oct 26 15:35:50 2016
From: p.e.creasey.00 at googlemail.com (Peter Creasey)
Date: Wed, 26 Oct 2016 12:35:50 -0700
Subject: [Numpy-discussion] padding options for diff
Message-ID: <CACUQHuJw=xwWJKKSzFrVmGfAsAeYo2pUMuR=cejSU7FD0aAsiQ@mail.gmail.com>

> Date: Wed, 26 Oct 2016 09:05:41 -0400
> From: Matthew Harrigan <harrigan.matthew at gmail.com>
>
> np.cumsum(np.diff(x, to_begin=x.take([0], axis=axis), axis=axis), axis=axis)
>
> That's certainly not going to win any beauty contests.  The 1d case is
> clean though:
>
> np.cumsum(np.diff(x, to_begin=x[0]))
>
> I'm not sure if this means the API should change, and if so how.  Higher
> dimensional arrays seem to just have extra complexity.
>
>>
>> I like the proposal, though I suspect that making it general has
>> obscured that the most common use-case for padding is to make the
>> inverse of np.cumsum (at least that?s what I frequently need), and now
>> in the multidimensional case you have the somewhat unwieldy:
>>
>> >>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis))
>>
>> rather than
>>
>> >>> np.diff(a, axis=axis, keep_left=True)
>>
>> which of course could just be an option upon what you already have.
>>

So my suggestion was intended that you might want an additional
keyword argument (keep_left=False) to make the inverse np.cumsum
use-case easier, i.e. you would have something in your np.diff like:

if keep_left:
    if to_begin is None:
        to_begin = np.take(a, [0], axis=axis)
    else:
        raise ValueError(?np.diff(a, keep_left=False, to_begin=None)
can be used with either keep_left or to_begin, but not both.?)

Generally I try to avoid optional keyword argument overlap, but in
this case it is probably justified.

Peter


From josef.pktd at gmail.com  Wed Oct 26 15:39:16 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 26 Oct 2016 15:39:16 -0400
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
 <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
 <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
Message-ID: <CAMMTP+AmULUFnYessnn-9987-RecU8zC56kA9Era=R0opeygBg@mail.gmail.com>

On Wed, Oct 26, 2016 at 3:23 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> I am also concerned about adding more special cases for NumPy scalars vs
>> arrays. These cases are already confusing (e.g., making no distinction
>> between 0d arrays and scalars) and poorly documented.
>>
>> On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>>> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>> > Hi All,
>>> >
>>> > I've been thinking about this some (a lot) more and have an alternate
>>> > proposal for the behavior of the `**` operator
>>> >
>>> > if both base and power are numpy/python scalar integers, convert to
>>> python
>>> > integers and call the `**` operator. That would solve both the
>>> precision and
>>> > compatibility problems and I think is the option of least surprise. For
>>> > those who need type preservation and modular arithmetic, the np.power
>>> > function remains, although the type conversions can be surpirising as
>>> it
>>> > seems that the base and power should  play different roles in
>>> determining
>>> > the type, at least to me.
>>> > Array, 0-d or not, are treated differently from scalars and integers
>>> raised
>>> > to negative integer powers always raise an error.
>>> >
>>> > I think this solves most problems and would not be difficult to
>>> implement.
>>> >
>>> > Thoughts?
>>>
>>> My main concern about this is that it adds more special cases to numpy
>>> scalars, and a new behavioral deviation between 0d arrays and scalars,
>>> when ideally we should be trying to reduce the
>>> duplication/discrepancies between these. It's also inconsistent with
>>> how other operations on integer scalars work, e.g. regular addition
>>> overflows rather than promoting to Python int:
>>>
>>> In [8]: np.int64(2 ** 63 - 1) + 1
>>> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning:
>>> overflow encountered in long_scalars
>>>   #!/home/njs/.user-python3.5-64bit/bin/python3.5
>>> Out[8]: -9223372036854775808
>>>
>>> So I'm inclined to try and keep it simple, like in your previous
>>> proposal... theoretically of course it would be nice to have the
>>> perfect solution here, but at this point it feels like we might be
>>> overthinking this trying to get that last 1% of improvement. The thing
>>> where 2 ** -1 returns 0 is just broken and bites people so we should
>>> definitely fix it, but beyond that I'm not sure it really matters
>>> *that* much what we do, and "special cases aren't special enough to
>>> break the rules" and all that.
>>>
>>>
> What I have been concerned about are the follow combinations that
> currently return floats
>
> num: <type 'numpy.int8'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float32'>
> num: <type 'numpy.int16'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float32'>
> num: <type 'numpy.int16'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float32'>
> num: <type 'numpy.int32'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int32'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int32'>, exp: <type 'numpy.int32'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int32'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int32'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>
>
> The other combinations of signed and unsigned integers to signed powers
> currently raise ValueError due to the change to the power ufunc. The
> exceptions that aren't covered by uint64 + signed (which won't change) seem
> to occur when the exponent can be safely cast to the base type. I suspect
> that people have already come to depend on that, especially as python
> integers on 64 bit linux convert to int64. So in those cases we should
> perhaps raise a FutureWarning instead of an error.
>


>>> np.int64(2)**np.array(-1, np.int64)
0.5
>>> np.__version__
'1.10.4'
>>> np.int64(2)**np.array([-1, 2], np.int64)
array([0, 4], dtype=int64)
>>> np.array(2, np.uint64)**np.array([-1, 2], np.int64)
array([0, 4], dtype=int64)
>>> np.array([2], np.uint64)**np.array([-1, 2], np.int64)
array([ 0.5,  4. ])
>>> np.array([2], np.uint64).squeeze()**np.array([-1, 2], np.int64)
array([0, 4], dtype=int64)


(IMO: If you have to break backwards compatibility, break forwards not
backwards.)


Josef
http://www.stanlaurelandoliverhardy.com/nicemess.htm


>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/fe22a60a/attachment.html>

From njs at pobox.com  Wed Oct 26 15:39:15 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 26 Oct 2016 12:39:15 -0700
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
 <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
 <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
Message-ID: <CAPJVwBkYdHtk2LEW87tGdny8H4OoGUby3o6Jngp_WgQUM31vHw@mail.gmail.com>

On Wed, Oct 26, 2016 at 12:23 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
[...]
> What I have been concerned about are the follow combinations that currently
> return floats
>
> num: <type 'numpy.int8'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float32'>
> num: <type 'numpy.int16'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float32'>
> num: <type 'numpy.int16'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float32'>
> num: <type 'numpy.int32'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int32'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int32'>, exp: <type 'numpy.int32'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int32'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int8'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int16'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int32'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>
> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> 'numpy.float64'>

What's this referring to? For both arrays and scalars I get:

In [8]: (np.array(2, dtype=np.int8) ** np.array(2, dtype=np.int8)).dtype
Out[8]: dtype('int8')

In [9]: (np.int8(2) ** np.int8(2)).dtype
Out[9]: dtype('int8')

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From warren.weckesser at gmail.com  Wed Oct 26 15:41:21 2016
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 26 Oct 2016 15:41:21 -0400
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
Message-ID: <CAGzF1ueN2XhAovi2cYLkOSKejAfC+E0kRmSyDjRcv5ZyUMZjAQ@mail.gmail.com>

On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> >>
> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
> >>> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> >>> wrote:
> >>>
> >>>     On 26.10.2016 06:34, Charles R Harris wrote:
> >>>     > Hi All,
> >>>     >
> >>>     > There is a proposed random number package PR now up on github:
> >>>     > https://github.com/numpy/numpy/pull/8209
> >>>     <https://github.com/numpy/numpy/pull/8209>. It is from
> >>>     > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
> >>>     <https://github.com/oleksandr-pavlyk>> and implements
> >>>     > the number random number package using MKL for increased speed.
> >>> I think
> >>>     > we are definitely interested in the improved speed, but I'm not
> >>> sure
> >>>     > numpy is the best place to put the package. I'd welcome any
> >>> comments on
> >>>     > the PR itself, as well as any thoughts on the best way organize
> >>> or use
> >>>     > of this work. Maybe scikit-random
> >>>
> >>>
> >>> Note that this thread is a continuation of
> >>> https://mail.scipy.org/pipermail/numpy-discussion/
> 2016-July/075822.html
> >>>
> >>>
> >>>
> >>>     I'm not a fan of putting code depending on a proprietary library
> >>>     into numpy.
> >>>     This should be a standalone package which may provide the same
> >>> interface
> >>>     as numpy.
> >>>
> >>>
> >>> I don't really see a problem with that in principle. Numpy can use
> Intel
> >>> MKL (and Accelerate) as well if it's available. It needs some thought
> >>> put into the API though - a ``numpy.random_intel`` module is certainly
> >>> not what we want.
> >>>
> >>
> >> For me there is a difference between being able to optionally use a
> >> proprietary library as an alternative to free software libraries if the
> >> user wishes to do so and offering functionality that only works with
> >> non-free software.
> >> We are providing a form of advertisement for them by allowing it (hey if
> >> you buy this black box that you cannot modify or use freely you get this
> >> neat numpy feature!).
> >>
> >> I prefer for the full functionality of numpy to stay available with a
> >> stack of community owned software, even if it may be less powerful that
> >> way.
> >
> > But then if this is really just the same random numbers numpy already
> > provides just faster, it is probably acceptable in principle. I haven't
> > actually looked at the PR yet.
>
> The RNG stream is totally different, so yeah, it can't just be a
> silent drop-in replacement like BLAS/LAPACK.
>
> The patch also adds ~10,000 lines of code; here's an example of what
> some of it looks like:
>
>     https://github.com/oleksandr-pavlyk/numpy/blob/
> b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/
> mklrand/mkl_distributions.cpp#L1724-L1833
>
> I don't see how we can realistically commit to maintaining this.
>
>

FYI:  numpy already maintains code exactly like that:
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397

Perhaps the point should be that the numpy devs won't want to maintain two
nearly identical versions of that code.

Warren


> I'm also not really seeing how shipping it as part of numpy provides
> extra benefits to maintainers or users? AFAICT right now it's
> basically structured as a standalone library that's been dropped into
> the numpy source tree, and it would be just as easy to ship separately
> (or am I wrong?). And since the public API is that all the
> functionality comes from importing this specific new module
> ('numpy.random_intel'), it'd be a one-line change for users to import
> from a non-numpy namespace, like 'mkl.random' or whatever. If it were
> more integrated with the rest of numpy then the trade-offs would be
> more complicated, but in its present form this seems like an easy
> call.
>
> The other question is whether it could/should change to *become* more
> integrated... that's more tricky. There's been some work towards
> supporting swappable backends inside np.random; but the focus has
> mostly been on allowing new core generators, though, and this code
> seems to want to take over the whole thing (core generator +
> distributions), so even once the swappable backends stuff is working
> I'm not sure it would be relevant here. The one case I can think of
> that does seem promising is that if we get an API for users to say "I
> don't care about stream compatibility, just give me un-reproducible
> variates as fast as you can", then it might make sense for that to
> silently use MKL if available -- this would be pretty analogous to the
> use of MKL in np.linalg. But we don't have that API yet, I'm not sure
> how the MKL fallback could be maintainably implemented given that it
> would require somehow swapping the entire RandomState implementation,
> and it's entirely possible that once we figure out solutions to those
> then it'd still make sense for the actual MKL wrappers to live in a
> third-party library that numpy imports.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/9b987326/attachment.html>

From robert.kern at gmail.com  Wed Oct 26 15:47:43 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 26 Oct 2016 12:47:43 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAGzF1ueN2XhAovi2cYLkOSKejAfC+E0kRmSyDjRcv5ZyUMZjAQ@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <CAGzF1ueN2XhAovi2cYLkOSKejAfC+E0kRmSyDjRcv5ZyUMZjAQ@mail.gmail.com>
Message-ID: <CAF6FJitK4=em47eRP3hNz_5K_wsSZzAYCECiJRWzRJHNuiHp8A@mail.gmail.com>

On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser <
warren.weckesser at gmail.com> wrote:
>
> On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith <njs at pobox.com> wrote:

>> The patch also adds ~10,000 lines of code; here's an example of what
>> some of it looks like:
>>
>>
https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833
>>
>> I don't see how we can realistically commit to maintaining this.
>
> FYI:  numpy already maintains code exactly like that:
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397
>
> Perhaps the point should be that the numpy devs won't want to maintain
two nearly identical versions of that code.

Indeed. That's how the algorithm was published. The /* sigh ... */ is my
own. ;-)

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/9ad8ef6d/attachment.html>

From josef.pktd at gmail.com  Wed Oct 26 15:49:36 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 26 Oct 2016 15:49:36 -0400
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAPJVwBkYdHtk2LEW87tGdny8H4OoGUby3o6Jngp_WgQUM31vHw@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
 <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
 <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
 <CAPJVwBkYdHtk2LEW87tGdny8H4OoGUby3o6Jngp_WgQUM31vHw@mail.gmail.com>
Message-ID: <CAMMTP+CmP_oPsjMOm-FWfqykbaKagic5dSd5AhCWyzhZXSfCuQ@mail.gmail.com>

On Wed, Oct 26, 2016 at 3:39 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Oct 26, 2016 at 12:23 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> [...]
> > What I have been concerned about are the follow combinations that
> currently
> > return floats
> >
> > num: <type 'numpy.int8'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float32'>
> > num: <type 'numpy.int16'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float32'>
> > num: <type 'numpy.int16'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float32'>
> > num: <type 'numpy.int32'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int32'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int32'>, exp: <type 'numpy.int32'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int32'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int32'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
>
> What's this referring to? For both arrays and scalars I get:
>
> In [8]: (np.array(2, dtype=np.int8) ** np.array(2, dtype=np.int8)).dtype
> Out[8]: dtype('int8')
>
> In [9]: (np.int8(2) ** np.int8(2)).dtype
> Out[9]: dtype('int8')
>


>>> (np.array([2], dtype=np.int8) ** np.array(-1,
dtype=np.int8).squeeze()).dtype
dtype('int8')
>>> (np.array([2], dtype=np.int8)[0] ** np.array(-1,
dtype=np.int8).squeeze()).dtype
dtype('float32')

>>> (np.int8(2)**np.int8(-1)).dtype
dtype('float32')
>>> (np.int8(2)**np.int8(2)).dtype
dtype('int8')

The last one looks like value dependent scalar dtype

Josef


>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/fce16342/attachment.html>

From njs at pobox.com  Wed Oct 26 15:49:50 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 26 Oct 2016 12:49:50 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAGzF1ueN2XhAovi2cYLkOSKejAfC+E0kRmSyDjRcv5ZyUMZjAQ@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <CAGzF1ueN2XhAovi2cYLkOSKejAfC+E0kRmSyDjRcv5ZyUMZjAQ@mail.gmail.com>
Message-ID: <CAPJVwBkY8uCJ6heVQQpDzDp9jVoNJF8SGERXkx2-OSZXqSdbnA@mail.gmail.com>

On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser
<warren.weckesser at gmail.com> wrote:
>
>
> On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
>> <jtaylor.debian at googlemail.com> wrote:
>> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
>> >>
>> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
>> >>> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
>> >>> wrote:
>> >>>
>> >>>     On 26.10.2016 06:34, Charles R Harris wrote:
>> >>>     > Hi All,
>> >>>     >
>> >>>     > There is a proposed random number package PR now up on github:
>> >>>     > https://github.com/numpy/numpy/pull/8209
>> >>>     <https://github.com/numpy/numpy/pull/8209>. It is from
>> >>>     > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
>> >>>     <https://github.com/oleksandr-pavlyk>> and implements
>> >>>     > the number random number package using MKL for increased speed.
>> >>> I think
>> >>>     > we are definitely interested in the improved speed, but I'm not
>> >>> sure
>> >>>     > numpy is the best place to put the package. I'd welcome any
>> >>> comments on
>> >>>     > the PR itself, as well as any thoughts on the best way organize
>> >>> or use
>> >>>     > of this work. Maybe scikit-random
>> >>>
>> >>>
>> >>> Note that this thread is a continuation of
>> >>>
>> >>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html
>> >>>
>> >>>
>> >>>
>> >>>     I'm not a fan of putting code depending on a proprietary library
>> >>>     into numpy.
>> >>>     This should be a standalone package which may provide the same
>> >>> interface
>> >>>     as numpy.
>> >>>
>> >>>
>> >>> I don't really see a problem with that in principle. Numpy can use
>> >>> Intel
>> >>> MKL (and Accelerate) as well if it's available. It needs some thought
>> >>> put into the API though - a ``numpy.random_intel`` module is certainly
>> >>> not what we want.
>> >>>
>> >>
>> >> For me there is a difference between being able to optionally use a
>> >> proprietary library as an alternative to free software libraries if the
>> >> user wishes to do so and offering functionality that only works with
>> >> non-free software.
>> >> We are providing a form of advertisement for them by allowing it (hey
>> >> if
>> >> you buy this black box that you cannot modify or use freely you get
>> >> this
>> >> neat numpy feature!).
>> >>
>> >> I prefer for the full functionality of numpy to stay available with a
>> >> stack of community owned software, even if it may be less powerful that
>> >> way.
>> >
>> > But then if this is really just the same random numbers numpy already
>> > provides just faster, it is probably acceptable in principle. I haven't
>> > actually looked at the PR yet.
>>
>> The RNG stream is totally different, so yeah, it can't just be a
>> silent drop-in replacement like BLAS/LAPACK.
>>
>> The patch also adds ~10,000 lines of code; here's an example of what
>> some of it looks like:
>>
>>
>> https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833
>>
>> I don't see how we can realistically commit to maintaining this.
>>
>
>
> FYI:  numpy already maintains code exactly like that:
> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397
>
> Perhaps the point should be that the numpy devs won't want to maintain two
> nearly identical versions of that code.

Heh, good catch! Okay, if random_intel is a massive copy-paste of
random with modifications applied on top, then that's its own issue...
on the one hand, yeah, we definitely don't want to carry around
massive copy/paste code. OTOH, it suggests that it might be possible
to refactor the code so that common parts are shared, and this would
be a benefit to integrating random and random_intel more closely. (And
this benefit would then have to be weighed against all the other
considerations, like how much sharing there actually was,
maintainability of the remaining random_intel-specific bits, the
desire to keep numpy free-and-open, etc.) Hard to make that call just
from skimming a 10,000 line patch, though...

Oleksandr, or others at Intel: how much possibility do you think there
is for sharing code between random and random_intel?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


From charlesr.harris at gmail.com  Wed Oct 26 15:57:29 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 26 Oct 2016 13:57:29 -0600
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAMMTP+AmULUFnYessnn-9987-RecU8zC56kA9Era=R0opeygBg@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
 <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
 <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
 <CAMMTP+AmULUFnYessnn-9987-RecU8zC56kA9Era=R0opeygBg@mail.gmail.com>
Message-ID: <CAB6mnxJf=v2q1cB17zXEB+uW6eJKEAQq83h=pbG1Q3wDwcqg+g@mail.gmail.com>

On Wed, Oct 26, 2016 at 1:39 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Wed, Oct 26, 2016 at 3:23 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer <shoyer at gmail.com> wrote:
>>
>>> I am also concerned about adding more special cases for NumPy scalars vs
>>> arrays. These cases are already confusing (e.g., making no distinction
>>> between 0d arrays and scalars) and poorly documented.
>>>
>>> On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>>> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris
>>>> <charlesr.harris at gmail.com> wrote:
>>>> > Hi All,
>>>> >
>>>> > I've been thinking about this some (a lot) more and have an alternate
>>>> > proposal for the behavior of the `**` operator
>>>> >
>>>> > if both base and power are numpy/python scalar integers, convert to
>>>> python
>>>> > integers and call the `**` operator. That would solve both the
>>>> precision and
>>>> > compatibility problems and I think is the option of least surprise.
>>>> For
>>>> > those who need type preservation and modular arithmetic, the np.power
>>>> > function remains, although the type conversions can be surpirising as
>>>> it
>>>> > seems that the base and power should  play different roles in
>>>> determining
>>>> > the type, at least to me.
>>>> > Array, 0-d or not, are treated differently from scalars and integers
>>>> raised
>>>> > to negative integer powers always raise an error.
>>>> >
>>>> > I think this solves most problems and would not be difficult to
>>>> implement.
>>>> >
>>>> > Thoughts?
>>>>
>>>> My main concern about this is that it adds more special cases to numpy
>>>> scalars, and a new behavioral deviation between 0d arrays and scalars,
>>>> when ideally we should be trying to reduce the
>>>> duplication/discrepancies between these. It's also inconsistent with
>>>> how other operations on integer scalars work, e.g. regular addition
>>>> overflows rather than promoting to Python int:
>>>>
>>>> In [8]: np.int64(2 ** 63 - 1) + 1
>>>> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning:
>>>> overflow encountered in long_scalars
>>>>   #!/home/njs/.user-python3.5-64bit/bin/python3.5
>>>> Out[8]: -9223372036854775808
>>>>
>>>> So I'm inclined to try and keep it simple, like in your previous
>>>> proposal... theoretically of course it would be nice to have the
>>>> perfect solution here, but at this point it feels like we might be
>>>> overthinking this trying to get that last 1% of improvement. The thing
>>>> where 2 ** -1 returns 0 is just broken and bites people so we should
>>>> definitely fix it, but beyond that I'm not sure it really matters
>>>> *that* much what we do, and "special cases aren't special enough to
>>>> break the rules" and all that.
>>>>
>>>>
>> What I have been concerned about are the follow combinations that
>> currently return floats
>>
>> num: <type 'numpy.int8'>, exp: <type 'numpy.int8'>, res: <type
>> 'numpy.float32'>
>> num: <type 'numpy.int16'>, exp: <type 'numpy.int8'>, res: <type
>> 'numpy.float32'>
>> num: <type 'numpy.int16'>, exp: <type 'numpy.int16'>, res: <type
>> 'numpy.float32'>
>> num: <type 'numpy.int32'>, exp: <type 'numpy.int8'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.int32'>, exp: <type 'numpy.int16'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.int32'>, exp: <type 'numpy.int32'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.int64'>, exp: <type 'numpy.int8'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.int64'>, exp: <type 'numpy.int16'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.int64'>, exp: <type 'numpy.int32'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int8'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int16'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int32'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
>> 'numpy.float64'>
>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
>> 'numpy.float64'>
>>
>> The other combinations of signed and unsigned integers to signed powers
>> currently raise ValueError due to the change to the power ufunc. The
>> exceptions that aren't covered by uint64 + signed (which won't change) seem
>> to occur when the exponent can be safely cast to the base type. I suspect
>> that people have already come to depend on that, especially as python
>> integers on 64 bit linux convert to int64. So in those cases we should
>> perhaps raise a FutureWarning instead of an error.
>>
>
>
> >>> np.int64(2)**np.array(-1, np.int64)
> 0.5
> >>> np.__version__
> '1.10.4'
> >>> np.int64(2)**np.array([-1, 2], np.int64)
> array([0, 4], dtype=int64)
> >>> np.array(2, np.uint64)**np.array([-1, 2], np.int64)
> array([0, 4], dtype=int64)
> >>> np.array([2], np.uint64)**np.array([-1, 2], np.int64)
> array([ 0.5,  4. ])
> >>> np.array([2], np.uint64).squeeze()**np.array([-1, 2], np.int64)
> array([0, 4], dtype=int64)
>
>
> (IMO: If you have to break backwards compatibility, break forwards not
> backwards.)
>

Current master is different. I'm not too worried in the array cases as the
results for negative exponents were zero except then raising -1 to a power.
Since that result is incorrect raising an error  falls on the fine line
between bug fix and compatibility break. If the pre-releases cause too much
trouble.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/f46b7997/attachment.html>

From charlesr.harris at gmail.com  Wed Oct 26 15:58:20 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 26 Oct 2016 13:58:20 -0600
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAPJVwBkYdHtk2LEW87tGdny8H4OoGUby3o6Jngp_WgQUM31vHw@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
 <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
 <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
 <CAPJVwBkYdHtk2LEW87tGdny8H4OoGUby3o6Jngp_WgQUM31vHw@mail.gmail.com>
Message-ID: <CAB6mnx+TuD2CXsy6EvfK=NpLDPc9fNwB3J+JHSK_6kxBoAtjKQ@mail.gmail.com>

On Wed, Oct 26, 2016 at 1:39 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Oct 26, 2016 at 12:23 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> [...]
> > What I have been concerned about are the follow combinations that
> currently
> > return floats
> >
> > num: <type 'numpy.int8'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float32'>
> > num: <type 'numpy.int16'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float32'>
> > num: <type 'numpy.int16'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float32'>
> > num: <type 'numpy.int32'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int32'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int32'>, exp: <type 'numpy.int32'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int32'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int8'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int16'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int32'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
> > num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
> > 'numpy.float64'>
>
> What's this referring to? For both arrays and scalars I get:
>
> In [8]: (np.array(2, dtype=np.int8) ** np.array(2, dtype=np.int8)).dtype
> Out[8]: dtype('int8')
>
> In [9]: (np.int8(2) ** np.int8(2)).dtype
> Out[9]: dtype('int8')
>
>
You need a negative exponent to see the effect.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/7cf79b8d/attachment.html>

From mathewsyriac at gmail.com  Wed Oct 26 16:12:05 2016
From: mathewsyriac at gmail.com (Mathew S. Madhavacheril)
Date: Wed, 26 Oct 2016 16:12:05 -0400
Subject: [Numpy-discussion] Combining covariance and correlation
 coefficient into one numpy.cov call
In-Reply-To: <CAMMTP+AWRu8P9vkjzK2+nmaYXxrdGiVDs1mys_P1ufOOFx_e=w@mail.gmail.com>
References: <CACKtB+RFMskJnCNcwZXPSAMbtXfpufzP4bgJWp-OEf4+wPUrLw@mail.gmail.com>
 <CAEQ_Tvd4Y99pNi-1Znn0SSq4asznUePZ4_1bqkOU3_rGhTwDow@mail.gmail.com>
 <CACKtB+Q_Zhbim3wsi0rQzdOPDYFkyw2dDo7oBeO4PMJqQ+TcLg@mail.gmail.com>
 <CAEQ_TvdjNNEHHtGUf+x3K3D7Oj4pL_1duJdo6Mwvq-B4ZkkzpQ@mail.gmail.com>
 <CAPJVwBm4yEr14t=qFe6QZZ9T5O+figT0WdMZ36u3pMbEsQP=6w@mail.gmail.com>
 <CACKtB+Q5XYrU+=bF6Q=nmmJ40hPsxW-7gMZRmvx0+gYDvGWaLw@mail.gmail.com>
 <CAMMTP+AWRu8P9vkjzK2+nmaYXxrdGiVDs1mys_P1ufOOFx_e=w@mail.gmail.com>
Message-ID: <CACKtB+Q_nYncBzc22+RaVH8TvEVNwJhFSakddxxcFtXa5g5aaw@mail.gmail.com>

On Wed, Oct 26, 2016 at 3:20 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Wed, Oct 26, 2016 at 3:11 PM, Mathew S. Madhavacheril <
> mathewsyriac at gmail.com> wrote:
>
>>
>>
>> On Wed, Oct 26, 2016 at 2:56 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>>> On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer <shoyer at gmail.com>
>>> wrote:
>>> > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril
>>> > <mathewsyriac at gmail.com> wrote:
>>> >>
>>> >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer <shoyer at gmail.com>
>>> wrote:
>>> >>>
>>> >>> I wonder if the goals of this addition could be achieved by simply
>>> adding
>>> >>> an optional `cov` argument
>>> >>>
>>> >>> to np.corr, which would provide a pre-computed covariance.
>>> >>
>>> >>
>>> >> That's a fair suggestion which I'm happy to switch to. This
>>> eliminates the
>>> >> need for two new functions.
>>> >> I'll add an optional `cov = False` argument to numpy.corrcoef that
>>> returns
>>> >> a tuple (corr, cov) instead.
>>> >>
>>> >>>
>>> >>>
>>> >>> Either way, `covcorr` feels like a helper function that could exist
>>> in
>>> >>> user code rather than numpy proper.
>>> >>
>>> >>
>>> >> The user would have to re-implement the part that converts the
>>> covariance
>>> >> matrix to a correlation
>>> >> coefficient. I made this PR to avoid that code duplication.
>>> >
>>> >
>>> > With the API I was envisioning (or even your proposed API, for that
>>> matter),
>>> > this function would only be a few lines, e.g.,
>>> >
>>> > def covcorr(x):
>>> >     cov = np.cov(x)
>>> >     corr = np.corrcoef(x, cov=cov)
>>>
>>> IIUC, if you have a covariance matrix then you can compute the
>>> correlation matrix directly, without looking at 'x', so corrcoef(x,
>>> cov=cov) is a bit odd-looking. I think probably the API that makes the
>>> most sense is just to expose something like the covtocorr function
>>> (maybe it could have a less telegraphic name?)? And then, yeah, users
>>> can use that to build their own covcorr or whatever if they want it.
>>>
>>
>> Right, agreed, this is why I said `x` becomes redundant when `cov` is
>> specified
>> when calling `numpy.corrcoef`.  So we have two alternatives:
>>
>> 1) Have `np.corrcoef` accept a boolean optional argument `covmat = False`
>> that lets
>> one obtain a tuple containing the covariance and the correlation matrices
>> in the same call
>> 2) Modify my original PR so that `np.covtocorr` remains (with possibly a
>> better
>> name) but remove `np.covcorr` since this is easy for the user to add.
>>
>> My preference is option 2.
>>
>
> cov2corr is a useful function
> http://www.statsmodels.org/dev/generated/statsmodels.stats.
> moment_helpers.cov2corr.html
> I also wrote the inverse function corr2cov, but AFAIR use it only in some
> test cases.
>
>
> I don't think adding any of the options to corrcoef or covcor is useful
> since there is no computational advantage to it.
>

I'm not sure I agree with that statement. If a user wants to calculate both
a covariance and correlation matrix,
they currently have two options:
A) Call np.cov and np.corrcoef separately, which takes at least twice as
long as one call to np.cov. For data-sets that
I am used to, a np.cov call takes 5-10 seconds.
B) Call np.cov and then separately implement their own correlation matrix
code, which means the user
isn't able to fully take advantage of code that is already in numpy.

In any case, I've updated the PR:
https://github.com/numpy/numpy/pull/8211

Relative to my original PR, it:
a) removes the numpy.covcorr function which the user can easily implement
b) have numpy.cov2corr be the function exposed in the API (previously
called numpy.covtocorr in the PR), which accepts a pre-calculated covariance
matrix
c) have numpy.corrcoef call numpy.cov2corr


>
>
>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/19664fb3/attachment.html>

From harrigan.matthew at gmail.com  Wed Oct 26 16:18:05 2016
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Wed, 26 Oct 2016 16:18:05 -0400
Subject: [Numpy-discussion] padding options for diff
In-Reply-To: <CACUQHuJw=xwWJKKSzFrVmGfAsAeYo2pUMuR=cejSU7FD0aAsiQ@mail.gmail.com>
References: <CACUQHuJw=xwWJKKSzFrVmGfAsAeYo2pUMuR=cejSU7FD0aAsiQ@mail.gmail.com>
Message-ID: <CAOfRF=ikCXcV6_CciFCfX63qzHKO0Vn_mEK5mgQJpapLpLMiPA@mail.gmail.com>

Would it be preferable to have to_begin='first' as an option under the
existing kwarg to avoid overlapping?

On Wed, Oct 26, 2016 at 3:35 PM, Peter Creasey <
p.e.creasey.00 at googlemail.com> wrote:

> > Date: Wed, 26 Oct 2016 09:05:41 -0400
> > From: Matthew Harrigan <harrigan.matthew at gmail.com>
> >
> > np.cumsum(np.diff(x, to_begin=x.take([0], axis=axis), axis=axis),
> axis=axis)
> >
> > That's certainly not going to win any beauty contests.  The 1d case is
> > clean though:
> >
> > np.cumsum(np.diff(x, to_begin=x[0]))
> >
> > I'm not sure if this means the API should change, and if so how.  Higher
> > dimensional arrays seem to just have extra complexity.
> >
> >>
> >> I like the proposal, though I suspect that making it general has
> >> obscured that the most common use-case for padding is to make the
> >> inverse of np.cumsum (at least that?s what I frequently need), and now
> >> in the multidimensional case you have the somewhat unwieldy:
> >>
> >> >>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis))
> >>
> >> rather than
> >>
> >> >>> np.diff(a, axis=axis, keep_left=True)
> >>
> >> which of course could just be an option upon what you already have.
> >>
>
> So my suggestion was intended that you might want an additional
> keyword argument (keep_left=False) to make the inverse np.cumsum
> use-case easier, i.e. you would have something in your np.diff like:
>
> if keep_left:
>     if to_begin is None:
>         to_begin = np.take(a, [0], axis=axis)
>     else:
>         raise ValueError(?np.diff(a, keep_left=False, to_begin=None)
> can be used with either keep_left or to_begin, but not both.?)
>
> Generally I try to avoid optional keyword argument overlap, but in
> this case it is probably justified.
>
> Peter
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/7d2f408e/attachment.html>

From josef.pktd at gmail.com  Wed Oct 26 16:20:55 2016
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 26 Oct 2016 16:20:55 -0400
Subject: [Numpy-discussion] Numpy integers to integer powers again again
In-Reply-To: <CAB6mnxJf=v2q1cB17zXEB+uW6eJKEAQq83h=pbG1Q3wDwcqg+g@mail.gmail.com>
References: <CAB6mnxLmVJ=qGPeQ6Myh6xZwjEBwozXbHRPj2E_ftRayxqvX7A@mail.gmail.com>
 <CAPJVwB=E8GMK_iMvpAOE+Tp=ijHLJK_atHgahQq7uC4zBbhZyQ@mail.gmail.com>
 <CAEQ_Tvd8v66332Lc2RDNr2o8CPMT0q7be1gLdDj3ytr4MdtQ_g@mail.gmail.com>
 <CAB6mnxKq-rqt+CsYGUQfq=toJ3aeXpbKvhcfLfoKJvP7xhG=hg@mail.gmail.com>
 <CAMMTP+AmULUFnYessnn-9987-RecU8zC56kA9Era=R0opeygBg@mail.gmail.com>
 <CAB6mnxJf=v2q1cB17zXEB+uW6eJKEAQq83h=pbG1Q3wDwcqg+g@mail.gmail.com>
Message-ID: <CAMMTP+BxsWYqWrcsCvs8W7MRJ59fS=mKSeSU8-Hp9kuqtW_cMA@mail.gmail.com>

On Wed, Oct 26, 2016 at 3:57 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Wed, Oct 26, 2016 at 1:39 PM, <josef.pktd at gmail.com> wrote:
>
>>
>>
>> On Wed, Oct 26, 2016 at 3:23 PM, Charles R Harris <
>> charlesr.harris at gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer <shoyer at gmail.com>
>>> wrote:
>>>
>>>> I am also concerned about adding more special cases for NumPy scalars
>>>> vs arrays. These cases are already confusing (e.g., making no distinction
>>>> between 0d arrays and scalars) and poorly documented.
>>>>
>>>> On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>>
>>>>> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris
>>>>> <charlesr.harris at gmail.com> wrote:
>>>>> > Hi All,
>>>>> >
>>>>> > I've been thinking about this some (a lot) more and have an alternate
>>>>> > proposal for the behavior of the `**` operator
>>>>> >
>>>>> > if both base and power are numpy/python scalar integers, convert to
>>>>> python
>>>>> > integers and call the `**` operator. That would solve both the
>>>>> precision and
>>>>> > compatibility problems and I think is the option of least surprise.
>>>>> For
>>>>> > those who need type preservation and modular arithmetic, the np.power
>>>>> > function remains, although the type conversions can be surpirising
>>>>> as it
>>>>> > seems that the base and power should  play different roles in
>>>>> determining
>>>>> > the type, at least to me.
>>>>> > Array, 0-d or not, are treated differently from scalars and integers
>>>>> raised
>>>>> > to negative integer powers always raise an error.
>>>>> >
>>>>> > I think this solves most problems and would not be difficult to
>>>>> implement.
>>>>> >
>>>>> > Thoughts?
>>>>>
>>>>> My main concern about this is that it adds more special cases to numpy
>>>>> scalars, and a new behavioral deviation between 0d arrays and scalars,
>>>>> when ideally we should be trying to reduce the
>>>>> duplication/discrepancies between these. It's also inconsistent with
>>>>> how other operations on integer scalars work, e.g. regular addition
>>>>> overflows rather than promoting to Python int:
>>>>>
>>>>> In [8]: np.int64(2 ** 63 - 1) + 1
>>>>> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning:
>>>>> overflow encountered in long_scalars
>>>>>   #!/home/njs/.user-python3.5-64bit/bin/python3.5
>>>>> Out[8]: -9223372036854775808
>>>>>
>>>>> So I'm inclined to try and keep it simple, like in your previous
>>>>> proposal... theoretically of course it would be nice to have the
>>>>> perfect solution here, but at this point it feels like we might be
>>>>> overthinking this trying to get that last 1% of improvement. The thing
>>>>> where 2 ** -1 returns 0 is just broken and bites people so we should
>>>>> definitely fix it, but beyond that I'm not sure it really matters
>>>>> *that* much what we do, and "special cases aren't special enough to
>>>>> break the rules" and all that.
>>>>>
>>>>>
>>> What I have been concerned about are the follow combinations that
>>> currently return floats
>>>
>>> num: <type 'numpy.int8'>, exp: <type 'numpy.int8'>, res: <type
>>> 'numpy.float32'>
>>> num: <type 'numpy.int16'>, exp: <type 'numpy.int8'>, res: <type
>>> 'numpy.float32'>
>>> num: <type 'numpy.int16'>, exp: <type 'numpy.int16'>, res: <type
>>> 'numpy.float32'>
>>> num: <type 'numpy.int32'>, exp: <type 'numpy.int8'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.int32'>, exp: <type 'numpy.int16'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.int32'>, exp: <type 'numpy.int32'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.int64'>, exp: <type 'numpy.int8'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.int64'>, exp: <type 'numpy.int16'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.int64'>, exp: <type 'numpy.int32'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.int64'>, exp: <type 'numpy.int64'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int8'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int16'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int32'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
>>> 'numpy.float64'>
>>> num: <type 'numpy.uint64'>, exp: <type 'numpy.int64'>, res: <type
>>> 'numpy.float64'>
>>>
>>> The other combinations of signed and unsigned integers to signed powers
>>> currently raise ValueError due to the change to the power ufunc. The
>>> exceptions that aren't covered by uint64 + signed (which won't change) seem
>>> to occur when the exponent can be safely cast to the base type. I suspect
>>> that people have already come to depend on that, especially as python
>>> integers on 64 bit linux convert to int64. So in those cases we should
>>> perhaps raise a FutureWarning instead of an error.
>>>
>>
>>
>> >>> np.int64(2)**np.array(-1, np.int64)
>> 0.5
>> >>> np.__version__
>> '1.10.4'
>> >>> np.int64(2)**np.array([-1, 2], np.int64)
>> array([0, 4], dtype=int64)
>> >>> np.array(2, np.uint64)**np.array([-1, 2], np.int64)
>> array([0, 4], dtype=int64)
>> >>> np.array([2], np.uint64)**np.array([-1, 2], np.int64)
>> array([ 0.5,  4. ])
>> >>> np.array([2], np.uint64).squeeze()**np.array([-1, 2], np.int64)
>> array([0, 4], dtype=int64)
>>
>>
>> (IMO: If you have to break backwards compatibility, break forwards not
>> backwards.)
>>
>
> Current master is different. I'm not too worried in the array cases as the
> results for negative exponents were zero except then raising -1 to a power.
> Since that result is incorrect raising an error  falls on the fine line
> between bug fix and compatibility break. If the pre-releases cause too much
> trouble.
>


naive question: if cleaning up the inconsistencies already (kind of) breaks
backwards compatibility and didn't result in a big outcry, why can we not
go with a Future warning all the way to float. (i.e. use the power function
with specified dtype instead of ** if you insist on int return)

Josef


>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/67c5c976/attachment.html>

From oleksandr.pavlyk at intel.com  Wed Oct 26 16:30:51 2016
From: oleksandr.pavlyk at intel.com (Pavlyk, Oleksandr)
Date: Wed, 26 Oct 2016 20:30:51 +0000
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
Message-ID: <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>

Hi, 

Thanks a lot everybody for the feedback. 

The package can certainly be made a stand-alone drop-in replacement for np.random. There are many points raised and unraised in favor of this, 
and it is easy to accomplish.  I will create a stand-alone package on github, but would still appreciate some help in reviewing it 
and making it available at PyPI.

Interestingly, Nathaniel's link to a representative changes, specifically 

    https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833

point at an unused code borrowed directly from mtrand/distributions.c:

https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L297

More representative change would be the implementation of Student's T-distribution:

https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L232-L262 

The module under review, similarly to randomstate package, provides alternative basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A, Wichmann-Hill. The scope of support differ, with randomstate implementing some generators absent in MKL and vice-versa. 

Thinking about the possibility of providing the functionality of this module within the framework of randomstate, I find that randomstate implements samplers from statistical distributions as functions that take the state of the underlying BRNG, and produce a single variate, e.g.:

https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26 

This design stands in a way of efficient use of MKL, which generates a whole vector of variates at a time. This can be done faster than sampling a variate at a time by using vectorized instructions.  So I wrote mkl_distributions.cpp to provide functions that return a given size vector of sampled variates from each supported distribution.

mklrand.pyx was then written by modifying mtrand.pyx to work with such vector generators.   In particular, this allowed for efficient sampling from product distributions of Poisson distributions with different rate parameters, which is implemented in MKL:

https://software.intel.com/en-us/node/521894 

https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1071 


Another point already raised by Nathaniel is that for numpy's randomness ideally should provide a way to override default algorithm for sampling from a particular distribution.  For example RandomState object that implements PCG may rely on default acceptance-rejection algorithm for sampling from Gamma, while the RandomState object that provides interface to MKL might want to call into MKL directly.

While at this topic, I also would like to point out the need for C-API interface to randomness, particularly felt writing parallel algorithms, where Python's GIL and use of Lock() in RandomState hurt scalability.

Oleksandr

-----Original Message-----
From: NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Nathaniel Smith
Sent: Wednesday, October 26, 2016 2:25 PM
To: Discussion of Numerical Python <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] Intel random number package

On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor <jtaylor.debian at googlemail.com> wrote:
> On 10/26/2016 06:00 PM, Julian Taylor wrote:
>>
>> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
>>>
>>>
>>>
>>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor 
>>> <jtaylor.debian at googlemail.com 
>>> <mailto:jtaylor.debian at googlemail.com>>
>>> wrote:
>>>
>>>     On 26.10.2016 06:34, Charles R Harris wrote:
>>>     > Hi All,
>>>     >
>>>     > There is a proposed random number package PR now up on github:
>>>     > https://github.com/numpy/numpy/pull/8209
>>>     <https://github.com/numpy/numpy/pull/8209>. It is from
>>>     > oleksandr-pavlyk <https://github.com/oleksandr-pavlyk
>>>     <https://github.com/oleksandr-pavlyk>> and implements
>>>     > the number random number package using MKL for increased speed.
>>> I think
>>>     > we are definitely interested in the improved speed, but I'm 
>>> not sure
>>>     > numpy is the best place to put the package. I'd welcome any 
>>> comments on
>>>     > the PR itself, as well as any thoughts on the best way 
>>> organize or use
>>>     > of this work. Maybe scikit-random
>>>
>>>
>>> Note that this thread is a continuation of 
>>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.h
>>> tml
>>>
>>>
>>>
>>>     I'm not a fan of putting code depending on a proprietary library
>>>     into numpy.
>>>     This should be a standalone package which may provide the same 
>>> interface
>>>     as numpy.
>>>
>>>
>>> I don't really see a problem with that in principle. Numpy can use 
>>> Intel MKL (and Accelerate) as well if it's available. It needs some 
>>> thought put into the API though - a ``numpy.random_intel`` module is 
>>> certainly not what we want.
>>>
>>
>> For me there is a difference between being able to optionally use a 
>> proprietary library as an alternative to free software libraries if 
>> the user wishes to do so and offering functionality that only works 
>> with non-free software.
>> We are providing a form of advertisement for them by allowing it (hey 
>> if you buy this black box that you cannot modify or use freely you 
>> get this neat numpy feature!).
>>
>> I prefer for the full functionality of numpy to stay available with a 
>> stack of community owned software, even if it may be less powerful 
>> that way.
>
> But then if this is really just the same random numbers numpy already 
> provides just faster, it is probably acceptable in principle. I 
> haven't actually looked at the PR yet.

The RNG stream is totally different, so yeah, it can't just be a silent drop-in replacement like BLAS/LAPACK.

The patch also adds ~10,000 lines of code; here's an example of what some of it looks like:

    https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833

I don't see how we can realistically commit to maintaining this.

I'm also not really seeing how shipping it as part of numpy provides extra benefits to maintainers or users? AFAICT right now it's basically structured as a standalone library that's been dropped into the numpy source tree, and it would be just as easy to ship separately (or am I wrong?). And since the public API is that all the functionality comes from importing this specific new module ('numpy.random_intel'), it'd be a one-line change for users to import from a non-numpy namespace, like 'mkl.random' or whatever. If it were more integrated with the rest of numpy then the trade-offs would be more complicated, but in its present form this seems like an easy call.

The other question is whether it could/should change to *become* more integrated... that's more tricky. There's been some work towards supporting swappable backends inside np.random; but the focus has mostly been on allowing new core generators, though, and this code seems to want to take over the whole thing (core generator + distributions), so even once the swappable backends stuff is working I'm not sure it would be relevant here. The one case I can think of that does seem promising is that if we get an API for users to say "I don't care about stream compatibility, just give me un-reproducible variates as fast as you can", then it might make sense for that to silently use MKL if available -- this would be pretty analogous to the use of MKL in np.linalg. But we don't have that API yet, I'm not sure how the MKL fallback could be maintainably implemented given that it would require somehow swapping the entire RandomState implementation, and it's entirely possible that once we figure out solutions to those then it'd still make sense for the actual MKL wrappers to live in a third-party library that numpy imports.

-n

--
Nathaniel J. Smith -- https://vorpus.org _______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


From p.e.creasey.00 at googlemail.com  Wed Oct 26 16:31:18 2016
From: p.e.creasey.00 at googlemail.com (Peter Creasey)
Date: Wed, 26 Oct 2016 13:31:18 -0700
Subject: [Numpy-discussion] padding options for diff
Message-ID: <CACUQHuJVCgkU1=m2gUVdZS278tZJ44iyV5VfzOKW777N3v_CDA@mail.gmail.com>

> Date: Wed, 26 Oct 2016 16:18:05 -0400
> From: Matthew Harrigan <harrigan.matthew at gmail.com>
>
> Would it be preferable to have to_begin='first' as an option under the
> existing kwarg to avoid overlapping?
>
>> if keep_left:
>>     if to_begin is None:
>>         to_begin = np.take(a, [0], axis=axis)
>>     else:
>>         raise ValueError(?np.diff(a, keep_left=False, to_begin=None)
>> can be used with either keep_left or to_begin, but not both.?)
>>
>> Generally I try to avoid optional keyword argument overlap, but in
>> this case it is probably justified.
>>

It works for me. I can't *think* of a case where you could have a
np.diff on a string array and 'first' could be confused with an
element, since you're not allowed diff on strings in the present numpy
anyway (unless wiser heads than me know something!). Feel free to move
the conversation to github btw.

Peter


From toddrjen at gmail.com  Wed Oct 26 17:03:37 2016
From: toddrjen at gmail.com (Todd)
Date: Wed, 26 Oct 2016 17:03:37 -0400
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
Message-ID: <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>

On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr <
oleksandr.pavlyk at intel.com> wrote:

>
> The module under review, similarly to randomstate package, provides
> alternative basic pseudo-random number generators (BRNGs), like MT2203,
> MCG31, MRG32K3A, Wichmann-Hill. The scope of support differ, with
> randomstate implementing some generators absent in MKL and vice-versa.
>
>
Is there a reason that randomstate shouldn't implement those generators?


> Thinking about the possibility of providing the functionality of this
> module within the framework of randomstate, I find that randomstate
> implements samplers from statistical distributions as functions that take
> the state of the underlying BRNG, and produce a single variate, e.g.:
>
> https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/
> distributions.c#L23-L26
>
> This design stands in a way of efficient use of MKL, which generates a
> whole vector of variates at a time. This can be done faster than sampling a
> variate at a time by using vectorized instructions.  So I wrote
> mkl_distributions.cpp to provide functions that return a given size vector
> of sampled variates from each supported distribution.
>

I don't know a huge amount about pseudo-random number generators, but this
seems superficially to be something that would benefit random number
generation as a whole independently of whether MKL is used.  Might it be
possible to modify the numpy implementation to support this sort of
vectorized approach?


Another point already raised by Nathaniel is that for numpy's randomness
> ideally should provide a way to override default algorithm for sampling
> from a particular distribution.  For example RandomState object that
> implements PCG may rely on default acceptance-rejection algorithm for
> sampling from Gamma, while the RandomState object that provides interface
> to MKL might want to call into MKL directly.
>

The approach that pyfftw uses at least for scipy, which may also work here,
is that you can monkey-patch the scipy.fftpack module at runtime, replacing
it with pyfftw's drop-in replacement.  scipy then proceeds to use pyfftw
instead of its built-in fftpack implementation.  Might such an approach
work here?  Users can either use this alternative randomstate replacement
directly, or they can replace numpy's with it at runtime and numpy will
then proceed to use the alternative.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/292f9098/attachment.html>

From oleksandr.pavlyk at intel.com  Wed Oct 26 17:25:40 2016
From: oleksandr.pavlyk at intel.com (Pavlyk, Oleksandr)
Date: Wed, 26 Oct 2016 21:25:40 +0000
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
Message-ID: <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>

Please see responses inline.

From: NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Todd
Sent: Wednesday, October 26, 2016 4:04 PM
To: Discussion of Numerical Python <numpy-discussion at scipy.org>
Subject: Re: [Numpy-discussion] Intel random number package

On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr <oleksandr.pavlyk at intel.com<mailto:oleksandr.pavlyk at intel.com>> wrote:

The module under review, similarly to randomstate package, provides alternative basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A, Wichmann-Hill. The scope of support differ, with randomstate implementing some generators absent in MKL and vice-versa.

Is there a reason that randomstate shouldn't implement those generators?


No, randomstate certainly can implement all the BRNGs implemented in MKL. It is at developer?s discretion.


Thinking about the possibility of providing the functionality of this module within the framework of randomstate, I find that randomstate implements samplers from statistical distributions as functions that take the state of the underlying BRNG, and produce a single variate, e.g.:

https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26

This design stands in a way of efficient use of MKL, which generates a whole vector of variates at a time. This can be done faster than sampling a variate at a time by using vectorized instructions.  So I wrote mkl_distributions.cpp to provide functions that return a given size vector of sampled variates from each supported distribution.

I don't know a huge amount about pseudo-random number generators, but this seems superficially to be something that would benefit random number generation as a whole independently of whether MKL is used.  Might it be possible to modify the numpy implementation to support this sort of vectorized approach?

I also think that adopting vectorized mindset would benefit np.random. For example, Gaussians are currently generated using Box-Muller algorithm which produces two variate at a time, so one currently needs to be saved in the random state struct itself, along with an indicator that it should be used on the next iteration.  With vectorized approach one could populate the vector two elements at a time with better memory locality, resulting in better performance.

Vectorized approach has merits with or without use of MKL.

Another point already raised by Nathaniel is that for numpy's randomness ideally should provide a way to override default algorithm for sampling from a particular distribution.  For example RandomState object that implements PCG may rely on default acceptance-rejection algorithm for sampling from Gamma, while the RandomState object that provides interface to MKL might want to call into MKL directly.

The approach that pyfftw uses at least for scipy, which may also work here, is that you can monkey-patch the scipy.fftpack module at runtime, replacing it with pyfftw's drop-in replacement.  scipy then proceeds to use pyfftw instead of its built-in fftpack implementation.  Might such an approach work here?  Users can either use this alternative randomstate replacement directly, or they can replace numpy's with it at runtime and numpy will then proceed to use the alternative.

I think the monkey-patching approach will work.

RandomState was written with a view to replace numpy.random at some point in the future. It is standalone at the moment, from what I understand, only because it is still being worked on and extended.

One particularly important development is the ability to sample continuous distributions in floats, or to populate a given preallocated
buffer with random samples. These features are missing from numpy.random_intel and we thought it providing them.

As I have said earlier, another missing feature in the C-API for randomness in numpy.


Oleksandr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161026/fecce664/attachment.html>

From ralf.gommers at gmail.com  Thu Oct 27 04:25:33 2016
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Thu, 27 Oct 2016 21:25:33 +1300
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
Message-ID: <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>

On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr <
oleksandr.pavlyk at intel.com> wrote:

> Please see responses inline.
>
>
>
> *From:* NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] *On
> Behalf Of *Todd
> *Sent:* Wednesday, October 26, 2016 4:04 PM
> *To:* Discussion of Numerical Python <numpy-discussion at scipy.org>
> *Subject:* Re: [Numpy-discussion] Intel random number package
>
>
>
> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr <
> oleksandr.pavlyk at intel.com> wrote:
>
> Another point already raised by Nathaniel is that for numpy's randomness
> ideally should provide a way to override default algorithm for sampling
> from a particular distribution.  For example RandomState object that
> implements PCG may rely on default acceptance-rejection algorithm for
> sampling from Gamma, while the RandomState object that provides interface
> to MKL might want to call into MKL directly.
>
>
>
> The approach that pyfftw uses at least for scipy, which may also work
> here, is that you can monkey-patch the scipy.fftpack module at runtime,
> replacing it with pyfftw's drop-in replacement.  scipy then proceeds to use
> pyfftw instead of its built-in fftpack implementation.  Might such an
> approach work here?  Users can either use this alternative randomstate
> replacement directly, or they can replace numpy's with it at runtime and
> numpy will then proceed to use the alternative.
>

The only reason that pyfftw uses monkeypatching is that the better approach
is not possible due to license constraints with FFTW (it's GPL).


> I think the monkey-patching approach will work.
>

It will work, for a while at least, but it's bad design.

We're all on the same page I think that a separate submodule for
random_intel is a no go, but as an explicitly switchable backend for
functions with the same signature it would be fine imho. Of course we don't
have that backend infrastructure today, but it's something we want and have
been discussing anyway.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/8d8d1993/attachment.html>

From vikramsingh001 at gmail.com  Thu Oct 27 06:30:59 2016
From: vikramsingh001 at gmail.com (Vikram Singh)
Date: Thu, 27 Oct 2016 13:30:59 +0300
Subject: [Numpy-discussion] Problem with compiling openacc with f2py
Message-ID: <CAD0gq3XtDmMHG=e5Ek2=3Pd577xeqAhzTJuOJhNYm5FqjOgvmQ@mail.gmail.com>

I am a newbie to f2py so I have been creating simple test cases.
Eventually I want to be able to use openacc subroutine from python. So
here's the test case

    module test

     use iso_c_binding, only: sp => C_FLOAT, dp => C_DOUBLE, i8 => C_INT
     use omp_lib
     use openacc

     implicit none

     contains

       subroutine add_acc (a, b, n, c)
          integer(kind=i8), intent(in)  :: n
          real(kind=dp), intent(in)  :: a(n)
          real(kind=dp), intent(in)  :: b(n)
          real(kind=dp), intent(out) :: c(n)

          integer(kind=i8)  :: i

          !$acc kernels
          do i = 1, n
              c(i) = a(i) + b(i)
          end do
          !$acc end kernels

      end subroutine add_acc

      subroutine add_omp (a, b, n, c)
          integer(kind=i8), intent(in)  :: n
          real(kind=dp), intent(in)  :: a(n)
          real(kind=dp), intent(in)  :: b(n)
          real(kind=dp), intent(out) :: c(n)

          integer(kind=i8)  :: i, j

          !$omp parallel do
          do i = 1, n
              c(i) = a(i) + b(i)
          end do
          !$omp end parallel do

      end subroutine add_omp

      subroutine nt (c)
          integer(kind=i8), intent(out) :: c

          c = omp_get_max_threads()

      end subroutine nt

      subroutine mult (a, b, c)
          real(kind=dp), intent(in)  :: a
          real(kind=dp), intent(in)  :: b
          real(kind=dp), intent(out) :: c

          c = a * b

      end subroutine mult

    end module test

I compile using

f2py -c -m --f90flags='-fopenacc -foffload=nvptx-none -foffload=-O3
-O3 -fPIC' hello hello.f90 -L/usr/local/cuda/lib64 -lcublas -lcudart
-lgomp

Now, until I add the acc directives everything works fine. But as soon
as I add the acc directives I get this error.

gfortran:f90: /tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.f90
/home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran
-Wall -g -Wall -g -shared
/tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o
/tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o
/tmp/tmpld6ssow3/hello.o
/tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o
-L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas
-lcudart -lgomp -lpython3.5m -lgfortran -o
./hello.cpython-35m-x86_64-linux-gnu.so
/usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against
`.rodata' can not be used when making a shared object; recompile with
-fPIC
/tmp/cc2yQ89d.target.o: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
/usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against
`.rodata' can not be used when making a shared object; recompile with
-fPIC
/tmp/cc2yQ89d.target.o: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
error: Command "/home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran
-Wall -g -Wall -g -shared
/tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o
/tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o
/tmp/tmpld6ssow3/hello.o
/tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o
-L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas
-lcudart -lgomp -lpython3.5m -lgfortran -o
./hello.cpython-35m-x86_64-linux-gnu.so" failed with exit status 1

I don't get why just putting acc directives should create errors, when
omp does not.

Vikram


From toddrjen at gmail.com  Thu Oct 27 10:30:36 2016
From: toddrjen at gmail.com (Todd)
Date: Thu, 27 Oct 2016 10:30:36 -0400
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
Message-ID: <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>

On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:
>
>
> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr <
> oleksandr.pavlyk at intel.com> wrote:
>
>> Please see responses inline.
>>
>>
>>
>> *From:* NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] *On
>> Behalf Of *Todd
>> *Sent:* Wednesday, October 26, 2016 4:04 PM
>> *To:* Discussion of Numerical Python <numpy-discussion at scipy.org>
>> *Subject:* Re: [Numpy-discussion] Intel random number package
>>
>>
>>
>> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr <
>> oleksandr.pavlyk at intel.com> wrote:
>>
>> Another point already raised by Nathaniel is that for numpy's randomness
>> ideally should provide a way to override default algorithm for sampling
>> from a particular distribution.  For example RandomState object that
>> implements PCG may rely on default acceptance-rejection algorithm for
>> sampling from Gamma, while the RandomState object that provides interface
>> to MKL might want to call into MKL directly.
>>
>>
>>
>> The approach that pyfftw uses at least for scipy, which may also work
>> here, is that you can monkey-patch the scipy.fftpack module at runtime,
>> replacing it with pyfftw's drop-in replacement.  scipy then proceeds to use
>> pyfftw instead of its built-in fftpack implementation.  Might such an
>> approach work here?  Users can either use this alternative randomstate
>> replacement directly, or they can replace numpy's with it at runtime and
>> numpy will then proceed to use the alternative.
>>
>
> The only reason that pyfftw uses monkeypatching is that the better
> approach is not possible due to license constraints with FFTW (it's GPL).
>

Yes, that is exactly why I brought it up.  Better approaches are also not
possible with MKL due to license constraints.  It is a very similar
situation overall.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/30f08607/attachment.html>

From jtaylor.debian at googlemail.com  Thu Oct 27 10:43:40 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 27 Oct 2016 16:43:40 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
Message-ID: <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>

On 10/27/2016 04:30 PM, Todd wrote:
> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers <ralf.gommers at gmail.com
> <mailto:ralf.gommers at gmail.com>> wrote:
>
>
>     On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>     <oleksandr.pavlyk at intel.com <mailto:oleksandr.pavlyk at intel.com>> wrote:
>
>         Please see responses inline.
>
>
>
>         *From:*NumPy-Discussion
>         [mailto:numpy-discussion-bounces at scipy.org
>         <mailto:numpy-discussion-bounces at scipy.org>] *On Behalf Of *Todd
>         *Sent:* Wednesday, October 26, 2016 4:04 PM
>         *To:* Discussion of Numerical Python <numpy-discussion at scipy.org
>         <mailto:numpy-discussion at scipy.org>>
>         *Subject:* Re: [Numpy-discussion] Intel random number package
>
>
>
>
>         On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>         <oleksandr.pavlyk at intel.com <mailto:oleksandr.pavlyk at intel.com>>
>         wrote:
>
>             Another point already raised by Nathaniel is that for
>             numpy's randomness ideally should provide a way to override
>             default algorithm for sampling from a particular
>             distribution.  For example RandomState object that
>             implements PCG may rely on default acceptance-rejection
>             algorithm for sampling from Gamma, while the RandomState
>             object that provides interface to MKL might want to call
>             into MKL directly.
>
>
>
>         The approach that pyfftw uses at least for scipy, which may also
>         work here, is that you can monkey-patch the scipy.fftpack module
>         at runtime, replacing it with pyfftw's drop-in replacement.
>         scipy then proceeds to use pyfftw instead of its built-in
>         fftpack implementation.  Might such an approach work here?
>         Users can either use this alternative randomstate replacement
>         directly, or they can replace numpy's with it at runtime and
>         numpy will then proceed to use the alternative.
>
>
>     The only reason that pyfftw uses monkeypatching is that the better
>     approach is not possible due to license constraints with FFTW (it's
>     GPL).
>
>
> Yes, that is exactly why I brought it up.  Better approaches are also
> not possible with MKL due to license constraints.  It is a very similar
> situation overall.
>

Its not that similar, the better approach is certainly possible with 
FFTW, the GPL is compatible with numpys license. It is only a concern 
users of binary distributions. Nobody provided the code to use fftw yet, 
but it would certainly be accepted.


From toddrjen at gmail.com  Thu Oct 27 10:52:48 2016
From: toddrjen at gmail.com (Todd)
Date: Thu, 27 Oct 2016 10:52:48 -0400
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
Message-ID: <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>

On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 10/27/2016 04:30 PM, Todd wrote:
>
>> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers <ralf.gommers at gmail.com
>> <mailto:ralf.gommers at gmail.com>> wrote:
>>
>>
>>     On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>>     <oleksandr.pavlyk at intel.com <mailto:oleksandr.pavlyk at intel.com>>
>> wrote:
>>
>>         Please see responses inline.
>>
>>
>>
>>         *From:*NumPy-Discussion
>>         [mailto:numpy-discussion-bounces at scipy.org
>>         <mailto:numpy-discussion-bounces at scipy.org>] *On Behalf Of *Todd
>>         *Sent:* Wednesday, October 26, 2016 4:04 PM
>>         *To:* Discussion of Numerical Python <numpy-discussion at scipy.org
>>         <mailto:numpy-discussion at scipy.org>>
>>         *Subject:* Re: [Numpy-discussion] Intel random number package
>>
>>
>>
>>
>>         On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>>         <oleksandr.pavlyk at intel.com <mailto:oleksandr.pavlyk at intel.com>>
>>         wrote:
>>
>>             Another point already raised by Nathaniel is that for
>>             numpy's randomness ideally should provide a way to override
>>             default algorithm for sampling from a particular
>>             distribution.  For example RandomState object that
>>             implements PCG may rely on default acceptance-rejection
>>             algorithm for sampling from Gamma, while the RandomState
>>             object that provides interface to MKL might want to call
>>             into MKL directly.
>>
>>
>>
>>         The approach that pyfftw uses at least for scipy, which may also
>>         work here, is that you can monkey-patch the scipy.fftpack module
>>         at runtime, replacing it with pyfftw's drop-in replacement.
>>         scipy then proceeds to use pyfftw instead of its built-in
>>         fftpack implementation.  Might such an approach work here?
>>         Users can either use this alternative randomstate replacement
>>         directly, or they can replace numpy's with it at runtime and
>>         numpy will then proceed to use the alternative.
>>
>>
>>     The only reason that pyfftw uses monkeypatching is that the better
>>     approach is not possible due to license constraints with FFTW (it's
>>     GPL).
>>
>>
>> Yes, that is exactly why I brought it up.  Better approaches are also
>> not possible with MKL due to license constraints.  It is a very similar
>> situation overall.
>>
>>
> Its not that similar, the better approach is certainly possible with FFTW,
> the GPL is compatible with numpys license. It is only a concern users of
> binary distributions. Nobody provided the code to use fftw yet, but it
> would certainly be accepted.


Although it is technically compatible, it would make numpy effectively
GPL.  Suggestions for this have been explicitly rejected on these grounds
[1]

[1] https://github.com/numpy/numpy/issues/3485
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/568f7e3a/attachment.html>

From jtaylor.debian at googlemail.com  Thu Oct 27 11:14:30 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 27 Oct 2016 17:14:30 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
 <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
Message-ID: <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>

On 10/27/2016 04:52 PM, Todd wrote:
> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> wrote:
>
>     On 10/27/2016 04:30 PM, Todd wrote:
>
>         On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
>         <ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>
>         <mailto:ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>>>
>         wrote:
>
>
>             On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>             <oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>
>         <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>>> wrote:
>
>                 Please see responses inline.
>
>
>
>                 *From:*NumPy-Discussion
>                 [mailto:numpy-discussion-bounces at scipy.org
>         <mailto:numpy-discussion-bounces at scipy.org>
>                 <mailto:numpy-discussion-bounces at scipy.org
>         <mailto:numpy-discussion-bounces at scipy.org>>] *On Behalf Of *Todd
>                 *Sent:* Wednesday, October 26, 2016 4:04 PM
>                 *To:* Discussion of Numerical Python
>         <numpy-discussion at scipy.org <mailto:numpy-discussion at scipy.org>
>                 <mailto:numpy-discussion at scipy.org
>         <mailto:numpy-discussion at scipy.org>>>
>                 *Subject:* Re: [Numpy-discussion] Intel random number
>         package
>
>
>
>
>                 On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>                 <oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>
>         <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>>>
>                 wrote:
>
>                     Another point already raised by Nathaniel is that for
>                     numpy's randomness ideally should provide a way to
>         override
>                     default algorithm for sampling from a particular
>                     distribution.  For example RandomState object that
>                     implements PCG may rely on default acceptance-rejection
>                     algorithm for sampling from Gamma, while the RandomState
>                     object that provides interface to MKL might want to call
>                     into MKL directly.
>
>
>
>                 The approach that pyfftw uses at least for scipy, which
>         may also
>                 work here, is that you can monkey-patch the
>         scipy.fftpack module
>                 at runtime, replacing it with pyfftw's drop-in replacement.
>                 scipy then proceeds to use pyfftw instead of its built-in
>                 fftpack implementation.  Might such an approach work here?
>                 Users can either use this alternative randomstate
>         replacement
>                 directly, or they can replace numpy's with it at runtime and
>                 numpy will then proceed to use the alternative.
>
>
>             The only reason that pyfftw uses monkeypatching is that the
>         better
>             approach is not possible due to license constraints with
>         FFTW (it's
>             GPL).
>
>
>         Yes, that is exactly why I brought it up.  Better approaches are
>         also
>         not possible with MKL due to license constraints.  It is a very
>         similar
>         situation overall.
>
>
>     Its not that similar, the better approach is certainly possible with
>     FFTW, the GPL is compatible with numpys license. It is only a
>     concern users of binary distributions. Nobody provided the code to
>     use fftw yet, but it would certainly be accepted.
>
>
> Although it is technically compatible, it would make numpy effectively
> GPL.  Suggestions for this have been explicitly rejected on these
> grounds [1]
>
> [1] https://github.com/numpy/numpy/issues/3485
>

Yes it would make numpy GPL, but that is not a concern for a lot of 
users. Users for who it is a problem can still use the non-GPL version.
A more interesting debate is whether our binary wheels should then be 
GPL wheels by default or not. Probably not, but that is something that 
should be discussed when its an actual issue.

But to clarify what I said, it would be accepted if the value it 
provides is sufficient compared to the code maintenance it adds. Given 
that pyfftw already exists the value is probably relatively small, but 
personally I'd still be interested in code that allows switching the fft 
backend as that could also allow plugging e.g. gpu based implementations 
(though again this is already covered by other third party modules).


From robbmcleod at gmail.com  Thu Oct 27 11:42:36 2016
From: robbmcleod at gmail.com (Robert McLeod)
Date: Thu, 27 Oct 2016 17:42:36 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
 <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
 <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>
Message-ID: <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>

Releasing NumPy under GPL would make it incompatible with SciPy, which may
be _slightly_ inconvenient to the scientific Python community:

https://scipy.github.io/old-wiki/pages/License_Compatibility.html

https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html

Robert

On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 10/27/2016 04:52 PM, Todd wrote:
>
>> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
>> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
>> wrote:
>>
>>     On 10/27/2016 04:30 PM, Todd wrote:
>>
>>         On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
>>         <ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>
>>         <mailto:ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>>>
>>         wrote:
>>
>>
>>             On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>>             <oleksandr.pavlyk at intel.com
>>         <mailto:oleksandr.pavlyk at intel.com>
>>         <mailto:oleksandr.pavlyk at intel.com
>>         <mailto:oleksandr.pavlyk at intel.com>>> wrote:
>>
>>                 Please see responses inline.
>>
>>
>>
>>                 *From:*NumPy-Discussion
>>                 [mailto:numpy-discussion-bounces at scipy.org
>>         <mailto:numpy-discussion-bounces at scipy.org>
>>                 <mailto:numpy-discussion-bounces at scipy.org
>>         <mailto:numpy-discussion-bounces at scipy.org>>] *On Behalf Of *Todd
>>                 *Sent:* Wednesday, October 26, 2016 4:04 PM
>>                 *To:* Discussion of Numerical Python
>>         <numpy-discussion at scipy.org <mailto:numpy-discussion at scipy.org>
>>                 <mailto:numpy-discussion at scipy.org
>>         <mailto:numpy-discussion at scipy.org>>>
>>                 *Subject:* Re: [Numpy-discussion] Intel random number
>>         package
>>
>>
>>
>>
>>                 On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>>                 <oleksandr.pavlyk at intel.com
>>         <mailto:oleksandr.pavlyk at intel.com>
>>         <mailto:oleksandr.pavlyk at intel.com
>>
>>         <mailto:oleksandr.pavlyk at intel.com>>>
>>                 wrote:
>>
>>                     Another point already raised by Nathaniel is that for
>>                     numpy's randomness ideally should provide a way to
>>         override
>>                     default algorithm for sampling from a particular
>>                     distribution.  For example RandomState object that
>>                     implements PCG may rely on default
>> acceptance-rejection
>>                     algorithm for sampling from Gamma, while the
>> RandomState
>>                     object that provides interface to MKL might want to
>> call
>>                     into MKL directly.
>>
>>
>>
>>                 The approach that pyfftw uses at least for scipy, which
>>         may also
>>                 work here, is that you can monkey-patch the
>>         scipy.fftpack module
>>                 at runtime, replacing it with pyfftw's drop-in
>> replacement.
>>                 scipy then proceeds to use pyfftw instead of its built-in
>>                 fftpack implementation.  Might such an approach work here?
>>                 Users can either use this alternative randomstate
>>         replacement
>>                 directly, or they can replace numpy's with it at runtime
>> and
>>                 numpy will then proceed to use the alternative.
>>
>>
>>             The only reason that pyfftw uses monkeypatching is that the
>>         better
>>             approach is not possible due to license constraints with
>>         FFTW (it's
>>             GPL).
>>
>>
>>         Yes, that is exactly why I brought it up.  Better approaches are
>>         also
>>         not possible with MKL due to license constraints.  It is a very
>>         similar
>>         situation overall.
>>
>>
>>     Its not that similar, the better approach is certainly possible with
>>     FFTW, the GPL is compatible with numpys license. It is only a
>>     concern users of binary distributions. Nobody provided the code to
>>     use fftw yet, but it would certainly be accepted.
>>
>>
>> Although it is technically compatible, it would make numpy effectively
>> GPL.  Suggestions for this have been explicitly rejected on these
>> grounds [1]
>>
>> [1] https://github.com/numpy/numpy/issues/3485
>>
>>
> Yes it would make numpy GPL, but that is not a concern for a lot of users.
> Users for who it is a problem can still use the non-GPL version.
> A more interesting debate is whether our binary wheels should then be GPL
> wheels by default or not. Probably not, but that is something that should
> be discussed when its an actual issue.
>
> But to clarify what I said, it would be accepted if the value it provides
> is sufficient compared to the code maintenance it adds. Given that pyfftw
> already exists the value is probably relatively small, but personally I'd
> still be interested in code that allows switching the fft backend as that
> could also allow plugging e.g. gpu based implementations (though again this
> is already covered by other third party modules).
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universit?t Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcleod at unibas.ch
robert.mcleod at bsse.ethz.ch <robert.mcleod at ethz.ch>
robbmcleod at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/5df90422/attachment.html>

From toddrjen at gmail.com  Thu Oct 27 11:57:17 2016
From: toddrjen at gmail.com (Todd)
Date: Thu, 27 Oct 2016 11:57:17 -0400
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
 <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
 <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>
 <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
Message-ID: <CAFpSVpJ6VUyPwWJ+8HRY58FBU9cYZCkJirdum=svp=wPsrVMdw@mail.gmail.com>

It would still be compatible with SciPy, it would "just" mean that SciPy
(and anything else that uses numpy) would be effectively GPL.

On Thu, Oct 27, 2016 at 11:42 AM, Robert McLeod <robbmcleod at gmail.com>
wrote:

> Releasing NumPy under GPL would make it incompatible with SciPy, which may
> be _slightly_ inconvenient to the scientific Python community:
>
> https://scipy.github.io/old-wiki/pages/License_Compatibility.html
>
> https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html
>
> Robert
>
> On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor <
> jtaylor.debian at googlemail.com> wrote:
>
>> On 10/27/2016 04:52 PM, Todd wrote:
>>
>>> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
>>> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
>>> wrote:
>>>
>>>     On 10/27/2016 04:30 PM, Todd wrote:
>>>
>>>         On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
>>>         <ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>
>>>         <mailto:ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>>>
>>>         wrote:
>>>
>>>
>>>             On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>>>             <oleksandr.pavlyk at intel.com
>>>         <mailto:oleksandr.pavlyk at intel.com>
>>>         <mailto:oleksandr.pavlyk at intel.com
>>>         <mailto:oleksandr.pavlyk at intel.com>>> wrote:
>>>
>>>                 Please see responses inline.
>>>
>>>
>>>
>>>                 *From:*NumPy-Discussion
>>>                 [mailto:numpy-discussion-bounces at scipy.org
>>>         <mailto:numpy-discussion-bounces at scipy.org>
>>>                 <mailto:numpy-discussion-bounces at scipy.org
>>>         <mailto:numpy-discussion-bounces at scipy.org>>] *On Behalf Of
>>> *Todd
>>>                 *Sent:* Wednesday, October 26, 2016 4:04 PM
>>>                 *To:* Discussion of Numerical Python
>>>         <numpy-discussion at scipy.org <mailto:numpy-discussion at scipy.org>
>>>                 <mailto:numpy-discussion at scipy.org
>>>         <mailto:numpy-discussion at scipy.org>>>
>>>                 *Subject:* Re: [Numpy-discussion] Intel random number
>>>         package
>>>
>>>
>>>
>>>
>>>                 On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>>>                 <oleksandr.pavlyk at intel.com
>>>         <mailto:oleksandr.pavlyk at intel.com>
>>>         <mailto:oleksandr.pavlyk at intel.com
>>>
>>>         <mailto:oleksandr.pavlyk at intel.com>>>
>>>                 wrote:
>>>
>>>                     Another point already raised by Nathaniel is that for
>>>                     numpy's randomness ideally should provide a way to
>>>         override
>>>                     default algorithm for sampling from a particular
>>>                     distribution.  For example RandomState object that
>>>                     implements PCG may rely on default
>>> acceptance-rejection
>>>                     algorithm for sampling from Gamma, while the
>>> RandomState
>>>                     object that provides interface to MKL might want to
>>> call
>>>                     into MKL directly.
>>>
>>>
>>>
>>>                 The approach that pyfftw uses at least for scipy, which
>>>         may also
>>>                 work here, is that you can monkey-patch the
>>>         scipy.fftpack module
>>>                 at runtime, replacing it with pyfftw's drop-in
>>> replacement.
>>>                 scipy then proceeds to use pyfftw instead of its built-in
>>>                 fftpack implementation.  Might such an approach work
>>> here?
>>>                 Users can either use this alternative randomstate
>>>         replacement
>>>                 directly, or they can replace numpy's with it at runtime
>>> and
>>>                 numpy will then proceed to use the alternative.
>>>
>>>
>>>             The only reason that pyfftw uses monkeypatching is that the
>>>         better
>>>             approach is not possible due to license constraints with
>>>         FFTW (it's
>>>             GPL).
>>>
>>>
>>>         Yes, that is exactly why I brought it up.  Better approaches are
>>>         also
>>>         not possible with MKL due to license constraints.  It is a very
>>>         similar
>>>         situation overall.
>>>
>>>
>>>     Its not that similar, the better approach is certainly possible with
>>>     FFTW, the GPL is compatible with numpys license. It is only a
>>>     concern users of binary distributions. Nobody provided the code to
>>>     use fftw yet, but it would certainly be accepted.
>>>
>>>
>>> Although it is technically compatible, it would make numpy effectively
>>> GPL.  Suggestions for this have been explicitly rejected on these
>>> grounds [1]
>>>
>>> [1] https://github.com/numpy/numpy/issues/3485
>>>
>>>
>> Yes it would make numpy GPL, but that is not a concern for a lot of
>> users. Users for who it is a problem can still use the non-GPL version.
>> A more interesting debate is whether our binary wheels should then be GPL
>> wheels by default or not. Probably not, but that is something that should
>> be discussed when its an actual issue.
>>
>> But to clarify what I said, it would be accepted if the value it provides
>> is sufficient compared to the code maintenance it adds. Given that pyfftw
>> already exists the value is probably relatively small, but personally I'd
>> still be interested in code that allows switching the fft backend as that
>> could also allow plugging e.g. gpu based implementations (though again this
>> is already covered by other third party modules).
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
>
> --
> Robert McLeod, Ph.D.
> Center for Cellular Imaging and Nano Analytics (C-CINA)
> Biozentrum der Universit?t Basel
> Mattenstrasse 26, 4058 Basel
> Work: +41.061.387.3225
> robert.mcleod at unibas.ch
> robert.mcleod at bsse.ethz.ch <robert.mcleod at ethz.ch>
> robbmcleod at gmail.com
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/0dbba6e2/attachment.html>

From jtaylor.debian at googlemail.com  Thu Oct 27 12:01:41 2016
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 27 Oct 2016 18:01:41 +0200
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
 <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
 <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>
 <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
Message-ID: <577d02ab-4fa7-d4fa-98a8-7e1227b7ebb2@googlemail.com>

As I understand it the wiki is talking about including code in 
numpy/scipy itself, all code in numpy and scipy must be permissively 
licensed so it is easy to reason about when building your binaries.

The license of the binaries produced from the code is a different 
matter, which at that time didn't really exist as we didn't distribute 
binaries at all (except for windows).

A GPL licensed binary containing numpy is perfectly compatible with 
SciPy. It may not be compatible with some other component which has an 
actually incompatible license (e.g. anything you cannot distribute the 
source of as required by the GPL).
I it is not numpy that is GPL licensed it is the restriction of another 
component in the binary distribution that makes the full product adhere 
to the most restrictive license
But numpy itself is always permissive, the distributor can always build 
a permissive numpy binary without the viral component in it.


On 10/27/2016 05:42 PM, Robert McLeod wrote:
> Releasing NumPy under GPL would make it incompatible with SciPy, which
> may be _slightly_ inconvenient to the scientific Python community:
>
> https://scipy.github.io/old-wiki/pages/License_Compatibility.html
>
> https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html
>
> Robert
>
> On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> wrote:
>
>     On 10/27/2016 04:52 PM, Todd wrote:
>
>         On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor
>         <jtaylor.debian at googlemail.com
>         <mailto:jtaylor.debian at googlemail.com>
>         <mailto:jtaylor.debian at googlemail.com
>         <mailto:jtaylor.debian at googlemail.com>>>
>         wrote:
>
>             On 10/27/2016 04:30 PM, Todd wrote:
>
>                 On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers
>                 <ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>
>         <mailto:ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>>
>                 <mailto:ralf.gommers at gmail.com
>         <mailto:ralf.gommers at gmail.com> <mailto:ralf.gommers at gmail.com
>         <mailto:ralf.gommers at gmail.com>>>>
>                 wrote:
>
>
>                     On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr
>                     <oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>
>                 <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>>
>                 <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>
>                 <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>>>> wrote:
>
>                         Please see responses inline.
>
>
>
>                         *From:*NumPy-Discussion
>                         [mailto:numpy-discussion-bounces at scipy.org
>         <mailto:numpy-discussion-bounces at scipy.org>
>                 <mailto:numpy-discussion-bounces at scipy.org
>         <mailto:numpy-discussion-bounces at scipy.org>>
>                         <mailto:numpy-discussion-bounces at scipy.org
>         <mailto:numpy-discussion-bounces at scipy.org>
>                 <mailto:numpy-discussion-bounces at scipy.org
>         <mailto:numpy-discussion-bounces at scipy.org>>>] *On Behalf Of *Todd
>                         *Sent:* Wednesday, October 26, 2016 4:04 PM
>                         *To:* Discussion of Numerical Python
>                 <numpy-discussion at scipy.org
>         <mailto:numpy-discussion at scipy.org>
>         <mailto:numpy-discussion at scipy.org
>         <mailto:numpy-discussion at scipy.org>>
>                         <mailto:numpy-discussion at scipy.org
>         <mailto:numpy-discussion at scipy.org>
>                 <mailto:numpy-discussion at scipy.org
>         <mailto:numpy-discussion at scipy.org>>>>
>                         *Subject:* Re: [Numpy-discussion] Intel random
>         number
>                 package
>
>
>
>
>                         On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr
>                         <oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>
>                 <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>>
>                 <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>
>
>                 <mailto:oleksandr.pavlyk at intel.com
>         <mailto:oleksandr.pavlyk at intel.com>>>>
>                         wrote:
>
>                             Another point already raised by Nathaniel is
>         that for
>                             numpy's randomness ideally should provide a
>         way to
>                 override
>                             default algorithm for sampling from a particular
>                             distribution.  For example RandomState
>         object that
>                             implements PCG may rely on default
>         acceptance-rejection
>                             algorithm for sampling from Gamma, while the
>         RandomState
>                             object that provides interface to MKL might
>         want to call
>                             into MKL directly.
>
>
>
>                         The approach that pyfftw uses at least for
>         scipy, which
>                 may also
>                         work here, is that you can monkey-patch the
>                 scipy.fftpack module
>                         at runtime, replacing it with pyfftw's drop-in
>         replacement.
>                         scipy then proceeds to use pyfftw instead of its
>         built-in
>                         fftpack implementation.  Might such an approach
>         work here?
>                         Users can either use this alternative randomstate
>                 replacement
>                         directly, or they can replace numpy's with it at
>         runtime and
>                         numpy will then proceed to use the alternative.
>
>
>                     The only reason that pyfftw uses monkeypatching is
>         that the
>                 better
>                     approach is not possible due to license constraints with
>                 FFTW (it's
>                     GPL).
>
>
>                 Yes, that is exactly why I brought it up.  Better
>         approaches are
>                 also
>                 not possible with MKL due to license constraints.  It is
>         a very
>                 similar
>                 situation overall.
>
>
>             Its not that similar, the better approach is certainly
>         possible with
>             FFTW, the GPL is compatible with numpys license. It is only a
>             concern users of binary distributions. Nobody provided the
>         code to
>             use fftw yet, but it would certainly be accepted.
>
>
>         Although it is technically compatible, it would make numpy
>         effectively
>         GPL.  Suggestions for this have been explicitly rejected on these
>         grounds [1]
>
>         [1] https://github.com/numpy/numpy/issues/3485
>         <https://github.com/numpy/numpy/issues/3485>
>
>
>     Yes it would make numpy GPL, but that is not a concern for a lot of
>     users. Users for who it is a problem can still use the non-GPL version.
>     A more interesting debate is whether our binary wheels should then
>     be GPL wheels by default or not. Probably not, but that is something
>     that should be discussed when its an actual issue.
>
>     But to clarify what I said, it would be accepted if the value it
>     provides is sufficient compared to the code maintenance it adds.
>     Given that pyfftw already exists the value is probably relatively
>     small, but personally I'd still be interested in code that allows
>     switching the fft backend as that could also allow plugging e.g. gpu
>     based implementations (though again this is already covered by other
>     third party modules).
>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
>
>
>
>
> --
> Robert McLeod, Ph.D.
> Center for Cellular Imaging and Nano Analytics (C-CINA)
> Biozentrum der Universit?t Basel
> Mattenstrasse 26, 4058 Basel
> Work: +41.061.387.3225
> robert.mcleod at unibas.ch <mailto:robert.mcleod at unibas.ch>
> robert.mcleod at bsse.ethz.ch <mailto:robert.mcleod at ethz.ch>
> robbmcleod at gmail.com <mailto:robbmcleod at gmail.com>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From njs at pobox.com  Thu Oct 27 12:12:58 2016
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 27 Oct 2016 09:12:58 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
 <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
 <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>
 <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
Message-ID: <CAPJVwBkW0Q6fCY=MtPgszk2eTDvJKCg4RMkwWcQjpqZSA_d93w@mail.gmail.com>

On Oct 27, 2016 8:42 AM, "Robert McLeod" <robbmcleod at gmail.com> wrote:
>
> Releasing NumPy under GPL would make it incompatible with SciPy, which
may be _slightly_ inconvenient to the scientific Python community:
>
> https://scipy.github.io/old-wiki/pages/License_Compatibility.html
>
> https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html

There's 0 chance that numpy is going to switch to the GPL in general, so
please don't panic. Also, you're misunderstanding license compatibility, so
let's back up a step :-).

The discussion was about whether numpy might potentially, at some
unspecified future date, be available with *optional* GPL code. A numpy
build with optional GPL bits included would be similar to how the numpy
builds that many people use which that are linked to MKL, and thus subject
to MKL's license terms. In both cases the license is no longer numpy's
regular bsd, but has these extra bits added. Neither changes the
availability of bsd-licensed numpy; they just give another option.

And, both numpy+GPL-bits and numpy+MKL-bits are/would be license
*compatible* with scipy in the sense that matters to end users: you can
absolutely use and distribute numpy+(pick one of the above)+scipy together,
and the licenses are happy to allow that.

The sense in which they're both *in*compatible with scipy is just that if
you want to *add code to scipy itself*, then that code can't be GPL like
pyfftw, or proprietary like MKL, because the scipy devs have decided that
they don't want to allow that. That's a decision they've made for good
reasons, but it isn't a legal inevitability, and it doesn't stop *you* from
using and distributing scipy and GPL code together, or scipy and
proprietary code together.

(The real license incompatibility is between GPL and proprietary. Either
one can be mixed with BSD, but they can't be mixed with each other and then
distributed. Ever notice how Anaconda doesn't provide pyfftw? They can't
legally ship both MKL and pyfftw, and they picked MKL. Even then, though,
this license restriction only applies to software distributors: if you as
an end user go and install MKL and pyfftw together in the privacy of your
own cluster, then that's also totally legal.)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/10313047/attachment.html>

From toddrjen at gmail.com  Thu Oct 27 13:45:24 2016
From: toddrjen at gmail.com (Todd)
Date: Thu, 27 Oct 2016 13:45:24 -0400
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAPJVwBkW0Q6fCY=MtPgszk2eTDvJKCg4RMkwWcQjpqZSA_d93w@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
 <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
 <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>
 <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
 <CAPJVwBkW0Q6fCY=MtPgszk2eTDvJKCg4RMkwWcQjpqZSA_d93w@mail.gmail.com>
Message-ID: <CAFpSVp+pcR5EAhGPbA1g=qCdnW8FAB1BDfnPmc5Mjd6nPfKqXA@mail.gmail.com>

On Thu, Oct 27, 2016 at 12:12 PM, Nathaniel Smith <njs at pobox.com> wrote:

> Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship
> both MKL and pyfftw, and they picked MKL.


Anaconda does ship GPL code [1].  They even ship GPL code that depends on
numpy, such as cvxcanon and pystan, and there doesn't seem to be anything
that prevents me from installing them alongside the MKL version of numpy.
So I don't see how it would be any different for pyfftw.

[1] https://docs.continuum.io/anaconda/pkg-docs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/e81ea3fa/attachment.html>

From robert.kern at gmail.com  Thu Oct 27 14:01:08 2016
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 27 Oct 2016 11:01:08 -0700
Subject: [Numpy-discussion] Intel random number package
In-Reply-To: <CAFpSVp+pcR5EAhGPbA1g=qCdnW8FAB1BDfnPmc5Mjd6nPfKqXA@mail.gmail.com>
References: <CAB6mnxK=s1peBpCkC72KJhZFh22zOqb+wWXdyYpmBkUE1h+3Cg@mail.gmail.com>
 <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com>
 <CABL7CQgOCFyKLazA-_nWiLC1kkKQYHoKv4_mZ+qG2QSc8J2-cw@mail.gmail.com>
 <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com>
 <ca956683-cb46-57f0-c03c-b27121e73995@googlemail.com>
 <CAPJVwBkxq6D7KUy02_nsiS3ERW8SFgYOaHdxw01aFhzevB3G9A@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com>
 <CAFpSVpKiaqZ_rHvMPF8jA250TdKZ=-zpEcwu5pwiXznCiKp0eg@mail.gmail.com>
 <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com>
 <CABL7CQgaiwM9e3O-tc39XH=NoG4iXbniAOFROAaSFM44q_xs0A@mail.gmail.com>
 <CAFpSVpL5djbg=HZZH2YJxNx0TJTsL0ihGDe4NJvR0VmKML17Yg@mail.gmail.com>
 <da4aab26-c8f5-abb4-aff4-dcd0e6573b86@googlemail.com>
 <CAFpSVpKSLkB2KF0UxkfcZkVH17R8Foq9BzhLGTj0UDgJDA8o3g@mail.gmail.com>
 <d88b445a-37e1-0b44-2b90-bfb758f84914@googlemail.com>
 <CAEFUWWWPgCTbopVYDVQ=NkjoniZjL9rv4ESAt+QY-3doxrRiAQ@mail.gmail.com>
 <CAPJVwBkW0Q6fCY=MtPgszk2eTDvJKCg4RMkwWcQjpqZSA_d93w@mail.gmail.com>
 <CAFpSVp+pcR5EAhGPbA1g=qCdnW8FAB1BDfnPmc5Mjd6nPfKqXA@mail.gmail.com>
Message-ID: <CAF6FJiv2Ds4dqVzAtyCVGWoxmUHuayhg5pi7RsQcw7VYU+8ybg@mail.gmail.com>

On Thu, Oct 27, 2016 at 10:45 AM, Todd <toddrjen at gmail.com> wrote:
>
> On Thu, Oct 27, 2016 at 12:12 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship
both MKL and pyfftw, and they picked MKL.
>
> Anaconda does ship GPL code [1].  They even ship GPL code that depends on
numpy, such as cvxcanon and pystan, and there doesn't seem to be anything
that prevents me from installing them alongside the MKL version of numpy.
So I don't see how it would be any different for pyfftw.

I think we've exhausted the relevance of this tangent to Oleksander's
contributions.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/9b9b419a/attachment.html>

From jfoxrabinovitz at gmail.com  Thu Oct 27 14:29:58 2016
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Thu, 27 Oct 2016 14:29:58 -0400
Subject: [Numpy-discussion] Added atleast_nd,
 request for clarification/cleanup of atleast_3d
In-Reply-To: <CAAa1KPZ2hoLc34Yzi4+LVkp0S=fO62_jx_DgrxoxNfJs+ByNcA@mail.gmail.com>
References: <CAAa1KPbHpVc+Op8tc=xebGTLY=gCcPQtt0P_53YS53=i6NSgpQ@mail.gmail.com>
 <CAPJVwB=YXjKXeJ4Tce8s9gKjX9L4Js+nYMv1KfwL09j1uWgnAA@mail.gmail.com>
 <CAEQ_Tvdgqi1yX9=gmCdBJMatuUxPVQc0cCCt1DLnKZTrSgJ6Lw@mail.gmail.com>
 <CAAa1KPaB+DGiXqeFAW7bcs4sEHxU3Ep_GjuSABpM2ffCZvtbqw@mail.gmail.com>
 <CANNq6F=kKiq6xrh6V-HeBhLW=aVj_6Oczb=ZQLRRuBDbQaKDeA@mail.gmail.com>
 <b6109be1-34b0-0016-d43d-1cf2d2c6b43a@hawaii.edu>
 <CAAa1KPaTDvRT_Pjq0=qSw4aTDDQ-1UbKRVi+k1WjEofr=N7hwA@mail.gmail.com>
 <CANNq6Fmg=o2kdJvcJcH-CXzSMV-OKnr3oFJzbgKMmyDtYxUhWw@mail.gmail.com>
 <1467880459.17128.10.camel@sipsolutions.net>
 <CAAa1KPZ3PoBoUP0La5GStTLoRr7QyhSYBPp7j4Mdw1tnWt1bPw@mail.gmail.com>
 <CAAa1KPZ2hoLc34Yzi4+LVkp0S=fO62_jx_DgrxoxNfJs+ByNcA@mail.gmail.com>
Message-ID: <CAAa1KPY7VMPDUosR2LUymUoHcOTRpahNT2Sq12jg1tNcdPVJpA@mail.gmail.com>

Hi,

I would like to revitalize the discussion on including PR#7804 (atleast_nd
function) at Stephan Hoyer's request. atleast_nd has come up as a
convenient workaround for #8206 (adding padding options to diff) to be able
to do broadcasting with the required dimensions reversed.

Regards,

    -Joe


On Mon, Jul 11, 2016 at 10:41 AM, Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> I would like to follow up on my original PR (7804). While there
> appears to be some debate as to whether the PR is numpy material to
> begin with, there do not appear to be any technical issues with it. To
> make the decision more straightforward, I factored out the
> non-controversial bug fixes to masked arrays into PR #7823, along with
> their regression tests. This way, the original enhancement can be
> closed or left hanging indefinitely, (even though I hope neither
> happens). PR 7804 still has the bug fixes duplicated in it.
>
> Regards,
>
>     -Joe
>
>
> On Thu, Jul 7, 2016 at 9:11 AM, Joseph Fox-Rabinovitz
> <jfoxrabinovitz at gmail.com> wrote:
> > On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg
> > <sebastian at sipsolutions.net> wrote:
> >> On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote:
> >>> I don't see how one could define a spec that would take an arbitrary
> >>> array of indices at which to place new dimensions. By definition, you
> >>>
> >>
> >> You just give a reordered range, so that (1, 0, 2) would be the current
> >> 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D,
> >> add everything of course).
> >
> > I was originally thinking (-1, 0) for the 2D case. Just go along the
> > list and fill as many dims as necessary. Your way is much better since
> > it does not require a different operation for positive and negative
> > indices.
> >
> >> However, I have my doubts that it is actually easier to understand then
> >> to write yourself ;).
> >
> > A dictionary or ragged list would be better for that: either {1: (1,
> > 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the
> > index in the list is the starting ndim - 1.
> >
> >>
> >> - Sebastian
> >>
> >>
> >>> don't know how many dimensions are going to be added. If you knew,
> >>> then you wouldn't be calling this function. I can only imagine simple
> >>> rules such as 'left' or 'right' or maybe something akin to what
> >>> at_least3d() implements.
> >>>
> >>> On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz <jfoxrabinovitz
> >>> @gmail.com> wrote:
> >>> > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing <efiring at hawaii.edu>
> >>> > wrote:
> >>> > > On 2016/07/06 8:25 AM, Benjamin Root wrote:
> >>> > >>
> >>> > >> I wouldn't have the keyword be "where", as that collides with
> >>> > the notion
> >>> > >> of "where" elsewhere in numpy.
> >>> > >
> >>> > >
> >>> > > Agreed.  Maybe "side"?
> >>> >
> >>> > I have tentatively changed it to "pos". The reason that I don't
> >>> > like
> >>> > "side" is that it implies only a subset of the possible ways that
> >>> > that
> >>> > the position of the new dimensions can be specified. The current
> >>> > implementation only puts things on one side or the other, but I
> >>> > have
> >>> > considered also allowing an array of indices at which to place new
> >>> > dimensions, and/or a dictionary keyed by the starting ndims. I do
> >>> > not
> >>> > think "side" would be appropriate for these extended cases, even if
> >>> > they are very unlikely to ever materialize.
> >>> >
> >>> >     -Joe
> >>> >
> >>> > > (I find atleast_1d and atleast_2d to be very helpful for handling
> >>> > inputs, as
> >>> > > Ben noted; I'm skeptical as to the value of atleast_3d and
> >>> > atleast_nd.)
> >>> > >
> >>> > > Eric
> >>> > >
> >>> > > _______________________________________________
> >>> > > NumPy-Discussion mailing list
> >>> > > NumPy-Discussion at scipy.org
> >>> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>> > _______________________________________________
> >>> > NumPy-Discussion mailing list
> >>> > NumPy-Discussion at scipy.org
> >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>> >
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion at scipy.org
> >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/1be25b94/attachment.html>

From djxvillain at gmail.com  Thu Oct 27 15:58:25 2016
From: djxvillain at gmail.com (djxvillain)
Date: Thu, 27 Oct 2016 12:58:25 -0700 (MST)
Subject: [Numpy-discussion] How to use user input as equation directly
Message-ID: <1477598305150-43665.post@n7.nabble.com>

Hello all,

I am an electrical engineer and new to numpy.  I need the ability to take in
user input, and use that input as a variable.  For example:

t = input('enter t: ')
x = input('enter x: ')

I need the user to be able to enter something like x =
2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just typed
it in the .py file.  There's no clean way to cast or evaluate it that I've
found.

I could make a function to parse this string character by character, but I
figured this is probably a common problem and someone else has probably
figured it out and created an object for it.  I can't find a library that
does it though.

If I can provide any more information please let me know.  Thank you in
advance for your help.


--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From rmay31 at gmail.com  Thu Oct 27 17:26:28 2016
From: rmay31 at gmail.com (Ryan May)
Date: Thu, 27 Oct 2016 15:26:28 -0600
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <1477598305150-43665.post@n7.nabble.com>
References: <1477598305150-43665.post@n7.nabble.com>
Message-ID: <CAKH0P+Wu5p4a_1kytKvV0TMiq4nibteL-YYiBAD1mNQEu2wESA@mail.gmail.com>

On Thu, Oct 27, 2016 at 1:58 PM, djxvillain <djxvillain at gmail.com> wrote:

> Hello all,
>
> I am an electrical engineer and new to numpy.  I need the ability to take
> in
> user input, and use that input as a variable.  For example:
>
> t = input('enter t: ')
> x = input('enter x: ')
>
> I need the user to be able to enter something like x =
> 2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just typed
> it in the .py file.  There's no clean way to cast or evaluate it that I've
> found.
>

Are you aware of Python's eval function:
https://docs.python.org/3/library/functions.html#eval

?

Ryan

-- 
Ryan May
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/148e973f/attachment.html>

From djxvillain at gmail.com  Thu Oct 27 16:06:44 2016
From: djxvillain at gmail.com (djxvillain)
Date: Thu, 27 Oct 2016 13:06:44 -0700 (MST)
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <CAKH0P+Wu5p4a_1kytKvV0TMiq4nibteL-YYiBAD1mNQEu2wESA@mail.gmail.com>
References: <1477598305150-43665.post@n7.nabble.com>
 <CAKH0P+Wu5p4a_1kytKvV0TMiq4nibteL-YYiBAD1mNQEu2wESA@mail.gmail.com>
Message-ID: <1477598804771-43667.post@n7.nabble.com>

That worked perfectly.  I've been googling how to do this, I guess I didn't
phrase it correctly.  Thank you very much.  You just saved me a ton of time.


--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665p43667.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From jladasky at itu.edu  Thu Oct 27 17:33:03 2016
From: jladasky at itu.edu (John Ladasky)
Date: Thu, 27 Oct 2016 14:33:03 -0700
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <1477598305150-43665.post@n7.nabble.com>
References: <1477598305150-43665.post@n7.nabble.com>
Message-ID: <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>

This isn't just a Numpy issue.  You are interested in Python's eval().

Keep in mind that any programming language that blurs the line between code
and data (many do not) has a potential security vulnerability.  What if
your user doesn't type

"x = 2*np.sin(2*np.pi*44100*t+np.pi/2)"

but instead types this:

"import os ; os.remove('/home')"

I do NOT recommend that you eval() the second statement.

You can try to write code which traps unwanted input before you eval() it.
It's apparently quite hard to stop everything bad from getting through.


On Thu, Oct 27, 2016 at 12:58 PM, djxvillain <djxvillain at gmail.com> wrote:

> Hello all,
>
> I am an electrical engineer and new to numpy.  I need the ability to take
> in
> user input, and use that input as a variable.  For example:
>
> t = input('enter t: ')
> x = input('enter x: ')
>
> I need the user to be able to enter something like x =
> 2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just typed
> it in the .py file.  There's no clean way to cast or evaluate it that I've
> found.
>
> I could make a function to parse this string character by character, but I
> figured this is probably a common problem and someone else has probably
> figured it out and created an object for it.  I can't find a library that
> does it though.
>
> If I can provide any more information please let me know.  Thank you in
> advance for your help.
>
>
>
> --
> View this message in context: http://numpy-discussion.10968.
> n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
*John J. Ladasky Jr., Ph.D.*
*Research Scientist*
*International Technological University*
*2711 N. First St, San Jose, CA 95134 USA*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/9d88737b/attachment.html>

From ben.v.root at gmail.com  Thu Oct 27 17:35:53 2016
From: ben.v.root at gmail.com (Benjamin Root)
Date: Thu, 27 Oct 2016 17:35:53 -0400
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>
References: <1477598305150-43665.post@n7.nabble.com>
 <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>
Message-ID: <CANNq6FmyFkS2g5PJw2azCZ9u2Thg=BM0DLsYWZk9R+naridNuw@mail.gmail.com>

Perhaps the numexpr package might be safer? Not exactly meant for this
situation (meant for optimizations), but the evaluator is pretty darn safe.

Ben Root

On Thu, Oct 27, 2016 at 5:33 PM, John Ladasky <jladasky at itu.edu> wrote:

> This isn't just a Numpy issue.  You are interested in Python's eval().
>
> Keep in mind that any programming language that blurs the line between
> code and data (many do not) has a potential security vulnerability.  What
> if your user doesn't type
>
> "x = 2*np.sin(2*np.pi*44100*t+np.pi/2)"
>
> but instead types this:
>
> "import os ; os.remove('/home')"
>
> I do NOT recommend that you eval() the second statement.
>
> You can try to write code which traps unwanted input before you eval()
> it.  It's apparently quite hard to stop everything bad from getting through.
>
>
> On Thu, Oct 27, 2016 at 12:58 PM, djxvillain <djxvillain at gmail.com> wrote:
>
>> Hello all,
>>
>> I am an electrical engineer and new to numpy.  I need the ability to take
>> in
>> user input, and use that input as a variable.  For example:
>>
>> t = input('enter t: ')
>> x = input('enter x: ')
>>
>> I need the user to be able to enter something like x =
>> 2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just
>> typed
>> it in the .py file.  There's no clean way to cast or evaluate it that I've
>> found.
>>
>> I could make a function to parse this string character by character, but I
>> figured this is probably a common problem and someone else has probably
>> figured it out and created an object for it.  I can't find a library that
>> does it though.
>>
>> If I can provide any more information please let me know.  Thank you in
>> advance for your help.
>>
>>
>>
>> --
>> View this message in context: http://numpy-discussion.10968.
>> n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665.html
>> Sent from the Numpy-discussion mailing list archive at Nabble.com.
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
>
> --
> *John J. Ladasky Jr., Ph.D.*
> *Research Scientist*
> *International Technological University*
> *2711 N. First St, San Jose, CA 95134 USA*
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/39be1331/attachment.html>

From djxvillain at gmail.com  Thu Oct 27 16:21:18 2016
From: djxvillain at gmail.com (djxvillain)
Date: Thu, 27 Oct 2016 13:21:18 -0700 (MST)
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>
References: <1477598305150-43665.post@n7.nabble.com>
 <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>
Message-ID: <1477599678422-43670.post@n7.nabble.com>

This will not be a public product and will only be used by other
engineers/scientists for research.  I don't think security should be a huge
issue, but I appreciate your input and concern for the quality of my code.


--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665p43670.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From ben.v.root at gmail.com  Thu Oct 27 17:52:47 2016
From: ben.v.root at gmail.com (Benjamin Root)
Date: Thu, 27 Oct 2016 17:52:47 -0400
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <1477599678422-43670.post@n7.nabble.com>
References: <1477598305150-43665.post@n7.nabble.com>
 <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>
 <1477599678422-43670.post@n7.nabble.com>
Message-ID: <CANNq6Fn15q77JB_n=sdo9bS+RT1Ter=wjexW6AQSJToU0ABfqw@mail.gmail.com>

"only be used by engineers/scientists for research"

Famous last words. I know plenty of scientists who would love to "do
research" with an exposed eval(). Full disclosure, I personally added a
security hole into matplotlib thinking I covered all my bases in protecting
an eval() statement.

Ben Root

On Thu, Oct 27, 2016 at 4:21 PM, djxvillain <djxvillain at gmail.com> wrote:

> This will not be a public product and will only be used by other
> engineers/scientists for research.  I don't think security should be a huge
> issue, but I appreciate your input and concern for the quality of my code.
>
>
>
> --
> View this message in context: http://numpy-discussion.10968.
> n7.nabble.com/How-to-use-user-input-as-equation-directly-
> tp43665p43670.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161027/fc6b72f4/attachment.html>

From m.h.vankerkwijk at gmail.com  Fri Oct 28 09:23:20 2016
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Fri, 28 Oct 2016 14:23:20 +0100
Subject: [Numpy-discussion] padding options for diff
In-Reply-To: <CACUQHuJVCgkU1=m2gUVdZS278tZJ44iyV5VfzOKW777N3v_CDA@mail.gmail.com>
References: <CACUQHuJVCgkU1=m2gUVdZS278tZJ44iyV5VfzOKW777N3v_CDA@mail.gmail.com>
Message-ID: <CAJNV+9vDd1jTXKmB63o-9TdNpr=Wit7wyve4eDzbGtxVvh+0hg@mail.gmail.com>

Matthew has made what looks like a very nice implementation of padding
in np.diff in https://github.com/numpy/numpy/pull/8206. I raised two
general questions about desired behaviour there that Matthew thought
we should put out on the mailiing list as well. This indeed seemed a
good opportunity to get feedback, so herewith a copy of
https://github.com/numpy/numpy/pull/8206#issuecomment-256909027

-- Marten

1. I'm not sure that treating a 1-d array as something that will just
extend the result along `axis` is a good idea, as it breaks standard
broadcasting rules. E.g., consider
```
np.diff([[1, 2], [4, 8]], to_begin=[1, 4])
# with your PR:
array([[1, 4, 1],
       [1, 4, 4]])
# but from regular broadcasting I would expect
array([[1, 1],
       [4, 4]])
# i.e., the same as if I did to_begin=[[1, 4]]
```
I think it is slightly odd to break the broadcasting expectation here,
especially since the regular use case surely is just to add a single
element so that one keeps the original shape. The advantage of
assuming this is that you do not have to do *any* array shaping of
`to_begin` and `to_end` (which perhaps also suggests it is the right
thing to do).

2. As I mentioned above, I think it may be worth thinking through a
little what to do with higher order differences, at least for
`to_begin='first'`. If the goal is to ensure that with that option, it
becomes the inverse of `cumsum`, then I think for higher order one
should add multiple elements in front, i.e., for that case, the
recursive call should be
```
return np.diff(np.diff(a, to_begin='first'), n-1, to_begin='first')
```


From bennyrowland at mac.com  Fri Oct 28 09:29:31 2016
From: bennyrowland at mac.com (Ben Rowland)
Date: Fri, 28 Oct 2016 09:29:31 -0400
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <CANNq6Fn15q77JB_n=sdo9bS+RT1Ter=wjexW6AQSJToU0ABfqw@mail.gmail.com>
References: <1477598305150-43665.post@n7.nabble.com>
 <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>
 <1477599678422-43670.post@n7.nabble.com>
 <CANNq6Fn15q77JB_n=sdo9bS+RT1Ter=wjexW6AQSJToU0ABfqw@mail.gmail.com>
Message-ID: <A0A5DBD3-9C56-41FC-9061-D7719FE6B644@mac.com>

It is important to bear in mind where the code is being run - if this is something running on a researcher?s own system, they almost certainly have lots of other ways of messing it up. These kind of security vulnerabilities are normally only relevant when you are running code that came from somewhere else.

That being said, this use case sounds like it could work with the Jupyter notebook. If you want something that is like typing code into a .py file but evaluated at run time instead, why not just use an interactive Python REPL instead of eval(input()).

Ben

> On 27 Oct 2016, at 17:52, Benjamin Root <ben.v.root at gmail.com> wrote:
> 
> "only be used by engineers/scientists for research"
> 
> Famous last words. I know plenty of scientists who would love to "do research" with an exposed eval(). Full disclosure, I personally added a security hole into matplotlib thinking I covered all my bases in protecting an eval() statement.
> 
> Ben Root
> 
> On Thu, Oct 27, 2016 at 4:21 PM, djxvillain <djxvillain at gmail.com <mailto:djxvillain at gmail.com>> wrote:
> This will not be a public product and will only be used by other
> engineers/scientists for research.  I don't think security should be a huge
> issue, but I appreciate your input and concern for the quality of my code.
> 
> 
> 
> --
> View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665p43670.html <http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665p43670.html>
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
> https://mail.scipy.org/mailman/listinfo/numpy-discussion <https://mail.scipy.org/mailman/listinfo/numpy-discussion>
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161028/91b863e8/attachment.html>

From robbmcleod at gmail.com  Fri Oct 28 10:18:05 2016
From: robbmcleod at gmail.com (Robert McLeod)
Date: Fri, 28 Oct 2016 16:18:05 +0200
Subject: [Numpy-discussion] How to use user input as equation directly
In-Reply-To: <CANNq6FmyFkS2g5PJw2azCZ9u2Thg=BM0DLsYWZk9R+naridNuw@mail.gmail.com>
References: <1477598305150-43665.post@n7.nabble.com>
 <CAOC21_aaT2LUwXLShHz13u8VdeE3G8Tea2w2oiWdrVgEJmVQ=A@mail.gmail.com>
 <CANNq6FmyFkS2g5PJw2azCZ9u2Thg=BM0DLsYWZk9R+naridNuw@mail.gmail.com>
Message-ID: <CAEFUWWU31mv-MWtmD-ty1Dt0-eHpb+xi3s-Xx_AZ_GwHzr5Kcw@mail.gmail.com>

On Thu, Oct 27, 2016 at 11:35 PM, Benjamin Root <ben.v.root at gmail.com>
wrote:

> Perhaps the numexpr package might be safer? Not exactly meant for this
> situation (meant for optimizations), but the evaluator is pretty darn safe.
>
>
It would not be able to evaluate something like 'np.arange(50)' for
example, since it only has a limited subset of numpy functionality. In the
example provided that or linspace is likely the natural input for the
variable 't'.

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universit?t Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcleod at unibas.ch
robert.mcleod at bsse.ethz.ch <robert.mcleod at ethz.ch>
robbmcleod at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161028/03ba0518/attachment.html>

From charlesr.harris at gmail.com  Fri Oct 28 20:18:23 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 28 Oct 2016 18:18:23 -0600
Subject: [Numpy-discussion] Numpy scalar integers to negative scalar integer
	powers.
Message-ID: <CAB6mnxJDqWSNNQiqEh4uTPaZHcOcD4mB1k-RFCpC1=vzcC_ykA@mail.gmail.com>

Hi All,

I've put up a PR to deal with the numpy scalar integer powers at
https://github.com/numpy/numpy/pull/8221. Note that for now everything goes
through the np.power function.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161028/e556481b/attachment.html>

From charlesr.harris at gmail.com  Sat Oct 29 09:02:21 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 29 Oct 2016 07:02:21 -0600
Subject: [Numpy-discussion] __numpy_ufunc__
Message-ID: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>

Hi All,

Does anyone remember discussion of numpy scalars apropos __numpy_ufunc__?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161029/71a54bd8/attachment.html>

From shoyer at gmail.com  Sat Oct 29 21:03:10 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Sat, 29 Oct 2016 18:03:10 -0700
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
Message-ID: <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>

I'm happy to revisit the __numpy_ufunc__ discussion (I still want to see it
happen!), but I don't recall scalars being a point of contention.

The obvious thing to do with scalars would be to treat them the same as
0-dimensional arrays, though I might be missing some nuance...

On Sat, Oct 29, 2016 at 6:02 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

> Hi All,
>
> Does anyone remember discussion of numpy scalars apropos __numpy_ufunc__?
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161029/2ec3906b/attachment.html>

From charlesr.harris at gmail.com  Sat Oct 29 21:56:12 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 29 Oct 2016 19:56:12 -0600
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
 <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
Message-ID: <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>

On Sat, Oct 29, 2016 at 7:03 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> I'm happy to revisit the __numpy_ufunc__ discussion (I still want to see
> it happen!), but I don't recall scalars being a point of contention.
>

The __numpy_ufunc__ functionality is the last bit I want for 1.12.0, the
rest of the remaining changes I can kick forward to 1.13.0. I will start
taking a look tomorrow, probably starting with Nathaniel's work.


>
> The obvious thing to do with scalars would be to treat them the same as
> 0-dimensional arrays, though I might be missing some nuance...
>

That's my thought. Currently they just look at __array_priority__ and call
the corresponding array method if needed, so that maybe needs some
improvement and a formal statement of intent.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161029/2b01286e/attachment.html>

From m.h.vankerkwijk at gmail.com  Sun Oct 30 06:34:35 2016
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Sun, 30 Oct 2016 10:34:35 +0000
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
 <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
 <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
Message-ID: <CAJNV+9t-=RUPvftyAH5Ys4vcK+g4KUK0x-ZdsV58OBYuTCgwJw@mail.gmail.com>

> The __numpy_ufunc__ functionality is the last bit I want for 1.12.0, the
> rest of the remaining changes I can kick forward to 1.13.0. I will start
> taking a look tomorrow, probably starting with Nathaniel's work.

Great! I'll revive the Quantity PRs that implement __numpy_ufunc__!

-- Marten


From vikramsingh001 at gmail.com  Sun Oct 30 10:12:27 2016
From: vikramsingh001 at gmail.com (Vikram Singh)
Date: Sun, 30 Oct 2016 16:12:27 +0200
Subject: [Numpy-discussion] Problem with compiling openacc with f2py
In-Reply-To: <CAD0gq3XtDmMHG=e5Ek2=3Pd577xeqAhzTJuOJhNYm5FqjOgvmQ@mail.gmail.com>
References: <CAD0gq3XtDmMHG=e5Ek2=3Pd577xeqAhzTJuOJhNYm5FqjOgvmQ@mail.gmail.com>
Message-ID: <CAD0gq3U+EC_pWMB1LFhy1nO9NyOVOZOSAOuRQLC3ZsBSZds=PA@mail.gmail.com>

Ok, I got it to compile using

f2py -c -m --f90flags='-fopenmp -fopenacc -foffload=nvptx-none -foffload=-O3
-O3 -fPIC' hello hello.f90 -L/usr/local/cuda/lib64 -lcublas -lcudart
-lgomp

But now I get the import error,
/home/Experiments/fortran_python/hello.cpython-35m-x86_64-linux-gnu.so:
undefined symbol: __offload_func_table

Seems to me I have to link another library. But where is __offload_func_table

On Thu, Oct 27, 2016 at 1:30 PM, Vikram Singh <vikramsingh001 at gmail.com> wrote:
> I am a newbie to f2py so I have been creating simple test cases.
> Eventually I want to be able to use openacc subroutine from python. So
> here's the test case
>
>     module test
>
>      use iso_c_binding, only: sp => C_FLOAT, dp => C_DOUBLE, i8 => C_INT
>      use omp_lib
>      use openacc
>
>      implicit none
>
>      contains
>
>        subroutine add_acc (a, b, n, c)
>           integer(kind=i8), intent(in)  :: n
>           real(kind=dp), intent(in)  :: a(n)
>           real(kind=dp), intent(in)  :: b(n)
>           real(kind=dp), intent(out) :: c(n)
>
>           integer(kind=i8)  :: i
>
>           !$acc kernels
>           do i = 1, n
>               c(i) = a(i) + b(i)
>           end do
>           !$acc end kernels
>
>       end subroutine add_acc
>
>       subroutine add_omp (a, b, n, c)
>           integer(kind=i8), intent(in)  :: n
>           real(kind=dp), intent(in)  :: a(n)
>           real(kind=dp), intent(in)  :: b(n)
>           real(kind=dp), intent(out) :: c(n)
>
>           integer(kind=i8)  :: i, j
>
>           !$omp parallel do
>           do i = 1, n
>               c(i) = a(i) + b(i)
>           end do
>           !$omp end parallel do
>
>       end subroutine add_omp
>
>       subroutine nt (c)
>           integer(kind=i8), intent(out) :: c
>
>           c = omp_get_max_threads()
>
>       end subroutine nt
>
>       subroutine mult (a, b, c)
>           real(kind=dp), intent(in)  :: a
>           real(kind=dp), intent(in)  :: b
>           real(kind=dp), intent(out) :: c
>
>           c = a * b
>
>       end subroutine mult
>
>     end module test
>
> I compile using
>
> f2py -c -m --f90flags='-fopenacc -foffload=nvptx-none -foffload=-O3
> -O3 -fPIC' hello hello.f90 -L/usr/local/cuda/lib64 -lcublas -lcudart
> -lgomp
>
> Now, until I add the acc directives everything works fine. But as soon
> as I add the acc directives I get this error.
>
> gfortran:f90: /tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.f90
> /home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran
> -Wall -g -Wall -g -shared
> /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o
> /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o
> /tmp/tmpld6ssow3/hello.o
> /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o
> -L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas
> -lcudart -lgomp -lpython3.5m -lgfortran -o
> ./hello.cpython-35m-x86_64-linux-gnu.so
> /usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against
> `.rodata' can not be used when making a shared object; recompile with
> -fPIC
> /tmp/cc2yQ89d.target.o: error adding symbols: Bad value
> collect2: error: ld returned 1 exit status
> /usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against
> `.rodata' can not be used when making a shared object; recompile with
> -fPIC
> /tmp/cc2yQ89d.target.o: error adding symbols: Bad value
> collect2: error: ld returned 1 exit status
> error: Command "/home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran
> -Wall -g -Wall -g -shared
> /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o
> /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o
> /tmp/tmpld6ssow3/hello.o
> /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o
> -L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas
> -lcudart -lgomp -lpython3.5m -lgfortran -o
> ./hello.cpython-35m-x86_64-linux-gnu.so" failed with exit status 1
>
> I don't get why just putting acc directives should create errors, when
> omp does not.
>
> Vikram


From m.h.vankerkwijk at gmail.com  Mon Oct 31 13:08:22 2016
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Mon, 31 Oct 2016 17:08:22 +0000
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
 <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
 <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
Message-ID: <CAJNV+9vq__v4DtAH_k+DrqbzO8NhpG0v0YeOTJhebaohPL=5YA@mail.gmail.com>

Hi Chuck,

I've revived my Quantity PRs that use __numpy_ufunc__ but is it
correct that at present in *dev, one cannot use it?

All the best,

Marten


From charlesr.harris at gmail.com  Mon Oct 31 13:31:06 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 31 Oct 2016 11:31:06 -0600
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAJNV+9vq__v4DtAH_k+DrqbzO8NhpG0v0YeOTJhebaohPL=5YA@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
 <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
 <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
 <CAJNV+9vq__v4DtAH_k+DrqbzO8NhpG0v0YeOTJhebaohPL=5YA@mail.gmail.com>
Message-ID: <CAB6mnx+5mqrPL4Uj-SBOdZRnaU_RqFU_Zfb2HB0eA-aNb=c+ew@mail.gmail.com>

On Mon, Oct 31, 2016 at 11:08 AM, Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

> Hi Chuck,
>
> I've revived my Quantity PRs that use __numpy_ufunc__ but is it
> correct that at present in *dev, one cannot use it?
>

It's not enabled yet.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161031/b748c550/attachment.html>

From shoyer at gmail.com  Mon Oct 31 13:39:51 2016
From: shoyer at gmail.com (Stephan Hoyer)
Date: Mon, 31 Oct 2016 10:39:51 -0700
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAB6mnx+5mqrPL4Uj-SBOdZRnaU_RqFU_Zfb2HB0eA-aNb=c+ew@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
 <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
 <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
 <CAJNV+9vq__v4DtAH_k+DrqbzO8NhpG0v0YeOTJhebaohPL=5YA@mail.gmail.com>
 <CAB6mnx+5mqrPL4Uj-SBOdZRnaU_RqFU_Zfb2HB0eA-aNb=c+ew@mail.gmail.com>
Message-ID: <CAEQ_TvdQ33Fm0rR7bRQeRMEQXjeq3DHiXWz-rZffCJn1o9C+FA@mail.gmail.com>

Recall that I think we wanted to rename this to __array_ufunc__, so we
could change the function signature:
https://github.com/numpy/numpy/issues/5986

I'm still a little nervous about this. Chunk -- what is your proposal for
resolving the outstanding issues from
https://github.com/numpy/numpy/issues/5844?

On Mon, Oct 31, 2016 at 10:31 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
> On Mon, Oct 31, 2016 at 11:08 AM, Marten van Kerkwijk <
> m.h.vankerkwijk at gmail.com> wrote:
>
>> Hi Chuck,
>>
>> I've revived my Quantity PRs that use __numpy_ufunc__ but is it
>> correct that at present in *dev, one cannot use it?
>>
>
> It's not enabled yet.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161031/82ec1b7e/attachment.html>

From charlesr.harris at gmail.com  Mon Oct 31 13:47:16 2016
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 31 Oct 2016 11:47:16 -0600
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAEQ_TvdQ33Fm0rR7bRQeRMEQXjeq3DHiXWz-rZffCJn1o9C+FA@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
 <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
 <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
 <CAJNV+9vq__v4DtAH_k+DrqbzO8NhpG0v0YeOTJhebaohPL=5YA@mail.gmail.com>
 <CAB6mnx+5mqrPL4Uj-SBOdZRnaU_RqFU_Zfb2HB0eA-aNb=c+ew@mail.gmail.com>
 <CAEQ_TvdQ33Fm0rR7bRQeRMEQXjeq3DHiXWz-rZffCJn1o9C+FA@mail.gmail.com>
Message-ID: <CAB6mnxKrGHMq41AU8s+hzE3K6uHv-iphQsp4LNN8s=Y-KXtB9w@mail.gmail.com>

On Mon, Oct 31, 2016 at 11:39 AM, Stephan Hoyer <shoyer at gmail.com> wrote:

> Recall that I think we wanted to rename this to __array_ufunc__, so we
> could change the function signature: https://github.com/numpy/
> numpy/issues/5986
>
> I'm still a little nervous about this. Chunk -- what is your proposal for
> resolving the outstanding issues from https://github.com/numpy/
> numpy/issues/5844?
>

We were pretty close. IIRC, the outstanding issue was some sort of
override. At the developer meeting at scipy 2015 it was agreed that it
would be easy to finish things up under the rubric "make Pauli happy". But
that wasn't happening which is why I asked Nathaniel to disable it for
1.10.0. It is now a year later, things have cooled, and, IMHO, it is time
to take another shot at it.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161031/b25e769b/attachment.html>

From m.h.vankerkwijk at gmail.com  Mon Oct 31 18:37:47 2016
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Mon, 31 Oct 2016 22:37:47 +0000
Subject: [Numpy-discussion] __numpy_ufunc__
In-Reply-To: <CAB6mnxKrGHMq41AU8s+hzE3K6uHv-iphQsp4LNN8s=Y-KXtB9w@mail.gmail.com>
References: <CAB6mnxK+JP2LyjMvN5=1XRwYQoDTDsqn+b__XK6hn3qpqn3L4A@mail.gmail.com>
 <CAEQ_TvcnnQToV2PVifEk_DQwLdeNMuFDz2QBvZF23jerWw0tSw@mail.gmail.com>
 <CAB6mnx+Ab+077VJoYU7D_04QdHB49TgKq4cGb5mBd_k1Kj3HmQ@mail.gmail.com>
 <CAJNV+9vq__v4DtAH_k+DrqbzO8NhpG0v0YeOTJhebaohPL=5YA@mail.gmail.com>
 <CAB6mnx+5mqrPL4Uj-SBOdZRnaU_RqFU_Zfb2HB0eA-aNb=c+ew@mail.gmail.com>
 <CAEQ_TvdQ33Fm0rR7bRQeRMEQXjeq3DHiXWz-rZffCJn1o9C+FA@mail.gmail.com>
 <CAB6mnxKrGHMq41AU8s+hzE3K6uHv-iphQsp4LNN8s=Y-KXtB9w@mail.gmail.com>
Message-ID: <CAJNV+9tAaqS-XW+Htt+6weqDQKkmm4JWHMrotWz5y9+1uRtxTQ@mail.gmail.com>

Hi Chuck,

> We were pretty close. IIRC, the outstanding issue was some sort of override.

Correct. With a general sentiment of those downstream that it would be
great to merge in any form, as it will be really helpful! (Generic
speedup of factor of 2 for computationally expensive ufuncs (sin, cos,
etc.) that needs scaling in Quantity...)

> At the developer meeting at scipy 2015 it was agreed that it would be easy
> to finish things up under the rubric "make Pauli happy".

That would certainly make me happy too!

Other items that were brought up (trying to summarize from issues
linked above, and links therein):

1. Remove index argument
2. Out always a tuple
3. Let ndarray have a __numpy_ufunc__ stub, so one can super it.

Here, the first item implied a possible name change (to
__array_ufunc__); if that's too troublesome, I don't think it really
hurts to have the argument, though it is somewhat "unclean" for the
case that only the output has __numpy_ufunc__.

All the best,

Marten