From chris.barker at noaa.gov Sat Oct 1 14:38:16 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 1 Oct 2016 11:38:16 -0700 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: Julian, This is really, really cool! I have been wanting something like this for years (over a decade? wow!), but always thought it would require hacking the interpreter to intercept operations. This is a really inspired idea, and could buy numpy a lot of performance. I'm afraid I can't say much about the implementation details -- but great work! -Chris On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 30.09.2016 23:09, josef.pktd at gmail.com wrote: > > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor > > wrote: > >> hi, > >> Temporary arrays generated in expressions are expensive as the imply > >> extra memory bandwidth which is the bottleneck in most numpy operations. > >> For example: > >> > >> r = a + b + c > >> > >> creates the b + c temporary and then adds a to it. > >> This can be rewritten to be more efficient using inplace operations: > >> > >> r = b + c > >> r += a > > > > general question (I wouldn't understand the details even if I looked.) > > > > how is this affected by broadcasting and type promotion? > > > > Some of the main reasons that I don't like to use inplace operation in > > general is that I'm often not sure when type promotion occurs and when > > arrays expand during broadcasting. > > > > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape. > > another case when I switch away from broadcasting is when b + c is int > > or bool and a is float. Thankfully, we get error messages for casting > > now. > > the temporary is only avoided when the casting follows the safe rule, so > it should be the same as what you get without inplace operations. E.g. > float32-temporary + float64 will not be converted to the unsafe float32 > += float64 which a normal inplace operations would allow. But > float64-temp + float32 is transformed. > > Currently the only broadcasting that will be transformed is temporary + > scalar value, otherwise it will only work on matching array sizes. > Though there is not really anything that prevents full broadcasting but > its not implemented yet in the PR. > > > > >> > >> This saves some memory bandwidth and can speedup the operation by 50% > >> for very large arrays or even more if the inplace operation allows it to > >> be completed completely in the cpu cache. > > > > I didn't realize the difference can be so large. That would make > > streamlining some code worth the effort. > > > > Josef > > > > > >> > >> The problem is that inplace operations are a lot less readable so they > >> are often only used in well optimized code. But due to pythons > >> refcounting semantics we can actually do some inplace conversions > >> transparently. > >> If an operand in python has a reference count of one it must be a > >> temporary so we can use it as the destination array. CPython itself does > >> this optimization for string concatenations. > >> > >> In numpy we have the issue that we can be called from the C-API directly > >> where the reference count may be one for other reasons. > >> To solve this we can check the backtrace until the python frame > >> evaluation function. If there are only numpy and python functions in > >> between that and our entry point we should be able to elide the > temporary. > >> > >> This PR implements this: > >> https://github.com/numpy/numpy/pull/7997 > >> > >> It currently only supports Linux with glibc (which has reliable > >> backtraces via unwinding) and maybe MacOS depending on how good their > >> backtrace is. On windows the backtrace APIs are different and I don't > >> know them but in theory it could also be done there. > >> > >> A problem is that checking the backtrace is quite expensive, so should > >> only be enabled when the involved arrays are large enough for it to be > >> worthwhile. In my testing this seems to be around 180-300KiB sized > >> arrays, basically where they start spilling out of the CPU L2 cache. > >> > >> I made a little crappy benchmark script to test this cutoff in this > branch: > >> https://github.com/juliantaylor/numpy/tree/elide-bench > >> > >> If you are interested you can run it with: > >> python setup.py build_ext -j 4 --inplace > >> ipython --profile=null check.ipy > >> > >> At the end it will plot the ratio between elided and non-elided runtime. > >> It should get larger than one around 180KiB on most cpus. > >> > >> If no one points out some flaw in the approach, I'm hoping to get this > >> into the next numpy version. > >> > >> cheers, > >> Julian > >> > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Sat Oct 1 15:54:01 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 1 Oct 2016 22:54:01 +0300 Subject: [Numpy-discussion] Vendorize tempita In-Reply-To: References: Message-ID: 01.10.2016 3:42 ???????????? "Charles R Harris" ???????: > > > > On Fri, Sep 30, 2016 at 10:36 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> >> >> >> On Fri, Sep 30, 2016 at 10:10 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: >>> >>> >>> >>> On Fri, Sep 30, 2016 at 9:48 AM, Evgeni Burovski < evgeny.burovskiy at gmail.com> wrote: >>>> >>>> On Fri, Sep 30, 2016 at 6:29 PM, Charles R Harris >>>> wrote: >>>> > >>>> > >>>> > On Fri, Sep 30, 2016 at 9:21 AM, Benjamin Root wrote: >>>> >> >>>> >> This is the first I am hearing of tempita (looks to be a templating >>>> >> language). How is it a dependency of numpy? Do I now need tempita in order >>>> >> to use numpy, or is it a build-time-only dependency? >>>> > >>>> > >>>> > Build time only. The virtue of tempita is that it can be used to generate >>>> > cython sources. We could adapt one of our current templating scripts to do >>>> > that also, but that would seem to be more work. Note that tempita is >>>> > currently included in cython, but the cython folks consider that an >>>> > implemention detail that should not be depended upon. >>>> > >>>> > >>>> > >>>> > Chuck >>>> > >>>> > _______________________________________________ >>>> > NumPy-Discussion mailing list >>>> > NumPy-Discussion at scipy.org >>>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> > >>>> >>>> >>>> Ideally, it's packaged in such a way that it's usable for scipy too -- >>>> at the moment it's used in scipy.sparse via Cython.Tempita + a >>>> fallback to system installed tempita if Cython.Tempita is not >>>> available (however I'm not sure that fallback is ever exercised). >>>> Since scipy needs to support numpy down to 1.8.2, a vendorized copy >>>> will not be usable for scipy for quite a while. >>>> >>>> So, it'd be great to handle it like numpydoc: to have npy_tempita as a >>>> small self-contained package with the repo under the numpy >>>> organization and include it via a git submodule. Chuck, do you think >>>> tempita would need much in terms of maintenance? >>>> >>>> To put some money where my mouth is, I can offer to do some legwork >>>> for packaging it up. >>>> >>> >>> It might be better to keep tempita and cythonize together so that the search path works out right. It is also possible that other scripts might be wanted as cythonize is currently restricted to cython files (*.pyx.in, *. pxi.in). There are two other templating scripts in numpy/distutils, and I think f2py has a dependency on one of those. >>> >>> If there is a set of tools that would be common to both scipy and numpy, having them included as a submodule would be a good idea. >>> >> >> Hmm, I suppose it just depends on where submodule is, so a npy_tempita alone would work fine. There isn't much maintenance needed if you resist the urge to refactor the code. I removed a six dependency, but that is now upstream as well. > > > There don't seem to be any objections, so I will put the current vendorization in. Evgeni, if you think it a good idea to make a repo for this and use submodules, go ahead with that. I have left out the testing infrastructure at https://github.com/gjhiggins/tempita which runs a sparse set of doctests. As long as it's being vendored into numpy/tools, I don't think there's much point in having one more copy. If any of cython.tempita, gjhiggins/tempita, and numpy/tools/npy_tempita disappears, we can reconsider adding a submodule. Thanks for working on this! Cheers, Evgeni -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 1 19:02:01 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 1 Oct 2016 17:02:01 -0600 Subject: [Numpy-discussion] Dropping sourceforge for releases. Message-ID: Hi All, Ralf has suggested dropping sourceforge as a NumPy release site. There was discussion of doing that some time back but we have not yet done it. Now that we put wheels up on PyPI for all supported architectures source forge is not needed. I note that there are still some 15,000 downloads a week from the site, so it is still used. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Sun Oct 2 09:10:45 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Sun, 2 Oct 2016 09:10:45 -0400 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: Just thinking aloud, an idea I had recently takes a different approach. The problem with temporaries isn't so much that they exists, but rather they they keep on malloc'ed and cleared. What if numpy kept a small LRU cache of weakref'ed temporaries? Whenever a new numpy array is requested, numpy could see if there is already one in its cache of matching size and use it. If you think about it, expressions that result in many temporaries would quite likely have many of them being the same size in memory. Don't know how feasible it would be to implement though. Cheers! Ben Root On Sat, Oct 1, 2016 at 2:38 PM, Chris Barker wrote: > Julian, > > This is really, really cool! > > I have been wanting something like this for years (over a decade? wow!), > but always thought it would require hacking the interpreter to intercept > operations. This is a really inspired idea, and could buy numpy a lot of > performance. > > I'm afraid I can't say much about the implementation details -- but great > work! > > -Chris > > > > > On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On 30.09.2016 23:09, josef.pktd at gmail.com wrote: >> > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor >> > wrote: >> >> hi, >> >> Temporary arrays generated in expressions are expensive as the imply >> >> extra memory bandwidth which is the bottleneck in most numpy >> operations. >> >> For example: >> >> >> >> r = a + b + c >> >> >> >> creates the b + c temporary and then adds a to it. >> >> This can be rewritten to be more efficient using inplace operations: >> >> >> >> r = b + c >> >> r += a >> > >> > general question (I wouldn't understand the details even if I looked.) >> > >> > how is this affected by broadcasting and type promotion? >> > >> > Some of the main reasons that I don't like to use inplace operation in >> > general is that I'm often not sure when type promotion occurs and when >> > arrays expand during broadcasting. >> > >> > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape. >> > another case when I switch away from broadcasting is when b + c is int >> > or bool and a is float. Thankfully, we get error messages for casting >> > now. >> >> the temporary is only avoided when the casting follows the safe rule, so >> it should be the same as what you get without inplace operations. E.g. >> float32-temporary + float64 will not be converted to the unsafe float32 >> += float64 which a normal inplace operations would allow. But >> float64-temp + float32 is transformed. >> >> Currently the only broadcasting that will be transformed is temporary + >> scalar value, otherwise it will only work on matching array sizes. >> Though there is not really anything that prevents full broadcasting but >> its not implemented yet in the PR. >> >> > >> >> >> >> This saves some memory bandwidth and can speedup the operation by 50% >> >> for very large arrays or even more if the inplace operation allows it >> to >> >> be completed completely in the cpu cache. >> > >> > I didn't realize the difference can be so large. That would make >> > streamlining some code worth the effort. >> > >> > Josef >> > >> > >> >> >> >> The problem is that inplace operations are a lot less readable so they >> >> are often only used in well optimized code. But due to pythons >> >> refcounting semantics we can actually do some inplace conversions >> >> transparently. >> >> If an operand in python has a reference count of one it must be a >> >> temporary so we can use it as the destination array. CPython itself >> does >> >> this optimization for string concatenations. >> >> >> >> In numpy we have the issue that we can be called from the C-API >> directly >> >> where the reference count may be one for other reasons. >> >> To solve this we can check the backtrace until the python frame >> >> evaluation function. If there are only numpy and python functions in >> >> between that and our entry point we should be able to elide the >> temporary. >> >> >> >> This PR implements this: >> >> https://github.com/numpy/numpy/pull/7997 >> >> >> >> It currently only supports Linux with glibc (which has reliable >> >> backtraces via unwinding) and maybe MacOS depending on how good their >> >> backtrace is. On windows the backtrace APIs are different and I don't >> >> know them but in theory it could also be done there. >> >> >> >> A problem is that checking the backtrace is quite expensive, so should >> >> only be enabled when the involved arrays are large enough for it to be >> >> worthwhile. In my testing this seems to be around 180-300KiB sized >> >> arrays, basically where they start spilling out of the CPU L2 cache. >> >> >> >> I made a little crappy benchmark script to test this cutoff in this >> branch: >> >> https://github.com/juliantaylor/numpy/tree/elide-bench >> >> >> >> If you are interested you can run it with: >> >> python setup.py build_ext -j 4 --inplace >> >> ipython --profile=null check.ipy >> >> >> >> At the end it will plot the ratio between elided and non-elided >> runtime. >> >> It should get larger than one around 180KiB on most cpus. >> >> >> >> If no one points out some flaw in the approach, I'm hoping to get this >> >> into the next numpy version. >> >> >> >> cheers, >> >> Julian >> >> >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Oct 2 17:26:28 2016 From: cournape at gmail.com (David Cournapeau) Date: Sun, 2 Oct 2016 22:26:28 +0100 Subject: [Numpy-discussion] Dropping sourceforge for releases. In-Reply-To: References: Message-ID: +1 from me. If we really need some distribution on top of github/pypi, note that bintray (https://bintray.com/) is free for OSS projects, and is a much better experience than sourceforge. David On Sun, Oct 2, 2016 at 12:02 AM, Charles R Harris wrote: > Hi All, > > Ralf has suggested dropping sourceforge as a NumPy release site. There was > discussion of doing that some time back but we have not yet done it. Now > that we put wheels up on PyPI for all supported architectures source forge > is not needed. I note that there are still some 15,000 downloads a week > from the site, so it is still used. > > Thoughts? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Sun Oct 2 19:53:32 2016 From: vincent at vincentdavis.net (Vincent Davis) Date: Sun, 2 Oct 2016 17:53:32 -0600 Subject: [Numpy-discussion] Dropping sourceforge for releases. In-Reply-To: References: Message-ID: +1, I am very skeptical of anything on SourceForge, it negatively impacts my opinion of any project that requires me to download from sourceforge. On Saturday, October 1, 2016, Charles R Harris wrote: > Hi All, > > Ralf has suggested dropping sourceforge as a NumPy release site. There was > discussion of doing that some time back but we have not yet done it. Now > that we put wheels up on PyPI for all supported architectures source forge > is not needed. I note that there are still some 15,000 downloads a week > from the site, so it is still used. > > Thoughts? > > Chuck > -- Sent from mobile app. Vincent Davis 720-301-3003 -------------- next part -------------- An HTML attachment was scrubbed... URL: From lxx9xx at gmail.com Sun Oct 2 20:15:13 2016 From: lxx9xx at gmail.com (Hush Hush) Date: Mon, 3 Oct 2016 09:15:13 +0900 Subject: [Numpy-discussion] automatically avoiding temporary arrays Message-ID: The same idea was published two years ago: http://hiperfit.dk/pdf/Doubling.pdf On Mon, Oct 3, 2016 at 8:53 AM, wrote: > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > Today's Topics: > > 1. Re: automatically avoiding temporary arrays (Benjamin Root) > 2. Re: Dropping sourceforge for releases. (David Cournapeau) > 3. Re: Dropping sourceforge for releases. (Vincent Davis) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 2 Oct 2016 09:10:45 -0400 > From: Benjamin Root > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] automatically avoiding temporary > arrays > Message-ID: > gmail.com> > Content-Type: text/plain; charset="utf-8" > > Just thinking aloud, an idea I had recently takes a different approach. The > problem with temporaries isn't so much that they exists, but rather they > they keep on malloc'ed and cleared. What if numpy kept a small LRU cache of > weakref'ed temporaries? Whenever a new numpy array is requested, numpy > could see if there is already one in its cache of matching size and use it. > If you think about it, expressions that result in many temporaries would > quite likely have many of them being the same size in memory. > > Don't know how feasible it would be to implement though. > > Cheers! > Ben Root > > > On Sat, Oct 1, 2016 at 2:38 PM, Chris Barker > wrote: > > > Julian, > > > > This is really, really cool! > > > > I have been wanting something like this for years (over a decade? wow!), > > but always thought it would require hacking the interpreter to intercept > > operations. This is a really inspired idea, and could buy numpy a lot of > > performance. > > > > I'm afraid I can't say much about the implementation details -- but great > > work! > > > > -Chris > > > > > > > > > > On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor < > > jtaylor.debian at googlemail.com> wrote: > > > >> On 30.09.2016 23:09, josef.pktd at gmail.com wrote: > >> > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor > >> > wrote: > >> >> hi, > >> >> Temporary arrays generated in expressions are expensive as the imply > >> >> extra memory bandwidth which is the bottleneck in most numpy > >> operations. > >> >> For example: > >> >> > >> >> r = a + b + c > >> >> > >> >> creates the b + c temporary and then adds a to it. > >> >> This can be rewritten to be more efficient using inplace operations: > >> >> > >> >> r = b + c > >> >> r += a > >> > > >> > general question (I wouldn't understand the details even if I looked.) > >> > > >> > how is this affected by broadcasting and type promotion? > >> > > >> > Some of the main reasons that I don't like to use inplace operation in > >> > general is that I'm often not sure when type promotion occurs and when > >> > arrays expand during broadcasting. > >> > > >> > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape. > >> > another case when I switch away from broadcasting is when b + c is int > >> > or bool and a is float. Thankfully, we get error messages for casting > >> > now. > >> > >> the temporary is only avoided when the casting follows the safe rule, so > >> it should be the same as what you get without inplace operations. E.g. > >> float32-temporary + float64 will not be converted to the unsafe float32 > >> += float64 which a normal inplace operations would allow. But > >> float64-temp + float32 is transformed. > >> > >> Currently the only broadcasting that will be transformed is temporary + > >> scalar value, otherwise it will only work on matching array sizes. > >> Though there is not really anything that prevents full broadcasting but > >> its not implemented yet in the PR. > >> > >> > > >> >> > >> >> This saves some memory bandwidth and can speedup the operation by 50% > >> >> for very large arrays or even more if the inplace operation allows it > >> to > >> >> be completed completely in the cpu cache. > >> > > >> > I didn't realize the difference can be so large. That would make > >> > streamlining some code worth the effort. > >> > > >> > Josef > >> > > >> > > >> >> > >> >> The problem is that inplace operations are a lot less readable so > they > >> >> are often only used in well optimized code. But due to pythons > >> >> refcounting semantics we can actually do some inplace conversions > >> >> transparently. > >> >> If an operand in python has a reference count of one it must be a > >> >> temporary so we can use it as the destination array. CPython itself > >> does > >> >> this optimization for string concatenations. > >> >> > >> >> In numpy we have the issue that we can be called from the C-API > >> directly > >> >> where the reference count may be one for other reasons. > >> >> To solve this we can check the backtrace until the python frame > >> >> evaluation function. If there are only numpy and python functions in > >> >> between that and our entry point we should be able to elide the > >> temporary. > >> >> > >> >> This PR implements this: > >> >> https://github.com/numpy/numpy/pull/7997 > >> >> > >> >> It currently only supports Linux with glibc (which has reliable > >> >> backtraces via unwinding) and maybe MacOS depending on how good their > >> >> backtrace is. On windows the backtrace APIs are different and I don't > >> >> know them but in theory it could also be done there. > >> >> > >> >> A problem is that checking the backtrace is quite expensive, so > should > >> >> only be enabled when the involved arrays are large enough for it to > be > >> >> worthwhile. In my testing this seems to be around 180-300KiB sized > >> >> arrays, basically where they start spilling out of the CPU L2 cache. > >> >> > >> >> I made a little crappy benchmark script to test this cutoff in this > >> branch: > >> >> https://github.com/juliantaylor/numpy/tree/elide-bench > >> >> > >> >> If you are interested you can run it with: > >> >> python setup.py build_ext -j 4 --inplace > >> >> ipython --profile=null check.ipy > >> >> > >> >> At the end it will plot the ratio between elided and non-elided > >> runtime. > >> >> It should get larger than one around 180KiB on most cpus. > >> >> > >> >> If no one points out some flaw in the approach, I'm hoping to get > this > >> >> into the next numpy version. > >> >> > >> >> cheers, > >> >> Julian > >> >> > >> >> > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at scipy.org > >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> > >> > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> > > > > > > -- > > > > Christopher Barker, Ph.D. > > Oceanographer > > > > Emergency Response Division > > NOAA/NOS/OR&R (206) 526-6959 voice > > 7600 Sand Point Way NE (206) 526-6329 fax > > Seattle, WA 98115 (206) 526-6317 main reception > > > > Chris.Barker at noaa.gov > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: attachments/20161002/2e65258f/attachment-0001.html> > > ------------------------------ > > Message: 2 > Date: Sun, 2 Oct 2016 22:26:28 +0100 > From: David Cournapeau > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Dropping sourceforge for releases. > Message-ID: > gmail.com> > Content-Type: text/plain; charset="utf-8" > > +1 from me. > > If we really need some distribution on top of github/pypi, note that > bintray (https://bintray.com/) is free for OSS projects, and is a much > better experience than sourceforge. > > David > > On Sun, Oct 2, 2016 at 12:02 AM, Charles R Harris < > charlesr.harris at gmail.com > > wrote: > > > Hi All, > > > > Ralf has suggested dropping sourceforge as a NumPy release site. There > was > > discussion of doing that some time back but we have not yet done it. Now > > that we put wheels up on PyPI for all supported architectures source > forge > > is not needed. I note that there are still some 15,000 downloads a week > > from the site, so it is still used. > > > > Thoughts? > > > > Chuck > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: attachments/20161002/4e462a48/attachment-0001.html> > > ------------------------------ > > Message: 3 > Date: Sun, 2 Oct 2016 17:53:32 -0600 > From: Vincent Davis > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Dropping sourceforge for releases. > Message-ID: > mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > +1, I am very skeptical of anything on SourceForge, it negatively impacts > my opinion of any project that requires me to download from sourceforge. > > On Saturday, October 1, 2016, Charles R Harris > wrote: > > > Hi All, > > > > Ralf has suggested dropping sourceforge as a NumPy release site. There > was > > discussion of doing that some time back but we have not yet done it. Now > > that we put wheels up on PyPI for all supported architectures source > forge > > is not needed. I note that there are still some 15,000 downloads a week > > from the site, so it is still used. > > > > Thoughts? > > > > Chuck > > > > > -- > Sent from mobile app. > Vincent Davis > 720-301-3003 > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: attachments/20161002/eb4cbff3/attachment.html> > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ------------------------------ > > End of NumPy-Discussion Digest, Vol 121, Issue 3 > ************************************************ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Mon Oct 3 05:48:06 2016 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Mon, 3 Oct 2016 11:48:06 +0200 Subject: [Numpy-discussion] ANN: pandas v0.19.0 released Message-ID: Hi all, I'm happy to announce pandas 0.19.0 has been released. This is a major release from 0.18.1 and includes a number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. See the Whatsnew file for more information. We recommend that all users upgrade to this version. This is the work of 5 months of development by 117 contributors. A big thank you to all contributors! Joris --- *What is it:* pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. *Highlights of the 0.19.0 release include:* - New method merge_asof for asof-style time-series joining, see here - The .rolling() method is now time-series aware, see here - read_csv now supports parsing Categorical data, see here - A function union_categorical has been added for combining categoricals, see here - PeriodIndex now has its own period dtype, and changed to be more consistent with other Index classes. See here - Sparse data structures gained enhanced support of int and bool dtypes, see here - Comparison operations with Series no longer ignores the index, see here for an overview of the API changes. - Introduction of a pandas development API for utility functions, see here . - Deprecation of Panel4D and PanelND. We recommend to represent these types of n-dimensional data with the xarray package . - Removal of the previously deprecated modules pandas.io.data, pandas.io.wb, pandas.tools.rplot. See the Whatsnew file for more information. *How to get it:* Source tarballs and windows/mac/linux wheels are available on PyPI (thanks to Christoph Gohlke for the windows wheels, and to Matthew Brett for setting up the mac/linux wheels). Conda packages are already available via the conda-forge channel (conda install pandas -c conda-forge). It will be available on the main channel shortly. *Issues:* Please report any issues on our issue tracker: https://github.com/pydata/pandas/issues *Thanks to all the contributors:* - adneu - Adrien Emery - agraboso - Alex Alekseyev - Alex Vig - Allen Riddell - Amol - Amol Agrawal - Andy R. Terrel - Anthonios Partheniou - babakkeyvani - Ben Kandel - Bob Baxley - Brett Rosen - c123w - Camilo Cota - Chris - chris-b1 - Chris Grinolds - Christian Hudon - Christopher C. Aycock - Chris Warth - cmazzullo - conquistador1492 - cr3 - Daniel Siladji - Douglas McNeil - Drewrey Lupton - dsm054 - Eduardo Blancas Reyes - Elliot Marsden - Evan Wright - Felix Marczinowski - Francis T. O?Donovan - G?bor Lipt?k - Geraint Duck - gfyoung - Giacomo Ferroni - Grant Roch - Haleemur Ali - harshul1610 - Hassan Shamim - iamsimha - Iulius Curt - Ivan Nazarov - jackieleng - Jeff Reback - Jeffrey Gerard - Jenn Olsen - Jim Crist - Joe Jevnik - John Evans - John Freeman - John Liekezer - Johnny Gill - John W. O?Brien - John Zwinck - Jordan Erenrich - Joris Van den Bossche - Josh Howes - Jozef Brandys - Kamil Sindi - Ka Wo Chen - Kerby Shedden - Kernc - Kevin Sheppard - Matthieu Brucher - Maximilian Roos - Michael Scherer - Mike Graham - Mortada Mehyar - mpuels - Muhammad Haseeb Tariq - Nate George - Neil Parley - Nicolas Bonnotte - OXPHOS - Pan Deng / Zora - Paul - Pauli Virtanen - Paul Mestemaker - Pawel Kordek - Pietro Battiston - pijucha - Piotr Jucha - priyankjain - Ravi Kumar Nimmi - Robert Gieseke - Robert Kern - Roger Thomas - Roy Keyes - Russell Smith - Sahil Dua - Sanjiv Lobo - Sa?o Stanovnik - Shawn Heide - sinhrks - Sinhrks - Stephen Kappel - Steve Choi - Stewart Henderson - Sudarshan Konge - Thomas A Caswell - Tom Augspurger - Tom Bird - Uwe Hoffmann - wcwagner - WillAyd - Xiang Zhang - Yadunandan - Yaroslav Halchenko - YG-Riku - Yuichiro Kaneko - yui-knk - zhangjinjie - znmean - ????Yan Facai? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Oct 3 06:16:48 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 3 Oct 2016 12:16:48 +0200 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com> the problem with this approach is that we don't really want numpy hogging on to hundreds of megabytes of memory by default so it would need to be a user option. A context manager could work too but it would probably lead to premature optimization. Very new Linux versions (4.6+) now finally support MADV_FREE which gives memory back to the system but does not require refaulting it if nothing else needed it. So this might be an option now. But libc implementations will probably use that at some point too and then numpy doesn't need to do this. On 02.10.2016 15:10, Benjamin Root wrote: > Just thinking aloud, an idea I had recently takes a different approach. > The problem with temporaries isn't so much that they exists, but rather > they they keep on malloc'ed and cleared. What if numpy kept a small LRU > cache of weakref'ed temporaries? Whenever a new numpy array is > requested, numpy could see if there is already one in its cache of > matching size and use it. If you think about it, expressions that result > in many temporaries would quite likely have many of them being the same > size in memory. > > Don't know how feasible it would be to implement though. > > Cheers! > Ben Root > > > On Sat, Oct 1, 2016 at 2:38 PM, Chris Barker > wrote: > > Julian, > > This is really, really cool! > > I have been wanting something like this for years (over a decade? > wow!), but always thought it would require hacking the interpreter > to intercept operations. This is a really inspired idea, and could > buy numpy a lot of performance. > > I'm afraid I can't say much about the implementation details -- but > great work! > > -Chris > > > > > On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor > > wrote: > > On 30.09.2016 23:09, josef.pktd at gmail.com > wrote: > > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor > > > wrote: > >> hi, > >> Temporary arrays generated in expressions are expensive as the imply > >> extra memory bandwidth which is the bottleneck in most numpy operations. > >> For example: > >> > >> r = a + b + c > >> > >> creates the b + c temporary and then adds a to it. > >> This can be rewritten to be more efficient using inplace operations: > >> > >> r = b + c > >> r += a > > > > general question (I wouldn't understand the details even if I looked.) > > > > how is this affected by broadcasting and type promotion? > > > > Some of the main reasons that I don't like to use inplace operation in > > general is that I'm often not sure when type promotion occurs and when > > arrays expand during broadcasting. > > > > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape. > > another case when I switch away from broadcasting is when b + c is int > > or bool and a is float. Thankfully, we get error messages for casting > > now. > > the temporary is only avoided when the casting follows the safe > rule, so > it should be the same as what you get without inplace > operations. E.g. > float32-temporary + float64 will not be converted to the unsafe > float32 > += float64 which a normal inplace operations would allow. But > float64-temp + float32 is transformed. > > Currently the only broadcasting that will be transformed is > temporary + > scalar value, otherwise it will only work on matching array sizes. > Though there is not really anything that prevents full > broadcasting but > its not implemented yet in the PR. > > > > >> > >> This saves some memory bandwidth and can speedup the > operation by 50% > >> for very large arrays or even more if the inplace operation > allows it to > >> be completed completely in the cpu cache. > > > > I didn't realize the difference can be so large. That would make > > streamlining some code worth the effort. > > > > Josef > > > > > >> > >> The problem is that inplace operations are a lot less > readable so they > >> are often only used in well optimized code. But due to pythons > >> refcounting semantics we can actually do some inplace conversions > >> transparently. > >> If an operand in python has a reference count of one it must be a > >> temporary so we can use it as the destination array. CPython > itself does > >> this optimization for string concatenations. > >> > >> In numpy we have the issue that we can be called from the > C-API directly > >> where the reference count may be one for other reasons. > >> To solve this we can check the backtrace until the python frame > >> evaluation function. If there are only numpy and python > functions in > >> between that and our entry point we should be able to elide > the temporary. > >> > >> This PR implements this: > >> https://github.com/numpy/numpy/pull/7997 > > >> > >> It currently only supports Linux with glibc (which has reliable > >> backtraces via unwinding) and maybe MacOS depending on how > good their > >> backtrace is. On windows the backtrace APIs are different and > I don't > >> know them but in theory it could also be done there. > >> > >> A problem is that checking the backtrace is quite expensive, > so should > >> only be enabled when the involved arrays are large enough for > it to be > >> worthwhile. In my testing this seems to be around 180-300KiB > sized > >> arrays, basically where they start spilling out of the CPU L2 > cache. > >> > >> I made a little crappy benchmark script to test this cutoff > in this branch: > >> https://github.com/juliantaylor/numpy/tree/elide-bench > > >> > >> If you are interested you can run it with: > >> python setup.py build_ext -j 4 --inplace > >> ipython --profile=null check.ipy > >> > >> At the end it will plot the ratio between elided and > non-elided runtime. > >> It should get larger than one around 180KiB on most cpus. > >> > >> If no one points out some flaw in the approach, I'm hoping to > get this > >> into the next numpy version. > >> > >> cheers, > >> Julian > >> > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > >> > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 > voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 > main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From chris.barker at noaa.gov Mon Oct 3 14:23:53 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 3 Oct 2016 11:23:53 -0700 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com> References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com> Message-ID: On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor wrote: > the problem with this approach is that we don't really want numpy > hogging on to hundreds of megabytes of memory by default so it would > need to be a user option. > indeed -- but one could set an LRU cache to be very small (few items, not small memory), and then it get used within expressions, but not hold on to much outside of expressions. However, is the allocation the only (Or even biggest) source of the performance hit? If you generate a temporary as a result of an operation, rather than doing it in-place, that temporary needs to be allocated, but it also means that an additional array needs to be pushed through the processor -- and that can make a big performance difference too. I"m not entirely sure how to profile this correctly, but this seems to indicate that the allocation is cheap compared to the operations (for a million--element array) * Regular old temporary creation In [24]: def f1(arr1, arr2): ...: result = arr1 + arr2 ...: return result In [26]: %timeit f1(arr1, arr2) 1000 loops, best of 3: 1.13 ms per loop * Completely in-place, no allocation of an extra array In [27]: def f2(arr1, arr2): ...: arr1 += arr2 ...: return arr1 In [28]: %timeit f2(arr1, arr2) 1000 loops, best of 3: 755 ?s per loop So that's about 30% faster * allocate a temporary that isn't used -- but should catch the creation cost In [29]: def f3(arr1, arr2): ...: result = np.empty_like(arr1) ...: arr1 += arr2 ...: return arr1 In [30]: % timeit f3(arr1, arr2) 1000 loops, best of 3: 756 ?s per loop only a ?s slower! Profiling is hard, and I'm not good at it, but this seems to indicate that the allocation is cheap. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Oct 3 14:43:16 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 3 Oct 2016 20:43:16 +0200 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com> Message-ID: <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com> On 03.10.2016 20:23, Chris Barker wrote: > > > On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor > > > wrote: > > the problem with this approach is that we don't really want numpy > hogging on to hundreds of megabytes of memory by default so it would > need to be a user option. > > > indeed -- but one could set an LRU cache to be very small (few items, > not small memory), and then it get used within expressions, but not hold > on to much outside of expressions. numpy doesn't see the whole expression so we can't really do much. (technically we could in 3.5 by using pep 523, but that would be a larger undertaking) > > However, is the allocation the only (Or even biggest) source of the > performance hit? > on large arrays the allocation is insignificant. What does cost some time is faulting the memory into the process which implies writing zeros into the pages (a page at a time as it is being used). By storing memory blocks in numpy we would save this portion. This is really the job of the libc, but these are usually tuned for general purpose workloads and thus tend to give back memory back to the system much earlier than numerical workloads would like. Note that numpy already has a small memory block cache but its only used for very small arrays where the allocation cost itself is significant, it is limited to a couple megabytes at most. From ben.v.root at gmail.com Mon Oct 3 15:07:28 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Mon, 3 Oct 2016 15:07:28 -0400 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com> References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com> <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com> Message-ID: With regards to arguments about holding onto large arrays, I would like to emphasize that my original suggestion mentioned weakref'ed numpy arrays. Essentially, the idea is to claw back only the raw memory blocks during that limbo period between discarding the numpy array python object and when python garbage-collects it. Ben Root On Mon, Oct 3, 2016 at 2:43 PM, Julian Taylor wrote: > On 03.10.2016 20:23, Chris Barker wrote: > > > > > > On Mon, Oct 3, 2016 at 3:16 AM, Julian Taylor > > > > > wrote: > > > > the problem with this approach is that we don't really want numpy > > hogging on to hundreds of megabytes of memory by default so it would > > need to be a user option. > > > > > > indeed -- but one could set an LRU cache to be very small (few items, > > not small memory), and then it get used within expressions, but not hold > > on to much outside of expressions. > > numpy doesn't see the whole expression so we can't really do much. > (technically we could in 3.5 by using pep 523, but that would be a > larger undertaking) > > > > > However, is the allocation the only (Or even biggest) source of the > > performance hit? > > > > on large arrays the allocation is insignificant. What does cost some > time is faulting the memory into the process which implies writing zeros > into the pages (a page at a time as it is being used). > By storing memory blocks in numpy we would save this portion. This is > really the job of the libc, but these are usually tuned for general > purpose workloads and thus tend to give back memory back to the system > much earlier than numerical workloads would like. > > Note that numpy already has a small memory block cache but its only used > for very small arrays where the allocation cost itself is significant, > it is limited to a couple megabytes at most. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Oct 3 15:33:57 2016 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 3 Oct 2016 19:33:57 +0000 (UTC) Subject: [Numpy-discussion] automatically avoiding temporary arrays References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com> <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com> Message-ID: Mon, 03 Oct 2016 15:07:28 -0400, Benjamin Root kirjoitti: > With regards to arguments about holding onto large arrays, I would like > to emphasize that my original suggestion mentioned weakref'ed numpy > arrays. > Essentially, the idea is to claw back only the raw memory blocks during > that limbo period between discarding the numpy array python object and > when python garbage-collects it. CPython afaik deallocates immediately when the refcount hits zero. It's relatively rare that you have arrays hanging around waiting for cycle breaking by gc. If you have them hanging around, I don't think it's possible to distinguish these from other arrays without running the gc. Note also that an "is an array in use" check probably always requires Julian's stack based hack since you cannot rely on the refcount. Pauli From charlesr.harris at gmail.com Mon Oct 3 20:23:01 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 Oct 2016 18:23:01 -0600 Subject: [Numpy-discussion] Dropping sourceforge for releases. In-Reply-To: References: Message-ID: On Sun, Oct 2, 2016 at 5:53 PM, Vincent Davis wrote: > +1, I am very skeptical of anything on SourceForge, it negatively impacts > my opinion of any project that requires me to download from sourceforge. > > > On Saturday, October 1, 2016, Charles R Harris > wrote: > >> Hi All, >> >> Ralf has suggested dropping sourceforge as a NumPy release site. There >> was discussion of doing that some time back but we have not yet done it. >> Now that we put wheels up on PyPI for all supported architectures source >> forge is not needed. I note that there are still some 15,000 downloads a >> week from the site, so it is still used. >> >> Thoughts? >> >> Chuck >> > I've uploaded the NumPy 1.11.2 release to sourceforge and made a note on the summary page that that will be the last release to be found there. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Oct 3 22:15:24 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 Oct 2016 20:15:24 -0600 Subject: [Numpy-discussion] NumPy 1.11.2 released Message-ID: *Hi All,* I'm pleased to announce the release of Numpy 1.11.2. This release supports Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions found in Numpy 1.11.1. Wheels for Linux, Windows, and OSX can be found on PyPI. Sources are available on both PyPI and Sourceforge . Thanks to all who were involved in this release. Contributors and merged pull requests are listed below. *Contributors to v1.11.2* - Allan Haldane - Bertrand Lefebvre - Charles Harris - Julian Taylor - Lo?c Est?ve - Marshall Bockrath-Vandegrift + - Michael Seifert + - Pauli Virtanen - Ralf Gommers - Sebastian Berg - Shota Kawabuchi + - Thomas A Caswell - Valentin Valls + - Xavier Abellan Ecija + A total of 14 people contributed to this release. People with a "+" by their names contributed a patch for the first time. *Pull requests merged for v1.11.2* - #7736 : Backport 4619, BUG: many functions silently drop keepdims kwarg - #7738 : Backport 5706, ENH: add extra kwargs and update doc of many MA... - #7778 : DOC: Update Numpy 1.11.1 release notes. - #7793 : Backport 7515, BUG: MaskedArray.count treats negative axes incorrectly - #7816 : Backport 7463, BUG: fix array too big error for wide dtypes. - #7821 : Backport 7817, BUG: Make sure npy_mul_with_overflow_ detects... - #7824 : Backport 7820, MAINT: Allocate fewer bytes for empty arrays. - #7847 : Backport 7791, MAINT,DOC: Fix some imp module uses and update... - #7849 : Backport 7848, MAINT: Fix remaining uses of deprecated Python... - #7851 : Backport 7840, Fix ATLAS version detection - #7870 : Backport 7853, BUG: Raise RuntimeError when reloading numpy is... - #7896 : Backport 7894, BUG: construct ma.array from np.array which contains... - #7904 : Backport 7903, BUG: fix float16 type not being called due to... - #7917 : BUG: Production install of numpy should not require nose. - #7919 : Backport 7908, BLD: Fixed MKL detection for recent versions of... - #7920 : Backport #7911: BUG: fix for issue#7835 (ma.median of 1d) - #7932 : Backport 7925, Monkey-patch _msvccompile.gen_lib_option like... - #7939 : Backport 7931, BUG: Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in... - #7953 : Backport 7937, BUG: Guard against buggy comparisons in generic... - #7954 : Backport 7952, BUG: Use keyword arguments to initialize Extension... - #7955 : Backport 7941, BUG: Make sure numpy globals keep identity after... - #7972 : Backport 7963, BUG: MSVCCompiler grows 'lib' & 'include' env... - #7990 : Backport 7977, DOC: Create 1.11.2 release notes. - #8005 : Backport 7956, BLD: remove __NUMPY_SETUP__ from builtins at end... - #8007 : Backport 8006, DOC: Update 1.11.2 release notes. - #8010 : Backport 8008, MAINT: Remove leftover imp module imports. - #8012 : Backport 8011, DOC: Update 1.11.2 release notes. - #8020 : Backport 8018, BUG: Fixes return for np.ma.count if keepdims... - #8024 : Backport 8016, BUG: Fix numpy.ma.median. - #8031 : Backport 8030, BUG: fix np.ma.median with only one non-masked... - #8032 : Backport 8028, DOC: Update 1.11.2 release notes. - #8044 : Backport 8042, BUG: core: fix bug in NpyIter buffering with discontinuous... - #8046 : Backport 8045, DOC: Update 1.11.2 release notes. Enjoy, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Oct 4 00:33:30 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 3 Oct 2016 21:33:30 -0700 Subject: [Numpy-discussion] NumPy 1.11.2 released In-Reply-To: References: Message-ID: On Mon, Oct 3, 2016 at 7:15 PM, Charles R Harris wrote: > Hi All, > > I'm pleased to announce the release of Numpy 1.11.2. This release supports > Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions found in > Numpy 1.11.1. Wheels for Linux, Windows, and OSX can be found on PyPI. > Sources are available on both PyPI and Sourceforge. > > Thanks to all who were involved in this release. Contributors and merged > pull requests are listed below. > > > Contributors to v1.11.2 > > Allan Haldane > Bertrand Lefebvre > Charles Harris > Julian Taylor > Lo?c Est?ve > Marshall Bockrath-Vandegrift + > Michael Seifert + > Pauli Virtanen > Ralf Gommers > Sebastian Berg > Shota Kawabuchi + > Thomas A Caswell > Valentin Valls + > Xavier Abellan Ecija + > > A total of 14 people contributed to this release. People with a "+" by their > names contributed a patch for the first time. > > Pull requests merged for v1.11.2 > > #7736: Backport 4619, BUG: many functions silently drop keepdims kwarg > #7738: Backport 5706, ENH: add extra kwargs and update doc of many MA... > #7778: DOC: Update Numpy 1.11.1 release notes. > #7793: Backport 7515, BUG: MaskedArray.count treats negative axes > incorrectly > #7816: Backport 7463, BUG: fix array too big error for wide dtypes. > #7821: Backport 7817, BUG: Make sure npy_mul_with_overflow_ detects... > #7824: Backport 7820, MAINT: Allocate fewer bytes for empty arrays. > #7847: Backport 7791, MAINT,DOC: Fix some imp module uses and update... > #7849: Backport 7848, MAINT: Fix remaining uses of deprecated Python... > #7851: Backport 7840, Fix ATLAS version detection > #7870: Backport 7853, BUG: Raise RuntimeError when reloading numpy is... > #7896: Backport 7894, BUG: construct ma.array from np.array which > contains... > #7904: Backport 7903, BUG: fix float16 type not being called due to... > #7917: BUG: Production install of numpy should not require nose. > #7919: Backport 7908, BLD: Fixed MKL detection for recent versions of... > #7920: Backport #7911: BUG: fix for issue#7835 (ma.median of 1d) > #7932: Backport 7925, Monkey-patch _msvccompile.gen_lib_option like... > #7939: Backport 7931, BUG: Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in... > #7953: Backport 7937, BUG: Guard against buggy comparisons in generic... > #7954: Backport 7952, BUG: Use keyword arguments to initialize Extension... > #7955: Backport 7941, BUG: Make sure numpy globals keep identity after... > #7972: Backport 7963, BUG: MSVCCompiler grows 'lib' & 'include' env... > #7990: Backport 7977, DOC: Create 1.11.2 release notes. > #8005: Backport 7956, BLD: remove __NUMPY_SETUP__ from builtins at end... > #8007: Backport 8006, DOC: Update 1.11.2 release notes. > #8010: Backport 8008, MAINT: Remove leftover imp module imports. > #8012: Backport 8011, DOC: Update 1.11.2 release notes. > #8020: Backport 8018, BUG: Fixes return for np.ma.count if keepdims... > #8024: Backport 8016, BUG: Fix numpy.ma.median. > #8031: Backport 8030, BUG: fix np.ma.median with only one non-masked... > #8032: Backport 8028, DOC: Update 1.11.2 release notes. > #8044: Backport 8042, BUG: core: fix bug in NpyIter buffering with > discontinuous... > #8046: Backport 8045, DOC: Update 1.11.2 release notes. Thanks very much for doing all the release work, congratulations on the release, Cheers, Matthew From m.h.vankerkwijk at gmail.com Tue Oct 4 00:45:04 2016 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 3 Oct 2016 21:45:04 -0700 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> <0450ca67-f674-8bdf-5686-f8cc490719a8@googlemail.com> <44a7e6d8-f796-1c36-bb4e-cb1514ab3d3c@googlemail.com> Message-ID: Note that numpy does store some larger arrays already, in the fft module. (In fact, this was a cache of unlimited size until #7686.) It might not be bad if the same cache were used more generally. That said, if newer versions of python are offering ways of doing this better, maybe that is the best way forward. -- Marten From evgeny.burovskiy at gmail.com Tue Oct 4 01:29:53 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Tue, 4 Oct 2016 08:29:53 +0300 Subject: [Numpy-discussion] NumPy 1.11.2 released In-Reply-To: References: Message-ID: Thank you Chuck! 04.10.2016 5:15 ???????????? "Charles R Harris" ???????: > *Hi All,* > > I'm pleased to announce the release of Numpy 1.11.2. This release > supports Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions > found in Numpy 1.11.1. Wheels for Linux, Windows, and OSX can be found > on PyPI. Sources are available on both PyPI and Sourceforge > . > > Thanks to all who were involved in this release. Contributors and merged > pull requests are listed below. > > > *Contributors to v1.11.2* > > - Allan Haldane > - Bertrand Lefebvre > - Charles Harris > - Julian Taylor > - Lo?c Est?ve > - Marshall Bockrath-Vandegrift + > - Michael Seifert + > - Pauli Virtanen > - Ralf Gommers > - Sebastian Berg > - Shota Kawabuchi + > - Thomas A Caswell > - Valentin Valls + > - Xavier Abellan Ecija + > > A total of 14 people contributed to this release. People with a "+" by > their names contributed a patch for the first time. > *Pull requests merged for v1.11.2* > > - #7736 : Backport 4619, > BUG: many functions silently drop keepdims kwarg > - #7738 : Backport 5706, > ENH: add extra kwargs and update doc of many MA... > - #7778 : DOC: Update Numpy > 1.11.1 release notes. > - #7793 : Backport 7515, > BUG: MaskedArray.count treats negative axes incorrectly > - #7816 : Backport 7463, > BUG: fix array too big error for wide dtypes. > - #7821 : Backport 7817, > BUG: Make sure npy_mul_with_overflow_ detects... > - #7824 : Backport 7820, > MAINT: Allocate fewer bytes for empty arrays. > - #7847 : Backport 7791, > MAINT,DOC: Fix some imp module uses and update... > - #7849 : Backport 7848, > MAINT: Fix remaining uses of deprecated Python... > - #7851 : Backport 7840, Fix > ATLAS version detection > - #7870 : Backport 7853, > BUG: Raise RuntimeError when reloading numpy is... > - #7896 : Backport 7894, > BUG: construct ma.array from np.array which contains... > - #7904 : Backport 7903, > BUG: fix float16 type not being called due to... > - #7917 : BUG: Production > install of numpy should not require nose. > - #7919 : Backport 7908, > BLD: Fixed MKL detection for recent versions of... > - #7920 : Backport #7911: > BUG: fix for issue#7835 (ma.median of 1d) > - #7932 : Backport 7925, > Monkey-patch _msvccompile.gen_lib_option like... > - #7939 : Backport 7931, > BUG: Check for HAVE_LDOUBLE_DOUBLE_DOUBLE_LE in... > - #7953 : Backport 7937, > BUG: Guard against buggy comparisons in generic... > - #7954 : Backport 7952, > BUG: Use keyword arguments to initialize Extension... > - #7955 : Backport 7941, > BUG: Make sure numpy globals keep identity after... > - #7972 : Backport 7963, > BUG: MSVCCompiler grows 'lib' & 'include' env... > - #7990 : Backport 7977, > DOC: Create 1.11.2 release notes. > - #8005 : Backport 7956, > BLD: remove __NUMPY_SETUP__ from builtins at end... > - #8007 : Backport 8006, > DOC: Update 1.11.2 release notes. > - #8010 : Backport 8008, > MAINT: Remove leftover imp module imports. > - #8012 : Backport 8011, > DOC: Update 1.11.2 release notes. > - #8020 : Backport 8018, > BUG: Fixes return for np.ma.count if keepdims... > - #8024 : Backport 8016, > BUG: Fix numpy.ma.median. > - #8031 : Backport 8030, > BUG: fix np.ma.median with only one non-masked... > - #8032 : Backport 8028, > DOC: Update 1.11.2 release notes. > - #8044 : Backport 8042, > BUG: core: fix bug in NpyIter buffering with discontinuous... > - #8046 : Backport 8045, > DOC: Update 1.11.2 release notes. > > Enjoy, > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 4 06:18:22 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 4 Oct 2016 23:18:22 +1300 Subject: [Numpy-discussion] update on mailing list issues Message-ID: Hi all, We've had a number of issues with the reliability of the mailman setup that powers the mailing lists for NumPy, SciPy and several other projects. To address that we'll start migrating to the python.org provided infrastructure, which should be much more reliable. The full set of lists is here: https://mail.scipy.org/mailman/listinfo. Looks like we have to migrate at least: AstroPy IPython-dev IPython-user NumPy-Discussion SciPy-Dev SciPy-User SciPy-organisers Some of the other ones that are not clearly obsolete but have almost zero activity (APUG, Nipy-devel) we'll have to contact the owners. *-tickets may be useful to archive. The other ones will just be cleaned up, unless someone indicates that there's a reason to keep them around. And a pre-emptive thanks to Didrik and Enthought for taking on the task of migrating the archives and user details. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Tue Oct 4 06:50:01 2016 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 04 Oct 2016 06:50:01 -0400 Subject: [Numpy-discussion] update on mailing list issues References: Message-ID: Ralf Gommers wrote: > Hi all, > > We've had a number of issues with the reliability of the mailman setup > that powers the mailing lists for NumPy, SciPy and several other projects. > To address that we'll start migrating to the python.org provided > infrastructure, which should be much more reliable. > > The full set of lists is here: https://mail.scipy.org/mailman/listinfo. > Looks like we have to migrate at least: > AstroPy > IPython-dev > IPython-user > NumPy-Discussion > SciPy-Dev > SciPy-User > SciPy-organisers > > Some of the other ones that are not clearly obsolete but have almost zero > activity (APUG, Nipy-devel) we'll have to contact the owners. *-tickets > may be useful to archive. The other ones will just be cleaned up, unless > someone indicates that there's a reason to keep them around. > > And a pre-emptive thanks to Didrik and Enthought for taking on the task of > migrating the archives and user details. > > Cheers, > Ralf Someone will need to update gmane nntp/mail gateway then, I suppose? From ralf.gommers at gmail.com Tue Oct 4 13:51:25 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Oct 2016 06:51:25 +1300 Subject: [Numpy-discussion] update on mailing list issues In-Reply-To: References: Message-ID: On Tue, Oct 4, 2016 at 11:50 PM, Neal Becker wrote: > Ralf Gommers wrote: > > > Hi all, > > > > We've had a number of issues with the reliability of the mailman setup > > that powers the mailing lists for NumPy, SciPy and several other > projects. > > To address that we'll start migrating to the python.org provided > > infrastructure, which should be much more reliable. > > > > The full set of lists is here: https://mail.scipy.org/mailman/listinfo. > > Looks like we have to migrate at least: > > AstroPy > > IPython-dev > > IPython-user > > NumPy-Discussion > > SciPy-Dev > > SciPy-User > > SciPy-organisers > > > > Some of the other ones that are not clearly obsolete but have almost zero > > activity (APUG, Nipy-devel) we'll have to contact the owners. *-tickets > > may be useful to archive. The other ones will just be cleaned up, unless > > someone indicates that there's a reason to keep them around. > > > > And a pre-emptive thanks to Didrik and Enthought for taking on the task > of > > migrating the archives and user details. > > > > Cheers, > > Ralf > > Someone will need to update gmane nntp/mail gateway then, I suppose? > Thanks for the reminder. Yes, guess we need to do something there. Not just yet though, this is what I got when I looked at how to edit list details on gmane: "Not all of Gmane is back yet - We're working hard to restore everything" Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Oct 4 14:44:35 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 4 Oct 2016 11:44:35 -0700 Subject: [Numpy-discussion] NumPy 1.11.2 released In-Reply-To: References: Message-ID: I'm pleased to announce the release of Numpy 1.11.2. This release supports > Python 2.6 - 2.7, and 3.2 - 3.5 and fixes bugs and regressions found in > Numpy 1.11.1. Wheels for Linux, Windows, and OSX can be found on PyPI. > Sources are available on both PyPI and Sourceforge > . > and on conda-forge: https://anaconda.org/conda-forge/numpy Hmm, not Windows (darn fortran an openblas!) -- but thanks for getting that up fast! And of course, thanks to all in the numpy community for getting this build out. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Wed Oct 5 02:45:11 2016 From: srean.list at gmail.com (srean) Date: Wed, 5 Oct 2016 12:15:11 +0530 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: Good discussion, but was surprised by the absence of numexpr in the discussion., given how relevant it (numexpr) is to the topic. Is the goal to fold in the numexpr functionality (and beyond) into Numpy ? On Fri, Sep 30, 2016 at 7:08 PM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > hi, > Temporary arrays generated in expressions are expensive as the imply > extra memory bandwidth which is the bottleneck in most numpy operations. > For example: > > r = a + b + c > > creates the b + c temporary and then adds a to it. > This can be rewritten to be more efficient using inplace operations: > > r = b + c > r += a > > This saves some memory bandwidth and can speedup the operation by 50% > for very large arrays or even more if the inplace operation allows it to > be completed completely in the cpu cache. > > The problem is that inplace operations are a lot less readable so they > are often only used in well optimized code. But due to pythons > refcounting semantics we can actually do some inplace conversions > transparently. > If an operand in python has a reference count of one it must be a > temporary so we can use it as the destination array. CPython itself does > this optimization for string concatenations. > > In numpy we have the issue that we can be called from the C-API directly > where the reference count may be one for other reasons. > To solve this we can check the backtrace until the python frame > evaluation function. If there are only numpy and python functions in > between that and our entry point we should be able to elide the temporary. > > This PR implements this: > https://github.com/numpy/numpy/pull/7997 > > It currently only supports Linux with glibc (which has reliable > backtraces via unwinding) and maybe MacOS depending on how good their > backtrace is. On windows the backtrace APIs are different and I don't > know them but in theory it could also be done there. > > A problem is that checking the backtrace is quite expensive, so should > only be enabled when the involved arrays are large enough for it to be > worthwhile. In my testing this seems to be around 180-300KiB sized > arrays, basically where they start spilling out of the CPU L2 cache. > > I made a little crappy benchmark script to test this cutoff in this branch: > https://github.com/juliantaylor/numpy/tree/elide-bench > > If you are interested you can run it with: > python setup.py build_ext -j 4 --inplace > ipython --profile=null check.ipy > > At the end it will plot the ratio between elided and non-elided runtime. > It should get larger than one around 180KiB on most cpus. > > If no one points out some flaw in the approach, I'm hoping to get this > into the next numpy version. > > cheers, > Julian > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Wed Oct 5 05:46:21 2016 From: faltet at gmail.com (Francesc Alted) Date: Wed, 5 Oct 2016 11:46:21 +0200 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: 2016-10-05 8:45 GMT+02:00 srean : > Good discussion, but was surprised by the absence of numexpr in the > discussion., given how relevant it (numexpr) is to the topic. > > Is the goal to fold in the numexpr functionality (and beyond) into Numpy ? > Yes, the question about merging numexpr into numpy has been something that periodically shows up in this list. I think mostly everyone agree that it is a good idea, but things are not so easy, and so far nobody provided a good patch for this. Also, the fact that numexpr relies on grouping an expression by using a string (e.g. (y = ne.evaluate("x**3 + tanh(x**2) + 4")) does not play well with the way in that numpy evaluates expressions, so something should be suggested to cope with this too. > > On Fri, Sep 30, 2016 at 7:08 PM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> hi, >> Temporary arrays generated in expressions are expensive as the imply >> extra memory bandwidth which is the bottleneck in most numpy operations. >> For example: >> >> r = a + b + c >> >> creates the b + c temporary and then adds a to it. >> This can be rewritten to be more efficient using inplace operations: >> >> r = b + c >> r += a >> >> This saves some memory bandwidth and can speedup the operation by 50% >> for very large arrays or even more if the inplace operation allows it to >> be completed completely in the cpu cache. >> >> The problem is that inplace operations are a lot less readable so they >> are often only used in well optimized code. But due to pythons >> refcounting semantics we can actually do some inplace conversions >> transparently. >> If an operand in python has a reference count of one it must be a >> temporary so we can use it as the destination array. CPython itself does >> this optimization for string concatenations. >> >> In numpy we have the issue that we can be called from the C-API directly >> where the reference count may be one for other reasons. >> To solve this we can check the backtrace until the python frame >> evaluation function. If there are only numpy and python functions in >> between that and our entry point we should be able to elide the temporary. >> >> This PR implements this: >> https://github.com/numpy/numpy/pull/7997 >> >> It currently only supports Linux with glibc (which has reliable >> backtraces via unwinding) and maybe MacOS depending on how good their >> backtrace is. On windows the backtrace APIs are different and I don't >> know them but in theory it could also be done there. >> >> A problem is that checking the backtrace is quite expensive, so should >> only be enabled when the involved arrays are large enough for it to be >> worthwhile. In my testing this seems to be around 180-300KiB sized >> arrays, basically where they start spilling out of the CPU L2 cache. >> >> I made a little crappy benchmark script to test this cutoff in this >> branch: >> https://github.com/juliantaylor/numpy/tree/elide-bench >> >> If you are interested you can run it with: >> python setup.py build_ext -j 4 --inplace >> ipython --profile=null check.ipy >> >> At the end it will plot the ratio between elided and non-elided runtime. >> It should get larger than one around 180KiB on most cpus. >> >> If no one points out some flaw in the approach, I'm hoping to get this >> into the next numpy version. >> >> cheers, >> Julian >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbmcleod at gmail.com Wed Oct 5 06:56:20 2016 From: robbmcleod at gmail.com (Robert McLeod) Date: Wed, 5 Oct 2016 12:56:20 +0200 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: All, On Wed, Oct 5, 2016 at 11:46 AM, Francesc Alted wrote: > 2016-10-05 8:45 GMT+02:00 srean : > >> Good discussion, but was surprised by the absence of numexpr in the >> discussion., given how relevant it (numexpr) is to the topic. >> >> Is the goal to fold in the numexpr functionality (and beyond) into Numpy ? >> > > Yes, the question about merging numexpr into numpy has been something that > periodically shows up in this list. I think mostly everyone agree that it > is a good idea, but things are not so easy, and so far nobody provided a > good patch for this. Also, the fact that numexpr relies on grouping an > expression by using a string (e.g. (y = ne.evaluate("x**3 + tanh(x**2) + > 4")) does not play well with the way in that numpy evaluates expressions, > so something should be suggested to cope with this too. > As Francesc said, Numexpr is going to get most of its power through grouping a series of operations so it can send blocks to the CPU cache and run the entire series of operations on the cache before returning the block to system memory. If it was just used to back-end NumPy, it would only gain from the multi-threading portion inside each function call. I'm not sure how one would go about grouping successive numpy expressions without modifying the Python interpreter? I put a bit of effort into extending numexpr to use 4-byte word opcodes instead of 1-byte. Progress has been very slow, however, due to time constraints, but I have most of the numpy data types (u[1-4], i[1-4], f[4,8], c[8,16], S[1-4], U[1-4]). On Tuesday I finished writing a Python generator script that writes all the C-side opcode macros for opcodes.hpp. Now I have about 900 opcodes, and this could easily grow into thousands if more functions are added, so I also built a reverse lookup tree (based on collections.defaultdict) for the Python-side of numexpr. Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universit?t Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Wed Oct 5 07:11:15 2016 From: srean.list at gmail.com (srean) Date: Wed, 5 Oct 2016 16:41:15 +0530 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: Thanks Francesc, Robert for giving me a broader picture of where this fits in. I believe numexpr does not handle slicing, so that might be another thing to look at. On Wed, Oct 5, 2016 at 4:26 PM, Robert McLeod wrote: > > As Francesc said, Numexpr is going to get most of its power through > grouping a series of operations so it can send blocks to the CPU cache and > run the entire series of operations on the cache before returning the block > to system memory. If it was just used to back-end NumPy, it would only > gain from the multi-threading portion inside each function call. > Is that so ? I thought numexpr also cuts down on number of temporary buffers that get filled (in other words copy operations) if the same expression was written as series of operations. My understanding can be wrong, and would appreciate correction. The 'out' parameter in ufuncs can eliminate extra temporaries but its not composable. Right now I have to manually carry along the array where the in place operations take place. I think the goal here is to eliminate that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbmcleod at gmail.com Wed Oct 5 08:06:06 2016 From: robbmcleod at gmail.com (Robert McLeod) Date: Wed, 5 Oct 2016 14:06:06 +0200 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: On Wed, Oct 5, 2016 at 1:11 PM, srean wrote: > Thanks Francesc, Robert for giving me a broader picture of where this fits > in. I believe numexpr does not handle slicing, so that might be another > thing to look at. > Dereferencing would be relatively simple to add into numexpr, as it would just be some getattr() calls. Personally I will add that at some point because it will clean up my code. Slicing, maybe only for continuous blocks in memory? I.e. imageStack[0,:,:] would be possible, but imageStack[:, ::2, ::2] would not be trivial (I think...). I seem to remember someone asked David Cooke about slicing and he said something along the lines of, "that's what Numba is for." Perhaps NumPy backended by Numba is more so what you are looking for, as it hooks into the byte compiler? The main advantage of numexpr is that a series of numpy functions in can be enclosed in ne.evaluate( "" ) and it provides a big acceleration for little programmer effort, but it's not nearly as sophisticated as Numba or PyPy. > On Wed, Oct 5, 2016 at 4:26 PM, Robert McLeod > wrote: > >> >> As Francesc said, Numexpr is going to get most of its power through >> grouping a series of operations so it can send blocks to the CPU cache and >> run the entire series of operations on the cache before returning the block >> to system memory. If it was just used to back-end NumPy, it would only >> gain from the multi-threading portion inside each function call. >> > > Is that so ? > > I thought numexpr also cuts down on number of temporary buffers that get > filled (in other words copy operations) if the same expression was written > as series of operations. My understanding can be wrong, and would > appreciate correction. > > The 'out' parameter in ufuncs can eliminate extra temporaries but its not > composable. Right now I have to manually carry along the array where the in > place operations take place. I think the goal here is to eliminate that. > The numexpr virtual machine does create temporaries where needed when it parses the abstract syntax tree for all the operations it has to do. I believe the main advantage is that the temporaries are created on the CPU cache, and not in system memory. It's certainly true that numexpr doesn't create a lot of OP_COPY operations, rather it's optimized to minimize them, so probably it's fewer ops than naive successive calls to numpy within python, but I'm unsure if there's any difference in operation count between a hand-optimized numpy with out= set and numexpr. Numexpr just does it for you. This blog post from Tim Hochberg is useful for understanding the performance advantages of blocking versus multithreading: http://www.bitsofbits.com/2014/09/21/numpy-micro-optimization-and-numexpr/ Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universit?t Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Thu Oct 6 05:51:07 2016 From: srean.list at gmail.com (srean) Date: Thu, 6 Oct 2016 15:21:07 +0530 Subject: [Numpy-discussion] automatically avoiding temporary arrays In-Reply-To: References: <283e3000-0b9c-886c-e322-1ff4d2e8cb26@googlemail.com> Message-ID: On Wed, Oct 5, 2016 at 5:36 PM, Robert McLeod wrote: > > It's certainly true that numexpr doesn't create a lot of OP_COPY > operations, rather it's optimized to minimize them, so probably it's fewer > ops than naive successive calls to numpy within python, but I'm unsure if > there's any difference in operation count between a hand-optimized numpy > with out= set and numexpr. Numexpr just does it for you. > That was my understanding as well. If it automatically does what one could achieve by carrying the state along in the 'out' parameter, that's as good as it can get in terms removing unnecessary ops. There are other speedup opportunities of course, but that's a separate matter. > This blog post from Tim Hochberg is useful for understanding the > performance advantages of blocking versus multithreading: > > http://www.bitsofbits.com/2014/09/21/numpy-micro-optimization-and-numexpr/ > Hadnt come across that one before. Great link. Thanks. using caches and vector registers well trumps threading, unless one has a lot of data and it helps to disable hyper-threading. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 7 21:12:53 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 7 Oct 2016 19:12:53 -0600 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. Message-ID: Hi All, The time for NumPy 1.12.0 approaches and I like to have a final decision on the treatment of integers to negative integer powers with the `**` operator. The two alternatives looked to be *Raise an error for arrays and numpy scalars, including 1 and -1 to negative powers.* *Pluses* - Backward compatible - Allows common powers to be integer, e.g., arange(3)**2 - Consistent with inplace operators - Fixes current wrong behavior. - Preserves type *Minuses* - Integer overflow - Computational inconvenience - Inconsistent with Python integers *Always return a float * *Pluses* - Computational convenience *Minuses* - Loss of type - Possible backward incompatibilities - Not applicable to inplace operators Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Oct 7 21:38:02 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Oct 2016 21:38:02 -0400 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On Fri, Oct 7, 2016 at 9:12 PM, Charles R Harris wrote: > Hi All, > > The time for NumPy 1.12.0 approaches and I like to have a final decision on > the treatment of integers to negative integer powers with the `**` operator. > The two alternatives looked to be > > Raise an error for arrays and numpy scalars, including 1 and -1 to negative > powers. > > Pluses > > Backward compatible > Allows common powers to be integer, e.g., arange(3)**2 > Consistent with inplace operators > Fixes current wrong behavior. > Preserves type > > > Minuses > > Integer overflow > Computational inconvenience > Inconsistent with Python integers > > > Always return a float > > Pluses > > Computational convenience > > > Minuses > > Loss of type > Possible backward incompatibilities > Not applicable to inplace operators > > > > Thoughts? 2: +1 I'm still in favor of number 2: less buggy code and less mental gymnastics (watch out for that int, or which int do I need) (upcasting is not applicable for any inplace operators, AFAIU *=0.5 ? zz = np.arange(5) zz**(-1) zz *= 0.5 tried in >>> np.__version__ '1.9.2rc1' >>> np.__version__ '1.10.4' backwards compatibility ? ) Josef > > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From alan.isaac at gmail.com Fri Oct 7 23:13:21 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Fri, 7 Oct 2016 23:13:21 -0400 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On 10/7/2016 9:12 PM, Charles R Harris wrote: > *Always return a float * > /Pluses/ > * Computational convenience Is the behavior of C++11 of any relevance to the choice? http://www.cplusplus.com/reference/cmath/pow/ Alan Isaac From sole at esrf.fr Sat Oct 8 01:33:49 2016 From: sole at esrf.fr (V. Armando Sole) Date: Sat, 08 Oct 2016 07:33:49 +0200 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: Hi all, Just to have the options clear. Is the operator '**' going to be handled in any different manner than pow? Thanks. Armando From njs at pobox.com Sat Oct 8 06:40:50 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 8 Oct 2016 03:40:50 -0700 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris wrote: > Hi All, > > The time for NumPy 1.12.0 approaches and I like to have a final decision on > the treatment of integers to negative integer powers with the `**` operator. > The two alternatives looked to be > > Raise an error for arrays and numpy scalars, including 1 and -1 to negative > powers. > > Pluses > > Backward compatible > Allows common powers to be integer, e.g., arange(3)**2 > Consistent with inplace operators > Fixes current wrong behavior. > Preserves type > > > Minuses > > Integer overflow > Computational inconvenience > Inconsistent with Python integers > > > Always return a float > > Pluses > > Computational convenience > > > Minuses > > Loss of type > Possible backward incompatibilities > Not applicable to inplace operators I guess I could be wrong, but I think the backwards incompatibilities are going to be *way* too severe to make option 2 possible in practice. -n -- Nathaniel J. Smith -- https://vorpus.org From charlesr.harris at gmail.com Sat Oct 8 09:59:06 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 8 Oct 2016 07:59:06 -0600 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith wrote: > On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris > wrote: > > Hi All, > > > > The time for NumPy 1.12.0 approaches and I like to have a final decision > on > > the treatment of integers to negative integer powers with the `**` > operator. > > The two alternatives looked to be > > > > Raise an error for arrays and numpy scalars, including 1 and -1 to > negative > > powers. > > > > Pluses > > > > Backward compatible > > Allows common powers to be integer, e.g., arange(3)**2 > > Consistent with inplace operators > > Fixes current wrong behavior. > > Preserves type > > > > > > Minuses > > > > Integer overflow > > Computational inconvenience > > Inconsistent with Python integers > > > > > > Always return a float > > > > Pluses > > > > Computational convenience > > > > > > Minuses > > > > Loss of type > > Possible backward incompatibilities > > Not applicable to inplace operators > > I guess I could be wrong, but I think the backwards incompatibilities > are going to be *way* too severe to make option 2 possible in > practice. > > Backwards compatibility is also a major concern for me. Here are my current thoughts - Add an fpow ufunc that always converts to float, it would not accept object arrays. - Raise errors in current power ufunc (**), for ints to negative ints. The power ufunc will change in the following ways - +1, -1 to negative ints will error, currently they work - n > 1 ints to negative ints will error, currently warn and return zero - 0 to negative ints will error, they currently return the minimum integer The `**` operator currently calls the power ufunc, leave that as is for backward almost compatibility. The remaining question is numpy scalars, which we can make either compatible with Python, or with NumPy arrays. I'm leaning towards NumPy array compatibility mostly on account of type preservation and the close relationship between zero dimensionaly arrays and scalars. The fpow function could be backported to NumPy 1.11 if that would be helpful going forward. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Oct 8 11:12:31 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 8 Oct 2016 08:12:31 -0700 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris wrote: > > > On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith wrote: >> >> On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris >> wrote: >> > Hi All, >> > >> > The time for NumPy 1.12.0 approaches and I like to have a final decision >> > on >> > the treatment of integers to negative integer powers with the `**` >> > operator. >> > The two alternatives looked to be >> > >> > Raise an error for arrays and numpy scalars, including 1 and -1 to >> > negative >> > powers. >> > >> > Pluses >> > >> > Backward compatible >> > Allows common powers to be integer, e.g., arange(3)**2 >> > Consistent with inplace operators >> > Fixes current wrong behavior. >> > Preserves type >> > >> > >> > Minuses >> > >> > Integer overflow >> > Computational inconvenience >> > Inconsistent with Python integers >> > >> > >> > Always return a float >> > >> > Pluses >> > >> > Computational convenience >> > >> > >> > Minuses >> > >> > Loss of type >> > Possible backward incompatibilities >> > Not applicable to inplace operators >> >> I guess I could be wrong, but I think the backwards incompatibilities >> are going to be *way* too severe to make option 2 possible in >> practice. >> > > Backwards compatibility is also a major concern for me. Here are my current > thoughts > > Add an fpow ufunc that always converts to float, it would not accept object > arrays. Maybe call it `fpower` or even `float_power`, for consistency with `power`? > Raise errors in current power ufunc (**), for ints to negative ints. > > The power ufunc will change in the following ways > > +1, -1 to negative ints will error, currently they work > n > 1 ints to negative ints will error, currently warn and return zero > 0 to negative ints will error, they currently return the minimum integer > > The `**` operator currently calls the power ufunc, leave that as is for > backward almost compatibility. The remaining question is numpy scalars, > which we can make either compatible with Python, or with NumPy arrays. I'm > leaning towards NumPy array compatibility mostly on account of type > preservation and the close relationship between zero dimensionaly arrays and > scalars. Sounds good to me. I agree that we should prioritize within-numpy consistency over consistency with Python. > The fpow function could be backported to NumPy 1.11 if that would be helpful > going forward. I'm not a big fan of this kind of backport. Violating the "bug-fixes-only" rule makes it hard for people to understand our release versions. And it creates the situation where people can write code that they think requires numpy 1.11 (because it works with their numpy 1.11!), but then breaks on other people's computers (because those users have 1.11.(x-1)). And if there's some reason why people aren't willing to upgrade to 1.12 for new features, then probably better to spend energy addressing those instead of on putting together 1.11-and-a-half releases. -n -- Nathaniel J. Smith -- https://vorpus.org From charlesr.harris at gmail.com Sat Oct 8 14:38:08 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 8 Oct 2016 12:38:08 -0600 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On Sat, Oct 8, 2016 at 9:12 AM, Nathaniel Smith wrote: > On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris > wrote: > > > > > > On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith wrote: > >> > >> On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris > >> wrote: > >> > Hi All, > >> > > >> > The time for NumPy 1.12.0 approaches and I like to have a final > decision > >> > on > >> > the treatment of integers to negative integer powers with the `**` > >> > operator. > >> > The two alternatives looked to be > >> > > >> > Raise an error for arrays and numpy scalars, including 1 and -1 to > >> > negative > >> > powers. > >> > > >> > Pluses > >> > > >> > Backward compatible > >> > Allows common powers to be integer, e.g., arange(3)**2 > >> > Consistent with inplace operators > >> > Fixes current wrong behavior. > >> > Preserves type > >> > > >> > > >> > Minuses > >> > > >> > Integer overflow > >> > Computational inconvenience > >> > Inconsistent with Python integers > >> > > >> > > >> > Always return a float > >> > > >> > Pluses > >> > > >> > Computational convenience > >> > > >> > > >> > Minuses > >> > > >> > Loss of type > >> > Possible backward incompatibilities > >> > Not applicable to inplace operators > >> > >> I guess I could be wrong, but I think the backwards incompatibilities > >> are going to be *way* too severe to make option 2 possible in > >> practice. > >> > > > > Backwards compatibility is also a major concern for me. Here are my > current > > thoughts > > > > Add an fpow ufunc that always converts to float, it would not accept > object > > arrays. > > Maybe call it `fpower` or even `float_power`, for consistency with `power`? > > > Raise errors in current power ufunc (**), for ints to negative ints. > > > > The power ufunc will change in the following ways > > > > +1, -1 to negative ints will error, currently they work > > n > 1 ints to negative ints will error, currently warn and return zero > > 0 to negative ints will error, they currently return the minimum integer > > > > The `**` operator currently calls the power ufunc, leave that as is for > > backward almost compatibility. The remaining question is numpy scalars, > > which we can make either compatible with Python, or with NumPy arrays. > I'm > > leaning towards NumPy array compatibility mostly on account of type > > preservation and the close relationship between zero dimensionaly arrays > and > > scalars. > > Sounds good to me. I agree that we should prioritize within-numpy > consistency over consistency with Python. > > > The fpow function could be backported to NumPy 1.11 if that would be > helpful > > going forward. > > I'm not a big fan of this kind of backport. Violating the > "bug-fixes-only" rule makes it hard for people to understand our > release versions. And it creates the situation where people can write > code that they think requires numpy 1.11 (because it works with their > numpy 1.11!), but then breaks on other people's computers (because > those users have 1.11.(x-1)). And if there's some reason why people > aren't willing to upgrade to 1.12 for new features, then probably > better to spend energy addressing those instead of on putting together > 1.11-and-a-half releases. > The power ufunc is updated in https://github.com/numpy/numpy/pull/8127. -------------- next part -------------- An HTML attachment was scrubbed... URL: From raksi.raksi at gmail.com Sat Oct 8 15:31:56 2016 From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=) Date: Sat, 8 Oct 2016 21:31:56 +0200 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: Hello, I think it should be consistent with Python3. So, it should give back a float. Best regards, Krisztian On Sat, Oct 8, 2016 at 3:12 AM, Charles R Harris wrote: > Hi All, > > The time for NumPy 1.12.0 approaches and I like to have a final decision > on the treatment of integers to negative integer powers with the `**` > operator. The two alternatives looked to be > > > *Raise an error for arrays and numpy scalars, including 1 and -1 to > negative powers.* > *Pluses* > > - Backward compatible > - Allows common powers to be integer, e.g., arange(3)**2 > - Consistent with inplace operators > - Fixes current wrong behavior. > - Preserves type > > > *Minuses* > > - Integer overflow > - Computational inconvenience > - Inconsistent with Python integers > > > *Always return a float * > > *Pluses* > > - Computational convenience > > > *Minuses* > > - Loss of type > - Possible backward incompatibilities > - Not applicable to inplace operators > > > > Thoughts? > > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 8 15:36:40 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 8 Oct 2016 13:36:40 -0600 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On Sat, Oct 8, 2016 at 1:31 PM, Kriszti?n Horv?th wrote: > Hello, > > I think it should be consistent with Python3. So, it should give back a > float. > > Best regards, > Krisztian > > Can't do that and also return integers for positive powers. It isn't possible to have behavior completely compatible with python for arrays: can't have mixed type returns, can't have arbitrary precision integers. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From raksi.raksi at gmail.com Sat Oct 8 15:43:06 2016 From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=) Date: Sat, 8 Oct 2016 21:43:06 +0200 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: Sorry, I was not clear enough. I meant that the second option (always float) would be more coherent with Python3. On Oct 8, 2016 9:36 PM, "Charles R Harris" wrote: On Sat, Oct 8, 2016 at 1:31 PM, Kriszti?n Horv?th wrote: > Hello, > > I think it should be consistent with Python3. So, it should give back a > float. > > Best regards, > Krisztian > > Can't do that and also return integers for positive powers. It isn't possible to have behavior completely compatible with python for arrays: can't have mixed type returns, can't have arbitrary precision integers. Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sole at esrf.fr Sat Oct 8 16:40:51 2016 From: sole at esrf.fr (V. Armando Sole) Date: Sat, 08 Oct 2016 22:40:51 +0200 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: <71300e52e43daf80c9a3b3f279d562be@esrf.fr> Well, testing under windows 64 bit, Python 3.5.2, positive powers of integers give integers and negative powers of integers give floats. So, do you want to raise an exception when taking a negative power of an element of an array of integers? Because not doing so would be inconsistent with raising the exception when applying the same operation to the array. Clearly things are broken now (I get zeros when calculating negative powers of numpy arrays of integers others than 1), but that behavior was consistent with python itself under python 2.x because the division of two integers was an integer. That does not hold under Python 3.5 where the division of two integers is a float. You have offered either to raise an exception or to always return a float (i.e. even with positive exponents). You have never offered to be consistent with what Python does. This last option would be my favorite. If it cannot be implemented, then I would prefer always float. At least one would be consistent with something and we would not invent yet another convention. On 08.10.2016 21:36, Charles R Harris wrote: > On Sat, Oct 8, 2016 at 1:31 PM, Kriszti?n Horv?th > wrote: > >> Hello, >> >> I think it should be consistent with Python3. So, it should give >> back a float. >> >> Best regards, >> Krisztian > > Can't do that and also return integers for positive powers. It isn't > possible to have behavior completely compatible with python for > arrays: can't have mixed type returns, can't have arbitrary precision > integers. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Sat Oct 8 17:51:58 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 8 Oct 2016 14:51:58 -0700 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: <71300e52e43daf80c9a3b3f279d562be@esrf.fr> References: <71300e52e43daf80c9a3b3f279d562be@esrf.fr> Message-ID: On Sat, Oct 8, 2016 at 1:40 PM, V. Armando Sole wrote: > Well, testing under windows 64 bit, Python 3.5.2, positive powers of > integers give integers and negative powers of integers give floats. So, do > you want to raise an exception when taking a negative power of an element of > an array of integers? Because not doing so would be inconsistent with > raising the exception when applying the same operation to the array. > > Clearly things are broken now (I get zeros when calculating negative powers > of numpy arrays of integers others than 1), but that behavior was consistent > with python itself under python 2.x because the division of two integers was > an integer. That does not hold under Python 3.5 where the division of two > integers is a float. Even on Python 2, negative powers gave floats: >>> sys.version_info sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0) >>> 2 ** -2 0.25 > You have offered either to raise an exception or to always return a float > (i.e. even with positive exponents). You have never offered to be consistent > with what Python does. This last option would be my favorite. If it cannot > be implemented, then I would prefer always float. At least one would be > consistent with something and we would not invent yet another convention. Numpy tries to be consistent with Python when it makes sense, but this is only one of several considerations. The use cases for numpy objects are different from the use cases for Python scalar objects, so we also consistently deviate in cases when that makes sense -- e.g., numpy bools are very different from Python bools (Python barely distinguishes between bools and integers, because they don't need to; indexing makes the distinction much more important to numpy), numpy integers are very different from Python integers (Python's arbitrary-width integers provide great semantics, but don't play nicely with large fixed-size arrays), numpy pays much more attention to type consistency between inputs and outputs than Python does (again because of the extra constraints imposed by working with memory-intensive type-consistent arrays), etc. For python, 2 ** 2 -> int, 2 ** -2 -> float. But numpy can't do this, because then 2 ** np.array([2, -2]) would have to be both int *and* float, which it can't be. Not a problem that Python has. Or we could say that the output is int if all the inputs are positive, and float if any of them are negative... but then that violates the numpy principle that output dtypes should be determined entirely by input dtypes, without peeking at the actual values. (And this rule is very important for avoiding nasty surprises when you run your code on new inputs.) And then there's backwards compatibility to consider. As mentioned, we *could* deviate from Python by making ** always return float... but this would almost certainly break tons and tons of people's code that is currently doing integer ** positive integer and expecting to get an integer back. Which is something we don't do without very careful weighing of the trade-offs, and my intuition is that this one is so disruptive we probably can't pull it off. Breaking working code needs a *very* compelling reason. -n -- Nathaniel J. Smith -- https://vorpus.org From saxri89 at gmail.com Sat Oct 8 18:11:49 2016 From: saxri89 at gmail.com (Xristos Xristoou) Date: Sun, 9 Oct 2016 01:11:49 +0300 Subject: [Numpy-discussion] delete pixel from the raster image with specific range value Message-ID: any idea how to delete pixel from the raster image with specific range value using numpy/scipy or gdal? for example i have a raster image with the 5 class : 1. 0-100 2. 100-200 3. 200-300 4. 300-500 5. 500-1000 and i want to delete class 1 range value or maybe i want to delete class 1,2,4,5 if i need only class 3 -------------- next part -------------- An HTML attachment was scrubbed... URL: From raksi.raksi at gmail.com Sat Oct 8 18:18:45 2016 From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=) Date: Sun, 9 Oct 2016 00:18:45 +0200 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: <71300e52e43daf80c9a3b3f279d562be@esrf.fr> Message-ID: but then that violates the numpy > principle that output dtypes should be determined entirely by input > dtypes, without peeking at the actual values. (And this rule is very > important for avoiding nasty surprises when you run your code on new > inputs.) > At division you get back an array of floats. >>> y = np.int64([1,2,4]) >>> y/1 array([ 1., 2., 4.]) >>> y/y array([ 1., 1., 1.]) Why is it different, if you calculate the power of something? > And then there's backwards compatibility to consider. As mentioned, we > *could* deviate from Python by making ** always return float... but > this would almost certainly break tons and tons of people's code that > is currently doing integer ** positive integer and expecting to get an > integer back. Which is something we don't do without very careful > weighing of the trade-offs, and my intuition is that this one is so > disruptive we probably can't pull it off. Breaking working code needs > a *very* compelling reason. > This is a valid reasoning. But it could be solved with raising an exception to warn the users for the new behaviour. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Oct 8 19:34:30 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 8 Oct 2016 16:34:30 -0700 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: <71300e52e43daf80c9a3b3f279d562be@esrf.fr> Message-ID: On Sat, Oct 8, 2016 at 3:18 PM, Kriszti?n Horv?th wrote: > > > >> but then that violates the numpy >> principle that output dtypes should be determined entirely by input >> dtypes, without peeking at the actual values. (And this rule is very >> important for avoiding nasty surprises when you run your code on new >> inputs.) > > At division you get back an array of floats. > >>>> y = np.int64([1,2,4]) >>>> y/1 > array([ 1., 2., 4.]) >>>> y/y > array([ 1., 1., 1.]) > > Why is it different, if you calculate the power of something? The difference is that Python division always returns float. Python int ** int sometimes returns int and sometimes returns float, depending on which particular integers are used. We can't be consistent with Python because Python isn't consistent with itself. >> >> And then there's backwards compatibility to consider. As mentioned, we >> *could* deviate from Python by making ** always return float... but >> this would almost certainly break tons and tons of people's code that >> is currently doing integer ** positive integer and expecting to get an >> integer back. Which is something we don't do without very careful >> weighing of the trade-offs, and my intuition is that this one is so >> disruptive we probably can't pull it off. Breaking working code needs >> a *very* compelling reason. > > This is a valid reasoning. But it could be solved with raising an exception > to warn the users for the new behaviour. That is generally the best conservative strategy for making a backwards incompatible change like this: instead of going straight to the new behavior, first make it raise an error, and then once people have had time to stop depending on the old behavior, then you can add the new behavior. But in this case if we were going to make int ** int return float, this rule would mean that we have to make int ** int always raise an error for a few years, i.e. remove integer power support from numpy altogether. That's a non-starter. -n -- Nathaniel J. Smith -- https://vorpus.org From Permafacture at gmail.com Sat Oct 8 20:55:00 2016 From: Permafacture at gmail.com (Elliot Hallmark) Date: Sat, 8 Oct 2016 19:55:00 -0500 Subject: [Numpy-discussion] delete pixel from the raster image with specific range value In-Reply-To: References: Message-ID: What do you mean delete? Set to zero or NaN? You want an (N-1) dimensional array of all the acceptable values from the N dimensional array? Elliot On Oct 8, 2016 5:11 PM, "Xristos Xristoou" wrote: > any idea how to delete pixel from the raster image with > specific range value using numpy/scipy or gdal? > > for example i have a raster image with the > 5 class : > > 1. 0-100 > 2. 100-200 > 3. 200-300 > 4. 300-500 > 5. 500-1000 > > and i want to delete class 1 range value > or maybe i want to delete class 1,2,4,5 if i need only class 3 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Oct 9 02:43:06 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 9 Oct 2016 19:43:06 +1300 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: On Sun, Oct 9, 2016 at 4:12 AM, Nathaniel Smith wrote: > On Sat, Oct 8, 2016 at 6:59 AM, Charles R Harris > wrote: > > > > > > On Sat, Oct 8, 2016 at 4:40 AM, Nathaniel Smith wrote: > >> > >> On Fri, Oct 7, 2016 at 6:12 PM, Charles R Harris > >> wrote: > >> > Hi All, > >> > > >> > The time for NumPy 1.12.0 approaches and I like to have a final > decision > >> > on > >> > the treatment of integers to negative integer powers with the `**` > >> > operator. > >> > The two alternatives looked to be > >> > > >> > Raise an error for arrays and numpy scalars, including 1 and -1 to > >> > negative > >> > powers. > >> > > >> > Pluses > >> > > >> > Backward compatible > >> > Allows common powers to be integer, e.g., arange(3)**2 > >> > Consistent with inplace operators > >> > Fixes current wrong behavior. > >> > Preserves type > >> > > >> > > >> > Minuses > >> > > >> > Integer overflow > >> > Computational inconvenience > >> > Inconsistent with Python integers > >> > > >> > > >> > Always return a float > >> > > >> > Pluses > >> > > >> > Computational convenience > >> > > >> > > >> > Minuses > >> > > >> > Loss of type > >> > Possible backward incompatibilities > >> > Not applicable to inplace operators > >> > >> I guess I could be wrong, but I think the backwards incompatibilities > >> are going to be *way* too severe to make option 2 possible in > >> practice. > >> > > > > Backwards compatibility is also a major concern for me. Here are my > current > > thoughts > > > > Add an fpow ufunc that always converts to float, it would not accept > object > > arrays. > > Maybe call it `fpower` or even `float_power`, for consistency with `power`? > > > Raise errors in current power ufunc (**), for ints to negative ints. > > > > The power ufunc will change in the following ways > > > > +1, -1 to negative ints will error, currently they work > > n > 1 ints to negative ints will error, currently warn and return zero > > 0 to negative ints will error, they currently return the minimum integer > > > > The `**` operator currently calls the power ufunc, leave that as is for > > backward almost compatibility. The remaining question is numpy scalars, > > which we can make either compatible with Python, or with NumPy arrays. > I'm > > leaning towards NumPy array compatibility mostly on account of type > > preservation and the close relationship between zero dimensionaly arrays > and > > scalars. > > Sounds good to me. I agree that we should prioritize within-numpy > consistency over consistency with Python. > +1 sounds good to me too. > > > The fpow function could be backported to NumPy 1.11 if that would be > helpful > > going forward. > > I'm not a big fan of this kind of backport. Violating the > "bug-fixes-only" rule makes it hard for people to understand our > release versions. And it creates the situation where people can write > code that they think requires numpy 1.11 (because it works with their > numpy 1.11!), but then breaks on other people's computers (because > those users have 1.11.(x-1)). And if there's some reason why people > aren't willing to upgrade to 1.12 for new features, then probably > better to spend energy addressing those instead of on putting together > 1.11-and-a-half releases. > Agreed, this is not something we want to backport. Ralf > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raksi.raksi at gmail.com Sun Oct 9 06:42:44 2016 From: raksi.raksi at gmail.com (=?UTF-8?Q?Kriszti=C3=A1n_Horv=C3=A1th?=) Date: Sun, 9 Oct 2016 12:42:44 +0200 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: > Sounds good to me. I agree that we should prioritize within-numpy > consistency over consistency with Python. > I agree with that. Because of numpy consitetncy, the `**` operator should always return float. Right now the case is: >>> aa = np.arange(2, 10, dtype=int) array([2, 3, 4, 5, 6, 7, 8, 9]) >>> bb = np.linspace(0, 7, 8, dtype=int) array([0, 1, 2, 3, 4, 5, 6, 7]) >>> 1/aa array([ 0.5 , 0.33333333, 0.25 , 0.2 , 0.16666667, 0.14285714, 0.125 , 0.11111111]) >>> aa**-1 array([0, 0, 0, 0, 0, 0, 0, 0]) >>> 1/aa**2 array([ 0.25 , 0.11111111, 0.0625 , 0.04 , 0.02777778, 0.02040816, 0.015625 , 0.01234568]) >>> aa**-2 array([0, 0, 0, 0, 0, 0, 0, 0]) >>> aa**bb array([ 1, 3, 16, 125, 1296, 16807, 262144, 4782969]) >>> 1/aa**bb array([ 1.00000000e+00, 3.33333333e-01, 6.25000000e-02, 8.00000000e-03, 7.71604938e-04, 5.94990183e-05, 3.81469727e-06, 2.09075158e-07]) >>> aa**(-bb) array([1, 0, 0, 0, 0, 0, 0, 0]) For me this behaviour is confusing. But I am not an expert just a user. I can live together with anything if I know what to expect. And I greatly appreciate the work of any developer for this excellent package. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Oct 9 09:25:11 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 09 Oct 2016 15:25:11 +0200 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: <1476019511.6762.10.camel@sipsolutions.net> On Fr, 2016-10-07 at 19:12 -0600, Charles R Harris wrote: > Hi All, > > The time for NumPy 1.12.0 approaches and I like to have a final > decision on the treatment of integers to negative integer powers with > the `**` operator. The two alternatives looked to be > > Raise an error for arrays and numpy scalars, including 1 and -1 to > negative powers. > For what its worth, I still feel it is probably the only real option to go with error, changing to float may have weird effects. Which does not mean it is impossible, I admit, though I would like some data on how downstream would handle it. Also would we need an int power? The fpower seems more straight forward/common pattern. If errors turned out annoying in some cases, a seterr might be plausible too (as well as a deprecation). - Sebastian > Pluses > Backward compatible > Allows common powers to be integer, e.g., arange(3)**2 > Consistent with inplace operators > Fixes current wrong behavior. > Preserves type > > Minuses > Integer overflow > Computational inconvenience > Inconsistent with Python integers > > Always return a float? > > Pluses > Computational convenience > > Minuses > Loss of type > Possible backward incompatibilities > Not applicable to inplace operators > > > Thoughts? > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From shoyer at gmail.com Sun Oct 9 14:59:10 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 9 Oct 2016 11:59:10 -0700 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: <1476019511.6762.10.camel@sipsolutions.net> References: <1476019511.6762.10.camel@sipsolutions.net> Message-ID: On Sun, Oct 9, 2016 at 6:25 AM, Sebastian Berg wrote: > For what its worth, I still feel it is probably the only real option to > go with error, changing to float may have weird effects. Which does not > mean it is impossible, I admit, though I would like some data on how > downstream would handle it. Also would we need an int power? The fpower > seems more straight forward/common pattern. > If errors turned out annoying in some cases, a seterr might be > plausible too (as well as a deprecation). > I agree with Sebastian and Nathaniel. I don't think we can deviating from the existing behavior (int ** int -> int) without breaking lots of existing code, and if we did, yes, we would need a new integer power function. I think it's better to preserve the existing behavior when it gives sensible results, and error when it doesn't. Adding another function float_power for the case that is currently broken seems like the right way to go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Mon Oct 10 12:38:37 2016 From: rmay31 at gmail.com (Ryan May) Date: Mon, 10 Oct 2016 10:38:37 -0600 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: <1476019511.6762.10.camel@sipsolutions.net> Message-ID: On Sun, Oct 9, 2016 at 12:59 PM, Stephan Hoyer wrote: > On Sun, Oct 9, 2016 at 6:25 AM, Sebastian Berg > wrote: > >> For what its worth, I still feel it is probably the only real option to >> go with error, changing to float may have weird effects. Which does not >> mean it is impossible, I admit, though I would like some data on how >> downstream would handle it. Also would we need an int power? The fpower >> seems more straight forward/common pattern. >> If errors turned out annoying in some cases, a seterr might be >> plausible too (as well as a deprecation). >> > > I agree with Sebastian and Nathaniel. I don't think we can deviating from > the existing behavior (int ** int -> int) without breaking lots of existing > code, and if we did, yes, we would need a new integer power function. > > I think it's better to preserve the existing behavior when it gives > sensible results, and error when it doesn't. Adding another function > float_power for the case that is currently broken seems like the right way > to go. > +1 Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Tue Oct 11 21:23:44 2016 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Tue, 11 Oct 2016 18:23:44 -0700 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. Message-ID: > On Sun, Oct 9, 2016 at 12:59 PM, Stephan Hoyer wrote: > >> >> I agree with Sebastian and Nathaniel. I don't think we can deviating from >> the existing behavior (int ** int -> int) without breaking lots of existing >> code, and if we did, yes, we would need a new integer power function. >> >> I think it's better to preserve the existing behavior when it gives >> sensible results, and error when it doesn't. Adding another function >> float_power for the case that is currently broken seems like the right way >> to go. >> > I actually suspect that the amount of code broken by int**int->float may be relatively small (though extremely annoying for those that it happens to, and it would definitely be good to have statistics). I mean, Numpy silently transitioned to int32+uint64->float64 not so long ago which broke my code, but the world didn?t end. If the primary argument against int**int->float seems to be the difficulty of managing the transition, with int**int->Error being the seen as the required yet *very* painful intermediate step for the large fraction of the int**int users who didn?t care if it was int or float (e.g. the output is likely to be cast to float in the next step anyway), and fail loudly for those users who need int**int->int, then if you are prepared to risk a less conservative transition (i.e. we think that latter group is small enough) you could skip the error on users and just throw a warning for a couple of releases, along the lines of: WARNING int**int -> int is going to be deprecated in favour of int**int->float in Numpy 1.16. To avoid seeing this message, either use ?from numpy import __future_float_power__? or explicitly set the type of one of your inputs to float, or use the new ipower(x,y) function for integer powers. Peter From m.h.vankerkwijk at gmail.com Wed Oct 12 12:02:59 2016 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Wed, 12 Oct 2016 12:02:59 -0400 Subject: [Numpy-discussion] Integers to negative integer powers, time for a decision. In-Reply-To: References: Message-ID: I still strongly favour ending up at int**int -> float, and like Peter's suggestion of raising a general warning rather than an exception for negative powers. -- Marten From allanhaldane at gmail.com Fri Oct 14 13:00:28 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Fri, 14 Oct 2016 13:00:28 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve Message-ID: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> Hi all, Eric Wieser has a PR which defines new functions np.ma.correlate and np.ma.convolve: https://github.com/numpy/numpy/pull/7922 We're deciding how to name the keyword arg which determines whether masked elements are "propagated" in the convolution sums. Currently we are leaning towards calling it "contagious", with default of True: def convolve(a, v, mode='full', contagious=True): Any thoughts? Cheers, Allan From sebastian at sipsolutions.net Fri Oct 14 13:08:17 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 14 Oct 2016 19:08:17 +0200 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> Message-ID: <1476464897.22194.2.camel@sipsolutions.net> On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote: > Hi all, > > Eric Wieser has a PR which defines new functions np.ma.correlate and > np.ma.convolve: > > https://github.com/numpy/numpy/pull/7922 > > We're deciding how to name the keyword arg which determines whether > masked elements are "propagated" in the convolution sums. Currently > we > are leaning towards calling it "contagious", with default of True: > > ????????def convolve(a, v, mode='full', contagious=True): > > Any thoughts? > Sounds a bit overly odd to me to be honest. Just brain storming, you could think/name it the other way around maybe? Should the masked values be considered as zero/ignored? - Sebastian > Cheers, > Allan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From ben.v.root at gmail.com Fri Oct 14 13:44:50 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 14 Oct 2016 13:44:50 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: <1476464897.22194.2.camel@sipsolutions.net> References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> Message-ID: Why not "propagated"? On Fri, Oct 14, 2016 at 1:08 PM, Sebastian Berg wrote: > On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote: > > Hi all, > > > > Eric Wieser has a PR which defines new functions np.ma.correlate and > > np.ma.convolve: > > > > https://github.com/numpy/numpy/pull/7922 > > > > We're deciding how to name the keyword arg which determines whether > > masked elements are "propagated" in the convolution sums. Currently > > we > > are leaning towards calling it "contagious", with default of True: > > > > def convolve(a, v, mode='full', contagious=True): > > > > Any thoughts? > > > > Sounds a bit overly odd to me to be honest. Just brain storming, you > could think/name it the other way around maybe? Should the masked > values be considered as zero/ignored? > > - Sebastian > > > > Cheers, > > Allan > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Fri Oct 14 14:23:09 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Fri, 14 Oct 2016 14:23:09 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> Message-ID: <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> I think the possibilities that have been mentioned so far (here or in the PR) are: contagious contagious_mask propagate propagate_mask propagated `propogate_mask=False` seemed to imply that the mask would never be set, so Eric also suggested propagate_mask='any' or propagate_mask='all' I would be happy with 'propagated=False' as the name/default. As Eric pointed out, most MaskedArray functions like sum implicitly don't propagate, currently, so maybe we should do likewise here. Allan On 10/14/2016 01:44 PM, Benjamin Root wrote: > Why not "propagated"? > > On Fri, Oct 14, 2016 at 1:08 PM, Sebastian Berg > > wrote: > > On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote: > > Hi all, > > > > Eric Wieser has a PR which defines new functions np.ma.correlate and > > np.ma.convolve: > > > > https://github.com/numpy/numpy/pull/7922 > > > > > We're deciding how to name the keyword arg which determines whether > > masked elements are "propagated" in the convolution sums. Currently > > we > > are leaning towards calling it "contagious", with default of True: > > > > def convolve(a, v, mode='full', contagious=True): > > > > Any thoughts? > > > > Sounds a bit overly odd to me to be honest. Just brain storming, you > could think/name it the other way around maybe? Should the masked > values be considered as zero/ignored? > > - Sebastian > > > > Cheers, > > Allan > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From jni.soma at gmail.com Fri Oct 14 19:49:48 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Sat, 15 Oct 2016 10:49:48 +1100 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> Message-ID: +1 for propagate_mask. That is the only proposal that immediately makes sense to me. "contagious" may be cute but I think approximately 0% of users would guess its purpose on first use. Can you elaborate on what happens with the masks exactly? I didn't quite get why propagate_mask=False was unintuitive. My expectation is that any mask present in the input will not be set in the output, but the mask will be "respected" by the function. On 15 Oct. 2016, 5:23 AM +1100, Allan Haldane , wrote: > I think the possibilities that have been mentioned so far (here or in > the PR) are: > > contagious > contagious_mask > propagate > propagate_mask > propagated > > `propogate_mask=False` seemed to imply that the mask would never be set, > so Eric also suggested > propagate_mask='any' or propagate_mask='all' > > > I would be happy with 'propagated=False' as the name/default. As Eric > pointed out, most MaskedArray functions like sum implicitly don't > propagate, currently, so maybe we should do likewise here. > > > Allan > > On 10/14/2016 01:44 PM, Benjamin Root wrote: > > Why not "propagated"? > > > > On Fri, Oct 14, 2016 at 1:08 PM, Sebastian Berg > > > wrote: > > > > On Fr, 2016-10-14 at 13:00 -0400, Allan Haldane wrote: > > > Hi all, > > > > > > Eric Wieser has a PR which defines new functions np.ma.correlate and > > > np.ma.convolve: > > > > > > https://github.com/numpy/numpy/pull/7922 > > > > > > > We're deciding how to name the keyword arg which determines whether > > > masked elements are "propagated" in the convolution sums. Currently > > > we > > > are leaning towards calling it "contagious", with default of True: > > > > > > def convolve(a, v, mode='full', contagious=True): > > > > > > Any thoughts? > > > > > > > Sounds a bit overly odd to me to be honest. Just brain storming, you > > could think/name it the other way around maybe? Should the masked > > values be considered as zero/ignored? > > > > - Sebastian > > > > > > > Cheers, > > > Allan > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Sat Oct 15 21:21:13 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Sat, 15 Oct 2016 21:21:13 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> Message-ID: <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> On 10/14/2016 07:49 PM, Juan Nunez-Iglesias wrote: > +1 for propagate_mask. That is the only proposal that immediately makes > sense to me. "contagious" may be cute but I think approximately 0% of > users would guess its purpose on first use. > > Can you elaborate on what happens with the masks exactly? I didn't quite > get why propagate_mask=False was unintuitive. My expectation is that any > mask present in the input will not be set in the output, but the mask > will be "respected" by the function. Here's an illustration of how the PR currently works with convolve, using the name "propagate_mask": >>> m = np.ma.masked >>> a = np.ma.array([1,1,1,m,1,1,1,m,m,m,1,1,1]) >>> b = np.ma.array([1,1,1]) >>> >>> print np.ma.convolve(a, b, propagate_mask=True) [1 2 3 -- -- -- 3 -- -- -- -- -- 3 2 1] >>> print np.ma.convolve(a, b, propagate_mask=False) [1 2 3 2 2 2 3 2 1 -- 1 2 3 2 1] Allan > On 15 Oct. 2016, 5:23 AM +1100, Allan Haldane , > wrote: >> I think the possibilities that have been mentioned so far (here or in >> the PR) are: >> >> contagious >> contagious_mask >> propagate >> propagate_mask >> propagated >> >> `propogate_mask=False` seemed to imply that the mask would never be set, >> so Eric also suggested >> propagate_mask='any' or propagate_mask='all' >> >> >> I would be happy with 'propagated=False' as the name/default. As Eric >> pointed out, most MaskedArray functions like sum implicitly don't >> propagate, currently, so maybe we should do likewise here. >> >> >> Allan From klemm at phys.ethz.ch Sun Oct 16 05:52:57 2016 From: klemm at phys.ethz.ch (Hanno Klemm) Date: Sun, 16 Oct 2016 11:52:57 +0200 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: > On 16 Oct 2016, at 03:21, Allan Haldane wrote: > >> On 10/14/2016 07:49 PM, Juan Nunez-Iglesias wrote: >> +1 for propagate_mask. That is the only proposal that immediately makes >> sense to me. "contagious" may be cute but I think approximately 0% of >> users would guess its purpose on first use. >> >> Can you elaborate on what happens with the masks exactly? I didn't quite >> get why propagate_mask=False was unintuitive. My expectation is that any >> mask present in the input will not be set in the output, but the mask >> will be "respected" by the function. > > Here's an illustration of how the PR currently works with convolve, using the name "propagate_mask": > > >>> m = np.ma.masked > >>> a = np.ma.array([1,1,1,m,1,1,1,m,m,m,1,1,1]) > >>> b = np.ma.array([1,1,1]) > >>> > >>> print np.ma.convolve(a, b, propagate_mask=True) > [1 2 3 -- -- -- 3 -- -- -- -- -- 3 2 1] > >>> print np.ma.convolve(a, b, propagate_mask=False) > [1 2 3 2 2 2 3 2 1 -- 1 2 3 2 1] > > Allan > Given this behaviour, I'm actually more concerned about the logic ma.convolve uses in the propagate_mask=False case. It appears that the masked values are essentially replaced by zero. Is my interpretation correct and if so does this make sense? When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. Hanno From harrigan.matthew at gmail.com Sun Oct 16 08:47:38 2016 From: harrigan.matthew at gmail.com (Matthew Harrigan) Date: Sun, 16 Oct 2016 08:47:38 -0400 Subject: [Numpy-discussion] add elementwise addition & subtraction to einsum Message-ID: Hello, This is a follow on for issue 8139 . I propose adding elementwise addition and subtraction functionality to einsum. I love einsum as it clearly and concisely defines complex linear algebra. However elementwise addition is a very common linear algebra operation and einsum does not currently support it. The Einstein field equations , what the notation was originally developed to document, contain that functionality. It is fairly common in stress analysis (my background), for example see these lectures notes . Specifically I propose adding "+" and "-" characters which separate current einsum statements which are then combined elementwise. An example is A = np.einsum('ij,jk+ij,jk', B, C, D, E), which is A = B * C + D * E. I wrote a crude function to demonstrate the functionality. I believe the functionality is useful, in keeping with the spirit of a clean concise API, and doesn't break the existing API, which could warrant acceptance. Additionally I believe it opens the possibility of many interesting performance optimizations. For instance, many of the optimizations in this NEP could be done internally to the einsum function, which may be easier to accomplish given the narrower scope (but I am ignorant of all the low level C internals of numpy). The example in the beginning could become A = np.einsum('...+...+...', B, C, D). Thank you for your time and consideration. Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Mon Oct 17 13:01:14 2016 From: pierre.haessig at crans.org (Pierre Haessig) Date: Mon, 17 Oct 2016 19:01:14 +0200 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: Hi, Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit : > When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. When estimating the autocorrelation of a signal, it make sense to drop missing pairs of values. Only in this use case, it opens the question of correcting or not correcting for the number of missing elements when computing the mean. I don't remember what R function "acf" is doing. Also, coming back to the initial question, I feel that it is necessary that the name "mask" (or "na" or similar) appears in the parameter name. Otherwise, people will wonder : "what on earth is contagious/being propagated...." just thinking of yet another keyword name : ignore_masked (or drop_masked) If I remember well, in R it is dropna. It would be nice if the boolean switch followed the same logic. Now of course the convolution function is more general than just autocorrelation... best, Pierre -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 837 bytes Desc: OpenPGP digital signature URL: From Shreyank.Amartya at itcinfotech.com Tue Oct 18 03:18:15 2016 From: Shreyank.Amartya at itcinfotech.com (Shreyank Amartya) Date: Tue, 18 Oct 2016 07:18:15 +0000 Subject: [Numpy-discussion] Scipy installation on Window with mingw32 Message-ID: Hi, I am trying install to theano which also requires numpy and scipy on windows 7 with mingw32 compilers. I have successfully installed numpy using mingw32 but however when trying to install scipy I get this error: Looking for python27.dll Building msvcr library: "c:\python27\libs\libmsvcr90.a" (from C:\Windows\win sxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\ msvcr90.dll) objdump.exe: C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0 .21022.8_none_750b37ff97f4f68b\msvcr90.dll: File format not recognized Traceback (most recent call last): File "", line 1, in File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\setup.py", line 415, in setup_package() File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\setup.py", line 411, in setup_package setup(**metadata) File "c:\python27\lib\site-packages\numpy\distutils\core.py", line 169, in setup return old_setup(**new_attr) File "c:\python27\lib\distutils\core.py", line 151, in setup dist.run_commands() File "c:\python27\lib\distutils\dist.py", line 953, in run_commands self.run_command(cmd) File "c:\python27\lib\distutils\dist.py", line 972, in run_command cmd_obj.run() File "c:\python27\lib\site-packages\numpy\distutils\command\install.py", l ine 62, in run r = self.setuptools_run() File "c:\python27\lib\site-packages\numpy\distutils\command\install.py", l ine 36, in setuptools_run return distutils_install.run(self) File "c:\python27\lib\distutils\command\install.py", line 563, in run self.run_command('build') File "c:\python27\lib\distutils\cmd.py", line 326, in run_command self.distribution.run_command(command) File "c:\python27\lib\distutils\dist.py", line 972, in run_command cmd_obj.run() File "c:\python27\lib\site-packages\numpy\distutils\command\build.py", lin e 47, in run old_build.run(self) File "c:\python27\lib\distutils\command\build.py", line 127, in run self.run_command(cmd_name) File "c:\python27\lib\distutils\cmd.py", line 326, in run_command self.distribution.run_command(command) File "c:\python27\lib\distutils\dist.py", line 972, in run_command cmd_obj.run() File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py", line 147, in run self.build_sources() File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py", line 164, in build_sources self.build_extension_sources(ext) File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py", line 323, in build_extension_sources sources = self.generate_sources(sources, ext) File "c:\python27\lib\site-packages\numpy\distutils\command\build_src.py", line 376, in generate_sources source = func(extension, build_dir) File "scipy\spatial\setup.py", line 35, in get_qhull_misc_config if config_cmd.check_func('open_memstream', decl=True, call=True): File "c:\python27\lib\site-packages\numpy\distutils\command\config.py", li ne 312, in check_func self._check_compiler() File "c:\python27\lib\site-packages\numpy\distutils\command\config.py", li ne 39, in _check_compiler old_config._check_compiler(self) File "c:\python27\lib\distutils\command\config.py", line 102, in _check_co mpiler dry_run=self.dry_run, force=1) File "c:\python27\lib\site-packages\numpy\distutils\ccompiler.py", line 59 6, in new_compiler compiler = klass(None, dry_run, force) File "c:\python27\lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 96, in __init__ msvcr_success = build_msvcr_library() File "c:\python27\lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 360, in build_msvcr_library generate_def(dll_file, def_file) File "c:\python27\lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 274, in generate_def raise ValueError("Symbol table not found") ValueError: Symbol table not found ---------------------------------------- Command "c:\python27\python.exe -u -c "import setuptools, tokenize;__file__='c:\ \users\\22193\\appdata\\local\\temp\\pip-build-d3f_pb\\scipy\\setup.py';exec(com pile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __f ile__, 'exec'))" install --record c:\users\22193\appdata\local\temp\pip-wxyrfu-r ecord\install-record.txt --single-version-externally-managed --compile" failed w ith error code 1 in c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\ C:\Python27\Scripts> I realize this is due to some problem with resolving symbols in python27.dll. Is there a workaround for this? I need this to work on this setup as this is my workstation at office and I cannot install Ubuntu which would have been way easier. Things I have tried: I used to get an error while installing scipy for lapack/blas resources not found, I was able to get through by downloading and compiling them from source. I have tried to install scipy from http://www.lfd.uci.edu/~gohlke/pythonlibs/ and it does install scipy successfully but I get a 64-bit compatibility error when I try to import theano. Please help as I'm stuck here. Thanks Shreyank Disclaimer: This communication is for the exclusive use of the intended recipient(s) and shall not attach any liability on the originator or ITC Infotech India Ltd./its Holding company/ its Subsidiaries/ its Group Companies. If you are the addressee, the contents of this e-mail are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorization from the originator or ITC Infotech India Ltd./ its Holding company/its Subsidiaries/ its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Infotech India Ltd./ its Holding company/ its Subsidiaries/ its Group Companies. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 18 05:07:04 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 18 Oct 2016 22:07:04 +1300 Subject: [Numpy-discussion] Scipy installation on Window with mingw32 In-Reply-To: References: Message-ID: Hi, A few comments: - you really really want to use a scientific Python distribution to avoid these issues on Windows. see http://scipy.org/install.html - we used to build scipy .exe installers with mingw32 but don't do that anymore because it's just too much of a pain. IIRC the last release we did that for was 0.16.0, with the toolchain in https://github.com/numpy/numpy-vendor. - I don't recognize the error; looks not specific to recent changes in scipy so there's probably something in your environment not set up quite right. Cheers, Ralf On Tue, Oct 18, 2016 at 8:18 PM, Shreyank Amartya < Shreyank.Amartya at itcinfotech.com> wrote: > Hi, > > > > I am trying install to theano which also requires numpy and scipy on > windows 7 with mingw32 compilers. > > I have successfully installed numpy using mingw32 but however when trying > to install scipy I get this error: > > > > Looking for python27.dll > > Building msvcr library: "c:\python27\libs\libmsvcr90.a" (from > C:\Windows\win > > sxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_ > none_750b37ff97f4f68b\ > > msvcr90.dll) > > objdump.exe: C:\Windows\winsxs\amd64_microsoft.vc90.crt_ > 1fc8b3b9a1e18e3b_9.0 > > .21022.8_none_750b37ff97f4f68b\msvcr90.dll: File format not recognized > > Traceback (most recent call last): > > File "", line 1, in > > File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\ > setup.py", > > line 415, in > > setup_package() > > File "c:\users\22193\appdata\local\temp\pip-build-d3f_pb\scipy\ > setup.py", > > line 411, in setup_package > > setup(**metadata) > > File "c:\python27\lib\site-packages\numpy\distutils\core.py", line > 169, in > > setup > > return old_setup(**new_attr) > > File "c:\python27\lib\distutils\core.py", line 151, in setup > > dist.run_commands() > > File "c:\python27\lib\distutils\dist.py", line 953, in run_commands > > self.run_command(cmd) > > File "c:\python27\lib\distutils\dist.py", line 972, in run_command > > cmd_obj.run() > > File "c:\python27\lib\site-packages\numpy\distutils\command\install.py", > l > > ine 62, in run > > r = self.setuptools_run() > > File "c:\python27\lib\site-packages\numpy\distutils\command\install.py", > l > > ine 36, in setuptools_run > > return distutils_install.run(self) > > File "c:\python27\lib\distutils\command\install.py", line 563, in > run > > self.run_command('build') > > File "c:\python27\lib\distutils\cmd.py", line 326, in run_command > > self.distribution.run_command(command) > > File "c:\python27\lib\distutils\dist.py", line 972, in run_command > > cmd_obj.run() > > File "c:\python27\lib\site-packages\numpy\distutils\command\build.py", > lin > > e 47, in run > > old_build.run(self) > > File "c:\python27\lib\distutils\command\build.py", line 127, in run > > self.run_command(cmd_name) > > File "c:\python27\lib\distutils\cmd.py", line 326, in run_command > > self.distribution.run_command(command) > > File "c:\python27\lib\distutils\dist.py", line 972, in run_command > > cmd_obj.run() > > File "c:\python27\lib\site-packages\numpy\distutils\ > command\build_src.py", > > line 147, in run > > self.build_sources() > > File "c:\python27\lib\site-packages\numpy\distutils\ > command\build_src.py", > > line 164, in build_sources > > self.build_extension_sources(ext) > > File "c:\python27\lib\site-packages\numpy\distutils\ > command\build_src.py", > > line 323, in build_extension_sources > > sources = self.generate_sources(sources, ext) > > File "c:\python27\lib\site-packages\numpy\distutils\ > command\build_src.py", > > line 376, in generate_sources > > source = func(extension, build_dir) > > File "scipy\spatial\setup.py", line 35, in get_qhull_misc_config > > if config_cmd.check_func('open_memstream', decl=True, call=True): > > File "c:\python27\lib\site-packages\numpy\distutils\command\config.py", > li > > ne 312, in check_func > > self._check_compiler() > > File "c:\python27\lib\site-packages\numpy\distutils\command\config.py", > li > > ne 39, in _check_compiler > > old_config._check_compiler(self) > > File "c:\python27\lib\distutils\command\config.py", line 102, in > _check_co > > mpiler > > dry_run=self.dry_run, force=1) > > File "c:\python27\lib\site-packages\numpy\distutils\ccompiler.py", > line 59 > > 6, in new_compiler > > compiler = klass(None, dry_run, force) > > File "c:\python27\lib\site-packages\numpy\distutils\ > mingw32ccompiler.py", > > line 96, in __init__ > > msvcr_success = build_msvcr_library() > > File "c:\python27\lib\site-packages\numpy\distutils\ > mingw32ccompiler.py", > > line 360, in build_msvcr_library > > generate_def(dll_file, def_file) > > File "c:\python27\lib\site-packages\numpy\distutils\ > mingw32ccompiler.py", > > line 274, in generate_def > > raise ValueError("Symbol table not found") > > ValueError: Symbol table not found > > > > ---------------------------------------- > > Command "c:\python27\python.exe -u -c "import setuptools, > tokenize;__file__='c:\ > > \users\\22193\\appdata\\local\\temp\\pip-build-d3f_pb\\ > scipy\\setup.py';exec(com > > pile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', > '\n'), __f > > ile__, 'exec'))" install --record c:\users\22193\appdata\local\ > temp\pip-wxyrfu-r > > ecord\install-record.txt --single-version-externally-managed --compile" > failed w > > ith error code 1 in c:\users\22193\appdata\local\ > temp\pip-build-d3f_pb\scipy\ > > > > C:\Python27\Scripts> > > > > I realize this is due to some problem with resolving symbols in > python27.dll. Is there a workaround for this? > > I need this to work on this setup as this is my workstation at office and > I cannot install Ubuntu which would have been way easier. > > > > Things I have tried: > > I used to get an error while installing scipy for lapack/blas resources > not found, I was able to get through by downloading and compiling them from > source. > > I have tried to install scipy from http://www.lfd.uci.edu/~ > gohlke/pythonlibs/ and it does install scipy successfully but I get a > 64-bit compatibility error when I try to import theano. > > Please help as I?m stuck here. > > > > Thanks > > Shreyank > > > Disclaimer: This communication is for the exclusive use of the intended > recipient(s) and shall not attach any liability on the originator or ITC > Infotech India Ltd./its Holding company/ its Subsidiaries/ its Group > Companies. If you are the addressee, the contents of this e-mail are > intended for your use only and it shall not be forwarded to any third > party, without first obtaining written authorization from the originator or > ITC Infotech India Ltd./ its Holding company/its Subsidiaries/ its Group > Companies. It may contain information which is confidential and legally > privileged and the same shall not be used or dealt with by any third party > in any manner whatsoever without the specific consent of ITC Infotech India > Ltd./ its Holding company/ its Subsidiaries/ its Group Companies. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 18 13:25:37 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Oct 2016 13:25:37 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: On Mon, Oct 17, 2016 at 1:01 PM, Pierre Haessig wrote: > Hi, > > > Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit : >> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. > When estimating the autocorrelation of a signal, it make sense to drop > missing pairs of values. Only in this use case, it opens the question of > correcting or not correcting for the number of missing elements when > computing the mean. I don't remember what R function "acf" is doing. > > > Also, coming back to the initial question, I feel that it is necessary > that the name "mask" (or "na" or similar) appears in the parameter name. > Otherwise, people will wonder : "what on earth is contagious/being > propagated...." > > just thinking of yet another keyword name : ignore_masked (or drop_masked) > > If I remember well, in R it is dropna. It would be nice if the boolean > switch followed the same logic. > > Now of course the convolution function is more general than just > autocorrelation... I think "drop" or "ignore" is too generic, for correlation it would be for example ignore pairs versus ignore cases. To me propagate sounds ok to me, but something with `valid` might be more explicit for convolution or `correlate`, however `valid` also refers to the end points, so maybe valid_na or valid_masked=True Josef > > best, > Pierre > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Tue Oct 18 13:30:52 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Oct 2016 13:30:52 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: On Tue, Oct 18, 2016 at 1:25 PM, wrote: > On Mon, Oct 17, 2016 at 1:01 PM, Pierre Haessig > wrote: >> Hi, >> >> >> Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit : >>> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. >> When estimating the autocorrelation of a signal, it make sense to drop >> missing pairs of values. Only in this use case, it opens the question of >> correcting or not correcting for the number of missing elements when >> computing the mean. I don't remember what R function "acf" is doing. as aside: statsmodels has now an option for acf and similar missing : str A string in ['none', 'raise', 'conservative', 'drop'] specifying how the NaNs are to be treated. Josef >> >> >> Also, coming back to the initial question, I feel that it is necessary >> that the name "mask" (or "na" or similar) appears in the parameter name. >> Otherwise, people will wonder : "what on earth is contagious/being >> propagated...." >> >> just thinking of yet another keyword name : ignore_masked (or drop_masked) >> >> If I remember well, in R it is dropna. It would be nice if the boolean >> switch followed the same logic. >> >> Now of course the convolution function is more general than just >> autocorrelation... > > I think "drop" or "ignore" is too generic, for correlation it would be > for example ignore pairs versus ignore cases. > > To me propagate sounds ok to me, but something with `valid` might be > more explicit for convolution or `correlate`, however `valid` also > refers to the end points, so maybe valid_na or valid_masked=True > > Josef > >> >> best, >> Pierre >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> From josef.pktd at gmail.com Tue Oct 18 13:49:13 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Oct 2016 13:49:13 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: On Tue, Oct 18, 2016 at 1:30 PM, wrote: > On Tue, Oct 18, 2016 at 1:25 PM, wrote: >> On Mon, Oct 17, 2016 at 1:01 PM, Pierre Haessig >> wrote: >>> Hi, >>> >>> >>> Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit : >>>> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. >>> When estimating the autocorrelation of a signal, it make sense to drop >>> missing pairs of values. Only in this use case, it opens the question of >>> correcting or not correcting for the number of missing elements when >>> computing the mean. I don't remember what R function "acf" is doing. > > as aside: statsmodels has now an option for acf and similar > > missing : str > A string in ['none', 'raise', 'conservative', 'drop'] > specifying how the NaNs > are to be treated. aside to the aside: statsmodels was just catching up in this The original for masked array acf including correct counting of "valid" terms is https://github.com/pierregm/scikits.timeseries/blob/master/scikits/timeseries/lib/avcf.py (which I looked at way before statsmodels had any acf) Josef > > Josef > >>> >>> >>> Also, coming back to the initial question, I feel that it is necessary >>> that the name "mask" (or "na" or similar) appears in the parameter name. >>> Otherwise, people will wonder : "what on earth is contagious/being >>> propagated...." >>> >>> just thinking of yet another keyword name : ignore_masked (or drop_masked) >>> >>> If I remember well, in R it is dropna. It would be nice if the boolean >>> switch followed the same logic. >>> >>> Now of course the convolution function is more general than just >>> autocorrelation... >> >> I think "drop" or "ignore" is too generic, for correlation it would be >> for example ignore pairs versus ignore cases. >> >> To me propagate sounds ok to me, but something with `valid` might be >> more explicit for convolution or `correlate`, however `valid` also >> refers to the end points, so maybe valid_na or valid_masked=True >> >> Josef >> >>> >>> best, >>> Pierre >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> From allanhaldane at gmail.com Tue Oct 18 18:37:56 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 18 Oct 2016 18:37:56 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: On 10/17/2016 01:01 PM, Pierre Haessig wrote: > Hi, > > > Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit : >> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. > When estimating the autocorrelation of a signal, it make sense to drop > missing pairs of values. Only in this use case, it opens the question of > correcting or not correcting for the number of missing elements when > computing the mean. I don't remember what R function "acf" is doing. > > > Also, coming back to the initial question, I feel that it is necessary > that the name "mask" (or "na" or similar) appears in the parameter name. > Otherwise, people will wonder : "what on earth is contagious/being > propagated...." > > just thinking of yet another keyword name : ignore_masked (or drop_masked) > > If I remember well, in R it is dropna. It would be nice if the boolean > switch followed the same logic. There is an old unimplemented NEP which uses similar language, like "ignorena", and np.NA. http://docs.scipy.org/doc/numpy/neps/missing-data.html But right now that isn't part of numpy, so I think it would be confusing to use that terminology. Allan From allanhaldane at gmail.com Tue Oct 18 18:49:16 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 18 Oct 2016 18:49:16 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: On 10/16/2016 05:52 AM, Hanno Klemm wrote: > > >> On 16 Oct 2016, at 03:21, Allan Haldane wrote: >> >>> On 10/14/2016 07:49 PM, Juan Nunez-Iglesias wrote: >>> +1 for propagate_mask. That is the only proposal that immediately makes >>> sense to me. "contagious" may be cute but I think approximately 0% of >>> users would guess its purpose on first use. >>> >>> Can you elaborate on what happens with the masks exactly? I didn't quite >>> get why propagate_mask=False was unintuitive. My expectation is that any >>> mask present in the input will not be set in the output, but the mask >>> will be "respected" by the function. >> >> Here's an illustration of how the PR currently works with convolve, using the name "propagate_mask": >> >> >>> m = np.ma.masked >> >>> a = np.ma.array([1,1,1,m,1,1,1,m,m,m,1,1,1]) >> >>> b = np.ma.array([1,1,1]) >> >>> >> >>> print np.ma.convolve(a, b, propagate_mask=True) >> [1 2 3 -- -- -- 3 -- -- -- -- -- 3 2 1] >> >>> print np.ma.convolve(a, b, propagate_mask=False) >> [1 2 3 2 2 2 3 2 1 -- 1 2 3 2 1] >> >> Allan >> > > Given this behaviour, I'm actually more concerned about the logic ma.convolve uses in the propagate_mask=False case. It appears that the masked values are essentially replaced by zero. Is my interpretation correct and if so does this make sense? > I think that's right. Its usefulness wasn't obvious to me either, but googling shows that in matlab people like the file "nanconv.m" which works this way, using nans similarly to how the mask is used here. Just as convolution functions often add zero-padding around an image, here the mask behavior would allow you to have different borders, eg [m,m,m,1,1,1,1,m,m,m,m] using my notation from before. Octave's "nanconv" does this too. I still agree that in most cases people should be handling the missing values more carefully (manually) if they are doing convolutions, but this default behaviour maybe seems reasonable to me. Allan From allanhaldane at gmail.com Tue Oct 18 19:18:18 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 18 Oct 2016 19:18:18 -0400 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> Message-ID: <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com> On 10/17/2016 01:01 PM, Pierre Haessig wrote: > Le 16/10/2016 ? 11:52, Hanno Klemm a ?crit : >> When I have similar situations, I usually interpolate between the valid values. I assume there are a lot of use cases for convolutions but I have difficulties imagining that ignoring a missing value and, for the purpose of the computation, treating it as zero is useful in many of them. > When estimating the autocorrelation of a signal, it make sense to drop > missing pairs of values. Only in this use case, it opens the question of > correcting or not correcting for the number of missing elements when > computing the mean. I don't remember what R function "acf" is doing. > > > Also, coming back to the initial question, I feel that it is necessary > that the name "mask" (or "na" or similar) appears in the parameter name. > Otherwise, people will wonder : "what on earth is contagious/being > propagated...." Based on feedback so far, I think "propagate_mask" sounds like the best word to use. Let's go with that. As for whether it should default to "True" or "False", the arguments I see are: * False, because that is the way most functions like `np.ma.sum` already work, as well as matlab and octave's similar "nanconv". * True, because its effects are more visible and might lead to less surprises. The "False" case seems like it is often not what the user intended. Eg, it affects the overall normalization of normalized kernels, and the choice of 0 seems arbitrary. If no one says anything, I'd probably go with True. Allan From shoyer at gmail.com Tue Oct 18 19:44:03 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 18 Oct 2016 16:44:03 -0700 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com> References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com> Message-ID: On Tue, Oct 18, 2016 at 4:18 PM, Allan Haldane wrote: > As for whether it should default to "True" or "False", the arguments I > see are: > > * False, because that is the way most functions like `np.ma.sum` > already work, as well as matlab and octave's similar "nanconv". > > * True, because its effects are more visible and might lead to less > surprises. The "False" case seems like it is often not what the user > intended. Eg, it affects the overall normalization of normalized > kernels, and the choice of 0 seems arbitrary. > > If no one says anything, I'd probably go with True > I also have serious concerns about if it ever actually makes sense to use `propagate_mask=False`. So, I think it's definitely appropriate to default to `propagate_mask=True`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Wed Oct 19 04:10:18 2016 From: pierre.haessig at crans.org (Pierre Haessig) Date: Wed, 19 Oct 2016 10:10:18 +0200 Subject: [Numpy-discussion] how to name "contagious" keyword in np.ma.convolve In-Reply-To: <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com> References: <244b5cfd-ae8f-3a84-b19c-e7ab5c5213a4@gmail.com> <1476464897.22194.2.camel@sipsolutions.net> <11f34d3d-2f76-b6a2-bd12-fc2e7bf6d49a@gmail.com> <788316a6-c7b6-46c4-fdff-3683078f101a@gmail.com> <08820cf0-679b-d789-a70a-f449b346f3fd@gmail.com> Message-ID: <27f8ecde-e3a3-db74-9b2f-333a85b6ba78@crans.org> Le 19/10/2016 ? 01:18, Allan Haldane a ?crit : > Based on feedback so far, I think "propagate_mask" sounds like the best > word to use. Let's go with that. > > As for whether it should default to "True" or "False", the arguments I > see are: > > * False, because that is the way most functions like `np.ma.sum` > already work, as well as matlab and octave's similar "nanconv". > > * True, because its effects are more visible and might lead to less > surprises. The "False" case seems like it is often not what the user > intended. Eg, it affects the overall normalization of normalized > kernels, and the choice of 0 seems arbitrary. > > If no one says anything, I'd probably go with True. Sounds good! Pierre From charlesr.harris at gmail.com Thu Oct 20 13:16:03 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 20 Oct 2016 11:16:03 -0600 Subject: [Numpy-discussion] assert_allclose equal_nan default value. Message-ID: Hi All, Just a heads up that there is a PR changing the default value of `equal_nan` to `True` in the `assert_allclose` test function. The `equal_nan` argument was previously ineffective due to a bug that has recently been fixed. The current default value of `False` is not backward compatible and causes test failures in scipy. See the extended argument at https://github.com/numpy/numpy/pull/8184. I think this change is the right thing to do but want to make sure everyone is aware of it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Thu Oct 20 13:18:52 2016 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Thu, 20 Oct 2016 12:18:52 -0500 Subject: [Numpy-discussion] assert_allclose equal_nan default value. In-Reply-To: References: Message-ID: Agreed, especially given the prevalence of using this function in downstream test suites: https://github.com/search?utf8=%E2%9C%93&q=numpy+assert_allclose&type=Code&ref=searchresults On Thu, Oct 20, 2016 at 12:16 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi All, > > Just a heads up that there is a PR changing the default value of > `equal_nan` to `True` in the `assert_allclose` test function. The > `equal_nan` argument was previously ineffective due to a bug that has > recently been fixed. The current default value of `False` is not backward > compatible and causes test failures in scipy. See the extended argument at > https://github.com/numpy/numpy/pull/8184. I think this change is the > right thing to do but want to make sure everyone is aware of it. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Oct 20 13:21:40 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 20 Oct 2016 13:21:40 -0400 Subject: [Numpy-discussion] assert_allclose equal_nan default value. In-Reply-To: References: Message-ID: +1. I was almost always setting it to True anyway. On Thu, Oct 20, 2016 at 1:18 PM, Nathan Goldbaum wrote: > Agreed, especially given the prevalence of using this function in > downstream test suites: > > https://github.com/search?utf8=%E2%9C%93&q=numpy+assert_ > allclose&type=Code&ref=searchresults > > On Thu, Oct 20, 2016 at 12:16 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> Just a heads up that there is a PR changing the default value of >> `equal_nan` to `True` in the `assert_allclose` test function. The >> `equal_nan` argument was previously ineffective due to a bug that has >> recently been fixed. The current default value of `False` is not backward >> compatible and causes test failures in scipy. See the extended argument at >> https://github.com/numpy/numpy/pull/8184. I think this change is the >> right thing to do but want to make sure everyone is aware of it. >> >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Thu Oct 20 16:38:27 2016 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Thu, 20 Oct 2016 16:38:27 -0400 Subject: [Numpy-discussion] assert_allclose equal_nan default value. In-Reply-To: References: Message-ID: Good, that means I can revert some changes to astropy, which made the tests less readable. -- Marten From rays at blue-cove.com Thu Oct 20 18:25:49 2016 From: rays at blue-cove.com (R Schumacher) Date: Thu, 20 Oct 2016 15:25:49 -0700 Subject: [Numpy-discussion] invalid value treatment, in filter_design Message-ID: <201610202226.u9KMQ4IO008573@blue-cove.com> In an attempt to computationally invert the effect of an analog RC filter on a data set and reconstruct the "true" signal, a co-worker suggested: "Mathematically, you just reverse the a and b parameters. Then the zeros become the poles, but if the new poles are not inside the unit circle, the filter is not stable." So, to "stabilize" the poles' issue seen, I test for the DIV/0 error and set it to 2./N+0.j in scipy/signal/filter_design.py ~ line 244 d = polyval(a[::-1], zm1) if d[0]==0.0+0.j: d[0] = 2./N+0.j h = polyval(b[::-1], zm1) / d - Question is, is this a mathematically valid treatment? - Is there a better way to invert a Butterworth filter, or work with the DIV/0 that occurs without modifying the signal library? - Should I post to *-users instead? I noted d[0] > 2./N+0.j makes the zero bin result spike low; 2/N gives a reasonable "extension" of the response curve. This whole tweak causes a zero offset however, which I remove. An example attached... Ray Schumacher Programmer/Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- import numpy as np from scipy.signal import butter, lfilter, freqz import matplotlib.pyplot as plt def butter_highpass(cutoff, fs, order=5): nyq = 0.5 * fs normal_cutoff = cutoff / nyq b, a = butter(order, Wn=normal_cutoff, btype='highpass', analog=False) return b, a def butter_inv_highpass(cutoff, fs, order=5): nyq = 0.5 * fs normal_cutoff = cutoff / nyq b, a = butter(order, Wn=normal_cutoff, btype='highpass', analog=False) ## swap the components return a, b def butter_highpass_filter(data, cutoff, fs, order=5): b, a = butter_highpass(cutoff, fs, order=order) y = lfilter(b, a, data) return y def butter_inv_highpass_filter(data, cutoff, fs, order=5): b, a = butter_inv_highpass(cutoff, fs, order=1) offset = data.mean() y = lfilter(b, a, data) ## remove new offset y -= (y.mean() - offset) return y # Filter requirements. order = 1 fs = 1024.0 # sample rate, Hz cutoff = 11.6 # desired cutoff frequency of the filter, Hz nyquist = fs/2. # Get the filter coefficients so we can check its frequency response. b, a = butter_highpass(cutoff, fs, order) bi, ai = butter_inv_highpass(cutoff, fs, order) # Plot the frequency response. plt.subplot(2, 1, 1) w, h = freqz(b, a, worN=8000) plt.plot(0.5*fs*w/np.pi, np.abs(h), 'g', label='high pass resp') wi, hi = freqz(bi, ai, worN=8000) plt.plot(0.5*fs*wi/np.pi, np.abs(hi), 'r', label='inv. high pass resp') plt.plot(cutoff, 0.5*np.sqrt(2), 'ko') plt.axvline(cutoff, color='k') plt.xlim(0, 0.05*fs) plt.ylim(0, 5) plt.title("Lowpass Filter Frequency Response") plt.xlabel('Frequency [Hz]') # add the legend in the middle of the plot leg = plt.legend(fancybox=True) # set the alpha value of the legend: it will be translucent leg.get_frame().set_alpha(0.5) plt.subplots_adjust(hspace=0.35) plt.grid() # Demonstrate the use of the filter. # First make some data to be filtered. T = 5.0 # seconds n = int(T * fs) # total number of samples t = np.linspace(0, T, n, endpoint=False) # "Noisy" data. We want to recover the 1.2 Hz signal from this. data = np.sin(1.2*2*np.pi*t)# + 1.5*np.cos(9*2*np.pi*t) + 0.5*np.sin(12.0*2*np.pi*t) # Filter the data, and plot both the original and filtered signals. y = butter_highpass_filter(data, cutoff, fs, order) yi = butter_inv_highpass_filter(y, cutoff, fs, order) plt.subplot(2, 1, 2) plt.plot(t, data, 'b-', label='1.2Hz "real" data') plt.plot(t, y, 'g-', linewidth=2, label='blue box data') plt.plot(t, yi, 'r--', linewidth=2, label='round-trip data') plt.xlabel('Time [sec]') plt.grid() #plt.legend() # add the legend in the middle of the plot leg = plt.legend(fancybox=True) # set the alpha value of the legend: it will be translucent leg.get_frame().set_alpha(0.5) plt.subplots_adjust(hspace=0.35) plt.show() From charlesr.harris at gmail.com Thu Oct 20 22:58:06 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 20 Oct 2016 20:58:06 -0600 Subject: [Numpy-discussion] fpower ufunc Message-ID: Hi All, I've put up a preliminary PR for the proposed fpower ufunc. Apart from adding more tests and documentation, I'd like to settle a few other things. The first is the name, two names have been proposed and we should settle on one - fpower (short) - float_power (obvious) The second thing is the minimum precision. In the preliminary version I have used float32, but perhaps it makes more sense for the intended use to make the minimum precision float64 instead. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Oct 20 23:11:13 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Oct 2016 20:11:13 -0700 Subject: [Numpy-discussion] fpower ufunc In-Reply-To: References: Message-ID: On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris wrote: > Hi All, > > I've put up a preliminary PR for the proposed fpower ufunc. Apart from > adding more tests and documentation, I'd like to settle a few other things. > The first is the name, two names have been proposed and we should settle on > one > > fpower (short) > float_power (obvious) +0.6 for float_power > The second thing is the minimum precision. In the preliminary version I have > used float32, but perhaps it makes more sense for the intended use to make > the minimum precision float64 instead. Can you elaborate on what you're thinking? I guess this is because float32 has limited range compared to float64, so is more likely to see overflow? float32 still goes up to 10**38 which is < int64_max**2, FWIW. Or maybe there's some subtlety with the int->float casting here? -n -- Nathaniel J. Smith -- https://vorpus.org From charlesr.harris at gmail.com Thu Oct 20 23:38:33 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 20 Oct 2016 21:38:33 -0600 Subject: [Numpy-discussion] fpower ufunc In-Reply-To: References: Message-ID: On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith wrote: > On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris > wrote: > > Hi All, > > > > I've put up a preliminary PR for the proposed fpower ufunc. Apart from > > adding more tests and documentation, I'd like to settle a few other > things. > > The first is the name, two names have been proposed and we should settle > on > > one > > > > fpower (short) > > float_power (obvious) > > +0.6 for float_power > > > The second thing is the minimum precision. In the preliminary version I > have > > used float32, but perhaps it makes more sense for the intended use to > make > > the minimum precision float64 instead. > > Can you elaborate on what you're thinking? I guess this is because > float32 has limited range compared to float64, so is more likely to > see overflow? float32 still goes up to 10**38 which is < int64_max**2, > FWIW. Or maybe there's some subtlety with the int->float casting here? > logical, (u)int8, (u)int16, and float16 get converted to float32, which is probably sufficient to avoid overflow and such. My thought was that float32 is something of a "specialized" type these days, while float64 is the standard floating point precision for everyday computation. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Oct 21 03:45:11 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 21 Oct 2016 09:45:11 +0200 Subject: [Numpy-discussion] fpower ufunc In-Reply-To: References: Message-ID: <1477035911.18447.1.camel@sipsolutions.net> On Do, 2016-10-20 at 21:38 -0600, Charles R Harris wrote: > > > On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith > wrote: > > On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris > > wrote: > > > Hi All, > > > > > > I've put up a preliminary PR for the proposed fpower ufunc. Apart > > from > > > adding more tests and documentation, I'd like to settle a few > > other things. > > > The first is the name, two names have been proposed and we should > > settle on > > > one > > > > > > fpower (short) > > > float_power (obvious) > > > > +0.6 for float_power > > > > > The second thing is the minimum precision. In the preliminary > > version I have > > > used float32, but perhaps it makes more sense for the intended > > use to make > > > the minimum precision float64 instead. > > > > Can you elaborate on what you're thinking? I guess this is because > > float32 has limited range compared to float64, so is more likely to > > see overflow? float32 still goes up to 10**38 which is < > > int64_max**2, > > FWIW. Or maybe there's some subtlety with the int->float casting > > here? > logical, (u)int8, (u)int16, and float16 get converted to float32, > which is probably sufficient to avoid overflow and such. My thought > was that float32 is something of a "specialized" type these days, > while float64 is the standard floating point precision for everyday > computation. > Isn't the behaviour we already have (e.g. such as mean). ints -> float64 inexacts do not get upcast? - Sebastian > Chuck? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Fri Oct 21 04:29:30 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 21 Oct 2016 10:29:30 +0200 Subject: [Numpy-discussion] fpower ufunc In-Reply-To: <1477035911.18447.1.camel@sipsolutions.net> References: <1477035911.18447.1.camel@sipsolutions.net> Message-ID: <1477038570.18447.3.camel@sipsolutions.net> On Fr, 2016-10-21 at 09:45 +0200, Sebastian Berg wrote: > On Do, 2016-10-20 at 21:38 -0600, Charles R Harris wrote: > > > > > > > > On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith > > wrote: > > > > > > On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris > > > wrote: > > > > > > > > Hi All, > > > > > > > > I've put up a preliminary PR for the proposed fpower ufunc. > > > > Apart > > > from > > > > > > > > adding more tests and documentation, I'd like to settle a few > > > other things. > > > > > > > > The first is the name, two names have been proposed and we > > > > should > > > settle on > > > > > > > > one > > > > > > > > fpower (short) > > > > float_power (obvious) > > > +0.6 for float_power > > > > > > > > > > > The second thing is the minimum precision. In the preliminary > > > version I have > > > > > > > > used float32, but perhaps it makes more sense for the intended > > > use to make > > > > > > > > the minimum precision float64 instead. > > > Can you elaborate on what you're thinking? I guess this is > > > because > > > float32 has limited range compared to float64, so is more likely > > > to > > > see overflow? float32 still goes up to 10**38 which is < > > > int64_max**2, > > > FWIW. Or maybe there's some subtlety with the int->float casting > > > here? > > logical, (u)int8, (u)int16, and float16 get converted to float32, > > which is probably sufficient to avoid overflow and such. My thought > > was that float32 is something of a "specialized" type these days, > > while float64 is the standard floating point precision for everyday > > computation. > > > > Isn't the behaviour we already have (e.g. such as mean). > > ints -> float64 > inexacts do not get upcast? > Ah, on the other hand, some/most of the float only ufuncs probably do it as you made it work? > - Sebastian > > > > > > Chuck? > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Fri Oct 21 12:26:23 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Oct 2016 10:26:23 -0600 Subject: [Numpy-discussion] fpower ufunc In-Reply-To: <1477035911.18447.1.camel@sipsolutions.net> References: <1477035911.18447.1.camel@sipsolutions.net> Message-ID: On Fri, Oct 21, 2016 at 1:45 AM, Sebastian Berg wrote: > On Do, 2016-10-20 at 21:38 -0600, Charles R Harris wrote: > > > > > > On Thu, Oct 20, 2016 at 9:11 PM, Nathaniel Smith > > wrote: > > > On Thu, Oct 20, 2016 at 7:58 PM, Charles R Harris > > > wrote: > > > > Hi All, > > > > > > > > I've put up a preliminary PR for the proposed fpower ufunc. Apart > > > from > > > > adding more tests and documentation, I'd like to settle a few > > > other things. > > > > The first is the name, two names have been proposed and we should > > > settle on > > > > one > > > > > > > > fpower (short) > > > > float_power (obvious) > > > > > > +0.6 for float_power > > > > > > > The second thing is the minimum precision. In the preliminary > > > version I have > > > > used float32, but perhaps it makes more sense for the intended > > > use to make > > > > the minimum precision float64 instead. > > > > > > Can you elaborate on what you're thinking? I guess this is because > > > float32 has limited range compared to float64, so is more likely to > > > see overflow? float32 still goes up to 10**38 which is < > > > int64_max**2, > > > FWIW. Or maybe there's some subtlety with the int->float casting > > > here? > > logical, (u)int8, (u)int16, and float16 get converted to float32, > > which is probably sufficient to avoid overflow and such. My thought > > was that float32 is something of a "specialized" type these days, > > while float64 is the standard floating point precision for everyday > > computation. > > > > > Isn't the behaviour we already have (e.g. such as mean). > > ints -> float64 > inexacts do not get upcast? > > Hmm... The best way to do that would be to put the function in `fromnumeric` and do it in python rather than as a ufunc, then for integer types call power with `dtype=float64`. I like that idea better than the current implementation, my mind was stuck in the ufunc universe. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From harrigan.matthew at gmail.com Mon Oct 24 08:44:46 2016 From: harrigan.matthew at gmail.com (Matthew Harrigan) Date: Mon, 24 Oct 2016 08:44:46 -0400 Subject: [Numpy-discussion] padding options for diff Message-ID: I posted a pull request which adds optional padding kwargs "to_begin" and "to_end" to diff. Those options are based on what's available in ediff1d. It closes this issue -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Oct 24 11:14:32 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 24 Oct 2016 08:14:32 -0700 Subject: [Numpy-discussion] padding options for diff In-Reply-To: References: Message-ID: This looks like a welcome addition in functionality! It will be nice to be able to finally (soft) deprecate ediff1d. On Mon, Oct 24, 2016 at 5:44 AM, Matthew Harrigan < harrigan.matthew at gmail.com> wrote: > I posted a pull request which > adds optional padding kwargs "to_begin" and "to_end" to diff. Those > options are based on what's available in ediff1d. It closes this issue > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Oct 24 18:41:00 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Oct 2016 16:41:00 -0600 Subject: [Numpy-discussion] Numpy integers to integer powers again again Message-ID: Hi All, I've been thinking about this some (a lot) more and have an alternate proposal for the behavior of the `**` operator - if both base and power are numpy/python scalar integers, convert to python integers and call the `**` operator. That would solve both the precision and compatibility problems and I think is the option of least surprise. For those who need type preservation and modular arithmetic, the np.power function remains, although the type conversions can be surpirising as it seems that the base and power should play different roles in determining the type, at least to me. - Array, 0-d or not, are treated differently from scalars and integers raised to negative integer powers always raise an error. I think this solves most problems and would not be difficult to implement. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Oct 24 19:30:43 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Oct 2016 16:30:43 -0700 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris wrote: > Hi All, > > I've been thinking about this some (a lot) more and have an alternate > proposal for the behavior of the `**` operator > > if both base and power are numpy/python scalar integers, convert to python > integers and call the `**` operator. That would solve both the precision and > compatibility problems and I think is the option of least surprise. For > those who need type preservation and modular arithmetic, the np.power > function remains, although the type conversions can be surpirising as it > seems that the base and power should play different roles in determining > the type, at least to me. > Array, 0-d or not, are treated differently from scalars and integers raised > to negative integer powers always raise an error. > > I think this solves most problems and would not be difficult to implement. > > Thoughts? My main concern about this is that it adds more special cases to numpy scalars, and a new behavioral deviation between 0d arrays and scalars, when ideally we should be trying to reduce the duplication/discrepancies between these. It's also inconsistent with how other operations on integer scalars work, e.g. regular addition overflows rather than promoting to Python int: In [8]: np.int64(2 ** 63 - 1) + 1 /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: overflow encountered in long_scalars #!/home/njs/.user-python3.5-64bit/bin/python3.5 Out[8]: -9223372036854775808 So I'm inclined to try and keep it simple, like in your previous proposal... theoretically of course it would be nice to have the perfect solution here, but at this point it feels like we might be overthinking this trying to get that last 1% of improvement. The thing where 2 ** -1 returns 0 is just broken and bites people so we should definitely fix it, but beyond that I'm not sure it really matters *that* much what we do, and "special cases aren't special enough to break the rules" and all that. -n -- Nathaniel J. Smith -- https://vorpus.org From shoyer at gmail.com Tue Oct 25 12:14:40 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 25 Oct 2016 09:14:40 -0700 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: I am also concerned about adding more special cases for NumPy scalars vs arrays. These cases are already confusing (e.g., making no distinction between 0d arrays and scalars) and poorly documented. On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith wrote: > On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris > wrote: > > Hi All, > > > > I've been thinking about this some (a lot) more and have an alternate > > proposal for the behavior of the `**` operator > > > > if both base and power are numpy/python scalar integers, convert to > python > > integers and call the `**` operator. That would solve both the precision > and > > compatibility problems and I think is the option of least surprise. For > > those who need type preservation and modular arithmetic, the np.power > > function remains, although the type conversions can be surpirising as it > > seems that the base and power should play different roles in determining > > the type, at least to me. > > Array, 0-d or not, are treated differently from scalars and integers > raised > > to negative integer powers always raise an error. > > > > I think this solves most problems and would not be difficult to > implement. > > > > Thoughts? > > My main concern about this is that it adds more special cases to numpy > scalars, and a new behavioral deviation between 0d arrays and scalars, > when ideally we should be trying to reduce the > duplication/discrepancies between these. It's also inconsistent with > how other operations on integer scalars work, e.g. regular addition > overflows rather than promoting to Python int: > > In [8]: np.int64(2 ** 63 - 1) + 1 > /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: > overflow encountered in long_scalars > #!/home/njs/.user-python3.5-64bit/bin/python3.5 > Out[8]: -9223372036854775808 > > So I'm inclined to try and keep it simple, like in your previous > proposal... theoretically of course it would be nice to have the > perfect solution here, but at this point it feels like we might be > overthinking this trying to get that last 1% of improvement. The thing > where 2 ** -1 returns 0 is just broken and bites people so we should > definitely fix it, but beyond that I'm not sure it really matters > *that* much what we do, and "special cases aren't special enough to > break the rules" and all that. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Tue Oct 25 13:26:39 2016 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Tue, 25 Oct 2016 10:26:39 -0700 Subject: [Numpy-discussion] padding options for diff Message-ID: > Date: Mon, 24 Oct 2016 08:44:46 -0400 > From: Matthew Harrigan > > I posted a pull request which > adds optional padding kwargs "to_begin" and "to_end" to diff. Those > options are based on what's available in ediff1d. It closes this issue > I like the proposal, though I suspect that making it general has obscured that the most common use-case for padding is to make the inverse of np.cumsum (at least that?s what I frequently need), and now in the multidimensional case you have the somewhat unwieldy: >>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis)) rather than >>> np.diff(a, axis=axis, keep_left=True) which of course could just be an option upon what you already have. Best, Peter From shoyer at gmail.com Tue Oct 25 15:38:16 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 25 Oct 2016 12:38:16 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling Message-ID: With a custom wrapper class, it's possible to preserve NumPy views when pickling: https://stackoverflow.com/questions/13746601/preserving-numpy-view-when-pickling This can result in significant time/space savings with pickling views along with base arrays and brings the behavior of NumPy more in line with Python proper. Is this something that we can/should port into NumPy itself? -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 25 16:07:52 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Oct 2016 13:07:52 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 12:38 PM, Stephan Hoyer wrote: > With a custom wrapper class, it's possible to preserve NumPy views when > pickling: > https://stackoverflow.com/questions/13746601/preserving-numpy-view-when-pickling > > This can result in significant time/space savings with pickling views along > with base arrays and brings the behavior of NumPy more in line with Python > proper. Is this something that we can/should port into NumPy itself? Concretely, what do would you suggest should happen with: base = np.zeros(100000000) view = base[:10] # case 1 pickle.dump(view, file) # case 2 pickle.dump(base, file) pickle.dump(view, file) # case 3 pickle.dump(view, file) pickle.dump(base, file) ? -- Nathaniel J. Smith -- https://vorpus.org From shoyer at gmail.com Tue Oct 25 18:07:04 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 25 Oct 2016 15:07:04 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith wrote: > Concretely, what do would you suggest should happen with: > > base = np.zeros(100000000) > view = base[:10] > > # case 1 > pickle.dump(view, file) > > # case 2 > pickle.dump(base, file) > pickle.dump(view, file) > > # case 3 > pickle.dump(view, file) > pickle.dump(base, file) > > ? > I see what you're getting at here. We would need a rule for when to include the base in the pickle and when not to. Otherwise, pickle.dump(view, file) always contains data from the base pickle, even with view is much smaller than base. The safe answer is "only use views in the pickle when base is already being pickled", but that isn't possible to check unless all the arrays are together in a custom container. So, this isn't really feasible for NumPy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Oct 25 19:28:22 2016 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 25 Oct 2016 16:28:22 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 3:07 PM, Stephan Hoyer wrote: > > On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith wrote: >> >> Concretely, what do would you suggest should happen with: >> >> base = np.zeros(100000000) >> view = base[:10] >> >> # case 1 >> pickle.dump(view, file) >> >> # case 2 >> pickle.dump(base, file) >> pickle.dump(view, file) >> >> # case 3 >> pickle.dump(view, file) >> pickle.dump(base, file) >> >> ? > > I see what you're getting at here. We would need a rule for when to include the base in the pickle and when not to. Otherwise, pickle.dump(view, file) always contains data from the base pickle, even with view is much smaller than base. > > The safe answer is "only use views in the pickle when base is already being pickled", but that isn't possible to check unless all the arrays are together in a custom container. So, this isn't really feasible for NumPy. It would be possible with a custom Pickler/Unpickler since they already keep track of objects previously (un)pickled. That would handle [base, view] okay but not [view, base], so it's probably not going to be all that useful outside of special situations. It would make a neat recipe, but I probably would not provide it in numpy itself. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From harrigan.matthew at gmail.com Tue Oct 25 20:09:09 2016 From: harrigan.matthew at gmail.com (Matthew Harrigan) Date: Tue, 25 Oct 2016 20:09:09 -0400 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: It seems pickle keeps track of references for basic python types. x = [1] y = [x] x,y = pickle.loads(pickle.dumps((x,y))) x.append(2) print(y) >>> [[1,2]] Numpy arrays are different but references are forgotten after pickle/unpickle. Shared objects do not remain shared. Based on the quote below it could be considered bug with numpy/pickle. Object sharing (references to the same object in different places): This is similar to self-referencing objects; pickle stores the object once, and ensures that all other references point to the master copy. Shared objects remain shared, which can be very important for mutable objects. link Another example with ndarrays: x = np.arange(5) y = x[::-1] x, y = pickle.loads(pickle.dumps((x, y))) x[0] = 9 print(y) >>> [4, 3, 2, 1, 0] In this case the two arrays share the exact same object for the data buffer (although object might not be the right word here) On Tue, Oct 25, 2016 at 7:28 PM, Robert Kern wrote: > On Tue, Oct 25, 2016 at 3:07 PM, Stephan Hoyer wrote: > > > > On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith wrote: > >> > >> Concretely, what do would you suggest should happen with: > >> > >> base = np.zeros(100000000) > >> view = base[:10] > >> > >> # case 1 > >> pickle.dump(view, file) > >> > >> # case 2 > >> pickle.dump(base, file) > >> pickle.dump(view, file) > >> > >> # case 3 > >> pickle.dump(view, file) > >> pickle.dump(base, file) > >> > >> ? > > > > I see what you're getting at here. We would need a rule for when to > include the base in the pickle and when not to. Otherwise, > pickle.dump(view, file) always contains data from the base pickle, even > with view is much smaller than base. > > > > The safe answer is "only use views in the pickle when base is already > being pickled", but that isn't possible to check unless all the arrays are > together in a custom container. So, this isn't really feasible for NumPy. > > It would be possible with a custom Pickler/Unpickler since they already > keep track of objects previously (un)pickled. That would handle [base, > view] okay but not [view, base], so it's probably not going to be all that > useful outside of special situations. It would make a neat recipe, but I > probably would not provide it in numpy itself. > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Oct 25 20:29:54 2016 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 25 Oct 2016 17:29:54 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 5:09 PM, Matthew Harrigan < harrigan.matthew at gmail.com> wrote: > > It seems pickle keeps track of references for basic python types. > > x = [1] > y = [x] > x,y = pickle.loads(pickle.dumps((x,y))) > x.append(2) > print(y) > >>> [[1,2]] > > Numpy arrays are different but references are forgotten after pickle/unpickle. Shared objects do not remain shared. Based on the quote below it could be considered bug with numpy/pickle. Not a bug, but an explicit design decision on numpy's part. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From rainwoodman at gmail.com Tue Oct 25 22:05:39 2016 From: rainwoodman at gmail.com (Feng Yu) Date: Tue, 25 Oct 2016 19:05:39 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: Hi, Just another perspective. base' and 'data' in PyArrayObject are two separate variables. base can point to any PyObject, but it is `data` that defines where data is accessed in memory. 1. There is no clear way to pickle a pointer (`data`) in a meaningful way. In order for `data` member to make sense we still need to 'readout' the values stored at `data` pointer in the pickle. 2. By definition base is not necessary a numpy array but it is just some other object for managing the memory. 3. One can surely pickle the `base` object as a reference, but it is useless if the data memory has been reconstructed independently during unpickling. 4. Unless there is clear way to notify the referencing numpy array of the new data pointer. There probably isn't. BTW, is the stride information is lost during pickling, too? The behavior shall probably be documented if not yet. Yu On Tue, Oct 25, 2016 at 5:29 PM, Robert Kern wrote: > On Tue, Oct 25, 2016 at 5:09 PM, Matthew Harrigan > wrote: >> >> It seems pickle keeps track of references for basic python types. >> >> x = [1] >> y = [x] >> x,y = pickle.loads(pickle.dumps((x,y))) >> x.append(2) >> print(y) >> >>> [[1,2]] >> >> Numpy arrays are different but references are forgotten after >> pickle/unpickle. Shared objects do not remain shared. Based on the quote >> below it could be considered bug with numpy/pickle. > > Not a bug, but an explicit design decision on numpy's part. > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Tue Oct 25 22:39:14 2016 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 25 Oct 2016 19:39:14 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 7:05 PM, Feng Yu wrote: > > Hi, > > Just another perspective. base' and 'data' in PyArrayObject are two > separate variables. > > base can point to any PyObject, but it is `data` that defines where > data is accessed in memory. > > 1. There is no clear way to pickle a pointer (`data`) in a meaningful > way. In order for `data` member to make sense we still need to > 'readout' the values stored at `data` pointer in the pickle. > > 2. By definition base is not necessary a numpy array but it is just > some other object for managing the memory. In general, yes, but most often it's another ndarray, and the child is related to the parent by a slice operation that could be computed by comparing the `data` tuples. The exercise here isn't to always represent the general case in this way, but to see what can be done opportunistically and if that actually helps solve a practical problem. > 3. One can surely pickle the `base` object as a reference, but it is > useless if the data memory has been reconstructed independently during > unpickling. > > 4. Unless there is clear way to notify the referencing numpy array of > the new data pointer. There probably isn't. > > BTW, is the stride information is lost during pickling, too? The > behavior shall probably be documented if not yet. The stride information may be lost, yes. We reserve the right to retain it, though (for example, if .T is contiguous then we might well serialize the transposed data linearly and return a view on that data upon deserialization). I don't believe that we guarantee that the unpickled result is contiguous. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 25 23:36:29 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Oct 2016 20:36:29 -0700 Subject: [Numpy-discussion] Preserving NumPy views when pickling In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 5:09 PM, Matthew Harrigan wrote: > It seems pickle keeps track of references for basic python types. > > x = [1] > y = [x] > x,y = pickle.loads(pickle.dumps((x,y))) > x.append(2) > print(y) >>>> [[1,2]] Yes, but the problem is: suppose I have a 10 gigabyte array, and then take a 20 byte slice of it, and then pickle that slice. Do you expect the pickle file to be 20 bytes, or 10 gigabytes? Both options are possible, but you have to pick one, and numpy picks 20 bytes. The advantage is obviously that you don't have mysterious 10 gigabyte pickle files; the disadvantage is that you can't reconstruct the view relationships afterwards. (You might think: oh, but we can be clever, and only record the view relationships if the user pickles both objects together. But while pickle might know whether the user is pickling both objects together, it unfortunately doesn't tell numpy, so we can't really do anything clever or different in this case.) -n -- Nathaniel J. Smith -- https://vorpus.org From charlesr.harris at gmail.com Wed Oct 26 00:34:42 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Oct 2016 22:34:42 -0600 Subject: [Numpy-discussion] Intel random number package Message-ID: Hi All, There is a proposed random number package PR now up on github: https://github.com/numpy/numpy/pull/8209. It is from oleksandr-pavlyk and implements the number random number package using MKL for increased speed. I think we are definitely interested in the improved speed, but I'm not sure numpy is the best place to put the package. I'd welcome any comments on the PR itself, as well as any thoughts on the best way organize or use of this work. Maybe scikit-random Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Oct 26 00:41:32 2016 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 25 Oct 2016 21:41:32 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris wrote: > > Hi All, > > There is a proposed random number package PR now up on github: https://github.com/numpy/numpy/pull/8209. It is from > oleksandr-pavlyk and implements the number random number package using MKL for increased speed. I think we are definitely interested in the improved speed, but I'm not sure numpy is the best place to put the package. I'd welcome any comments on the PR itself, as well as any thoughts on the best way organize or use of this work. Maybe scikit-random This is what ng-numpy-randomstate is for. https://github.com/bashtage/ng-numpy-randomstate -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Oct 26 01:22:54 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Oct 2016 23:22:54 -0600 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 10:41 PM, Robert Kern wrote: > On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > > > > Hi All, > > > > There is a proposed random number package PR now up on github: > https://github.com/numpy/numpy/pull/8209. It is from > > oleksandr-pavlyk and implements the number random number package using > MKL for increased speed. I think we are definitely interested in the > improved speed, but I'm not sure numpy is the best place to put the > package. I'd welcome any comments on the PR itself, as well as any thoughts > on the best way organize or use of this work. Maybe scikit-random > > This is what ng-numpy-randomstate is for. > > https://github.com/bashtage/ng-numpy-randomstate > Interesting, despite old fashioned original ziggurat implementation of the normal and gnu c style... Does that project seek to preserve all the bytestreams or is it still in flux? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Oct 26 01:29:29 2016 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 25 Oct 2016 22:29:29 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 10:22 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > On Tue, Oct 25, 2016 at 10:41 PM, Robert Kern wrote: >> >> On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> > >> > Hi All, >> > >> > There is a proposed random number package PR now up on github: https://github.com/numpy/numpy/pull/8209. It is from >> > oleksandr-pavlyk and implements the number random number package using MKL for increased speed. I think we are definitely interested in the improved speed, but I'm not sure numpy is the best place to put the package. I'd welcome any comments on the PR itself, as well as any thoughts on the best way organize or use of this work. Maybe scikit-random >> >> This is what ng-numpy-randomstate is for. >> >> https://github.com/bashtage/ng-numpy-randomstate > > Interesting, despite old fashioned original ziggurat implementation of the normal and gnu c style... Does that project seek to preserve all the bytestreams or is it still in flux? I would assume some flux for now, but you can ask the author by submitting a corrected ziggurat PR as a trial balloon. ;-) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Wed Oct 26 03:33:17 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 26 Oct 2016 09:33:17 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: Message-ID: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> On 26.10.2016 06:34, Charles R Harris wrote: > Hi All, > > There is a proposed random number package PR now up on github: > https://github.com/numpy/numpy/pull/8209. It is from > oleksandr-pavlyk and implements > the number random number package using MKL for increased speed. I think > we are definitely interested in the improved speed, but I'm not sure > numpy is the best place to put the package. I'd welcome any comments on > the PR itself, as well as any thoughts on the best way organize or use > of this work. Maybe scikit-random > I'm not a fan of putting code depending on a proprietary library into numpy. This should be a standalone package which may provide the same interface as numpy. From ralf.gommers at gmail.com Wed Oct 26 04:59:23 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 26 Oct 2016 21:59:23 +1300 Subject: [Numpy-discussion] Intel random number package In-Reply-To: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> Message-ID: On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 26.10.2016 06:34, Charles R Harris wrote: > > Hi All, > > > > There is a proposed random number package PR now up on github: > > https://github.com/numpy/numpy/pull/8209. It is from > > oleksandr-pavlyk and implements > > the number random number package using MKL for increased speed. I think > > we are definitely interested in the improved speed, but I'm not sure > > numpy is the best place to put the package. I'd welcome any comments on > > the PR itself, as well as any thoughts on the best way organize or use > > of this work. Maybe scikit-random > Note that this thread is a continuation of https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html > > I'm not a fan of putting code depending on a proprietary library into > numpy. > This should be a standalone package which may provide the same interface > as numpy. > I don't really see a problem with that in principle. Numpy can use Intel MKL (and Accelerate) as well if it's available. It needs some thought put into the API though - a ``numpy.random_intel`` module is certainly not what we want. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From harrigan.matthew at gmail.com Wed Oct 26 09:05:41 2016 From: harrigan.matthew at gmail.com (Matthew Harrigan) Date: Wed, 26 Oct 2016 09:05:41 -0400 Subject: [Numpy-discussion] padding options for diff In-Reply-To: References: Message-ID: The inverse of cumsum is actually a little more unweildy since you can't drop a dimension with take. This returns the original array (numerical caveats aside): np.cumsum(np.diff(x, to_begin=x.take([0], axis=axis), axis=axis), axis=axis) That's certainly not going to win any beauty contests. The 1d case is clean though: np.cumsum(np.diff(x, to_begin=x[0])) I'm not sure if this means the API should change, and if so how. Higher dimensional arrays seem to just have extra complexity. On Tue, Oct 25, 2016 at 1:26 PM, Peter Creasey < p.e.creasey.00 at googlemail.com> wrote: > > Date: Mon, 24 Oct 2016 08:44:46 -0400 > > From: Matthew Harrigan > > > > I posted a pull request which > > adds optional padding kwargs "to_begin" and "to_end" to diff. Those > > options are based on what's available in ediff1d. It closes this issue > > > > I like the proposal, though I suspect that making it general has > obscured that the most common use-case for padding is to make the > inverse of np.cumsum (at least that?s what I frequently need), and now > in the multidimensional case you have the somewhat unwieldy: > > >>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis)) > > rather than > > >>> np.diff(a, axis=axis, keep_left=True) > > which of course could just be an option upon what you already have. > > Best, > Peter > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Wed Oct 26 12:00:21 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 26 Oct 2016 18:00:21 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> Message-ID: <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> On 10/26/2016 10:59 AM, Ralf Gommers wrote: > > > On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor > > > wrote: > > On 26.10.2016 06:34, Charles R Harris wrote: > > Hi All, > > > > There is a proposed random number package PR now up on github: > > https://github.com/numpy/numpy/pull/8209 > . It is from > > oleksandr-pavlyk > and implements > > the number random number package using MKL for increased speed. I think > > we are definitely interested in the improved speed, but I'm not sure > > numpy is the best place to put the package. I'd welcome any comments on > > the PR itself, as well as any thoughts on the best way organize or use > > of this work. Maybe scikit-random > > > Note that this thread is a continuation of > https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html > > > > I'm not a fan of putting code depending on a proprietary library > into numpy. > This should be a standalone package which may provide the same interface > as numpy. > > > I don't really see a problem with that in principle. Numpy can use Intel > MKL (and Accelerate) as well if it's available. It needs some thought > put into the API though - a ``numpy.random_intel`` module is certainly > not what we want. > For me there is a difference between being able to optionally use a proprietary library as an alternative to free software libraries if the user wishes to do so and offering functionality that only works with non-free software. We are providing a form of advertisement for them by allowing it (hey if you buy this black box that you cannot modify or use freely you get this neat numpy feature!). I prefer for the full functionality of numpy to stay available with a stack of community owned software, even if it may be less powerful that way. From jtaylor.debian at googlemail.com Wed Oct 26 12:10:36 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 26 Oct 2016 18:10:36 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: On 10/26/2016 06:00 PM, Julian Taylor wrote: > On 10/26/2016 10:59 AM, Ralf Gommers wrote: >> >> >> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor >> > >> wrote: >> >> On 26.10.2016 06:34, Charles R Harris wrote: >> > Hi All, >> > >> > There is a proposed random number package PR now up on github: >> > https://github.com/numpy/numpy/pull/8209 >> . It is from >> > oleksandr-pavlyk > > and implements >> > the number random number package using MKL for increased speed. >> I think >> > we are definitely interested in the improved speed, but I'm not >> sure >> > numpy is the best place to put the package. I'd welcome any >> comments on >> > the PR itself, as well as any thoughts on the best way organize >> or use >> > of this work. Maybe scikit-random >> >> >> Note that this thread is a continuation of >> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html >> >> >> >> I'm not a fan of putting code depending on a proprietary library >> into numpy. >> This should be a standalone package which may provide the same >> interface >> as numpy. >> >> >> I don't really see a problem with that in principle. Numpy can use Intel >> MKL (and Accelerate) as well if it's available. It needs some thought >> put into the API though - a ``numpy.random_intel`` module is certainly >> not what we want. >> > > For me there is a difference between being able to optionally use a > proprietary library as an alternative to free software libraries if the > user wishes to do so and offering functionality that only works with > non-free software. > We are providing a form of advertisement for them by allowing it (hey if > you buy this black box that you cannot modify or use freely you get this > neat numpy feature!). > > I prefer for the full functionality of numpy to stay available with a > stack of community owned software, even if it may be less powerful that > way. But then if this is really just the same random numbers numpy already provides just faster, it is probably acceptable in principle. I haven't actually looked at the PR yet. From robert.kern at gmail.com Wed Oct 26 12:29:42 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 26 Oct 2016 09:29:42 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > > On 10/26/2016 06:00 PM, Julian Taylor wrote: >> I prefer for the full functionality of numpy to stay available with a >> stack of community owned software, even if it may be less powerful that >> way. > > But then if this is really just the same random numbers numpy already provides just faster, it is probably acceptable in principle. I haven't actually looked at the PR yet. I think the stream is different in some places, at least. And it's not a silent backend drop-in like np.linalg being built against an optimized BLAS, just a separate module that is inoperative without MKL. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Oct 26 12:36:35 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 26 Oct 2016 18:36:35 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: <1477499795.12923.1.camel@sipsolutions.net> On Mi, 2016-10-26 at 09:29 -0700, Robert Kern wrote: > On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor mail.com> wrote: > > > > On 10/26/2016 06:00 PM, Julian Taylor wrote: > > >> I prefer for the full functionality of numpy to stay available > with a > >> stack of community owned software, even if it may be less powerful > that > >> way. > > > > But then if this is really just the same random numbers numpy > already provides just faster, it is probably acceptable in principle. > I haven't actually looked at the PR yet. > > I think the stream is different in some places, at least. And it's > not a silent backend drop-in like np.linalg being built against an > optimized BLAS, just a separate module that is inoperative without > MKL. > I might be swayed, but my gut feeling would be that a backend change (if the default stream changes, an explicit one, though maybe one could make a "fastest") would be the only reasonable way to provide such a thing in numpy itself. - Sebastian > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From robert.kern at gmail.com Wed Oct 26 12:53:29 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 26 Oct 2016 09:53:29 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: <1477499795.12923.1.camel@sipsolutions.net> References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <1477499795.12923.1.camel@sipsolutions.net> Message-ID: On Wed, Oct 26, 2016 at 9:36 AM, Sebastian Berg wrote: > > On Mi, 2016-10-26 at 09:29 -0700, Robert Kern wrote: > > On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor > mail.com> wrote: > > > > > > On 10/26/2016 06:00 PM, Julian Taylor wrote: > > > > >> I prefer for the full functionality of numpy to stay available > > with a > > >> stack of community owned software, even if it may be less powerful > > that > > >> way. > > > > > > But then if this is really just the same random numbers numpy > > already provides just faster, it is probably acceptable in principle. > > I haven't actually looked at the PR yet. > > > > I think the stream is different in some places, at least. And it's > > not a silent backend drop-in like np.linalg being built against an > > optimized BLAS, just a separate module that is inoperative without > > MKL. > > I might be swayed, but my gut feeling would be that a backend change > (if the default stream changes, an explicit one, though maybe one could > make a "fastest") would be the only reasonable way to provide such a > thing in numpy itself. That mostly argues for distributing it as a separate package, not part of numpy at all. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathewsyriac at gmail.com Wed Oct 26 13:27:50 2016 From: mathewsyriac at gmail.com (Mathew S. Madhavacheril) Date: Wed, 26 Oct 2016 13:27:50 -0400 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call Message-ID: Hi all, I posted a pull request: https://github.com/numpy/numpy/pull/8211 which adds a function `numpy.covcorr` that calculates both the covariance matrix and correlation coefficient with a single call to `numpy.cov` (which is often an expensive call for large data-sets). A function `numpy.covtocorr` has also been added that converts a covariance matrix to a correlation coefficent, and `numpy.corrcoef` has been modified to call this. The motivation here is that one often needs the covariance for subsequent analysis and the correlation coefficient for visualization, so instead of forcing the user to write their own code to convert one to the other, we want to allow both to be obtained from `numpy` as efficiently as possible. Best, Mathew -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Oct 26 13:46:48 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 26 Oct 2016 10:46:48 -0700 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: I wonder if the goals of this addition could be achieved by simply adding an optional `cov` argument to np.corr, which would provide a pre-computed covariance. Either way, `covcorr` feels like a helper function that could exist in user code rather than numpy proper. On Wed, Oct 26, 2016 at 10:27 AM, Mathew S. Madhavacheril < mathewsyriac at gmail.com> wrote: > Hi all, > > I posted a pull request: > https://github.com/numpy/numpy/pull/8211 > > which adds a function `numpy.covcorr` that calculates both > the covariance matrix and correlation coefficient with a single > call to `numpy.cov` (which is often an expensive call for large > data-sets). A function `numpy.covtocorr` has also been added > that converts a covariance matrix to a correlation coefficent, > and `numpy.corrcoef` has been modified to call this. The > motivation here is that one often needs the covariance for > subsequent analysis and the correlation coefficient for > visualization, so instead of forcing the user to write their own > code to convert one to the other, we want to allow both to > be obtained from `numpy` as efficiently as possible. > > Best, > Mathew > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathewsyriac at gmail.com Wed Oct 26 14:03:36 2016 From: mathewsyriac at gmail.com (Mathew S. Madhavacheril) Date: Wed, 26 Oct 2016 14:03:36 -0400 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer wrote: > I wonder if the goals of this addition could be achieved by simply adding > an optional `cov` argument > to np.corr, which would provide a pre-computed covariance. > That's a fair suggestion which I'm happy to switch to. This eliminates the need for two new functions. I'll add an optional `cov = False` argument to numpy.corrcoef that returns a tuple (corr, cov) instead. > > Either way, `covcorr` feels like a helper function that could exist in > user code rather than numpy proper. > The user would have to re-implement the part that converts the covariance matrix to a correlation coefficient. I made this PR to avoid that code duplication. Mathew > > On Wed, Oct 26, 2016 at 10:27 AM, Mathew S. Madhavacheril < > mathewsyriac at gmail.com> wrote: > >> Hi all, >> >> I posted a pull request: >> https://github.com/numpy/numpy/pull/8211 >> >> which adds a function `numpy.covcorr` that calculates both >> the covariance matrix and correlation coefficient with a single >> call to `numpy.cov` (which is often an expensive call for large >> data-sets). A function `numpy.covtocorr` has also been added >> that converts a covariance matrix to a correlation coefficent, >> and `numpy.corrcoef` has been modified to call this. The >> motivation here is that one often needs the covariance for >> subsequent analysis and the correlation coefficient for >> visualization, so instead of forcing the user to write their own >> code to convert one to the other, we want to allow both to >> be obtained from `numpy` as efficiently as possible. >> >> Best, >> Mathew >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Oct 26 14:13:54 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 26 Oct 2016 11:13:54 -0700 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril < mathewsyriac at gmail.com> wrote: > On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer wrote: > >> I wonder if the goals of this addition could be achieved by simply adding >> an optional `cov` argument >> > to np.corr, which would provide a pre-computed covariance. >> > > That's a fair suggestion which I'm happy to switch to. This eliminates the > need for two new functions. > I'll add an optional `cov = False` argument to numpy.corrcoef that returns > a tuple (corr, cov) instead. > > >> >> Either way, `covcorr` feels like a helper function that could exist in >> user code rather than numpy proper. >> > > The user would have to re-implement the part that converts the covariance > matrix to a correlation > coefficient. I made this PR to avoid that code duplication. > With the API I was envisioning (or even your proposed API, for that matter), this function would only be a few lines, e.g., def covcorr(x): cov = np.cov(x) corr = np.corrcoef(x, cov=cov) return (cov, corr) Generally, functions this short should be provided as recipes (if at all) rather than be added to numpy proper, unless the need for them is extremely common. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathewsyriac at gmail.com Wed Oct 26 14:26:32 2016 From: mathewsyriac at gmail.com (Mathew S. Madhavacheril) Date: Wed, 26 Oct 2016 14:26:32 -0400 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 2:13 PM, Stephan Hoyer wrote: > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril < > mathewsyriac at gmail.com> wrote: > >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer wrote: >> >>> I wonder if the goals of this addition could be achieved by simply >>> adding an optional `cov` argument >>> >> to np.corr, which would provide a pre-computed covariance. >>> >> >> That's a fair suggestion which I'm happy to switch to. This eliminates >> the need for two new functions. >> I'll add an optional `cov = False` argument to numpy.corrcoef that >> returns a tuple (corr, cov) instead. >> >> >>> >>> Either way, `covcorr` feels like a helper function that could exist in >>> user code rather than numpy proper. >>> >> >> The user would have to re-implement the part that converts the covariance >> matrix to a correlation >> coefficient. I made this PR to avoid that code duplication. >> > > With the API I was envisioning (or even your proposed API, for that > matter), this function would only be a few lines, e.g., > > def covcorr(x): > cov = np.cov(x) > corr = np.corrcoef(x, cov=cov) > return (cov, corr) > > Generally, functions this short should be provided as recipes (if at all) > rather than be added to numpy proper, unless the need for them is extremely > common. > Ah, I see what you were suggesting now. I agree that a function like covcorr need not be provided by numpy itself, but it would be tremendously useful if a pre-computed covariance could be provided to np.corrcoef. I can update this PR to just add `cov = None` to numpy.corrcoef and do an `if cov is not None` before calculating the covariance. Note however that in the case that `cov` is specified for np.corrcoef, the non-optional `x` argument is redundant. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Oct 26 14:56:41 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 26 Oct 2016 11:56:41 -0700 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer wrote: > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril > wrote: >> >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer wrote: >>> >>> I wonder if the goals of this addition could be achieved by simply adding >>> an optional `cov` argument >>> >>> to np.corr, which would provide a pre-computed covariance. >> >> >> That's a fair suggestion which I'm happy to switch to. This eliminates the >> need for two new functions. >> I'll add an optional `cov = False` argument to numpy.corrcoef that returns >> a tuple (corr, cov) instead. >> >>> >>> >>> Either way, `covcorr` feels like a helper function that could exist in >>> user code rather than numpy proper. >> >> >> The user would have to re-implement the part that converts the covariance >> matrix to a correlation >> coefficient. I made this PR to avoid that code duplication. > > > With the API I was envisioning (or even your proposed API, for that matter), > this function would only be a few lines, e.g., > > def covcorr(x): > cov = np.cov(x) > corr = np.corrcoef(x, cov=cov) IIUC, if you have a covariance matrix then you can compute the correlation matrix directly, without looking at 'x', so corrcoef(x, cov=cov) is a bit odd-looking. I think probably the API that makes the most sense is just to expose something like the covtocorr function (maybe it could have a less telegraphic name?)? And then, yeah, users can use that to build their own covcorr or whatever if they want it. -n -- Nathaniel J. Smith -- https://vorpus.org From mathewsyriac at gmail.com Wed Oct 26 15:11:22 2016 From: mathewsyriac at gmail.com (Mathew S. Madhavacheril) Date: Wed, 26 Oct 2016 15:11:22 -0400 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 2:56 PM, Nathaniel Smith wrote: > On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer wrote: > > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril > > wrote: > >> > >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer > wrote: > >>> > >>> I wonder if the goals of this addition could be achieved by simply > adding > >>> an optional `cov` argument > >>> > >>> to np.corr, which would provide a pre-computed covariance. > >> > >> > >> That's a fair suggestion which I'm happy to switch to. This eliminates > the > >> need for two new functions. > >> I'll add an optional `cov = False` argument to numpy.corrcoef that > returns > >> a tuple (corr, cov) instead. > >> > >>> > >>> > >>> Either way, `covcorr` feels like a helper function that could exist in > >>> user code rather than numpy proper. > >> > >> > >> The user would have to re-implement the part that converts the > covariance > >> matrix to a correlation > >> coefficient. I made this PR to avoid that code duplication. > > > > > > With the API I was envisioning (or even your proposed API, for that > matter), > > this function would only be a few lines, e.g., > > > > def covcorr(x): > > cov = np.cov(x) > > corr = np.corrcoef(x, cov=cov) > > IIUC, if you have a covariance matrix then you can compute the > correlation matrix directly, without looking at 'x', so corrcoef(x, > cov=cov) is a bit odd-looking. I think probably the API that makes the > most sense is just to expose something like the covtocorr function > (maybe it could have a less telegraphic name?)? And then, yeah, users > can use that to build their own covcorr or whatever if they want it. > Right, agreed, this is why I said `x` becomes redundant when `cov` is specified when calling `numpy.corrcoef`. So we have two alternatives: 1) Have `np.corrcoef` accept a boolean optional argument `covmat = False` that lets one obtain a tuple containing the covariance and the correlation matrices in the same call 2) Modify my original PR so that `np.covtocorr` remains (with possibly a better name) but remove `np.covcorr` since this is easy for the user to add. My preference is option 2. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Oct 26 15:20:15 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Oct 2016 15:20:15 -0400 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 3:11 PM, Mathew S. Madhavacheril < mathewsyriac at gmail.com> wrote: > > > On Wed, Oct 26, 2016 at 2:56 PM, Nathaniel Smith wrote: > >> On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer wrote: >> > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril >> > wrote: >> >> >> >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer >> wrote: >> >>> >> >>> I wonder if the goals of this addition could be achieved by simply >> adding >> >>> an optional `cov` argument >> >>> >> >>> to np.corr, which would provide a pre-computed covariance. >> >> >> >> >> >> That's a fair suggestion which I'm happy to switch to. This eliminates >> the >> >> need for two new functions. >> >> I'll add an optional `cov = False` argument to numpy.corrcoef that >> returns >> >> a tuple (corr, cov) instead. >> >> >> >>> >> >>> >> >>> Either way, `covcorr` feels like a helper function that could exist in >> >>> user code rather than numpy proper. >> >> >> >> >> >> The user would have to re-implement the part that converts the >> covariance >> >> matrix to a correlation >> >> coefficient. I made this PR to avoid that code duplication. >> > >> > >> > With the API I was envisioning (or even your proposed API, for that >> matter), >> > this function would only be a few lines, e.g., >> > >> > def covcorr(x): >> > cov = np.cov(x) >> > corr = np.corrcoef(x, cov=cov) >> >> IIUC, if you have a covariance matrix then you can compute the >> correlation matrix directly, without looking at 'x', so corrcoef(x, >> cov=cov) is a bit odd-looking. I think probably the API that makes the >> most sense is just to expose something like the covtocorr function >> (maybe it could have a less telegraphic name?)? And then, yeah, users >> can use that to build their own covcorr or whatever if they want it. >> > > Right, agreed, this is why I said `x` becomes redundant when `cov` is > specified > when calling `numpy.corrcoef`. So we have two alternatives: > > 1) Have `np.corrcoef` accept a boolean optional argument `covmat = False` > that lets > one obtain a tuple containing the covariance and the correlation matrices > in the same call > 2) Modify my original PR so that `np.covtocorr` remains (with possibly a > better > name) but remove `np.covcorr` since this is easy for the user to add. > > My preference is option 2. > cov2corr is a useful function http://www.statsmodels.org/dev/generated/statsmodels.stats.moment_helpers.cov2corr.html I also wrote the inverse function corr2cov, but AFAIR use it only in some test cases. I don't think adding any of the options to corrcoef or covcor is useful since there is no computational advantage to it. What I'm missing are functions that return the intermediate results, e.g. var and mean or cov and mean. (For statsmodels I decided to return mean and cov or mean and var in the related functions. Some R packages return the mean as an option.) Josef > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Oct 26 15:23:08 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Oct 2016 13:23:08 -0600 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer wrote: > I am also concerned about adding more special cases for NumPy scalars vs > arrays. These cases are already confusing (e.g., making no distinction > between 0d arrays and scalars) and poorly documented. > > On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith wrote: > >> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris >> wrote: >> > Hi All, >> > >> > I've been thinking about this some (a lot) more and have an alternate >> > proposal for the behavior of the `**` operator >> > >> > if both base and power are numpy/python scalar integers, convert to >> python >> > integers and call the `**` operator. That would solve both the >> precision and >> > compatibility problems and I think is the option of least surprise. For >> > those who need type preservation and modular arithmetic, the np.power >> > function remains, although the type conversions can be surpirising as it >> > seems that the base and power should play different roles in >> determining >> > the type, at least to me. >> > Array, 0-d or not, are treated differently from scalars and integers >> raised >> > to negative integer powers always raise an error. >> > >> > I think this solves most problems and would not be difficult to >> implement. >> > >> > Thoughts? >> >> My main concern about this is that it adds more special cases to numpy >> scalars, and a new behavioral deviation between 0d arrays and scalars, >> when ideally we should be trying to reduce the >> duplication/discrepancies between these. It's also inconsistent with >> how other operations on integer scalars work, e.g. regular addition >> overflows rather than promoting to Python int: >> >> In [8]: np.int64(2 ** 63 - 1) + 1 >> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: >> overflow encountered in long_scalars >> #!/home/njs/.user-python3.5-64bit/bin/python3.5 >> Out[8]: -9223372036854775808 >> >> So I'm inclined to try and keep it simple, like in your previous >> proposal... theoretically of course it would be nice to have the >> perfect solution here, but at this point it feels like we might be >> overthinking this trying to get that last 1% of improvement. The thing >> where 2 ** -1 returns 0 is just broken and bites people so we should >> definitely fix it, but beyond that I'm not sure it really matters >> *that* much what we do, and "special cases aren't special enough to >> break the rules" and all that. >> >> What I have been concerned about are the follow combinations that currently return floats num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: num: , exp: , res: The other combinations of signed and unsigned integers to signed powers currently raise ValueError due to the change to the power ufunc. The exceptions that aren't covered by uint64 + signed (which won't change) seem to occur when the exponent can be safely cast to the base type. I suspect that people have already come to depend on that, especially as python integers on 64 bit linux convert to int64. So in those cases we should perhaps raise a FutureWarning instead of an error. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Oct 26 15:24:48 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 26 Oct 2016 12:24:48 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor wrote: > On 10/26/2016 06:00 PM, Julian Taylor wrote: >> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote: >>> >>> >>> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor >>> > >>> wrote: >>> >>> On 26.10.2016 06:34, Charles R Harris wrote: >>> > Hi All, >>> > >>> > There is a proposed random number package PR now up on github: >>> > https://github.com/numpy/numpy/pull/8209 >>> . It is from >>> > oleksandr-pavlyk >> > and implements >>> > the number random number package using MKL for increased speed. >>> I think >>> > we are definitely interested in the improved speed, but I'm not >>> sure >>> > numpy is the best place to put the package. I'd welcome any >>> comments on >>> > the PR itself, as well as any thoughts on the best way organize >>> or use >>> > of this work. Maybe scikit-random >>> >>> >>> Note that this thread is a continuation of >>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html >>> >>> >>> >>> I'm not a fan of putting code depending on a proprietary library >>> into numpy. >>> This should be a standalone package which may provide the same >>> interface >>> as numpy. >>> >>> >>> I don't really see a problem with that in principle. Numpy can use Intel >>> MKL (and Accelerate) as well if it's available. It needs some thought >>> put into the API though - a ``numpy.random_intel`` module is certainly >>> not what we want. >>> >> >> For me there is a difference between being able to optionally use a >> proprietary library as an alternative to free software libraries if the >> user wishes to do so and offering functionality that only works with >> non-free software. >> We are providing a form of advertisement for them by allowing it (hey if >> you buy this black box that you cannot modify or use freely you get this >> neat numpy feature!). >> >> I prefer for the full functionality of numpy to stay available with a >> stack of community owned software, even if it may be less powerful that >> way. > > But then if this is really just the same random numbers numpy already > provides just faster, it is probably acceptable in principle. I haven't > actually looked at the PR yet. The RNG stream is totally different, so yeah, it can't just be a silent drop-in replacement like BLAS/LAPACK. The patch also adds ~10,000 lines of code; here's an example of what some of it looks like: https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833 I don't see how we can realistically commit to maintaining this. I'm also not really seeing how shipping it as part of numpy provides extra benefits to maintainers or users? AFAICT right now it's basically structured as a standalone library that's been dropped into the numpy source tree, and it would be just as easy to ship separately (or am I wrong?). And since the public API is that all the functionality comes from importing this specific new module ('numpy.random_intel'), it'd be a one-line change for users to import from a non-numpy namespace, like 'mkl.random' or whatever. If it were more integrated with the rest of numpy then the trade-offs would be more complicated, but in its present form this seems like an easy call. The other question is whether it could/should change to *become* more integrated... that's more tricky. There's been some work towards supporting swappable backends inside np.random; but the focus has mostly been on allowing new core generators, though, and this code seems to want to take over the whole thing (core generator + distributions), so even once the swappable backends stuff is working I'm not sure it would be relevant here. The one case I can think of that does seem promising is that if we get an API for users to say "I don't care about stream compatibility, just give me un-reproducible variates as fast as you can", then it might make sense for that to silently use MKL if available -- this would be pretty analogous to the use of MKL in np.linalg. But we don't have that API yet, I'm not sure how the MKL fallback could be maintainably implemented given that it would require somehow swapping the entire RandomState implementation, and it's entirely possible that once we figure out solutions to those then it'd still make sense for the actual MKL wrappers to live in a third-party library that numpy imports. -n -- Nathaniel J. Smith -- https://vorpus.org From p.e.creasey.00 at googlemail.com Wed Oct 26 15:35:50 2016 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Wed, 26 Oct 2016 12:35:50 -0700 Subject: [Numpy-discussion] padding options for diff Message-ID: > Date: Wed, 26 Oct 2016 09:05:41 -0400 > From: Matthew Harrigan > > np.cumsum(np.diff(x, to_begin=x.take([0], axis=axis), axis=axis), axis=axis) > > That's certainly not going to win any beauty contests. The 1d case is > clean though: > > np.cumsum(np.diff(x, to_begin=x[0])) > > I'm not sure if this means the API should change, and if so how. Higher > dimensional arrays seem to just have extra complexity. > >> >> I like the proposal, though I suspect that making it general has >> obscured that the most common use-case for padding is to make the >> inverse of np.cumsum (at least that?s what I frequently need), and now >> in the multidimensional case you have the somewhat unwieldy: >> >> >>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis)) >> >> rather than >> >> >>> np.diff(a, axis=axis, keep_left=True) >> >> which of course could just be an option upon what you already have. >> So my suggestion was intended that you might want an additional keyword argument (keep_left=False) to make the inverse np.cumsum use-case easier, i.e. you would have something in your np.diff like: if keep_left: if to_begin is None: to_begin = np.take(a, [0], axis=axis) else: raise ValueError(?np.diff(a, keep_left=False, to_begin=None) can be used with either keep_left or to_begin, but not both.?) Generally I try to avoid optional keyword argument overlap, but in this case it is probably justified. Peter From josef.pktd at gmail.com Wed Oct 26 15:39:16 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Oct 2016 15:39:16 -0400 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 3:23 PM, Charles R Harris wrote: > > > On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer wrote: > >> I am also concerned about adding more special cases for NumPy scalars vs >> arrays. These cases are already confusing (e.g., making no distinction >> between 0d arrays and scalars) and poorly documented. >> >> On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith wrote: >> >>> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris >>> wrote: >>> > Hi All, >>> > >>> > I've been thinking about this some (a lot) more and have an alternate >>> > proposal for the behavior of the `**` operator >>> > >>> > if both base and power are numpy/python scalar integers, convert to >>> python >>> > integers and call the `**` operator. That would solve both the >>> precision and >>> > compatibility problems and I think is the option of least surprise. For >>> > those who need type preservation and modular arithmetic, the np.power >>> > function remains, although the type conversions can be surpirising as >>> it >>> > seems that the base and power should play different roles in >>> determining >>> > the type, at least to me. >>> > Array, 0-d or not, are treated differently from scalars and integers >>> raised >>> > to negative integer powers always raise an error. >>> > >>> > I think this solves most problems and would not be difficult to >>> implement. >>> > >>> > Thoughts? >>> >>> My main concern about this is that it adds more special cases to numpy >>> scalars, and a new behavioral deviation between 0d arrays and scalars, >>> when ideally we should be trying to reduce the >>> duplication/discrepancies between these. It's also inconsistent with >>> how other operations on integer scalars work, e.g. regular addition >>> overflows rather than promoting to Python int: >>> >>> In [8]: np.int64(2 ** 63 - 1) + 1 >>> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: >>> overflow encountered in long_scalars >>> #!/home/njs/.user-python3.5-64bit/bin/python3.5 >>> Out[8]: -9223372036854775808 >>> >>> So I'm inclined to try and keep it simple, like in your previous >>> proposal... theoretically of course it would be nice to have the >>> perfect solution here, but at this point it feels like we might be >>> overthinking this trying to get that last 1% of improvement. The thing >>> where 2 ** -1 returns 0 is just broken and bites people so we should >>> definitely fix it, but beyond that I'm not sure it really matters >>> *that* much what we do, and "special cases aren't special enough to >>> break the rules" and all that. >>> >>> > What I have been concerned about are the follow combinations that > currently return floats > > num: , exp: , res: 'numpy.float32'> > num: , exp: , res: 'numpy.float32'> > num: , exp: , res: 'numpy.float32'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > > The other combinations of signed and unsigned integers to signed powers > currently raise ValueError due to the change to the power ufunc. The > exceptions that aren't covered by uint64 + signed (which won't change) seem > to occur when the exponent can be safely cast to the base type. I suspect > that people have already come to depend on that, especially as python > integers on 64 bit linux convert to int64. So in those cases we should > perhaps raise a FutureWarning instead of an error. > >>> np.int64(2)**np.array(-1, np.int64) 0.5 >>> np.__version__ '1.10.4' >>> np.int64(2)**np.array([-1, 2], np.int64) array([0, 4], dtype=int64) >>> np.array(2, np.uint64)**np.array([-1, 2], np.int64) array([0, 4], dtype=int64) >>> np.array([2], np.uint64)**np.array([-1, 2], np.int64) array([ 0.5, 4. ]) >>> np.array([2], np.uint64).squeeze()**np.array([-1, 2], np.int64) array([0, 4], dtype=int64) (IMO: If you have to break backwards compatibility, break forwards not backwards.) Josef http://www.stanlaurelandoliverhardy.com/nicemess.htm > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Oct 26 15:39:15 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 26 Oct 2016 12:39:15 -0700 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 12:23 PM, Charles R Harris wrote: [...] > What I have been concerned about are the follow combinations that currently > return floats > > num: , exp: , res: 'numpy.float32'> > num: , exp: , res: 'numpy.float32'> > num: , exp: , res: 'numpy.float32'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> > num: , exp: , res: 'numpy.float64'> What's this referring to? For both arrays and scalars I get: In [8]: (np.array(2, dtype=np.int8) ** np.array(2, dtype=np.int8)).dtype Out[8]: dtype('int8') In [9]: (np.int8(2) ** np.int8(2)).dtype Out[9]: dtype('int8') -n -- Nathaniel J. Smith -- https://vorpus.org From warren.weckesser at gmail.com Wed Oct 26 15:41:21 2016 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Wed, 26 Oct 2016 15:41:21 -0400 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith wrote: > On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor > wrote: > > On 10/26/2016 06:00 PM, Julian Taylor wrote: > >> > >> On 10/26/2016 10:59 AM, Ralf Gommers wrote: > >>> > >>> > >>> > >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor > >>> > > >>> wrote: > >>> > >>> On 26.10.2016 06:34, Charles R Harris wrote: > >>> > Hi All, > >>> > > >>> > There is a proposed random number package PR now up on github: > >>> > https://github.com/numpy/numpy/pull/8209 > >>> . It is from > >>> > oleksandr-pavlyk >>> > and implements > >>> > the number random number package using MKL for increased speed. > >>> I think > >>> > we are definitely interested in the improved speed, but I'm not > >>> sure > >>> > numpy is the best place to put the package. I'd welcome any > >>> comments on > >>> > the PR itself, as well as any thoughts on the best way organize > >>> or use > >>> > of this work. Maybe scikit-random > >>> > >>> > >>> Note that this thread is a continuation of > >>> https://mail.scipy.org/pipermail/numpy-discussion/ > 2016-July/075822.html > >>> > >>> > >>> > >>> I'm not a fan of putting code depending on a proprietary library > >>> into numpy. > >>> This should be a standalone package which may provide the same > >>> interface > >>> as numpy. > >>> > >>> > >>> I don't really see a problem with that in principle. Numpy can use > Intel > >>> MKL (and Accelerate) as well if it's available. It needs some thought > >>> put into the API though - a ``numpy.random_intel`` module is certainly > >>> not what we want. > >>> > >> > >> For me there is a difference between being able to optionally use a > >> proprietary library as an alternative to free software libraries if the > >> user wishes to do so and offering functionality that only works with > >> non-free software. > >> We are providing a form of advertisement for them by allowing it (hey if > >> you buy this black box that you cannot modify or use freely you get this > >> neat numpy feature!). > >> > >> I prefer for the full functionality of numpy to stay available with a > >> stack of community owned software, even if it may be less powerful that > >> way. > > > > But then if this is really just the same random numbers numpy already > > provides just faster, it is probably acceptable in principle. I haven't > > actually looked at the PR yet. > > The RNG stream is totally different, so yeah, it can't just be a > silent drop-in replacement like BLAS/LAPACK. > > The patch also adds ~10,000 lines of code; here's an example of what > some of it looks like: > > https://github.com/oleksandr-pavlyk/numpy/blob/ > b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/ > mklrand/mkl_distributions.cpp#L1724-L1833 > > I don't see how we can realistically commit to maintaining this. > > FYI: numpy already maintains code exactly like that: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397 Perhaps the point should be that the numpy devs won't want to maintain two nearly identical versions of that code. Warren > I'm also not really seeing how shipping it as part of numpy provides > extra benefits to maintainers or users? AFAICT right now it's > basically structured as a standalone library that's been dropped into > the numpy source tree, and it would be just as easy to ship separately > (or am I wrong?). And since the public API is that all the > functionality comes from importing this specific new module > ('numpy.random_intel'), it'd be a one-line change for users to import > from a non-numpy namespace, like 'mkl.random' or whatever. If it were > more integrated with the rest of numpy then the trade-offs would be > more complicated, but in its present form this seems like an easy > call. > > The other question is whether it could/should change to *become* more > integrated... that's more tricky. There's been some work towards > supporting swappable backends inside np.random; but the focus has > mostly been on allowing new core generators, though, and this code > seems to want to take over the whole thing (core generator + > distributions), so even once the swappable backends stuff is working > I'm not sure it would be relevant here. The one case I can think of > that does seem promising is that if we get an API for users to say "I > don't care about stream compatibility, just give me un-reproducible > variates as fast as you can", then it might make sense for that to > silently use MKL if available -- this would be pretty analogous to the > use of MKL in np.linalg. But we don't have that API yet, I'm not sure > how the MKL fallback could be maintainably implemented given that it > would require somehow swapping the entire RandomState implementation, > and it's entirely possible that once we figure out solutions to those > then it'd still make sense for the actual MKL wrappers to live in a > third-party library that numpy imports. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Oct 26 15:47:43 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 26 Oct 2016 12:47:43 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > > On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith wrote: >> The patch also adds ~10,000 lines of code; here's an example of what >> some of it looks like: >> >> https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833 >> >> I don't see how we can realistically commit to maintaining this. > > FYI: numpy already maintains code exactly like that: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397 > > Perhaps the point should be that the numpy devs won't want to maintain two nearly identical versions of that code. Indeed. That's how the algorithm was published. The /* sigh ... */ is my own. ;-) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Oct 26 15:49:36 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Oct 2016 15:49:36 -0400 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 3:39 PM, Nathaniel Smith wrote: > On Wed, Oct 26, 2016 at 12:23 PM, Charles R Harris > wrote: > [...] > > What I have been concerned about are the follow combinations that > currently > > return floats > > > > num: , exp: , res: > 'numpy.float32'> > > num: , exp: , res: > 'numpy.float32'> > > num: , exp: , res: > 'numpy.float32'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > What's this referring to? For both arrays and scalars I get: > > In [8]: (np.array(2, dtype=np.int8) ** np.array(2, dtype=np.int8)).dtype > Out[8]: dtype('int8') > > In [9]: (np.int8(2) ** np.int8(2)).dtype > Out[9]: dtype('int8') > >>> (np.array([2], dtype=np.int8) ** np.array(-1, dtype=np.int8).squeeze()).dtype dtype('int8') >>> (np.array([2], dtype=np.int8)[0] ** np.array(-1, dtype=np.int8).squeeze()).dtype dtype('float32') >>> (np.int8(2)**np.int8(-1)).dtype dtype('float32') >>> (np.int8(2)**np.int8(2)).dtype dtype('int8') The last one looks like value dependent scalar dtype Josef > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Oct 26 15:49:50 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 26 Oct 2016 12:49:50 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser wrote: > > > On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith wrote: >> >> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor >> wrote: >> > On 10/26/2016 06:00 PM, Julian Taylor wrote: >> >> >> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote: >> >>> >> >>> >> >>> >> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor >> >>> > >> >>> wrote: >> >>> >> >>> On 26.10.2016 06:34, Charles R Harris wrote: >> >>> > Hi All, >> >>> > >> >>> > There is a proposed random number package PR now up on github: >> >>> > https://github.com/numpy/numpy/pull/8209 >> >>> . It is from >> >>> > oleksandr-pavlyk > >>> > and implements >> >>> > the number random number package using MKL for increased speed. >> >>> I think >> >>> > we are definitely interested in the improved speed, but I'm not >> >>> sure >> >>> > numpy is the best place to put the package. I'd welcome any >> >>> comments on >> >>> > the PR itself, as well as any thoughts on the best way organize >> >>> or use >> >>> > of this work. Maybe scikit-random >> >>> >> >>> >> >>> Note that this thread is a continuation of >> >>> >> >>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.html >> >>> >> >>> >> >>> >> >>> I'm not a fan of putting code depending on a proprietary library >> >>> into numpy. >> >>> This should be a standalone package which may provide the same >> >>> interface >> >>> as numpy. >> >>> >> >>> >> >>> I don't really see a problem with that in principle. Numpy can use >> >>> Intel >> >>> MKL (and Accelerate) as well if it's available. It needs some thought >> >>> put into the API though - a ``numpy.random_intel`` module is certainly >> >>> not what we want. >> >>> >> >> >> >> For me there is a difference between being able to optionally use a >> >> proprietary library as an alternative to free software libraries if the >> >> user wishes to do so and offering functionality that only works with >> >> non-free software. >> >> We are providing a form of advertisement for them by allowing it (hey >> >> if >> >> you buy this black box that you cannot modify or use freely you get >> >> this >> >> neat numpy feature!). >> >> >> >> I prefer for the full functionality of numpy to stay available with a >> >> stack of community owned software, even if it may be less powerful that >> >> way. >> > >> > But then if this is really just the same random numbers numpy already >> > provides just faster, it is probably acceptable in principle. I haven't >> > actually looked at the PR yet. >> >> The RNG stream is totally different, so yeah, it can't just be a >> silent drop-in replacement like BLAS/LAPACK. >> >> The patch also adds ~10,000 lines of code; here's an example of what >> some of it looks like: >> >> >> https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833 >> >> I don't see how we can realistically commit to maintaining this. >> > > > FYI: numpy already maintains code exactly like that: > https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397 > > Perhaps the point should be that the numpy devs won't want to maintain two > nearly identical versions of that code. Heh, good catch! Okay, if random_intel is a massive copy-paste of random with modifications applied on top, then that's its own issue... on the one hand, yeah, we definitely don't want to carry around massive copy/paste code. OTOH, it suggests that it might be possible to refactor the code so that common parts are shared, and this would be a benefit to integrating random and random_intel more closely. (And this benefit would then have to be weighed against all the other considerations, like how much sharing there actually was, maintainability of the remaining random_intel-specific bits, the desire to keep numpy free-and-open, etc.) Hard to make that call just from skimming a 10,000 line patch, though... Oleksandr, or others at Intel: how much possibility do you think there is for sharing code between random and random_intel? -n -- Nathaniel J. Smith -- https://vorpus.org From charlesr.harris at gmail.com Wed Oct 26 15:57:29 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Oct 2016 13:57:29 -0600 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 1:39 PM, wrote: > > > On Wed, Oct 26, 2016 at 3:23 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer wrote: >> >>> I am also concerned about adding more special cases for NumPy scalars vs >>> arrays. These cases are already confusing (e.g., making no distinction >>> between 0d arrays and scalars) and poorly documented. >>> >>> On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith wrote: >>> >>>> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris >>>> wrote: >>>> > Hi All, >>>> > >>>> > I've been thinking about this some (a lot) more and have an alternate >>>> > proposal for the behavior of the `**` operator >>>> > >>>> > if both base and power are numpy/python scalar integers, convert to >>>> python >>>> > integers and call the `**` operator. That would solve both the >>>> precision and >>>> > compatibility problems and I think is the option of least surprise. >>>> For >>>> > those who need type preservation and modular arithmetic, the np.power >>>> > function remains, although the type conversions can be surpirising as >>>> it >>>> > seems that the base and power should play different roles in >>>> determining >>>> > the type, at least to me. >>>> > Array, 0-d or not, are treated differently from scalars and integers >>>> raised >>>> > to negative integer powers always raise an error. >>>> > >>>> > I think this solves most problems and would not be difficult to >>>> implement. >>>> > >>>> > Thoughts? >>>> >>>> My main concern about this is that it adds more special cases to numpy >>>> scalars, and a new behavioral deviation between 0d arrays and scalars, >>>> when ideally we should be trying to reduce the >>>> duplication/discrepancies between these. It's also inconsistent with >>>> how other operations on integer scalars work, e.g. regular addition >>>> overflows rather than promoting to Python int: >>>> >>>> In [8]: np.int64(2 ** 63 - 1) + 1 >>>> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: >>>> overflow encountered in long_scalars >>>> #!/home/njs/.user-python3.5-64bit/bin/python3.5 >>>> Out[8]: -9223372036854775808 >>>> >>>> So I'm inclined to try and keep it simple, like in your previous >>>> proposal... theoretically of course it would be nice to have the >>>> perfect solution here, but at this point it feels like we might be >>>> overthinking this trying to get that last 1% of improvement. The thing >>>> where 2 ** -1 returns 0 is just broken and bites people so we should >>>> definitely fix it, but beyond that I'm not sure it really matters >>>> *that* much what we do, and "special cases aren't special enough to >>>> break the rules" and all that. >>>> >>>> >> What I have been concerned about are the follow combinations that >> currently return floats >> >> num: , exp: , res: > 'numpy.float32'> >> num: , exp: , res: > 'numpy.float32'> >> num: , exp: , res: > 'numpy.float32'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> num: , exp: , res: > 'numpy.float64'> >> >> The other combinations of signed and unsigned integers to signed powers >> currently raise ValueError due to the change to the power ufunc. The >> exceptions that aren't covered by uint64 + signed (which won't change) seem >> to occur when the exponent can be safely cast to the base type. I suspect >> that people have already come to depend on that, especially as python >> integers on 64 bit linux convert to int64. So in those cases we should >> perhaps raise a FutureWarning instead of an error. >> > > > >>> np.int64(2)**np.array(-1, np.int64) > 0.5 > >>> np.__version__ > '1.10.4' > >>> np.int64(2)**np.array([-1, 2], np.int64) > array([0, 4], dtype=int64) > >>> np.array(2, np.uint64)**np.array([-1, 2], np.int64) > array([0, 4], dtype=int64) > >>> np.array([2], np.uint64)**np.array([-1, 2], np.int64) > array([ 0.5, 4. ]) > >>> np.array([2], np.uint64).squeeze()**np.array([-1, 2], np.int64) > array([0, 4], dtype=int64) > > > (IMO: If you have to break backwards compatibility, break forwards not > backwards.) > Current master is different. I'm not too worried in the array cases as the results for negative exponents were zero except then raising -1 to a power. Since that result is incorrect raising an error falls on the fine line between bug fix and compatibility break. If the pre-releases cause too much trouble. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Oct 26 15:58:20 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Oct 2016 13:58:20 -0600 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 1:39 PM, Nathaniel Smith wrote: > On Wed, Oct 26, 2016 at 12:23 PM, Charles R Harris > wrote: > [...] > > What I have been concerned about are the follow combinations that > currently > > return floats > > > > num: , exp: , res: > 'numpy.float32'> > > num: , exp: , res: > 'numpy.float32'> > > num: , exp: , res: > 'numpy.float32'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > num: , exp: , res: > 'numpy.float64'> > > What's this referring to? For both arrays and scalars I get: > > In [8]: (np.array(2, dtype=np.int8) ** np.array(2, dtype=np.int8)).dtype > Out[8]: dtype('int8') > > In [9]: (np.int8(2) ** np.int8(2)).dtype > Out[9]: dtype('int8') > > You need a negative exponent to see the effect. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathewsyriac at gmail.com Wed Oct 26 16:12:05 2016 From: mathewsyriac at gmail.com (Mathew S. Madhavacheril) Date: Wed, 26 Oct 2016 16:12:05 -0400 Subject: [Numpy-discussion] Combining covariance and correlation coefficient into one numpy.cov call In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 3:20 PM, wrote: > > > On Wed, Oct 26, 2016 at 3:11 PM, Mathew S. Madhavacheril < > mathewsyriac at gmail.com> wrote: > >> >> >> On Wed, Oct 26, 2016 at 2:56 PM, Nathaniel Smith wrote: >> >>> On Wed, Oct 26, 2016 at 11:13 AM, Stephan Hoyer >>> wrote: >>> > On Wed, Oct 26, 2016 at 11:03 AM, Mathew S. Madhavacheril >>> > wrote: >>> >> >>> >> On Wed, Oct 26, 2016 at 1:46 PM, Stephan Hoyer >>> wrote: >>> >>> >>> >>> I wonder if the goals of this addition could be achieved by simply >>> adding >>> >>> an optional `cov` argument >>> >>> >>> >>> to np.corr, which would provide a pre-computed covariance. >>> >> >>> >> >>> >> That's a fair suggestion which I'm happy to switch to. This >>> eliminates the >>> >> need for two new functions. >>> >> I'll add an optional `cov = False` argument to numpy.corrcoef that >>> returns >>> >> a tuple (corr, cov) instead. >>> >> >>> >>> >>> >>> >>> >>> Either way, `covcorr` feels like a helper function that could exist >>> in >>> >>> user code rather than numpy proper. >>> >> >>> >> >>> >> The user would have to re-implement the part that converts the >>> covariance >>> >> matrix to a correlation >>> >> coefficient. I made this PR to avoid that code duplication. >>> > >>> > >>> > With the API I was envisioning (or even your proposed API, for that >>> matter), >>> > this function would only be a few lines, e.g., >>> > >>> > def covcorr(x): >>> > cov = np.cov(x) >>> > corr = np.corrcoef(x, cov=cov) >>> >>> IIUC, if you have a covariance matrix then you can compute the >>> correlation matrix directly, without looking at 'x', so corrcoef(x, >>> cov=cov) is a bit odd-looking. I think probably the API that makes the >>> most sense is just to expose something like the covtocorr function >>> (maybe it could have a less telegraphic name?)? And then, yeah, users >>> can use that to build their own covcorr or whatever if they want it. >>> >> >> Right, agreed, this is why I said `x` becomes redundant when `cov` is >> specified >> when calling `numpy.corrcoef`. So we have two alternatives: >> >> 1) Have `np.corrcoef` accept a boolean optional argument `covmat = False` >> that lets >> one obtain a tuple containing the covariance and the correlation matrices >> in the same call >> 2) Modify my original PR so that `np.covtocorr` remains (with possibly a >> better >> name) but remove `np.covcorr` since this is easy for the user to add. >> >> My preference is option 2. >> > > cov2corr is a useful function > http://www.statsmodels.org/dev/generated/statsmodels.stats. > moment_helpers.cov2corr.html > I also wrote the inverse function corr2cov, but AFAIR use it only in some > test cases. > > > I don't think adding any of the options to corrcoef or covcor is useful > since there is no computational advantage to it. > I'm not sure I agree with that statement. If a user wants to calculate both a covariance and correlation matrix, they currently have two options: A) Call np.cov and np.corrcoef separately, which takes at least twice as long as one call to np.cov. For data-sets that I am used to, a np.cov call takes 5-10 seconds. B) Call np.cov and then separately implement their own correlation matrix code, which means the user isn't able to fully take advantage of code that is already in numpy. In any case, I've updated the PR: https://github.com/numpy/numpy/pull/8211 Relative to my original PR, it: a) removes the numpy.covcorr function which the user can easily implement b) have numpy.cov2corr be the function exposed in the API (previously called numpy.covtocorr in the PR), which accepts a pre-calculated covariance matrix c) have numpy.corrcoef call numpy.cov2corr > > > >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harrigan.matthew at gmail.com Wed Oct 26 16:18:05 2016 From: harrigan.matthew at gmail.com (Matthew Harrigan) Date: Wed, 26 Oct 2016 16:18:05 -0400 Subject: [Numpy-discussion] padding options for diff In-Reply-To: References: Message-ID: Would it be preferable to have to_begin='first' as an option under the existing kwarg to avoid overlapping? On Wed, Oct 26, 2016 at 3:35 PM, Peter Creasey < p.e.creasey.00 at googlemail.com> wrote: > > Date: Wed, 26 Oct 2016 09:05:41 -0400 > > From: Matthew Harrigan > > > > np.cumsum(np.diff(x, to_begin=x.take([0], axis=axis), axis=axis), > axis=axis) > > > > That's certainly not going to win any beauty contests. The 1d case is > > clean though: > > > > np.cumsum(np.diff(x, to_begin=x[0])) > > > > I'm not sure if this means the API should change, and if so how. Higher > > dimensional arrays seem to just have extra complexity. > > > >> > >> I like the proposal, though I suspect that making it general has > >> obscured that the most common use-case for padding is to make the > >> inverse of np.cumsum (at least that?s what I frequently need), and now > >> in the multidimensional case you have the somewhat unwieldy: > >> > >> >>> np.diff(a, axis=axis, to_begin=np.take(a, 0, axis=axis)) > >> > >> rather than > >> > >> >>> np.diff(a, axis=axis, keep_left=True) > >> > >> which of course could just be an option upon what you already have. > >> > > So my suggestion was intended that you might want an additional > keyword argument (keep_left=False) to make the inverse np.cumsum > use-case easier, i.e. you would have something in your np.diff like: > > if keep_left: > if to_begin is None: > to_begin = np.take(a, [0], axis=axis) > else: > raise ValueError(?np.diff(a, keep_left=False, to_begin=None) > can be used with either keep_left or to_begin, but not both.?) > > Generally I try to avoid optional keyword argument overlap, but in > this case it is probably justified. > > Peter > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Oct 26 16:20:55 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Oct 2016 16:20:55 -0400 Subject: [Numpy-discussion] Numpy integers to integer powers again again In-Reply-To: References: Message-ID: On Wed, Oct 26, 2016 at 3:57 PM, Charles R Harris wrote: > > > On Wed, Oct 26, 2016 at 1:39 PM, wrote: > >> >> >> On Wed, Oct 26, 2016 at 3:23 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Tue, Oct 25, 2016 at 10:14 AM, Stephan Hoyer >>> wrote: >>> >>>> I am also concerned about adding more special cases for NumPy scalars >>>> vs arrays. These cases are already confusing (e.g., making no distinction >>>> between 0d arrays and scalars) and poorly documented. >>>> >>>> On Mon, Oct 24, 2016 at 4:30 PM, Nathaniel Smith wrote: >>>> >>>>> On Mon, Oct 24, 2016 at 3:41 PM, Charles R Harris >>>>> wrote: >>>>> > Hi All, >>>>> > >>>>> > I've been thinking about this some (a lot) more and have an alternate >>>>> > proposal for the behavior of the `**` operator >>>>> > >>>>> > if both base and power are numpy/python scalar integers, convert to >>>>> python >>>>> > integers and call the `**` operator. That would solve both the >>>>> precision and >>>>> > compatibility problems and I think is the option of least surprise. >>>>> For >>>>> > those who need type preservation and modular arithmetic, the np.power >>>>> > function remains, although the type conversions can be surpirising >>>>> as it >>>>> > seems that the base and power should play different roles in >>>>> determining >>>>> > the type, at least to me. >>>>> > Array, 0-d or not, are treated differently from scalars and integers >>>>> raised >>>>> > to negative integer powers always raise an error. >>>>> > >>>>> > I think this solves most problems and would not be difficult to >>>>> implement. >>>>> > >>>>> > Thoughts? >>>>> >>>>> My main concern about this is that it adds more special cases to numpy >>>>> scalars, and a new behavioral deviation between 0d arrays and scalars, >>>>> when ideally we should be trying to reduce the >>>>> duplication/discrepancies between these. It's also inconsistent with >>>>> how other operations on integer scalars work, e.g. regular addition >>>>> overflows rather than promoting to Python int: >>>>> >>>>> In [8]: np.int64(2 ** 63 - 1) + 1 >>>>> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: >>>>> overflow encountered in long_scalars >>>>> #!/home/njs/.user-python3.5-64bit/bin/python3.5 >>>>> Out[8]: -9223372036854775808 >>>>> >>>>> So I'm inclined to try and keep it simple, like in your previous >>>>> proposal... theoretically of course it would be nice to have the >>>>> perfect solution here, but at this point it feels like we might be >>>>> overthinking this trying to get that last 1% of improvement. The thing >>>>> where 2 ** -1 returns 0 is just broken and bites people so we should >>>>> definitely fix it, but beyond that I'm not sure it really matters >>>>> *that* much what we do, and "special cases aren't special enough to >>>>> break the rules" and all that. >>>>> >>>>> >>> What I have been concerned about are the follow combinations that >>> currently return floats >>> >>> num: , exp: , res: >> 'numpy.float32'> >>> num: , exp: , res: >> 'numpy.float32'> >>> num: , exp: , res: >> 'numpy.float32'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> num: , exp: , res: >> 'numpy.float64'> >>> >>> The other combinations of signed and unsigned integers to signed powers >>> currently raise ValueError due to the change to the power ufunc. The >>> exceptions that aren't covered by uint64 + signed (which won't change) seem >>> to occur when the exponent can be safely cast to the base type. I suspect >>> that people have already come to depend on that, especially as python >>> integers on 64 bit linux convert to int64. So in those cases we should >>> perhaps raise a FutureWarning instead of an error. >>> >> >> >> >>> np.int64(2)**np.array(-1, np.int64) >> 0.5 >> >>> np.__version__ >> '1.10.4' >> >>> np.int64(2)**np.array([-1, 2], np.int64) >> array([0, 4], dtype=int64) >> >>> np.array(2, np.uint64)**np.array([-1, 2], np.int64) >> array([0, 4], dtype=int64) >> >>> np.array([2], np.uint64)**np.array([-1, 2], np.int64) >> array([ 0.5, 4. ]) >> >>> np.array([2], np.uint64).squeeze()**np.array([-1, 2], np.int64) >> array([0, 4], dtype=int64) >> >> >> (IMO: If you have to break backwards compatibility, break forwards not >> backwards.) >> > > Current master is different. I'm not too worried in the array cases as the > results for negative exponents were zero except then raising -1 to a power. > Since that result is incorrect raising an error falls on the fine line > between bug fix and compatibility break. If the pre-releases cause too much > trouble. > naive question: if cleaning up the inconsistencies already (kind of) breaks backwards compatibility and didn't result in a big outcry, why can we not go with a Future warning all the way to float. (i.e. use the power function with specified dtype instead of ** if you insist on int return) Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oleksandr.pavlyk at intel.com Wed Oct 26 16:30:51 2016 From: oleksandr.pavlyk at intel.com (Pavlyk, Oleksandr) Date: Wed, 26 Oct 2016 20:30:51 +0000 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> Message-ID: <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> Hi, Thanks a lot everybody for the feedback. The package can certainly be made a stand-alone drop-in replacement for np.random. There are many points raised and unraised in favor of this, and it is easy to accomplish. I will create a stand-alone package on github, but would still appreciate some help in reviewing it and making it available at PyPI. Interestingly, Nathaniel's link to a representative changes, specifically https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833 point at an unused code borrowed directly from mtrand/distributions.c: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L297 More representative change would be the implementation of Student's T-distribution: https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L232-L262 The module under review, similarly to randomstate package, provides alternative basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A, Wichmann-Hill. The scope of support differ, with randomstate implementing some generators absent in MKL and vice-versa. Thinking about the possibility of providing the functionality of this module within the framework of randomstate, I find that randomstate implements samplers from statistical distributions as functions that take the state of the underlying BRNG, and produce a single variate, e.g.: https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26 This design stands in a way of efficient use of MKL, which generates a whole vector of variates at a time. This can be done faster than sampling a variate at a time by using vectorized instructions. So I wrote mkl_distributions.cpp to provide functions that return a given size vector of sampled variates from each supported distribution. mklrand.pyx was then written by modifying mtrand.pyx to work with such vector generators. In particular, this allowed for efficient sampling from product distributions of Poisson distributions with different rate parameters, which is implemented in MKL: https://software.intel.com/en-us/node/521894 https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1071 Another point already raised by Nathaniel is that for numpy's randomness ideally should provide a way to override default algorithm for sampling from a particular distribution. For example RandomState object that implements PCG may rely on default acceptance-rejection algorithm for sampling from Gamma, while the RandomState object that provides interface to MKL might want to call into MKL directly. While at this topic, I also would like to point out the need for C-API interface to randomness, particularly felt writing parallel algorithms, where Python's GIL and use of Lock() in RandomState hurt scalability. Oleksandr -----Original Message----- From: NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Nathaniel Smith Sent: Wednesday, October 26, 2016 2:25 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Intel random number package On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor wrote: > On 10/26/2016 06:00 PM, Julian Taylor wrote: >> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote: >>> >>> >>> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor >>> >> > >>> wrote: >>> >>> On 26.10.2016 06:34, Charles R Harris wrote: >>> > Hi All, >>> > >>> > There is a proposed random number package PR now up on github: >>> > https://github.com/numpy/numpy/pull/8209 >>> . It is from >>> > oleksandr-pavlyk >> > and implements >>> > the number random number package using MKL for increased speed. >>> I think >>> > we are definitely interested in the improved speed, but I'm >>> not sure >>> > numpy is the best place to put the package. I'd welcome any >>> comments on >>> > the PR itself, as well as any thoughts on the best way >>> organize or use >>> > of this work. Maybe scikit-random >>> >>> >>> Note that this thread is a continuation of >>> https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075822.h >>> tml >>> >>> >>> >>> I'm not a fan of putting code depending on a proprietary library >>> into numpy. >>> This should be a standalone package which may provide the same >>> interface >>> as numpy. >>> >>> >>> I don't really see a problem with that in principle. Numpy can use >>> Intel MKL (and Accelerate) as well if it's available. It needs some >>> thought put into the API though - a ``numpy.random_intel`` module is >>> certainly not what we want. >>> >> >> For me there is a difference between being able to optionally use a >> proprietary library as an alternative to free software libraries if >> the user wishes to do so and offering functionality that only works >> with non-free software. >> We are providing a form of advertisement for them by allowing it (hey >> if you buy this black box that you cannot modify or use freely you >> get this neat numpy feature!). >> >> I prefer for the full functionality of numpy to stay available with a >> stack of community owned software, even if it may be less powerful >> that way. > > But then if this is really just the same random numbers numpy already > provides just faster, it is probably acceptable in principle. I > haven't actually looked at the PR yet. The RNG stream is totally different, so yeah, it can't just be a silent drop-in replacement like BLAS/LAPACK. The patch also adds ~10,000 lines of code; here's an example of what some of it looks like: https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833 I don't see how we can realistically commit to maintaining this. I'm also not really seeing how shipping it as part of numpy provides extra benefits to maintainers or users? AFAICT right now it's basically structured as a standalone library that's been dropped into the numpy source tree, and it would be just as easy to ship separately (or am I wrong?). And since the public API is that all the functionality comes from importing this specific new module ('numpy.random_intel'), it'd be a one-line change for users to import from a non-numpy namespace, like 'mkl.random' or whatever. If it were more integrated with the rest of numpy then the trade-offs would be more complicated, but in its present form this seems like an easy call. The other question is whether it could/should change to *become* more integrated... that's more tricky. There's been some work towards supporting swappable backends inside np.random; but the focus has mostly been on allowing new core generators, though, and this code seems to want to take over the whole thing (core generator + distributions), so even once the swappable backends stuff is working I'm not sure it would be relevant here. The one case I can think of that does seem promising is that if we get an API for users to say "I don't care about stream compatibility, just give me un-reproducible variates as fast as you can", then it might make sense for that to silently use MKL if available -- this would be pretty analogous to the use of MKL in np.linalg. But we don't have that API yet, I'm not sure how the MKL fallback could be maintainably implemented given that it would require somehow swapping the entire RandomState implementation, and it's entirely possible that once we figure out solutions to those then it'd still make sense for the actual MKL wrappers to live in a third-party library that numpy imports. -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From p.e.creasey.00 at googlemail.com Wed Oct 26 16:31:18 2016 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Wed, 26 Oct 2016 13:31:18 -0700 Subject: [Numpy-discussion] padding options for diff Message-ID: > Date: Wed, 26 Oct 2016 16:18:05 -0400 > From: Matthew Harrigan > > Would it be preferable to have to_begin='first' as an option under the > existing kwarg to avoid overlapping? > >> if keep_left: >> if to_begin is None: >> to_begin = np.take(a, [0], axis=axis) >> else: >> raise ValueError(?np.diff(a, keep_left=False, to_begin=None) >> can be used with either keep_left or to_begin, but not both.?) >> >> Generally I try to avoid optional keyword argument overlap, but in >> this case it is probably justified. >> It works for me. I can't *think* of a case where you could have a np.diff on a string array and 'first' could be confused with an element, since you're not allowed diff on strings in the present numpy anyway (unless wiser heads than me know something!). Feel free to move the conversation to github btw. Peter From toddrjen at gmail.com Wed Oct 26 17:03:37 2016 From: toddrjen at gmail.com (Todd) Date: Wed, 26 Oct 2016 17:03:37 -0400 Subject: [Numpy-discussion] Intel random number package In-Reply-To: <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> Message-ID: On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr < oleksandr.pavlyk at intel.com> wrote: > > The module under review, similarly to randomstate package, provides > alternative basic pseudo-random number generators (BRNGs), like MT2203, > MCG31, MRG32K3A, Wichmann-Hill. The scope of support differ, with > randomstate implementing some generators absent in MKL and vice-versa. > > Is there a reason that randomstate shouldn't implement those generators? > Thinking about the possibility of providing the functionality of this > module within the framework of randomstate, I find that randomstate > implements samplers from statistical distributions as functions that take > the state of the underlying BRNG, and produce a single variate, e.g.: > > https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/ > distributions.c#L23-L26 > > This design stands in a way of efficient use of MKL, which generates a > whole vector of variates at a time. This can be done faster than sampling a > variate at a time by using vectorized instructions. So I wrote > mkl_distributions.cpp to provide functions that return a given size vector > of sampled variates from each supported distribution. > I don't know a huge amount about pseudo-random number generators, but this seems superficially to be something that would benefit random number generation as a whole independently of whether MKL is used. Might it be possible to modify the numpy implementation to support this sort of vectorized approach? Another point already raised by Nathaniel is that for numpy's randomness > ideally should provide a way to override default algorithm for sampling > from a particular distribution. For example RandomState object that > implements PCG may rely on default acceptance-rejection algorithm for > sampling from Gamma, while the RandomState object that provides interface > to MKL might want to call into MKL directly. > The approach that pyfftw uses at least for scipy, which may also work here, is that you can monkey-patch the scipy.fftpack module at runtime, replacing it with pyfftw's drop-in replacement. scipy then proceeds to use pyfftw instead of its built-in fftpack implementation. Might such an approach work here? Users can either use this alternative randomstate replacement directly, or they can replace numpy's with it at runtime and numpy will then proceed to use the alternative. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oleksandr.pavlyk at intel.com Wed Oct 26 17:25:40 2016 From: oleksandr.pavlyk at intel.com (Pavlyk, Oleksandr) Date: Wed, 26 Oct 2016 21:25:40 +0000 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> Message-ID: <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Please see responses inline. From: NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Todd Sent: Wednesday, October 26, 2016 4:04 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Intel random number package On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr > wrote: The module under review, similarly to randomstate package, provides alternative basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A, Wichmann-Hill. The scope of support differ, with randomstate implementing some generators absent in MKL and vice-versa. Is there a reason that randomstate shouldn't implement those generators? No, randomstate certainly can implement all the BRNGs implemented in MKL. It is at developer?s discretion. Thinking about the possibility of providing the functionality of this module within the framework of randomstate, I find that randomstate implements samplers from statistical distributions as functions that take the state of the underlying BRNG, and produce a single variate, e.g.: https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26 This design stands in a way of efficient use of MKL, which generates a whole vector of variates at a time. This can be done faster than sampling a variate at a time by using vectorized instructions. So I wrote mkl_distributions.cpp to provide functions that return a given size vector of sampled variates from each supported distribution. I don't know a huge amount about pseudo-random number generators, but this seems superficially to be something that would benefit random number generation as a whole independently of whether MKL is used. Might it be possible to modify the numpy implementation to support this sort of vectorized approach? I also think that adopting vectorized mindset would benefit np.random. For example, Gaussians are currently generated using Box-Muller algorithm which produces two variate at a time, so one currently needs to be saved in the random state struct itself, along with an indicator that it should be used on the next iteration. With vectorized approach one could populate the vector two elements at a time with better memory locality, resulting in better performance. Vectorized approach has merits with or without use of MKL. Another point already raised by Nathaniel is that for numpy's randomness ideally should provide a way to override default algorithm for sampling from a particular distribution. For example RandomState object that implements PCG may rely on default acceptance-rejection algorithm for sampling from Gamma, while the RandomState object that provides interface to MKL might want to call into MKL directly. The approach that pyfftw uses at least for scipy, which may also work here, is that you can monkey-patch the scipy.fftpack module at runtime, replacing it with pyfftw's drop-in replacement. scipy then proceeds to use pyfftw instead of its built-in fftpack implementation. Might such an approach work here? Users can either use this alternative randomstate replacement directly, or they can replace numpy's with it at runtime and numpy will then proceed to use the alternative. I think the monkey-patching approach will work. RandomState was written with a view to replace numpy.random at some point in the future. It is standalone at the moment, from what I understand, only because it is still being worked on and extended. One particularly important development is the ability to sample continuous distributions in floats, or to populate a given preallocated buffer with random samples. These features are missing from numpy.random_intel and we thought it providing them. As I have said earlier, another missing feature in the C-API for randomness in numpy. Oleksandr -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Oct 27 04:25:33 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 27 Oct 2016 21:25:33 +1300 Subject: [Numpy-discussion] Intel random number package In-Reply-To: <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr < oleksandr.pavlyk at intel.com> wrote: > Please see responses inline. > > > > *From:* NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] *On > Behalf Of *Todd > *Sent:* Wednesday, October 26, 2016 4:04 PM > *To:* Discussion of Numerical Python > *Subject:* Re: [Numpy-discussion] Intel random number package > > > > On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr < > oleksandr.pavlyk at intel.com> wrote: > > Another point already raised by Nathaniel is that for numpy's randomness > ideally should provide a way to override default algorithm for sampling > from a particular distribution. For example RandomState object that > implements PCG may rely on default acceptance-rejection algorithm for > sampling from Gamma, while the RandomState object that provides interface > to MKL might want to call into MKL directly. > > > > The approach that pyfftw uses at least for scipy, which may also work > here, is that you can monkey-patch the scipy.fftpack module at runtime, > replacing it with pyfftw's drop-in replacement. scipy then proceeds to use > pyfftw instead of its built-in fftpack implementation. Might such an > approach work here? Users can either use this alternative randomstate > replacement directly, or they can replace numpy's with it at runtime and > numpy will then proceed to use the alternative. > The only reason that pyfftw uses monkeypatching is that the better approach is not possible due to license constraints with FFTW (it's GPL). > I think the monkey-patching approach will work. > It will work, for a while at least, but it's bad design. We're all on the same page I think that a separate submodule for random_intel is a no go, but as an explicitly switchable backend for functions with the same signature it would be fine imho. Of course we don't have that backend infrastructure today, but it's something we want and have been discussing anyway. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From vikramsingh001 at gmail.com Thu Oct 27 06:30:59 2016 From: vikramsingh001 at gmail.com (Vikram Singh) Date: Thu, 27 Oct 2016 13:30:59 +0300 Subject: [Numpy-discussion] Problem with compiling openacc with f2py Message-ID: I am a newbie to f2py so I have been creating simple test cases. Eventually I want to be able to use openacc subroutine from python. So here's the test case module test use iso_c_binding, only: sp => C_FLOAT, dp => C_DOUBLE, i8 => C_INT use omp_lib use openacc implicit none contains subroutine add_acc (a, b, n, c) integer(kind=i8), intent(in) :: n real(kind=dp), intent(in) :: a(n) real(kind=dp), intent(in) :: b(n) real(kind=dp), intent(out) :: c(n) integer(kind=i8) :: i !$acc kernels do i = 1, n c(i) = a(i) + b(i) end do !$acc end kernels end subroutine add_acc subroutine add_omp (a, b, n, c) integer(kind=i8), intent(in) :: n real(kind=dp), intent(in) :: a(n) real(kind=dp), intent(in) :: b(n) real(kind=dp), intent(out) :: c(n) integer(kind=i8) :: i, j !$omp parallel do do i = 1, n c(i) = a(i) + b(i) end do !$omp end parallel do end subroutine add_omp subroutine nt (c) integer(kind=i8), intent(out) :: c c = omp_get_max_threads() end subroutine nt subroutine mult (a, b, c) real(kind=dp), intent(in) :: a real(kind=dp), intent(in) :: b real(kind=dp), intent(out) :: c c = a * b end subroutine mult end module test I compile using f2py -c -m --f90flags='-fopenacc -foffload=nvptx-none -foffload=-O3 -O3 -fPIC' hello hello.f90 -L/usr/local/cuda/lib64 -lcublas -lcudart -lgomp Now, until I add the acc directives everything works fine. But as soon as I add the acc directives I get this error. gfortran:f90: /tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.f90 /home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran -Wall -g -Wall -g -shared /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o /tmp/tmpld6ssow3/hello.o /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o -L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas -lcudart -lgomp -lpython3.5m -lgfortran -o ./hello.cpython-35m-x86_64-linux-gnu.so /usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC /tmp/cc2yQ89d.target.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status /usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC /tmp/cc2yQ89d.target.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status error: Command "/home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran -Wall -g -Wall -g -shared /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o /tmp/tmpld6ssow3/hello.o /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o -L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas -lcudart -lgomp -lpython3.5m -lgfortran -o ./hello.cpython-35m-x86_64-linux-gnu.so" failed with exit status 1 I don't get why just putting acc directives should create errors, when omp does not. Vikram From toddrjen at gmail.com Thu Oct 27 10:30:36 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 27 Oct 2016 10:30:36 -0400 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers wrote: > > > On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr < > oleksandr.pavlyk at intel.com> wrote: > >> Please see responses inline. >> >> >> >> *From:* NumPy-Discussion [mailto:numpy-discussion-bounces at scipy.org] *On >> Behalf Of *Todd >> *Sent:* Wednesday, October 26, 2016 4:04 PM >> *To:* Discussion of Numerical Python >> *Subject:* Re: [Numpy-discussion] Intel random number package >> >> >> >> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr < >> oleksandr.pavlyk at intel.com> wrote: >> >> Another point already raised by Nathaniel is that for numpy's randomness >> ideally should provide a way to override default algorithm for sampling >> from a particular distribution. For example RandomState object that >> implements PCG may rely on default acceptance-rejection algorithm for >> sampling from Gamma, while the RandomState object that provides interface >> to MKL might want to call into MKL directly. >> >> >> >> The approach that pyfftw uses at least for scipy, which may also work >> here, is that you can monkey-patch the scipy.fftpack module at runtime, >> replacing it with pyfftw's drop-in replacement. scipy then proceeds to use >> pyfftw instead of its built-in fftpack implementation. Might such an >> approach work here? Users can either use this alternative randomstate >> replacement directly, or they can replace numpy's with it at runtime and >> numpy will then proceed to use the alternative. >> > > The only reason that pyfftw uses monkeypatching is that the better > approach is not possible due to license constraints with FFTW (it's GPL). > Yes, that is exactly why I brought it up. Better approaches are also not possible with MKL due to license constraints. It is a very similar situation overall. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Oct 27 10:43:40 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 27 Oct 2016 16:43:40 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On 10/27/2016 04:30 PM, Todd wrote: > On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers > wrote: > > > On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr > > wrote: > > Please see responses inline. > > > > *From:*NumPy-Discussion > [mailto:numpy-discussion-bounces at scipy.org > ] *On Behalf Of *Todd > *Sent:* Wednesday, October 26, 2016 4:04 PM > *To:* Discussion of Numerical Python > > *Subject:* Re: [Numpy-discussion] Intel random number package > > > > > On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr > > > wrote: > > Another point already raised by Nathaniel is that for > numpy's randomness ideally should provide a way to override > default algorithm for sampling from a particular > distribution. For example RandomState object that > implements PCG may rely on default acceptance-rejection > algorithm for sampling from Gamma, while the RandomState > object that provides interface to MKL might want to call > into MKL directly. > > > > The approach that pyfftw uses at least for scipy, which may also > work here, is that you can monkey-patch the scipy.fftpack module > at runtime, replacing it with pyfftw's drop-in replacement. > scipy then proceeds to use pyfftw instead of its built-in > fftpack implementation. Might such an approach work here? > Users can either use this alternative randomstate replacement > directly, or they can replace numpy's with it at runtime and > numpy will then proceed to use the alternative. > > > The only reason that pyfftw uses monkeypatching is that the better > approach is not possible due to license constraints with FFTW (it's > GPL). > > > Yes, that is exactly why I brought it up. Better approaches are also > not possible with MKL due to license constraints. It is a very similar > situation overall. > Its not that similar, the better approach is certainly possible with FFTW, the GPL is compatible with numpys license. It is only a concern users of binary distributions. Nobody provided the code to use fftw yet, but it would certainly be accepted. From toddrjen at gmail.com Thu Oct 27 10:52:48 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 27 Oct 2016 10:52:48 -0400 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 10/27/2016 04:30 PM, Todd wrote: > >> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers > > wrote: >> >> >> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr >> > >> wrote: >> >> Please see responses inline. >> >> >> >> *From:*NumPy-Discussion >> [mailto:numpy-discussion-bounces at scipy.org >> ] *On Behalf Of *Todd >> *Sent:* Wednesday, October 26, 2016 4:04 PM >> *To:* Discussion of Numerical Python > > >> *Subject:* Re: [Numpy-discussion] Intel random number package >> >> >> >> >> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr >> > >> wrote: >> >> Another point already raised by Nathaniel is that for >> numpy's randomness ideally should provide a way to override >> default algorithm for sampling from a particular >> distribution. For example RandomState object that >> implements PCG may rely on default acceptance-rejection >> algorithm for sampling from Gamma, while the RandomState >> object that provides interface to MKL might want to call >> into MKL directly. >> >> >> >> The approach that pyfftw uses at least for scipy, which may also >> work here, is that you can monkey-patch the scipy.fftpack module >> at runtime, replacing it with pyfftw's drop-in replacement. >> scipy then proceeds to use pyfftw instead of its built-in >> fftpack implementation. Might such an approach work here? >> Users can either use this alternative randomstate replacement >> directly, or they can replace numpy's with it at runtime and >> numpy will then proceed to use the alternative. >> >> >> The only reason that pyfftw uses monkeypatching is that the better >> approach is not possible due to license constraints with FFTW (it's >> GPL). >> >> >> Yes, that is exactly why I brought it up. Better approaches are also >> not possible with MKL due to license constraints. It is a very similar >> situation overall. >> >> > Its not that similar, the better approach is certainly possible with FFTW, > the GPL is compatible with numpys license. It is only a concern users of > binary distributions. Nobody provided the code to use fftw yet, but it > would certainly be accepted. Although it is technically compatible, it would make numpy effectively GPL. Suggestions for this have been explicitly rejected on these grounds [1] [1] https://github.com/numpy/numpy/issues/3485 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Oct 27 11:14:30 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 27 Oct 2016 17:14:30 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On 10/27/2016 04:52 PM, Todd wrote: > On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor > > > wrote: > > On 10/27/2016 04:30 PM, Todd wrote: > > On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers > > >> > wrote: > > > On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr > > >> wrote: > > Please see responses inline. > > > > *From:*NumPy-Discussion > [mailto:numpy-discussion-bounces at scipy.org > > >] *On Behalf Of *Todd > *Sent:* Wednesday, October 26, 2016 4:04 PM > *To:* Discussion of Numerical Python > > >> > *Subject:* Re: [Numpy-discussion] Intel random number > package > > > > > On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr > > >> > wrote: > > Another point already raised by Nathaniel is that for > numpy's randomness ideally should provide a way to > override > default algorithm for sampling from a particular > distribution. For example RandomState object that > implements PCG may rely on default acceptance-rejection > algorithm for sampling from Gamma, while the RandomState > object that provides interface to MKL might want to call > into MKL directly. > > > > The approach that pyfftw uses at least for scipy, which > may also > work here, is that you can monkey-patch the > scipy.fftpack module > at runtime, replacing it with pyfftw's drop-in replacement. > scipy then proceeds to use pyfftw instead of its built-in > fftpack implementation. Might such an approach work here? > Users can either use this alternative randomstate > replacement > directly, or they can replace numpy's with it at runtime and > numpy will then proceed to use the alternative. > > > The only reason that pyfftw uses monkeypatching is that the > better > approach is not possible due to license constraints with > FFTW (it's > GPL). > > > Yes, that is exactly why I brought it up. Better approaches are > also > not possible with MKL due to license constraints. It is a very > similar > situation overall. > > > Its not that similar, the better approach is certainly possible with > FFTW, the GPL is compatible with numpys license. It is only a > concern users of binary distributions. Nobody provided the code to > use fftw yet, but it would certainly be accepted. > > > Although it is technically compatible, it would make numpy effectively > GPL. Suggestions for this have been explicitly rejected on these > grounds [1] > > [1] https://github.com/numpy/numpy/issues/3485 > Yes it would make numpy GPL, but that is not a concern for a lot of users. Users for who it is a problem can still use the non-GPL version. A more interesting debate is whether our binary wheels should then be GPL wheels by default or not. Probably not, but that is something that should be discussed when its an actual issue. But to clarify what I said, it would be accepted if the value it provides is sufficient compared to the code maintenance it adds. Given that pyfftw already exists the value is probably relatively small, but personally I'd still be interested in code that allows switching the fft backend as that could also allow plugging e.g. gpu based implementations (though again this is already covered by other third party modules). From robbmcleod at gmail.com Thu Oct 27 11:42:36 2016 From: robbmcleod at gmail.com (Robert McLeod) Date: Thu, 27 Oct 2016 17:42:36 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: Releasing NumPy under GPL would make it incompatible with SciPy, which may be _slightly_ inconvenient to the scientific Python community: https://scipy.github.io/old-wiki/pages/License_Compatibility.html https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html Robert On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 10/27/2016 04:52 PM, Todd wrote: > >> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor >> > >> wrote: >> >> On 10/27/2016 04:30 PM, Todd wrote: >> >> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers >> >> >> >> wrote: >> >> >> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr >> > >> > >> wrote: >> >> Please see responses inline. >> >> >> >> *From:*NumPy-Discussion >> [mailto:numpy-discussion-bounces at scipy.org >> >> > >] *On Behalf Of *Todd >> *Sent:* Wednesday, October 26, 2016 4:04 PM >> *To:* Discussion of Numerical Python >> >> > >> >> *Subject:* Re: [Numpy-discussion] Intel random number >> package >> >> >> >> >> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr >> > >> > >> >> >> wrote: >> >> Another point already raised by Nathaniel is that for >> numpy's randomness ideally should provide a way to >> override >> default algorithm for sampling from a particular >> distribution. For example RandomState object that >> implements PCG may rely on default >> acceptance-rejection >> algorithm for sampling from Gamma, while the >> RandomState >> object that provides interface to MKL might want to >> call >> into MKL directly. >> >> >> >> The approach that pyfftw uses at least for scipy, which >> may also >> work here, is that you can monkey-patch the >> scipy.fftpack module >> at runtime, replacing it with pyfftw's drop-in >> replacement. >> scipy then proceeds to use pyfftw instead of its built-in >> fftpack implementation. Might such an approach work here? >> Users can either use this alternative randomstate >> replacement >> directly, or they can replace numpy's with it at runtime >> and >> numpy will then proceed to use the alternative. >> >> >> The only reason that pyfftw uses monkeypatching is that the >> better >> approach is not possible due to license constraints with >> FFTW (it's >> GPL). >> >> >> Yes, that is exactly why I brought it up. Better approaches are >> also >> not possible with MKL due to license constraints. It is a very >> similar >> situation overall. >> >> >> Its not that similar, the better approach is certainly possible with >> FFTW, the GPL is compatible with numpys license. It is only a >> concern users of binary distributions. Nobody provided the code to >> use fftw yet, but it would certainly be accepted. >> >> >> Although it is technically compatible, it would make numpy effectively >> GPL. Suggestions for this have been explicitly rejected on these >> grounds [1] >> >> [1] https://github.com/numpy/numpy/issues/3485 >> >> > Yes it would make numpy GPL, but that is not a concern for a lot of users. > Users for who it is a problem can still use the non-GPL version. > A more interesting debate is whether our binary wheels should then be GPL > wheels by default or not. Probably not, but that is something that should > be discussed when its an actual issue. > > But to clarify what I said, it would be accepted if the value it provides > is sufficient compared to the code maintenance it adds. Given that pyfftw > already exists the value is probably relatively small, but personally I'd > still be interested in code that allows switching the fft backend as that > could also allow plugging e.g. gpu based implementations (though again this > is already covered by other third party modules). > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universit?t Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Oct 27 11:57:17 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 27 Oct 2016 11:57:17 -0400 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: It would still be compatible with SciPy, it would "just" mean that SciPy (and anything else that uses numpy) would be effectively GPL. On Thu, Oct 27, 2016 at 11:42 AM, Robert McLeod wrote: > Releasing NumPy under GPL would make it incompatible with SciPy, which may > be _slightly_ inconvenient to the scientific Python community: > > https://scipy.github.io/old-wiki/pages/License_Compatibility.html > > https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html > > Robert > > On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On 10/27/2016 04:52 PM, Todd wrote: >> >>> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor >>> > >>> wrote: >>> >>> On 10/27/2016 04:30 PM, Todd wrote: >>> >>> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers >>> >>> >> >>> wrote: >>> >>> >>> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr >>> >> >>> >> >> wrote: >>> >>> Please see responses inline. >>> >>> >>> >>> *From:*NumPy-Discussion >>> [mailto:numpy-discussion-bounces at scipy.org >>> >>> >> >] *On Behalf Of >>> *Todd >>> *Sent:* Wednesday, October 26, 2016 4:04 PM >>> *To:* Discussion of Numerical Python >>> >>> >> >> >>> *Subject:* Re: [Numpy-discussion] Intel random number >>> package >>> >>> >>> >>> >>> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr >>> >> >>> >> >>> >> >>> wrote: >>> >>> Another point already raised by Nathaniel is that for >>> numpy's randomness ideally should provide a way to >>> override >>> default algorithm for sampling from a particular >>> distribution. For example RandomState object that >>> implements PCG may rely on default >>> acceptance-rejection >>> algorithm for sampling from Gamma, while the >>> RandomState >>> object that provides interface to MKL might want to >>> call >>> into MKL directly. >>> >>> >>> >>> The approach that pyfftw uses at least for scipy, which >>> may also >>> work here, is that you can monkey-patch the >>> scipy.fftpack module >>> at runtime, replacing it with pyfftw's drop-in >>> replacement. >>> scipy then proceeds to use pyfftw instead of its built-in >>> fftpack implementation. Might such an approach work >>> here? >>> Users can either use this alternative randomstate >>> replacement >>> directly, or they can replace numpy's with it at runtime >>> and >>> numpy will then proceed to use the alternative. >>> >>> >>> The only reason that pyfftw uses monkeypatching is that the >>> better >>> approach is not possible due to license constraints with >>> FFTW (it's >>> GPL). >>> >>> >>> Yes, that is exactly why I brought it up. Better approaches are >>> also >>> not possible with MKL due to license constraints. It is a very >>> similar >>> situation overall. >>> >>> >>> Its not that similar, the better approach is certainly possible with >>> FFTW, the GPL is compatible with numpys license. It is only a >>> concern users of binary distributions. Nobody provided the code to >>> use fftw yet, but it would certainly be accepted. >>> >>> >>> Although it is technically compatible, it would make numpy effectively >>> GPL. Suggestions for this have been explicitly rejected on these >>> grounds [1] >>> >>> [1] https://github.com/numpy/numpy/issues/3485 >>> >>> >> Yes it would make numpy GPL, but that is not a concern for a lot of >> users. Users for who it is a problem can still use the non-GPL version. >> A more interesting debate is whether our binary wheels should then be GPL >> wheels by default or not. Probably not, but that is something that should >> be discussed when its an actual issue. >> >> But to clarify what I said, it would be accepted if the value it provides >> is sufficient compared to the code maintenance it adds. Given that pyfftw >> already exists the value is probably relatively small, but personally I'd >> still be interested in code that allows switching the fft backend as that >> could also allow plugging e.g. gpu based implementations (though again this >> is already covered by other third party modules). >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > Robert McLeod, Ph.D. > Center for Cellular Imaging and Nano Analytics (C-CINA) > Biozentrum der Universit?t Basel > Mattenstrasse 26, 4058 Basel > Work: +41.061.387.3225 > robert.mcleod at unibas.ch > robert.mcleod at bsse.ethz.ch > robbmcleod at gmail.com > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Oct 27 12:01:41 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 27 Oct 2016 18:01:41 +0200 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: <577d02ab-4fa7-d4fa-98a8-7e1227b7ebb2@googlemail.com> As I understand it the wiki is talking about including code in numpy/scipy itself, all code in numpy and scipy must be permissively licensed so it is easy to reason about when building your binaries. The license of the binaries produced from the code is a different matter, which at that time didn't really exist as we didn't distribute binaries at all (except for windows). A GPL licensed binary containing numpy is perfectly compatible with SciPy. It may not be compatible with some other component which has an actually incompatible license (e.g. anything you cannot distribute the source of as required by the GPL). I it is not numpy that is GPL licensed it is the restriction of another component in the binary distribution that makes the full product adhere to the most restrictive license But numpy itself is always permissive, the distributor can always build a permissive numpy binary without the viral component in it. On 10/27/2016 05:42 PM, Robert McLeod wrote: > Releasing NumPy under GPL would make it incompatible with SciPy, which > may be _slightly_ inconvenient to the scientific Python community: > > https://scipy.github.io/old-wiki/pages/License_Compatibility.html > > https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html > > Robert > > On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor > > > wrote: > > On 10/27/2016 04:52 PM, Todd wrote: > > On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor > > >> > wrote: > > On 10/27/2016 04:30 PM, Todd wrote: > > On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers > > > > >>> > wrote: > > > On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr > > > > > >>> wrote: > > Please see responses inline. > > > > *From:*NumPy-Discussion > [mailto:numpy-discussion-bounces at scipy.org > > > > > >>] *On Behalf Of *Todd > *Sent:* Wednesday, October 26, 2016 4:04 PM > *To:* Discussion of Numerical Python > > > > > >>> > *Subject:* Re: [Numpy-discussion] Intel random > number > package > > > > > On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr > > > > > > >>> > wrote: > > Another point already raised by Nathaniel is > that for > numpy's randomness ideally should provide a > way to > override > default algorithm for sampling from a particular > distribution. For example RandomState > object that > implements PCG may rely on default > acceptance-rejection > algorithm for sampling from Gamma, while the > RandomState > object that provides interface to MKL might > want to call > into MKL directly. > > > > The approach that pyfftw uses at least for > scipy, which > may also > work here, is that you can monkey-patch the > scipy.fftpack module > at runtime, replacing it with pyfftw's drop-in > replacement. > scipy then proceeds to use pyfftw instead of its > built-in > fftpack implementation. Might such an approach > work here? > Users can either use this alternative randomstate > replacement > directly, or they can replace numpy's with it at > runtime and > numpy will then proceed to use the alternative. > > > The only reason that pyfftw uses monkeypatching is > that the > better > approach is not possible due to license constraints with > FFTW (it's > GPL). > > > Yes, that is exactly why I brought it up. Better > approaches are > also > not possible with MKL due to license constraints. It is > a very > similar > situation overall. > > > Its not that similar, the better approach is certainly > possible with > FFTW, the GPL is compatible with numpys license. It is only a > concern users of binary distributions. Nobody provided the > code to > use fftw yet, but it would certainly be accepted. > > > Although it is technically compatible, it would make numpy > effectively > GPL. Suggestions for this have been explicitly rejected on these > grounds [1] > > [1] https://github.com/numpy/numpy/issues/3485 > > > > Yes it would make numpy GPL, but that is not a concern for a lot of > users. Users for who it is a problem can still use the non-GPL version. > A more interesting debate is whether our binary wheels should then > be GPL wheels by default or not. Probably not, but that is something > that should be discussed when its an actual issue. > > But to clarify what I said, it would be accepted if the value it > provides is sufficient compared to the code maintenance it adds. > Given that pyfftw already exists the value is probably relatively > small, but personally I'd still be interested in code that allows > switching the fft backend as that could also allow plugging e.g. gpu > based implementations (though again this is already covered by other > third party modules). > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Robert McLeod, Ph.D. > Center for Cellular Imaging and Nano Analytics (C-CINA) > Biozentrum der Universit?t Basel > Mattenstrasse 26, 4058 Basel > Work: +41.061.387.3225 > robert.mcleod at unibas.ch > robert.mcleod at bsse.ethz.ch > robbmcleod at gmail.com > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Thu Oct 27 12:12:58 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 27 Oct 2016 09:12:58 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On Oct 27, 2016 8:42 AM, "Robert McLeod" wrote: > > Releasing NumPy under GPL would make it incompatible with SciPy, which may be _slightly_ inconvenient to the scientific Python community: > > https://scipy.github.io/old-wiki/pages/License_Compatibility.html > > https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html There's 0 chance that numpy is going to switch to the GPL in general, so please don't panic. Also, you're misunderstanding license compatibility, so let's back up a step :-). The discussion was about whether numpy might potentially, at some unspecified future date, be available with *optional* GPL code. A numpy build with optional GPL bits included would be similar to how the numpy builds that many people use which that are linked to MKL, and thus subject to MKL's license terms. In both cases the license is no longer numpy's regular bsd, but has these extra bits added. Neither changes the availability of bsd-licensed numpy; they just give another option. And, both numpy+GPL-bits and numpy+MKL-bits are/would be license *compatible* with scipy in the sense that matters to end users: you can absolutely use and distribute numpy+(pick one of the above)+scipy together, and the licenses are happy to allow that. The sense in which they're both *in*compatible with scipy is just that if you want to *add code to scipy itself*, then that code can't be GPL like pyfftw, or proprietary like MKL, because the scipy devs have decided that they don't want to allow that. That's a decision they've made for good reasons, but it isn't a legal inevitability, and it doesn't stop *you* from using and distributing scipy and GPL code together, or scipy and proprietary code together. (The real license incompatibility is between GPL and proprietary. Either one can be mixed with BSD, but they can't be mixed with each other and then distributed. Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship both MKL and pyfftw, and they picked MKL. Even then, though, this license restriction only applies to software distributors: if you as an end user go and install MKL and pyfftw together in the privacy of your own cluster, then that's also totally legal.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Oct 27 13:45:24 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 27 Oct 2016 13:45:24 -0400 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On Thu, Oct 27, 2016 at 12:12 PM, Nathaniel Smith wrote: > Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship > both MKL and pyfftw, and they picked MKL. Anaconda does ship GPL code [1]. They even ship GPL code that depends on numpy, such as cvxcanon and pystan, and there doesn't seem to be anything that prevents me from installing them alongside the MKL version of numpy. So I don't see how it would be any different for pyfftw. [1] https://docs.continuum.io/anaconda/pkg-docs -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Oct 27 14:01:08 2016 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Oct 2016 11:01:08 -0700 Subject: [Numpy-discussion] Intel random number package In-Reply-To: References: <83dbd49b-0086-006e-4e1d-85ccbf16d25d@googlemail.com> <82123478-aa40-2cf8-a96a-c7c24b4b8a95@googlemail.com> <4C9EDA7282E297428F3986994EB0FBD387BBBE@ORSMSX110.amr.corp.intel.com> <4C9EDA7282E297428F3986994EB0FBD387BC1D@ORSMSX110.amr.corp.intel.com> Message-ID: On Thu, Oct 27, 2016 at 10:45 AM, Todd wrote: > > On Thu, Oct 27, 2016 at 12:12 PM, Nathaniel Smith wrote: >> >> Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship both MKL and pyfftw, and they picked MKL. > > Anaconda does ship GPL code [1]. They even ship GPL code that depends on numpy, such as cvxcanon and pystan, and there doesn't seem to be anything that prevents me from installing them alongside the MKL version of numpy. So I don't see how it would be any different for pyfftw. I think we've exhausted the relevance of this tangent to Oleksander's contributions. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Thu Oct 27 14:29:58 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Thu, 27 Oct 2016 14:29:58 -0400 Subject: [Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d In-Reply-To: References: <1467880459.17128.10.camel@sipsolutions.net> Message-ID: Hi, I would like to revitalize the discussion on including PR#7804 (atleast_nd function) at Stephan Hoyer's request. atleast_nd has come up as a convenient workaround for #8206 (adding padding options to diff) to be able to do broadcasting with the required dimensions reversed. Regards, -Joe On Mon, Jul 11, 2016 at 10:41 AM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > I would like to follow up on my original PR (7804). While there > appears to be some debate as to whether the PR is numpy material to > begin with, there do not appear to be any technical issues with it. To > make the decision more straightforward, I factored out the > non-controversial bug fixes to masked arrays into PR #7823, along with > their regression tests. This way, the original enhancement can be > closed or left hanging indefinitely, (even though I hope neither > happens). PR 7804 still has the bug fixes duplicated in it. > > Regards, > > -Joe > > > On Thu, Jul 7, 2016 at 9:11 AM, Joseph Fox-Rabinovitz > wrote: > > On Thu, Jul 7, 2016 at 4:34 AM, Sebastian Berg > > wrote: > >> On Mi, 2016-07-06 at 15:30 -0400, Benjamin Root wrote: > >>> I don't see how one could define a spec that would take an arbitrary > >>> array of indices at which to place new dimensions. By definition, you > >>> > >> > >> You just give a reordered range, so that (1, 0, 2) would be the current > >> 3D version. If 1D, fill in `1` and `2`, if 2D, fill in only `2` (0D, > >> add everything of course). > > > > I was originally thinking (-1, 0) for the 2D case. Just go along the > > list and fill as many dims as necessary. Your way is much better since > > it does not require a different operation for positive and negative > > indices. > > > >> However, I have my doubts that it is actually easier to understand then > >> to write yourself ;). > > > > A dictionary or ragged list would be better for that: either {1: (1, > > 0), 2: (2,)} or [(1, 0), (2,)]. The first is more clear since the > > index in the list is the starting ndim - 1. > > > >> > >> - Sebastian > >> > >> > >>> don't know how many dimensions are going to be added. If you knew, > >>> then you wouldn't be calling this function. I can only imagine simple > >>> rules such as 'left' or 'right' or maybe something akin to what > >>> at_least3d() implements. > >>> > >>> On Wed, Jul 6, 2016 at 3:20 PM, Joseph Fox-Rabinovitz >>> @gmail.com> wrote: > >>> > On Wed, Jul 6, 2016 at 2:57 PM, Eric Firing > >>> > wrote: > >>> > > On 2016/07/06 8:25 AM, Benjamin Root wrote: > >>> > >> > >>> > >> I wouldn't have the keyword be "where", as that collides with > >>> > the notion > >>> > >> of "where" elsewhere in numpy. > >>> > > > >>> > > > >>> > > Agreed. Maybe "side"? > >>> > > >>> > I have tentatively changed it to "pos". The reason that I don't > >>> > like > >>> > "side" is that it implies only a subset of the possible ways that > >>> > that > >>> > the position of the new dimensions can be specified. The current > >>> > implementation only puts things on one side or the other, but I > >>> > have > >>> > considered also allowing an array of indices at which to place new > >>> > dimensions, and/or a dictionary keyed by the starting ndims. I do > >>> > not > >>> > think "side" would be appropriate for these extended cases, even if > >>> > they are very unlikely to ever materialize. > >>> > > >>> > -Joe > >>> > > >>> > > (I find atleast_1d and atleast_2d to be very helpful for handling > >>> > inputs, as > >>> > > Ben noted; I'm skeptical as to the value of atleast_3d and > >>> > atleast_nd.) > >>> > > > >>> > > Eric > >>> > > > >>> > > _______________________________________________ > >>> > > NumPy-Discussion mailing list > >>> > > NumPy-Discussion at scipy.org > >>> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > _______________________________________________ > >>> > NumPy-Discussion mailing list > >>> > NumPy-Discussion at scipy.org > >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion at scipy.org > >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From djxvillain at gmail.com Thu Oct 27 15:58:25 2016 From: djxvillain at gmail.com (djxvillain) Date: Thu, 27 Oct 2016 12:58:25 -0700 (MST) Subject: [Numpy-discussion] How to use user input as equation directly Message-ID: <1477598305150-43665.post@n7.nabble.com> Hello all, I am an electrical engineer and new to numpy. I need the ability to take in user input, and use that input as a variable. For example: t = input('enter t: ') x = input('enter x: ') I need the user to be able to enter something like x = 2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just typed it in the .py file. There's no clean way to cast or evaluate it that I've found. I could make a function to parse this string character by character, but I figured this is probably a common problem and someone else has probably figured it out and created an object for it. I can't find a library that does it though. If I can provide any more information please let me know. Thank you in advance for your help. -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From rmay31 at gmail.com Thu Oct 27 17:26:28 2016 From: rmay31 at gmail.com (Ryan May) Date: Thu, 27 Oct 2016 15:26:28 -0600 Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: <1477598305150-43665.post@n7.nabble.com> References: <1477598305150-43665.post@n7.nabble.com> Message-ID: On Thu, Oct 27, 2016 at 1:58 PM, djxvillain wrote: > Hello all, > > I am an electrical engineer and new to numpy. I need the ability to take > in > user input, and use that input as a variable. For example: > > t = input('enter t: ') > x = input('enter x: ') > > I need the user to be able to enter something like x = > 2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just typed > it in the .py file. There's no clean way to cast or evaluate it that I've > found. > Are you aware of Python's eval function: https://docs.python.org/3/library/functions.html#eval ? Ryan -- Ryan May -------------- next part -------------- An HTML attachment was scrubbed... URL: From djxvillain at gmail.com Thu Oct 27 16:06:44 2016 From: djxvillain at gmail.com (djxvillain) Date: Thu, 27 Oct 2016 13:06:44 -0700 (MST) Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: References: <1477598305150-43665.post@n7.nabble.com> Message-ID: <1477598804771-43667.post@n7.nabble.com> That worked perfectly. I've been googling how to do this, I guess I didn't phrase it correctly. Thank you very much. You just saved me a ton of time. -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665p43667.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From jladasky at itu.edu Thu Oct 27 17:33:03 2016 From: jladasky at itu.edu (John Ladasky) Date: Thu, 27 Oct 2016 14:33:03 -0700 Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: <1477598305150-43665.post@n7.nabble.com> References: <1477598305150-43665.post@n7.nabble.com> Message-ID: This isn't just a Numpy issue. You are interested in Python's eval(). Keep in mind that any programming language that blurs the line between code and data (many do not) has a potential security vulnerability. What if your user doesn't type "x = 2*np.sin(2*np.pi*44100*t+np.pi/2)" but instead types this: "import os ; os.remove('/home')" I do NOT recommend that you eval() the second statement. You can try to write code which traps unwanted input before you eval() it. It's apparently quite hard to stop everything bad from getting through. On Thu, Oct 27, 2016 at 12:58 PM, djxvillain wrote: > Hello all, > > I am an electrical engineer and new to numpy. I need the ability to take > in > user input, and use that input as a variable. For example: > > t = input('enter t: ') > x = input('enter x: ') > > I need the user to be able to enter something like x = > 2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just typed > it in the .py file. There's no clean way to cast or evaluate it that I've > found. > > I could make a function to parse this string character by character, but I > figured this is probably a common problem and someone else has probably > figured it out and created an object for it. I can't find a library that > does it though. > > If I can provide any more information please let me know. Thank you in > advance for your help. > > > > -- > View this message in context: http://numpy-discussion.10968. > n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *John J. Ladasky Jr., Ph.D.* *Research Scientist* *International Technological University* *2711 N. First St, San Jose, CA 95134 USA* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Oct 27 17:35:53 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 27 Oct 2016 17:35:53 -0400 Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: References: <1477598305150-43665.post@n7.nabble.com> Message-ID: Perhaps the numexpr package might be safer? Not exactly meant for this situation (meant for optimizations), but the evaluator is pretty darn safe. Ben Root On Thu, Oct 27, 2016 at 5:33 PM, John Ladasky wrote: > This isn't just a Numpy issue. You are interested in Python's eval(). > > Keep in mind that any programming language that blurs the line between > code and data (many do not) has a potential security vulnerability. What > if your user doesn't type > > "x = 2*np.sin(2*np.pi*44100*t+np.pi/2)" > > but instead types this: > > "import os ; os.remove('/home')" > > I do NOT recommend that you eval() the second statement. > > You can try to write code which traps unwanted input before you eval() > it. It's apparently quite hard to stop everything bad from getting through. > > > On Thu, Oct 27, 2016 at 12:58 PM, djxvillain wrote: > >> Hello all, >> >> I am an electrical engineer and new to numpy. I need the ability to take >> in >> user input, and use that input as a variable. For example: >> >> t = input('enter t: ') >> x = input('enter x: ') >> >> I need the user to be able to enter something like x = >> 2*np.sin(2*np.pi*44100*t+np.pi/2) and it be the same as if they just >> typed >> it in the .py file. There's no clean way to cast or evaluate it that I've >> found. >> >> I could make a function to parse this string character by character, but I >> figured this is probably a common problem and someone else has probably >> figured it out and created an object for it. I can't find a library that >> does it though. >> >> If I can provide any more information please let me know. Thank you in >> advance for your help. >> >> >> >> -- >> View this message in context: http://numpy-discussion.10968. >> n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665.html >> Sent from the Numpy-discussion mailing list archive at Nabble.com. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > *John J. Ladasky Jr., Ph.D.* > *Research Scientist* > *International Technological University* > *2711 N. First St, San Jose, CA 95134 USA* > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From djxvillain at gmail.com Thu Oct 27 16:21:18 2016 From: djxvillain at gmail.com (djxvillain) Date: Thu, 27 Oct 2016 13:21:18 -0700 (MST) Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: References: <1477598305150-43665.post@n7.nabble.com> Message-ID: <1477599678422-43670.post@n7.nabble.com> This will not be a public product and will only be used by other engineers/scientists for research. I don't think security should be a huge issue, but I appreciate your input and concern for the quality of my code. -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665p43670.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ben.v.root at gmail.com Thu Oct 27 17:52:47 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 27 Oct 2016 17:52:47 -0400 Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: <1477599678422-43670.post@n7.nabble.com> References: <1477598305150-43665.post@n7.nabble.com> <1477599678422-43670.post@n7.nabble.com> Message-ID: "only be used by engineers/scientists for research" Famous last words. I know plenty of scientists who would love to "do research" with an exposed eval(). Full disclosure, I personally added a security hole into matplotlib thinking I covered all my bases in protecting an eval() statement. Ben Root On Thu, Oct 27, 2016 at 4:21 PM, djxvillain wrote: > This will not be a public product and will only be used by other > engineers/scientists for research. I don't think security should be a huge > issue, but I appreciate your input and concern for the quality of my code. > > > > -- > View this message in context: http://numpy-discussion.10968. > n7.nabble.com/How-to-use-user-input-as-equation-directly- > tp43665p43670.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Fri Oct 28 09:23:20 2016 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Fri, 28 Oct 2016 14:23:20 +0100 Subject: [Numpy-discussion] padding options for diff In-Reply-To: References: Message-ID: Matthew has made what looks like a very nice implementation of padding in np.diff in https://github.com/numpy/numpy/pull/8206. I raised two general questions about desired behaviour there that Matthew thought we should put out on the mailiing list as well. This indeed seemed a good opportunity to get feedback, so herewith a copy of https://github.com/numpy/numpy/pull/8206#issuecomment-256909027 -- Marten 1. I'm not sure that treating a 1-d array as something that will just extend the result along `axis` is a good idea, as it breaks standard broadcasting rules. E.g., consider ``` np.diff([[1, 2], [4, 8]], to_begin=[1, 4]) # with your PR: array([[1, 4, 1], [1, 4, 4]]) # but from regular broadcasting I would expect array([[1, 1], [4, 4]]) # i.e., the same as if I did to_begin=[[1, 4]] ``` I think it is slightly odd to break the broadcasting expectation here, especially since the regular use case surely is just to add a single element so that one keeps the original shape. The advantage of assuming this is that you do not have to do *any* array shaping of `to_begin` and `to_end` (which perhaps also suggests it is the right thing to do). 2. As I mentioned above, I think it may be worth thinking through a little what to do with higher order differences, at least for `to_begin='first'`. If the goal is to ensure that with that option, it becomes the inverse of `cumsum`, then I think for higher order one should add multiple elements in front, i.e., for that case, the recursive call should be ``` return np.diff(np.diff(a, to_begin='first'), n-1, to_begin='first') ``` From bennyrowland at mac.com Fri Oct 28 09:29:31 2016 From: bennyrowland at mac.com (Ben Rowland) Date: Fri, 28 Oct 2016 09:29:31 -0400 Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: References: <1477598305150-43665.post@n7.nabble.com> <1477599678422-43670.post@n7.nabble.com> Message-ID: It is important to bear in mind where the code is being run - if this is something running on a researcher?s own system, they almost certainly have lots of other ways of messing it up. These kind of security vulnerabilities are normally only relevant when you are running code that came from somewhere else. That being said, this use case sounds like it could work with the Jupyter notebook. If you want something that is like typing code into a .py file but evaluated at run time instead, why not just use an interactive Python REPL instead of eval(input()). Ben > On 27 Oct 2016, at 17:52, Benjamin Root wrote: > > "only be used by engineers/scientists for research" > > Famous last words. I know plenty of scientists who would love to "do research" with an exposed eval(). Full disclosure, I personally added a security hole into matplotlib thinking I covered all my bases in protecting an eval() statement. > > Ben Root > > On Thu, Oct 27, 2016 at 4:21 PM, djxvillain > wrote: > This will not be a public product and will only be used by other > engineers/scientists for research. I don't think security should be a huge > issue, but I appreciate your input and concern for the quality of my code. > > > > -- > View this message in context: http://numpy-discussion.10968.n7.nabble.com/How-to-use-user-input-as-equation-directly-tp43665p43670.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbmcleod at gmail.com Fri Oct 28 10:18:05 2016 From: robbmcleod at gmail.com (Robert McLeod) Date: Fri, 28 Oct 2016 16:18:05 +0200 Subject: [Numpy-discussion] How to use user input as equation directly In-Reply-To: References: <1477598305150-43665.post@n7.nabble.com> Message-ID: On Thu, Oct 27, 2016 at 11:35 PM, Benjamin Root wrote: > Perhaps the numexpr package might be safer? Not exactly meant for this > situation (meant for optimizations), but the evaluator is pretty darn safe. > > It would not be able to evaluate something like 'np.arange(50)' for example, since it only has a limited subset of numpy functionality. In the example provided that or linspace is likely the natural input for the variable 't'. -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universit?t Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 28 20:18:23 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Oct 2016 18:18:23 -0600 Subject: [Numpy-discussion] Numpy scalar integers to negative scalar integer powers. Message-ID: Hi All, I've put up a PR to deal with the numpy scalar integer powers at https://github.com/numpy/numpy/pull/8221. Note that for now everything goes through the np.power function. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 29 09:02:21 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 29 Oct 2016 07:02:21 -0600 Subject: [Numpy-discussion] __numpy_ufunc__ Message-ID: Hi All, Does anyone remember discussion of numpy scalars apropos __numpy_ufunc__? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sat Oct 29 21:03:10 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Sat, 29 Oct 2016 18:03:10 -0700 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: I'm happy to revisit the __numpy_ufunc__ discussion (I still want to see it happen!), but I don't recall scalars being a point of contention. The obvious thing to do with scalars would be to treat them the same as 0-dimensional arrays, though I might be missing some nuance... On Sat, Oct 29, 2016 at 6:02 AM, Charles R Harris wrote: > Hi All, > > Does anyone remember discussion of numpy scalars apropos __numpy_ufunc__? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 29 21:56:12 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 29 Oct 2016 19:56:12 -0600 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: On Sat, Oct 29, 2016 at 7:03 PM, Stephan Hoyer wrote: > I'm happy to revisit the __numpy_ufunc__ discussion (I still want to see > it happen!), but I don't recall scalars being a point of contention. > The __numpy_ufunc__ functionality is the last bit I want for 1.12.0, the rest of the remaining changes I can kick forward to 1.13.0. I will start taking a look tomorrow, probably starting with Nathaniel's work. > > The obvious thing to do with scalars would be to treat them the same as > 0-dimensional arrays, though I might be missing some nuance... > That's my thought. Currently they just look at __array_priority__ and call the corresponding array method if needed, so that maybe needs some improvement and a formal statement of intent. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sun Oct 30 06:34:35 2016 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sun, 30 Oct 2016 10:34:35 +0000 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: > The __numpy_ufunc__ functionality is the last bit I want for 1.12.0, the > rest of the remaining changes I can kick forward to 1.13.0. I will start > taking a look tomorrow, probably starting with Nathaniel's work. Great! I'll revive the Quantity PRs that implement __numpy_ufunc__! -- Marten From vikramsingh001 at gmail.com Sun Oct 30 10:12:27 2016 From: vikramsingh001 at gmail.com (Vikram Singh) Date: Sun, 30 Oct 2016 16:12:27 +0200 Subject: [Numpy-discussion] Problem with compiling openacc with f2py In-Reply-To: References: Message-ID: Ok, I got it to compile using f2py -c -m --f90flags='-fopenmp -fopenacc -foffload=nvptx-none -foffload=-O3 -O3 -fPIC' hello hello.f90 -L/usr/local/cuda/lib64 -lcublas -lcudart -lgomp But now I get the import error, /home/Experiments/fortran_python/hello.cpython-35m-x86_64-linux-gnu.so: undefined symbol: __offload_func_table Seems to me I have to link another library. But where is __offload_func_table On Thu, Oct 27, 2016 at 1:30 PM, Vikram Singh wrote: > I am a newbie to f2py so I have been creating simple test cases. > Eventually I want to be able to use openacc subroutine from python. So > here's the test case > > module test > > use iso_c_binding, only: sp => C_FLOAT, dp => C_DOUBLE, i8 => C_INT > use omp_lib > use openacc > > implicit none > > contains > > subroutine add_acc (a, b, n, c) > integer(kind=i8), intent(in) :: n > real(kind=dp), intent(in) :: a(n) > real(kind=dp), intent(in) :: b(n) > real(kind=dp), intent(out) :: c(n) > > integer(kind=i8) :: i > > !$acc kernels > do i = 1, n > c(i) = a(i) + b(i) > end do > !$acc end kernels > > end subroutine add_acc > > subroutine add_omp (a, b, n, c) > integer(kind=i8), intent(in) :: n > real(kind=dp), intent(in) :: a(n) > real(kind=dp), intent(in) :: b(n) > real(kind=dp), intent(out) :: c(n) > > integer(kind=i8) :: i, j > > !$omp parallel do > do i = 1, n > c(i) = a(i) + b(i) > end do > !$omp end parallel do > > end subroutine add_omp > > subroutine nt (c) > integer(kind=i8), intent(out) :: c > > c = omp_get_max_threads() > > end subroutine nt > > subroutine mult (a, b, c) > real(kind=dp), intent(in) :: a > real(kind=dp), intent(in) :: b > real(kind=dp), intent(out) :: c > > c = a * b > > end subroutine mult > > end module test > > I compile using > > f2py -c -m --f90flags='-fopenacc -foffload=nvptx-none -foffload=-O3 > -O3 -fPIC' hello hello.f90 -L/usr/local/cuda/lib64 -lcublas -lcudart > -lgomp > > Now, until I add the acc directives everything works fine. But as soon > as I add the acc directives I get this error. > > gfortran:f90: /tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.f90 > /home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran > -Wall -g -Wall -g -shared > /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o > /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o > /tmp/tmpld6ssow3/hello.o > /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o > -L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas > -lcudart -lgomp -lpython3.5m -lgfortran -o > ./hello.cpython-35m-x86_64-linux-gnu.so > /usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against > `.rodata' can not be used when making a shared object; recompile with > -fPIC > /tmp/cc2yQ89d.target.o: error adding symbols: Bad value > collect2: error: ld returned 1 exit status > /usr/bin/ld: /tmp/cc2yQ89d.target.o: relocation R_X86_64_32 against > `.rodata' can not be used when making a shared object; recompile with > -fPIC > /tmp/cc2yQ89d.target.o: error adding symbols: Bad value > collect2: error: ld returned 1 exit status > error: Command "/home//Experiments/Nvidia/OpenACC/OLCFHack15/gcc6/install/bin/gfortran > -Wall -g -Wall -g -shared > /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hellomodule.o > /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/fortranobject.o > /tmp/tmpld6ssow3/hello.o > /tmp/tmpld6ssow3/tmp/tmpld6ssow3/src.linux-x86_64-3.5/hello-f2pywrappers2.o > -L/usr/local/cuda/lib64 -L/home//usr/local/miniconda/lib -lcublas > -lcudart -lgomp -lpython3.5m -lgfortran -o > ./hello.cpython-35m-x86_64-linux-gnu.so" failed with exit status 1 > > I don't get why just putting acc directives should create errors, when > omp does not. > > Vikram From m.h.vankerkwijk at gmail.com Mon Oct 31 13:08:22 2016 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 31 Oct 2016 17:08:22 +0000 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: Hi Chuck, I've revived my Quantity PRs that use __numpy_ufunc__ but is it correct that at present in *dev, one cannot use it? All the best, Marten From charlesr.harris at gmail.com Mon Oct 31 13:31:06 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 31 Oct 2016 11:31:06 -0600 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: On Mon, Oct 31, 2016 at 11:08 AM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Hi Chuck, > > I've revived my Quantity PRs that use __numpy_ufunc__ but is it > correct that at present in *dev, one cannot use it? > It's not enabled yet. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Oct 31 13:39:51 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 31 Oct 2016 10:39:51 -0700 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: Recall that I think we wanted to rename this to __array_ufunc__, so we could change the function signature: https://github.com/numpy/numpy/issues/5986 I'm still a little nervous about this. Chunk -- what is your proposal for resolving the outstanding issues from https://github.com/numpy/numpy/issues/5844? On Mon, Oct 31, 2016 at 10:31 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Mon, Oct 31, 2016 at 11:08 AM, Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > >> Hi Chuck, >> >> I've revived my Quantity PRs that use __numpy_ufunc__ but is it >> correct that at present in *dev, one cannot use it? >> > > It's not enabled yet. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Oct 31 13:47:16 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 31 Oct 2016 11:47:16 -0600 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: On Mon, Oct 31, 2016 at 11:39 AM, Stephan Hoyer wrote: > Recall that I think we wanted to rename this to __array_ufunc__, so we > could change the function signature: https://github.com/numpy/ > numpy/issues/5986 > > I'm still a little nervous about this. Chunk -- what is your proposal for > resolving the outstanding issues from https://github.com/numpy/ > numpy/issues/5844? > We were pretty close. IIRC, the outstanding issue was some sort of override. At the developer meeting at scipy 2015 it was agreed that it would be easy to finish things up under the rubric "make Pauli happy". But that wasn't happening which is why I asked Nathaniel to disable it for 1.10.0. It is now a year later, things have cooled, and, IMHO, it is time to take another shot at it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Oct 31 18:37:47 2016 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 31 Oct 2016 22:37:47 +0000 Subject: [Numpy-discussion] __numpy_ufunc__ In-Reply-To: References: Message-ID: Hi Chuck, > We were pretty close. IIRC, the outstanding issue was some sort of override. Correct. With a general sentiment of those downstream that it would be great to merge in any form, as it will be really helpful! (Generic speedup of factor of 2 for computationally expensive ufuncs (sin, cos, etc.) that needs scaling in Quantity...) > At the developer meeting at scipy 2015 it was agreed that it would be easy > to finish things up under the rubric "make Pauli happy". That would certainly make me happy too! Other items that were brought up (trying to summarize from issues linked above, and links therein): 1. Remove index argument 2. Out always a tuple 3. Let ndarray have a __numpy_ufunc__ stub, so one can super it. Here, the first item implied a possible name change (to __array_ufunc__); if that's too troublesome, I don't think it really hurts to have the argument, though it is somewhat "unclean" for the case that only the output has __numpy_ufunc__. All the best, Marten